Prometheus is a beast of a monitoring system, but Grafana dashboards can easily bring it to its knees. If you have ever set up a company-wide wallboard that auto-refreshes every ten seconds, or if you have multiple SRE teams querying the exact same raw metrics, you have probably seen your Prometheus CPU usage spike into a terrifying sawtooth pattern.

The obvious solution is to slap an HTTP cache in front of it. But if you try doing that with a generic proxy like Varnish or Nginx, you will quickly realize it is completely useless.

Prometheus query URLs contain high-precision, floating-point timestamps (e.g., time=1626359042.123 or start=1626359012.000&end=1626359042.000). Because these timestamps are dynamically generated by the client browser on every single auto-refresh, the URL is never the same twice. Your cache hit rate drops to zero percent, and your Prometheus instance continues to thrash.

To fix this properly, I built PromCache — an intelligent, lightweight caching proxy in Go designed specifically to normalize and cache Prometheus API endpoints.

graph TD
    Grafana[Grafana / Client] -->|GET /api/v1/query_range?start=1626359042.123| PC[PromCache Proxy]
    PC -->|1. Sort Parameters| Norm[Normalize Query]
    PC -->|2. Round Timestamps to TTL| Norm
    Norm -->|Generate Key| Cache{In-Memory Cache}
    Cache -->|Hit: X-Cache: HIT| Grafana
    Cache -->|Miss: Forward & Save| Prom[Upstream Prometheus]

Intelligent timestamp alignment

The core innovation of PromCache is its temporal alignment engine. When an incoming GET request hits the proxy, PromCache intercepts the query parameters and rounds the high-precision temporal arguments (time, start, end) to the nearest TTL boundary.

// roundTimeParameter rounds a time parameter to the nearest TTL boundary
func (p *HTTPCacheProxy) roundTimeParameter(query url.Values, paramName string, ttlSeconds int64, roundUp bool) {
	if paramStr := query.Get(paramName); paramStr != "" {
		paramTime, err := strconv.ParseFloat(paramStr, 64)
		if err != nil {
			return // Skip if not a valid number
		}

		var roundedTime int64
		if roundUp {
			// Round up to next TTL boundary
			roundedTime = ((int64(paramTime) + ttlSeconds - 1) / ttlSeconds) * ttlSeconds
		} else {
			// Round down to previous TTL boundary
			roundedTime = (int64(paramTime) / ttlSeconds) * ttlSeconds
		}

		query.Set(paramName, strconv.FormatInt(roundedTime, 10))
	}
}

If your cache TTL is set to fifteen seconds, queries executed at 1626359042 and 1626359044 will both be normalized to the exact same temporal bucket at 1626359040.

By collapsing these tiny millisecond-level differences, PromCache forces identical queries executed within the same window to map to the exact same cache key. The browser still gets fresh-enough data, but Prometheus only has to calculate the query once every fifteen seconds.

Alphabetical query normalization

In addition to temporal alignment, PromCache normalizes the query arguments themselves. Prometheus clients can append parameters in any order, and different tools might group their label matchers differently.

To guarantee cache key stability, PromCache extracts all query keys, sorts them alphabetically, and sorts any multi-value arrays before constructing the final key.

func (p *HTTPCacheProxy) normalizeQueryString(query url.Values) string {
	if len(query) == 0 {
		return ""
	}

	keys := make([]string, 0, len(query))
	for k := range query {
		keys = append(keys, k)
	}
	sort.Strings(keys)

	var b strings.Builder
	b.Grow(128)

	for i, k := range keys {
		values := query[k]
		sort.Strings(values)

		for j, v := range values {
			if i > 0 || j > 0 {
				b.WriteByte('&')
			}
			b.WriteString(k)
			b.WriteByte('=')
			b.WriteString(v)
		}
	}
	return b.String()
}

This guarantees that query=up&step=15 and step=15&query=up compile to the identical cache entry, squeezing every last percentage point of efficiency out of your dashboard traffic.

Metrics and integration

PromCache is designed to be as non-intrusive as possible. It is a single, self-contained binary that you can drop directly in front of your Prometheus datasource in Grafana.

It exports its own Prometheus-compatible metrics, letting you monitor your cache hit ratios, response times, and memory footprint in real-time. If you are serious about keeping your monitoring stack lightweight and resilient, this is the missing piece of the puzzle.