Metric types: counter, gauge, histogram, summary

Why different types exist

A metric is not just a number. Its semantics decide how you aggregate it.

http_requests_total rose by 1200 over a minute, so rate = 20 req/s. A counter is monotonic, and the difference between measurements is the pace.
memory_usage_bytes = 2.3 GB right now. This is the direct value, and a gauge can swing up and down. The mean over an hour is its average.
request_duration_seconds is a distribution, so you need percentiles (p50, p99). Taking the mean is useless: it hides tail latency.

Each type is a contract between the application and the query system (PromQL/MetricsQL). Using rate() on a gauge gives garbage. Using avg() on a histogram loses the meaning.

Counter, monotonically increasing

Semantics: how many times an event happened. Up only, with a reset on process restart.

http_requests_total{method="GET",status="200"} 12345

In PromQL, never use the raw counter. Always go through rate() or increase():

rate(http_requests_total[5m])           # average req/s over 5 min

increase(http_requests_total[1h])       # how much it grew over an hour

sum by (status)(rate(http_requests_total[5m]))  # by status

rate() automatically handles a counter reset (on restart).

Examples of counters in real systems:

process_cpu_seconds_total, total CPU
node_network_receive_bytes_total, bytes received
kafka_consumer_messages_consumed_total

Gauge, current value

Semantics: the value right now. It can rise and fall.

node_memory_MemAvailable_bytes 4521234432

goroutines_active 142

queue_depth{queue="orders"} 87

In PromQL, use it directly:

node_memory_MemAvailable_bytes / 1024 / 1024 / 1024

avg_over_time(queue_depth[10m])

max(queue_depth) by (queue)

Do not use rate() on a gauge! rate(temperature[5m]) is meaningless.

Histogram, a distribution with buckets

Semantics: how many events fell into each bucket by value.

http_request_duration_seconds_bucket{le="0.1"}  4500

http_request_duration_seconds_bucket{le="0.25"} 4800

http_request_duration_seconds_bucket{le="0.5"}  4920

http_request_duration_seconds_bucket{le="1"}    4980

http_request_duration_seconds_bucket{le="+Inf"} 5000

http_request_duration_seconds_sum               350.5

http_request_duration_seconds_count             5000

Each bucket is a counter. le="0.5" means how many events were <=0.5 sec.

The percentile is computed at query time:

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

It is aggregatable across instances: you can sum by (status). This is unlike summary.

Choose buckets carefully. The default 10 buckets from 5ms to 10s suit HTTP, but not DB queries (which need microseconds) or batch jobs (which run for minutes). Too many buckets lead to cardinality explosion (cardinality-explosion).

Summary, pre-computed quantiles

The client computes the quantiles itself (p50, p95, p99) and exports them:

http_request_duration_seconds{quantile="0.5"}  0.012

http_request_duration_seconds{quantile="0.95"} 0.087

http_request_duration_seconds{quantile="0.99"} 0.241

http_request_duration_seconds_sum              350.5

http_request_duration_seconds_count            5000

Problems with summary:

Not aggregatable: you cannot sum by (instance), because it is mathematically wrong. It only makes sense per instance.
Expensive on CPU and memory, since the client holds a running quantile estimator.
Quantile fixed at compile time, so you cannot compute p99.9 if the client does not export it.

Use summary only when p99 is cheap to compute in the client and aggregation is not needed. In most cases, histogram is better.

Histogram vs Summary

Criterion	Histogram	Summary
Where it is computed	server (PromQL)	client
Aggregatable across instances	yes	no
Quantile precision	bucket-bounded	exact-ish
Quantile changeable after the fact	yes (new query)	no
Memory in client	low	medium
Cardinality	bucket × labels	quantile × labels

Rule: use histogram. Summary is only for legacy.

Native histogram (Prom 2.40+)

The problem with a classic histogram is fixed buckets, either many (cardinality) or few (poor precision).

A native histogram (a.k.a. sparse histogram) builds buckets on the fly, with a logarithmic scale and sparse encoding:

metric_native_histogram{} {schema:1, count:5000, sum:350.5,

                          positive_buckets: ...sparse...}

One time series per metric (instead of N buckets)
Precision of about 1-3% at any quantile
100x less storage than a classic histogram with the same precision

It requires:

A client SDK with support (Go >=1.16, Python >=0.18, Java >=1.0)
Prometheus 2.40+ with --enable-feature=native-histograms
Grafana 10+ for visualization

Production-ready as of Prom 2.50+ (2023). It should become the default.

Exemplars, a bridge to traces

An exemplar is a concrete sample attached to a bucket:

http_request_duration_seconds_bucket{le="0.5"} 4920 # {trace_id="abc123"} 0.42 1683456789.123

"One of the 4920 requests in this bucket had trace_id=abc123." In Grafana you can click a point on the graph and jump into the trace (tracing-basics).

Support:

Prometheus 2.26+ (requires the OpenMetrics format)
SDK: Go, Java, Python, .NET
Grafana 8+

OpenMetrics, the formal spec

OpenMetrics is a CNCF standard (RFC-style) for metrics that extends the Prometheus exposition format:

UTF-8, not ASCII
A # UNIT line
Exemplars are formalized
JSON serialization (optional)

In 2025, most SDKs export to OpenMetrics automatically. Prometheus and the [[opentelemetry|OTel collector]] both understand it.

When things go wrong

rate(my_metric[5m]) returns 0: the counter is named like a gauge, so PromQL computes rate over a continuously identical value. Rename the metric to _total.
p99 latency jumps around: too few samples in the window (rate over [5m] for 0.1 req/s is 30 events, which is noisy). Widen the window to [30m].
histogram_quantile() returns NaN: the buckets do not cover the observed values, or there is no data in the window. Check _count > 0.
Cardinality explosion: you added endpoint=/api/v1/user/123, so every user_id creates a new series. Refactor to endpoint=/api/v1/user/:id.
Summary quantile is inaccurate after a restart: the client estimator resets. This is a property of summary, not a bug.
Buckets with le= as a string, not a number: Prom expects strings "0.1", "0.5". le=0.1 (a number) breaks.

Why different types exist

A metric is not just a number. Its semantics decide how you aggregate it.

http_requests_total rose by 1200 over a minute, so rate = 20 req/s. A counter is monotonic, and the difference between measurements is the pace.
memory_usage_bytes = 2.3 GB right now. This is the direct value, and a gauge can swing up and down. The mean over an hour is its average.
request_duration_seconds is a distribution, so you need percentiles (p50, p99). Taking the mean is useless: it hides tail latency.

Each type is a contract between the application and the query system (PromQL/MetricsQL). Using rate() on a gauge gives garbage. Using avg() on a histogram loses the meaning.

Counter, monotonically increasing

Semantics: how many times an event happened. Up only, with a reset on process restart.

http_requests_total{method="GET",status="200"} 12345

In PromQL, never use the raw counter. Always go through rate() or increase():

rate(http_requests_total[5m])           # average req/s over 5 min

increase(http_requests_total[1h])       # how much it grew over an hour

sum by (status)(rate(http_requests_total[5m]))  # by status

rate() automatically handles a counter reset (on restart).

Examples of counters in real systems:

process_cpu_seconds_total, total CPU
node_network_receive_bytes_total, bytes received
kafka_consumer_messages_consumed_total

Gauge, current value

Semantics: the value right now. It can rise and fall.

node_memory_MemAvailable_bytes 4521234432

goroutines_active 142

queue_depth{queue="orders"} 87

In PromQL, use it directly:

node_memory_MemAvailable_bytes / 1024 / 1024 / 1024

avg_over_time(queue_depth[10m])

max(queue_depth) by (queue)

Do not use rate() on a gauge! rate(temperature[5m]) is meaningless.

Histogram, a distribution with buckets

Semantics: how many events fell into each bucket by value.

http_request_duration_seconds_bucket{le="0.1"}  4500

http_request_duration_seconds_bucket{le="0.25"} 4800

http_request_duration_seconds_bucket{le="0.5"}  4920

http_request_duration_seconds_bucket{le="1"}    4980

http_request_duration_seconds_bucket{le="+Inf"} 5000

http_request_duration_seconds_sum               350.5

http_request_duration_seconds_count             5000

Each bucket is a counter. le="0.5" means how many events were <=0.5 sec.

The percentile is computed at query time:

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

It is aggregatable across instances: you can sum by (status). This is unlike summary.

Summary, pre-computed quantiles

The client computes the quantiles itself (p50, p95, p99) and exports them:

http_request_duration_seconds{quantile="0.5"}  0.012

http_request_duration_seconds{quantile="0.95"} 0.087

http_request_duration_seconds{quantile="0.99"} 0.241

http_request_duration_seconds_sum              350.5

http_request_duration_seconds_count            5000

Problems with summary:

Not aggregatable: you cannot sum by (instance), because it is mathematically wrong. It only makes sense per instance.
Expensive on CPU and memory, since the client holds a running quantile estimator.
Quantile fixed at compile time, so you cannot compute p99.9 if the client does not export it.

Use summary only when p99 is cheap to compute in the client and aggregation is not needed. In most cases, histogram is better.

Histogram vs Summary

Criterion	Histogram	Summary
Where it is computed	server (PromQL)	client
Aggregatable across instances	yes	no
Quantile precision	bucket-bounded	exact-ish
Quantile changeable after the fact	yes (new query)	no
Memory in client	low	medium
Cardinality	bucket × labels	quantile × labels

Rule: use histogram. Summary is only for legacy.

Native histogram (Prom 2.40+)

The problem with a classic histogram is fixed buckets, either many (cardinality) or few (poor precision).

A native histogram (a.k.a. sparse histogram) builds buckets on the fly, with a logarithmic scale and sparse encoding:

metric_native_histogram{} {schema:1, count:5000, sum:350.5,

                          positive_buckets: ...sparse...}

One time series per metric (instead of N buckets)
Precision of about 1-3% at any quantile
100x less storage than a classic histogram with the same precision

It requires:

A client SDK with support (Go >=1.16, Python >=0.18, Java >=1.0)
Prometheus 2.40+ with --enable-feature=native-histograms
Grafana 10+ for visualization

Production-ready as of Prom 2.50+ (2023). It should become the default.

Exemplars, a bridge to traces

An exemplar is a concrete sample attached to a bucket:

http_request_duration_seconds_bucket{le="0.5"} 4920 # {trace_id="abc123"} 0.42 1683456789.123

"One of the 4920 requests in this bucket had trace_id=abc123." In Grafana you can click a point on the graph and jump into the trace (tracing-basics).

Support:

Prometheus 2.26+ (requires the OpenMetrics format)
SDK: Go, Java, Python, .NET
Grafana 8+

OpenMetrics, the formal spec

OpenMetrics is a CNCF standard (RFC-style) for metrics that extends the Prometheus exposition format:

UTF-8, not ASCII
A # UNIT line
Exemplars are formalized
JSON serialization (optional)

In 2025, most SDKs export to OpenMetrics automatically. Prometheus and the [[opentelemetry|OTel collector]] both understand it.

When things go wrong

rate(my_metric[5m]) returns 0: the counter is named like a gauge, so PromQL computes rate over a continuously identical value. Rename the metric to _total.
p99 latency jumps around: too few samples in the window (rate over [5m] for 0.1 req/s is 30 events, which is noisy). Widen the window to [30m].
histogram_quantile() returns NaN: the buckets do not cover the observed values, or there is no data in the window. Check _count > 0.
Cardinality explosion: you added endpoint=/api/v1/user/123, so every user_id creates a new series. Refactor to endpoint=/api/v1/user/:id.
Summary quantile is inaccurate after a restart: the client estimator resets. This is a property of summary, not a bug.
Buckets with le= as a string, not a number: Prom expects strings "0.1", "0.5". le=0.1 (a number) breaks.

Metric types: counter, gauge, histogram, summary

Why different types exist

Counter, monotonically increasing

Gauge, current value

Histogram, a distribution with buckets

Summary, pre-computed quantiles

Histogram vs Summary

Native histogram (Prom 2.40+)

Exemplars, a bridge to traces

OpenMetrics, the formal spec

When things go wrong

§ команды

§ см. также

Metric types: counter, gauge, histogram, summary

Why different types exist

Counter, monotonically increasing

Gauge, current value

Histogram, a distribution with buckets

Summary, pre-computed quantiles

Histogram vs Summary

Native histogram (Prom 2.40+)

Exemplars, a bridge to traces

OpenMetrics, the formal spec

When things go wrong

§ команды

§ см. также