The Hidden Cost of High Cardinality in Observability Platforms

Your Prometheus self-hosted setup was running fine. Memory was manageable, queries were fast, dashboards loaded in under a second. Then you migrated to a managed metrics platform — Datadog, Grafana Cloud, New Relic, or one of the alternatives — and three months later the bill is two or three times what the sales estimate projected. Your engineering manager wants an explanation. The data volume is roughly the same. Event counts are flat. What changed?

Cardinality. Specifically, label cardinality — the number of unique label-value combinations your metrics carry. When you were self-hosted, high cardinality was a performance problem: more RAM, slower queries, more disk. Now that you're on a managed platform, it's a billing problem: more unique time series, larger invoice. The data volume didn't change. The pricing model did.

How Label Explosion Works: The Math That Compounds

A time series in Prometheus (and most TSDB-based metric backends) is uniquely identified by its metric name plus its complete label set. http_requests_total with labels {status_code="200", service="payment-svc"} is one time series. http_requests_total with labels {status_code="500", service="payment-svc"} is a different time series.

The cardinality of a metric is the count of all unique label combinations it can produce. For http_requests_total with three labels:

status_code: 6 values (200, 201, 301, 302, 404, 500)
service: 20 services
method: 4 values (GET, POST, PUT, DELETE)

That's 6 × 20 × 4 = 480 time series. Completely manageable. Now add a fourth label:

pod_name: 200 pods (10 replicas across 20 services)

You now have 6 × 20 × 4 × 200 = 96,000 time series — for one metric. Datadog bills this as 96,000 custom metrics. At Datadog's indexed metrics pricing (approximately $0.05 per custom metric per month at published rates), that single metric costs roughly $4,800/month.

Adding one high-cardinality label to a metric emitted by 200 pods across 5 clusters creates the kind of multiplication that shows up in the next billing cycle, not in any alert. The pods and clusters were already there. The label was a one-line addition to your OTEL configuration. Nobody expected it to 200x a metric's cost.

The Labels That Actually Cause This

Not all high-cardinality labels are accidental. Some are introduced deliberately for good reasons, then left in place after the reason no longer applies. Others are added without understanding the cardinality implications. The recurring offenders:

Container ID and Pod Name — In Kubernetes environments, container_id and pod_name are easy to include in metric labels because they're available in the metadata. The problem: these change on every pod restart. A pod that restarts 10 times in a month creates 10 different cardinality values for that label. Across a fleet of 200 pods with normal Kubernetes churn, this can create thousands of distinct label values that accumulate over your retention window.

Deployment SHA or Build Version — version="a3f8c2d" is useful context. As a metric label, it creates a new time series family for every deploy. A team shipping 10 times per day creates 300 distinct version values per month. Without explicit label value expiry, these accumulate indefinitely.

User ID, Session ID, Request ID — These belong in logs and traces, not in aggregate metrics. A metric labeled with user_id at 50,000 active users creates 50,000 time series per metric. Request ID is worse — one per request, which at any meaningful scale creates cardinality that no TSDB can handle efficiently and that no managed platform will accept without a harsh lesson about their overage policy.

Endpoint Path Templates vs. Raw Paths — /api/orders/{order_id} as a template label has bounded cardinality (one value per route). /api/orders/48302 as a raw path value has unbounded cardinality (one value per order). ORMs and route frameworks that generate labels from raw URL paths are a common source of this class of cardinality explosion.

Prometheus Self-Hosted vs. Managed: The Pricing Model Inversion

When you ran Prometheus yourself, high cardinality was operationally painful but not financially visible. The Prometheus TSDB stores each unique time series as a separate chunk on disk and in memory. High cardinality meant larger WAL (write-ahead log), more memory pressure, slower range queries, and degraded query performance above roughly 100,000–200,000 active series depending on your hardware. It was a performance conversation, not a finance conversation.

Managed platforms inverted this. They absorb the operational complexity of running a scalable TSDB, and they charge for it in proportion to the cardinality you push to them. The per-series pricing model is economically rational from the platform's perspective — high-cardinality ingestion genuinely costs more to store and query at scale. It's just not the model most teams internalize before they start the migration.

The conversation that should happen before a managed metrics migration: count your current active series in Prometheus. The query is count({__name__=~".+"}) — it returns the number of active time series in your Prometheus instance. If the answer is 2 million series, you should know what 2 million series costs on your target platform before you sign the contract, not after the first invoice.

Query Degradation at High Cardinality: The Performance Problem That Predates Billing

There's a parallel problem that affects self-hosted Prometheus and managed platforms alike: query performance degrades non-linearly above approximately 100,000–500,000 active series, depending on query type. Range queries that scan across high-cardinality label dimensions — sum by (pod_name) (http_requests_total) across 10,000 pod names — require iterating across all matching series before aggregation. This is fine at low cardinality. At 100,000+ series for a single metric, a simple sum query can take seconds to minutes.

The degradation pattern is particularly sharp for dashboards that perform real-time aggregation at query time rather than pre-computing aggregations via recording rules. A Grafana dashboard with 6 panels, each running a range query across a high-cardinality metric, can bring a Prometheus instance to its knees during the 5-minute evaluation window of a high-traffic incident.

We're not saying high cardinality always causes query problems — it depends heavily on query patterns and hardware. We're saying that teams who discover the query degradation problem often also discover the billing problem at the same time when they migrate, and neither problem was visible during development testing with 10 services and 50 pods.

Prometheus Relabeling: The Primary Cardinality Defense

Prometheus's relabel_configs and metric_relabel_configs let you drop, rewrite, or normalize labels before ingestion. This is your primary tool for managing cardinality at the source — cheaper to drop a label during scrape than to pay for it in storage or send it to a managed platform.

# prometheus.yml — drop container_id before ingest, keep service + namespace
scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    metric_relabel_configs:
      # drop container_id (unbounded cardinality)
      - action: labeldrop
        regex: container_id
      # drop pod_name, preserve service via label mapping
      - action: labeldrop
        regex: pod_name
      # normalize version label to major.minor only
      - source_labels: [version]
        target_label: version
        regex: 'v(\d+\.\d+)\.\d+.*'
        replacement: 'v$1'

For metrics you've already sent to a managed platform with high-cardinality labels, relabeling at the OTEL Collector level intercepts before export:

# otel-collector-config.yaml
processors:
  transform/drop-high-cardinality:
    metric_statements:
      - context: datapoint
        statements:
          # remove unbounded labels before export to managed backend
          - delete_key(attributes, "container_id")
          - delete_key(attributes, "pod_name")
          - delete_key(attributes, "request_id")

Aggregation at Collection Time vs. Query Time

There are two places to reduce high-cardinality metrics to low-cardinality aggregates: at collection time (Prometheus recording rules, OTEL Collector transforms) or at query time (PromQL sum/avg/count). Collection-time aggregation is almost always preferable for metrics that will be used in dashboards or alerts.

A recording rule that pre-computes sum by (service, status_code) (http_requests_total) runs once per evaluation interval and stores the result as a new low-cardinality metric. The original high-cardinality metric can be dropped after the recording rule is validated. Dashboard queries against the recording rule result are fast because the aggregation has already been done.

The pattern that causes billing and query problems simultaneously: keeping the high-cardinality raw metric for "just in case" detailed queries while also using it in dashboards. The high-cardinality metric pays cardinality cost at all times. It only provides incremental value over the pre-aggregated recording rule when you need the pod-level or container-level granularity for a specific investigation — which happens much less often than the metric is being stored.

When High Cardinality Is Intentional and Worth the Cost

Not every high-cardinality label is an accident. Some are genuinely useful and worth their cost — particularly when the cardinality is bounded and predictable.

HTTP route path at the template level (/api/orders/{id} normalized to the route, not the value) typically has 50–500 distinct values for a typical API surface. This is entirely manageable and provides real value for error-rate-by-route dashboards and SLO tracking per endpoint.

Country or region codes (typically 50–300 distinct values) for latency analysis by geography is often worth the cardinality expansion if you have multi-region deployments and geography-correlated performance issues.

The test for intentional high-cardinality labels: can you state the maximum number of distinct values this label will ever take? If yes, multiply that bound through your label set and calculate the resulting time series count. If the number is manageable and the diagnostic value is clear, keep it. If the bound is "it grows with user count" or "it grows with request volume," that label belongs in logs and traces, not in aggregate metrics.

The fundamental principle: cardinality and signal type should match. Aggregate, bounded-cardinality signals belong in metrics. Individual, unbounded-cardinality signals belong in logs and traces. The billing problems come almost entirely from putting the second type into the first category — usually because metrics were the first observability signal type a team built, and adding dimensions to existing metrics feels easier than routing the signal to a different tool.