Cardinality: Why Your Datadog Bill Is Out of Control
Home / Solutions / Control Your Datadog Bill / Cardinality

Detecting and Controlling Cardinality Cost
Problem – Datadog Custom Metrics and High Cardinality Cost
What Is High Cardinality — and Why Does It Cost So Much? In the Datadog world, cardinality refers to how many unique combinations of tags or attributes your telemetry data generates — particularly in custom metrics within the Infrastructure and Datadog APM modules. Take a metric like api.request.count. Seems harmless — until it’s tagged with env, region, user_id, and container_id. You now have thousands (or millions) of unique time series metrics. Each of those gets billed. Every Datadog metric query across them slows down. And nobody realizes it until the graphs started lagging and the Datadog pricing model kicks in, resulting in high Datadog cardinality costs.
The issue gets even more expensive with Datadog logs. If every log line includes dynamic fields — session tokens, headers, request payloads — you’re ingesting a massive volume of indexed log data. Even if it’s rarely queried, you’re still paying to ingest, store, and process it. And now, increasingly, Datadog tracing data is adding to the problem.
Traces: The New Frontier of High Cardinality
Datadog trace search is essential for understanding how requests move through your system — but traces are also full of dimensions. Every span can carry tags like:
- Service name
- Endpoint
- Status code
- Container, pod, or host ID
- User context
- Feature flags
- Region and environment
Multiply that by thousands of traces per minute, and suddenly your Datadog APM pipeline is ingesting millions of unique tag combinations from traces alone. Because traces often reflect dynamic, real-time request data, the cardinality footprint is unpredictable — which makes cost control in Datadog harder and surprise bills more likely.
Why This Gets Dangerous — Fast
The financial impact is obvious: more data, more cardinality, more cost. But the operational impact is just as painful:
- Dashboards slow down or time out
- Queries become unreliable
- Engineers lose trust in the tools
- Teams start turning things off to save money — and fly blind
- Eventually someone says, “Should we just leave Datadog?”
But switching platforms isn’t the solution. The real question is: How can we make Datadog more cost-efficient?
Solution – Get a Bird’s-Eye View and Take Precise Action
That’s where ControlTheory comes in. We help you understand what’s actually flowing into Datadog — not just telemetry volume, but telemetry shape:
- Where high cardinality is coming from
- Which dimensions are driving up Datadog costs
- How traces, logs, and metrics contribute
This gives you the visibility you need — a bird’s-eye view of your telemetry. Once you can see what’s happening, you’re in a position to optimize it. We work with you to implement Datadog cost optimization strategies through telemetry pipelines that:
- Filter out low-value data before ingestion
- Downsample or aggregate metrics
- Remove volatile dimensions (like user_id)
- Offload raw logs to cold storage
These policies are implemented between Datadog agents and the backend, often using OpenTelemetry for transformation and control.





You Don’t Have to Throw Anything Away
Here’s the smart part: just because data isn’t going to Datadog doesn’t mean it’s gone. We route non-critical telemetry to low-cost S3 storage, where it’s retained for compliance, rehydration, or ingestion into other tools like OpenSearch or ClickHouse. You’re not deleting — you’re taking control of your observability pipeline.
The Outcome?
- Datadog becomes faster and more responsive
- Trace and log data is curated, not chaotic
- Engineers spend less time wrangling noisy dashboards
- Your bill drops — without losing visibility
- You get sustainable observability without overpaying
If your Datadog bill feels disconnected from the value you’re getting, you’re not alone. But you’re not stuck, either. ControlTheory helps you optimize Datadog usage, reduce cost, and build a smarter observability stack.