Smarter Tracing in Datadog: Service Reliability, Lower Costs

Capture Only the Traces You Need

Tail-sample to identify, send only important traces

Pay for only the traces you need

Improve Datadog APM while managing cost

Problem – Collecting the Right Traces While Managing Datadog APM Cost

Datadog APM is a powerful observability solution — and a top choice for distributed tracing. But as your system scales, so do two critical challenges: gaps in visibility and unexpected costs tied to Datadog pricing and billing models. Building resilient systems requires collecting the traces you need — and stop overpaying for data you don’t need. If you’ve ever asked:

“Why can’t I see what happened inside that queue or async worker?”
“Why is our Datadog bill growing faster than our infrastructure?”
“Why does our tracing data still leave questions unanswered during an incident?”

You’re not alone. These are common signs that your Datadog APM tracing pipeline is due for smarter tracing and a more targeted approach — one focused on service reliability, data quality, and cost control.

At scale, even well-instrumented systems hit the same friction points:

Head-Based Sampling Drops Critical Traces:
- Datadog trace sampling decisions are made too early — before knowing if a trace contains errors or high latency. That means slowdowns, retries, and failures often go unobserved.
Trace Context Breaks Across Boundaries:
- Async jobs, message queues, and legacy systems frequently break context, resulting in fragmented Datadog trace search results.
High-Cardinality Tags Drive Up Datadog Costs:
- Tags like user IDs, session tokens, or dynamic URLs increase unique time series and storage use. These high-cardinality metrics impact Datadog custom metrics and billing.“Why can’t I see what happened inside that queue or async worker?”

These aren’t minor annoyances — they increase Datadog APM pricing, add dashboard noise, and weaken incident response.

Solution – Smarter Tracing Without Replacing Datadog

ControlTheory enhances your existing Datadog APM setup to provide fine-grained control over what gets traced, sampled, and stored — so you get more value from every trace while reducing Datadog billing shocks.

Here’s how:

1. Tail-Based Sampling

Instead of making decisions at the start of a request, we wait until a trace completes using a technique called tail sampling. Tail-based sampling allows you to:

Keep traces with errors, slowdowns, or retries
Drop unimportant traffic
Prioritize critical paths for training or debugging

This reduces ingestion costs and improves signal quality across your Datadog trace search for smarter tracing in Datadog APM.

2. Trace Consolidation

ControlTheory makes it possible to combine native Datadog traces with OpenTelemetry, eBPF, and LD_PRELOAD-based instrumentation, giving you a complete and unified view of system behavior:

OpenTelemetry for flexible cross-platform instrumentation
eBPF for container- and kernel-level insight without code changes
LD_PRELOAD for auto-instrumenting legacy and third-party binaries

We stitch these together into complete, end-to-end Datadog-compatible traces — giving you full context even across async jobs, queues, and service boundaries.

“Increasingly complex systems and ballooning telemetry volumes have made observability costs and processes an operational challenge for many organizations. Concepts like controllability aim to address these issues and necessarily evolve how we think about observability by focusing on actively governing, shaping, and optimizing telemetry rather than just collecting it.”

Kelly Fitzpatrick Senior Analyst at RedMonk

“By monitoring and routing logs, traces and metrics as they move across different data silos and to leading observability platforms, customers get application visibility while controlling costs and reducing data vendor lock-in.”

Jason Englishpartner and principal analyst at Intellyx

“ControlTheory’s approach merges observability with feedback-driven control-ability, using a closed-loop control plane that balances cost and value, delivering exactly the data you need, precisely when you need it.”

Brian DucharmeVMBlog.com

“ControlTheory is pushing the boundaries of observability by introducing the crucial concept of Controllability, which empowers businesses to immediately manage costs, optimize performance, and position themselves for the AI-enabled future.”

Kip McClanahanGeneral Partner at Silverton Partners

“We’re excited to welcome ControlTheory to the CNCF as a new member. The future of observability is open—with projects like OpenTelemetry leading the way as the 2nd highest velocity open source project behind Kubernetes. ControlTheory’s innovative approach to controllability empowers organizations to regain control of their current observability, optimize existing stacks, and accelerate their journey toward an open, interoperable future.”

Chris AniszczykCTO, CNCF

3. Cardinality Management

ControlTheory helps reduce Datadog tracing and APM costs by managing tag explosion at the source. You can:

Normalize or strip volatile high-cardinality fields
Enrich only business-critical traces
Prevent noisy tags from inflating Datadog custom metrics and logs

This keeps your dashboards clean and your Datadog billing predictable.

4. Intelligent Telemetry Pipelines Enable Smarter Tracing

ControlTheory acts as a telemetry control plane between your services and Datadog. With flexible pipelines, you can:

Downsample low-value spans
Filter or transform spans in-flight
Send full-fidelity traces to cold storage (like S3)
Route only actionable data into Datadog

This approach ensures observability ROI and cost-efficiency — not just more data ingestion.

The Outcome: Smarter Tracing – Just the Traces You Need, Controlled Cost, Confident Reliability

With ControlTheory and Datadog working together, you get:

End-to-end visibility across modern and legacy systems
Reduced Datadog pricing through cardinality control and sampling
Better incident response with clean, actionable traces
Smarter AI automation, thanks to high-quality telemetry
Total control with policy-driven data pipelines

It’s the easiest way to improve Datadog trace search performance and reduce custom metric bloat — without leaving the Datadog platform.

Building Observability That Actually Works, Better

If you’re scaling your systems — and your observability spend — ControlTheory can help you shift from tracing chaos to clarity. Control Theory helps you:

Get the right traces
Maintain trace context
Filter noise and reduce Datadog custom metrics
Avoid Datadog cost surprises
Improve reliability at scale

Build smarter tracing in Datadog that’s more affordable, with controllability from ControlTheory.