See How to Optimize Datadog - Control Cost, Cardinality, Traces, APM Read More

Observability Next: Tackling The Why

May 30, 2025
By Bob Quillin
Share Bluesky-logo X-twitter-logo Linkedin-logo Youtube-logo
The Why of Observability
To seriously consider what is Observability Next, we need to start with the most fundamental question we should be asking our teams: Why are we collecting all this data? This singular question opens the floodgates to more why’s. Why do we need to store all these logs and metrics?

To seriously consider what is Observability Next, we need to start with the most fundamental question we should be asking our teams:

Why are we collecting all this data?

This singular question opens the floodgates to more why’s. Why do we need to store all these logs and metrics? Even with all these logs, metrics, and traces, why can’t we find the root cause of this issue? Why did it take us three days to track down that problem?

If you start with the why’s, observability objectives become crystal clear. Observability today relies too heavily on the philosophy of “collect and store everything, ask why later.”

Why is Observability Costing Us So Much?

Rising observability costs are painful, but they’re a symptom of a more foundational issue. We pay to ingest this data, index it, store it, and when necessary, rehydrate it. But back to the why:

Why are we observing our infrastructure and application performance in the first place?

The end game of observability is not to observe but to:

  • Prevent and solve problems
  • Reduce MTTR
  • Accelerate root cause analysis
  • Unearth critical new business KPIs

So why is observability costing so much? If we worked backwards from the problems we’re trying to solve, we could collect just what we need, when we need it, and for whom. That would lead us back to the intent behind what we’re collecting.

In the absence of that clarity, we fall back on collecting and storing everything. The observability ecosystem today meets the market exactly where it is—delivering products that do just that.

Fat Telemetry Pipes and Massive Data Lakes

So how did we get here? Proprietary agents make it trivially simple to collect everything. Fat telemetry pipes stream data up to ever-growing, amazingly efficient data stores—meeting the market need for more and cheaper telemetry storage.

This model of fat “dumb” pipes and massive data lakes unfortunately reinforces the worst lazy engineering habits we’ve developed over the last decade of observability.

If you don’t know what to collect or how to analyze it, you store everything you can and slap the data into an ever-increasing number of… dashboards.

The Goal of Observability Isn’t the Dashboard—Is It?

Why do we have so many dashboards?

That’s one of my favorite “why” questions. Did you ever stop to think that there has to be more to observability than graphs, charts, and dashboards? It makes sense—since so much telemetry is flooding in, the only way to make sense of it seems to be yet another dashboard.

For evidence, just count the number of dashboard tabs you keep open in your browser at any point in time.

Observability Next—to me—is imagining observability without dashboards. What if:

  • Observability solutions presented answers instead of dashboards
  • The only telemetry you sent was the telemetry needed for those answers
  • Analytics could provide feedback to collectors and pipelines to request more data when needed
  • And dial it down when done

Disrupting the Observability Supply Chain

The building blocks of Observability Next are already forming across the observability supply chain. Starting with collection, then moving to control, and finally to analysis—each layer requires the others:

  • Collection and instrumentation: OpenTelemetry
  • Distribution and control: Adaptive Control Planes
  • Intelligent problem solving: Inference Engines

OpenTelemetry is already reshaping observability by breaking vendor lock-in at the data collection layer. Instead of proprietary instrumentation that ties you to a single platform, OpenTelemetry offers open, flexible, and vendor-neutral telemetry—putting data ownership and control back in the hands of engineering teams.

The Need for Control

Building scalable cloud-native systems requires a clear separation between the data plane and the control plane. Only then can a full management plane emerge to drive the next generation of intelligent problem solving.

OpenTelemetry has laid the foundation for open data collection, but it needs a robust control plane to give organizations real feedback loops and control over their telemetry. By evolving from one-way data pipelines to two-way feedback systems, teams can actively manage telemetry to achieve desired outcomes—just like in traditional control systems where controllability and observability work hand in hand.

Why This Matters for Engineering Leaders

As engineering leaders, you’re constantly balancing competing priorities:

  • Accelerating feature delivery while maintaining system reliability
  • Managing operational costs while improving visibility and control
  • Reducing downtime and human churn while ensuring critical issues are addressed

Observability Next must directly address these challenges by going back to the Why:

  • Reducing mean time to resolution (MTTR) through better APM
  • Optimizing telemetry to focus on high-value signals that actually solve problems
  • Discovering how to master new AI tech with a human in the loop

Observability Next – A Newsletter for You

For ongoing conversation and engagement, check out the “Observability Next” newsletter on LinkedIn. It’s designed for engineering leaders navigating the rapidly evolving observability landscape.

Each edition explores trends, strategic shifts, and practical ideas to help teams move beyond the status quo—toward faster resolution, clearer insight, and smarter operations. I share industry patterns, personal experiences, and approaches that challenge conventional thinking.

Or check out these related posts:

For media inquiries, please contact
press@controltheory.com