The OpenTelemetry Collector is a fundamental component of the OpenTelemetry architecture but can be a little complicated to sort through, especially when you add in important concepts like pipelines, receivers, processors, exporters, connectors, agents, and gateways. Let’s break it down piece by piece and clear up any confusion.
OpenTelemetry Collector
At the top level, the OpenTelemetry Collector simply receives telemetry, processes or filters it, then sends or exports it out to observability applications or other consumers. The Collector can receive telemetry signals – logs, metrics, traces, with more to come – from a wide variety of sources, integrating with existing instrumentation such as Datadog, Jaeger, and Prometheus or operating natively as its own collection agent. The Collector architecture is extensible, enabling it to support a broad range of existing or new sources or protocols in the future. For example, as OpenTelemetry embraces profiling as a new key signal type.
OpenTelemetry Collector Configuration Files
The Collector configuration file is in YAML format, and defines all components that make up that specific Collector instance configuration, such as Receivers, Processors, Exporters, Pipelines and other optional components.
Collector Agent vs Gateway Deployment Pattern
There are two primary deployment patterns for the OpenTelemetry Collector: Agent and Gateway. The Agent deployment configuration is fairly common, as the agent simply runs as a daemon that can be independently deployed “close to the workload” (e.g. on the same host), similar to most instrumentation agents. In Kubernetes for example, the collector can be deployed as a daemonset where an instance of the collector is deployed to each node in the cluster.
In the Gateway deployment, the collector runs as a standalone service, collecting telemetry data from other agents before forwarding it. This can be useful for use cases such as load balancing or aggregation, and can also enable changes to telemetry flows without needing to make changes to the agents themselves.
OpenTelemetry Pipeline
Inside each Collector, OpenTelemetry Pipelines define how data is received, processed, and exported. Each Collector can have one or more pipelines. This is particularly helpful when you, for example, want to split or filter data to different endpoints (e.g., some to your observability platform and some to AWS S3 storage).
A Pipeline consists of the following sub-components:
- Receivers
- Processors
- Exporters
Pipelines only operate on one kind of telemetry data type: traces, metrics, or logs, but you can have multiple pipelines for each telemetry type.
OpenTelemetry Receivers
Receivers are responsible for collecting the data. You may have one or more receivers for a pipeline. Receivers can receive data from your existing telemetry sources, for example via a DataDog receiver or a Prometheus receiver. For sources that natively support OpenTelemetry instrumentation, an OTLP (OpenTelemetry Protocol) receiver can be used. Some receivers (such as the OTLP receiver) support multiple signal types.
OpenTelemetry Processors
Processors transform the received data, for example, to filter, enrich, sample, or in some way massage the data close to the source before moving it on. While they are optional, Processors are powerful components that enable much more control at the edge to avoid more expensive operations later or to improve the fidelity of the data sent so only the most important telemetry is exported out. As you can imagine, there are multiple ways to process data so there are multiple types of Processors available to manipulate and manage your observability data.
In a Filter Processor, OpenTelemetry Transformation Language (OTTL) defines the conditions for filtering or dropping telemetry. OTTL can be used for metrics, traces, or logs and is one of the superpowers of the OpenTelemetry Collector, as most of these signals are extremely redundant so filtering out the important telemetry from the massive majority of wasteful data is key.
Processors can be chained sequentially one after the other. The initial Processor gets the data from one or more receivers that are set up for the pipeline. The final Processor at the end of the chain operates, before the telemetry data is sent to one or more exporters as configured in the pipeline. The Processors work serially and transform the data at each step before forwarding it. This can include adding or removing attributes, dropping data or analyzing and creating new, more useful aggregate measurements.
OpenTelemetry Exporters
Exporters send the data out from the collector. There may also be multiple Exporters within a Pipeline. Exporters usually send the data they get to a management application (”observability backend”) but they can also write out to a debug endpoint, an AWS S3 bucket, a file and many other destinations. Some exporters (like the OTLP exporter) support multiple signal types.
OpenTelemetry Extensions
Extensions are optional components that provide capabilities on top of the primary functionality of the collector. Common examples are for authentication or health checks.
OpenTelemetry Connectors
An OpenTelemetry Connector is an optional component that joins two pipelines together. As such, it operates as both an exporter and receiver. When it first connects to a pipeline, it acts as an exporter at the end of that pipeline. To the connecting pipeline, it then acts as a receiver to send the data forward. Connectors can be useful for merging, (conditional) routing and replicating telemetry data across streams.
Open Agent Management Protocol (OpAMP)
Once you have a fleet of OpenTelemetry Collectors, their management becomes more complex. To eliminate proprietary vendor approaches to agent management, the Open Agent Management Protocol (OpAMP) was introduced to enable critical functions such as remote configuration, secure updates, health and internal telemetry, and security. The OpAMP server typically works in the “control plane” of the OpenTelemetry architecture, and provides the orchestration of the collector fleet agents at the data plane level. These agents act as OpAMP Clients providing a client-side implementation of the OpAMP protocol.
OpenTelemetry Control Plane
For cloud-scale architectures, it is essential to separate the data plane (the collectors and the telemetry flowing through them) and control plane (command and control) to maximize scalability, efficiency, security, and policy enforcement. The OpenTelemetry architecture consists of the OpAMP server at the control plane and OpAMP clients in the data plane (in the collectors). The separation of control planes and data planes has been an essential part of every successful cloud provider architecture, a foundational part of network design, and the basis for server, network, and storage virtualization.