Phillip Carter shares why he believes OpenTelemetry is the best choice for instrumenting your services, and why it shouldn’t be owned by any one vendor.
Instrumenting your services is table stakes for modern services work. If your services are properly instrumented, debugging stops becoming guesswork, and you can understand your systems at a much deeper level than just knowing when something goes wrong. And when something is table stakes in software development, it mustn’t be controlled by any one vendor.
In this post, I’ll introduce the OpenTelemetry observability framework, sharing why you should be adopting it to instrument your services, how you can work with different vendors to export your telemetry data to a backend, and why I believe it’s essential that OpenTelemetry isn’t tied to a specific vendor and their offerings.
The rise of OpenTelemetry
The OpenTelemetry observability framework is emerging as the new standard for instrumenting your apps and services. It’s the second most active project in the CNCF (Cloud Native Computing Foundation), right after Kubernetes, with support from dozens of major vendors and cloud providers who regularly contribute and help maintain various projects. If you’re in the business of building services, you should adopt OpenTelemetry.
Some teams write code that sends traces to Zipkin, others to Jaeger. Some services report metrics to Prometheus, others to Wavefront. All these different formats tie instrumentation to a particular storage and UI – some even to a particular pricing plan. This lack of portability is a problem! OpenTelemetry is the standard that stands between them all, letting you convert and interoperate with different data formats and export data to different backends.
Writing instrumentation code is expensive, especially if you choose to switch backends for your data. With OpenTelemetry, you can be sure that work will stay relevant and even grow in value as instrumentation query tools and products improve.
What is OpenTelemetry?
OpenTelemetry is a vendor-agnostic observability framework for instrumenting, generating, collecting, and exporting telemetry data. Concretely, it’s made up of a few things:
- A cross-language specification for distributed traces, metrics, and logs
- Per-language API and SDK
- Automatic instrumentation for major libraries and frameworks
- The Collector, a proxy and swiss army knife for collecting, processing, and exporting data
APIs implement the specification, providing data types for all relevant concepts like traces and metrics, and how you can construct and interact with them.
SDKs package and implement APIs for each language. Each language has a vendor-neutral SDK. Different vendors, like Honeycomb, can create their own specific distributions of an SDK that make things like configuration simpler than with the vendor-neutral SDKs and optionally add a few nice features like deterministic sampling. Critically, swapping out one vendor’s SDK with another vendor’s SDK means that all your instrumentation code stays the same: It’s all the same APIs!
Automatic instrumentation is also done on a per-language basis. For example, using the Java SDK means you can automatically instrument a Spring Boot application. In some cases, you can also deploy an agent as a sidecar that will detect packages you’re using and provide automatic instrumentation if it’s available. The list of libraries and frameworks that OpenTelemetry offers automatic instrumentation grows every day.
Finally, the OpenTelemetry Collector is a proxy that you can deploy as an agent or in standalone mode. You can use it to receive trace and metrics from other sources (e.g., Prometheus metrics) and marshall them into OpenTelemetry formats. You can export OpenTelemetry data into other formats too (e.g., Jaeger traces). You can also use it to perform sampling, filtering, or any custom processing of data you might need.
Why is OpenTelemetry the best choice for instrumenting code?
The two most critical aspects of OpenTelemetry that make it a better choice than proprietary instrumentation libraries are:
- OpenTelemetry is vendor neutral and it lets you change backends without rewriting your code. It even lets you send your telemetry data to multiple backends.
- OpenTelemetry SDKs give best-of-breed automatic instrumentation, in part because of its vast community of contributors and vendors who contribute features upstream so that they don’t maintain bespoke integrations.
By decoupling proprietary instrumentation from your code, OpenTelemetry helps you avoid vendor lock-in. It gives you the flexibility to instrument once and send your data anywhere, at any time, to as many places as you want – some of which you probably couldn’t send to before.
Working with vendors to export your telemetry data to a backend
Your telemetry data must ultimately be consumed by some sort of backend to be useful to you. With OpenTelemetry, you have a wealth of choices, ranging from something like a self-hosted instance of Zipkin to a vendor backend like Honeycomb. There are three primary ways to get started.
First, if you’re using a vendor backend, check to see if they have a distribution of a relevant OpenTelemetry SDK for your language. These typically make the configuration and export as simple as possible, and then expose the same vendor-neutral API. If you change backends/vendors you’ll likely need to change this SDK distribution as well, but critically, all of your instrumentation code will stay the same!
Second, if you’d rather not use a vendor SDK distribution or it isn’t applicable to your scenario, you can configure a stock exporter from the vendor-neutral SDK. This typically involves a bit more code than a vendor SDK, and you’ll also have to decide which wire format to use (gRPC or HTTP). You can consult the OpenTelemetry documentation for ways to configure an exporter in each SDK. For example, in a Go program it can work out to be a dozen or so lines of code, depending on the additional metadata you need to attach to HTTP headers and/or if you’re sending data to an OpenTelemetry Collector instance. If you change backends or vendors, you’ll have the option of either adjusting your code or removing it in favor of using a vendor SDK distribution.
Finally, if you’re already instrumented with something like OpenTracing or Jaeger, or have a more advanced scenario like filtering spans or exporting to multiple sources, you can configure and use the OpenTelemetry Collector to export your data. You’ll need to stand up the collector as a proxy in your infrastructure, pick the wire format, and specify things like an endpoint and/or API key for the backend(s) you wish to export data to.
A consistent theme here is flexibility and neutrality. Even if you go down the route of a vendor’s SDK distribution, you can still easily change vendors and not have it affect your systems much. This is one of the true powers of OpenTelemetry.
Vendor neutrality is the future of observability data
The trajectory of important technology tends towards vendor neutrality. As concepts and components become more fundamental to your sociotechnical systems, it becomes more and more precarious to tie them to a specific vendor and their offerings.
For example, look no further than the broad adoption of Kubernetes despite most organizations only picking one cloud provider for their infrastructure today. As this happens, a feedback loop is created where developers learn vendor-neutral technologies to future-proof their careers as employers seek vendor-neutral technologies to access the talent they need in their organizations. This, in turn, forces vendors to adopt clean interfaces with their systems and “play well” with everyone.
Instrumentation via OpenTelemetry is but another chapter in the story of vendor-neutral technologies pushing our industry to be better. There’s no better time to hop on the OpenTelemetry train than now!