Observability is a hot topic of debate at the moment, principally around the argument that observability is different than monitoring. This article will shed some light onto this contentious subject and hopefully bring some clarity.
Why is Application Observability Important?
To run the CI/CD process effectively there must be some form of feedback. It does not make any sense to continually push out changes without knowing if they make things better or worse.
The DevOps loop. CI/CD excellence is achieved by automating as much of this loop as possible
The MONITOR section of the DevOps loop provides the all important feedback that drives future iterations.
The Relationship Between Monitoring and Observability
Monitoring and observability are in a symbiotic relationship, summarised by the following statement.
If you are observable, I can understand you.
Simply put, observability is achieved when data is made available from within the system that you wish to monitor. Monitoring is the actual task of collecting and displaying this data. There is one more significant term when having the “observability v. monitoring” discussion and that term is “analysis”. After you’ve made the system observable, and after you’ve collected the data using a monitoring tool, you must perform analysis either manually or automatically. Without meaningful analysis you’ve fallen short of the whole purpose of creating observability and performing monitoring in the first place. The better your analysis capabilities are, the more valuable your investments in observability and monitoring become.
The Pyramid of Power – Observability, Monitoring, and Analysis combine to provide actionable intelligence
Despite its current popularity, observability is nothing new. Logging has been around since the dawn of programming, making execution observable by writing out useful messages. Another example is JMX (JSR-160 2001) which provides a common approach to making runtime data available for Java programs.
Of course it’s possible to monitor a service endpoint, for example, even if it does not make itself observable, by simply calling it every 10 seconds and recording success or failure and response times. This is referred to as “Synthetic Monitoring”.
The most difficult form of observability is distributed tracing within and between application services. Creating this form of observability in an efficient and effective way requires strong experience and an understanding of the underlying principles of tracing requests that flow between services. If you can completely automate the process of creating distributed tracing observability (as Instana does) you will have found the Holy Grail of observability and monitoring.
Contemporary Application Observability
Over the past 20 years there have been many monitoring tools that created observability of application servers. The entire APM market was built around the concept of creating observability. Initially all observability data was created using manual methodologies like adding API calls to your code or by adding method names to configuration files. Eventually, APM tools like AppDynamics, New Relic, and Dynatrace got really good at using automated methodologies to create code level observability (production profiling) in monolithic and SOA applications.
Modern application delivery has shifted to CI/CD, containerization, microservices, and polyglot environments creating a new problem for APM vendors and for observability in general. New software is deployed so quickly, in so many small components, that the production profilers of the SOA generation have trouble keeping pace. They have trouble identifying and connecting dependencies between microservices, especially at the individual request level. Those production profilers employ various algorithms that limit the amount of data collected and therefore provide only partial data or meta-data for most of the requests flowing through the system. This strategy MIGHT be acceptable for SOA applications but is completely unacceptable in the microservices world. The problem is so pervasive that the Cloud Native Computing Foundation (CNCF) has multiple open source observability projects in either the Incubation or Graduated phase.
Looking at the CNCF stack, the popular frameworks are Prometheus for time series metrics and Jaeger for distributed tracing. These frameworks provide a collection and storage platform along with an API for generating the observed data. Thus, open-source generated observability is more than just technology, it also has to be an organisational culture, similar to DevOps, such that observability has to be embraced as part of the CI/CD process. Using open-source tools, Developers are required to code-in the necessary data to be collected and Operations are required to configure and manage the collection infrastructure (the monitoring tool).
In the context of the CI/CD process and developing microservice applications, the current focus on observability is concerned with improving the quality of the monitoring data by making those services observable; providing internal data to augment external measurements. The resulting high fidelity monitoring data improves the quality of feedback in the CI/CD loop.
The practicalities of implementing open-source or manual observability can be significant, placing a considerable additional load on both developers and operators. The burden of monitoring as code results in valuable programming resource being consumed writing instrumentation rather than functional (business) code. The set up and maintenance of the data collection, dashboarding, and alerting systems takes operators away from running core systems.
Automation is already extensively used throughout the CI/CD process, automating observability frees up DevOps resource to concentrate on core tasks. The good news is that there is already an APM product that automates observability and that product is Instana APM. Instana automatically detects language runtimes as they are deployed or scaled, then automatically performs code instrumentation to create observability. Instana collects the data, all the data (a distributed trace for every request), and automatically analyses it all using an intelligent backend. This provides actionable information to aid with troubleshooting and optimisation; not monitoring as code but monitoring by robot. We’ve named our robot Stan.
AutoTrace – Automatic,
zero touch, no restart
Table of supported language runtimes for Instana automatic observability, see Instana APM documentation for full details.
End to End Automation
Making applications observable by instrumenting the code to provide an internal view of their execution is nothing new and the high fidelity data that this technique provides is invaluable to maintaining the speed of the CI/CD process. Unfortunately there is no gain without pain, in this case the pain is the extra resources required from both developers and operators to collect and manage this extra data. Utilising Instana’s automation capabilities completely or considerably reduces the amount of effort required to generate and acquire the data and get actionable information from it.
Automation is a key ingredient for the success of CI/CD – why not automate application observability and monitoring too? Sign up for your free trial of Instana today.
To continue your journey on the path to understanding Observability and how it relates to Monitoring, here are a few great resources:
- eBook about Observability and Developers
- Analyst Report from APM Experts Ranking Observability Solutions
For an even deeper dive into observability and monitoring concepts you should read “From Observability to Monitoring and then Controllability“.