Application Monitoring

Concepts

Traditional Application Performance Management (APM) solutions are about managing the performance and availability of applications.

An application for APM tools is a static set of code runtimes (e.g. JVM or CLR) that are monitored using an agent. Normally the application is defined as a configuration parameter on each agent.

This concept, which was a good model for classical 3-tier applications, does not work anymore in modern (micro)service applications. A service does not always belong to exactly one application. Think of a credit card payment service that is used in the online store of a company and also in their Point of Sales solution. A solution to this problem could be to define every service as an application, but that would introduce some new issues:

  • Too many applications to monitor. Treating every service as an application would result in hundreds or thousands of applications. Monitoring them using dashboards will not work - just too much data for humans.
  • Loss of context. As every service is treated separately, it would not be possible to understand dependency and understand the role of the service in the context of a problem.

Instana introduces the next generation of APM with its application hierarchy of services, endpoints, and application perspectives across them. Our main goal is to simplify the monitoring of your business' service quality. Based on the data we collect from traces and component sensors, we discover your application landscape directly from the services actually being implemented.

Summary

Latency Distribution

Latency Distribution chart is perfect for investigating latency related issues of your applications, services or endpoints. A latency range can be selected on the chart and using the "View in Analytics" menu item, you can further explore the specific calls in Unbounded Analytics.

Latency Distribution

Infrastructure Issues & Changes

Infrastructure issues and changes related to your applications, services or endpoints are shown on the respective dashboard "Summary" tab, to help you find correlations with interesting application metric changes, e.g., increase of "Erroneous Call Rate" or "Latency".

Infrastructure Issues & Changes

To learn more about some specific issues or changes, select a desired time range on the chart and click on the "View Events" context menu item, which brings you to the "Events" view.

Time Shift

To compare metrics to past timeframes, you can use a Time Shift functionality as shown in the image below. Be aware of decreased precision when comparing metrics against historical data.

Time Shift

Application Dependency Map

The dependency map is available for each application and provides:

  • an overview of the service dependencies within your application.
  • a visual representation of calls between services to understand communication paths and throughput.
  • different layouts to quickly gain an understanding of the application's architecture.
  • easy access to service views (dashboards, flows, calls and issues).

Application Dependency Map Overview

Error Messages

Error messages are all messages collected from errors happening during code execution of a service. For example, if an exception is thrown during processing and it is not caught and handled by the application code, this call together with the error message will be listed on the "Error Messages" tab. An example would be an unhandled exception in a Servlet's doGet method that causes the request to be responded to with HTTP 500.

Log Messages

Log Messages are collected from instrumented logging libraries/frameworks (see for example the section "Logging" in the list of supported libraries). When a service writes a log message with severity WARN or higher via a logging library, the message will be displayed on the "Log Messages" tab. Additionally, captured log messages will also be shown in the trace details in the context of their trace. If a log message was written with severity ERROR or higher, it will be marked as an error. Note that log messages with a severity lower than WARN are not tracked.

Infrastructure

From the Application Perspective view or Services dashboard it is possible to navigate to the corresponding infrastructure component shown on the Infrastructure Monitoring view.

The "Unmonitored" Infrastructure Component

The list of infrastructure components for an application or service might sometimes show or include the "Unmonitored" host / container / process.

The "Unmonitored" component indicates that for some or all calls to this service, we were unable to link it to a specific infrastructure component. As Services are "logical" entities, we are often able to link it to infrastructure components via the monitored process. This does not hold for example for third-party web services, which we don't monitor but where we still create Services and Endpoints based on host-name + path. Since no host or process is known, these services would be resulting in the "Unknown" infrastructure component being shown.