How Developers Can Tame Microservices Sprawl

How Developers Can Tame Microservices Sprawl


Microservice architectures have the simultaneous challenge of information hiding and information discovery. Irrelevant services are hidden to make the system understandable by focusing on what is important and ignoring everything else. However, the system is constantly changing so you need to see inter-service dependencies to understand what is relevant. Isn’t this an information hiding puzzle? Application Perspectives (AP) is a capability that solves this puzzle because it enables you to dynamically scope the visibility to “just the right” size to meet your needs, such as:

  • by zone or cluster;
  • by technology;
  • by business transaction or user journey;
  • by deployment engine;
  • by version or release;
  • or any combination.

This “divide and conquer” algorithm effortlessly shrinks microservice sprawl into a perfectly scoped perspective using automatic discovery, automatic instrumentation, automatic tracing, and automatic dashboard creation. The first article in this series introduced Application Perspectives. This second article builds on that to further explain AP definition.

The Call as a Building Block

As discussed in the prior article, there are four components to an Application Perspective: the Application Perspective (AP) itself, services, endpoints, and calls. The most important of these is the call because it is the building block to form the other components: one or more call(s) form an endpoint; endpoint(s) form a service; and services form an AP. As expected, a call represents the message sent from a source to a destination. That call object is annotated with meta-data (i.e., tags) that is used while matching a query; if the call meta-data matches the query then the call and its endpoint(s) and service belong in the AP.

Application Perspectives Developer Overview

How are calls found? Instana’s AutoTrace  instrumentation records distributed traces automatically, including the sent and received request information used to construct the call. The trace spans are further processed into one, or more, calls. Additional infrastructure meta-data is added to the call from both the source process and destination process of a request.

How are endpoints formed? An endpoint is an operation exposed by a service (e.g., GET{user-id}). It is formed by applying pattern matching techniques to one, or more, calls. An endpoint is automatically assigned a commonsense name according to predetermined rules. Of course, custom names for endpoints can be specified.

How are services formed? An Instana service is what you would expect it to be when talking about a microservice architecture: it is code that is deployed with endpoints. It is rare that you need to remember the details that a service is a logical grouping of related endpoints and calls. It is constructed using an aggregation technique called service mapping. Services are intuitively named using a set of well documented service mapping rules. Of course, a user can create a custom service mapping rule.

Defining an AP by Choosing Tags

An AP is defined using a declarative query which matches call meta-data called tags. An AP is automatically kept up to date because the query is continuously evaluated for each input trace. So, if a new container receives a request that is captured in a trace, that request is converted to a call, the call’s tags are matched against an AP query, and both the endpoint and its service is automatically included in that AP when there is a match. For example, the tag/value pair technology=’springbootApplicationContainer’ defines an AP that includes all Spring Boot applications where each application is a service. If a new Spring Boot application ‘Bar’ is deployed, then this is modeled as a new service ‘Bar’ which is automatically added to the AP when Bar receives its first request. So selecting the right tags is key to building a meaningful AP query.

Tag selection is aided by intellisense capabilities for tag names and tag values. Tags follow a naming convention like the dot convention of a Java package or a domain name. For example, the first part of the name is the most general (e.g., ‘call’), then another part to narrow the scope some more (e.g., call.http), followed by the actual item (e.g., call.http.header). Finding a tag is relatively easy because you can provide any part of the tag name and a list of candidate tags is recommended for you to pick from (e.g., enter ‘http’ or ‘header’). After selecting the tag name, a list of candidate values is shown to select from. It is quite popular to add user defined tag/value pairs using the Instana SDK and this same intellisense works for them.

When choosing a tag, you may need to specify which entity it applies to. There are three possibilities. The first is that Source Destination Tag Independent because it’s an attribute of the call (e.g., call.http.params). This is shown as Source Destination Icon. Alternatively, a tag can apply to either the Source Icon or Destination Icon with the default being Destination. An example where the source or destination is important is the because it identifies if that is the source or destination service.

Using the Same AP for Monitoring or Troubleshooting

There are two principal views each AP has which addresses two different use cases. The first is an opaque, monitoring view for a group of services or even just one service. This opaque view is helpful for monitoring if clients are experiencing issues. The second view is an end-to-end flow of the calls which is discovered at run time.

These two views are controlled using two sets of options: the Downstream Services option and the Dashboard View option. Downstream services determine what additional data is collected. The Downstream Services option is usually specified when defining the AP. Dashboard View determines the default dashboard view. It easily transitions from the monitoring view to the end-to-end view, and vice-versa, Dashboard View can be changed at any time using a menu.

Monitoring for Client Problems
If you are interested in monitoring for client problems, then set Downstream Services to the value “No downstream services” and “Inbound Calls” is set to the ‘Dashboard View’ value. Then, the following AP query

AP Declarative Query 1

results in the following (unsurprising) Dependency Graph with the specified, single service called shipping.

Application Perspective Shipping Service

The automatically constructed dashboard Summary tab shows the metrics for that single service. These metrics correspond to a client’s experience for its shipping requests.

Application Perspective Edge Service Dashboard

Troubleshooting a Client Problem using the End-to-End Flow
If you are interested in diagnosing a client problem, then set Downstream Services to be “All downstream services”. Downstream Services is helpful because it effortlessly and automatically includes all the services you are interested in. A technical description is that it: transitively finds and includes all services that are downstream of those services that originally matched the AP query. This is best explained by a medical analogy. An angiogram is a medical procedure where a radioactive dye is injected into the blood steam and that dye flows through the arteries where an x-ray records the flow to check for blockages. By analogy, Downstream Services detects the entire end-to-end request flow(s) by injecting a software dye (the distributed trace context) that marks all the services involved whether they are specified by the query or discovered by the injected software dye.

The Dashboard View is set to All Calls Icon. The “All Calls” option expands the dashboard metrics to include all the calls, endpoints, and their services found in the distributed traces. With these settings, the AP is no longer just the lone shipping service. It is now the set of services that shipping calls, the services that those services depend upon, etc. This is shown below where there are three additional services that complete the distributed trace.

Application Perspective Dependent Service Map

The dashboard Summary metrics include these additional services and aggregate them together. This is seen below where the Total Calls count increased from 44K calls in the “Inbound” view to 172K calls for the “All Calls” view because the downstream calls are included.

Application Perspective End To End Service Dashboard

The troubleshooting process proceeds using the “divide and conquer” algorithm — drilling down from the summary tab to more detailed tabs, like the services table in the Services tab; the Dependency Graph that highlights services with issues; the error or log tab; etc. This often leads you to a distributed trace timeline (shown below) to identify the root cause in the code so that you can quickly fix the problem.

Distributed Trace Call Span Details

A View for Insightful Development

There is a third view available if you want to understand what is happening with a few services and their associated datastores. This is helpful when you need something in between the opaque and end-to-end views, like when you are developing code for services with their own databases. For this third view, set the Dashboard View to All Calls Icon 2 . Also set the Downstream Service’s value to “Immediate downstream database and messaging services” which needs a little explanation: the AP query determines the core set of services that match it and then this core set is expanded to include the database and messaging services the core set directly interacts with. An example is shown below where the ‘cities’ (database) service is automatically added to the AP because it is the directly used by the ‘shipping’ service.

Dependent Services View

Now you can build your own personalized, development dashboard to highlight errors or find key distributed traces. The first step is to install the Instana agent in your development environment with its AGENT_ZONE set. After you have sent some requests, confirm that monitoring is configured properly by seeing your services in Instana’s Services List. Then create your very personal, development Application Perspective by: (i) forming an AP query using the and tags, (ii) setting Downstream Services to the “Immediate downstream database and messaging services” value (more details further below about how to do this). Voila. You now have a dashboard, analytics running in the background, and distributed traces to review.

Example APs for Developers

Hiding infrastructure details and treating services as logical entities improves the signal-to-noise ratio. This section provides several example APs to do just that.

Specifying a Group of Services
A team of people may be responsible for several related services or web sites. In this case, the services are well known, fairly static, and the primary focus. It is quite easy to construct an AP query for those services.

The AP query below does exactly that and highlights some of the nuances of AP construction. First the evaluation of the Boolean expressions gives priority to evaluating AND expressions before OR expressions. This can be viewed as implicitly adding brackets around the AND expressions so they are evaluated first, resulting in the effective query of:(service name == cart AND type == HTTP) OR (service name == shipping) or (service name == ratings AND type == HTTP).

Secondly, it may be required to specify both the service name and service type together. A service is a logical entity which has its name automatically created so there can be rare instances of a service name clash — additional information is needed to remove the clash. In the example below, there is an HTTP cart service which needs to be differentiated from a database service that has the same name. That is why the call.type is set to ‘HTTP’.

AP Declarative Query 3

Kubernetes Application by Version or Release
A Kubernetes application can have an AP created for it which is scoped by version. The example AP below is for the Kubernetes application named “MyApp” for version 1.0.1. The standard Kubernetes labels are used.

AP Declarative Query 2

Monitoring a Technology by an SME
An AP can be constructed for a Community of Practice. This could be technology based, such as all database administrators monitoring all databases from one dashboard. The query below creates an AP to monitor database calls for mongoDB. Similar APs can be created for the other technology types, such as: Kubernetes, Rabbit MQ, ElasticSearch, AWS lambda, Jenkins, MySQL, etc.

AP Declarative Query 4

Picking an Environment
When the same service is running in different environments, such as PRODUCTION or STAGING, it may be desirable to create different APs for them. This can be done in several ways but the most common approach is to use the Instana data collection agent’s information. This is because different environments typically run on separate infrastructure which is often captured by the Instana agent. In this case, the environment is specified by the agent’s INSTANA_ZONE environment variable or its configuration file using the com.instana.plugin.generic.hardware label. This data is available as the tag. So, an example AP query using the agent zone to scope the perspective to the production environment is shown below.

AP Declarative Query 5

Adding Your Own Custom Tags
As previously mentioned, custom tags can also be added by the developer using the Instana SDK which is available for many languages. Two short examples for PHP and Python are:

$entrySpan = \Instana\Tracer::getEntrySpan();
$span->annotate('account', 'Universal Sprockets');
$span->annotate('customer', '12345678');

import opentracing
opentracing.global_tracer().active_span.set_tag('account', 'Universal Sprockets')
opentracing.global_tracer().active_span.set_tag('customer', '12345678')

Both implementations create two new custom tags: one for the customer account and one for the user ID. It is assumed the account and ID information is available in an internal database and added via the SDK to the distributed trace. The AP query uses the call.tag tag which, when provided a key, has a value selected to complete the AP query.

AP Configuration Example

In this example, the call.tag.account key, along with the value “Universal Sprockets”, defines an AP which auto-generates a dashboard for the Universal Sprockets customer.

AP Declarative Query 6

It is also possible to create an AP for an individual user in this fashion. This can be used to resolve intermittent problems for a specific customer which, in this example, has the ID ‘12345678’. In this situation a temporary AP is defined for that specific customer which captures all the traces related for that customer to investigate the intermittent issue. This type of information is useful to the support and development teams.


The Application Perspective concept is unique to Instana and is a key enabler for cutting through the noise of any application environment. The uses, structure, and definition of an Application Perspective have been explained so you are equipped to construct an AP for your purposes. As shown, several example APs for use by developers can be helpful for dynamically shrinking the scope to focus the team. The next article in the series will discuss how APs can mix service and infrastructure information to define a mixed monitoring scope using logical and physical meta-data. New types of services are presented too. This is useful information for developers, SREs, DevOps, and IT operations.

Play with Instana’s APM Observability Sandbox

Announcement, Product
You asked, and we have delivered. We are pleased to announce that Instana Smart Alert is available for Application Perspectives. The previous versions allowed the capability to create alerts for front-end web...
Announcement, Developer
We are excited to announce that Lightrun had partnered with Instana to enrich existing telemetry with real-time, code-level observability data and provide full-cycle Observability. Understanding everything that happens inside a production environment...
Conceptual, Product
Instana’s Enterprise Observability Platform provides a vast range of application and microservice metrics, events, traces, profiles, and application log information with the context needed to manage Cloud-native environments. It’s an abundance of...

Start your FREE TRIAL today!

Instana, an IBM company, provides an Enterprise Observability Platform with automated application monitoring capabilities to businesses operating complex, modern, cloud-native applications no matter where they reside – on-premises or in public and private clouds, including mobile devices or IBM Z.

Control hybrid modern applications with Instana’s AI-powered discovery of deep contextual dependencies inside hybrid applications. Instana also gives visibility into development pipelines to help enable closed-loop DevOps automation.

This provides actionable feedback needed for clients as they to optimize application performance, enable innovation and mitigate risk, helping Dev+Ops add value and efficiency to software delivery pipelines while meeting their service and business level objectives.

For further information, please visit