Analyze Traces & Calls
TABLE OF CONTENTS
Examine traces in Unbounded Analytics, where you can investigate the traces and calls collected by Instana. To help you understand how an application behaves with each call, we monitor each one of those calls as they come in to the system.
- On the sidebar, click Applications.
- On the Applications dashboard, select an application or service.
- On the application or services dashboard, click Analyze Calls.
- On the Analytics dashboard you can analyze calls by application, service, and endpoint, breaking down the data that Instana is presenting by service, endpoint and call names, respectively. Under Applications, select Calls or Traces.
- Click a group and then select a trace.
On the Analytics dashboard, traces or calls can be filtered and grouped using arbitrary tags. Filters are connected using the AND logic operator, so a trace or a call needs to match all the filters, and the default grouping is the endpoint name. To inspect the individual traces and calls that match the filters, the grouping can be removed.
The preceding example filters by
service=catalogue and lists the calls grouped by the endpoint name.
By selecting All Filters, you can apply the
endpoint.name tags, along with the infrastructure entity tags, such as
host.name, to both the source and the destination of a call. By default it is applied to the destination. To change it to the source, click All Filters -> Source.
By combining source and destination, you can create queries such as Show me all the calls between these two services or Show me all the calls that are issued from my
agent.zone 'production' towards the
To apply grouping tags, click Group by and select the tags. The default grouping uses the endpoint name (
endpoint.name) tag. To inspect the individual traces and calls that match the filters, the grouping can be removed. Tags can be applied to the call's source or destination, so you can express queries such as Show me all the calls towards this one service, broken down by caller. Calls that do not match any group are shown in a separate group, for instance with
agent.zone this will be Calls without the '
agent.zone' tag. To remove the unmatched
agent.zone from the results, apply an additional filter with the
is present operator.
The option to apply filters and grouping on either the source or destination is not available for call tags such as
call.tag, which are properties on the call itself and are independent of the source or destination.
Grouping by source and destination is also not available in Analyze Traces, as the available groups in that view are independent from source or destination of any one particular call.
Trace and call latency can be inspected using the Latency Distribution chart. When selecting a latency range on the chart, filters above the chart will be adjusted accordingly. The results in the table below will be updated to show only traces or calls within the specified latency range.
To display a trace view, on the Analytics dashboard select a group, and then click the trace. Selecting a call displays the call in the context of its trace.
The summary details of a trace include:
- The trace name (usually an HTTP entry).
- The name of the service it occurred on.
- The type or technology.
The core KPIs:
- Sub calls to other services.
- The number of erroneous calls.
- The number of errors within the trace.
- The number of warnings within the trace.
- The total latency.
The trace timeline displays the following:
- when the trace started.
- the chronological order of services that have been called throughout the trace.
The call chains hang from the root element (span). On simple three tier systems, you have a typical depth of four levels. In contrast, on systems with a distributed service or microservices architecture, you can expect to see much longer icicles. When you have long subcalls of the trace, or periodic call patterns, like one HTTP call per database entry, the timeline gives you an excellent overview of the call structure.
To view details of the span, click the span within the timeline graph. To view details of where the time was spent within a specific call, hover over the call displayed on the timeline graph. The call details include
Self (within the call),
Waiting on another call, or on the
The services, listed under the timeline graph, summarizes all the calls per service and lists the number of calls, the aggregated time, and the errors that have occurred. Each service has its own colour (in this example shop = blue, productsearch green). Select a service to view its details in the applications and services dashboard.
The trace tree displays the structure of the upstream and downstream service calls, along with the type of the call. To explore specific calls, expand and collapse individual parts of the trace tree. Select a call to view its details in the services and endpoints dashboard.
To display the call detail sidebar, select a call in the timeline graph. The details displayed include the source and destination of the call, errors, a status code, along with the stack trace.
Instana automatically captures errors when a service returns a bad response or log with an
WARN level (or similar depending on the framework) was detected.
Instana always endeavors to give you the best understanding of service interactions, while also minimizing impact on the actual application. Certain scenarios, however, require Instana to drop data in order to achieve that.
A very common problem in systems is the so called 1+N query problem, which describes a situation where code performs 1 database call to get a list of items, followed by N individual calls to retrieve the individual items. The problem usually can be fixed by only performing one call and joining the other calls to it.
The icon next to the call name indicates how many requests were batched together. Call details match those of the most significant service invocation, for example the request with highest duration or having errors. Duration and error count for the shown call is aggregated from all batched calls.
The aggregation of service interactions only happens within the following constraints:
- High frequent and repetitive access patterns of similar type
- Individual service invocations take less than 10 ms
- Time between invocations is less than 10 ms
Due to impact concerns, at the moment the tracing sensors of Instana do not automatically capture method parameters or method return values. To capture additional data on demand, use the SDKs.
Due to timeouts, high load, or any other number of environmental conditions, it is possible that calls need significant time until they respond. Traces can contain tens or even hundreds of such calls. As we don't want to wait until all calls have responded to deliver tracing information, long running spans are replaced with a placeholder. When the span finally returns, the placeholder is replaced again with the correct call information.
Instana stores all traces and calls for 7 days. Past this period, our retention strategy retains statistically significant traces and calls to prevent unbounded storage growth.
Traces and calls that rarely occur may not be represented in such scenarios.