Apache Spark Monitoring and Performance Management
Apache Spark is the largest open source data processing project, providing a fast data processing tool for big data and deep analytics. Instana’s Apache Spark Monitoring includes the ability to monitor Spark deployed through AWS EMR, but can also monitor Spark Standalone Cluster Manager. Spark performance monitoring revolves around monitoring the Spark Driver instance. Instana’s Spark Monitoring Sensor supports both Driver deployment methods.
Spark Performance and Health Monitoring
Depending on the type of application that has been deployed (EMR, Standalone), different data is collected and used for monitoring.
Spark Performance and Configuration Monitoring
For Spark instances running on AWS EMR, install the Instana agent on the Amazon EC2 instances withing the EMR cluster. If you want automated deployment of the Spark monitoring sensor, the Instana agent must be placed on all nodes in the EMR cluster.
Instana’s Spark Monitoring includes an automatically built summary dashboard that centers around application KPIs – including response time and load. The dashboard also includes key infrastructure configuration and performance metrics, as well as specific Spark processing data metrics. The dashboard allows DevOps and IT Ops to see all relevant Spark data on one screen, making it easy to understand the state of their Spark instances.
Monitoring the health and performance of Apache Spark instances requires both an understanding of Spark, itself, as well as the ability to see the interactions and dependencies between clustered spark instances and the interactions with other microservices (both upstream and downstream). Instana’s Spark monitoring sensor automatically identifies and collects those relevant metrics.
Spark Monitoring Data
Batch Applications | Streaming Applications | Configuration | Metrics |
---|---|---|---|
Jobs | Batching | Host | Alive Workers |
Stages | Scheduling Delay | Port | Dead Workers |
Longest Completed Steps | Total Delay | Rest URI | Decommissioned Workers |
Executors | Processing Time | Version | Workers in Unknown State |
Output Operations | Status | Used Memory | |
Input Records | Total Memory | ||
Receivers | Used Cores | ||
Executors | Total Cores | ||
Data and Metrics per Worker | |||
Most Recent Apps | |||
Most Recent Drivers |
Spark Monitoring Sensor Installation: Getting Started
Ready to start monitoring Spark? Begin by signing up for an Instana Trial or Account. Once you have an account, hit the Spark Management Documentation for details on how to configure different Spark driver and deployment types.