Monitoring Amazon ElasticMapReduce (EMR)

Note: Learn about the other supported AWS services with our AWS documentation.

This sensor monitors AWS ElasticMapReduce (EMR) environments and their instances.

Sensor (Data Collection)

Cluster Details

  • Cluster Id
  • Cluster Name
  • Cluster Creation Time
  • Cluster Version
  • Cluster State
  • Grouping zone (region)

Metrics

Cluster Metrics

Name Description
Apps Running The number of applications running currently in the cluster.
Apps Pending The number of applications pending for the cluster.
Apps Failed The number of applications that failed in the cluster.
Memory Allocated The amount of memory allocated to the cluster in bytes.
Memory Reserved The amount of memory reserved in bytes.
Memory Available The amount of memory available to be allocated in bytes.
Containers Running The number of containers running in the cluster.

Node Metrics

Name Description
Active Nodes The number of nodes currently running MapReduce tasks within the cluster.
Lost Nodes The number of nodes allocated to MapReduce tasks with a LOST state.
Unhealthy Nodes The number of nodes allocated to MapReduce tasks with an UNHEALTHY state.
Decommissioned Nodes The number of nodes allocated to MapReduce tasks with a DECOMMISSIONED state.

Input/Output Metrics

Name Description
Bytes Written to S3 The number of bytes written to the S3 bucket by the cluster.
Bytes Read fron S3 The number of bytes read from the S3 bucket by the cluster.
HDFS Utilization The percentage of HDFS storage currently being used.
Total Load The total number of concurrent data transfers.

Required Permissions

  • cloudwatch:GetMetricStatistics
  • cloudwatch:GetMetricData
  • elasticmapreduce:ListClusters
  • elasticmapreduce:DescribeCluster

Configuration

Metrics for EMR are pulled every 300 seconds, this can be changed via agent configuration in <agent_install_dir>/etc/instana/configuration.yml:

com.instana.plugin.aws.emr:
  cloudwatch_period: 300

To disable monitoring of EMR instances use the following configuration:

com.instana.plugin.aws.emr:
  enabled: false

Monitoring multiple AWS accounts

Refer to the Monitoring multiple AWS accounts documentation to set up monitoring of multiple AWS accounts with one AWS agent in the same region.

To override which profiles should be used to monitor ElasticMapReduce, use the following configuration:

com.instana.plugin.aws.emr:
  profile_names:
    - 'profile2'
    - 'profile3'

Note: Defining profiles on service level will override global AWS configuration.

Filtering

Multiple tags can be defined, separated by a comma. Tags should be provided as a key-value pair separated by :. In order to make configuration easier, it is possible to define which tags you want to include in discovery or exclude from discovery. In case of defining tag in both lists (include and exclude), exclude list has higher priority. If there is no need for services filtering, the configuration should not be defined. It’s not mandatory to define all values in order to enable filtering.

Users are able to specify how often sensors will poll the AWS tagged resources using the tagged-services-poll-rate configuration property (default 300 seconds).

Note: Tags are only available in conjunction with the AWS Agent.

To define how often sensors will poll the tagged resources use following configuration:

com.instana.plugin.aws:
  tagged-services-poll-rate: 60 #default 300

To include services by tags into discovery use following configuration:

com.instana.plugin.aws.emr:
  include_tags: # Comma separated list of tags in key:value format (e.g. env:prod,env:staging)

To exclude services by tags from discovery use following configuration:

com.instana.plugin.aws.emr:
  exclude_tags: # Comma separated list of tags in key:value format (e.g. env:dev,env:test)

Instana Agent Tags

Please note that tags are only available in conjunction with the AWS Agent. More details on using tags are described here.