Monitoring IBM Z HMC

Supported versions

zHMC is now supported as a platform. Confirmed support for metrics and configuration data for versions: Hardware Management Console (Version 2.15.0 and higher)

Configuration

To connect to zHMC server, you need to configure the following fields in the agent configuration <agent_install_dir>/etc/instana/configuration.yaml:

Note: Only remote monitoring is supported. You can have multiple HMCs configured as below:

com.instana.plugin.zhmc:
  remote:
    - host: ''             # IP address of the HMC
      port: ''             # HMC port
      user: ''             # userid on the HMC to be used for logging on
      password: ''         # password for the userid
      availabilityZone: '' # default is 'ZHMC Remote Monitoring'
      poll_rate: 15        # metrics poll rate in seconds.

The configured remote zHMC instance will then be shown as a separate box in the specified availabilityZone.

Metrics collection

Currently, this supports following 11 Metric Groups in both Classic and DPM operational mode.

SI No. Metrics Group Name Mode
1 cpc-usage-overview C
2 logical-partition-usage C
3 channel-usage C
4 dpm-system-usage-overview D
5 partition-usage D
6 zcpc-environmentals-and-power C+D
7 zcpc-processor-usage C+D
8 crypto-usage C
9 flash-memory-usage D
10 adapter-usage C
11 network-physical-adapter-port D

Note: C - Classic and D - DPM mode.

Performance metrics

CPC overview (C)

This metric group reports the aggregated processor usage and channel usage, the ambient temperature, and total system power consumption for each system. The cpc-processor-usage is the average of the percentages of processing capacity for all the physical processors in the CPC. The channel-usage is the average of the percentages of I/O capacity for all the channels and adapters in the CPC.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
CPC Processor Usage The processor percent usage for Central Processor Complex processors. 15 seconds
Channel Usage The channel percent usage. 15 seconds
Power Consumption Watts The total system power consumption in watts. 15 seconds
Temperature Celsius The ambient temperature in Degree Celsius. 15 seconds
CP Shared Processor Usage The processor percent usage for shared Central Processors. 15 seconds
CP Dedicated Processor Usage The processor percent usage for dedicated Central Processors. 15 seconds
IFL Shared Processor Usage The processor percent usage for shared Integrated Facility for Linux processors. 15 seconds
IFL Dedicated Processor Usage The processor percent usage for dedicated Integrated Facility for Linux processors. 15 seconds
ICF Shared Processor Usage The processor percent usage for shared Internal Coupling Facility processors. 15 seconds
ICF Dedicated Processor Usage The processor percent usage for dedicated Internal Coupling Facility processors. 15 seconds
IIP Shared Processor Usage The processor percent usage for shared Integrated Information processors. 15 seconds
IIP Dedicated Processor Usage The processor percent usage for dedicated Integrated Information Processors. 15 seconds
AAP Shared Processor Usage The processor percent usage for shared Application Assist Processors. 15 seconds
AAP Dedicated Processor Usage The processor percent usage for dedicated Application Assist Processors. 15 seconds
ALL Shared Processor Usage The processor percent usage for all the shared processors, combined together. 15 seconds
ALL Dedicated Processor Usage The processor percent usage for all the dedicated processors, combined together. 15 seconds
CP ALL Processor Usage The processor percent usage for all the Central Processors, combined together. 15 seconds
IFL ALL Processor Usage The processor percent usage for all the Integrated Facility for Linux processors, combined together. 15 seconds
ICF ALL Processor Usage The processor percent usage for all the Internal Coupling Facility processors, combined together. 15 seconds
IIP ALL Processor Usage The processor percent usage for all the Integrated Information Processors, combined together. 15 seconds
CBP Shared Processor Usage The processor percent usage for shared Container Based Processors. 15 seconds
CBP Dedicated Processor Usage The processor percent usage for dedicated Container Based Processors. 15 seconds
CBP ALL Processor Usage The processor percent usage for all the Container Based Processors. 15 seconds

Logical partitions (C)

This metric group reports the processor usage and z/VM paging rate for each active logical partition (Image, LPAR Image, Zone, PR/SM virtual server) on the system.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Processor Usage The processor percent usage of the Logical Partition. 15 seconds
ZVM Paging Rate The z/VM paging rate. Only returned for logical partitions running z/VM level 6.1 or higher that have the appropriate agent running in them. 15 seconds
CP Processor Usage The processor percent usage for Central Processor. 15 seconds
IFL Processor Usage The processor percent usage for Integrated Facility for Linux processors. 15 seconds
ICF Processor Usage The processor percent usage for Internal Coupling Facility processors. 15 seconds
IIP Processor Usage The processor percent usage for Integrated Information Processors. 15 seconds
CBP Processor Usage The processor percent usage for Container Based Processor. 15 seconds

Channels Usage (C)

This metric group reports the channel usage for each channel on the system. An instance of this metric group is created for each channel of a CPC.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Channel Name The name of the channel in the form channel subsystem path id. 15 seconds
Shared Channel True if the channel is shared among logical partitions, and false if it is not. 15 seconds
Logical Partition Name The name of the owning logical partition or the value "shared" if the channel is shared. 15 seconds
Channel Usage The channel percent usage (0 – 100%). 15 seconds

DPM system overview (D)

This metric group reports the aggregated processor usage, network usage, storage usage, accelerator usage, crypto usage, power consumption and temperature for each DPM enabled system.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Processor usage The processor percent usage. 15 seconds
Network usage The network percent usage. 15 seconds
Storage usage The storage percent usage. 15 seconds
Accelerator usage The accelerator percent usage. 15 seconds
Crypto usage The crypto percent usage. 15 seconds
Power consumption watts The power consumption in watts. 15 seconds
Temperature celsius The ambient temperature. 15 seconds
CP shared- processor usage The processor percent usage for all CP shared processors. 15 seconds
CP all processor usage The processor percent usage for all CP processors. 15 seconds
IFL shared processor usage The processor percent usage for all IFL shared processors. 15 seconds
All processor usage The processor percent usage for all IFL processors. 15 seconds
All shared processor usage The processor percent usage for all shared processors. 15 seconds

Partitions (D)

This metric group reports the processor usage, network usage, storage usage, accelerator usage, and crypto usage for each active partition on a DPM enabled system.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Processor usage The processor percent usage. 15 seconds
Network usage The network percent usage. 15 seconds
Storage usage The storage percent usage. 15 seconds
Accelerator usage The accelerator percent usage. 15 seconds
Crypto usage The crypto percent usage. 15 seconds

zCPC environmentals and power (C+D)

This metric group reports environmental data and power consumption for the zCPC.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Temperature celsius The ambient temperature 15 seconds
Humidity The relative humidity 15 seconds
Dew point celsius The dew point 15 seconds
Power consumption watts The power consumption in watts 15 seconds
Heat load The total heat load of the system (heat load forced-air + heat load water) 15 seconds
Heat load forced air The heat load covered by forced-air 15 seconds
Heat load water The heat load covered by water 15 seconds
Exhaust temperature celsius The exhaust temperature 15 seconds

zCPC processors (C+D)

This metric group reports the processor usage for each physical zCPC processor on the system. This includes the System Assist Processors (SAPs). An instance of this metric group is created for each processor of a CPC.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Processor name The name of the zCPC processor in the form processor-type + processor ID. 15 seconds
Processor type The type of zCPC processor. 15 seconds
Processor usage The processor percent usage. 15 seconds
Smt usage The percentage of time the processor is running in simultaneous multithreading (SMT) mode. 15 seconds
Thread 0 usage The percent usage of thread 0 when the processor is running in simultaneous multithreading (SMT) mode 15 seconds
Thread 1 usage The percent usage of thread 1 when the processor is running in simultaneous multithreading (SMT) mode 15 seconds

Cryptos (C)

This metric group reports the adapter usage for each crypto on the system. An instance of this metric group is created for each crypto adapter. This metric group is not used for a DPM system. For DPM, crypto adapters are reported in the Adapters metric group.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Channel id The physical channel identifier of the crypto 15 seconds
Crypto id The crypto identifier of the crypto, decimal 0-15 15 seconds
Adapter usage The adapter percent usage (0-100%) 15 seconds

Adapters (D)

This metric group reports the adapter usage for each adapter on the DPM enabled system. An instance of this metric group is created for each adapter.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Adapter usage The adapter percent usage (0-100%) 15 seconds

Flash memory adapters (C)

This metric group reports the adapter usage for each Flash memory (Flash Express) adapter on the system. An instance of this metric group is created for each Flash memory adapter of the CPC. If a CPC has no flash memory adapters, then no data will appear in this metric group for that CPC.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
Channel id The physical channel identifier of the Flash memory adapter 15 seconds
Adapter usage The adapter percent usage (0-100%) 15 seconds

Network adapter port metric group (D)

OSA and RoCE network adapters have up to two physical ports that connect to the network. Metrics are collected from these ports on a DPM enabled system and provided to the user. This metrics group will contain metrics data representing metrics for one physical port.

The following metrics are provided in each entry of this metric group:

Metric Description Granularity
network-port-id Numerical value corresponding to the network adapter's physical port. 15 seconds
bytes-sent Number of bytes this physical port sent out to the attached network. 15 seconds
bytes-received Number of unicast packets this physical port received from the attached network. 15 seconds
packets-sent Number of unicast packets this physical port sent out to the attached network. 15 seconds
packets-received Number of unicast packets this physical port received from the attached network. 15 seconds
packets-sent-dropped Number of packets that were dropped when this physical port was sending them out to the attached network. 15 seconds
packets-received- dropped Number of packets that were dropped when this physical port was receiving them from the attached network. 15 seconds
packets-sent- discarded Number of packets that were discarded when this physical port was sending them out to the attached network. 15 seconds
packets-received- discarded Number of packets that were discarded when this physical port was receiving them from the attached network. 15 seconds
multicast-packets-sent Number of multicast packets this physical port sent out to the attached network. 15 seconds
multicast-packets received Number of multicast packets this physical port received from the attached network. 15 seconds
broadcast-packets sent Number of broadcast packets this physical port sent out to the attached network. 15 seconds
broadcast-packets received Number of broadcast packets this physical port received from the attached network. 15 seconds
interval-bytes-sent Number of bytes sent by this physical port over the collection interval. 15 seconds
interval-bytes-received Number of bytes received by this physical port over the collection interval. 15 seconds
bytes-per-second-sent Number of bytes sent per second by this physical port over the collection interval. 15 seconds
bytes-per-second- received Number of bytes per second received by this physical port over the collection interval. 15 seconds
utilization Link utilization expressed as usage percentage of overall link bandwidth. 15 seconds
mac-address The MAC address of this uplink, if known. 15 seconds
flags Flags indicating the types of metrics that are supported by this interface. 15 seconds