ESMI PMDA for Performance Co-Pilot
This PMDA (Performance Metrics Domain Agent) exports AMD EPYC System Management Interface (ESMI) metrics to Performance Co-Pilot (PCP).
By integrating ESMI as a PMDA, the metrics are exposed to the entire standard PCP stack, enabling monitoring for ESMI/HSMP metrics via CLI, stored archives, the PCP GUI, REST API, and Grafana.
Most metrics are also supported on Threadripper Pro 7000 and 9000, but these SKUs are not officially supported by AMD. The ESMI PMDA is developed and tested on a Threadripper Pro 7965WX.
Example Grafana
Table of Contents
- [Fe…
ESMI PMDA for Performance Co-Pilot
This PMDA (Performance Metrics Domain Agent) exports AMD EPYC System Management Interface (ESMI) metrics to Performance Co-Pilot (PCP).
By integrating ESMI as a PMDA, the metrics are exposed to the entire standard PCP stack, enabling monitoring for ESMI/HSMP metrics via CLI, stored archives, the PCP GUI, REST API, and Grafana.
Most metrics are also supported on Threadripper Pro 7000 and 9000, but these SKUs are not officially supported by AMD. The ESMI PMDA is developed and tested on a Threadripper Pro 7965WX.
Example Grafana
Table of Contents
- Hardware Layer
- Kernel Interfaces
- CPU Frequency and Power Management
- AMD Metric Concepts
- E-SMI Library
- Performance Co-Pilot (PCP)
- Visualization Stack
- Data Flow
- Prerequisites
- Installation
- Grafana Configuration
- Metrics Reference
- Usage Examples
- pmlogger Integration
- Troubleshooting
- Dependencies
- Security
- Development
- References
Features
- 52 metrics covering power, energy, frequency, thermal, and C-state data
- Per-socket and per-core instance domains for detailed monitoring
- amd-pstate driver integration for CPU frequency scaling metrics
- cpuidle C-state residency per core
- Automatic pmlogger integration with optimized logging intervals
Architecture
This section describes how AMD EPYC system management metrics flow from processor hardware to Grafana dashboards. The stack consists of:
- Hardware interfaces (HSMP mailbox, MSR registers)
- Kernel drivers (amd_hsmp, msr-safe)
- Userspace library (E-SMI)
- Metrics framework (PCP with ESMI PMDA)
- Time series storage (Redis)
- Visualization (Grafana with grafana-pcp)
The ESMI PMDA translates E-SMI library calls into PCP metrics. PCP archives metrics via pmlogger, pmproxy pushes them to Redis, and Grafana queries Redis for display.
Hardware Layer
AMD EPYC Processor Topology
AMD EPYC processors use a chiplet architecture. The hierarchy from largest to smallest:
| Level | Description |
|---|---|
| Socket | Physical processor package |
| CCD | Core Complex Die, contains 1-2 CCXs |
| CCX | Core Complex, contains 4-8 cores |
| Core | Individual CPU core |
| Thread | Hardware thread (2 per core with SMT) |
A dual-socket EPYC Genoa system may contain 2 sockets × 12 CCDs × 8 cores = 192 cores (384 threads).
Metrics are reported at different levels. Socket-level metrics include total power draw and memory bandwidth. Core-level metrics include frequency, boost limits, and energy consumption. The ESMI PMDA attaches socket_id and die_id labels to core metrics to indicate topology.
References:
Host System Management Port (HSMP)
HSMP is a mailbox interface for communication with the System Management Unit (SMU) firmware. The SMU manages power distribution, thermal limits, and frequency scaling across the processor.
HSMP provides read access to:
- Current socket power consumption
- DDR memory bandwidth utilization
- Boost frequency limits
- PROCHOT (thermal throttling) status
- Fabric clock frequencies
HSMP also supports write operations (power limits, frequency caps) but the ESMI PMDA only reads metrics.
The interface uses memory-mapped I/O and PCI configuration space. The kernel driver abstracts these details.
References:
Model-Specific Registers (MSRs)
MSRs provide low-level access to processor features via RDMSR/WRMSR instructions. Relevant MSRs for EPYC include:
| MSR | Purpose |
|---|---|
0xC001_0299 | Core energy accumulator |
0xC001_029A | Package energy accumulator |
0xC001_029B | Energy unit scaling factor |
Energy MSRs are 64-bit counters that accumulate energy consumption in units defined by the unit MSR. The counter wraps at 2^32 × unit value. For typical units of 15.3 µJ, wraparound occurs approximately every 18 hours at full load.
Raw MSR access via /dev/cpu/*/msr requires root and provides no safety checking. This PMDA requires msr-safe for controlled access.
References:
Kernel Interfaces
HSMP Driver
The amd_hsmp kernel module provides userspace access to the HSMP mailbox via /dev/hsmp. The E-SMI library uses this interface via ioctl(2).
On kernel 5.18+, the driver is mainline. The upstream amd_hsmp repository may include newer HSMP protocol versions with additional features not yet merged into the kernel.
Check the current HSMP protocol version via ESMI:
$ pminfo -f esmi.hsmp.protocol_version
See Prerequisites for BIOS configuration and installation.
msr-safe
msr-safe is a kernel module that provides allowlist-controlled MSR access. It is a required dependency for this PMDA.
Unlike the standard msr driver, msr-safe:
- Restricts access to explicitly allowed MSR addresses
- Supports bit-level masking (read/write permissions per bit)
- Enables batch reads across multiple CPUs
- Allows non-root access when configured
The module exposes:
| Device | Purpose |
|---|---|
/dev/cpu/*/msr_safe | Per-CPU safe MSR access |
/dev/cpu/msr_batch | Batch operations across CPUs |
/dev/cpu/msr_allowlist | Allowlist configuration |
The ESMI PMDA configures the allowlist for AMD energy counter MSRs (read-only access to 0xC0010299, 0xC001029A, 0xC001029B).
See Prerequisites for installation.
CPU Frequency and Power Management
Frequency Domains: FCLK, MCLK, CCLK
AMD EPYC processors have multiple independent clock domains:
| Clock | Domain | Description |
|---|---|---|
| CCLK | Core | Per-core frequency, varies with load and thermal conditions |
| FCLK | Fabric | Infinity Fabric frequency, affects inter-die and inter-socket latency |
| MCLK | Memory | DDR memory controller frequency |
| UCLK | Unified Memory | Memory controller frequency (often locked to MCLK) |
CCLK varies per-core based on workload, power budget, and thermal headroom. The other clocks are typically fixed at boot based on BIOS settings.
The ESMI PMDA reports:
esmi.clock.fclk- Fabric clock (per socket)esmi.clock.mclk- Memory clock (per socket)esmi.cclk_limit- CCLK frequency limit (per core)esmi.boostlimit- Maximum boost frequency (per core)
P-States and Precision Boost
Traditional P-States define discrete voltage/frequency operating points. AMD EPYC with Precision Boost 2 extends this model:
- Hardware continuously adjusts frequency within a range
- Per-core frequency based on individual thermal/power headroom
- Frequency can exceed base clock when conditions permit
Key frequency values:
- Base frequency: Guaranteed sustained all-core frequency (TDP-limited)
- Boost frequency: Maximum single-core frequency under optimal conditions
- Current frequency: Actual operating frequency (varies continuously)
The esmi.boostlimit metric reports the maximum frequency the SMU will allow per core. This may be lower than the rated boost frequency due to power or thermal limits.
AMD P-State Driver
The amd-pstate cpufreq driver interfaces with AMD’s Collaborative Processor Performance Control (CPPC).
Two modes:
- Passive (
amd_pstate): Kernel selects frequency, hardware provides hints - Active (
amd_pstate_epp): Hardware autonomously controls frequency
Sysfs interface per CPU (/sys/devices/system/cpu/cpu0/cpufreq/):
| File | Description |
|---|---|
scaling_driver | amd-pstate or amd-pstate-epp |
scaling_governor | Current governor policy |
scaling_cur_freq | Current frequency (kHz) |
scaling_min_freq | Configured minimum |
scaling_max_freq | Configured maximum |
amd_pstate_max_freq | Hardware maximum |
amd_pstate_lowest_nonlinear_freq | Efficiency threshold |
energy_performance_preference | EPP hint (active mode only) |
The ESMI PMDA reads these sysfs files for esmi.pstate.* metrics.
References:
CPU Frequency Governors
Governors determine frequency scaling policy:
| Governor | Behavior |
|---|---|
performance | Maximum frequency |
powersave | Minimum frequency |
schedutil | Scheduler-driven, responds to utilization |
ondemand | Ramp up quickly, ramp down slowly |
conservative | Ramp up and down slowly |
With amd-pstate-epp, the governor is secondary. The energy_performance_preference setting (performance, balance_performance, balance_power, power) controls hardware behavior.
The PMDA reports the active governor via esmi.pstate.governor.
References:
- Arch Wiki: CPU frequency scaling - Practical configuration guide
C-States (Idle States)
C-States are progressively deeper idle states:
| State | Description | Typical Exit Latency |
|---|---|---|
| C0 | Active (executing) | 0 |
| C1 | Halt (clock stopped) | ~1 µs |
| C1E | Enhanced halt (voltage reduced) | ~10 µs |
| C6 | Deep sleep (context saved to SRAM) | ~100+ µs |
Deeper states save more power but have higher wakeup latency. Latency-sensitive workloads may disable deep C-states.
The PMDA reports:
esmi.c0_residency- Percentage of time in C0 (active) per coreesmi.cpuidle.*- Time in each idle state
References:
- CPU Idle Documentation
- turbostat(8) - Report processor frequency and idle statistics
AMD Metric Concepts
Power Measurement: Socket vs TDP
The PMDA provides two power metrics per socket:
| Metric | Source | Description |
|---|---|---|
esmi.power.socket | HSMP | Actual measured power draw (milliwatts) |
esmi.power.socket_limit | HSMP | Configured TDP limit (milliwatts) |
esmi.power.socket is a real-time measurement from the SMU. It fluctuates with workload and may briefly exceed the limit during transients.
esmi.power.socket_limit is the configured power cap. The SMU throttles frequency to stay within this budget over time.
Comparing these values shows power headroom. If socket power consistently approaches the limit, the workload is power-constrained and cannot boost to higher frequencies.
Energy Counters
Energy MSRs (similar to Intel’s RAPL) provide cumulative energy consumption in microjoules. Unlike instantaneous power, energy counters enable accurate power calculation over arbitrary intervals:
Power (W) = ΔEnergy (J) / Δtime (s)
This approach handles variable sampling intervals correctly, unlike averaging instantaneous power readings.
The PMDA reports:
esmi.energy.core- Per-core accumulated energy (counter semantic)esmi.energy.pkg- Per-package accumulated energy (counter semantic)
PCP automatically computes rate-of-change for counter metrics. In Grafana, use rate() to convert to watts.
Energy counters wrap at 2^32 × energy_unit. The PMDA tracks wraparound internally.
DDR Bandwidth
Memory bandwidth metrics come from HSMP:
| Metric | Description |
|---|---|
esmi.ddr.max_bw | Theoretical maximum bandwidth (GB/s) |
esmi.ddr.utilized_bw | Current utilized bandwidth (GB/s) |
esmi.ddr.utilized_pct | Utilization as percentage |
Maximum bandwidth depends on:
- Number of populated memory channels
- Memory speed (DDR4-3200, DDR5-4800, etc.)
- Memory rank configuration
Utilized bandwidth reflects actual memory traffic. High utilization (>70%) may indicate memory-bound workloads.
Note: HSMP bandwidth values are estimates from performance counters, not direct measurements. Accuracy is typically within 10%.
Infinity Fabric
The Infinity Fabric (IF) interconnect links CCDs within a socket and sockets within a system. Fabric frequency (FCLK) affects:
- Inter-CCD latency (cores on different CCDs)
- Inter-socket latency (NUMA nodes)
- Memory access latency when FCLK ≠ MCLK
For optimal latency, FCLK should match MCLK (1:1 ratio). Some configurations run FCLK:MCLK at 1:2 to support higher memory speeds.
The PMDA reports esmi.clock.fclk per socket. Compare with esmi.clock.mclk to verify the ratio.
Thermal Throttling (PROCHOT)
PROCHOT (Processor Hot) indicates thermal throttling is active. When asserted:
- Core frequencies are reduced to limit heat
- Performance degrades proportionally
- Condition persists until temperature drops
The PMDA reports esmi.prochot (boolean per socket). A value of 1 indicates active thermal throttling. Persistent throttling suggests inadequate cooling.
E-SMI Library
The E-SMI (EPYC System Management Interface) library provides a C API for HSMP and sysfs access.
Functions used by the PMDA:
esmi_init() // Initialize library
esmi_socket_power_get() // Current power (mW)
esmi_socket_power_cap_get() // Power limit (mW)
esmi_ddr_bw_get() // DDR bandwidth
esmi_cpu_boostlimit_get() // Per-core boost limit
esmi_prochot_status_get() // Thermal throttling
esmi_fclk_mclk_get() // Fabric/memory clocks
The library requires HSMP driver access. It is built and statically linked into the PMDA.
References:
Performance Co-Pilot (PCP)
Architecture Overview
PCP is a framework for collecting, archiving, and analyzing system metrics.
graph TB
subgraph Hardware
CPU[AMD EPYC CPU]
HSMP[HSMP Mailbox]
MSR[MSR Registers]
end
subgraph Kernel
HSMP_DRV[amd_hsmp driver]
MSR_SAFE[msr-safe driver]
SYSFS[cpufreq sysfs]
end
subgraph Userspace
ESMI_LIB[E-SMI Library]
ESMI_PMDA[ESMI PMDA]
PMCD[PMCD]
OTHER_PMDA[Other PMDAs]
PMLOGGER[pmlogger]
PMPROXY[pmproxy]
end
subgraph Storage
ARCHIVES[PCP Archives]
REDIS[(Redis)]
end
subgraph Visualization
GRAFANA[Grafana]
end
CPU --> HSMP
CPU --> MSR
HSMP --> HSMP_DRV
MSR --> MSR_SAFE
CPU --> SYSFS
HSMP_DRV --> ESMI_LIB
MSR_SAFE --> ESMI_LIB
SYSFS --> ESMI_LIB
ESMI_LIB --> ESMI_PMDA
ESMI_PMDA <--> PMCD
OTHER_PMDA <--> PMCD
PMCD --> PMLOGGER
PMLOGGER --> ARCHIVES
ARCHIVES --> PMPROXY
PMPROXY --> REDIS
REDIS --> GRAFANA
Loading
Core components:
- PMCD: Daemon that coordinates metric collection
- PMDA: Plugin that collects metrics from a specific source
- pmlogger: Archives metrics to disk
- pmproxy: REST API and Redis gateway
PMDA Interface
A PMDA defines metrics with:
- Name: Hierarchical path (e.g.,
esmi.power.socket) - Type: Data type (32/64-bit, signed/unsigned, float/double)
- Semantics: Counter (monotonic), instant (point-in-time), discrete (categorical)
- Units: Dimensional units for automatic conversion
- Instance domain: For multi-valued metrics (per-socket, per-core)
- Labels: Key-value metadata
PMDAs run as:
- DSO: Loaded into PMCD (lower overhead)
- Daemon: Separate process (isolated, restartable)
The ESMI PMDA runs as a daemon for isolation from PMCD.
References:
ESMI PMDA Instance Domains and Labels
This PMDA exports AMD EPYC metrics via E-SMI.
Instance domains:
SOCKET_INDOM: socket0, socket1, ...CORE_INDOM: cpu0, cpu1, ...
Labels:
socket_id: Socket number (0, 1, ...)die_id: CCD number within socketdisp_instance: Display name for Grafana
Metric groups:
| Prefix | Scope | Description |
|---|---|---|
esmi.power.* | socket | Power draw and limits |
esmi.energy.* | core | Cumulative energy counters |
esmi.ddr.* | socket | Memory bandwidth |
esmi.clock.* | socket | FCLK, MCLK frequencies |
esmi.boostlimit | core | Per-core boost limit |
esmi.cclk_limit | core | Per-core CCLK limit |
esmi.c0_residency | core | Active time percentage |
esmi.prochot | socket | Thermal throttling status |
esmi.pstate.* | core | P-state driver metrics |
esmi.cpuidle.* | core | Idle state percentages |
pmlogger
pmlogger archives metrics at configurable intervals. The ESMI PMDA installs pmlogconf groups:
| Group | Interval | Metrics |
|---|---|---|
| realtime | 1 second | power, bandwidth, thermal |
| limits | 5 seconds | power caps, frequency limits |
| config | 5 minutes | topology, static configuration |
Archives are stored in /var/log/pcp/pmlogger/.
References:
pmproxy
pmproxy provides:
- REST API for metric queries
- Redis gateway for time series storage
When PCP_REDIS_SERVERS is configured, pmproxy watches archives and pushes data to Redis.
Default port: 44322
References:
Visualization Stack
Redis
Redis stores PCP time series with the RedisTimeSeries module.
Data organization:
pcp:series:<SHA1> # Time series data points
pcp:map:series.name # Metric name to series ID
pcp:map:label.name # Label indexes
When metrics change (renamed, added, removed), stale entries may remain. Use purge_esmi.sh to clean up.
Grafana
The grafana-pcp plugin provides:
- PCP Redis: Query historical data from Redis
- PCP Vector: Live streaming from pmproxy
- PCP bpftrace: Dynamic eBPF instrumentation
Configuration:
- Install grafana-pcp plugin
- Add PCP Redis datasource pointing to pmproxy (http://localhost:44322)
- Query metrics by name:
esmi.power.socket - Filter by label:
{socket_id="0"}
The disp_instance label supports Grafana’s "Labels to fields" transformation for friendly column names.
References:
Data Flow
sequenceDiagram
participant HW as Hardware
participant Kernel as Kernel Drivers
participant ESMI as E-SMI Library
participant PMDA as ESMI PMDA
participant PMCD as PMCD
participant Logger as pmlogger
participant Proxy as pmproxy
participant Redis as Redis
participant Grafana as Grafana
HW->>Kernel: HSMP response / MSR read
Kernel->>ESMI: ioctl / read
ESMI->>PMDA: API call
PMDA->>PMCD: metric values
PMCD->>Logger: archive data
Logger->>Proxy: new archive data
Proxy->>Redis: TS.ADD
Grafana->>Proxy: REST query
Proxy->>Redis: TS.RANGE
Redis->>Proxy: data points
Proxy->>Grafana: JSON response
Loading
Latency from hardware to Grafana is typically:
- Hardware to PMDA: <10ms (HSMP mailbox latency)
- PMDA to archive: depends on pmlogger interval
- Archive to Redis: <1 second (pmproxy poll)
- Redis to Grafana: <10ms (query latency)
Prerequisites
PCP (Performance Co-Pilot) >= 7.0.3
Distribution packages are often outdated. Install from the official PCP repository:
# Download the repository setup script
wget -O pcp-repo.sh https://packagecloud.io/install/repositories/performancecopilot/pcp/script.deb.sh
# Run with your OS and distribution codename (e.g. 24.04)
sudo os=ubuntu dist=noble bash pcp-repo.sh
On Ubuntu derivatives (e.g., Linux Mint), determine the underlying Ubuntu codename:
grep UBUNTU_CODENAME /etc/os-release
Install PCP and development packages (minimum version 7.0.3):
sudo apt-get update
sudo apt-get install pcp libpcp-pmda3-dev libpcp3-dev
See packagecloud.io/performancecopilot/pcp for available versions.
If this is your first time using PCP, you should set up the system services and configure other metrics sources you’re interested in:
sudo systemctl enable --now pmcd pmlogger pmie
# install services required by other PMDAs, for example:
sudo apt install smartmontools nvme-cli
Add any PMDAs (metrics) you want from ls /var/lib/pcp/pmdas:
cd /var/lib/pcp/pmdas/bash
sudo ./Install
# repeat for any others, eg.: linux, proc, disk, network, hwmon, bpftrace, docker, kvm, libvirt, lmsensors, mounts, nvidia, redis, samba, smart, systemd
Now Redis for storing the metrics:
sudo apt install redis-server
sudo systemctl enable --now redis-server
sudo systemctl enable --now pmproxy
E-SMI Library >= 5.0.1
Building from source is recommended to get the latest fixes.
# Install build dependencies
sudo apt-get install cmake doxygen
# Clone and build
git clone https://github.com/amd/esmi_ib_library
cd esmi_ib_library
mkdir build && cd build
cmake -DENABLE_STATIC_LIB=1 -DCMAKE_INSTALL_PREFIX=/usr/local ..
make
sudo make install
Alternatively, download pre-built packages from AMD E-SMI Downloads.
Verify installation (minimum version 5.0.1):
e_smi_tool -V
AMD HSMP Module (Required)
BIOS Configuration:
HSMP must be enabled in the BIOS. The default setting is disabled. Navigate to:
Advanced > AMD CBS > NBIO Common Options > SMU Common Options > HSMP Support
Set the value to Enabled. If HSMP is disabled (the default "Auto" setting), all HSMP calls will timeout.
Supported Processors:
- Family 19h Model 00-1Fh, 30-3Fh, 90-9Fh, A0-AFh (Milan, Genoa)
- Family 1Ah Model 00-1Fh, 50-5Fh (Turin)
- Threadripper Pro 7000/9000 (some, unofficial, might work without BIOS setting if it doesn’t exist)
Load the driver:
sudo modprobe amd_hsmp
For newer HSMP protocol features, build from the upstream amd_hsmp repository instead of using the kernel-included driver.
Verify:
ls -l /dev/hsmp
dmesg | grep -i hsmp
msr-safe (Required)
msr-safe provides allowlist-controlled MSR access. Installation will fail if msr-safe is not present.
# Build and install via DKMS (https://github.com/dell/dkms)
sudo make msr-safe
# Configure AMD energy MSR allowlist
sudo make msr-safe-allow
Verify:
ls /dev/cpu/msr_allowlist
ls /dev/cpu/0/msr_safe
See msr-safe on GitHub for manual installation.
Installation
⚠️ You must complete and verify all steps in Prerequisites first.
# Install msr-safe first
sudo make msr-safe
# Install PMDA
sudo make install
This will:
- Build and install the PMDA binary
- Configure MSR allowlist for energy metrics
- Set up udev rules for persistent permissions
- Install pmlogconf configuration for automatic logging
- Restart pmlogger to pick up new metrics
Removal
sudo make uninstall
Exporting metrics to Grafana
If you do not have Grafana running, install it now, and add PCP as a data source:
wget -q -O - https://apt.grafana.com/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/grafana.gpg
echo "deb [signed-by=/usr/share/keyrings/grafana.gpg] https://apt.grafana.com stable main" \
| sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install grafana
sudo systemctl enable --now grafana-server
sudo grafana-cli plugins install performancecopilot-pcp-app
sudo systemctl restart grafana-server
Grafana now runs at http://localhost:3000
In Grafana, add or confirm the PCP plugin is installed:
Then add a PCP Redis (Valkey) data source and use http://localhost:44322 as the URL, leave the rest blank:
Then use it in queries, for example:
Note the variables in Legend, e.g. $instance to format the legend string. See grafana-pcp-docs-language for other options.
The query language itself is pmseries, read the docs for more examples.
Use Transformations for complex visualizations:
| Query | Transformation |
|---|---|
Use rate conversions to natively transform monotonic counters into time-based units, e.g. per-core Joules to Watt:
Metrics Reference
Power (per socket)
| Metric | Description | Units |
|---|---|---|
esmi.power.socket | Instantaneous power consumption | Watts |
esmi.power.socket_limit | Current power limit | Watts |
esmi.power.socket_limit_max | Maximum power limit | Watts |
esmi.power.svi_telemetry | SVI-based power telemetry | Watts |
esmi.power.efficiency_mode | Power efficiency mode | 0-5 |
Energy
| Metric | Scope | Description | Units |
|---|---|---|---|
esmi.energy.socket | socket | Cumulative energy consumption | Joules |
esmi.energy.core | core | Cumulative energy consumption | Joules |
CPU Activity
| Metric | Scope | Description | Units |
|---|---|---|---|
esmi.c0_residency.socket | socket | C0 (active) residency | % |
esmi.cpuidle.poll_residency | core | POLL idle state residency | % |
esmi.cpuidle.c1_residency | core | C1 idle state residency | % |
esmi.cpuidle.c2_residency | core | C2 idle state residency | % |
Memory Bandwidth (per socket)
| Metric | Description | Units |
|---|---|---|
esmi.ddr.max_bandwidth | DDR maximum theoretical bandwidth | GB/s |
esmi.ddr.utilized_bandwidth | DDR utilized bandwidth | GB/s |
esmi.ddr.utilized_percent | DDR bandwidth utilization | % |
Frequency
| Metric | Scope | Description | Units |
|---|---|---|---|
esmi.freq.active_limit | socket | Active frequency limit | MHz |
esmi.freq.limit_source | socket | Source of frequency limit | string |
esmi.freq.max | socket | Maximum frequency | MHz |
esmi.freq.min | socket | Minimum frequency | MHz |
esmi.boostlimit.core | core | Boost limit | MHz |
esmi.cclk_limit.core | core | CCLK frequency limit | MHz |
Clocks (per socket)
| Metric | Description | Units |
|---|---|---|
esmi.clock.fclk | Data Fabric clock (FCLK) | MHz |
esmi.clock.mclk | Memory clock (MCLK) | MHz |
esmi.clock.cclk_limit | Socket-level CCLK limit | MHz |
Thermal (per socket)
| Metric | Description | Units |
|---|---|---|
esmi.temperature.socket | CPU temperature | °C |
esmi.prochot.status | PROCHOT assertion status | 0/1 |
P-State (per socket)
| Metric | Description |
|---|---|
esmi.df_pstate.max | Data Fabric P-state max |
esmi.df_pstate.min | Data Fabric P-state min |
esmi.xgmi_pstate.max | xGMI P-state max (multi-socket) |
esmi.xgmi_pstate.min | xGMI P-state min (multi-socket) |
esmi.xgmi_width.min | xGMI link width min |
esmi.xgmi_width.max | xGMI link width max |
amd-pstate Driver (per core unless noted)
| Metric | Description | Units |
|---|---|---|
esmi.amd_pstate.scaling_cur_freq | Current CPU frequency | MHz |
esmi.amd_pstate.energy_perf_pref | Energy performance preference | string |
esmi.amd_pstate.scaling_governor | CPU frequency governor | string |
esmi.amd_pstate.boost | Boost enabled | 0/1 |
esmi.amd_pstate.prefcore_ranking | Preferred core ranking | 0-255 |
esmi.amd_pstate.highest_perf | Highest performance level | 0-255 |
esmi.amd_pstate.lowest_nonlinear_freq | Lowest non-linear frequency | MHz |
esmi.amd_pstate.max_freq | Maximum frequency | MHz |
esmi.amd_pstate.driver | Active cpufreq driver (system) | string |
esmi.amd_pstate.hw_prefcore | Hardware prefcore support (system) | string |
System Information
| Metric | Description |
|---|---|
esmi.cpu.family | CPU family number |
esmi.cpu.model | CPU model number |
esmi.cpu.sockets | Number of CPU sockets |
esmi.cpu.cores | Number of physical cores |
esmi.cpu.threads_per_core | Threads per core (SMT) |
esmi.hsmp.proto_version | HSMP protocol version |
esmi.hsmp.smu_fw_version | SMU firmware version |
esmi.hsmp.driver_version | HSMP driver version |
C-State/Power Control
| Metric | Description |
|---|---|
esmi.cstate.pc6_enabled | PC6 package C-state enabled |
esmi.cstate.cc6_enabled | CC6 core C-state enabled |
esmi.dfc.enabled | Data Fabric C-state enabled |
Usage Examples
PCP provides command-line tools for querying and monitoring metrics. See pminfo(1), pmval(1), and pmrep(1) for details.
# View all metrics
pminfo -f esmi
# Monitor socket power (1 second interval)
pmval -t 1sec esmi.power.socket
# Monitor temperature and power together
pmrep esmi.power.socket esmi.temperature.socket -t 1sec
# View per-core frequencies
pminfo -f esmi.amd_pstate.scaling_cur_freq
# Check C-state residency per core
pminfo -f esmi.cpuidle.c2_residency
# Export to CSV
pmrep -o csv -t 1sec esmi.power.socket esmi.temperature.socket > power.csv
# View historical data from archive
pmval -a /var/log/pcp/pmlogger/$(hostname)/$(date +%Y%m%d) esmi.power.socket
pmlogger Integration
The PMDA installs pmlogconf configuration for automatic metric logging:
| Group | Interval | Metrics |
|---|---|---|
| realtime | 1 second | Power, energy, temperature, frequencies, C-state residency |
| limits | 5 seconds | Power limits, P-state config, amd_pstate settings |
| config | 5 minutes | CPU info, HSMP versions, driver info (static) |
Verify logging
# Check pmlogger status
systemctl status pmlogger
# View logged metrics
pminfo -a /var/log/pcp/pmlogger/$(hostname)/$(date +%Y%m%d) esmi
# Dump recent values
pmval -a /var/log/pcp/pmlogger/$(hostname)/$(date +%Y%m%d) -S -5min esmi.power.socket
Troubleshooting
No metrics available
# Check PMDA is installed
pminfo esmi 2>&1 | head -5
# Check PMCD status
systemctl status pmcd
# View PMDA logs
sudo cat /var/log/pcp/pmcd/esmi.log
Energy metrics return errors
# Check msr-safe is loaded (see lsmod(8))
lsmod | grep msr_safe
# Check allowlist
cat /dev/cpu/msr_allowlist
# Reconfigure
sudo e_smi_tool --writemsrallowlist
sudo make install
Permission denied errors
# Check device permissions
ls -la /dev/cpu/0/msr_safe
ls -la /dev/cpu/msr_allowlist
# PMDA runs as 'pcp' user - verify group membership
groups pcp
Dependencies
Required:
| Component | Purpose |
|---|---|
| amd_hsmp driver | HSMP mailbox access |
| msr-safe | Safe MSR access (energy counters) |
| E-SMI library | AMD API abstraction |
| PCP (pcp, pcp-devel) | Metrics framework |
Optional:
| Component | Purpose |
|---|---|
| Redis + RedisTimeSeries | Time series storage |
| pmproxy | Redis gateway |
| Grafana + grafana-pcp | Visualization |
The PMDA installation checks for msr-safe and fails if not present.
Security
Privilege Requirements
| Component | Privilege | Notes |
|---|---|---|
| HSMP driver | root | Kernel module |
| msr-safe | root or allowlist | Configured per-MSR |
| PMCD | root (init), pcp (runtime) | Drops privileges |
| pmlogger | pcp | Writes to /var/log/pcp |
| pmproxy | pcp | Binds to 44322 |
| Redis | redis | Binds to 6379 |
| Grafana | grafana | Binds to 3000 |
Network Ports
| Service | Default | Binding |
|---|---|---|
| pmproxy | 44322 | localhost |
| Redis | 6379 | localhost |
| Grafana | 3000 | all interfaces |
For remote access, configure pmproxy via /etc/pcp/pmproxy/pmproxy.options. Consider TLS termination for production.
License
LGPL-2.1-or-later
References
AMD
- AMD EPYC Specifications
- AMD Technical Documentation Hub - PPRs, tuning guides, datasheets
- AMD HSMP Driver
- E-SMI Library
- E-SMI API Reference
Linux Kernel
ACPI
msr-safe
PCP
Grafana
Redis
Development
Building
cd esmi
make
If ESMI is installed in a non-standard location:
make ESMI_INC_DIR=/path/to/esmi/include ESMI_LIB_DIR=/path/to/esmi/lib
Testing with dbpmda
Before installing, test the PMDA using dbpmda(1):
# Start dbpmda (requires root for ESMI access)
sudo dbpmda
# Open the PMDA (use domain number 470)
dbpmda> open pipe ./pmdaesmi -d 470
# Check metric descriptor (use numeric PMID)
dbpmda> desc 470.0.0
# Fetch metric values
dbpmda> fetch 470.0.0
# List all instances for socket metrics
dbpmda> instance 470.0
# List all instances for core metrics
dbpmda> instance 470.1
# Exit
dbpmda> quit
Useful dbpmda commands:
desc <pmid> - Show metric descriptor
fetch <pmid> - Fetch metric value(s)
instance <indom> - List instances in domain
status - Show PMDA status
timer on - Enable timing output
Debugging
Enable debug logging:
# Run PMDA with debug flags
sudo ./pmdaesmi -D appl0 -d 470 -l pmda.log
# Check log
tail -f pmda.log
Common debug flags:
appl0- Application-level debuggingfetch- Fetch callback tracingindom- Instance domain operationspmns- Namespace operations
Check PMCD logs:
# PMCD daemon log
sudo tail -f /var/log/pcp/pmcd/pmcd.log
# PMDA-specific log
sudo tail -f /var/log/pcp/pmcd/esmi.log
PMDA Architecture
Instance Domains
| Indom | ID | Description |
|---|---|---|
| SOCKET_INDOM | 470.0 | One instance per CPU socket (socket0, socket1, ...) |
| CORE_INDOM | 470.1 | One instance per logical CPU (cpu0, cpu1, ...) |
Metric Clusters
| Cluster | PMID Range | Description |
|---|---|---|
| 0 | 470.0.* | Power metrics (socket) |
| 1 | 470.1.* | Energy - socket |
| 2 | 470.2.* | C0 residency (socket) |
| 3 | 470.3.* | DDR bandwidth (socket) |
| 4 | 470.4.* | Frequency limits (socket) |
| 5 | 470.5.* | Energy - core |
| 6 | 470.6.* | Boost limit (core) |
| 7 | 470.7.* | CCLK limit (core) |
| 8 | 470.8.* | Clocks (socket) |
| 9 | 470.9.* | Temperature (socket) |
| 10 | 470.10.* | PROCHOT (socket) |
| 11 | 470.11.* | DF P-state (socket) |
| 12 | 470.12.* | xGMI P-state (socket) |
| 13 | 470.13.* | xGMI width (socket) |
| 14 | 470.14.* | HSMP info (system) |
| 15 | 470.15.* | CPU info (system) |
| 16 | 470.16.* | C-state config (system) |
| 17 | 470.17.* | DFC config (system) |
| 18 | 470.18.* | amd_pstate (core) |
| 19 | 470.19.* | amd_pstate (system) |
| 20 | 470.20.* | cpuidle residency (core) |
Adding New Metrics
- Add metric definition in
metrictab[]array inpmdaesmi.c - Add fetch handler in
esmi_fetchCallBack()switch statement - Add PMNS entry in
pmnsfile - Add help text in
helpfile - Add to pmlogconf if metric should be logged automatically
Example adding a new socket metric:
// In metrictab[] - add after last metric in appropriate cluster
/* esmi.new.metric - cluster X, item Y */
{ NULL,
{ PMDA_PMID(X,Y), PM_TYPE_U32, SOCKET_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0, 0, 0, 0, 0, 0) } },
// In esmi_fetchCallBack() - add case in cluster switch
case X:
switch (item) {
case Y: /* esmi.new.metric */
ret = esmi_new_metric_get(inst, &value_u32);
if (ret != ESMI_SUCCESS)
return PM_ERR_AGAIN;
atom->ul = value_u32;
return PMDA_FETCH_STATIC;
}
MSR-Safe Development
Building msr-safe from source:
sudo make msr-safe
This clones, builds, and installs the msr-safe kernel module via DKMS.
The ESMI library requires these MSRs for energy metrics:
0xC0010299- Core energy status0xC001029A- Package energy status0xC001029B- Energy units0xC00102F0- Additional energy register0xC00102F1- Additional energy register
Write allowlist with:
sudo e_smi_tool --writemsrallowlist
Source Files
| File | Purpose |
|---|---|
pmdaesmi.c | Main PMDA source code |
domain.h | PMDA domain number (470) |
pmns | Performance Metrics Name Space |
help | Metric help text |
root | PMNS root file |
Makefile | Build and install targets |
Install | PCP installation script |
Remove | PCP removal script |
pmlogconf/ | pmlogger configuration files |
scripts/ | Installation helper scripts |
Testing Changes
# Quick test cycle
make && sudo make install && pminfo -f esmi.power.socket
# Test specific metric
pminfo -df esmi.your.new.metric
# Verify no errors
pminfo -v esmi 2>&1 | grep -i error
pmlogconf Testing
# Validate syntax
pmlogconf -c /var/lib/pcp/config/pmlogconf/esmi/realtime
# Check what would be logged
sudo pmlogconf -r /var/log/pcp/pmlogger/$(hostname)/config.pmlogger 2>&1 | grep -i esmi