Intel 和 AMD 两家都叫做 PMU,所以这个是通用的(至少 x86 内部来说是通用的)。

对于 PMU 的使用,有两种模式:

  • Counting. This is what Intel Performance Counter Monitor or Likwid are using. For example, it counts how many cache misses occured.
  • Sampling. This is what Intel VTune Amplifier XE or Linux Perf are using. The PMU is used to take samples, when an events has occured a certain number of times. For example, one collects samples after 1M cache misses.

perf stat 使用的是计数模式、perf record 使用的是采样模式。

PMU 本身只有 counter,所以只有 counting 的功能。

PMU Specification / PMU Documentation

并没有一个独立的文档写了 PMU 的所有内容。其实是在 SDM 的 Chapter 19 PERFORMANCE MONITORING

性能分析工具 / Performance analysis tools

Tool Platform By
PerfMon Windows Intel

PEBS (Precise Event Based Sampling)

PEBS is a feature of the PMU,也就是说和 PMU 其实是包含的关系。

PEBS is an extension of PMU "sampling^". The PMU is instructed to collect additional information if a sample is taken. For example, the precise instruction counter, registers or flags are recorded.

PMU Event Counter

PMU Events

不同的架构,不同的 Vendor,甚至同一个 Vendor 提供的不同代处理器(比如 Intel 的 Skylake 与 Haswell)之间所支持的 PMU Events 都是不一样的。

我觉的直接看 PMU Event list 对于理解 CPU 微架构有很大的帮助。

分成两种类型:

  • Architectural PerfMon event:每一代都支持的,Architectural 的应该是比较少的。
  • Non-architectural 的,应该就是每一代都有可能会变的。

The first class supports events for monitoring performance using counting or interrupt-based event sampling usage. These events are non-architectural and vary from one processor model to another. These non-architectural performance monitoring events are specific to the microarchitecture and may change with enhancements. Non-architectural events for a given microarchitecture cannot be enumerated using CPUID.

The second class of performance monitoring capabilities is referred to as architectural performance monitoring. This class supports the same counting and Interrupt-based event sampling usages, with a smaller set of available events. The visible behavior of architectural performance events is consistent across processor implementations. Availability of architectural performance monitoring capabilities is enumerated using the CPUID.0AH.

以 Intel 为例,可以在这里查看每一代 Intel CPU 支持的 events:intel/perfmon,比如,对于 EMR 已经支持的 events,我们可以看到:perfmon/EMR/events/emeraldrapids_core.json at main · intel/perfmon,或者从这里也可以看到:https://perfmon-events.intel.com/

这里就不需要对每一个 Event 做笔记了,直接看里面的 description 就行了。