PMU
Intel 和 AMD 两家都叫做 PMU,所以这个是通用的(至少 x86 内部来说是通用的)。
对于 PMU 的使用,有两种模式:
- Counting. This is what Intel Performance Counter Monitor or Likwid are using. For example, it counts how many cache misses occured.
- Sampling. This is what Intel VTune Amplifier XE or Linux Perf are using. The PMU is used to take samples, when an events has occured a certain number of times. For example, one collects samples after 1M cache misses.
perf stat
使用的是计数模式、perf record
使用的是采样模式。
PMU 本身只有 counter,所以只有 counting 的功能。
PMU Specification / PMU Documentation
并没有一个独立的文档写了 PMU 的所有内容。其实是在 SDM 的 Chapter 19 PERFORMANCE MONITORING。
性能分析工具 / Performance analysis tools
Tool | Platform | By |
---|---|---|
PerfMon | Windows | Intel |
PEBS (Precise Event Based Sampling)
PEBS is a feature of the PMU,也就是说和 PMU 其实是包含的关系。
PEBS is an extension of PMU "sampling^". The PMU is instructed to collect additional information if a sample is taken. For example, the precise instruction counter, registers or flags are recorded.
PMU Event Counter
PMU Events
不同的架构,不同的 Vendor,甚至同一个 Vendor 提供的不同代处理器(比如 Intel 的 Skylake 与 Haswell)之间所支持的 PMU Events 都是不一样的。
我觉的直接看 PMU Event list 对于理解 CPU 微架构有很大的帮助。
分成两种类型:
- Architectural PerfMon event:每一代都支持的,Architectural 的应该是比较少的。
- Non-architectural 的,应该就是每一代都有可能会变的。
The first class supports events for monitoring performance using counting or interrupt-based event sampling usage. These events are non-architectural and vary from one processor model to another. These non-architectural performance monitoring events are specific to the microarchitecture and may change with enhancements. Non-architectural events for a given microarchitecture cannot be enumerated using CPUID.
The second class of performance monitoring capabilities is referred to as architectural performance monitoring. This class supports the same counting and Interrupt-based event sampling usages, with a smaller set of available events. The visible behavior of architectural performance events is consistent across processor implementations. Availability of architectural performance monitoring capabilities is enumerated using the CPUID.0AH.
以 Intel 为例,可以在这里查看每一代 Intel CPU 支持的 events:intel/perfmon,比如,对于 EMR 已经支持的 events,我们可以看到:perfmon/EMR/events/emeraldrapids_core.json at main · intel/perfmon,或者从这里也可以看到:https://perfmon-events.intel.com/
这里就不需要对每一个 Event 做笔记了,直接看里面的 description 就行了。