perfmon2 已不再被開發和維護。
[[http://perfmon2.sourceforge.net/|perfmon2]] 需要修改 Linux 內核。kernel interface 和 libpfm 各有其版本號。perfmon2.x 代表 kernel interface 的版本號。perfmon 指的是工具。Linux 2.6.31 以後,已有對 performance monitoring 的支援 (Linux Performance Counter subsystem),不再需要對 kernel 做修改。libpfm4 是利用 Linux 原生支援開發的函式庫。
====== pfmon & libpfm3 ======
''pfmon'' 和 ''libpfm3'' 的使用請見[[http://perfmon2.sourceforge.net/docs.html|這裡]]。
# http://git.kernel.org/?p=linux/kernel/git/eranian/linux-2.6.git;a=summary
$ git clone git clone git://git.kernel.org/pub/scm/linux/kernel/git/eranian/linux-2.6.git
$ wget http://sourceforge.net/projects/perfmon2/files/libpfm/libpfm-3.10.tar.gz/download
$ tar xvf libpfm-3.10.tar.gz; cd libpfm-3.10
# examples_v$ 放置不同 perfmon 內核版本的範例程式
$ cd examples_v2.x
# 可得知內核版本和 libpfm 版本號
$ pfmon -I
pfmlib version: 3.9
kernel perfmon version: 2.9
# perfmon 內核版本 2.9
$ ./self
sycall base 297
major version 2
minor version 9
# 此為 libpfm 版本號
$ pfmon -V
pfmon version 3.8 Date: Jul 17 2009
Copyright (C) 2001-2007 Hewlett-Packard Company
# 列出所有事件
$ pfmon -l
# 列出事件詳細資訊
$ pfmon -i UNC_QMC_NORMAL_READS
Name : UNC_QMC_NORMAL_READS
Code : 0x2c
Counters : [ 20 21 22 23 24 25 26 27 ]
Desc : QMC channel 0 normal read requests
Umask-00 : 0x01 : [CH0] : QMC channel 0 normal read requests
Umask-01 : 0x02 : [CH1] : QMC channel 1 normal read requests
Umask-02 : 0x04 : [CH2] : QMC channel 2 normal read requests
Umask-03 : 0x07 : [ANY] : QMC normal read requests
PEBS : No
Uncore : Yes
PEBS 全名為 Precise Event-Based Sampling,這是用來避免採樣上的誤差 ([[http://perfmon2.sourceforge.net/pfmon_usersguide.html#output|10.3 Sampling modules]], [[http://perfmon2.sourceforge.net/pfmon_intel_atom.html#pebs|5. Precise Event-Based Sampling (PEBS)]])。欲使用 PEBS 必須確定 perfmon 版本的內核介面為 2.81 以上,且須載入特定的內核模組。
$ pfmon --smpl-module=pebs
[[wp>Uncore]] 代表這是一 socket-level,非單一核獨享的計數器。Umask 可以用來修飾事件 ([[http://perfmon2.sourceforge.net/pfmon_usersguide.html#umasks|4.9 Using events with unit masks]])。
$ pfmon --system-wide --cpu-list=1 -u -k -e UNC_QMC_NORMAL_READS:CH2 ls
===== 採樣 =====
關於 ''pfmon'' 採樣的流程請見 [[http://perfmon2.sourceforge.net/pfmon_usersguide.html#samp|10. Sampling with pfmon]]。''pfmon'' 提供 ''--short-smpl-period'' 和 ''--long-smpl-periods'' 兩個選項用來指定每發生多少事件就要採樣一次;後者是用來消除採樣時 ''pfmon'' 存取 kernel buffer 所造成的誤差[([http://perfmon2.sourceforge.net/pfmon_usersguide.html#smplp])]。
--short-smpl-period=500,000 --smpl-periods-random=1000000:1000000
''--smpl-periods-random'' 用來調整採樣週期,注意其 seed 欄位在 ''perfmon'' 2.2 以後不再被使用。[([http://perfmon2.sourceforge.net/pfmon_usersguide.html#random])]。調整後的範圍為 mask +/- periods。
====== perf & libpfm4 ======
===== 簡介 =====
Linux 內核提供 ''perf_events'' 又稱 Linux Performance Counter subsystem,其標頭檔為 ''/usr/include/linux/perf_event.h''。關於其介紹請見 [[http://lwn.net/Articles/310176/|Performance Counters for Linux]] 和 [[http://lwn.net/Articles/357481/|The future of perf events]]。''perf'' 是相對於 ''pfmon'' 的工具,可從 ''/usr/src/linux/tools/perf'' 安裝。文件請在 ''/usr/src/linux/tools/perf'' 尋找。安裝時請注意 ''tools/perf'' 代碼是否需要更新。[(http://article.gmane.org/gmane.linux.kernel.perf.user/460)] 關於 ''perf_events'' 開發的討論串請見 [[http://thread.gmane.org/gmane.comp.linux.perfmon2.devel/1312/focus=1350]]
===== 安裝和使用 =====
# http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
# 需要 2.6.31 以後的內核
$ uname -r
2.6.38-rc7-00051-gcbdbb4c-dirty
$ cd /usr/src/linux/tools/perf
$ sudo make NO_LIBPYTHON=1
$ sudo cp perf perf-archive /usr/local/bin
$ git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4
$ cd libpfm4; make
$ cd examples
# 顯示支援的事件,分別有平台特定和通用的事件。-h 可列出其它選項
$ ./showevtinfo [-h]
#-----------------------------
IDX : 90177536
PMU name : ppc970 (PPC970)
Name : PM_LSU_REJECT_RELOAD_CDF
#-----------------------------
IDX : 106954771
PMU name : perf (perf_events generic PMU)
Name : PERF_COUNT_HW_CACHE_BPU
# 顯示事件的編碼
$ ./check_events PERF_COUNT_HW_CACHE_ITLB
''perf list'' 只會列出通用的事件名稱。如果需要採樣平台特定的事件則需要特定編碼,請見 ''Documentation/perf-list.txt''。''evt2raw'' 是用來將事件轉成編碼供 ''perf'' 使用。平台上所提供的事件可由 ''showevtinfo'' 查詢。[([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg02847.html])]
# X86 平台
$ cd libpfm4/perf_examples
$ ./evt2raw inst_retired:any_p
r5300c0
$ perf stat -e `evt2raw inst_retired:any_p` /bin/ls
$ cd libpfm4/perf_examples
$ ./task_smpl -e PERF_COUNT_HW_CACHE_ITLB:period=100 ls
在 ''libpfm4/perf_examples'' 裡,''perf_util.h'' 和 ''perf_util.c'' 為 ''perf_evnets'' 提供 helper function。Trace point 請見 [[http://lwn.net/Articles/346470/|Fun with tracepoints]]
''perf'' 除了 cycles 這個事件以外,其它事件的採樣皆有誤差[([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg02895.html])]。''perf'' 可以在要採樣的事件加上修飾字 (請見 ''Documentation/perf-list.txt''),改用較精準的採樣,如: PEBS ( [[http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/win/ug_docs/reference/pmbk/events/about_precise_event_based_sampling_performance_tuning_events.html|About Precise Event Based Sampling Performance Tuning Events]])。
$ perf record -e branch-misses:p
===== showevtinfo =====
#-----------------------------
IDX : 23068780
PMU name : core (Intel Core)
Name : X87_OPS_RETIRED
Equiv : None
Flags : [precise]
Desc : FXCH instructions retired
Code : 0xc1
Umask-00 : 0x01 : PMU : [FXCH] : None : FXCH instructions retired
Umask-01 : 0xfe : PMU : [ANY] : [default] [precise] : Retired floating-point computational operations (Precise Event)
Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean)
Modif-02 : 0x02 : PMU : [e] : edge level (boolean)
Modif-03 : 0x03 : PMU : [i] : invert (boolean)
Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer)
Umask 若皆沒有 ''[default]'' 字樣,則必須在 ''Name'' 之後加上某一個 Umask。
$ evt2raw -v X87_OPS_RETIRED
r53fec1 core::X87_OPS_RETIRED:ANY:e=0:i=0:c=0:u=1:k=1:precise=0
$ evt2raw -v X87_OPS_RETIRED:FXCH
r5301c1 core::X87_OPS_RETIRED:FXCH:e=0:i=0:c=0:u=1:k=1:precise=0
Modif 可以如下使用:
$ ./perf_examples/evt2raw -v X87_OPS_RETIRED:precise=1:i=1
rd3fec1 core::X87_OPS_RETIRED:ANY:e=0:i=1:c=0:u=1:k=1:precise=1
===== PBES =====
''perf list'' 無法得知事件是否支援 PBES。改用 ''showevtinfo'' 查詢。
# 檢查 CPU 和內核是否支援 PBES
$ dmesg | grep "Performance Events"
Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
===== 其它 =====
$ perf stat -e branches:pp ls
No permission to collect stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid.
''/proc/sys/kernel/perf_event_paranoid'' 用來控制一般使用者在使用 ''perf'' 上的權限[([http://www.spinics.net/lists/linux-perf-users/msg00716.html])]。請見 ''kernel/perf_event.c'' 中的註釋:
/*
* perf event paranoia level:
* -1 - not paranoid at all
* 0 - disallow raw tracepoint access for unpriv
* 1 - disallow cpu events for unpriv
* 2 - disallow kernel profiling for unpriv
*/
int sysctl_perf_event_paranoid __read_mostly = 1;
====== 名詞解釋 ======
* Branch Trace Store (BTS)
* Branch Trace Buffer (BTB)
* Last Branch Record (LBR) [([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg01598.html])][([http://thread.gmane.org/gmane.comp.linux.perfmon2.devel/2026])]
* [[wp>Machine state register|Machine State Register (MSR)]]
* [[wp>Model-specific register|Model-specific register (MSR)]]
===== LBR =====
目前內核僅在內部使用 LBR 於底下用途,並未將 LBR 提供給一般使用者 [([http://lkml.org/lkml/2010/4/7/145])][([http://lkml.org/lkml/2010/3/4/160])]。
- [[http://lkml.org/lkml/2010/3/4/153|use LBR for PEBS IP+1 fixup]]
- [[http://lkml.org/lkml/2010/3/29/141|Use LBR for machine/oops debugging]]
====== 硬體效能計數器 ======
===== Intel =====
* [[http://software.intel.com/file/30388|Performance Monitoring Unit Sharing Guide]]
* [[http://www.intel.com/Assets/PDF/manual/248966.pdf|Intel® 64 and IA-32 Architectures Optimization Reference Manual]]
* [[http://software.intel.com/file/30320|Intel® Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide]]
* [[http://software.intel.com/en-us/articles/intel-performance-counter-monitor/|Intel® Performance Counter Monitor - A better way to measure CPU utilization]]
* [[http://www.csksoft.net/blog/post/bts_setup.html|intel x86提供的Branch Trace Store的功能]]
===== PowerPC =====
* [[http://www.ibm.com/developerworks/aix/library/au-counteranalyzer/index.html|Performance Monitor Counter data analysis using Counter Analyzer]]
===== SPARC =====
* [[http://blogs.sun.com/jonh/entry/performance_counter_generic_events|Performance Counter Generic Events]]
====== 外部連結 ======
* [[http://perfmon2.sourceforge.net/|perfmon2]]
* [[http://www.kernel.org/doc/ols/2006/ols2006v1-pages-269-288.pdf|Perfmon2: a flexible performance monitoring interface for Linux]]
* [[http://www.hpl.hp.com/techreports/2004/HPL-2004-200R1.pdf|The perfmon2 interface specification]]
* [[http://blog.csdn.net/bluebeach/archive/2010/09/28/5912062.aspx|Perf -- Linux下的系统性能调优工具介绍]]
* [[http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Developer_Guide/perf.html|Performance Counters for Linux (PCL) Tools and perf]]
* [[http://web.eecs.utk.edu/~vweaver1/projects/perf-events/|Unofficial Linux Perf Events Web-Page]] - 定期追蹤 ''perf_events'' 開發情況。
* [[http://vger.kernel.org/~acme/perf/perf.pdf|Performance Counters on Linux The New Tools]]
* [[http://pdxplumbers.osuosl.org/2010/ocw/system/presentations/579/original/perf-plumbers2010.pdf|Perf Tools: Recent Improvements]]
* [[http://stackoverflow.com/questions/7107825/using-hardware-performance-counters-in-linux|Using Hardware Performance Counters in Linux]]
* [[https://wiki.linaro.org/KenWerner/Sandbox/perf|perf]]
* [[https://perf.wiki.kernel.org/|Perf Wiki]]