perfmon2 已不再被開發和維護。 [[http://perfmon2.sourceforge.net/|perfmon2]] 需要修改 Linux 內核。kernel interface 和 libpfm 各有其版本號。perfmon2.x 代表 kernel interface 的版本號。perfmon 指的是工具。Linux 2.6.31 以後,已有對 performance monitoring 的支援 (Linux Performance Counter subsystem),不再需要對 kernel 做修改。libpfm4 是利用 Linux 原生支援開發的函式庫。 ====== pfmon & libpfm3 ====== ''pfmon'' 和 ''libpfm3'' 的使用請見[[http://perfmon2.sourceforge.net/docs.html|這裡]]。 # http://git.kernel.org/?p=linux/kernel/git/eranian/linux-2.6.git;a=summary $ git clone git clone git://git.kernel.org/pub/scm/linux/kernel/git/eranian/linux-2.6.git $ wget http://sourceforge.net/projects/perfmon2/files/libpfm/libpfm-3.10.tar.gz/download $ tar xvf libpfm-3.10.tar.gz; cd libpfm-3.10 # examples_v$ 放置不同 perfmon 內核版本的範例程式 $ cd examples_v2.x # 可得知內核版本和 libpfm 版本號 $ pfmon -I pfmlib version: 3.9 kernel perfmon version: 2.9 # perfmon 內核版本 2.9 $ ./self sycall base 297 major version 2 minor version 9 # 此為 libpfm 版本號 $ pfmon -V pfmon version 3.8 Date: Jul 17 2009 Copyright (C) 2001-2007 Hewlett-Packard Company # 列出所有事件 $ pfmon -l # 列出事件詳細資訊 $ pfmon -i UNC_QMC_NORMAL_READS Name : UNC_QMC_NORMAL_READS Code : 0x2c Counters : [ 20 21 22 23 24 25 26 27 ] Desc : QMC channel 0 normal read requests Umask-00 : 0x01 : [CH0] : QMC channel 0 normal read requests Umask-01 : 0x02 : [CH1] : QMC channel 1 normal read requests Umask-02 : 0x04 : [CH2] : QMC channel 2 normal read requests Umask-03 : 0x07 : [ANY] : QMC normal read requests PEBS : No Uncore : Yes PEBS 全名為 Precise Event-Based Sampling,這是用來避免採樣上的誤差 ([[http://perfmon2.sourceforge.net/pfmon_usersguide.html#output|10.3 Sampling modules]], [[http://perfmon2.sourceforge.net/pfmon_intel_atom.html#pebs|5. Precise Event-Based Sampling (PEBS)]])。欲使用 PEBS 必須確定 perfmon 版本的內核介面為 2.81 以上,且須載入特定的內核模組。 $ pfmon --smpl-module=pebs [[wp>Uncore]] 代表這是一 socket-level,非單一核獨享的計數器。Umask 可以用來修飾事件 ([[http://perfmon2.sourceforge.net/pfmon_usersguide.html#umasks|4.9 Using events with unit masks]])。 $ pfmon --system-wide --cpu-list=1 -u -k -e UNC_QMC_NORMAL_READS:CH2 ls ===== 採樣 ===== 關於 ''pfmon'' 採樣的流程請見 [[http://perfmon2.sourceforge.net/pfmon_usersguide.html#samp|10. Sampling with pfmon]]。''pfmon'' 提供 ''--short-smpl-period'' 和 ''--long-smpl-periods'' 兩個選項用來指定每發生多少事件就要採樣一次;後者是用來消除採樣時 ''pfmon'' 存取 kernel buffer 所造成的誤差[([http://perfmon2.sourceforge.net/pfmon_usersguide.html#smplp])]。 --short-smpl-period=500,000 --smpl-periods-random=1000000:1000000 ''--smpl-periods-random'' 用來調整採樣週期,注意其 seed 欄位在 ''perfmon'' 2.2 以後不再被使用。[([http://perfmon2.sourceforge.net/pfmon_usersguide.html#random])]。調整後的範圍為 mask +/- periods。 ====== perf & libpfm4 ====== ===== 簡介 ===== Linux 內核提供 ''perf_events'' 又稱 Linux Performance Counter subsystem,其標頭檔為 ''/usr/include/linux/perf_event.h''。關於其介紹請見 [[http://lwn.net/Articles/310176/|Performance Counters for Linux]] 和 [[http://lwn.net/Articles/357481/|The future of perf events]]。''perf'' 是相對於 ''pfmon'' 的工具,可從 ''/usr/src/linux/tools/perf'' 安裝。文件請在 ''/usr/src/linux/tools/perf'' 尋找。安裝時請注意 ''tools/perf'' 代碼是否需要更新。[(http://article.gmane.org/gmane.linux.kernel.perf.user/460)] 關於 ''perf_events'' 開發的討論串請見 [[http://thread.gmane.org/gmane.comp.linux.perfmon2.devel/1312/focus=1350]] ===== 安裝和使用 ===== # http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git # 需要 2.6.31 以後的內核 $ uname -r 2.6.38-rc7-00051-gcbdbb4c-dirty $ cd /usr/src/linux/tools/perf $ sudo make NO_LIBPYTHON=1 $ sudo cp perf perf-archive /usr/local/bin $ git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4 $ cd libpfm4; make $ cd examples # 顯示支援的事件,分別有平台特定和通用的事件。-h 可列出其它選項 $ ./showevtinfo [-h] #----------------------------- IDX : 90177536 PMU name : ppc970 (PPC970) Name : PM_LSU_REJECT_RELOAD_CDF #----------------------------- IDX : 106954771 PMU name : perf (perf_events generic PMU) Name : PERF_COUNT_HW_CACHE_BPU # 顯示事件的編碼 $ ./check_events PERF_COUNT_HW_CACHE_ITLB ''perf list'' 只會列出通用的事件名稱。如果需要採樣平台特定的事件則需要特定編碼,請見 ''Documentation/perf-list.txt''。''evt2raw'' 是用來將事件轉成編碼供 ''perf'' 使用。平台上所提供的事件可由 ''showevtinfo'' 查詢。[([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg02847.html])] # X86 平台 $ cd libpfm4/perf_examples $ ./evt2raw inst_retired:any_p r5300c0 $ perf stat -e `evt2raw inst_retired:any_p` /bin/ls $ cd libpfm4/perf_examples $ ./task_smpl -e PERF_COUNT_HW_CACHE_ITLB:period=100 ls 在 ''libpfm4/perf_examples'' 裡,''perf_util.h'' 和 ''perf_util.c'' 為 ''perf_evnets'' 提供 helper function。Trace point 請見 [[http://lwn.net/Articles/346470/|Fun with tracepoints]] ''perf'' 除了 cycles 這個事件以外,其它事件的採樣皆有誤差[([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg02895.html])]。''perf'' 可以在要採樣的事件加上修飾字 (請見 ''Documentation/perf-list.txt''),改用較精準的採樣,如: PEBS ( [[http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/win/ug_docs/reference/pmbk/events/about_precise_event_based_sampling_performance_tuning_events.html|About Precise Event Based Sampling Performance Tuning Events]])。 $ perf record -e branch-misses:p ===== showevtinfo ===== #----------------------------- IDX : 23068780 PMU name : core (Intel Core) Name : X87_OPS_RETIRED Equiv : None Flags : [precise] Desc : FXCH instructions retired Code : 0xc1 Umask-00 : 0x01 : PMU : [FXCH] : None : FXCH instructions retired Umask-01 : 0xfe : PMU : [ANY] : [default] [precise] : Retired floating-point computational operations (Precise Event) Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean) Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean) Modif-02 : 0x02 : PMU : [e] : edge level (boolean) Modif-03 : 0x03 : PMU : [i] : invert (boolean) Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer) Umask 若皆沒有 ''[default]'' 字樣,則必須在 ''Name'' 之後加上某一個 Umask。 $ evt2raw -v X87_OPS_RETIRED r53fec1 core::X87_OPS_RETIRED:ANY:e=0:i=0:c=0:u=1:k=1:precise=0 $ evt2raw -v X87_OPS_RETIRED:FXCH r5301c1 core::X87_OPS_RETIRED:FXCH:e=0:i=0:c=0:u=1:k=1:precise=0 Modif 可以如下使用: $ ./perf_examples/evt2raw -v X87_OPS_RETIRED:precise=1:i=1 rd3fec1 core::X87_OPS_RETIRED:ANY:e=0:i=1:c=0:u=1:k=1:precise=1 ===== PBES ===== ''perf list'' 無法得知事件是否支援 PBES。改用 ''showevtinfo'' 查詢。 # 檢查 CPU 和內核是否支援 PBES $ dmesg | grep "Performance Events" Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver. ===== 其它 ===== $ perf stat -e branches:pp ls No permission to collect stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid. ''/proc/sys/kernel/perf_event_paranoid'' 用來控制一般使用者在使用 ''perf'' 上的權限[([http://www.spinics.net/lists/linux-perf-users/msg00716.html])]。請見 ''kernel/perf_event.c'' 中的註釋: /* * perf event paranoia level: * -1 - not paranoid at all * 0 - disallow raw tracepoint access for unpriv * 1 - disallow cpu events for unpriv * 2 - disallow kernel profiling for unpriv */ int sysctl_perf_event_paranoid __read_mostly = 1; ====== 名詞解釋 ====== * Branch Trace Store (BTS) * Branch Trace Buffer (BTB) * Last Branch Record (LBR) [([http://www.mail-archive.com/perfmon2-devel@lists.sourceforge.net/msg01598.html])][([http://thread.gmane.org/gmane.comp.linux.perfmon2.devel/2026])] * [[wp>Machine state register|Machine State Register (MSR)]] * [[wp>Model-specific register|Model-specific register (MSR)]] ===== LBR ===== 目前內核僅在內部使用 LBR 於底下用途,並未將 LBR 提供給一般使用者 [([http://lkml.org/lkml/2010/4/7/145])][([http://lkml.org/lkml/2010/3/4/160])]。 - [[http://lkml.org/lkml/2010/3/4/153|use LBR for PEBS IP+1 fixup]] - [[http://lkml.org/lkml/2010/3/29/141|Use LBR for machine/oops debugging]] ====== 硬體效能計數器 ====== ===== Intel ===== * [[http://software.intel.com/file/30388|Performance Monitoring Unit Sharing Guide]] * [[http://www.intel.com/Assets/PDF/manual/248966.pdf|Intel® 64 and IA-32 Architectures Optimization Reference Manual]] * [[http://software.intel.com/file/30320|Intel® Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide]] * [[http://software.intel.com/en-us/articles/intel-performance-counter-monitor/|Intel® Performance Counter Monitor - A better way to measure CPU utilization]] * [[http://www.csksoft.net/blog/post/bts_setup.html|intel x86提供的Branch Trace Store的功能]] ===== PowerPC ===== * [[http://www.ibm.com/developerworks/aix/library/au-counteranalyzer/index.html|Performance Monitor Counter data analysis using Counter Analyzer]] ===== SPARC ===== * [[http://blogs.sun.com/jonh/entry/performance_counter_generic_events|Performance Counter Generic Events]] ====== 外部連結 ====== * [[http://perfmon2.sourceforge.net/|perfmon2]] * [[http://www.kernel.org/doc/ols/2006/ols2006v1-pages-269-288.pdf|Perfmon2: a flexible performance monitoring interface for Linux]] * [[http://www.hpl.hp.com/techreports/2004/HPL-2004-200R1.pdf|The perfmon2 interface specification]] * [[http://blog.csdn.net/bluebeach/archive/2010/09/28/5912062.aspx|Perf -- Linux下的系统性能调优工具介绍]] * [[http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Developer_Guide/perf.html|Performance Counters for Linux (PCL) Tools and perf]] * [[http://web.eecs.utk.edu/~vweaver1/projects/perf-events/|Unofficial Linux Perf Events Web-Page]] - 定期追蹤 ''perf_events'' 開發情況。 * [[http://vger.kernel.org/~acme/perf/perf.pdf|Performance Counters on Linux The New Tools]] * [[http://pdxplumbers.osuosl.org/2010/ocw/system/presentations/579/original/perf-plumbers2010.pdf|Perf Tools: Recent Improvements]] * [[http://stackoverflow.com/questions/7107825/using-hardware-performance-counters-in-linux|Using Hardware Performance Counters in Linux]] * [[https://wiki.linaro.org/KenWerner/Sandbox/perf|perf]] * [[https://perf.wiki.kernel.org/|Perf Wiki]]