summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2012-07-02perf kvm: Fix segfault with report and mixed guestmount useDavid Ahern
Using the guestmount option on record: $ perf kvm --guest --host --guestmount=/tmp/guest-mount record -ag But not the subsequent report: $ perf kvm report causes a SEGFAULT in the usual place: (gdb) bt 0 0x0000000000470356 in machine__mmap_name (self=0x0, bf=0x7fffffffbdb0 " z\370\367\377\177", size= 4096) at util/map.c:712 1 0x00000000004453e8 in perf_event__process_kernel_mmap (tool=0x7fffffffde10, event=0x7ffff7f87e38, machine=0x0) at util/event.c:550 2 0x00000000004458c9 in perf_event__process_mmap (tool=0x7fffffffde10, event=0x7ffff7f87e38, sample= 0x7fffffffd2a0, machine=0x0) at util/event.c:656 3 0x00000000004733e0 in perf_session_deliver_event (session=0x91aca0, event=0x7ffff7f87e38, sample= 0x7fffffffd2a0, tool=0x7fffffffde10, file_offset=7736) at util/session.c:979 ... The MMAP events in this case already contain the full path to the module. No need to require it for the report path to. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1341241977-71535-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-07-02perf kvm: Fix regression with guest machine creationDavid Ahern
Commit 743eb868657bdb1b26c7b24077ca21c67c82c777 reworked when the machines were created. Prior to this commit guest machines could be created in perf_event__process_kernel_mmap() while processing kernel MMAP events. This commit assumes that the machines exist by the time perf_session_deliver_event is called (e.g., during processing of build id events) - which is not always correct. One example is the use of default guest args (--guestkallsyms and --guestmodules) for short times where no samples hit within a guest module. For this case no build id is added to the file header. No build id == no machine created. That leads to the next example -- the use of no-buildid (-B) on the record for all perf-kvm invocations. In both cases perf report dies with a SEGFAULT of the form: (gdb) bt 0 0x000000000046dd7b in machine__mmap_name (self=0x0, bf=0x7fffffffbd20 "q\021", size=4096) at util/map.c:715 1 0x0000000000444161 in perf_event__process_kernel_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, machine=0x0) at util/event.c:562 2 0x0000000000444642 in perf_event__process_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, sample=0x7fffffffd210, machine=0x0) at util/event.c:668 3 0x0000000000470e0b in perf_session_deliver_event (session=0x915ca0, event=0x7ffff7fb4120, sample=0x7fffffffd210, tool=0x7fffffffdd80, file_offset=8480) at util/session.c:979 4 0x000000000047032e in flush_sample_queue (s=0x915ca0, tool=0x7fffffffdd80) at util/session.c:679 5 0x0000000000471c8d in __perf_session__process_events (session=0x915ca0, data_offset=400, data_size=150448, file_size=150848, tool= 0x7fffffffdd80) at util/session.c:1363 6 0x0000000000471d42 in perf_session__process_events (self=0x915ca0, tool=0x7fffffffdd80) at util/session.c:1379 7 0x000000000042484a in __cmd_report (rep=0x7fffffffdd80) at builtin-report.c:368 8 0x0000000000425bf1 in cmd_report (argc=0, argv=0x915b00, prefix=0x0) at builtin-report.c:756 9 0x0000000000438505 in __cmd_report (argc=4, argv=0x7fffffffe260) at builtin-kvm.c:84 10 0x000000000043882a in cmd_kvm (argc=4, argv=0x7fffffffe260, prefix=0x0) at builtin-kvm.c:131 11 0x00000000004152cd in run_builtin (p=0x7a54e8, argc=9, argv=0x7fffffffe260) at perf.c:273 12 0x00000000004154c7 in handle_internal_command (argc=9, argv=0x7fffffffe260) at perf.c:345 13 0x0000000000415613 in run_argv (argcp=0x7fffffffe14c, argv=0x7fffffffe140) at perf.c:389 14 0x0000000000415899 in main (argc=9, argv=0x7fffffffe260) at perf.c:487 Fix by allowing the machine to be created in perf_session_deliver_event. Tested with --guestmount option and default guest args, with and without -B arg on record for both and for short (10 seconds) and long (10 minutes) windows. Reported-by: Pradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com> Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1341180697-64515-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-07-02perf script: Fix format regression due to libtraceevent mergeDavid Ahern
Consider the commands: perf record -e sched:sched_switch -fo /tmp/perf.data -a -- sleep 1 perf script -i /tmp/perf.data In v3.4 the output has the form (lines wrapped here) perf 29214 [005] 821043.582596: sched_switch: prev_comm=perf prev_pid=29214 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 In 3.5 that same line has become: perf 29214 [005] 821043.582596: sched_switch: <...>-29214 [005] 0.000000000: sched_switch: prev_comm=perf prev_pid=29214 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 Note the duplicates in the output -- pid, cpu, event name. With this patch the v3.4 output is restored: perf 29214 [005] 821043.582596: sched_switch: prev_comm=perf prev_pid=29214 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 v3: Remove that pesky newline too. Output now matches v3.4 (pre-libtracevent). v2: Change print_trace_event function local to perf per Steve's comments. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1339698977-68962-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-22Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ftrace: Make all inline tags also include notrace perf: Use css_tryget() to avoid propping up css refcount perf tools: Fix synthesizing tracepoint names from the perf.data headers perf stat: Fix default output file perf tools: Fix endianity swapping for adds_features bitmask
2012-06-12perf tools: Fix synthesizing tracepoint names from the perf.data headersArnaldo Carvalho de Melo
We need to use the per event info snapshoted at record time to synthesize the events name, so do it just after reading the perf.data headers, when we already processed the /sys events data, otherwise we'll end up using the local /sys that only by sheer luck will have the same tracepoint ID -> real event association. Example: # uname -a Linux felicio.ghostprotocols.net 3.4.0-rc5+ #1 SMP Sat May 19 15:27:11 BRT 2012 x86_64 x86_64 x86_64 GNU/Linux # perf record -e sched:sched_switch usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.015 MB perf.data (~648 samples) ] # cat /t/events/sched/sched_switch/id 279 # perf evlist -v sched:sched_switch: sample_freq=1, type: 2, config: 279, size: 80, sample_type: 1159, read_format: 7, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 # So on the above machine the sched:sched_switch has tracepoint id 279, but on the machine were we'll analyse it it has a different id: $ cat /t/events/sched/sched_switch/id 56 $ perf evlist -i /tmp/perf.data kmem:mm_balancedirty_writeout $ cat /t/events/kmem/mm_balancedirty_writeout/id 279 With this fix: $ perf evlist -i /tmp/perf.data sched:sched_switch Reported-by: Dmitry Antipov <dmitry.antipov@linaro.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-auwks8fpuhmrdpiefs55o5oz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-11perf stat: Fix default output fileStephane Eranian
The following commit: commit 56f3bae70638b33477a6015fd362ccfe354fd3ee Author: Jim Cromie <jim.cromie@gmail.com> Date: Wed Sep 7 17:14:00 2011 -0600 perf stat: Add --log-fd <N> option to redirect stderr elsewhere introduced a bug in the way perf stat outputs the results by default, i.e., without the --log-fd or --output option. It would default to writing to file descriptor 0, i.e., stdin. Writing to stdin is allowed and is equivalent to writing to stdout. However, there is a major difference for any script that was already capturing the output of perf stat via redirection: perf stat >/tmp/log .... or perf stat 2>/tmp/log .... They would not capture anything anymore. They would have to do: perf stat 0>/tmp/log ... This breaks compatibility with existing scripts and does not look very natural. This patch fixes the problem by looking at output_fd only when it was modified by user (> 0). It also checks that the value if positive. Passing --log-fd 0 is ignored. I would also argue that defaulting to stderr for the results is not the right thing to do, though this patch does not address this specific issue. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jim Cromie <jim.cromie@gmail.com> Link: http://lkml.kernel.org/r/20120515111111.GA9870@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-11perf tools: Fix endianity swapping for adds_features bitmaskDavid Ahern
Based on Jiri's latest attempt: https://lkml.org/lkml/2012/5/16/61 Basically, adds_features should be byte swapped assuming unsigned longs are either 8-bytes (u64) or 4-bytes (u32). Fixes 32-bit ppc dumping 64-bit x86 feature data: ======== captured on: Sun May 20 19:23:23 2012 hostname : nxos-vdc-dev3 os release : 3.4.0-rc7+ perf version : 3.4.rc4.137.g978da3 arch : x86_64 nrcpus online : 16 nrcpus avail : 16 cpudesc : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz cpuid : GenuineIntel,6,26,5 total memory : 24680324 kB ... Verified 64-bit x86 can still dump feature data for 32-bit ppc. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/4FBBB539.5010805@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-10Tools: hv: verify origin of netlink connector messageOlaf Hering
The SuSE security team suggested to use recvfrom instead of recv to be certain that the connector message is originated from kernel. CVE-2012-2669 Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Marcus Meissner <meissner@suse.de> Signed-off-by: Sebastian Krahmer <krahmer@suse.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-08Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A bit larger than what I'd wish for - half of it is due to hw driver updates to Intel Ivy-Bridge which info got recently released, cycles:pp should work there now too, amongst other things. (but we are generally making exceptions for hardware enablement of this type.) There are also callchain fixes in it - responding to mostly theoretical (but valid) concerns. The tooling side sports perf.data endianness/portability fixes which did not make it for the merge window - and various other fixes as well." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) perf/x86: Check user address explicitly in copy_from_user_nmi() perf/x86: Check if user fp is valid perf: Limit callchains to 127 perf/x86: Allow multiple stacks perf/x86: Update SNB PEBS constraints perf/x86: Enable/Add IvyBridge hardware support perf/x86: Implement cycles:p for SNB/IVB perf/x86: Fix Intel shared extra MSR allocation x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix perf: Remove duplicate invocation on perf_event_for_each perf uprobes: Remove unnecessary check before strlist__delete perf symbols: Check for valid dso before creating map perf evsel: Fix 32 bit values endianity swap for sample_id_all header perf session: Handle endianity swap on sample_id_all header data perf symbols: Handle different endians properly during symbol load perf evlist: Pass third argument to ioctl explicitly perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP perf tools: Make --version show kernel version instead of pull req tag perf tools: Check if callchain is corrupted perf callchain: Make callchain cursors TLS ...
2012-06-06Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf fixes from Arnaldo Carvalho de Melo: * Endianness fixes from Jiri Olsa * Fixes for make perf tarball * Fix for DSO name in perf script callchains, from David Ahern * Segfault fixes for perf top --callchain, from Namhyung Kim * Minor function result fixes from Srikar Dronamraju * Add missing 3rd ioctl parameter, from Namhyung Kim * Fix pager usage in minimal embedded systems, from Avik Sil Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-03tools/power turbostat: fix IVB supportLen Brown
Initial IVB support went into turbostat in Linux-3.1: 553575f1ae048aa44682b46b3c51929a0b3ad337 (tools turbostat: recognize and run properly on IVB) However, when running on IVB, turbostat would fail to report the new couters added with SNB, c7, pc2 and pc7. So in scenarios where these counters are non-zero on IVB, turbostat would report erroneous residencey results. In particular c7 time would be added to c1 time, since c1 time is calculated as "that which is left over". Also, turbostat reports MHz capabilities when passed the "-v" option, and it would incorrectly report 133MHz bclk instead of 100MHz bclk for IVB, which would inflate GHz reported with that option. This patch is a backport of a fix already included in turbostat v2. Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03tools/power turbostat: fix un-intended affinity of forked programLen Brown
Linux 3.4 included a modification to turbostat to lower cross-call overhead by using scheduler affinity: 15aaa34654831e98dd76f7738b6c7f5d05a66430 (tools turbostat: reduce measurement overhead due to IPIs) In the use-case where turbostat forks a child program, that change had the un-intended side-effect of binding the child to the last cpu in the system. This change removed the binding before forking the child. This is a back-port of a fix already included in turbostat v2. Signed-off-by: Len Brown <len.brown@intel.com>
2012-05-31Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...
2012-05-31syscalls, x86: add __NR_kcmp syscallCyrill Gorcunov
While doing the checkpoint-restore in the user space one need to determine whether various kernel objects (like mm_struct-s of file_struct-s) are shared between tasks and restore this state. The 2nd step can be solved by using appropriate CLONE_ flags and the unshare syscall, while there's currently no ways for solving the 1st one. One of the ways for checking whether two tasks share e.g. mm_struct is to provide some mm_struct ID of a task to its proc file, but showing such info considered to be not that good for security reasons. Thus after some debates we end up in conclusion that using that named 'comparison' syscall might be the best candidate. So here is it -- __NR_kcmp. It takes up to 5 arguments - the pids of the two tasks (which characteristics should be compared), the comparison type and (in case of comparison of files) two file descriptors. Lookups for pids are done in the caller's PID namespace only. At moment only x86 is supported and tested. [akpm@linux-foundation.org: fix up selftests, warnings] [akpm@linux-foundation.org: include errno.h] [akpm@linux-foundation.org: tweak comment text] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Glauber Costa <glommer@parallels.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Tejun Heo <tj@kernel.org> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Valdis.Kletnieks@vt.edu Cc: Michal Marek <mmarek@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31tools/selftests: add mq_perf_testsDoug Ledford
Add the mq_perf_tests tool I used when creating my mq performance patch. Also add a local .gitignore to keep the binaries from showing up in git status output. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31selftests: add mq_open_testsDoug Ledford
Add a directory to house POSIX message queue subsystem specific tests. Add first test which checks the operation of mq_open() under various corner conditions. Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Joe Korty <joe.korty@ccur.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31perf uprobes: Remove unnecessary check before strlist__deleteSrikar Dronamraju
Since strlist__delete() itself checks, the additional check before calling strlist__delete() is redundant. No Functional change. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20120531114643.23691.38666.sendpatchset@srdronam.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf symbols: Check for valid dso before creating mapSrikar Dronamraju
dso__new() can return NULL. Hence verify dso before creating a new map. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20120531114656.23691.54223.sendpatchset@srdronam.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf evsel: Fix 32 bit values endianity swap for sample_id_all headerJiri Olsa
We swap the sample_id_all header by u64 pointers. Some members of the header happen to be 32 bit values. We need to handle them separatelly. Together with other endianity patches, this change fixies perf report discrepancies on origin and target systems as described in test 1 below, e.g. following perf report diff: ... 0.12% ps [kernel.kallsyms] [k] clear_page - 0.12% awk bash [.] alloc_word_desc + 0.12% awk bash [.] yyparse 0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e 0.10% perf libc-2.12.so [.] __ctype_toupper_loc - 0.09% rhts-test-runne bash [.] maybe_make_export_env + 0.09% rhts-test-runne bash [.] 0x385a0 0.09% ps [kernel.kallsyms] [k] page_fault ... Note, running following to test perf endianity handling: test 1) - origin system: # perf record -a -- sleep 10 (any perf record will do) # perf report > report.origin # perf archive perf.data - copy the perf.data, report.origin and perf.data.tar.bz2 to a target system and run: # tar xjvf perf.data.tar.bz2 -C ~/.debug # perf report > report.target # diff -u report.origin report.target - the diff should produce no output (besides some white space stuff and possibly different date/TZ output) test 2) - origin system: # perf record -ag -fo /tmp/perf.data -- sleep 1 - mount origin system root to the target system on /mnt/origin - target system: # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \ --kallsyms /mnt/origin/proc/kallsyms - complete perf.data header is displayed Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338380624-7443-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf session: Handle endianity swap on sample_id_all header dataJiri Olsa
Adding endianity swapping for event header attached via sample_id_all. Currently we dont do that and it's causing wrong data to be read when running report on architecture with different endianity than the record. The perf is currently able to process 32-bit PPC samples on 32-bit and 64-bit x86. Together with other endianity patches, this change fixies perf report discrepancies on origin and target systems as described in test 1 below, e.g. following perf report diff: ... 0.12% ps [kernel.kallsyms] [k] clear_page - 0.12% awk bash [.] alloc_word_desc + 0.12% awk bash [.] yyparse 0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e 0.10% perf libc-2.12.so [.] __ctype_toupper_loc - 0.09% rhts-test-runne bash [.] maybe_make_export_env + 0.09% rhts-test-runne bash [.] 0x385a0 0.09% ps [kernel.kallsyms] [k] page_fault ... Note, running following to test perf endianity handling: test 1) - origin system: # perf record -a -- sleep 10 (any perf record will do) # perf report > report.origin # perf archive perf.data - copy the perf.data, report.origin and perf.data.tar.bz2 to a target system and run: # tar xjvf perf.data.tar.bz2 -C ~/.debug # perf report > report.target # diff -u report.origin report.target - the diff should produce no output (besides some white space stuff and possibly different date/TZ output) test 2) - origin system: # perf record -ag -fo /tmp/perf.data -- sleep 1 - mount origin system root to the target system on /mnt/origin - target system: # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \ --kallsyms /mnt/origin/proc/kallsyms - complete perf.data header is displayed Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338380624-7443-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf symbols: Handle different endians properly during symbol loadJiri Olsa
Currently we dont care about the file object's endianness. It's possible we read buildid file object from different architecture than we are currentlly running on. So we need to care about properly reading such object's data - handle different endianness properly. Adding: needs_swap DSO field dso__swap_init function to initialize DSO's needs_swap DSO__SWAP to read the data with proper swaps Together with other endianity patches, this change fixies perf report discrepancies on origin and target systems as described in test 1 below, e.g. following perf report diff: ... 0.12% ps [kernel.kallsyms] [k] clear_page - 0.12% awk bash [.] alloc_word_desc + 0.12% awk bash [.] yyparse 0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e 0.10% perf libc-2.12.so [.] __ctype_toupper_loc - 0.09% rhts-test-runne bash [.] maybe_make_export_env + 0.09% rhts-test-runne bash [.] 0x385a0 0.09% ps [kernel.kallsyms] [k] page_fault ... Note, running following to test perf endianity handling: test 1) - origin system: # perf record -a -- sleep 10 (any perf record will do) # perf report > report.origin # perf archive perf.data - copy the perf.data, report.origin and perf.data.tar.bz2 to a target system and run: # tar xjvf perf.data.tar.bz2 -C ~/.debug # perf report > report.target # diff -u report.origin report.target - the diff should produce no output (besides some white space stuff and possibly different date/TZ output) test 1) - origin system: # perf record -ag -fo /tmp/perf.data -- sleep 1 - mount origin system root to the target system on /mnt/origin - target system: # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \ --kallsyms /mnt/origin/proc/kallsyms - complete perf.data header is displayed Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338380624-7443-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf evlist: Pass third argument to ioctl explicitlyNamhyung Kim
The ioctl on perf event fd wants 3 arguments but we only passed 2. As the only user of the functions is perf record and it calls them for every event (regardless of group setting), just pass 0 for now. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338443506-25009-3-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUPNamhyung Kim
The ioctl interface of perf event fd receives 3 arguments to control event group behavior but it lacked documentation. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338443506-25009-2-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf tools: Make --version show kernel version instead of pull req tagArnaldo Carvalho de Melo
Before: $ perf --version perf version perf.urgent.for.mingo.5.g37da28 After: $ perf --version perf version 3.4.8941.g37da28.dirty Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vc9b4e6023iegz9kabr3yvyv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf tools: Check if callchain is corruptedNamhyung Kim
We faced segmentation fault on perf top -G at very high sampling rate due to a corrupted callchain. While the root cause was not revealed (I failed to figure it out), this patch tries to protect us from the segfault on such cases. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sunjin Yang <fan4326@gmail.com> Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31perf callchain: Make callchain cursors TLSNamhyung Kim
perf top -G has a race on callchain cursor between main thread and display thread. Since the callchain cursors are used locally make them thread-local data would solve the problem. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reported-by: Sunjin Yang <fan4326@gmail.com> Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sunjin Yang <fan4326@gmail.com> Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf ui browser: Stop using 'self' perf annotate browser: Read perf config file for settings perf config: Allow '_' in config file variable names perf annotate browser: Make feature toggles global perf annotate browser: The idx_asm field should be used in asm only view perf tools: Convert critical messages to ui__error() perf ui: Make --stdio default when TUI is not supported tools lib traceevent: Silence compiler warning on 32bit build perf record: Fix branch_stack type in perf_record_opts perf tools: Reconstruct event with modifiers from perf_event_attr perf top: Fix counter name fixup when fallbacking to cpu-clock perf tools: fix thread_map__new_by_pid_str() memory leak in error path perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specified tools lib traceevent: Fix signature of create_arg_item() tools lib traceevent: Use proper function parameter type tools lib traceevent: Fix freeing arg on process_dynamic_array() tools lib traceevent: Fix a possibly wrong memory dereference tools lib traceevent: Fix a possible memory leak tools lib traceevent: Allow expressions in __print_symbolic() fields perf evlist: Explicititely initialize input_name ...
2012-05-30perf tools: Fix pager on minimal-install embedded systemsAvik Sil
Some Distributions may lack "less" package being included by default, e.g., Linaro nano rootfs. In those cases use the portable "pager" command instead of "less". Signed-off-by: Avik Sil <avik.sil@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30perf tools: Fix make tarballsArnaldo Carvalho de Melo
The patch series that introduced the top level tools/ makefile and the libtraceevent broke this feature where files needed to build in a detached tarball were not included in the MANIFEST file and thus not included in the tarball. Fix it by adding the relevant files to the MANIFEST. Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30perf script: Fix regression in callchain dso nameDavid Ahern
$ perf script -i /tmp/perf.data ... gcc 13623 544315.062858: context-switches: ffffffff815f65c9 __schedule ([kernel.kallsyms]) ffffffff81087cea __cond_resched ([kernel.kallsyms]) ffffffff815f6b92 _cond_resched ([kernel.kallsyms]) ffffffff815fb87a do_page_fault ([kernel.kallsyms]) ffffffff815f8465 page_fault ([kernel.kallsyms]) 2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms]) 2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms]) 2b7a71e99b2e dl_main ([kernel.kallsyms]) 2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms]) All DSO's in a callchain are printed as [kernel.kallsyms]. git bisect chased it to: 547a92e0aedb88129e7fbd804697a11949de2e5a is the first bad commit commit 547a92e0aedb88129e7fbd804697a11949de2e5a Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com> Date: Mon Jan 30 13:42:57 2012 +0900 perf script: Unify the expressions indicating "unknown" The perf script command uses various expressions to indicate "unknown". It is unfriendly for user scripts to parse it. So, this patch unifies the expressions to "[unknown]". Looks like a copy-paste in that the other references use al.map but this one should be node->map. With this patch you get: $ perf script -i /tmp/perf.data ... gcc 13623 544315.062858: context-switches: ffffffff815f65c9 __schedule ([kernel.kallsyms]) ffffffff81087cea __cond_resched ([kernel.kallsyms]) ffffffff815f6b92 _cond_resched ([kernel.kallsyms]) ffffffff815fb87a do_page_fault ([kernel.kallsyms]) ffffffff815f8465 page_fault ([kernel.kallsyms]) 2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so) 2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so) 2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so) 2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so) Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30perf stat: Initialize default events wrt exclude_{guest,host}Arnaldo Carvalho de Melo
When no event is specified the tools use perf_evlist__add_default(), that will call event_attr_init to initialize the KVM exclusion bits. When the change was made to the tools so that by default guest samples would be excluded, the changes were made just to the parsing routines and to perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far just by perf stat to add multiple events, according to the level of detail specified. Recently the tools were changed to reconstruct the event name from all the details in perf_event_attr, not just from .type and .config, but taking into account all the feature bits (.exclude_{guest,host,user,kernel,etc}, .precise_ip, etc). That is when we noticed that the default for perf stat wasn't the one for the rest of the tools, i.e. the .exclude_guest bit wasn't being set. I.e. the default, that doesn't call event_attr_init was showing the :HG modifier: $ perf stat usleep 1 Performance counter stats for 'usleep 1': 0.942119 task-clock # 0.454 CPUs utilized 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 K/sec 126 page-faults # 0.134 M/sec 693,193 cycles:HG # 0.736 GHz [40.11%] 407,461 stalled-cycles-frontend:HG # 58.78% frontend cycles idle [72.29%] 365,403 stalled-cycles-backend:HG # 52.71% backend cycles idle 465,982 instructions:HG # 0.67 insns per cycle # 0.87 stalled cycles per insn 89,760 branches:HG # 95.275 M/sec 6,178 branch-misses:HG # 6.88% of all branches 0.002077228 seconds time elapsed While if one explicitely specifies the same events, which will make the parsing code to be called and thus event_attr_init is called: $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1 Performance counter stats for 'usleep 1': 1.040349 task-clock # 0.500 CPUs utilized 2 context-switches # 0.002 M/sec 0 CPU-migrations # 0.000 K/sec 127 page-faults # 0.122 M/sec 587,966 cycles # 0.565 GHz [13.18%] 459,167 stalled-cycles-frontend # 78.09% frontend cycles idle 390,249 stalled-cycles-backend # 66.37% backend cycles idle 504,006 instructions # 0.86 insns per cycle # 0.91 stalled cycles per insn 96,455 branches # 92.714 M/sec 6,522 branch-misses # 6.76% of all branches [96.12%] 0.002078681 seconds time elapsed Fix it by introducing a perf_evlist__add_default_attrs method that will call evlist_attr_init in all the perf_event_attr entries before adding the events. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30perf annotate browser: Fix help window entry for navigating to hottest lineArnaldo Carvalho de Melo
Its 'H', not 'h'. The later is for getting to the help window. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-7zvwphhm815y2zczoxgstzuf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30perf report: Use the right symbol for annotationArnaldo Carvalho de Melo
In non symbolic views, i.e. --sort without "symbol", as in: perf report --sort comm We're segfaulting in the --tui because we're testing the symbol resolved and then trying to use the symbol on the histogram entry where we're coalescing all hits for a COMM, and the first hist_entry for a comm may have a NULL symbol, i.e. the RIP didn't resolve to any symbol. In this case we're segfaulting, fix it by testing against the symbol in the histogram entry. Reported-by: William Cohen <wcohen@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8ylwubbcmu27ucc9ffrku3yv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30Merge branch 'linus' into perf/urgentIngo Molnar
Merge back Linus's latest branch so that we pick up the uprobes changes. ( I tested this branch locally and while it's one from the middle of the merge window it's a good one to base further work off. ) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-29perf ui browser: Stop using 'self'Arnaldo Carvalho de Melo
Stop using this python/OOP convention, doesn't really helps. Will do more from time to time till we get it cleaned up in all of /perf. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5dyxyb8o0gf4yndk27kafbd1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-29perf annotate browser: Read perf config file for settingsArnaldo Carvalho de Melo
The defaults are: [annotate] hide_src_code = false use_offset = true jump_arrows = true show_nr_jumps = false Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-q4egci70rjgxh7bogbbfpcyf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-29perf config: Allow '_' in config file variable namesArnaldo Carvalho de Melo
For annotate I want to be able to have variables that are the same as the ones representing feature toggles. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-7rhhf6m0a72p2wja4tgv1itg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-29perf annotate browser: Make feature toggles globalArnaldo Carvalho de Melo
So that when navigating to another function from a call site or when going to another annotation browser thru the main report/top browser the options (hide source code, jump arrows, jumpy lines, etc) remains the last ones selected. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0h0tah1zj59p01581snjufne@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-29perf annotate browser: The idx_asm field should be used in asm only viewArnaldo Carvalho de Melo
When hide_src_view is true we can't use browser_disasm_line->idx, that takes into account also non asm lines, we must use browser_disasm_line->idx_asm instead, otherwise we may end up with an index after the number of entries, oops, fix it. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-o1szpyjh3z87yi0n6x0cr8uu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-29tools/vm/page-types.c: cleanupsUlrich Drepper
Compiling page-type.c with a recent compiler produces many warnings, mostly related to signed/unsigned comparisons. This patch cleans up most of them. One remaining warning is about an unused parameter. The <compiler.h> file doesn't define a __unused macro (or the like) yet. This can be addressed later. Signed-off-by: Ulrich Drepper <drepper@gmail.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29kbuild: install kernel-page-flags.hUlrich Drepper
Programs using /proc/kpageflags need to know about the various flags. The <linux/kernel-page-flags.h> provides them and the comments in the file indicate that it is supposed to be used by user-level code. But the file is not installed. Install the headers and mark the unstable flags as out-of-bounds. The page-type tool is also adjusted to not duplicate the definitions Signed-off-by: Ulrich Drepper <drepper@gmail.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29perf tools: Convert critical messages to ui__error()Namhyung Kim
There were places where use ui__warning (or even fprintf) to show critical messages. This patch converts them to ui__error so that the front-end code can implement appropriate behavior. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338265382-6872-3-git-send-email-namhyung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-28perf ui: Make --stdio default when TUI is not supportedNamhyung Kim
The commit dc41b9b8f02db ("perf ui: Change fallback policy of setup_browser") changed default behavior of the function but missed setting the use_browser variable to 0 accidently. So perf report ends up doing nothing in such cases. Fix it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1338216802-5675-1-git-send-email-namhyung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-26tools lib traceevent: Silence compiler warning on 32bit buildNamhyung Kim
The gcc complains about casting a pointer to unsigned long long directly: SUBDIR ../lib/traceevent/ CC FPIC event-parse.o CC FPIC trace-seq.o CC FPIC parse-filter.o /home/namhyung/project/linux/tools/lib/traceevent/parse-filter.c: In function ‘get_value’: /home/namhyung/project/linux/tools/lib/traceevent/parse-filter.c:1588: warning: cast from pointer to integer of different size CC FPIC parse-utils.o BUILD STATIC LIB libtraceevent.a Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1338003691-3141-1-git-send-email-namhyung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-25perf record: Fix branch_stack type in perf_record_optsStephane Eranian
The attr.branch_sample_type field is defined as u64 by the API. As such, we need to ensure the variable holding the value of the branch stack filters is also u64 otherwise we may lose bits in the future. Note also that the bogus definition of the field in perf_record_opts caused problems on big-endian PPC systems. Thanks to Anshuman Khandual for tracking the problem on PPC. Reported-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120525211344.GA7729@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-25perf tools: Reconstruct event with modifiers from perf_event_attrArnaldo Carvalho de Melo
The modifiers: k kernel space u user space h hypervisor G guest H host p, pp, ppp precision level (PEBS) that can be suffixed to an event were lost when tools used event_name() to reconstruct them from the perf_event_attr entries in a perf.data file. Fix it by following the defaults used for these modifiers in the current codebase, so: $ perf record -e instructions:u usleep 1 2> /dev/null $ perf evlist instructions:u $ perf record -e cycles:k usleep 1 2> /dev/null $ perf evlist cycles:k $ perf record -e cycles:kh usleep 1 2> /dev/null $ perf evlist cycles:kh $ perf record -e cache-misses:G usleep 1 2> /dev/null $ perf evlist cache-misses:G $ perf record -e cycles:ppk usleep 1 2> /dev/null $ perf evlist cycles:kpp $ Also works with 'top', 'report', etc. More work needed to cover tracepoints and software events while not dragging lots of baggage to the python binding, this is a minimal fix for v3.5. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4hl5glle0hxlklw4usva1mkt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-25perf top: Fix counter name fixup when fallbacking to cpu-clockArnaldo Carvalho de Melo
In 40491eaa "perf top: Update event name when falling back to cpu-clock" we freed counter->name but didn't reset it to NULL, then when setting it to the result of event_name(), event_name() would use the cached value, which by now was overwritten and thus we got garbage or a zero lenght string. Fix it by just freeing and setting counter->name to NULL, this way event_name() when called afterwards, will find the right counter name and cache it again. Found while trying 'cycles:pp' on a machine were :pp couldn't be honoured. Probably the best fallback here is to tell the user that that level of precision is not available on the PMU and then go removing 'p', levels of precision till we get to play 'cycles' and if even that fails, _then_ get to 'cpu-clock'. But that is the matter for another patch, this one just needs to fix the caching issue, which in the end will show 'cpu-clock' when tools ask for the event name being used, which clarifies things for the user, that will see that 'cycles:pp' or whatever not support event is not being used, some sort of fallback happened. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-w1neie2dqli89we1bzwkf4id@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-25perf tools: fix thread_map__new_by_pid_str() memory leak in error pathFranck Bui-Huu
The namelist array (including its content) was not freed if we fail to realloc a new 'threads' structure. Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com> Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1337952109-31995-1-git-send-email-fbuihuu@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-24Merge branch 'perf-uprobes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull user-space probe instrumentation from Ingo Molnar: "The uprobes code originates from SystemTap and has been used for years in Fedora and RHEL kernels. This version is much rewritten, reviews from PeterZ, Oleg and myself shaped the end result. This tree includes uprobes support in 'perf probe' - but SystemTap (and other tools) can take advantage of user probe points as well. Sample usage of uprobes via perf, for example to profile malloc() calls without modifying user-space binaries. First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled. If you don't know which function you want to probe you can pick one from 'perf top' or can get a list all functions that can be probed within libc (binaries can be specified as well): $ perf probe -F -x /lib/libc.so.6 To probe libc's malloc(): $ perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0x7eac0) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 Make use of it to create a call graph (as the flat profile is going to look very boring): $ perf record -e probe_libc:malloc -gR make [ perf record: Woken up 173 times to write data ] [ perf record: Captured and wrote 44.190 MB perf.data (~1930712 $ perf report | less 32.03% git libc-2.15.so [.] malloc | --- malloc 29.49% cc1 libc-2.15.so [.] malloc | --- malloc | |--0.95%-- 0x208eb1000000000 | |--0.63%-- htab_traverse_noresize 11.04% as libc-2.15.so [.] malloc | --- malloc | 7.15% ld libc-2.15.so [.] malloc | --- malloc | 5.07% sh libc-2.15.so [.] malloc | --- malloc | 4.99% python-config libc-2.15.so [.] malloc | --- malloc | 4.54% make libc-2.15.so [.] malloc | --- malloc | |--7.34%-- glob | | | |--93.18%-- 0x41588f | | | --6.82%-- glob | 0x41588f ... Or: $ perf report -g flat | less # Overhead Command Shared Object Symbol # ........ ............. ............. .......... # 32.03% git libc-2.15.so [.] malloc 27.19% malloc 29.49% cc1 libc-2.15.so [.] malloc 24.77% malloc 11.04% as libc-2.15.so [.] malloc 11.02% malloc 7.15% ld libc-2.15.so [.] malloc 6.57% malloc ... The core uprobes design is fairly straightforward: uprobes probe points register themselves at (inode:offset) addresses of libraries/binaries, after which all existing (or new) vmas that map that address will have a software breakpoint injected at that address. vmas are COW-ed to preserve original content. The probe points are kept in an rbtree. If user-space executes the probed inode:offset instruction address then an event is generated which can be recovered from the regular perf event channels and mmap-ed ring-buffer. Multiple probes at the same address are supported, they create a dynamic callback list of event consumers. The basic model is further complicated by the XOL speedup: the original instruction that is probed is copied (in an architecture specific fashion) and executed out of line when the probe triggers. The XOL area is a single vma per process, with a fixed number of entries (which limits probe execution parallelism). The API: uprobes are installed/removed via /sys/kernel/debug/tracing/uprobe_events, the API is integrated to align with the kprobes interface as much as possible, but is separate to it. Injecting a probe point is privileged operation, which can be relaxed by setting perf_paranoid to -1. You can use multiple probes as well and mix them with kprobes and regular PMU events or tracepoints, when instrumenting a task." Fix up trivial conflicts in mm/memory.c due to previous cleanup of unmap_single_vma(). * 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf probe: Detect probe target when m/x options are absent perf probe: Provide perf interface for uprobes tracing: Fix kconfig warning due to a typo tracing: Provide trace events interface for uprobes tracing: Extract out common code for kprobes/uprobes trace events tracing: Modify is_delete, is_return from int to bool uprobes/core: Decrement uprobe count before the pages are unmapped uprobes/core: Make background page replacement logic account for rss_stat counters uprobes/core: Optimize probe hits with the help of a counter uprobes/core: Allocate XOL slots for uprobes use uprobes/core: Handle breakpoint and singlestep exceptions uprobes/core: Rename bkpt to swbp uprobes/core: Make order of function parameters consistent across functions uprobes/core: Make macro names consistent uprobes: Update copyright notices uprobes/core: Move insn to arch specific structure uprobes/core: Remove uprobe_opcode_sz uprobes/core: Make instruction tables volatile uprobes: Move to kernel/events/ uprobes/core: Clean up, refactor and improve the code ...
2012-05-24perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specifiedArnaldo Carvalho de Melo
As: make DEBUG=1 -C tools/perf disables optimizations and _FORTIFY_SOURCE in recent distros requires optimizations to be enabled, seen on a Fedora 17 system: [acme@Fedora17 linux]$ make DEBUG=1 O=/home/acme/git/build/perf/ -C tools/perf install In file included from /usr/include/sys/types.h:26:0, from /usr/include/libelf.h:53, from /usr/include/gelf.h:53, from /usr/include/elfutils/libdw.h:53, from <stdin>:2: /usr/include/features.h:314:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4ccyiebqju4uatm31ky7725b@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>