From: Linus Torvalds Date: Wed, 4 Sep 2013 15:25:35 +0000 (-0700) Subject: Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git.kernel... X-Git-Tag: firefly_0821_release~176^2~5456 X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=0d99b7087324978b09b59d8c7a0736214c4a42b1;p=firefly-linux-kernel-4.4.55.git Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "As a first remark I'd like to point out that the obsolete '-f' (--force) option, which has not done anything for several releases, has been removed from 'perf record' and related utilities. Everyone please update muscle memory accordingly! :-) Main changes on the perf kernel side: - Performance optimizations: . for trace events, by Steve Rostedt. . for time values, by Peter Zijlstra - New hardware support: . for Intel Silvermont (22nm Atom) CPUs, by Zheng Yan . for Intel SNB-EP uncore PMUs, by Zheng Yan - Enhanced hardware support: . for Intel uncore PMUs: add filter support for QPI boxes, by Zheng Yan - Core perf events code enhancements and fixes: . for full-nohz feature handling, by Frederic Weisbecker . for group events, by Jiri Olsa . for call chains, by Frederic Weisbecker . for event stream parsing, by Adrian Hunter - New ABI details: . Add attr->mmap2 attribute, by Stephane Eranian . Add PERF_EVENT_IOC_ID ioctl to return event ID, by Jiri Olsa . Export u64 time_zero on the mmap header page to allow TSC calculation, by Adrian Hunter . Add dummy software event, by Adrian Hunter. . Add a new PERF_SAMPLE_IDENTIFIER to make samples always parseable, by Adrian Hunter. . Make Power7 events available via sysfs, by Runzhen Wang. - Code cleanups and refactorings: . for nohz-full, by Frederic Weisbecker . for group events, by Jiri Olsa - Documentation updates: . for perf_event_type, by Peter Zijlstra Main changes on the perf tooling side (some of these tooling changes utilize the above kernel side changes): - Lots of 'perf trace' enhancements: . Make 'perf trace' command line arguments consistent with 'perf record', by David Ahern. . Allow specifying syscalls a la strace, by Arnaldo Carvalho de Melo. . Add --verbose and -o/--output options, by Arnaldo Carvalho de Melo. . Support ! in -e expressions, to filter a list of syscalls, by Arnaldo Carvalho de Melo. . Arg formatting improvements to allow masking arguments in syscalls such as futex and open, where the some arguments are ignored and thus should not be printed depending on other args, by Arnaldo Carvalho de Melo. . Beautify futex open, openat, open_by_handle_at, lseek and futex syscalls, by Arnaldo Carvalho de Melo. . Add option to analyze events in a file versus live, so that one can do: [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ] [root@zoo ~]# perf trace -i perf.data -e futex --duration 1 17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua 113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967 133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496 [root@zoo ~]# By David Ahern. . Honor target pid / tid options when analyzing a file, by David Ahern. . Introduce better formatting of syscall arguments, including so far beautifiers for mmap, madvise, syscall return values, by Arnaldo Carvalho de Melo. . Handle HUGEPAGE defines in the mmap beautifier, by David Ahern. - 'perf report/top' enhancements: . Do annotation using /proc/kcore and /proc/kallsyms when available, removing the forced need for a vmlinux file kernel assembly annotation. This also improves this use case because vmlinux has just the initial kernel image, not what is actually in use after various code patchings by things like alternatives. By Adrian Hunter. . Add --ignore-callees= option to collapse undesired parts of call graphs, by Greg Price. . Simplify symbol filtering by doing it at machine class level, by Adrian Hunter. . Add support for callchains in the gtk UI, by Namhyung Kim. . Add --objdump option to 'perf top', by Sukadev Bhattiprolu. - 'perf kvm' enhancements: . Add option to print only events that exceed a specified time duration, by David Ahern. . Improve stack trace printing, by David Ahern. . Update documentation of the live command, by David Ahern . Add perf kvm stat live mode that combines aspects of 'perf kvm stat' record and report, by David Ahern. . Add option to analyze specific VM in perf kvm stat report, by David Ahern. . Do not require /lib/modules/* on a guest, by Jason Wessel. - 'perf script' enhancements: . Fix symbol offset computation for some dsos, by David Ahern. . Fix named threads support, by David Ahern. . Don't install scripting files files when perl/python support is disabled, by Arnaldo Carvalho de Melo. - 'perf test' enhancements: . Add various improvements and fixes to the "vmlinux matches kallsyms" 'perf test' entry, related to the /proc/kcore annotation feature. By Adrian Hunter. . Add sample parsing test, by Adrian Hunter. . Add test for reading object code, by Adrian Hunter. . Add attr record group sampling test, by Jiri Olsa. . Misc testing infrastructure improvements and other details, by Jiri Olsa. - 'perf list' enhancements: . Skip unsupported hardware events, by Namhyung Kim. . List pmu events, by Andi Kleen. - 'perf diff' enhancements: . Add support for more than two files comparison, by Jiri Olsa. - 'perf sched' enhancements: . Various improvements, including removing reliance on some scheduler tracepoints that provide the same information as the PERF_RECORD_{FORK,EXIT} events. By David Ahern. . Remove odd build stall by moving a large struct initialization from a local variable to a global one, by Namhyung Kim. - 'perf stat' enhancements: . Add --initial-delay option to skip measuring for a defined startup phase, by Andi Kleen. - Generic perf tooling infrastructure/plumbing changes: . Tidy up sample parsing validation, by Adrian Hunter. . Fix up jobserver setup in libtraceevent Makefile. by Arnaldo Carvalho de Melo. . Debug improvements, by Adrian Hunter. . Fix correlation of samples coming after PERF_RECORD_EXIT event, by David Ahern. . Improve robustness of the topology parsing code, by Stephane Eranian. . Add group leader sampling, that allows just one event in a group to sample while the other events have just its values read, by Jiri Olsa. . Add support for a new modifier "D", which requests that the event, or group of events, be pinned to the PMU. By Michael Ellerman. . Support callchain sorting based on addresses, by Andi Kleen . Prep work for multi perf data file storage, by Jiri Olsa. . libtraceevent cleanups, by Namhyung Kim. And lots and lots of other fixes and code reorganizations that did not make it into the list, see the shortlog, diffstat and the Git log for details!" [ Also merge a leftover from the 3.11 cycle ] * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Prevent race in unthrottling code * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (237 commits) perf trace: Tell arg formatters the arg index perf trace: Add beautifier for open's flags arg perf trace: Add beautifier for lseek's whence arg perf tools: Fix symbol offset computation for some dsos perf list: Skip unsupported events perf tests: Add 'keep tracking' test perf tools: Add support for PERF_COUNT_SW_DUMMY perf: Add a dummy software event to keep tracking perf trace: Add beautifier for futex 'operation' parm perf trace: Allow syscall arg formatters to mask args perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node() perf: Export struct perf_branch_entry to userspace perf: Add attr->mmap2 attribute to an event perf/x86: Add Silvermont (22nm Atom) support perf/x86: use INTEL_UEVENT_EXTRA_REG to define MSR_OFFCORE_RSP_X perf trace: Handle missing HUGEPAGE defines perf trace: Honor target pid / tid options when analyzing a file perf trace: Add option to analyze events in a file versus live perf evlist: Add tracepoint lookup by name perf tests: Add a sample parsing test ... --- 0d99b7087324978b09b59d8c7a0736214c4a42b1 diff --cc kernel/events/core.c index 9300f5226077,258eaaffe95a,c7ee497c39a7..2207efc941d1 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@@@ -3131,36 -3128,36 -3129,63 +3132,63 @@@@ static void free_event_rcu(struct rcu_h static void ring_buffer_put(struct ring_buffer *rb); static void ring_buffer_detach(struct perf_event *event, struct ring_buffer *rb); -- static void free_event(struct perf_event *event) ++ static void unaccount_event_cpu(struct perf_event *event, int cpu) { -- irq_work_sync(&event->pending); ++ if (event->parent) ++ return; ++ ++ if (has_branch_stack(event)) { ++ if (!(event->attach_state & PERF_ATTACH_TASK)) ++ atomic_dec(&per_cpu(perf_branch_stack_events, cpu)); ++ } ++ if (is_cgroup_event(event)) ++ atomic_dec(&per_cpu(perf_cgroup_events, cpu)); ++ } + ++ static void unaccount_event(struct perf_event *event) ++ { ++ if (event->parent) ++ return; ++ ++ if (event->attach_state & PERF_ATTACH_TASK) ++ static_key_slow_dec_deferred(&perf_sched_events); ++ if (event->attr.mmap || event->attr.mmap_data) ++ atomic_dec(&nr_mmap_events); ++ if (event->attr.comm) ++ atomic_dec(&nr_comm_events); ++ if (event->attr.task) ++ atomic_dec(&nr_task_events); ++ if (event->attr.freq) ++ atomic_dec(&nr_freq_events); ++ if (is_cgroup_event(event)) ++ static_key_slow_dec_deferred(&perf_sched_events); ++ if (has_branch_stack(event)) ++ static_key_slow_dec_deferred(&perf_sched_events); ++ ++ unaccount_event_cpu(event, event->cpu); ++ } + ++ static void __free_event(struct perf_event *event) ++ { if (!event->parent) { -- if (event->attach_state & PERF_ATTACH_TASK) -- static_key_slow_dec_deferred(&perf_sched_events); -- if (event->attr.mmap || event->attr.mmap_data) -- atomic_dec(&nr_mmap_events); -- if (event->attr.comm) -- atomic_dec(&nr_comm_events); -- if (event->attr.task) -- atomic_dec(&nr_task_events); if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) put_callchain_buffers(); -- if (is_cgroup_event(event)) { -- atomic_dec(&per_cpu(perf_cgroup_events, event->cpu)); -- static_key_slow_dec_deferred(&perf_sched_events); -- } -- -- if (has_branch_stack(event)) { -- static_key_slow_dec_deferred(&perf_sched_events); -- /* is system-wide event */ -- if (!(event->attach_state & PERF_ATTACH_TASK)) { -- atomic_dec(&per_cpu(perf_branch_stack_events, -- event->cpu)); -- } -- } } ++ if (event->destroy) ++ event->destroy(event); ++ ++ if (event->ctx) ++ put_ctx(event->ctx); ++ ++ call_rcu(&event->rcu_head, free_event_rcu); ++ } ++ static void free_event(struct perf_event *event) ++ { ++ irq_work_sync(&event->pending); ++ ++ unaccount_event(event); ++ if (event->rb) { struct ring_buffer *rb;