firefly-linux-kernel-4.4.55.git
10 years agosched: Fix asymmetric scheduling for POWER7
Vaidyanathan Srinivasan [Wed, 30 Oct 2013 03:12:42 +0000 (08:42 +0530)]
sched: Fix asymmetric scheduling for POWER7

Asymmetric scheduling within a core is a scheduler loadbalancing
feature that is triggered when SD_ASYM_PACKING flag is set.  The goal
for the load balancer is to move tasks to lower order idle SMT threads
within a core on a POWER7 system.

In nohz_kick_needed(), we intend to check if our sched domain (core)
is completely busy or we have idle cpu.

The following check for SD_ASYM_PACKING:

    (cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu)

already covers the case of checking if the domain has an idle cpu,
because cpumask_first_and() will not yield any set bits if this domain
has no idle cpu.

Hence, nr_busy check against group weight can be removed.

Reported-by: Michael Neuling <michael.neuling@au1.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131030031242.23426.13019.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
Yan, Zheng [Thu, 31 Oct 2013 05:36:55 +0000 (13:36 +0800)]
perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support

Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.

Besides the counter/control registers in IRP boxes are not properly
aligned.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
Yan, Zheng [Thu, 31 Oct 2013 05:36:54 +0000 (13:36 +0800)]
perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes

The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Factor out strncpy() in perf_event_mmap_event()
Oleg Nesterov [Thu, 17 Oct 2013 18:24:17 +0000 (20:24 +0200)]
perf: Factor out strncpy() in perf_event_mmap_event()

While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.

Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agotools/perf: Add required memory barriers
Peter Zijlstra [Wed, 30 Oct 2013 10:42:46 +0000 (11:42 +0100)]
tools/perf: Add required memory barriers

To match patch bf378d341e48 ("perf: Fix perf ring buffer memory
ordering") change userspace to also adhere to the ordering outlined.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Link: http://lkml.kernel.org/r/20131030104246.GH16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Fix arch_perf_out_copy_user default
Peter Zijlstra [Wed, 30 Oct 2013 20:16:22 +0000 (21:16 +0100)]
perf: Fix arch_perf_out_copy_user default

The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Update a stale comment
Peter Zijlstra [Thu, 31 Oct 2013 16:41:23 +0000 (17:41 +0100)]
perf: Update a stale comment

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Optimize perf_output_begin() -- address calculation
Peter Zijlstra [Thu, 31 Oct 2013 16:36:25 +0000 (17:36 +0100)]
perf: Optimize perf_output_begin() -- address calculation

Rewrite the handle address calculation code to be clearer.

Saves 8 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Optimize perf_output_begin() -- lost_event case
Peter Zijlstra [Thu, 31 Oct 2013 16:29:29 +0000 (17:29 +0100)]
perf: Optimize perf_output_begin() -- lost_event case

Avoid touching the lost_event and sample_data cachelines twince. Its
not like we end up doing less work, but it might help to keep all
accesses to these cachelines in one place.

Due to code shuffle, this looses 4 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Optimize perf_output_begin()
Peter Zijlstra [Thu, 31 Oct 2013 16:25:38 +0000 (17:25 +0100)]
perf: Optimize perf_output_begin()

There's no point in re-doing the memory-barrier when we fail the
cmpxchg(). Also placing it after the space reservation loop makes it
clearer it only separates the userpage->tail read from the data
stores.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Add unlikely() to the ring-buffer code
Peter Zijlstra [Thu, 31 Oct 2013 16:20:25 +0000 (17:20 +0100)]
perf: Add unlikely() to the ring-buffer code

Add unlikely() annotations to 'slow' paths:

When having a sampling event but no output buffer; you have bigger
issues -- also the bail is still faster than actually doing the work.

When having a sampling event but a control page only buffer, you have
bigger issues -- again the bail is still faster than actually doing
work.

Optimize for the case where you're not loosing events -- again, not
doing the work is still faster but make sure that when you have to
actually do work its as fast as possible.

The typical watermark is 1/2 the buffer size, so most events will not
take this path.

Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Simplify the ring-buffer code
Peter Zijlstra [Thu, 31 Oct 2013 09:19:59 +0000 (10:19 +0100)]
perf: Simplify the ring-buffer code

By using CIRC_SPACE() we can obviate the need for perf_output_space().

Shrinks the size of perf_output_begin() by 17 bytes on
x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agometag: handle low level kicks directly
James Hogan [Tue, 5 Nov 2013 14:42:03 +0000 (14:42 +0000)]
metag: handle low level kicks directly

Kick interrupts trigger the LWK (low level kick) signal, usually handled
by the __TBIDoStdLWK() function which is the only handler inherited from
the bootloader. The LWK signal is converted either to a SWK (plain
software kick) or a SWS (software kick with an attached message).

Linux has kick_handler() to handle SWK and call registered kick handlers
(IPIs and inter-thread comms), but SWS is as far as I'm aware unused
with Linux.

Therefore remove that abstraction and have Linux handle LWK directly.
This will reduce kick latency slightly, and reduce our dependence on the
bootloader, which makes it easier to directly boot a kernel in QEMU
(particularly for SMP).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
10 years agoarm64: KVM: vgic: byteswap GICv2 access on world switch if BE
Marc Zyngier [Tue, 5 Nov 2013 18:29:46 +0000 (18:29 +0000)]
arm64: KVM: vgic: byteswap GICv2 access on world switch if BE

Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: KVM: initialize HYP mode following the kernel endianness
Marc Zyngier [Tue, 5 Nov 2013 18:29:45 +0000 (18:29 +0000)]
arm64: KVM: initialize HYP mode following the kernel endianness

Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agomicroblaze: Remove unused NO_MMU Kconfig parameter
Michael Opdenacker [Mon, 4 Nov 2013 08:30:18 +0000 (09:30 +0100)]
microblaze: Remove unused NO_MMU Kconfig parameter

This removes the NO_MMU Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.

This also updates a comment refering to this parameter.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
10 years agox86/cpu: Increase max CPU count to 8192
Josh Boyer [Tue, 5 Nov 2013 14:38:16 +0000 (09:38 -0500)]
x86/cpu: Increase max CPU count to 8192

The MAXSMP option is intended to enable silly large numbers of
CPUs for testing purposes.  The current value of 4096 isn't very
silly any longer as there are actual SGI machines that approach
6096 CPUs when taking HT into account.

Increase the value to a nice round 8192 to account for this and
allow for short term future increases.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143816.GK9944@hansolo.jdub.homelinux.org
[ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agox86/cpu: Allow higher NR_CPUS values
Josh Boyer [Tue, 5 Nov 2013 14:37:29 +0000 (09:37 -0500)]
x86/cpu: Allow higher NR_CPUS values

The current range for SMP configs is 2 - 512 CPUs, or a full
4096 in the case of MAXSMP.  There are machines that have 1024
CPUs in them today and configuring a kernel for that means you
are forced to set MAXSMP.  This adds additional unnecessary
overhead.  While that overhead might be considered tiny for
large machines, it isn't necessarily so if you are building a
kernel that runs across a wide variety of machines.

To cover the range of more common machines today, we allow
NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143728.GJ9944@hansolo.jdub.homelinux.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agox86/cpu: Always print SMP information in /proc/cpuinfo
HATAYAMA Daisuke [Mon, 4 Nov 2013 17:15:48 +0000 (02:15 +0900)]
x86/cpu: Always print SMP information in /proc/cpuinfo

Currently show_cpuinfo_core() displays cpu core information only if
the number of threads per a whole cores is 2 or larger.

However, this condition doesn't care about the number of
sockets. For example, this condition doesn't hold on systems
with two logical cpus consisting of two sockets and a single
core on each socket - yet the topology information would be
interesting to see in that case as well.

I don't know whether or not there are processors in real world
by which such configurations are possible, but at least on
vitual machine environments, such configuration can occur,
typically when no explicit SMP information is provided in
advance.

For example, on qemu/KVM, SMP information is specified via -smp
command-line option, more specifically, its syntax is:

  -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus]

If this is not specified, qemu tells configuration with
n-sockets, 1-core and 1-thread to the guest machine, on which
guest, MP information is not displayed in /proc/cpuinfo.

I saw this situation on VMWare guest environment, too.

To fix this issue, this patch simply removes the condition
because this information is useful even if there's only 1
thread.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5277D644.4090707@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agosched: Move completion code from core.c to completion.c
Peter Zijlstra [Fri, 4 Oct 2013 20:06:53 +0000 (22:06 +0200)]
sched: Move completion code from core.c to completion.c

Completions already have their own header file: linux/completion.h
Move the implementation out of kernel/sched/core.c and into its own
file: kernel/sched/completion.c.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-x2y49rmxu5dljt66ai2lcfuw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agosched: Move wait code from core.c to wait.c
Peter Zijlstra [Fri, 4 Oct 2013 15:24:35 +0000 (17:24 +0200)]
sched: Move wait code from core.c to wait.c

For some reason only the wait part of the wait api lives in
kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c;
ammend this.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-ftycee88naznulqk7ei5mbci@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agosched: Move wait.c into kernel/sched/
Peter Zijlstra [Thu, 31 Oct 2013 17:07:08 +0000 (18:07 +0100)]
sched: Move wait.c into kernel/sched/

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-5q5yqvdaen0rmapwloeaotx3@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes
Ingo Molnar [Wed, 6 Nov 2013 05:50:21 +0000 (06:50 +0100)]
Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 6 Nov 2013 05:28:23 +0000 (06:28 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  * Check maximum frequency rate for record/top, emitting better error
    messages, from Jiri Olsa.

  * Disable live kvm command if timerfd is not supported, from David Ahern.

  * Add usage to 'perf list', from David Ahern.

  * Fix detection of non-core features, from David Ahern.

  * Consolidate __hists__add_*entry(), cleanup from Namhyung Kim.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoARC: [SMP] Fix build failures for large NR_CPUS
Vineet Gupta [Fri, 1 Nov 2013 05:16:40 +0000 (10:46 +0530)]
ARC: [SMP] Fix build failures for large NR_CPUS

ST.as only takes S9 (255) for offset. This was going out of range when
accessing a task_struct field with 4k NR_CPUS (due to 128b of coumaks
itself in there).

Workaround by using an intermediate register to do the address scaling.

There is some duplication of fix for ctx_sw.c and ctx_sw_asm.S however
given that C version will go away soon I'm not bothering to factor out
the common code.

Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: [SMP] enlarge possible NR_CPUS
Noam Camus [Mon, 3 Jun 2013 12:19:59 +0000 (15:19 +0300)]
ARC: [SMP] enlarge possible NR_CPUS

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: [SMP] TLB flush
Vineet Gupta [Sun, 27 Oct 2013 09:19:02 +0000 (14:49 +0530)]
ARC: [SMP] TLB flush

- Add mm_cpumask setting (aggregating only, unlike some other arches)
  used to restrict the TLB flush cross-calling

- cross-calling versions of TLB flush routines (thanks to Noam)

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: [SMP] ASID allocation
Vineet Gupta [Fri, 23 Aug 2013 13:46:34 +0000 (19:16 +0530)]
ARC: [SMP] ASID allocation

-Track a Per CPU ASID counter
-mm-per-cpu ASID (multiple threads, or mm migrated around)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoarc: export symbol for pm_power_off in reset.c
Chen Gang [Mon, 28 Oct 2013 03:49:47 +0000 (11:49 +0800)]
arc: export symbol for pm_power_off in reset.c

Need export symbol for it, or can not pass compiling, the related error
with allmodconfig:

    MODPOST 2994 modules
  ERROR: "pm_power_off" [drivers/mfd/retu-mfd.ko] undefined!
  ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined!

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoarc: export symbol for save_stack_trace() in stacktrace.c
Chen Gang [Mon, 28 Oct 2013 03:00:38 +0000 (11:00 +0800)]
arc: export symbol for save_stack_trace() in stacktrace.c

Need export its symbol just like other architectures done, or can not
pass compiling with allmodconfig, the related error:

    MODPOST 2994 modules
  ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined!
  ERROR: "save_stack_trace" [drivers/md/persistent-data/dm-persistent-data.ko] undefined!

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoarc: remove '__init' for get_hw_config_num_irq()
Chen Gang [Wed, 23 Oct 2013 03:02:51 +0000 (11:02 +0800)]
arc: remove '__init' for get_hw_config_num_irq()

get_hw_config_num_irq() may be called by normal iss_model_init_smp()
which is a function pointer for 'init_smp' which may be called by
first_lines_of_secondary() which also need be normal too.

The related warning (with allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x5814): Section mismatch in reference from the function iss_model_init_smp() to the function .init.text:get_hw_config_num_irq()
  The function iss_model_init_smp() references
  the function __init get_hw_config_num_irq().
  This is often because iss_model_init_smp lacks a __init
  annotation or the annotation of get_hw_config_num_irq is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
10 years agoarc: remove '__init' for first_lines_of_secondary()
Chen Gang [Wed, 23 Oct 2013 02:16:38 +0000 (10:16 +0800)]
arc: remove '__init' for first_lines_of_secondary()

first_lines_of_secondary() is a '__init' function, but it may be called
by __cpu_up() by _cpu_up() by cpu_up() which is a normal export symbol
function. So recommend to remove '__init'.

The related warning (with allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x315c): Section mismatch in reference from the function __cpu_up() to the function .init.text:first_lines_of_secondary()
  The function __cpu_up() references
  the function __init first_lines_of_secondary().
  This is often because __cpu_up lacks a __init
  annotation or the annotation of first_lines_of_secondary is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
10 years agoarc: remove '__init' for setup_processor() and arc_init_IRQ()
Chen Gang [Wed, 23 Oct 2013 02:12:05 +0000 (10:12 +0800)]
arc: remove '__init' for setup_processor() and arc_init_IRQ()

They haven't '__init' in definition, but has '__init' in declaration.
And normal function start_kernel_secondary() may call setup_processor()
which will call arc_init_IRQ().

So need remove '__init' for both of them. The related warning (with
allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x3084): Section mismatch in reference from the function start_kernel_secondary() to the function .init.text:setup_processor()
  The function start_kernel_secondary() references
  the function __init setup_processor().
  This is often because start_kernel_secondary lacks a __init
  annotation or the annotation of setup_processor is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
10 years agoarc: kgdb: add default implementation for kgdb_roundup_cpus()
Chen Gang [Thu, 24 Oct 2013 03:50:09 +0000 (11:50 +0800)]
arc: kgdb: add default implementation for kgdb_roundup_cpus()

arc supports kgdb, but need update -- add function kgdb_roundup_cpus(),
or can not pass compiling. At present, add the simple generic one just
like other architectures(e.g. tile, mips ...).

The related error (with allmodconfig):

  kernel/built-in.o: In function `kgdb_cpu_enter':
  kernel/debug/debug_core.c:580: undefined reference to `kgdb_roundup_cpus'

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Fix bogus gcc warning and micro-optimise TLB iteration loop
Vineet Gupta [Fri, 27 Sep 2013 12:50:06 +0000 (18:20 +0530)]
ARC: Fix bogus gcc warning and micro-optimise TLB iteration loop

------------------>8----------------------
arch/arc/mm/tlb.c: In function â€˜do_tlb_overlap_fault’:
arch/arc/mm/tlb.c:688:13: warning: array subscript is above array bounds
[-Warray-bounds]
         (pd0[n] & PAGE_MASK)) {
             ^
------------------>8----------------------

While at it, remove the usless last iteration of outer loop when reading
a TLB SET for duplicate entries.

Suggested-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Add support for irqflags tracing and lockdep
Vineet Gupta [Fri, 6 Sep 2013 08:48:17 +0000 (14:18 +0530)]
ARC: Add support for irqflags tracing and lockdep

Lockdep required a small fix to stacktrace API which was incorrectly
unwindign out of __switch_to for the current call frame.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Reset the value of Interrupt Priority Register
Vineet Gupta [Thu, 12 Sep 2013 08:23:06 +0000 (13:53 +0530)]
ARC: Reset the value of Interrupt Priority Register

In case bootloader has changed the priority of one/more IRQ lines

Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Reduce #ifdef'ery for unaligned access emulation
Vineet Gupta [Wed, 18 Sep 2013 12:38:01 +0000 (18:08 +0530)]
ARC: Reduce #ifdef'ery for unaligned access emulation

Emulation not enabled is treated as if the fixup failed, so no need for
special #ifdef checks.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Change calling convention of do_page_fault()
Vineet Gupta [Wed, 18 Sep 2013 10:55:40 +0000 (16:25 +0530)]
ARC: Change calling convention of do_page_fault()

switch the args (address, pt_regs) to match with all the other "C"
exception handlers.

This removes the awkwardness in EV_ProtV for page fault vs. unaligned
access.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: cacheflush optim - PTAG can be loop invariant if V-P is const
Vineet Gupta [Thu, 5 Sep 2013 09:15:51 +0000 (14:45 +0530)]
ARC: cacheflush optim - PTAG can be loop invariant if V-P is const

Line op needs vaddr (indexing) and paddr (tag match). For page sized
flushes (V-P const), each line op will need a different index, but the
tag bits wil remain constant, hence paddr can be setup once outside the
loop.

This improves select LMBench numbers for Aliasing dcache where we have
more "preventive" cache flushing.

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.11-rc7- Linux 3.11.0-   80 4.66 8.88 69.7 112. 268. 8.60 28.0 3489 13.K 27.K # Non alias ARC700
3.11-rc7- Linux 3.11.0-   80 4.64 8.51 68.6 98.5 271. 8.58 28.1 4160 15.K 32.K # Aliasing
3.11-rc7- Linux 3.11.0-   80 4.64 8.51 69.8 99.4 270. 8.73 27.5 3880 15.K 31.K # PTAG loop Inv

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: cacheflush refactor #3: Unify the {d,i}cache flush leaf helpers
Vineet Gupta [Thu, 5 Sep 2013 08:13:03 +0000 (13:43 +0530)]
ARC: cacheflush refactor #3: Unify the {d,i}cache flush leaf helpers

With Line length being constant now, we can fold the 2 helpers into 1.
This allows applying any optimizations (forthcoming) to single place.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: cacheflush refactor #2: I and D caches lines to have same size
Vineet Gupta [Thu, 5 Sep 2013 07:47:49 +0000 (13:17 +0530)]
ARC: cacheflush refactor #2: I and D caches lines to have same size

Having them be different seems an obscure configuration.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: cacheflush refactor #1: push aux reg ascertaining into leaf routine
Vineet Gupta [Thu, 5 Sep 2013 07:33:35 +0000 (13:03 +0530)]
ARC: cacheflush refactor #1: push aux reg ascertaining into leaf routine

ARC dcache supports 3 ops - Inv, Flush, Flush-n-Inv.
The programming model however provides 2 commands FLUSH, INV.
INV will either discard or flush-n-discard (based on DT_CTRL bit)

The leaf helper __dc_line_loop() used to take the AUX register
(corresponding to the 2 commands). Now we push that to within the
helper, paving way for code consolidations to follow.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: use __weak instead of __attribute__((weak))
Vineet Gupta [Thu, 31 Oct 2013 08:23:54 +0000 (13:53 +0530)]
ARC: use __weak instead of __attribute__((weak))

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoARC: Annotate some functions as static
Vineet Gupta [Wed, 4 Sep 2013 10:43:35 +0000 (16:13 +0530)]
ARC: Annotate some functions as static

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
10 years agoarc: Replace __get_cpu_var uses
Christoph Lameter [Wed, 28 Aug 2013 19:48:15 +0000 (19:48 +0000)]
arc: Replace __get_cpu_var uses

__get_cpu_var() is used for multiple purposes in the kernel source. One of them is
address calculation via the form &__get_cpu_var(x). This calculates the address for
the instance of the percpu variable of the current processor based on an offset.

Other use cases are for storing and retrieving data from the current processors percpu area.
__get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store and retrieve operations
could use a segment prefix (or global register on other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
optimized assembly code to read and write per cpu variables.

This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
and less registers are used when code is generated.

At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.

The patchset includes passes over all arches as well. Once these operations are used throughout then
specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
f.e. using a global register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);

    Converts to

int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);

    Converts to

int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu variable.

DEFINE_PER_CPU(int, u);
int x = __get_cpu_var(y)

   Converts to

int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);

   Converts to

memcpy(this_cpu_ptr(&x), y, sizeof(x));

5. Assignment to a per cpu variable

DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;

   Converts to

this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++

   Converts to

this_cpu_inc(y)

Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
10 years agopowerpc: Fix fatal SLB miss when restoring PPR
Benjamin Herrenschmidt [Tue, 5 Nov 2013 05:33:22 +0000 (16:33 +1100)]
powerpc: Fix fatal SLB miss when restoring PPR

When restoring the PPR value, we incorrectly access the thread structure
at a time where MSR:RI is clear, which means we cannot recover from nested
faults. However the thread structure isn't covered by the "bolted" SLB
entries and thus accessing can fault.

This fixes it by splitting the code so that the PPR value is loaded into
a GPR before MSR:RI is cleared.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agopowerpc/powernv: Reserve the correct PE number
Gavin Shan [Mon, 4 Nov 2013 08:32:47 +0000 (16:32 +0800)]
powerpc/powernv: Reserve the correct PE number

We're assigning PE numbers after the completion of PCI probe. During
the PCI probe, we had PE#0 as the super container to encompass all
PCI devices. However, that's inappropriate since PELTM has ascending
order of priority on search on P7IOC. So we need PE#127 takes the
role that PE#0 has previously. For PHB3, we still have PE#0 as the
reserved PE.

The patch supposes that the underly firmware has built the RID to
PE# mapping after resetting IODA tables: all PELTM entries except
last one has invalid mapping on P7IOC, but all RTEs have binding
to PE#0. The reserved PE# is being exported by firmware by device
tree.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agopowerpc/powernv: Add PE to its own PELTV
Gavin Shan [Mon, 4 Nov 2013 08:32:46 +0000 (16:32 +0800)]
powerpc/powernv: Add PE to its own PELTV

We need add PE to its own PELTV. Otherwise, the errors originated
from the PE might contribute to other PEs. In the result, we can't
clear up the error successfully even we're checking and clearing
errors during access to PCI config space.

Cc: stable@vger.kernel.org
Reported-by: kalshett@in.ibm.com
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agopowerpc/powernv: Add support for indirect XSCOM via debugfs
Benjamin Herrenschmidt [Thu, 10 Oct 2013 08:19:15 +0000 (19:19 +1100)]
powerpc/powernv: Add support for indirect XSCOM via debugfs

Indirect XSCOM addresses normally have the top bit set (of the 64-bit
address). This doesn't work via the normal debugfs interface, so we use
a different encoding, which we need to convert before calling OPAL.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agopowerpc/scom: Improve debugfs interface
Benjamin Herrenschmidt [Mon, 4 Nov 2013 02:20:35 +0000 (13:20 +1100)]
powerpc/scom: Improve debugfs interface

The current debugfs interface to scom is essentially unused
and racy. It uses two different files "address" and "data"
to perform accesses which is at best impractical for anything
but manual use by a developer.

This replaces it with an "access" file which represent the entire
scom address space which can be lseek/read/writen too.

This file only supports accesses that are 8 bytes aligned and
multiple of 8 bytes in size. The offset is logically the SCOM
address multiplied by 8.

Since nothing in userspace exploits that file at the moment, the ABI
change is a no-brainer.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agopowerpc/scom: Enable 64-bit addresses
Benjamin Herrenschmidt [Thu, 10 Oct 2013 08:18:02 +0000 (19:18 +1100)]
powerpc/scom: Enable 64-bit addresses

On P8, XSCOM addresses has a special "indirect" form that
requires more than 32-bits, so let's use u64 everywhere in
the code instead of u32.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agoperf tools: Finish the removal of 'self' arguments
Arnaldo Carvalho de Melo [Tue, 5 Nov 2013 18:32:36 +0000 (15:32 -0300)]
perf tools: Finish the removal of 'self' arguments

They convey no information, perhaps I was bitten by some snake at some
point, complete the detox by naming the last of those arguments more
sensibly.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-u1r0dnjoro08dgztiy2g3t2q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Check maximum frequency rate for record/top
Jiri Olsa [Tue, 5 Nov 2013 14:14:47 +0000 (15:14 +0100)]
perf tools: Check maximum frequency rate for record/top

Adding the check for maximum allowed frequency rate defined in following
file:

  /proc/sys/kernel/perf_event_max_sample_rate

When we cross the maximum value we fail and display detailed error
message with advise.

  $ perf record -F 3000 ls
  Maximum frequency rate (2000) reached.
  Please use -F freq option with lower value or consider
  tweaking /proc/sys/kernel/perf_event_max_sample_rate.

In case user does not specify the frequency and the default value cross
the maximum, we display warning and set the frequency value to the
current maximum.

  $ perf record ls
  Lowering default frequency rate to 2000.
  Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.

Same messages are used for 'perf top'.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf fs: Add procfs support
Jiri Olsa [Tue, 5 Nov 2013 14:14:46 +0000 (15:14 +0100)]
perf fs: Add procfs support

Adding procfs support into fs class.

The interface function:
  const char *procfs__mountpoint(void);

provides existing mountpoint path for procfs.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383660887-1734-3-git-send-email-jolsa@redhat.com
[ Fixup namespace ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf fs: Rename NAME_find_mountpoint() to NAME__mountpoint()
Arnaldo Carvalho de Melo [Tue, 5 Nov 2013 17:48:50 +0000 (14:48 -0300)]
perf fs: Rename NAME_find_mountpoint() to NAME__mountpoint()

Shorten it, "finding" it is an implementation detail, what callers want
is the pathname, not to ask for it to _always_ do the lookup.

And the existing implementation already caches it, i.e. it doesn't
"finds" it on every call.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-r24wa4bvtccg7mnkessrbbdj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoarm64: compat: Clear the IT state independent of the 32-bit ARM or Thumb-2 mode
T.J. Purtell [Tue, 5 Nov 2013 17:07:18 +0000 (17:07 +0000)]
arm64: compat: Clear the IT state independent of the 32-bit ARM or Thumb-2 mode

The ARM architecture reference specifies that the IT state bits in the
PSR must be all zeros in ARM mode or behavior is unspecified. If an ARM
function is registered as a signal handler, and that signal is delivered
inside a block of instructions following an IT instruction, some of the
instructions at the beginning of the signal handler may be skipped if
the IT state bits of the Program Status Register are not cleared by the
kernel.

Signed-off-by: T.J. Purtell <tj@mobisocial.us>
[catalin.marinas@arm.com: code comment and commit log updated]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoperf tools: Factor sysfs code into generic fs object
Jiri Olsa [Tue, 5 Nov 2013 14:14:45 +0000 (15:14 +0100)]
perf tools: Factor sysfs code into generic fs object

Moving sysfs code into generic fs object and preparing it to carry
procfs support.

This should be merged with tools/lib/lk/debugfs.c at some point in the
future.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383660887-1734-2-git-send-email-jolsa@redhat.com
[ Added fs__ namespace qualifier to some more functions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf list: Add usage
David Ahern [Wed, 30 Oct 2013 16:28:29 +0000 (10:28 -0600)]
perf list: Add usage

Currently 'perf list' is not very helpful if you forget the syntax:

  $ perf list -h

  List of pre-defined events (to be used in -e):

After:
  $ perf list -h

   usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob]

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/527133AD.4030003@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf list: Remove a level of indentation
David Ahern [Wed, 30 Oct 2013 16:15:06 +0000 (10:15 -0600)]
perf list: Remove a level of indentation

With a return after the if check an indentation level can be removed.
Indentation shift only; no functional changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383149707-1008-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoarm64: Use 42-bit address space with 64K pages
Catalin Marinas [Wed, 23 Oct 2013 15:50:07 +0000 (16:50 +0100)]
arm64: Use 42-bit address space with 64K pages

This patch expands the VA_BITS to 42 when the 64K page configuration is
enabled allowing 2TB kernel linear mapping. Linux still uses 2 levels of
page tables in this configuration with pgd now being a full page.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
10 years agotools/perf/build: Fix detection of non-core features
David Ahern [Tue, 29 Oct 2013 16:43:15 +0000 (10:43 -0600)]
tools/perf/build: Fix detection of non-core features

feature_check needs to be invoked through call, and LDFLAGS may not be
set so quotes are needed.

Thanks to Jiri for spotting the quotes around LDFLAGS; that one was
driving me nuts with the upcoming timerfd feature detection.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383064996-20933-1-git-send-email-dsahern@gmail.com
[ Fixed conflict with 8a0c4c2843d3 ("perf tools: Fix libunwind build and feature detection for 32-bit build") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf kvm: Disable live command if timerfd is not supported
David Ahern [Tue, 29 Oct 2013 16:43:16 +0000 (10:43 -0600)]
perf kvm: Disable live command if timerfd is not supported

If the OS does not have timerfd support (e.g., older OS'es like RHEL5)
disable perf kvm stat live.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383064996-20933-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoarm64: module: ensure instruction is little-endian before manipulation
Will Deacon [Tue, 5 Nov 2013 10:16:52 +0000 (10:16 +0000)]
arm64: module: ensure instruction is little-endian before manipulation

Relocations that require an instruction immediate to be re-encoded must
ensure that the instruction pattern is represented in a little-endian
format for the manipulation code to work correctly.

This patch converts the loaded instruction into native-endianess prior
to encoding and then converts back to little-endian byteorder before
updating memory.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: defconfig: Enable CONFIG_PREEMPT by default
Catalin Marinas [Tue, 5 Nov 2013 10:03:53 +0000 (10:03 +0000)]
arm64: defconfig: Enable CONFIG_PREEMPT by default

This way we can spot early bugs when just testing with the default
config.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: fix access to preempt_count from assembly code
Marc Zyngier [Mon, 4 Nov 2013 20:14:58 +0000 (20:14 +0000)]
arm64: fix access to preempt_count from assembly code

preempt_count is defined as an int. Oddly enough, we access it
as a 64bit value. Things become interesting when running a BE
kernel, and looking at the current CPU number, which is stored
as an int next to preempt_count. Like in a per-cpu interrupt
handler, for example...

Using a 32bit access fixes the issue for good.

Cc: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agox86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to *86*_defconfig
Chen Gang [Tue, 5 Nov 2013 01:46:39 +0000 (09:46 +0800)]
x86, defconfig: Add DEVTMPFS and DEVTMPFS_MOUNT to *86*_defconfig

The defconfig kernel can not run under neither fedora16 x86_64 laptop
nor fedora17 x86_64 pc. After enable DEVTMPFS* in x86_64_defconfig, it
will be OK.

DEVTMPFS* is only related with software, so for i386_defconfig may also
need them (at least, it has no negative effect for defconfig).

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Link: http://lkml.kernel.org/r/52784DFF.8040004@asianux.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
10 years agoperf hists: Consolidate __hists__add_*entry()
Namhyung Kim [Thu, 31 Oct 2013 06:56:03 +0000 (15:56 +0900)]
perf hists: Consolidate __hists__add_*entry()

The __hists__add_{branch,mem}_entry() does almost the same thing that
__hists__add_entry() does.  Consolidate them into one.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383202576-28141-2-git-send-email-namhyung@kernel.org
[ Fixup clash with new COMM infrastructure ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agopowerpc/boot: Properly handle the base "of" boot wrapper
Benjamin Herrenschmidt [Mon, 4 Nov 2013 23:09:11 +0000 (10:09 +1100)]
powerpc/boot: Properly handle the base "of" boot wrapper

The wrapper script needs an explicit rule for the "of" boot
wrapper (generic wrapper, similar to pseries). Before
0c9fa29149d3726e14262aeb0c8461a948cc9d56 it was hanlded
implicitly by the statement:

platformo=$object/"$platform".o

But now that epapr.o needs to be added, that doesn't work
and an explicit rule must be added.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
10 years agoNFSv4.2: Remove redundant checks in nfs_setsecurity+nfs4_label_init_security
Trond Myklebust [Mon, 4 Nov 2013 18:46:40 +0000 (13:46 -0500)]
NFSv4.2: Remove redundant checks in nfs_setsecurity+nfs4_label_init_security

We already check for nfs_server_capable(inode, NFS_CAP_SECURITY_LABEL)
in nfs4_label_alloc()
We check the minor version in _nfs4_server_capabilities before setting
NFS_CAP_SECURITY_LABEL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
10 years agoNFSv4: Sanity check the server reply in _nfs4_server_capabilities
Trond Myklebust [Mon, 4 Nov 2013 20:20:20 +0000 (15:20 -0500)]
NFSv4: Sanity check the server reply in _nfs4_server_capabilities

We don't want to be setting capabilities and/or requesting attributes
that are not appropriate for the NFSv4 minor version.

- Ensure that we clear the NFS_CAP_SECURITY_LABEL capability when appropriate
- Ensure that we limit the attribute bitmasks to the mounted_on_fileid
  attribute and less for NFSv4.0
- Ensure that we limit the attribute bitmasks to suppattr_exclcreat and
  less for NFSv4.1
- Ensure that we limit it to change_sec_label or less for NFSv4.2

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
10 years agoNFSv4.2: encode_readdir - only ask for labels when doing readdirplus
Trond Myklebust [Mon, 4 Nov 2013 18:23:59 +0000 (13:23 -0500)]
NFSv4.2: encode_readdir - only ask for labels when doing readdirplus

Currently, if the server is doing NFSv4.2 and supports labeled NFS, then
our on-the-wire READDIR request ends up asking for the label information,
which is then ignored unless we're doing readdirplus.
This patch ensures that READDIR doesn't ask the server for label information
at all unless the readdir->bitmask contains the FATTR4_WORD2_SECURITY_LABEL
attribute, and the readdir->plus flag is set.

While we're at it, optimise away the 3rd bitmap field if it is zero.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
10 years agonfs: set security label when revalidating inode
Jeff Layton [Sat, 2 Nov 2013 10:57:18 +0000 (06:57 -0400)]
nfs: set security label when revalidating inode

Currently, we fetch the security label when revalidating an inode's
attributes, but don't apply it. This is in contrast to the readdir()
codepath where we do apply label changes.

Cc: Dave Quigley <dpquigl@davequigley.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
10 years agoNFSv4.2: Fix a mismatch between Linux labeled NFS and the NFSv4.2 spec
Trond Myklebust [Mon, 4 Nov 2013 19:38:05 +0000 (14:38 -0500)]
NFSv4.2: Fix a mismatch between Linux labeled NFS and the NFSv4.2 spec

In the spec, the security label attribute id is '80', which means that
it should be bit number 80-64 == 16 in the 3rd word of the bitmap.

Fixes: 4488cc96c581: NFS: Add NFSv4.2 protocol constants
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Steve Dickson <steved@redhat.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
10 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Mon, 4 Nov 2013 20:14:04 +0000 (21:14 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  * Add new COMM infrastructure, further improving histogram processing, from
    Frédéric Weisbecker, one fix from Namhyung Kim.

  * Enhance option parse error message, showing just the help lines of the
    options affected, from Namhyung Kim.

  * Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing and
    'perf test', from Adrian Hunter.

  * Set up output options for in-stream attributes, from Adrian Hunter.

  * Fix 32-bit cross build, from Adrian Hunter.

  * Fix libunwind build and feature detection for 32-bit build, from Adrian Hunter.

  * Always use perf_evsel__set_sample_bit to set sample_type, from Adrian Hunter.
    perf evlist: Add a debug print if event buffer mmap fails

  * Add missing data.h into LIB_H headers, fix from Jiri Olsa.

  * libtraceevent updates from upstream trace-cmd repo, from Steven Rostedt.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoarm64: move enabling of GIC before CPUs are set online
Marc Zyngier [Mon, 4 Nov 2013 16:55:22 +0000 (16:55 +0000)]
arm64: move enabling of GIC before CPUs are set online

Commit 53ae3acd (arm64: Only enable local interrupts after the CPU
is marked online) moved the enabling of the GIC after the CPUs are
marked online.

This has some interesting effect:
[...]
[<ffffffc0002eefd8>] gic_raise_softirq+0xf8/0x160
[<ffffffc000088f58>] smp_send_reschedule+0x38/0x40
[<ffffffc0000c8728>] resched_task+0x84/0xc0
[<ffffffc0000c8cdc>] check_preempt_curr+0x58/0x98
[<ffffffc0000c8d38>] ttwu_do_wakeup+0x1c/0xf4
[<ffffffc0000c8f90>] ttwu_do_activate.constprop.84+0x64/0x70
[<ffffffc0000cad30>] try_to_wake_up+0x1d4/0x2b4
[<ffffffc0000cae6c>] default_wake_function+0x10/0x18
[<ffffffc0000c5ca4>] __wake_up_common+0x60/0xa0
[<ffffffc0000c7784>] complete+0x48/0x64
[<ffffffc000088bec>] secondary_start_kernel+0xe8/0x110
[...]

Here, we end-up calling gic_raise_softirq without having initialized
the interrupt controller for this CPU. While this goes unnoticed
with GICv2 (the distributor is always accessible), it explodes with
GICv3.

The fix is to move the call to notify_cpu_starting before we set
the secondary CPU online.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: use generic RW_DATA_SECTION macro in linker script
Mark Salter [Mon, 4 Nov 2013 16:38:47 +0000 (16:38 +0000)]
arm64: use generic RW_DATA_SECTION macro in linker script

The .data section in the arm64 linker script currently lacks a
definition for page-aligned data. This leads to a .page_aligned
section being placed between the end of data and start of bss.
This patch corrects that by using the generic RW_DATA_SECTION
macro which includes support for page-aligned data.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agotools lib traceevent: Add pevent_print_func_field() helper function
Steven Rostedt [Fri, 1 Nov 2013 21:54:00 +0000 (17:54 -0400)]
tools lib traceevent: Add pevent_print_func_field() helper function

Add the pevent_print_func_field() that will look up a field that is
expected to be a function pointer, and it will print the function name
and offset of the address given by the field.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131101215501.869542711@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Add flags NOHANDLE and PRINTRAW to individual events
Steven Rostedt [Fri, 1 Nov 2013 21:53:59 +0000 (17:53 -0400)]
tools lib traceevent: Add flags NOHANDLE and PRINTRAW to individual events

Add the flags EVENT_FL_NOHANDLE and EVENT_FL_PRINTRAW to the event flags
to have the event either ignore the register handler or to ignore the
handler and also print the raw format respectively.

This allows a tool to force a raw format or non handle for an event.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131101215501.655258742@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Check for spaces in character array
Steven Rostedt (Red Hat) [Fri, 1 Nov 2013 21:53:58 +0000 (17:53 -0400)]
tools lib traceevent: Check for spaces in character array

Currently when using the raw format for fields, when looking at a
character array, to determine if it is a string or not, we make sure all
characters are "isprint()". If not, then we consider it a numeric array,
and print the hex numbers of the characters instead.

But it seems that '\n' fails the isprint() check! Add isspace() to the
check as well, such that if all characters pass isprint() or isspace()
it will assume the character array is a string.

Reported-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Xenia Ragiadakou <burzalodowa@gmail.com>
Link: http://lkml.kernel.org/r/20131101215501.465091682@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Have bprintk output the same as the kernel does
Steven Rostedt (Red Hat) [Fri, 1 Nov 2013 21:53:57 +0000 (17:53 -0400)]
tools lib traceevent: Have bprintk output the same as the kernel does

The trace_bprintk() in the kernel looks like:

 ring_buffer_producer_thread: Missed:   0
 ring_buffer_producer_thread: Hit:      62174350
 ring_buffer_producer_thread: Entries per millisec: 6296
 ring_buffer_producer_thread: 158 ns per entry
 ring_buffer_producer_thread: Sleeping for 10 secs
 ring_buffer_producer_thread: Starting ring buffer hammer
 ring_buffer_producer_thread: End ring buffer hammer

But the current output looks like this:

 ring_buffer_producer_thread : Time:     9407018 (usecs)
 ring_buffer_producer_thread : Overruns: 43285485
 ring_buffer_producer_thread : Read:     4405365  (by events)
 ring_buffer_producer_thread : Entries:  0
 ring_buffer_producer_thread : Total:    47690850
 ring_buffer_producer_thread : Missed:   0
 ring_buffer_producer_thread : Hit:      47690850

Remove the space between the function and the colon.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131101215501.272654481@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Handle __print_hex(__get_dynamic_array(fieldname), len)
Howard Cochran [Fri, 1 Nov 2013 21:53:56 +0000 (17:53 -0400)]
tools lib traceevent: Handle __print_hex(__get_dynamic_array(fieldname), len)

The kernel has a few events with a format similar to this excerpt:
        field:unsigned int len;     offset:12;      size:4; signed:0;
        field:__data_loc unsigned char[] data_array;  offset:16;      size:4; signed:0;
print fmt: "%s", __print_hex(__get_dynamic_array(data_array), REC->len)

trace-cmd could already parse that arg correctly, but print_str_arg()
was unable to handle the first parameter being a dynamic array. (It
just printed a "field not found" warning).

Teach print_str_arg's PRINT_HEX case to handle the nested
PRINT_DYNAMIC_ARRAY correctly. The output now matches the kernel's own
formatting for this case.

Signed-off-by: Howard Cochran <hcochran@lexmark.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1381503349-12271-1-git-send-email-hcochran@lexmark.com
[ Removed "polish compare", we don't do that here ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: If %s is a pointer, check printk formats
Steven Rostedt (Red Hat) [Fri, 1 Nov 2013 21:53:55 +0000 (17:53 -0400)]
tools lib traceevent: If %s is a pointer, check printk formats

If the format string of TP_printk() contains a %s, and the argument is
not a string, check if the argument is a pointer that might match the
printk_formats that were stored.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131101215500.698924777@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Update printk formats when entered
Steven Rostedt (Red Hat) [Fri, 1 Nov 2013 21:53:54 +0000 (17:53 -0400)]
tools lib traceevent: Update printk formats when entered

Instead of cropping off the '"' and '\n"' from a printk format every
time it is referenced, do it when it's added. This makes it easier to
reference a printk_map and should speed things up a little.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131101215500.495619312@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Add support for extracting trace_clock in report
Yoshihiro YUNOMAE [Fri, 1 Nov 2013 21:53:53 +0000 (17:53 -0400)]
tools lib traceevent: Add support for extracting trace_clock in report

If trace-cmd extracts trace_clock, trace-cmd reads trace_clock data from
the trace.dat and switches outputting format of timestamp for each
trace_clock.

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20130424231305.14877.86147.stgit@yunodevel
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoMIPS: Random whitespace clean-ups
Maciej W. Rozycki [Fri, 1 Nov 2013 23:47:05 +0000 (23:47 +0000)]
MIPS: Random whitespace clean-ups

Another whitespace clean-up, this removes tabs from between sentences in
some comments.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6103/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
10 years agoperf stat: Enhance option parse error message
Namhyung Kim [Fri, 1 Nov 2013 07:33:15 +0000 (16:33 +0900)]
perf stat: Enhance option parse error message

Print related option help messages only when it failed to process
options.  While at it, modify parse_options_usage() to skip usage part
so that it can be used for showing multiple option help messages
naturally like below:

  $ perf stat -Bx, ls
  -B option not supported with -x

   usage: perf stat [<options>] [<command>]

      -B, --big-num         print large numbers with thousands' separators
      -x, --field-separator <separator>
                            print counts with custom separator

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf top: Use parse_options_usage() for -s option failure
Namhyung Kim [Fri, 1 Nov 2013 07:33:14 +0000 (16:33 +0900)]
perf top: Use parse_options_usage() for -s option failure

The -s (--sort) option was processed after normal option parsing so that
it cannot call the parse_options_usage() automatically.  Currently it
calls usage_with_options() which shows entire help messages for event
option.  Fix it by showing just -s options.

  $ perf top -s help
    Error: Unknown --sort key: `help'

   usage: perf top [<options>]

      -s, --sort <key[,key2...]>
                            sort by key(s): pid, comm, dso, symbol, ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf report: Use parse_options_usage() for -s option failure
Namhyung Kim [Fri, 1 Nov 2013 07:33:13 +0000 (16:33 +0900)]
perf report: Use parse_options_usage() for -s option failure

The -s (--sort) option was processed after normal option parsing so that
it cannot call the parse_options_usage() automatically.  Currently it
calls usage_with_options() which shows entire help messages for event
option.  Fix it by showing just -s options.

  $ perf report -s help
    Error: Unknown --sort key: `help'

   usage: perf report [<options>]

      -s, --sort <key[,key2...]>
                            sort by key(s): pid, comm, dso, symbol, ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf report: Postpone setting up browser after parsing options
Namhyung Kim [Fri, 1 Nov 2013 07:33:12 +0000 (16:33 +0900)]
perf report: Postpone setting up browser after parsing options

If setup_browser() called earlier than option parsing, the actual error
message can be discarded during the terminal reset.  So move it after
setup_sorting() checks whether the sort keys are valid.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Show single option when failed to parse
Namhyung Kim [Fri, 1 Nov 2013 07:33:11 +0000 (16:33 +0900)]
perf tools: Show single option when failed to parse

Current option parser outputs whole option help string when it failed to
parse an option.  However this is not good for user if the command has
many option, she might feel hard which one is related easily.

Fix it by just showing the help message of the given option only.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Requested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evsel: Synthesize PERF_SAMPLE_TRANSACTION
Adrian Hunter [Fri, 1 Nov 2013 13:51:38 +0000 (15:51 +0200)]
perf evsel: Synthesize PERF_SAMPLE_TRANSACTION

Add missing PERF_SAMPLE_TRANSACTION to perf_event__synthesize_sample()
and perf_event__sample_event_size().

This makes the "sample parsing" test pass.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf test: Update "sample parsing" test for PERF_SAMPLE_TRANSACTION
Adrian Hunter [Fri, 1 Nov 2013 13:51:37 +0000 (15:51 +0200)]
perf test: Update "sample parsing" test for PERF_SAMPLE_TRANSACTION

In fact the "sample parsing" test does not automatically check new
sample type bits - they must be added to the comparison logic.

Doing that shows that the test fails because the functions
perf_event__synthesize_sample() and perf_event__sample_event_size() have
not been updated with PERF_SAMPLE_TRANSACTION either.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evsel: Add missing overflow check for TRANSACTION
Adrian Hunter [Fri, 1 Nov 2013 13:51:36 +0000 (15:51 +0200)]
perf evsel: Add missing overflow check for TRANSACTION

Add missing overflow check for PERF_SAMPLE_TRANSACTION in
perf_evsel__parse_sample().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evsel: Always use perf_evsel__set_sample_bit()
Adrian Hunter [Fri, 1 Nov 2013 13:51:35 +0000 (15:51 +0200)]
perf evsel: Always use perf_evsel__set_sample_bit()

Always use perf_evsel__set_sample_bit() rather than just setting the
bit.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-8-git-send-email-adrian.hunter@intel.com
[ Cope with 3090ffb "perf: Disable PERF_RECORD_MMAP2 support" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evlist: Add a debug print if event buffer mmap fails
Adrian Hunter [Fri, 1 Nov 2013 13:51:33 +0000 (15:51 +0200)]
perf evlist: Add a debug print if event buffer mmap fails

Add a debug print if mmap of the perf event ring buffer fails.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Fix libunwind build and feature detection for 32-bit build
Adrian Hunter [Fri, 1 Nov 2013 13:51:32 +0000 (15:51 +0200)]
perf tools: Fix libunwind build and feature detection for 32-bit build

Use -lunwind-x86 instead of -lunwind-x86_64 for 32-bit build.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Fix 32-bit cross build
Adrian Hunter [Fri, 1 Nov 2013 13:51:31 +0000 (15:51 +0200)]
perf tools: Fix 32-bit cross build

Setting EXTRA_CFLAGS=-m32 did not work because it was not passed around.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf script: Set up output options for in-stream attributes
Adrian Hunter [Fri, 1 Nov 2013 13:51:30 +0000 (15:51 +0200)]
perf script: Set up output options for in-stream attributes

Attributes (struct perf_event_attr) are recorded separately in the
perf.data file.  perf script uses them to set up output options.
However attributes can also be in the event stream, for example when the
input is a pipe (i.e. live mode).  This patch makes perf script process
in-stream attributes in the same way as on-file attributes.

Here is an example:

Before this patch:

$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4220  4220 [-01] 2933367.838906: cycles:

:4220  4220 [-01] 2933367.838910: cycles:

:4220  4220 [-01] 2933367.838912: cycles:

:4220  4220 [-01] 2933367.838914: cycles:

:4220  4220 [-01] 2933367.838916: cycles:

:4220  4220 [-01] 2933367.838918: cycles:

uname  4220 [-01] 2933367.838938: cycles:

uname  4220 [-01] 2933367.839207: cycles:

After this patch:

$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
           :4582  4582 2933425.707724: cycles:  ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
           :4582  4582 2933425.707728: cycles:  ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
           :4582  4582 2933425.707730: cycles:  ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
           :4582  4582 2933425.707732: cycles:  ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
           :4582  4582 2933425.707734: cycles:  ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
           :4582  4582 2933425.707736: cycles:  ffffffff81309a24 memcpy ([kernel.kallsyms])
           uname  4582 2933425.707760: cycles:  ffffffff8109c1c7 enqueue_task_fair ([kernel.kallsyms])
           uname  4582 2933425.707978: cycles:  ffffffff81308457 clear_page_c ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evsel: Add a debug print if perf_event_open fails
Adrian Hunter [Fri, 1 Nov 2013 13:51:29 +0000 (15:51 +0200)]
perf evsel: Add a debug print if perf_event_open fails

There is a debug print (at verbose level 2) for each call to
perf_event_open.  Add another debug print if the call fails, and print
the error number.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>