Ingo Molnar [Tue, 2 Dec 2008 08:20:29 +0000 (09:20 +0100)]
Merge branch 'tracing/urgent' into tracing/core
Conflicts:
kernel/trace/ring_buffer.c
Ingo Molnar [Thu, 27 Nov 2008 09:56:13 +0000 (10:56 +0100)]
Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core
Lai Jiangshan [Thu, 27 Nov 2008 02:21:46 +0000 (10:21 +0800)]
ftrace: prevent recursion
Impact: prevent unnecessary stack recursion
if the resched flag was set before we entered, then don't reschedule.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 26 Nov 2008 10:59:56 +0000 (11:59 +0100)]
blktrace: port to tracepoints, update
Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Arnaldo Carvalho de Melo [Thu, 30 Oct 2008 07:34:33 +0000 (08:34 +0100)]
blktrace: port to tracepoints
This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan van de Ven [Mon, 24 Nov 2008 00:49:58 +0000 (16:49 -0800)]
tracing: add "power-tracer": C/P state tracer to help power optimization
Impact: new "power-tracer" ftrace plugin
This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.
An example way of using this is:
mount -t debugfs none /sys/kernel/debug
echo cstate > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/kernel/debug/tracing/tracing_enabled
sleep 1
echo 0 > /sys/kernel/debug/tracing/tracing_enabled
cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:27 +0000 (00:16 -0500)]
ftrace: add cpu annotation for function graph tracer
Impact: enhancement for function graph tracer
When run on a SMP box, the function graph tracer is confusing because
it shows the different CPUS as changes in the trace.
This patch adds the annotation of 'CPU[###]' where ### is a three digit
number. The output will look similar to this:
CPU[001] dput() {
CPU[000] } 726
CPU[001] } 487
CPU[000] do_softirq() {
CPU[001] } 2221
CPU[000] __do_softirq() {
CPU[000] __local_bh_disable() {
CPU[001] unroll_tree_refs() {
CPU[000] } 569
CPU[001] } 501
CPU[000] rcu_process_callbacks() {
CPU[001] kfree() {
What makes this nice is that now you can grep the file and produce
readable format for a particular CPU.
# cat /debug/tracing/trace > /tmp/trace
# grep '^CPU\[000\]' /tmp/trace > /tmp/trace0
# grep '^CPU\[001\]' /tmp/trace > /tmp/trace1
Will give you:
# head /tmp/trace0
CPU[000] ------------8<---------- thread sshd-3899 ------------8<----------
CPU[000] inotify_dentry_parent_queue_event() {
CPU[000] } 2531
CPU[000] inotify_inode_queue_event() {
CPU[000] } 505
CPU[000] } 69626
CPU[000] } 73089
CPU[000] audit_syscall_exit() {
CPU[000] path_put() {
CPU[000] dput() {
# head /tmp/trace1
CPU[001] ------------8<---------- thread pcscd-3446 ------------8<----------
CPU[001] } 4186
CPU[001] dput() {
CPU[001] } 543
CPU[001] vfs_permission() {
CPU[001] inode_permission() {
CPU[001] shmem_permission() {
CPU[001] generic_permission() {
CPU[001] } 501
CPU[001] } 2205
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:26 +0000 (00:16 -0500)]
ftrace: add thread comm to function graph tracer
Impact: enhancement to function graph tracer
Export the trace_find_cmdline so the function graph tracer can
use it to print the comms of the threads.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:25 +0000 (00:16 -0500)]
ftrace: let function tracing and function return run together
Impact: feature
This patch enables function tracing and function return to run together.
I've tested this by enabling the stack tracer and return tracer, where
both the function entry and function return are used together with
dynamic ftrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:24 +0000 (00:16 -0500)]
ftrace: use code patching for ftrace graph tracer
Impact: more efficient code for ftrace graph tracer
This patch uses the dynamic patching, when available, to patch
the function graph code into the kernel.
This patch will ease the way for letting both function tracing
and function graph tracing run together.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:23 +0000 (00:16 -0500)]
ftrace: add function tracing to single thread
Impact: feature to function trace a single thread
This patch adds the ability to function trace a single thread.
The file:
/debugfs/tracing/set_ftrace_pid
contains the pid to trace. Valid pids are any positive integer.
Writing any negative number to this file will disable the pid
tracing and the function tracer will go back to tracing all of
threads.
This feature works with both static and dynamic function tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Liming Wang [Wed, 26 Nov 2008 02:29:26 +0000 (10:29 +0800)]
ftrace: adding other non-leaving .text sections
Impact: widen the scope of recordmcount.pl
Besides .text section, there are three .text sections that won't
be freed after kernel booting. They are: .sched.text, .spinlock.text
and .kprobes.text, which contain functions we can trace. But the last
section ".kprobes.text" is particular, which has been marked as "notrace",
we ignore it. Thus we add other two sections.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 26 Nov 2008 02:10:01 +0000 (03:10 +0100)]
tracing: function graph tracer, fix
fix return-tracer => graph-tracer namespace rename fallout.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 25 Nov 2008 23:57:25 +0000 (00:57 +0100)]
tracing/function-return-tracer: set a more human readable output
Impact: feature
This patch sets a C-like output for the function graph tracing.
For this aim, we now call two handler for each function: one on the entry
and one other on return. This way we can draw a well-ordered call stack.
The pid of the previous trace is loosely stored to be compared against
the one of the current trace to see if there were a context switch.
Without this little feature, the call tree would seem broken at
some locations.
We could use the sched_tracer to capture these sched_events but this
way of processing is much more simpler.
2 spaces have been chosen for indentation to fit the screen while deep
calls. The time of execution in nanosecs is printed just after closed
braces, it seems more easy this way to find the corresponding function.
If the time was printed as a first column, it would be not so easy to
find the corresponding function if it is called on a deep depth.
I plan to output the return value but on 32 bits CPU, the return value
can be 32 or 64, and its difficult to guess on which case we are.
I don't know what would be the better solution on X86-32: only print
eax (low-part) or even edx (high-part).
Actually it's thee same problem when a function return a 8 bits value, the
high part of eax could contain junk values...
Here is an example of trace:
sys_read() {
fget_light() {
} 526
vfs_read() {
rw_verify_area() {
security_file_permission() {
cap_file_permission() {
} 519
} 1564
} 2640
do_sync_read() {
pipe_read() {
__might_sleep() {
} 511
pipe_wait() {
prepare_to_wait() {
} 760
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 504
} 1587
clear_buddies() {
} 512
add_cfs_task_weight() {
} 519
update_min_vruntime() {
} 511
} 5602
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 496
} 1631
clear_buddies() {
} 496
update_min_vruntime() {
} 527
} 4580
hrtick_update() {
hrtick_start_fair() {
} 488
} 1489
} 13700
} 14949
} 16016
msecs_to_jiffies() {
} 496
put_prev_task_fair() {
} 504
pick_next_task_fair() {
} 489
pick_next_task_rt() {
} 496
pick_next_task_fair() {
} 489
pick_next_task_idle() {
} 489
------------8<---------- thread 4 ------------8<----------
finish_task_switch() {
} 1203
do_softirq() {
__do_softirq() {
__local_bh_disable() {
} 669
rcu_process_callbacks() {
__rcu_process_callbacks() {
cpu_quiet() {
rcu_start_batch() {
} 503
} 1647
} 3128
__rcu_process_callbacks() {
} 542
} 5362
_local_bh_enable() {
} 587
} 8880
} 9986
kthread_should_stop() {
} 669
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
calc_delta_mine() {
} 511
update_min_vruntime() {
} 511
} 2813
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 25 Nov 2008 20:07:04 +0000 (21:07 +0100)]
tracing/function-return-tracer: change the name into function-graph-tracer
Impact: cleanup
This patch changes the name of the "return function tracer" into
function-graph-tracer which is a more suitable name for a tracing
which makes one able to retrieve the ordered call stack during
the code flow.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 26 Nov 2008 00:58:05 +0000 (01:58 +0100)]
Merge branches 'tracing/hw-branch-tracing' and 'tracing/branch-tracer' into tracing/core
Markus Metzger [Tue, 25 Nov 2008 08:24:15 +0000 (09:24 +0100)]
x86, bts, ftrace: a BTS ftrace plug-in prototype
Impact: add new ftrace plugin
A prototype for a BTS ftrace plug-in.
The tracer collects branch trace in a cyclic buffer for each cpu.
The tracer is not configurable and the trace for each snapshot is
appended when doing cat /debug/tracing/trace.
This is a proof of concept that will be extended with future patches
to become a (hopefully) useful tool.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 08:12:31 +0000 (09:12 +0100)]
x86, ftrace: call trace->open() before stopping tracing; add trace->print_header()
Add a callback to allow an ftrace plug-in to write its own header.
Move the call to trace->open() up a few lines.
The changes are required by the BTS ftrace plug-in.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 08:05:27 +0000 (09:05 +0100)]
x86, bts, ptrace: move BTS buffer allocation from ds.c into ptrace.c
Impact: restructure DS memory allocation to be done by the usage site of DS
Require pre-allocated buffers in ds.h.
Move the BTS buffer allocation for ptrace into ptrace.c.
The pointer to the allocated buffer is stored in the traced task's
task_struct together with the handle returned by ds_request_bts().
Removes memory accounting code.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 08:01:25 +0000 (09:01 +0100)]
x86, bts: base in-kernel ds interface on handles
Impact: generalize the DS code to shared buffers
Change the in-kernel ds.h interface to identify the tracer via a
handle returned on ds_request_~().
Tracers used to be identified via their task_struct.
The changes are required to allow DS to be shared between different
tasks, which is needed for perfmon2 and for ftrace.
For ptrace, the handle is stored in the traced task's task_struct.
This should probably go into a (arch-specific) ptrace context some
time.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 25 Nov 2008 16:30:25 +0000 (17:30 +0100)]
Merge branches 'tracing/core', 'x86/urgent' and 'x86/ptrace' into tracing/hw-branch-tracing
This pulls together all the topic branches that are needed
for the DS/BTS/PEBS tracing work.
Markus Metzger [Tue, 25 Nov 2008 07:52:56 +0000 (08:52 +0100)]
x86, bts: fix wrmsr and spinlock over kmalloc
Impact: fix sleeping-with-spinlock-held bugs/crashes
- Turn a wrmsr to write the DS_AREA MSR into a wrmsrl.
- Use irqsave variants of spinlocks.
- Do not allocate memory while holding spinlocks.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 07:49:06 +0000 (08:49 +0100)]
x86, pebs: fix PEBS record size configuration
Impact: fix DS hw enablement on 64-bit x86
Fix the PEBS record size in the DS configuration.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 07:47:19 +0000 (08:47 +0100)]
x86, bts: turn macro into static inline function
Impact: cleanup
Replace a macro with a static inline function.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Tue, 25 Nov 2008 07:45:13 +0000 (08:45 +0100)]
x86, bts: exclude ds.c from build when disabled
Impact: cleanup
Move the CONFIG guard from the .c file into the makefile.
Reported-by: Andi Kleen <andi-suse@firstfloor.org>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Julia Lawall [Tue, 25 Nov 2008 13:13:03 +0000 (14:13 +0100)]
arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
Impact: fix theoretical option string parsing overflow
Since bridge is unsigned, it would seem better to use simple_strtoul that
simple_strtol.
A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r2@
long e;
position p;
@@
e = simple_strtol@p(...)
@@
position p != r2.p;
type T;
T e;
@@
e =
- simple_strtol@p
+ simple_strtoul
(...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: muli@il.ibm.com
Cc: jdmason@kudzu.us
Cc: discuss@x86-64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 25 Nov 2008 05:42:37 +0000 (00:42 -0500)]
x86: use limited register constraint for setnz
Impact: build fix with certain compilers
GCC can decide to use %dil when "r" is used, which is not valid for
setnz.
This bug was brought out by Stephen Rothwell's merging of the
branch tracer into linux-next.
[ Thanks to Uros Bizjak for recommending 'q' over 'Q' ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 25 Nov 2008 07:58:11 +0000 (08:58 +0100)]
tracing, tty: fix warnings caused by branch tracing and tty_kref_get()
Stephen Rothwell reported tht this warning started triggering in
linux-next:
In file included from init/main.c:27:
include/linux/tty.h: In function ‘tty_kref_get’:
include/linux/tty.h:330: warning: ‘______f’ is static but declared in inline function ‘tty_kref_get’ which is not static
Which gcc emits for 'extern inline' functions that nevertheless define
static variables. Change it to 'static inline', which is the norm
in the kernel anyway.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 24 Nov 2008 16:46:24 +0000 (17:46 +0100)]
Merge branches 'tracing/branch-tracer', 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core
Török Edwin [Sun, 23 Nov 2008 21:24:53 +0000 (23:24 +0200)]
vfs, seqfile: fix comment style on mangle_path
Impact: use standard docbook tags
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sun, 23 Nov 2008 17:43:39 +0000 (18:43 +0100)]
tracing/function-return-tracer: free the return stack on free_task()
Impact: avoid losing some traces when a task is freed
do_exit() is not the last function called when a task finishes.
There are still some functions which are to be called such as
ree_task(). So we delay the freeing of the return stack to the
last moment.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Paalanen [Sun, 23 Nov 2008 19:24:59 +0000 (21:24 +0200)]
tracing, doc: update mmiotrace documentation
Impact: update documentation
Update to reflect the current state of the tracing framework:
- "none" tracer has been replaced by "nop" tracer
- tracing_enabled must be toggled when changing buffer size
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Paalanen [Sun, 23 Nov 2008 19:24:30 +0000 (21:24 +0200)]
x86, mmiotrace: fix buffer overrun detection
Impact: fix mmiotrace overrun tracing
When ftrace framework moved to use the ring buffer facility, the buffer
overrun detection was broken after 2.6.27 by commit
| commit
3928a8a2d98081d1bc3c0a84a2d70e29b90ecf1c
| Author: Steven Rostedt <rostedt@goodmis.org>
| Date: Mon Sep 29 23:02:41 2008 -0400
|
| ftrace: make work with new ring buffer
|
| This patch ports ftrace over to the new ring buffer.
The detection is now fixed by using the ring buffer API.
When mmiotrace detects a buffer overrun, it will report the number of
lost events. People reading an mmiotrace log must know if something was
missed, otherwise the data may not make sense.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sun, 23 Nov 2008 16:33:12 +0000 (17:33 +0100)]
tracing/function-return-tracer: don't trace kfree while it frees the return stack
Impact: fix a crash
While I killed the cat process, I got sometimes the following (but rare)
crash:
[ 65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727
[ 65.689027] EIP: 0060:[<
00000000>] EFLAGS:
00010082 CPU: 1
[ 65.689027] EIP is at 0x0
[ 65.689027] EAX:
00000000 EBX:
f66cd780 ECX:
c019a64a EDX:
f66cd780
[ 65.689027] ESI:
00000286 EDI:
f66cd780 EBP:
f630be2c ESP:
f630be24
[ 65.689027] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 65.689027] Process cat (pid: 2969, ti=
f630a000 task=
f66cd780 task.ti=
f630a000)
[ 65.689027] Stack:
[ 65.689027]
00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c
[ 65.689027]
f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc
[ 65.689027]
f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6
[ 65.689027] Call Trace:
[ 65.689027] [<
c012c853>] ? do_exit+0x603/0x850
[ 65.689027] [<
c0133cc9>] ? next_signal+0x9/0x40
[ 65.689027] [<
c0135dbc>] ? dequeue_signal+0x8c/0x180
[ 65.689027] [<
c012cad6>] ? do_group_exit+0x36/0x90
[ 65.689027] [<
c013709c>] ? get_signal_to_deliver+0x20c/0x390
[ 65.689027] [<
c0102b69>] ? do_notify_resume+0x99/0x8b0
[ 65.689027] [<
c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<
c014db9b>] ? trace_hardirqs_on+0xb/0x10
[ 65.689027] [<
c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<
c02e39b0>] ? n_tty_write+0x0/0x340
[ 65.689027] [<
c02e1812>] ? redirected_tty_write+0x82/0x90
[ 65.689027] [<
c019ee99>] ? vfs_write+0x99/0xd0
[ 65.689027] [<
c02e1790>] ? redirected_tty_write+0x0/0x90
[ 65.689027] [<
c019f342>] ? sys_write+0x42/0x70
[ 65.689027] [<
c01035ca>] ? work_notifysig+0x13/0x19
[ 65.689027] Code: Bad EIP value.
[ 65.689027] EIP: [<
00000000>] 0x0 SS:ESP 0068:
f630be24
This is because on do_exit(), kfree is called to free the return addresses stack
but kfree is traced and stored its return address in this stack.
This patch fixes it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 23 Nov 2008 12:47:54 +0000 (13:47 +0100)]
Merge branch 'ppc/ftrace' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/powerpc
Ian Campbell [Fri, 21 Nov 2008 10:21:33 +0000 (10:21 +0000)]
xen: pin correct PGD on suspend
Impact: fix Xen guest boot failure
commit
eefb47f6a1e855653d275cb90592a3587ea93a09 ("xen: use
spin_lock_nest_lock when pinning a pagetable") changed xen_pgd_walk to
walk over mm->pgd rather than taking pgd as an argument.
This breaks xen_mm_(un)pin_all() because it makes init_mm.pgd readonly
instead of the pgd we are interested in and therefore the pin subsequently
fails.
(XEN) mm.c:2280:d15 Bad type (saw
00000000e8000001 != exp
0000000060000000) for mfn bc464 (pfn 21ca7)
(XEN) mm.c:2665:d15 Error while pinning mfn bc464
[ 14.586913] 1 multicall(s) failed: cpu 0
[ 14.586926] Pid: 14, comm: kstop/0 Not tainted 2.6.28-rc5-x86_32p
-xenU-00172-gee2f6cc #200
[ 14.586940] Call Trace:
[ 14.586955] [<
c030c17a>] ? printk+0x18/0x1e
[ 14.586972] [<
c0103df3>] xen_mc_flush+0x163/0x1d0
[ 14.586986] [<
c0104bc1>] __xen_pgd_pin+0xa1/0x110
[ 14.587000] [<
c015a330>] ? stop_cpu+0x0/0xf0
[ 14.587015] [<
c0104d7b>] xen_mm_pin_all+0x4b/0x70
[ 14.587029] [<
c022bcb9>] xen_suspend+0x39/0xe0
[ 14.587042] [<
c015a330>] ? stop_cpu+0x0/0xf0
[ 14.587054] [<
c015a3cd>] stop_cpu+0x9d/0xf0
[ 14.587067] [<
c01417cd>] run_workqueue+0x8d/0x150
[ 14.587080] [<
c030e4b3>] ? _spin_unlock_irqrestore+0x23/0x40
[ 14.587094] [<
c014558a>] ? prepare_to_wait+0x3a/0x70
[ 14.587107] [<
c0141918>] worker_thread+0x88/0xf0
[ 14.587120] [<
c01453c0>] ? autoremove_wake_function+0x0/0x50
[ 14.587133] [<
c0141890>] ? worker_thread+0x0/0xf0
[ 14.587146] [<
c014509c>] kthread+0x3c/0x70
[ 14.587157] [<
c0145060>] ? kthread+0x0/0x70
[ 14.587170] [<
c0109d1b>] kernel_thread_helper+0x7/0x10
[ 14.587181] call 1/3: op=14 arg=[
c0415000] result=0
[ 14.587192] call 2/3: op=14 arg=[
e1ca2000] result=0
[ 14.587204] call 3/3: op=26 arg=[
c1808860] result=-22
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sun, 23 Nov 2008 11:08:10 +0000 (13:08 +0200)]
tracing/stack-tracer: avoid races accessing file
Impact: fix race
vma->vm_file reference is only stable while holding the mmap_sem,
so move usage of it to within the critical section.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 23 Nov 2008 11:16:57 +0000 (12:16 +0100)]
Merge branch 'oprofile-for-tip' of git://git./linux/kernel/git/rric/oprofile into x86/urgent
Thomas Gleixner [Fri, 21 Nov 2008 19:16:48 +0000 (11:16 -0800)]
x86: revert irq number limitation
Impact: fix MSIx not enough irq numbers available regression
The manual revert of the sparse_irq patches missed to bring the number
of possible irqs back to the .27 status. This resulted in a regression
when two multichannel network cards were placed in a system with only
one IO_APIC - causing the networking driver to not have the right
IRQ and the device not coming up.
Remove the dynamic allocation logic leftovers and simply return
NR_IRQS in probe_nr_irqs() for now.
Fixes: http://lkml.org/lkml/2008/11/19/354
Reported-by: Jesper Dangaard Brouer <hawk@diku.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jesper Dangaard Brouer <hawk@diku.dk>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sun, 23 Nov 2008 10:39:08 +0000 (12:39 +0200)]
tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT
Impact: cleanup
User stack tracing is just implemented for x86, but it is not x86 specific.
Introduce a generic config flag, that is currently enabled only for x86.
When other arches implement it, they will have to
SELECT USER_STACKTRACE_SUPPORT.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sun, 23 Nov 2008 10:39:07 +0000 (12:39 +0200)]
tracing/stack-tracer: fix locking and refcounts
Impact: fix refcounting/object-access bug
Hold mmap_sem while looking up/accessing vma.
Hold the RCU lock while using the task we looked up.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sun, 23 Nov 2008 10:39:06 +0000 (12:39 +0200)]
tracing/stack-tracer: fix style issues
Impact: cleanup
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 19:44:57 +0000 (14:44 -0500)]
trace: fix compiler warning in branch profiler
Impact: fix compiler warning
The ftrace_pointers used in the branch profiler are constant values.
They should never change. But the compiler complains when they are
passed into the debugfs_create_file as a data pointer, because the
function discards the qualifier.
This patch typecasts the parameter to debugfs_create_file back to
a void pointer. To remind the callbacks that they are pointing to
a constant value, I also modified the callback local pointers to
be const struct ftrace_pointer * as well.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 17:59:38 +0000 (12:59 -0500)]
ftrace: add ftrace_off_permanent
Impact: add new API to disable all of ftrace on anomalies
It case of a serious anomaly being detected (like something caught by
lockdep) it is a good idea to disable all tracing immediately, without
grabing any locks.
This patch adds ftrace_off_permanent that disables the tracers, function
tracing and ring buffers without a way to enable them again. This should
only be used when something serious has been detected.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 17:41:55 +0000 (12:41 -0500)]
ring-buffer: add tracing_off_permanent
Impact: feature to permanently disable ring buffer
This patch adds a API to the ring buffer code that will permanently
disable the ring buffer from ever recording. This should only be
called when some serious anomaly is detected, and the system
may be in an unstable state. When that happens, shutting down the
recording to the ring buffers may be appropriate.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jim Radford [Fri, 21 Nov 2008 03:48:39 +0000 (19:48 -0800)]
ftrace: scripts/recordmcount.pl support for ARM
Impact: extend scripts/recordmcount.pl to ARM
Arm uses %progbits instead of @progbits and requires only 4 byte alignment.
[ Thanks to Sam Ravnborg for mentioning that ARM uses %progbits ]
Signed-off-by: Jim Radford <radford@galvanix.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Matt Fleming [Thu, 20 Nov 2008 21:49:52 +0000 (21:49 +0000)]
ftrace: specify $alignment for sh architecture
Impact: extend scripts/recordmcount.pl with default alignment for SH
Set $alignment=2 for the sh architecture so that a ".align 2" directive
will be emitted for all __mcount_loc sections. Fix a whitspace error
while I'm here (converted spaces to tabs).
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 06:30:54 +0000 (01:30 -0500)]
trace: profile all if conditionals
Impact: feature to profile if statements
This patch adds a branch profiler for all if () statements.
The results will be found in:
/debugfs/tracing/profile_branch
For example:
miss hit % Function File Line
------- --------- - -------- ---- ----
0 1 100 x86_64_start_reservations head64.c 127
0 1 100 copy_bootdata head64.c 69
1 0 0 x86_64_start_kernel head64.c 111
32 0 0 set_intr_gate desc.h 319
1 0 0 reserve_ebda_region head.c 51
1 0 0 reserve_ebda_region head.c 47
0 1 100 reserve_ebda_region head.c 42
0 0 X maxcpus main.c 165
Miss means the branch was not taken. Hit means the branch was taken.
The percent is the percentage the branch was taken.
This adds a significant amount of overhead and should only be used
by those analyzing their system.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 06:51:53 +0000 (01:51 -0500)]
trace: branch profiling should not print percent without data
Impact: cleanup on output of branch profiler
When a branch has not been taken, it does not make sense to show
a percentage incorrect or hit. This patch changes the behaviour
to print out a 'X' when the branch has not been executed yet.
For example:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
2096 0 0 do_arch_prctl process_64.c 832
0 0 X do_arch_prctl process_64.c 804
2604 0 0 IS_ERR err.h 34
130228 5765 4 __switch_to process_64.c 673
0 0 X enable_TSC process_64.c 448
0 0 X disable_TSC process_64.c 431
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 05:40:40 +0000 (00:40 -0500)]
trace: consolidate unlikely and likely profiler
Impact: clean up to make one profiler of like and unlikely tracer
The likely and unlikely profiler prints out the file and line numbers
of the annotated branches that it is profiling. It shows the number
of times it was correct or incorrect in its guess. Having two
different files or sections for that matter to tell us if it was a
likely or unlikely is pretty pointless. We really only care if
it was correct or not.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 21 Nov 2008 04:57:47 +0000 (23:57 -0500)]
trace: remove extra assign in branch check
Impact: clean up of branch check
The unlikely/likely profiler does an extra assign of the f.line.
This is not needed since it is already calculated at compile time.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Thu, 20 Nov 2008 20:07:34 +0000 (15:07 -0500)]
ftrace: create default variables for archs in recordmcount.pl
Impact: cleanup of recordmcount.pl
Now that more architectures are being ported to the MCOUNT_RECORD
method, there is no reason to have each declare their own arch
specific variable if most of them share the same value. This patch
creates a set of default values for the arch specific variables
based off of i386.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Thu, 20 Nov 2008 15:16:16 +0000 (07:16 -0800)]
ftrace: add support for powerpc to recordmcount.pl script
Impact: Add PowerPC port to recordmcount.pl script
This patch updates the recordmcount.pl script to process
PowerPC.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Matt Fleming [Wed, 12 Nov 2008 11:11:47 +0000 (20:11 +0900)]
sh: dynamic ftrace support.
First cut at dynamic ftrace support.
[
Steven Rostedt - only updated the recordmcount.pl file.
There are updates for PowerPC that will conflict with this,
and we need to base off of these changes.
]
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Will Newton [Fri, 21 Nov 2008 22:08:59 +0000 (14:08 -0800)]
init/main.c: use ktime accessor function in initcall_debug code
Impact: fix initcall debug output on non-scalar ktime platforms (32-bit embedded)
The initcall_debug code access the tv64 member of ktime. This won't work
correctly for large deltas on platforms that don't use the scalar ktime
implementation.
Signed-off-by: Will Newton <will.newton@gmail.com>
Acked-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 23 Nov 2008 09:37:12 +0000 (10:37 +0100)]
tracing: allow tracing of suspend/resume & hibernation code again
Impact: widen function-tracing to suspend+resume (and hibernation) sequences
Now that the ftrace kernel thread is gone, we can allow tracing
during suspend/resume again.
So revert these two commits:
f42ac38c5 "ftrace: disable tracing for suspend to ram"
41108eb10 "ftrace: disable tracing for hibernation"
This should be tested very carefully, as it could interact with
altneratives instruction patching, etc.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sat, 22 Nov 2008 11:28:48 +0000 (13:28 +0200)]
tracing: identify which executable object the userspace address belongs to
Impact: modify+improve the userstacktrace tracing visualization feature
Store thread group leader id, and use it to lookup the address in the
process's map. We could have looked up the address on thread's map,
but the thread might not exist by the time we are called. The process
might not exist either, but if you are reading trace_pipe, that is
unlikely.
Example usage:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sym-userobj >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
cat trace_pipe >/tmp/trace&
.... run application ...
echo 0 >tracing_enabled
cat /tmp/trace
You'll see stack entries like:
/lib/libpthread-2.7.so[+0xd370]
You can convert them to function/line using:
addr2line -fie /lib/libpthread-2.7.so 0xd370
Or:
addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370
For non-PIC/PIE executables this won't work:
a.out[+0x73b]
You need to run the following: addr2line -fie a.out 0x40073b
(where 0x400000 is the default load address of a.out)
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sat, 22 Nov 2008 11:28:48 +0000 (13:28 +0200)]
vfs, seqfile: make mangle_path() global
Impact: expose new VFS API
make mangle_path() available, as per the suggestions of Christoph Hellwig
and Al Viro:
http://lkml.org/lkml/2008/11/4/338
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Török Edwin [Sat, 22 Nov 2008 11:28:47 +0000 (13:28 +0200)]
tracing: add support for userspace stacktraces in tracing/iter_ctrl
Impact: add new (default-off) tracing visualization feature
Usage example:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
.... run application ...
echo 0 >tracing_enabled
Then read one of 'trace','latency_trace','trace_pipe'.
To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 23 Nov 2008 08:18:56 +0000 (09:18 +0100)]
tracing/function-return-tracer: clean up task start/exit callbacks
Impact: cleanup
Eliminate #ifdefs in core code by using empty inline functions.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sun, 23 Nov 2008 05:22:56 +0000 (06:22 +0100)]
tracing/function-return-tracer: store return stack into task_struct and allocate it dynamically
Impact: use deeper function tracing depth safely
Some tests showed that function return tracing needed a more deeper depth
of function calls. But it could be unsafe to store these return addresses
to the stack.
So these arrays will now be allocated dynamically into task_struct of current
only when the tracer is activated.
Typical scheme when tracer is activated:
- allocate a return stack for each task in global list.
- fork: allocate the return stack for the newly created task
- exit: free return stack of current
- idle init: same as fork
I chose a default depth of 50. I don't have overruns anymore.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 23 Nov 2008 08:10:32 +0000 (09:10 +0100)]
Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' into tracing/core
Ingo Molnar [Fri, 21 Nov 2008 19:55:09 +0000 (20:55 +0100)]
Merge commit 'v2.6.28-rc6' into x86/urgent
Liming Wang [Fri, 21 Nov 2008 03:00:18 +0000 (11:00 +0800)]
function tracing: fix wrong position computing of stack_trace
Impact: make output of stack_trace complete if buffer overruns
When read buffer overruns, the output of stack_trace isn't complete.
When printing records with seq_printf in t_show, if the read buffer
has overruned by the current record, then this record won't be
printed to user space through read buffer, it will just be dropped in
this printing.
When next printing, t_start should return the "*pos"th record, which
is the one dropped by previous printing, but it just returns
(m->private + *pos)th record.
Here we use a more sane method to implement seq_operations which can
be found in kernel code. Thus we needn't initialize m->private.
About testing, it's not easy to overrun read buffer, but we can use
seq_printf to print more padding bytes in t_show, then it's easy to
check whether or not records are lost.
This commit has been tested on both condition of overrun and non
overrun.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Fri, 21 Nov 2008 02:08:09 +0000 (18:08 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5330/1: mach-pxa: Fixup reset for systems using reboot=cold or other strings
[ARM] pxa: fix incorrect PCMCIA PSKTSEL pin configuration for spitz
[ARM] pxa: fix I2C controller device being registered twice on Akita
pxafb: only initialize the smart panel thread when dealing with a smartpanel
pxafb: introduce LCD_TYPE_MASK and use it.
Linus Torvalds [Thu, 20 Nov 2008 23:19:22 +0000 (15:19 -0800)]
Linux 2.6.28-rc6
Linus Torvalds [Thu, 20 Nov 2008 23:07:40 +0000 (15:07 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] xen: fix xen_get_eflags.
[IA64] ia64/pv_ops/pv_cpu_ops: fix _IA64_REG_IP case.
[IA64] remove duplicate include iommu.h
[IA64] use mprintk instead of printk, in ia64_mca_modify_original_stack
[IA64] Rationalize kernel mode alignment checking
Linus Torvalds [Thu, 20 Nov 2008 21:53:21 +0000 (13:53 -0800)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB:
ACE1001 patch for cp2101.c
USB: usbmon: fix read(2)
USB: gadget rndis: send notifications
USB: gadget rndis: stop windows self-immolation
USB: storage: update unusual_devs entries for Nokia 5300 and 5310
USB: storage: updates unusual_devs entry for the Nokia 6300
usb: musb: fix bug in musb_schedule
USB: fix SB700 usb subsystem hang bug
Isaku Yamahata [Tue, 18 Nov 2008 10:20:51 +0000 (19:20 +0900)]
[IA64] xen: fix xen_get_eflags.
fix xen_get_eflags. It doesn't take any argument.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Isaku Yamahata [Tue, 18 Nov 2008 10:19:50 +0000 (19:19 +0900)]
[IA64] ia64/pv_ops/pv_cpu_ops: fix _IA64_REG_IP case.
pv_cpu_ops.getreg(_IA64_REG_IP) returned constant.
But the returned ip valued should be the one in the caller, not of the callee.
This patch fixes that.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Huang Weiyi [Thu, 20 Nov 2008 21:38:16 +0000 (13:38 -0800)]
[IA64] remove duplicate include iommu.h
arch/ia64/kernel/pci-dma.c only needs to include iommu once.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Hidetoshi Seto [Mon, 17 Nov 2008 01:18:08 +0000 (10:18 +0900)]
[IA64] use mprintk instead of printk, in ia64_mca_modify_original_stack
Using printk from MCA/INIT context is unsafe since it can cause deadlock.
The ia64_mca_modify_original_stack is called from both of mca handler and
init handler, so it should use mprintk instead of printk.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tony Luck [Thu, 20 Nov 2008 21:27:12 +0000 (13:27 -0800)]
[IA64] Rationalize kernel mode alignment checking
Itanium processors can handle some misaligned data accesses. They
also provide a mode where all such accesses are forced to trap. The
kernel was schizophrenic about use of this mode:
* Base kernel code ran in permissive mode where the only traps
generated were from those cases that the h/w could not handle.
* Interrupt, syscall and trap code ran in strict mode where all
unaligned accesses caused traps to the 0x5a00 unaligned reference
vector.
Use strict alignment checking throughout the kernel, but make
sure that we continue to let user mode use more relaxed mode
as the default.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Matthew Wilcox [Thu, 20 Nov 2008 21:09:33 +0000 (14:09 -0700)]
x86: Fix interrupt leak due to migration
When we migrate an interrupt from one CPU to another, we set the
move_in_progress flag and clean up the vectors later once they're not
being used. If you're unlucky and call destroy_irq() before the vectors
become un-used, the move_in_progress flag is never cleared, which causes
the interrupt to become unusable.
This was discovered by Jesse Brandeburg for whom it manifested as an
MSI-X device refusing to use MSI-X mode when the driver was unloaded
and reloaded repeatedly.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond Myklebust [Thu, 20 Nov 2008 21:06:21 +0000 (16:06 -0500)]
SUNRPC: Fix a performance regression in the RPC authentication code
Fix a regression reported by Max Kellermann whereby kernel profiling
showed that his clients were spending 45% of their time in
rpcauth_lookup_credcache.
It turns out that although his processes had identical uid/gid/groups,
generic_match() was failing to detect this, because the task->group_info
pointers were not shared. This again lead to the creation of a huge number
of identical credentials at the RPC layer.
The regression is fixed by comparing the contents of task->group_info
if the actual pointers are not identical.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 20 Nov 2008 21:14:16 +0000 (13:14 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Do not attempt to close invalidated file handles
[CIFS] fix check for dead tcon in smb_init
Linus Torvalds [Thu, 20 Nov 2008 21:13:48 +0000 (13:13 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.
MIPS: csrc-r4k: Fix spelling mistake.
MIPS: RB532: Provide functions for gpio configuration
MIPS: IP22: Make indy_sc_ops variable static
MIPS: RB532: GPIO register offsets are relative to GPIOBASE
MIPS: Malta: Fix include paths in malta-amon.c
Linus Torvalds [Thu, 20 Nov 2008 21:13:03 +0000 (13:13 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
intel-iommu: fix compile warnings
Linus Torvalds [Thu, 20 Nov 2008 21:12:14 +0000 (13:12 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
net: fix tiny output corruption of /proc/net/snmp6
atl2: don't request irq on resume if netif running
ipv6: use seq_release_private for ip6mr.c /proc entries
pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
smc911x: Fix printf format typo in smc911x driver.
asix: Fix asix-based cards connecting to 10/100Mbs LAN.
mv643xx_eth: fix recycle check bound
mv643xx_eth: fix the order of mdiobus_{unregister, free}() calls
sh: sh_eth: Update to change of mii_bus
TPROXY: supply a struct flowi->flags argument in inet_sk_rebuild_header()
TPROXY: fill struct flowi->flags in udp_sendmsg()
net: ipg.c fix bracing on endian swapping
phylib: Fix auto-negotiation restart avoidance
net: jme.c rxdesc.flags is __le16, other missing endian swaps
phylib: fix phy name example in documentation
net: Do not fire linkwatch events until the device is registered.
phonet: fix compilation with gcc-3.4
ixgbe: fix compilation with gcc-3.4
pktgen: fix multiple queue warning
net: fix ip_mr_init() error path
...
Linus Torvalds [Thu, 20 Nov 2008 21:11:21 +0000 (13:11 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ftrace: fix dyn ftrace filter selection
ftrace: make filtered functions effective on setting
ftrace: fix set_ftrace_filter
trace: introduce missing mutex_unlock()
tracing: kernel/trace/trace.c: introduce missing kfree()
Linus Torvalds [Thu, 20 Nov 2008 21:09:32 +0000 (13:09 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: uaccess_64: fix return value in __copy_from_user()
x86: quirk for reboot stalls on a Dell Optiplex 330
Helge Deller [Thu, 20 Nov 2008 09:54:09 +0000 (10:54 +0100)]
parisc: fix bug in compat_arch_ptrace
Commit
81e192d6ce303b6792aa38ff35f41a1a7357f23a ("parisc: convert to
generic compat_sys_ptrace") introduced a bug which segfaults the parisc
64bit kernel when stracing 32bit applications:
Kernel Fault: Code=15 regs=
00000000bafa42b0 (Addr=
00000001baf5ab57)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW:
00001000000001101111111100001011 Tainted: G W
r00-03
000000ff0806ff0b 000000004068edc0 00000000401203f8 00000000fb3e2508
r04-07
0000000040686dc0 00000000baf5a800 fffffffffffffffc fffffffffb3e2508
r08-11
00000000baf5a800 000000000004b068 00000000000402b0 0000000000040d68
r12-15
0000000000042a9c 0000000000040a9c 0000000000040d60 0000000000042e9c
r16-19
000000000004b060 000000000004b058 0000000000042d9c ffffffffffffffff
r20-23
000000000800000b 0000000000000000 000000000800000b fffffffffb3e2508
r24-27
00000000fffffffc 0000000000000003 00000000fffffffc 0000000040686dc0
r28-31
00000001baf5a7ff 00000000bafa4280 00000000bafa42b0 00000000000001d7
sr00-03
0000000000fca000 0000000000000000 0000000000000000 0000000000fca000
sr04-07
0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ:
0000000000000000 0000000000000000 IAOQ:
0000000040120400 0000000040120404
IIR:
4b9a06b0 ISR:
0000000000000000 IOR:
00000001baf5ab57
CPU: 0 CR30:
00000000bafa4000 CR31:
00000000d22344e0
ORIG_R28:
00000000fb3e2248
IAOQ[0]: compat_arch_ptrace+0xb8/0x160
IAOQ[1]: compat_arch_ptrace+0xbc/0x160
RP(r2): compat_arch_ptrace+0xb0/0x160
Backtrace:
[<
00000000401612ac>] compat_sys_ptrace+0x15c/0x180
[<
0000000040104ef8>] syscall_exit+0x0/0x14
The problem is that compat_arch_ptrace() enters with an addr value of
type compat_ulong_t and calls translate_usr_offset() to translate the
address offset into a struct pt_regs offset like this:
addr = translate_usr_offset(addr)
this means that any return value of translate_usr_offset() is stored
back as compat_ulong_t type into the addr variable.
But since translate_usr_offset() returns -1 for invalid offsets, addr
can now get the value 0xffffffff which then fails the next return-value
sanity check and thus the kernel tries to access invalid memory:
if (addr < 0)
break;
Fix this bug by modifying translate_usr_offset() to take and return
values of type compat_ulong_t, and by returning the value
"sizeof(struct pt_regs)" as an error indicator.
Additionally change the sanity check to check for return values
for >= sizeof(struct pt_regs).
This patch survived my compile and run-tests.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steve French [Thu, 20 Nov 2008 20:00:44 +0000 (20:00 +0000)]
[CIFS] Do not attempt to close invalidated file handles
If a connection with open file handles has gone down
and come back up and reconnected without reopening
the file handle yet, do not attempt to send an SMB close
request for this handle in cifs_close. We were
checking for the connection being invalid in cifs_close
but since the connection may have been reconnected
we also need to check whether the file handle
was marked invalid (otherwise we could close the
wrong file handle by accident).
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Ralf Baechle [Mon, 3 Nov 2008 11:32:34 +0000 (11:32 +0000)]
MIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 3 Nov 2008 11:31:54 +0000 (11:31 +0000)]
MIPS: csrc-r4k: Fix spelling mistake.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Phil Sutter [Sat, 1 Nov 2008 14:13:21 +0000 (15:13 +0100)]
MIPS: RB532: Provide functions for gpio configuration
As gpiolib doesn't support pin multiplexing, it provides no way to
access the GPIOFUNC register. Also there is no support for setting
interrupt status and level. These functions provide access to them and
are needed by the CompactFlash driver.
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Dmitri Vorobiev [Fri, 31 Oct 2008 17:54:11 +0000 (19:54 +0200)]
MIPS: IP22: Make indy_sc_ops variable static
The indy_sc_ops variable in arch/mips/mm/sc-ip22.c is needlessly defined
global, and this patch makes it static.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
Florian Fainelli [Fri, 31 Oct 2008 13:24:29 +0000 (14:24 +0100)]
MIPS: RB532: GPIO register offsets are relative to GPIOBASE
This patch fixes the wrong use of GPIO register offsets
in devices.c. To avoid further problems, use gpio_get_value
to return the NAND status instead of our own expanded code.
Also define the zero offset of the alternate function register to allow
consistent access.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Sat, 20 Sep 2008 17:16:36 +0000 (10:16 -0700)]
MIPS: Malta: Fix include paths in malta-amon.c
On linux-queue, malta doesn't build after the include file relocation.
This should fix it.
There some occurrences of 'asm-mips' in the comments of quite a few
files, but this is the only place I found it in any code.
Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Steven Rostedt [Sat, 15 Nov 2008 07:39:05 +0000 (02:39 -0500)]
powerpc/ppc32: ftrace, dynamic ftrace to handle modules
Impact: add ability to trace modules on 32 bit PowerPC
This patch performs the necessary trampoline calls to handle
modules with dynamic ftrace on 32 bit PowerPC.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Sat, 15 Nov 2008 04:47:03 +0000 (20:47 -0800)]
powerpc/ppc64: ftrace, handle module trampolines for dyn ftrace
Impact: Allow 64 bit PowerPC to trace modules with dynamic ftrace
This adds code to handle the PPC64 module trampolines, and allows for
PPC64 to use dynamic ftrace.
Thanks to Paul Mackerras for these updates:
- fix the mod and rec->arch.mod NULL checks.
- fix to is_bl_op compare.
Thanks to Milton Miller for:
- finding the nasty race with using two nops, and recommending
instead that I use a branch 8 forward.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Sat, 15 Nov 2008 00:21:20 +0000 (16:21 -0800)]
powerpc: ftrace, use probe_kernel API to modify code
Impact: use cleaner probe_kernel API over assembly
Using probe_kernel_read/write interface is a much cleaner approach
than the current assembly version.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)]
powerpc: ftrace, convert to new dynamic ftrace arch API
Impact: update to PowerPC ftrace arch API
This patch converts PowerPC to use the new dynamic ftrace arch API.
Thanks to Paul Mackennas for pointing out the mistakes of my original
test_24bit_addr function.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)]
powerpc: ftrace, do not latency trace idle
Impact: fix for irq off latency tracer
When idle is called, interrupts are disabled, but the idle function
will still wake up on an interrupt. The problem is that the interrupt
disabled latency tracer will take this call to idle as a latency.
This patch disables the latency tracing when going into idle.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Rakib Mullick [Thu, 20 Nov 2008 13:12:50 +0000 (19:12 +0600)]
x86: fixing __cpuinit/__init tangle, xsave_cntxt_init()
Annotate xsave_cntxt_init() as "can be called outside of __init".
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rakib Mullick [Thu, 20 Nov 2008 13:08:45 +0000 (19:08 +0600)]
x86: fix __cpuinit/__init tangle in init_thread_xstate()
Impact: fix incorrect __init annotation
This patch removes the following section mismatch warning. A patch set
was send previously (http://lkml.org/lkml/2008/11/10/407). But
introduce some other problem, reported by Rufus
(http://lkml.org/lkml/2008/11/11/46). Then Ingo Molnar suggest that,
it's best to remove __init from xsave_cntxt_init(void). Which is the
second patch in this series. Now, this one removes the following
warning.
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x2237): Section
mismatch in reference from the function cpu_init() to the function
.init.text:init_thread_xstate()
The function __cpuinit cpu_init() references
a function __init init_thread_xstate().
If init_thread_xstate is only used by cpu_init then
annotate init_thread_xstate with a matching annotation.
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Alexey Dobriyan [Thu, 20 Nov 2008 12:20:10 +0000 (04:20 -0800)]
net: fix tiny output corruption of /proc/net/snmp6
Because "name" is static, it can be occasionally be filled with
somewhat garbage if two processes read /proc/net/snmp6.
Also, remove useless casts and "-1" -- snprintf() correctly terminates it's
output.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Jenkins [Thu, 20 Nov 2008 12:18:25 +0000 (04:18 -0800)]
atl2: don't request irq on resume if netif running
If the device is suspended with the cable disconnected, then
resumed with the cable connected, dev->open is called before
resume. During resume, we request an IRQ, but the IRQ was
already assigned during dev->open, resulting in the warning
shown below.
Don't request an IRQ if the device is running.
Call Trace:
[<
c011b89a>] warn_on_slowpath+0x40/0x59
[<
c023df15>] raw_pci_read+0x4d/0x55
[<
c023dff3>] pci_read+0x1c/0x21
[<
c01bcd81>] __pci_find_next_cap_ttl+0x44/0x70
[<
c01bce86>] __pci_find_next_cap+0x1a/0x1f
[<
c01bcef9>] pci_find_capability+0x28/0x2c
[<
c01c4144>] pci_msi_check_device+0x53/0x62
[<
c01c49c2>] pci_enable_msi+0x3a/0x1cd
[<
e019f17b>] atl2_write_phy_reg+0x40/0x5f [atl2]
[<
c01061b1>] dma_generic_alloc_coherent+0x0/0xd7
[<
e019f107>] atl2_request_irq+0x15/0x49 [atl2]
[<
e01a1481>] atl2_open+0x20b/0x297 [atl2]
[<
c024a35c>] dev_open+0x62/0x91
[<
c0248b9a>] dev_change_flags+0x93/0x141
[<
c024f308>] do_setlink+0x238/0x2d5
[<
c02501b2>] rtnl_setlink+0xa9/0xbf
[<
c0297f0c>] mutex_lock+0xb/0x19
[<
c024ffa7>] rtnl_dump_ifinfo+0x0/0x69
[<
c0250109>] rtnl_setlink+0x0/0xbf
[<
c024fe42>] rtnetlink_rcv_msg+0x185/0x19f
[<
c0240fd1>] sock_rmalloc+0x23/0x57
[<
c024fcbd>] rtnetlink_rcv_msg+0x0/0x19f
[<
c0259457>] netlink_rcv_skb+0x2d/0x71
[<
c024fcb7>] rtnetlink_rcv+0x14/0x1a
[<
c025929e>] netlink_unicast+0x184/0x1e4
[<
c025992a>] netlink_sendmsg+0x233/0x240
[<
c023f405>] sock_sendmsg+0xb7/0xd0
[<
c0129131>] autoremove_wake_function+0x0/0x2b
[<
c0129131>] autoremove_wake_function+0x0/0x2b
[<
c0147796>] mempool_alloc+0x2d/0x9e
[<
c020c923>] scsi_pool_alloc_command+0x35/0x4f
[<
c0297f0c>] mutex_lock+0xb/0x19
[<
c028e867>] unix_stream_recvmsg+0x357/0x3e2
[<
c01b81c9>] copy_from_user+0x23/0x4f
[<
c02452ea>] verify_iovec+0x3e/0x6c
[<
c023f5ab>] sys_sendmsg+0x18d/0x1f0
[<
c023ffa8>] sys_recvmsg+0x146/0x1c8
[<
c0240016>] sys_recvmsg+0x1b4/0x1c8
[<
c0118f48>] __wake_up+0xf/0x15
[<
c02586cd>] netlink_table_ungrab+0x17/0x19
[<
c01b83ba>] copy_to_user+0x25/0x3b
[<
c023fe4a>] move_addr_to_user+0x50/0x68
[<
c0240266>] sys_getsockname+0x6f/0x9a
[<
c0240280>] sys_getsockname+0x89/0x9a
[<
c015046a>] do_wp_page+0x3ae/0x41a
[<
c0151525>] handle_mm_fault+0x4c5/0x540
[<
c02405d0>] sys_socketcall+0x176/0x1b0
[<
c010376d>] sysenter_do_call+0x12/0x21
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Thery [Thu, 20 Nov 2008 12:16:12 +0000 (04:16 -0800)]
ipv6: use seq_release_private for ip6mr.c /proc entries
In ip6mr.c, /proc entries /proc/net/ip6_mr_cache and /proc/net/ip6_mr_vif
are opened with seq_open_private(), thus seq_release_private() should be
used to release them.
Should fix a small memory leak.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 20 Nov 2008 12:07:14 +0000 (04:07 -0800)]
pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
nla_nest_start() might return NULL, causing a NULL pointer dereference.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>