Frederic Weisbecker [Tue, 11 Nov 2008 06:14:25 +0000 (07:14 +0100)]
tracing: add a tracer to catch execution time of kernel functions
Impact: add new tracing plugin which can trace full (entry+exit) function calls
This tracer uses the low level function return ftrace plugin to
measure the execution time of the kernel functions.
The first field is the caller of the function, the second is the
measured function, and the last one is the execution time in
nanoseconds.
- v3:
- HAVE_FUNCTION_RET_TRACER have been added. Each arch that support ftrace return
should enable it.
- ftrace_return_stub becomes ftrace_stub.
- CONFIG_FUNCTION_RET_TRACER depends now on CONFIG_FUNCTION_TRACER
- Return traces printing can be used for other tracers on trace.c
- Adapt to the new tracing API (no more ctrl_update callback)
- Correct the check of "disabled" during insertion.
- Minor changes...
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 11 Nov 2008 06:03:45 +0000 (07:03 +0100)]
tracing, x86: add low level support for ftrace return tracing
Impact: add infrastructure for function-return tracing
Add low level support for ftrace return tracing.
This plug-in stores return addresses on the thread_info structure of
the current task.
The index of the current return address is initialized when the task
is the first one (init) and when a process forks (the child). It is
not needed when a task does a sys_execve because after this syscall,
it still needs to return on the kernel functions it called.
Note that the code of return_to_handler has been suggested by Steven
Rostedt as almost all of the ideas of improvements in this V3.
For purpose of security, arch/x86/kernel/process_32.c is not traced
because __switch_to() changes the current task during its execution.
That could cause inconsistency in the stored return address of this
function even if I didn't have any crash after testing with tracing on
this function enabled.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 11 Nov 2008 08:58:36 +0000 (09:58 +0100)]
Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core
Steven Rostedt [Tue, 11 Nov 2008 04:07:30 +0000 (23:07 -0500)]
ring-buffer: replace most bug ons with warn on and disable buffer
This patch replaces most of the BUG_ONs in the ring_buffer code with
RB_WARN_ON variants. It adds some more variants as needed for the
replacement. This lets the buffer die nicely and still warn the user.
One BUG_ON remains in the code, and that is because it detects a
bad pointer passed in by the calling function, and not a bug by
the ring buffer code itself.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 11 Nov 2008 04:07:30 +0000 (23:07 -0500)]
ftrace: prevent ftrace_special from recursion
Impact: stop ftrace_special from recursion
The ftrace_special is used to help debug areas of the kernel.
Because of this, if it is put in certain locations, the fact that
it allows recursion can become a problem if the kernel developer
using does not realize that.
This patch changes ftrace_special to not allow recursion into itself
to make it more robust.
It also changes from preempt disable interrupts disable to prevent
any loss of trace entries.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 11 Nov 2008 08:40:18 +0000 (09:40 +0100)]
Merge branch 'tracing/urgent' into tracing/ftrace
Conflicts:
kernel/trace/trace.c
Ingo Molnar [Tue, 11 Nov 2008 08:16:20 +0000 (09:16 +0100)]
Merge branch 'devel' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
Steven Rostedt [Tue, 11 Nov 2008 02:46:01 +0000 (21:46 -0500)]
ring-buffer: prevent infinite looping on time stamping
Impact: removal of unnecessary looping
The lockless part of the ring buffer allows for reentry into the code
from interrupts. A timestamp is taken, a test is preformed and if it
detects that an interrupt occurred that did tracing, it tries again.
The problem arises if the timestamp code itself causes a trace.
The detection will detect this and loop again. The difference between
this and an interrupt doing tracing, is that this will fail every time,
and cause an infinite loop.
Currently, we test if the loop happens 1000 times, and if so, it will
produce a warning and disable the ring buffer.
The problem with this approach is that it makes it difficult to perform
some types of tracing (tracing the timestamp code itself).
Each trace entry has a delta timestamp from the previous entry.
If a trace entry is reserved but and interrupt occurs and traces before
the previous entry is commited, the delta timestamp for that entry will
be zero. This actually makes sense in terms of tracing, because the
interrupt entry happened before the preempted entry was commited, so
one may consider the two happening at the same time. The order is
still preserved in the buffer.
With this idea, instead of trying to get a new timestamp if an interrupt
made it in between the timestamp and the test, the entry could simply
make the delta zero and continue. This will prevent interrupts or
tracers in the timer code from causing the above loop.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Tue, 11 Nov 2008 02:46:00 +0000 (21:46 -0500)]
ftrace: disable tracing on resize
Impact: fix for bug on resize
This patch addresses the bug found here:
http://bugzilla.kernel.org/show_bug.cgi?id=11996
When ftrace converted to the new unified trace buffer, the resizing of
the buffer was not protected as much as it was originally. If tracing
is performed while the resize occurs, then the buffer can be corrupted.
This patch disables all ftrace buffer modifications before a resize
takes place.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Linus Torvalds [Mon, 10 Nov 2008 17:13:37 +0000 (09:13 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Make the HP EliteBook 8530p use
AD1884A model laptop
ALSA: gusextreme: Fix build errors
ALSA: hdsp: check for iobox and upload firmware during ioctl
ALSA: HDSP: check for io box before uploading firmware
ALSA: hda - Add another HP model (6730s) for
AD1884A
alsa: fix snd_BUG_on() and friends
ALSA: hda - Add a quirk for MEDION MD96630
ALSA: hda - Limit the number of GPIOs show in proc
Takashi Iwai [Mon, 10 Nov 2008 16:58:46 +0000 (17:58 +0100)]
Merge branches 'topic/fix/misc' and 'topic/fix/hda' into for-linus
Travis Place [Mon, 10 Nov 2008 16:56:23 +0000 (17:56 +0100)]
ALSA: hda - Make the HP EliteBook 8530p use
AD1884A model laptop
Added a QUIRK to patch_analog.c for the HP Elitebook 8530p
(IDs 0x103c:0x30e7) to use
AD1884A model 'laptop' by default.
Playback and Capture confirmed working.
Signed-off-by: Travis Place <wishie@wishie.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tejun Heo [Mon, 10 Nov 2008 05:48:21 +0000 (14:48 +0900)]
libata: revert convert-to-block-tagging patches
This patch reverts the following three commits which convert libata to
use block layer tagging.
43a49cbdf31e812c0d8f553d433b09b421f5d52c
e013e13bf605b9e6b702adffbe2853cfc60e7806
2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
Although using block layer tagging is the right direction, due to the
tight coupling among tag number, data structure allocation and
hardware command slot allocation, libata doesn't work correctly with
the current conversion.
The biggest problem is guaranteeing that tag 0 is always used for
non-NCQ commands. Due to the way blk-tag is implemented and how SCSI
starts and finishes requests, such guarantee can't be made. I'm not
sure whether this would actually break any low level driver but it
doesn't look like a good idea to break such assumption given the
frailty of ATA controllers.
So, for the time being, keep using the old dumb in-libata qc
allocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <jens.axboe@oracle.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ville Syrjala [Sun, 9 Nov 2008 18:32:40 +0000 (20:32 +0200)]
ALSA: gusextreme: Fix build errors
gusextreme depends on opl3 support. Add the approriate select to Kconfig.
Also remove the unnecessary hwdep select.
Relevant build errors:
ERROR: "snd_opl3_hwdep_new" [sound/isa/gus/snd-gusextreme.ko] undefined!
ERROR: "snd_opl3_create" [sound/isa/gus/snd-gusextreme.ko] undefined!
Signed-off-by: Ville Syrjala <syrjala@sci.fi>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Mon, 10 Nov 2008 00:36:15 +0000 (16:36 -0800)]
Linux 2.6.28-rc4
Arjan van de Ven [Sun, 9 Nov 2008 20:45:10 +0000 (12:45 -0800)]
regression: disable timer peek-ahead for 2.6.28
It's showing up as regressions; disabling it very likely just papers
over an underlying issue, but time is running out for 2.6.28, lets get
back to this for 2.6.29
Fixes: #11826 and #11893
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Nov 2008 00:20:49 +0000 (16:20 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/sam/kbuild-fixes
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: Fixup deb-pkg target to generate separate firmware deb
Jonathan McDowell [Sat, 13 Sep 2008 16:08:31 +0000 (17:08 +0100)]
kbuild: Fixup deb-pkg target to generate separate firmware deb
The below is a simplistic fix for "make deb-pkg"; it splits the
firmware out to a linux-firmware-image package and adds an
(unversioned) Suggests to the linux package for this firmware.
Signed-Off-By: Jonathan McDowell <noodles@earth.li>
Acked-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Linus Torvalds [Sun, 9 Nov 2008 20:47:04 +0000 (12:47 -0800)]
Don't ask twice about not including staging drivers
The "Exclude staging drivers" question is there so that we don't build
staging drivers for allyesconfig or allnoconfig settings, but it's very
irritating when you've already said "no" to staging drivers earlier.
There is absolutely no point in declining twice - once you've declined
the staging drivers, you're done.
So make the second question depend on the first question having been
answered in the affirmative.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 9 Nov 2008 20:25:44 +0000 (12:25 -0800)]
Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
Fix nfsd truncation of readdir results
Linus Torvalds [Sun, 9 Nov 2008 20:20:56 +0000 (12:20 -0800)]
Merge branch 'cpus4096' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
cpumask: introduce new API, without changing anything, v3
cpumask: new API, v2
cpumask: introduce new API, without changing anything
Doug Nazar [Wed, 5 Nov 2008 11:16:28 +0000 (06:16 -0500)]
Fix nfsd truncation of readdir results
Commit
8d7c4203 "nfsd: fix failure to set eof in readdir in some
situations" introduced a bug: on a directory in an exported ext3
filesystem with dir_index unset, a READDIR will only return about 250
entries, even if the directory was larger.
Bisected it back to this commit; reverting it fixes the problem.
It turns out that in this case ext3 reads a block at a time, then
returns from readdir, which means we can end up with buf.full==0 but
with more entries in the directory still to be read. Before
8d7c4203
(but after
c002a6c797 "Optimise NFS readdir hack slightly"), this would
cause us to return the READDIR result immediately, but with the eof bit
unset. That could cause a performance regression (because the client
would need more roundtrips to the server to read the whole directory),
but no loss in correctness, since the cleared eof bit caused the client
to send another readdir. After
8d7c4203, the setting of the eof bit
made this a correctness problem.
So, move nfserr_eof into the loop and remove the buf.full check so that
we loop until buf.used==0. The following seems to do the right thing
and reduces the network traffic since we don't return a READDIR result
until the buffer is full.
Tested on an empty directory & large directory; eof is properly sent and
there are no more short buffers.
Signed-off-by: Doug Nazar <nazard@dragoninc.ca>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Rusty Russell [Sat, 8 Nov 2008 09:24:19 +0000 (20:24 +1100)]
cpumask: introduce new API, without changing anything, v3
Impact: cleanup
Clean up based on feedback from Andrew Morton and others:
- change to inline functions instead of macros
- add __init to bootmem method
- add a missing debug check
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Miklos Szeredi [Sun, 9 Nov 2008 14:23:57 +0000 (15:23 +0100)]
net: unix: fix inflight counting bug in garbage collector
Previously I assumed that the receive queues of candidates don't
change during the GC. This is only half true, nothing can be received
from the queues (see comment in unix_gc()), but buffers could be added
through the other half of the socket pair, which may still have file
descriptors referring to it.
This can result in inc_inflight_move_tail() erronously increasing the
"inflight" counter for a unix socket for which dec_inflight() wasn't
previously called. This in turn can trigger the "BUG_ON(total_refs <
inflight_refs)" in a later garbage collection run.
Fix this by only manipulating the "inflight" counter for sockets which
are candidates themselves. Duplicating the file references in
unix_attach_fds() is also needed to prevent a socket becoming a
candidate for GC while the skb that contains it is not yet queued.
Reported-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolas Pitre [Sun, 9 Nov 2008 05:27:53 +0000 (00:27 -0500)]
clarify usage expectations for cnt32_to_63()
Currently, all existing users of cnt32_to_63() are fine since the CPU
architectures where it is used don't do read access reordering, and user
mode preemption is disabled already. It is nevertheless a good idea to
better elaborate usage requirements wrt preemption, and use an explicit
memory barrier on SMP to avoid different CPUs accessing the counter
value in the wrong order. On UP a simple compiler barrier is
sufficient.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 9 Nov 2008 19:14:16 +0000 (11:14 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: struct device - replace bus_id with dev_name(), dev_set_name()
mmc: increase SD write timeout for crappy cards
Takashi Iwai [Thu, 30 Oct 2008 14:57:05 +0000 (15:57 +0100)]
regulator: Use menuconfig in Kconfig
Use menuconfig instead of flat configs so that you can disable/enable
regulator items with one selection. Also, use depends instead of
reverse selections to make life easier, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Tim Blechmann [Sat, 8 Nov 2008 13:42:18 +0000 (14:42 +0100)]
ALSA: hdsp: check for iobox and upload firmware during ioctl
currently, the error message when trying to run hdspmixer or hdspconf
if the breakout box is not connected is somehow misleading, since it
asks the user to upload the firmware.
this patch adds a test, whether the breakout box is connected and
tries to upload the firmware in the case, that it is not present, e.g.
because of power failures of the breakout box.
[Minor coding-style fixes by tiwai]
Signed-off-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tim Blechmann [Sun, 9 Nov 2008 11:50:52 +0000 (12:50 +0100)]
ALSA: HDSP: check for io box before uploading firmware
currently the hdsp driver tries to upload the firmware, even if the
io box is not connected. this patch adds a check for the io box
before trying to upload the firmware.
thus instead of messages complaining about the fifo status and firmware
loading failure, the driver gives a message that no multiface or
digiface is connected.
[A minor coding-style fix by tiwai]
Signed-off-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Michel Marti [Sat, 8 Nov 2008 10:33:32 +0000 (11:33 +0100)]
ALSA: hda - Add another HP model (6730s) for
AD1884A
Added model=laptop for another HP machine (103c:3614) with
AD1884A
codec.
Signed-off-by: Michel Marti <mma@objectxp.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Kay Sievers [Sat, 8 Nov 2008 20:37:46 +0000 (21:37 +0100)]
mmc: struct device - replace bus_id with dev_name(), dev_set_name()
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 26 Oct 2008 11:37:25 +0000 (12:37 +0100)]
mmc: increase SD write timeout for crappy cards
It seems that some cards are slightly out of spec and occasionally
will not be able to complete a write in the alloted 250 ms [1].
Incease the timeout slightly to allow even these cards to function
properly.
[1] http://lkml.org/lkml/2008/9/23/390
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Linus Torvalds [Sat, 8 Nov 2008 18:24:28 +0000 (10:24 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: optimize sched_clock() a bit
sched: improve sched_clock() performance
Linus Torvalds [Sat, 8 Nov 2008 18:22:38 +0000 (10:22 -0800)]
Merge branch 'oprofile-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'oprofile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: Fix p6 counter overflow check
Cell OProfile: Incorrect local array size in activate spu profiling function
Revert "Cell OProfile: Incorrect local array size in activate spu profiling function"
oprofile: fix memory ordering
Cell OProfile: Incorrect local array size in activate spu profiling function
Change UTF8 chars in Kconfig help text about Oprofile AMD barcelona
Linus Torvalds [Sat, 8 Nov 2008 18:22:00 +0000 (10:22 -0800)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
Staging: make usbip depend on CONFIG_NET
Staging: only build the tree if we really want to
Rafael J. Wysocki [Sat, 8 Nov 2008 12:53:33 +0000 (13:53 +0100)]
Fix __pfn_to_page(pfn) for CONFIG_DISCONTIGMEM=y
Fix the __pfn_to_page(pfn) macro so that it doesn't evaluate its
argument twice in the CONFIG_DISCONTIGMEM=y case, because 'pfn' may
be a result of a funtion call having side effects.
For example, the hibernation code applies pfn_to_page(pfn) to the
result of a function returning the pfn corresponding to the next set
bit in a bitmap and the current bit position is modified on each
call. This leads to "interesting" failures for CONFIG_DISCONTIGMEM=y
due to the current behavior of __pfn_to_page(pfn).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Sat, 8 Nov 2008 16:05:38 +0000 (17:05 +0100)]
sched: optimize sched_clock() a bit
sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().
Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 8 Nov 2008 15:19:55 +0000 (16:19 +0100)]
sched: improve sched_clock() performance
in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:
659567 system_call 41222.9375
686796 schedule 435.7843
718382 __switch_to 665.1685
823875 switch_mm 4526.7857
1883122 native_read_tsc 55385.9412
9761990 total 2.8468
this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.
But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().
This improves lat_ctx performance by about 5%.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: display start of CPU buffer in trace output
Impact: change in trace output
Because the trace buffers are per cpu ring buffers, the start of
the trace can be confusing. If one CPU is very active at the
end of the trace, its history will not go as far back as the
other CPU traces. This means that output for a particular CPU
may not appear for the first part of a trace.
To help annotate what is happening, and to prevent any more
confusion, this patch adds a line that annotates the start of
a CPU buffer output.
For example:
automount-3495 [001] 184.596443: dnotify_parent <-vfs_write
[...]
automount-3495 [001] 184.596449: dput <-path_put
automount-3496 [002] 184.596450: down_read_trylock <-do_page_fault
[...]
sshd-3497 [001] 184.597069: up_read <-do_page_fault
<idle>-0 [000] 184.597074: __exit_idle <-exit_idle
[...]
automount-3496 [002] 184.597257: filemap_fault <-__do_fault
<idle>-0 [003] 184.597261: exit_idle <-smp_apic_timer_interrupt
Note, parsers of a trace output should always ignore any lines that
start with a '#'.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: force pass of preemptoff selftest
Impact: preemptoff not tested in selftest
Due to the BKL not being preemptable anymore, the selftest of the
preemptoff code can not be tested. It requires that it is called
with preemption enabled, but since the BKL is held, that is no
longer the case.
This patch simply skips those tests if it detects that the context
is not preemptable. The following will now show up in the tests:
Testing tracer preemptoff: can not test ... force PASSED
Testing tracer preemptirqsoff: can not test ... force PASSED
When the BKL is removed, or it becomes preemptable once again, then
the tests will be performed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Matt Fleming [Fri, 7 Nov 2008 13:26:25 +0000 (13:26 +0000)]
ftrace: align __mcount_loc sections
Impact: add alignment option for recordmcount.pl script
Align the __mcount_loc sections so that architectures with strict
alignment requirements need not worry about performing unaligned
accesses.
This fixes an issue where I was seeing unaligned accesses, which are not
supported on our architecture (the results of an unaligned access are
undefined).
Signed-off-by: Matt Fleming <matthew.fleming@imgtec.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: remove trace array ctrl
Impact: remove obsolete variable in trace_array structure
With the new start / stop method of ftrace, the ctrl variable
in the trace_array structure is now obsolete. Remove it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: remove ctrl_update method
Impact: Remove the ctrl_update tracer method
With the new quick start/stop method of tracing, the ctrl_update
method is out of date.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: enable trace_printk by default
Impact: have the ftrace_printk enabled on startup
It is confusing to have to "echo trace_printk > /debug/tracing/iter_ctrl"
after adding ftrace_printk in the kernel.
Currently the trace_printk is set to off by default. ftrace_printk
should only be in open kernel code when used for debugging, and thus
it should be enabled by default.
It may also be used to record data within a tracer, but those ftrace_printks
should be within wrappers that are either enabled by trace_points or
have a variable protecting the code path from being entered when the
tracer is disabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: irqsoff tracer incorrect reset
Impact: fix to irqsoff tracer output
In converting to the new start / stop ftrace handling, the irqsoff
tracer start called the irqsoff reset function. irqsoff tracer is
not the same as the other traces, and it resets the buffers while
searching for the longest latency.
The reset that the irqsoff stop method calls disables the function
tracing. That means that, by starting the tracer, the function
tracer is disabled incorrectly.
This patch simply removes the call to reset which keeps the function
tracing enabled. Reset is not needed for the irqsoff stop method.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: fix sched_switch API
Impact: fix for sched_switch that broke dynamic ftrace startup
The commit: tracing/fastboot: use sched switch tracer from boot tracer
broke the API of the sched_switch trace. The use of the
tracing_start/stop_cmdline record is for only recording the cmdline,
NOT recording the schedule switches themselves.
Seeing that the boot tracer broke the API to do something that it
wanted, this patch adds a new interface for the API while
puting back the original interface of the old API.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: fix boot trace sched startup
Impact: boot tracer startup modified
The boot tracer calls into some of the schedule tracing private functions
that should not be exported. This patch cleans it up, and makes
way for further changes in the ftrace infrastructure.
This patch adds a api to assign a tracer array to the schedule
context switch tracer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)]
ftrace: fix set_ftrace_filter
Impact: fix of output of set_ftrace_filter
Commit ftrace: do not show freed records in available_filter_functions
Removed a bit too much from the set_ftrace_filter code, where we now see
all functions in the set_ftrace_filter file even when we set a filter.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 8 Nov 2008 08:34:35 +0000 (09:34 +0100)]
Merge branches 'tracing/ftrace', 'tracing/fastboot', 'tracing/nmisafe' and 'tracing/urgent' into tracing/core
Greg Kroah-Hartman [Wed, 29 Oct 2008 17:44:55 +0000 (10:44 -0700)]
Staging: make usbip depend on CONFIG_NET
Thanks to Randy Dunlap for finding this problem.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Greg Kroah-Hartman [Sat, 8 Nov 2008 05:12:17 +0000 (21:12 -0800)]
Staging: only build the tree if we really want to
This Kconfig change allows the common 'make allmodconfig' and
'make allyesconfig' build options to skip the staging tree, which is
probably what you want to have happen anyway.
This makes the linux-next developer's life a lot easier so he doesn't
have to worry about changes that break the staging tree, that's for me
to worry about...
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ingo Molnar [Fri, 7 Nov 2008 18:22:10 +0000 (19:22 +0100)]
Merge branch 'oprofile-for-tip' of git://git./linux/kernel/git/rric/oprofile into x86/urgent
Linus Torvalds [Fri, 7 Nov 2008 18:09:28 +0000 (10:09 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP
[IA64] fix boot panic caused by offline CPUs
[IA64] reorder Kconfig options to match x86
[IA64] Build VT-D iommu support into generic kernel
[IA64] remove dead BIO_VMERGE_BOUNDARY definition
[IA64] remove duplicated #include from pci-dma.c
[IA64] use common header for software IO/TLB
[IA64] fix the difference between node_mem_map and node_start_pfn
[IA64] Add error_recovery_info field to SAL section header
[IA64] Add UV watchlist support.
[IA64] Simplify SGI uv vs. sn2 driver issues
Jay Lan [Fri, 7 Nov 2008 17:51:55 +0000 (09:51 -0800)]
[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP
IA64 kdump kernel failed to initialize /proc/vmcore in 2.6.28-rc2.
A bug was introduced in this patch commit:
d9a9855d0b06ca6d6cc92596fedcc03f8512e062
always reserve elfcore header memory in crash kernel
The problem was that the call to reserve_elfcorehdr() should be placed
in CONFIG_CRASH_DUMP rather than in CONFIG_CRASH_KERNEL, which does
not exist.
Signed-off-by: Jay Lan <jlan@sgi.com>
Acked-by: Simon Hormon <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Linus Torvalds [Fri, 7 Nov 2008 17:18:14 +0000 (09:18 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fix range check on mmapped sysfs resource files
PCI: remove excess kernel-doc notation
PCI: annotate return value of pci_ioremap_bar with __iomem
PCI: fix VPD limit quirk for Broadcom 5708S
Linus Torvalds [Fri, 7 Nov 2008 17:17:59 +0000 (09:17 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, xen: fix use of pgd_page now that it really does return a page
Linus Torvalds [Fri, 7 Nov 2008 17:17:46 +0000 (09:17 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fine-tune SD_SIBLING_INIT
sched: fine-tune SD_MC_INIT
sched: fix memory leak in a failure path
sched: fix a bug in sched domain degenerate
Linus Torvalds [Fri, 7 Nov 2008 17:17:21 +0000 (09:17 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
xen: make sure stray alias mappings are gone before pinning
vmap: cope with vm_unmap_aliases before vmalloc_init()
Andi Kleen [Fri, 7 Nov 2008 13:02:49 +0000 (14:02 +0100)]
oprofile: Fix p6 counter overflow check
Fix the counter overflow check for CPUs with counter width > 32
I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Alan Cox [Fri, 7 Nov 2008 16:07:02 +0000 (16:07 +0000)]
trivial: MPT fusion - remove long dead code
This triggers false bug reports as it does a bogus kmalloc with locks held
but is never really compiled into the kernel.
Closes #8329
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Fri, 7 Nov 2008 16:03:46 +0000 (16:03 +0000)]
trivial: dmi_scan typo
As we've lost our trivial maintainer for the moment I'll send this
directly. Only touches a comment
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 7 Nov 2008 16:15:18 +0000 (08:15 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
ext4: calculate journal credits correctly
ext4: wait on all pending commits in ext4_sync_fs()
ext4: Convert to host order before using the values.
ext4: fix missing ext4_unlock_group in error path
jbd2: deregister proc on failure in jbd2_journal_init_inode
jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space
jbd: don't give up looking for space so easily in __log_wait_for_space
Ingo Molnar [Fri, 7 Nov 2008 15:09:23 +0000 (16:09 +0100)]
sched: fine-tune SD_SIBLING_INIT
fine-tune the HT sched-domains parameters as well.
On a HT capable box, this increases lat_ctx performance from 23.87
usecs to 1.49 usecs:
# before
$ ./lat_ctx -s 0 2
"size=0k ovr=1.89
2 23.87
# after
$ ./lat_ctx -s 0 2
"size=0k ovr=1.84
2 1.49
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Fri, 7 Nov 2008 14:26:50 +0000 (15:26 +0100)]
sched: fine-tune SD_MC_INIT
Tune SD_MC_INIT the same way as SD_CPU_INIT:
unset SD_BALANCE_NEWIDLE, and set SD_WAKE_BALANCE.
This improves vmark by 5%:
vmark 132102 125968 125497 messages/sec avg 127855.66 .984
vmark 139404 131719 131272 messages/sec avg 134131.66 1.033
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
# *DOCUMENTATION*
Frederic Bohe [Fri, 7 Nov 2008 14:21:01 +0000 (09:21 -0500)]
ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
When initializing an uninitialized block group in ext4_new_inode(),
its block group checksum must be re-calculated. This fixes a race
when several threads try to allocate a new inode in an UNINIT'd group.
There is some question whether we need to be initializing the block
bitmap in ext4_new_inode() at all, but for now, if we are going to
init the block group, let's eliminate the race.
Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Aneesh Kumar K.V [Fri, 7 Nov 2008 14:06:45 +0000 (09:06 -0500)]
ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
We need to make sure we mark the buffer_heads as dirty and uptodate
so that block_write_full_page write them correctly.
This fixes mmap corruptions that can occur in low memory situations.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Rusty Russell [Fri, 7 Nov 2008 00:12:29 +0000 (11:12 +1100)]
cpumask: new API, v2
- add cpumask_of()
- add free_bootmem_cpumask_var()
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge [Tue, 28 Oct 2008 08:23:06 +0000 (19:23 +1100)]
xen: make sure stray alias mappings are gone before pinning
Xen requires that all mappings of pagetable pages are read-only, so
that they can't be updated illegally. As a result, if a page is being
turned into a pagetable page, we need to make sure all its mappings
are RO.
If the page had been used for ioremap or vmalloc, it may still have
left over mappings as a result of not having been lazily unmapped.
This change makes sure we explicitly mop them all up before pinning
the page.
Unlike aliases created by kmap, the there can be vmalloc aliases even
for non-high pages, so we must do the flush unconditionally.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge [Tue, 28 Oct 2008 08:22:34 +0000 (19:22 +1100)]
vmap: cope with vm_unmap_aliases before vmalloc_init()
Xen can end up calling vm_unmap_aliases() before vmalloc_init() has
been called. In this case its safe to make it a simple no-op.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Fri, 7 Nov 2008 06:47:21 +0000 (14:47 +0800)]
sched: fix memory leak in a failure path
Impact: fix rare memory leak in the sched-domains manual reconfiguration code
In the failure path, rd is not attached to a sched domain,
so it causes a leak.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Thu, 6 Nov 2008 01:45:16 +0000 (09:45 +0800)]
sched: fix a bug in sched domain degenerate
Impact: re-add incorrectly eliminated sched domain layers
(1) on i386 with SCHED_SMT and SCHED_MC enabled
# mount -t cgroup -o cpuset xxx /mnt
# echo 0 > /mnt/cpuset.sched_load_balance
# mkdir /mnt/0
# echo 0 > /mnt/0/cpuset.cpus
# dmesg
CPU0 attaching sched-domain:
domain 0: span 0 level CPU
groups: 0
(2) on i386 with SCHED_MC enabled but SCHED_SMT disabled
# same with (1)
# dmesg
CPU0 attaching NULL sched-domain.
The bug is that some sched domains may be skipped unintentionally when
degenerating (optimizing) sched domains.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Fri, 7 Nov 2008 00:43:44 +0000 (16:43 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: Fix recursive descent in __scm_destroy().
iwl3945: fix deadlock on suspend
iwl3945: do not send scan command if channel count zero
iwl3945: clear scanning bits upon failure
ath5k: correct handling of rx status fields
zd1211rw: Add 2 device IDs
Fix logic error in rfkill_check_duplicity
iwlagn: avoid sleep in softirq context
iwlwifi: clear scanning bits upon failure
Revert "ath5k: honor FIF_BCN_PRBRESP_PROMISC in STA mode"
tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
netfilter: netns ct: walk netns list under RTNL
ipv6: fix run pending DAD when interface becomes ready
net/9p: fix printk format warnings
net: fix packet socket delivery in rx irq handler
xfrm: Have af-specific init_tempsel() initialize family field of temporary selector
Andrew Morton [Thu, 6 Nov 2008 20:05:21 +0000 (12:05 -0800)]
alsa: fix snd_BUG_on() and friends
sound/pci/pcxhr/pcxhr_core.c: In function 'pcxhr_set_pipe_cmd_params':
sound/pci/pcxhr/pcxhr_core.c:700: warning: statement with no effect
sound/pci/pcxhr/pcxhr_core.c:706: warning: statement with no effect
sound/pci/pcxhr/pcxhr_core.c:710: warning: statement with no effect
Due to
try to fix this, and be more conventional about the empty stubs.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Thu, 6 Nov 2008 23:57:24 +0000 (15:57 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "x86: default to reboot via ACPI"
x86: align DirectMap in /proc/meminfo
AMD IOMMU: fix lazy IO/TLB flushing in unmap path
x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
x86: mention ACPI in top-level Kconfig menu
x86: size NR_IRQS on 32-bit systems the same way as 64-bit
x86: don't allow nr_irqs > NR_IRQS
x86/docs: remove noirqbalance param docs
x86: don't use tsc_khz to calculate lpj if notsc is passed
x86, voyager: fix smp_intr_init() compile breakage
AMD IOMMU: fix detection of NP capable IOMMUs
Linus Torvalds [Thu, 6 Nov 2008 23:56:29 +0000 (15:56 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] xsc3: fix xsc3_l2_inv_range
[ARM] mm: fix page table initialization
[ARM] fix naming of MODULE_START / MODULE_END
ARM: OMAP: Fix define for twl4030 irqs
ARM: OMAP: Fix get_irqnr_and_base to clear spurious interrupt bits
ARM: OMAP: Fix debugfs_create_*'s error checking method for arm/plat-omap
ARM: OMAP: Fix compiler warnings in gpmc.c
[ARM] fix VFP+softfloat binaries
Linus Torvalds [Thu, 6 Nov 2008 23:55:34 +0000 (15:55 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: dv1394: fix possible deadlock in multithreaded clients
ieee1394: raw1394: fix possible deadlock in multithreaded clients
ieee1394: struct device - replace bus_id with dev_name(), dev_set_name()
firewire: struct device - replace bus_id with dev_name(), dev_set_name()
Linus Torvalds [Thu, 6 Nov 2008 23:53:47 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Block: use round_jiffies_up()
Add round_jiffies_up and related routines
block: fix __blkdev_get() for removable devices
generic-ipi: fix the smp_mb() placement
blk: move blk_delete_timer call in end_that_request_last
block: add timer on blkdev_dequeue_request() not elv_next_request()
bio: define __BIOVEC_PHYS_MERGEABLE
block: remove unused ll_new_mergeable()
David S. Miller [Thu, 6 Nov 2008 23:52:00 +0000 (15:52 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6
Linus Torvalds [Thu, 6 Nov 2008 23:50:54 +0000 (15:50 -0800)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] SAM9 watchdog - supported on all SAM9 and CAP9 processors
[WATCHDOG] SAM9 watchdog - update for moved headers
Linus Torvalds [Thu, 6 Nov 2008 23:50:11 +0000 (15:50 -0800)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: linear: Fix a division by zero bug for very small arrays.
md: fix bug in raid10 recovery.
md: revert the recent addition of a call to the BLKRRPART ioctl.
Linus Torvalds [Thu, 6 Nov 2008 23:46:28 +0000 (15:46 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix "unused variable" warning in pci_dlpar.c
powerpc/cell: Fix compile error in ras.c
powerpc/ps3: Fix compile error in ps3-lpm.c
Linus Torvalds [Thu, 6 Nov 2008 23:45:57 +0000 (15:45 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: fix printk format warnings
unsigned fid->fid cannot be negative
9p: rdma: remove duplicated #include
p9: Fix leak of waitqueue in request allocation path
9p: Remove unneeded free of fcall for Flush
9p: Make all client spin locks IRQ safe
9p: rdma: Set trans prior to requesting async connection ops
Linus Torvalds [Thu, 6 Nov 2008 23:45:40 +0000 (15:45 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: re-tune balancing
sched: fix buddies for group scheduling
sched: backward looking buddy
sched: fix fair preempt check
sched: cleanup fair task selection
David S. Miller [Thu, 6 Nov 2008 23:45:32 +0000 (15:45 -0800)]
net: Fix recursive descent in __scm_destroy().
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Wed, 5 Nov 2008 17:38:47 +0000 (17:38 +0000)]
Fix accidental implicit cast in HR-timer conversion
Fix the hrtimer_add_expires_ns() function. It should take a 'u64 ns' argument,
but rather takes an 'unsigned long ns' argument - which might only be 32-bits.
On FRV, this results in the kernel locking up because hrtimer_forward() passes
the result of a 64-bit multiplication to this function, for which the compiler
discards the top 32-bits - something that didn't happen when ktime_add_ns() was
called directly.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 6 Nov 2008 23:43:13 +0000 (15:43 -0800)]
Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
[JFFS2] fix race condition in jffs2_lzo_compress()
[MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
[JFFS2] Fix lack of locking in thread_should_wake()
[JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFER
[MTD] [NAND] OMAP2: remove duplicated #include
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:58 +0000 (12:53 -0800)]
fat: i_blocks warning fix
blkcnt_t type depends on CONFIG_LSF. Use unsigned long long always for
printk(). But lazy to type it, so add "llu" and use it.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:57 +0000 (12:53 -0800)]
fat: ->i_pos race fix
i_pos is 64bits value, hence it's not atomic to update.
Important place is fat_write_inode() only, other places without lock
are just for printk().
This adds lock for "BITS_PER_LONG == 32" kernel.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:57 +0000 (12:53 -0800)]
fat: mmu_private race fix
mmu_private is 64bits value, hence it's not atomic to update.
So, the access rule for mmu_private is we must hold ->i_mutex. But,
fat_get_block() path doesn't follow the rule on non-allocation path.
This fixes by using i_size instead if non-allocation path.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:56 +0000 (12:53 -0800)]
fat: Add printf attribute to fat_fs_panic()
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:56 +0000 (12:53 -0800)]
fat: Fix _fat_bmap() race
fat_get_cluster() assumes the requested blocknr isn't truncated during
read. _fat_bmap() doesn't follow this rule.
This protects it by ->i_mutex.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:55 +0000 (12:53 -0800)]
fat: Fix ATTR_RO for directory
FAT has the ATTR_RO (read-only) attribute. But on Windows, the ATTR_RO
of the directory will be just ignored actually, and is used by only
applications as flag. E.g. it's setted for the customized folder by
Explorer.
http://msdn2.microsoft.com/en-us/library/
aa969337.aspx
This adds "rodir" option. If user specified it, ATTR_RO is used as
read-only flag even if it's the directory. Otherwise, inode->i_mode
is not used to hold ATTR_RO (i.e. fat_mode_can_save_ro() returns 0).
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Fix ATTR_RO in the case of (~umask & S_WUGO) == 0
If inode->i_mode doesn't have S_WUGO, current code assumes it means
ATTR_RO. However, if (~[ufd]mask & S_WUGO) == 0, inode->i_mode can't
hold S_WUGO. Therefore the updated directory entry will always have
ATTR_RO.
This adds fat_mode_can_hold_ro() to check it. And if inode->i_mode
can't hold, uses -i_attrs to hold ATTR_RO instead.
With this, we don't set ATTR_RO unless users change it via ioctl() if
(~[ufd]mask & S_WUGO) == 0.
And on FAT_IOCTL_GET_ATTRIBUTES path, this adds ->i_mutex to it for
not returning the partially updated attributes by FAT_IOCTL_SET_ATTRIBUTES
to userland.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Cleanup FAT attribute stuff
This adds three helpers:
fat_make_attrs() - makes FAT attributes from inode.
fat_make_mode() - makes mode_t from FAT attributes.
fat_save_attrs() - saves FAT attributes to inode.
Then this replaces: MSDOS_MKMODE() by fat_make_mode(), fat_attr() by
fat_make_attrs(), ->i_attrs = attr & ATTR_UNUSED by fat_save_attrs().
And for root inode, those is used with ATTR_DIR instead of bogus
ATTR_NONE.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:53 +0000 (12:53 -0800)]
fat: Cleanup msdos_lookup()
Use same style with vfat_lookup().
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:52 +0000 (12:53 -0800)]
fat: Kill d_invalidate() in vfat_lookup()
d_invalidate() for positive dentry doesn't work in some cases
(vfsmount, nfsd, and maybe others). shrink_dcache_parent() by
d_invalidate() is pointless for vfat usage at all.
So, this kills it, and intead of it uses d_move().
To save old behavior, this returns alias simply for directory (don't
change pwd, etc..). the directory lookup shouldn't be important for
performance.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:51 +0000 (12:53 -0800)]
fat: Fix/Cleanup dcache handling for vfat
- Add comments for handling dcache of vfat.
- Separate case-sensitive case and case-insensitive to
vfat_revalidate() and vfat_ci_revalidate().
vfat_revalidate() doesn't need to drop case-insensitive negative
dentry on creation path.
- Current code is missing to set ->d_revalidate to the negative dentry
created by unlink/etc..
This sets ->d_revalidate always, and returns 1 for positive
dentry. Now, we don't need to change ->d_op dynamically anymore,
so this just uses sb->s_root->d_op to set ->d_op.
- d_find_alias() may return DCACHE_DISCONNECTED dentry. It's not
the interesting dentry there. This checks it.
- Add missing LOOKUP_PARENT check. We don't need to drop the valid
negative dentry for (LOOKUP_CREATE | LOOKUP_PARENT) lookup.
- For consistent filename on creation path, this drops negative dentry
if we can't see intent.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:51 +0000 (12:53 -0800)]
vfat: Fix vfat_find() error path in vfat_lookup()
Current vfat_lookup() creates negetive dentry blindly if vfat_find()
returned a error. It's wrong. If the error isn't -ENOENT, just return
error.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:50 +0000 (12:53 -0800)]
fat: use fat_detach() in fat_clear_inode()
Use fat_detach() instead of opencoding it.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:49 +0000 (12:53 -0800)]
fat: Fix fat_ent_update_ptr() for FAT12
This fixes the missing update for bhs/nr_bhs in case the caller
accessed from block boundary to first block of boundary.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>