Peter Zijlstra [Wed, 25 May 2011 00:12:03 +0000 (17:12 -0700)]
lockdep, mutex: provide mutex_lock_nest_lock
In order to convert i_mmap_lock to a mutex we need a mutex equivalent to
spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation.
As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of
the locking pattern where an outer lock serializes the acquisition order
of nested locks. That is, if every time you lock multiple locks A, say A1
and A2 you first acquire N, the order of acquiring A1 and A2 is
irrelevant.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:12:01 +0000 (17:12 -0700)]
mm: extended batches for generic mmu_gather
Instead of using a single batch (the small on-stack, or an allocated
page), try and extend the batch every time it runs out and only flush once
either the extend fails or we're done.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:12:00 +0000 (17:12 -0700)]
mm, powerpc: move the RCU page-table freeing into generic code
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:58 +0000 (17:11 -0700)]
mm: now that all old mmu_gather code is gone, remove the storage
Fold all the mmu_gather rework patches into one for submission
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:57 +0000 (17:11 -0700)]
um: mmu_gather rework
Fix up the um mmu_gather code to conform to the new API.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:55 +0000 (17:11 -0700)]
ia64: mmu_gather rework
Fix up the ia64 mmu_gather code to conform to the new API.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:54 +0000 (17:11 -0700)]
sh: mmu_gather rework
Fix up the sh mmu_gather code to conform to the new API.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:53 +0000 (17:11 -0700)]
arm: mmu_gather rework
Fix up the arm mmu_gather code to conform to the new API.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:51 +0000 (17:11 -0700)]
s390: mmu_gather rework
Adapt the stand-alone s390 mmu_gather implementation to the new
preemptible mmu_gather interface.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:50 +0000 (17:11 -0700)]
sparc: mmu_gather rework
Rework the sparc mmu_gather usage to conform to the new world order :-)
Sparc mmu_gather does two things:
- tracks vaddrs to unhash
- tracks pages to free
Split these two things like powerpc has done and keep the vaddrs
in per-cpu data structures and flush them on context switch.
The remaining bits can then use the generic mmu_gather.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:48 +0000 (17:11 -0700)]
powerpc: mmu_gather rework
Fix up powerpc to the new mmu_gather stuff.
PPC has an extra batching queue to RCU free the actual pagetable
allocations, use the ARCH extentions for that for now.
For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
hardware hash-table, keep using per-cpu arrays but flush on context switch
and use a TLF bit to track the lazy_mmu state.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 25 May 2011 00:11:45 +0000 (17:11 -0700)]
mm: mmu_gather rework
Rework the existing mmu_gather infrastructure.
The direct purpose of these patches was to allow preemptible mmu_gather,
but even without that I think these patches provide an improvement to the
status quo.
The first 9 patches rework the mmu_gather infrastructure. For review
purpose I've split them into generic and per-arch patches with the last of
those a generic cleanup.
The next patch provides generic RCU page-table freeing, and the followup
is a patch converting s390 to use this. I've also got 4 patches from
DaveM lined up (not included in this series) that uses this to implement
gup_fast() for sparc64.
Then there is one patch that extends the generic mmu_gather batching.
After that follow the mm preemptibility patches, these make part of the mm
a lot more preemptible. It converts i_mmap_lock and anon_vma->lock to
mutexes which together with the mmu_gather rework makes mmu_gather
preemptible as well.
Making i_mmap_lock a mutex also enables a clean-up of the truncate code.
This also allows for preemptible mmu_notifiers, something that XPMEM I
think wants.
Furthermore, it removes the new and universially detested unmap_mutex.
This patch:
Remove the first obstacle towards a fully preemptible mmu_gather.
The current scheme assumes mmu_gather is always done with preemption
disabled and uses per-cpu storage for the page batches. Change this to
try and allocate a page for batching and in case of failure, use a small
on-stack array to make some progress.
Preemptible mmu_gather is desired in general and usable once i_mmap_lock
becomes a mutex. Doing it before the mutex conversion saves us from
having to rework the code by moving the mmu_gather bits inside the
pte_lock.
Also avoid flushing the tlb batches from under the pte lock, this is
useful even without the i_mmap_lock conversion as it significantly reduces
pte lock hold times.
[akpm@linux-foundation.org: fix comment tpyo]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Wed, 25 May 2011 00:11:44 +0000 (17:11 -0700)]
mm: make expand_downwards() symmetrical with expand_upwards()
Currently we have expand_upwards exported while expand_downwards is
accessible only via expand_stack or expand_stack_downwards.
check_stack_guard_page is a nice example of the asymmetry. It uses
expand_stack for VM_GROWSDOWN while expand_upwards is called for
VM_GROWSUP case.
Let's clean this up by exporting both functions and make those names
consistent. Let's use expand_{upwards,downwards} because expanding
doesn't always involve stack manipulation (an example is
ia64_do_page_fault which uses expand_upwards for registers backing store
expansion). expand_downwards has to be defined for both
CONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards
version in the early process initialization phase for growsup
configuration.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Wed, 25 May 2011 00:11:43 +0000 (17:11 -0700)]
mm/vmalloc: remove guard page from between vmap blocks
The vmap allocator is used to, among other things, allocate per-cpu vmap
blocks, where each vmap block is naturally aligned to its own size.
Obviously, leaving a guard page after each vmap area forbids packing vmap
blocks efficiently and can make the kernel run out of possible vmap blocks
long before overall vmap space is exhausted.
The new interface to map a user-supplied page array into linear vmalloc
space (vm_map_ram) insists on allocating from a vmap block (instead of
falling back to a custom area) when the area size is below a certain
threshold. With heavy users of this interface (e.g. XFS) and limited
vmalloc space on 32-bit, vmap block exhaustion is a real problem.
Remove the guard page from the core vmap allocator. vmalloc and the old
vmap interface enforce a guard page on their own at a higher level.
Note that without this patch, we had accidental guard pages after those
vm_map_ram areas that happened to be at the end of a vmap block, but not
between every area. This patch removes this accidental guard page only.
If we want guard pages after every vm_map_ram area, this should be done
separately. And just like with vmalloc and the old interface on a
different level, not in the core allocator.
Mel pointed out: "If necessary, the guard page could be reintroduced as a
debugging-only option (CONFIG_DEBUG_PAGEALLOC?). Otherwise it seems
reasonable."
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Wed, 25 May 2011 00:11:42 +0000 (17:11 -0700)]
include/linux/gfp.h: convert BUG_ON() into VM_BUG_ON()
VM_BUG_ON() if effectively a BUG_ON() undef #ifdef CONFIG_DEBUG_VM. That
is exactly what we have here now, and two different folks have suggested
doing it this way.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Wed, 25 May 2011 00:11:41 +0000 (17:11 -0700)]
include/linux/gfp.h: work around apparent sparse confusion
Running sparse on page_alloc.c today, it errors out:
include/linux/gfp.h:254:17: error: bad constant expression
include/linux/gfp.h:254:17: error: cannot size expression
which is a line in gfp_zone():
BUILD_BUG_ON((GFP_ZONE_BAD >> bit) & 1);
That's really unfortunate, because it ends up hiding all of the other
legitimate sparse messages like this:
mm/page_alloc.c:5315:59: warning: incorrect type in argument 1 (different base types)
mm/page_alloc.c:5315:59: expected unsigned long [unsigned] [usertype] size
mm/page_alloc.c:5315:59: got restricted gfp_t [usertype] <noident>
...
Having sparse be able to catch these very oopsable bugs is a lot more
important than keeping a BUILD_BUG_ON(). Kill the BUILD_BUG_ON().
Compiles on x86_64 with and without CONFIG_DEBUG_VM=y. defconfig boots
fine for me.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 25 May 2011 00:11:40 +0000 (17:11 -0700)]
oom: replace PF_OOM_ORIGIN with toggling oom_score_adj
There's a kernel-wide shortage of per-process flags, so it's always
helpful to trim one when possible without incurring a significant penalty.
It's even more important when you're planning on adding a per- process
flag yourself, which I plan to do shortly for transparent hugepages.
PF_OOM_ORIGIN is used by ksm and swapoff to prefer current since it has a
tendency to allocate large amounts of memory and should be preferred for
killing over other tasks. We'd rather immediately kill the task making
the errant syscall rather than penalizing an innocent task.
This patch removes PF_OOM_ORIGIN since its behavior is equivalent to
setting the process's oom_score_adj to OOM_SCORE_ADJ_MAX.
The process's old oom_score_adj is stored and then set to
OOM_SCORE_ADJ_MAX during the time it used to have PF_OOM_ORIGIN. The old
value is then reinstated when the process should no longer be considered a
high priority for oom killing.
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Wed, 25 May 2011 00:11:38 +0000 (17:11 -0700)]
mm/compaction: reverse the change that forbade sync migraton with __GFP_NO_KSWAPD
It's uncertain this has been beneficial, so it's safer to undo it. All
other compaction users would still go in synchronous mode if a first
attempt at async compaction failed. Hopefully we don't need to force
special behavior for THP (which is the only __GFP_NO_KSWAPD user so far
and it's the easier to exercise and to be noticeable). This also make
__GFP_NO_KSWAPD return to its original strict semantics specific to bypass
kswapd, as THP allocations have khugepaged for the async THP
allocations/compactions.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alex Villacis Lasso <avillaci@fiec.espol.edu.ec>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:33 +0000 (17:11 -0700)]
mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur
Currently, cpu hotplug updates pcp->stat_threshold, but memory hotplug
doesn't. There is no reason for this.
[akpm@linux-foundation.org: fix CONFIG_SMP=n build]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:32 +0000 (17:11 -0700)]
mm, mem-hotplug: recalculate lowmem_reserve when memory hotplug occurs
Currently, memory hotplug calls setup_per_zone_wmarks() and
calculate_zone_inactive_ratio(), but doesn't call
setup_per_zone_lowmem_reserve().
It means the number of reserved pages aren't updated even if memory hot
plug occur. This patch fixes it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:31 +0000 (17:11 -0700)]
mm, mem-hotplug: fix section mismatch. setup_per_zone_inactive_ratio() should be __meminit.
Commit
bce7394a3e ("page-allocator: reset wmark_min and inactive ratio of
zone when hotplug happens") introduced invalid section references. Now,
setup_per_zone_inactive_ratio() is marked __init and then it can't be
referenced from memory hotplug code.
This patch marks it as __meminit and also marks caller as __ref.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:30 +0000 (17:11 -0700)]
x86,mm: make pagefault killable
When an oom killing occurs, almost all processes are getting stuck at the
following two points.
1) __alloc_pages_nodemask
2) __lock_page_or_retry
1) is not very problematic because TIF_MEMDIE leads to an allocation
failure and getting out from page allocator.
2) is more problematic. In an OOM situation, zones typically don't have
page cache at all and memory starvation might lead to greatly reduced IO
performance. When a fork bomb occurs, TIF_MEMDIE tasks don't die quickly,
meaning that a fork bomb may create new process quickly rather than the
oom-killer killing it. Then, the system may become livelocked.
This patch makes the pagefault interruptible by SIGKILL.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:29 +0000 (17:11 -0700)]
mm: introduce wait_on_page_locked_killable()
commit
2687a356 ("Add lock_page_killable") introduced killable
lock_page(). Similarly this patch introdues killable
wait_on_page_locked().
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 25 May 2011 00:11:28 +0000 (17:11 -0700)]
mm: per-node vmstat: show proper vmstats
commit
2ac390370a ("writeback: add
/sys/devices/system/node/<node>/vmstat") added vmstat entry. But
strangely it only show nr_written and nr_dirtied.
# cat /sys/devices/system/node/node20/vmstat
nr_written 0
nr_dirtied 0
Of course, It's not adequate. With this patch, the vmstat show all vm
stastics as /proc/vmstat.
# cat /sys/devices/system/node/node0/vmstat
nr_free_pages 899224
nr_inactive_anon 201
nr_active_anon 17380
nr_inactive_file 31572
nr_active_file 28277
nr_unevictable 0
nr_mlock 0
nr_anon_pages 17321
nr_mapped 8640
nr_file_pages 60107
nr_dirty 33
nr_writeback 0
nr_slab_reclaimable 6850
nr_slab_unreclaimable 7604
nr_page_table_pages 3105
nr_kernel_stack 175
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 260
nr_dirtied 1050
nr_written 938
numa_hit 962872
numa_miss 0
numa_foreign 0
numa_interleave 8617
numa_local 962872
numa_other 0
nr_anon_transparent_hugepages 0
[akpm@linux-foundation.org: no externs in .c files]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:27 +0000 (17:11 -0700)]
mm: nommu: fix a compile warning in do_mmap_pgoff()
Because 'ret' is declared as int, not unsigned long, no need to cast the
error contants into unsigned long. If you compile this code on a 64-bit
machine somehow, you'll see following warning:
CC mm/nommu.o
mm/nommu.c: In function `do_mmap_pgoff':
mm/nommu.c:1411: warning: overflow in implicit constant conversion
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:26 +0000 (17:11 -0700)]
mm: nommu: fix a potential memory leak in do_mmap_private()
If f_op->read() fails and sysctl_nr_trim_pages > 1, there could be a
memory leak between @region->vm_end and @region->vm_top.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:25 +0000 (17:11 -0700)]
mm: nommu: check the vma list when unmapping file-mapped vma
Now we have the sorted vma list, use it in do_munmap() to check that we
have an exact match.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:24 +0000 (17:11 -0700)]
mm: nommu: find vma using the sorted vma list
Now we have the sorted vma list, use it in the find_vma[_exact]() rather
than doing linear search on the rb-tree.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:23 +0000 (17:11 -0700)]
mm: nommu: don't scan the vma list when deleting
Since commit
297c5eee3724 ("mm: make the vma list be doubly linked") made
it a doubly linked list, we don't need to scan the list when deleting
@vma.
And the original code didn't update the prev pointer. Fix it too.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Wed, 25 May 2011 00:11:22 +0000 (17:11 -0700)]
mm: nommu: sort mm->mmap list properly
When I was reading nommu code, I found that it handles the vma list/tree
in an unusual way. IIUC, because there can be more than one
identical/overrapped vmas in the list/tree, it sorts the tree more
strictly and does a linear search on the tree. But it doesn't applied to
the list (i.e. the list could be constructed in a different order than
the tree so that we can't use the list when finding the first vma in that
order).
Since inserting/sorting a vma in the tree and link is done at the same
time, we can easily construct both of them in the same order. And linear
searching on the tree could be more costly than doing it on the list, it
can be converted to use the list.
Also, after the commit
297c5eee3724 ("mm: make the vma list be doubly
linked") made the list be doubly linked, there were a couple of code need
to be fixed to construct the list properly.
Patch 1/6 is a preparation. It maintains the list sorted same as the tree
and construct doubly-linked list properly. Patch 2/6 is a simple
optimization for the vma deletion. Patch 3/6 and 4/6 convert tree
traversal to list traversal and the rest are simple fixes and cleanups.
This patch:
@vma added into @mm should be sorted by start addr, end addr and VMA
struct addr in that order because we may get identical VMAs in the @mm.
However this was true only for the rbtree, not for the list.
This patch fixes this by remembering 'rb_prev' during the tree traversal
like find_vma_prepare() does and linking the @vma via __vma_link_list().
After this patch, we can iterate the whole VMAs in correct order simply by
using @mm->mmap list.
[akpm@linux-foundation.org: avoid duplicating __vma_link_list()]
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergey Senozhatsky [Wed, 25 May 2011 00:11:21 +0000 (17:11 -0700)]
mm: remove unused zone_idx variable from set_migratetype_isolate
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Wed, 25 May 2011 00:11:20 +0000 (17:11 -0700)]
mmap: avoid merging cloned VMAs
Avoid merging a VMA with another VMA which is cloned from the parent process.
The cloned VMA shares the anon_vma lock with the parent process's VMA. If
we do the merge, more vmas (even the new range is only for current
process) use the perent process's anon_vma lock. This introduces
scalability issues. find_mergeable_anon_vma() already considers this
case.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Wed, 25 May 2011 00:11:19 +0000 (17:11 -0700)]
mmap: avoid unnecessary anon_vma lock
If we only change vma->vm_end, we can avoid taking anon_vma lock even if
'insert' isn't NULL, which is the case of split_vma.
As I understand it, we need the lock before because rmap must get the
'insert' VMA when we adjust old VMA's vm_end (the 'insert' VMA is linked
to anon_vma list in __insert_vm_struct before).
But now this isn't true any more. The 'insert' VMA is already linked to
anon_vma list in __split_vma(with anon_vma_clone()) instead of
__insert_vm_struct. There is no race rmap can't get required VMAs. So
the anon_vma lock is unnecessary, and this can reduce one locking in brk
case and improve scalability.
Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Wed, 25 May 2011 00:11:18 +0000 (17:11 -0700)]
mmap: add alignment for some variables
Make some variables have correct alignment/section to avoid cache issue.
In a workload which heavily does mmap/munmap, the variables will be used
frequently.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 25 May 2011 00:11:16 +0000 (17:11 -0700)]
arch, mm: filter disallowed nodes from arch specific show_mem functions
Architectures that implement their own show_mem() function did not pass
the filter argument to show_free_areas() to appropriately avoid emitting
the state of nodes that are disallowed in the current context. This patch
now passes the filter argument to show_free_areas() so those nodes are now
avoided.
This patch also removes the show_free_areas() wrapper around
__show_free_areas() and converts existing callers to pass an empty filter.
ia64 emits additional information for each node, so skip_free_areas_zone()
must be made global to filter disallowed nodes and it is converted to use
a nid argument rather than a zone for this use case.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Andrzej Siewior [Wed, 25 May 2011 00:11:15 +0000 (17:11 -0700)]
xtensa/mm: remove WANT_PAGE_VIRTUAL
This is not useful: it provides page->virtual and is used with highmem.
xtensa has no support for highmem and those HIGHMEM bits which are found
by grep are partly implemented. The interesting functions like kmap() are
missing. If someone actually implements the complete HIGHMEM support he
could use HASHED_PAGE_VIRTUAL like most others do.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Chris Zankel <chris@zankel.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Andrzej Siewior [Wed, 25 May 2011 00:11:14 +0000 (17:11 -0700)]
xtensa/mm: remove pgtable.c
It is not referenced by any Makefile. pte_alloc_one_kernel() and
pte_alloc_one() is implemented in arch/xtensa/include/asm/pgalloc.h.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Chris Zankel <chris@zankel.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Liu Yuan [Wed, 25 May 2011 00:11:12 +0000 (17:11 -0700)]
drivers/video/backlight/adp5520_bl.c: check strict_strtoul() return value
It should check if strict_strtoul() succeeds.
[akpm@linux-foundation.org: don't override strict_strtoul() return value]
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 25 May 2011 00:11:11 +0000 (17:11 -0700)]
mm: vmscan: correctly check if reclaimer should schedule during shrink_slab
It has been reported on some laptops that kswapd is consuming large
amounts of CPU and not being scheduled when SLUB is enabled during large
amounts of file copying. It is expected that this is due to kswapd
missing every cond_resched() point because;
shrink_page_list() calls cond_resched() if inactive pages were isolated
which in turn may not happen if all_unreclaimable is set in
shrink_zones(). If for whatver reason, all_unreclaimable is
set on all zones, we can miss calling cond_resched().
balance_pgdat() only calls cond_resched if the zones are not
balanced. For a high-order allocation that is balanced, it
checks order-0 again. During that window, order-0 might have
become unbalanced so it loops again for order-0 and returns
that it was reclaiming for order-0 to kswapd(). It can then
find that a caller has rewoken kswapd for a high-order and
re-enters balance_pgdat() without ever calling cond_resched().
shrink_slab only calls cond_resched() if we are reclaiming slab
pages. If there are a large number of direct reclaimers, the
shrinker_rwsem can be contended and prevent kswapd calling
cond_resched().
This patch modifies the shrink_slab() case. If the semaphore is
contended, the caller will still check cond_resched(). After each
successful call into a shrinker, the check for cond_resched() remains in
case one shrinker is particularly slow.
[mgorman@suse.de: preserve call to cond_resched after each call into shrinker]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org> [2.6.38+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Wed, 25 May 2011 00:11:09 +0000 (17:11 -0700)]
mm: vmscan: correct use of pgdat_balanced in sleeping_prematurely
There are a few reports of people experiencing hangs when copying large
amounts of data with kswapd using a large amount of CPU which appear to be
due to recent reclaim changes. SLUB using high orders is the trigger but
not the root cause as SLUB has been using high orders for a while. The
root cause was bugs introduced into reclaim which are addressed by the
following two patches.
Patch 1 corrects logic introduced by commit
1741c877 ("mm: kswapd:
keep kswapd awake for high-order allocations until a percentage of
the node is balanced") to allow kswapd to go to sleep when
balanced for high orders.
Patch 2 notes that it is possible for kswapd to miss every
cond_resched() and updates shrink_slab() so it'll at least reach
that scheduling point.
Chris Wood reports that these two patches in isolation are sufficient to
prevent the system hanging. AFAIK, they should also resolve similar hangs
experienced by James Bottomley.
This patch:
Johannes Weiner poined out that the logic in commit
1741c877 ("mm: kswapd:
keep kswapd awake for high-order allocations until a percentage of the
node is balanced") is backwards. Instead of allowing kswapd to go to
sleep when balancing for high order allocations, it keeps it kswapd
running uselessly.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Colin King <colin.king@canonical.com>
Cc: Raghavendra D Prabhu <raghu.prabhu13@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@kernel.org> [2.6.38+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Lameter [Wed, 25 May 2011 14:47:43 +0000 (09:47 -0500)]
slub: Fix double bit unlock in debug mode
Commit
442b06bcea23 ("slub: Remove node check in slab_free") added a
call to deactivate_slab() in the debug case in __slab_alloc(), which
unlocks the current slab used for allocation. Going to the label
'unlock_out' then does it again.
Also, in the debug case we do not need all the other processing that the
'unlock_out' path does. We always fall back to the slow path in the
debug case. So the tid update is useless.
Similarly, ALLOC_SLOWPATH would just be incremented for all allocations.
Also a pretty useless thing.
So simply restore irq flags and return the object.
Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-and-bisected-by: James Morris <jmorris@namei.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Jens Axboe <jaxboe@fusionio.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 24 May 2011 23:39:23 +0000 (16:39 -0700)]
Merge branch 'for-linus/2640/i2c' of git://git.fluff.org/bjdooks/linux
* 'for-linus/2640/i2c' of git://git.fluff.org/bjdooks/linux: (21 commits)
mach-ux500: set proper I2C platform data from MOP500s
i2c-nomadik: break out single messsage transmission
i2c-nomadik: reset the hw after status check
i2c-nomadik: remove the unnecessary delay
i2c-nomadik: change the TX and RX threshold
i2c-nomadik: add code to retry on timeout failure
i2c-nomadik: use pm_runtime API
i2c-nomadik: print abort cause only on abort tag
i2c-nomadik: correct adapter timeout initialization
i2c-nomadik: remove the redundant error message
i2c-nomadik: corrrect returned error numbers
i2c-nomadik: fix speed enumerator
i2c-nomadik: make i2c timeout specific per i2c bus
i2c-nomadik: add regulator support
i2c: i2c-sh_mobile bus speed platform data V2
i2c: i2c-sh_mobile clock string removal
i2c-eg20t: Support new device ML7223 IOH
i2c: tegra: Add de-bounce cycles.
i2c: tegra: fix repeated start handling
i2c: tegra: recover from spurious interrupt storm
...
Ben Dooks [Tue, 24 May 2011 23:25:55 +0000 (00:25 +0100)]
Merge branches 'for-2639/i2c-eg20t', 'for-2639/i2c-shmobile', 'for-2639/i2c-tegra' and 'for-2639/i2c-nomadik2' into for-linus/2640/i2c
Linus Walleij [Fri, 13 May 2011 10:31:13 +0000 (12:31 +0200)]
mach-ux500: set proper I2C platform data from MOP500s
This specifies the new per-platform timeout per I2C bus and
switches the I2C buses to fast mode, and increase the FIFO
depth to 8 for reads and writes.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Walleij [Fri, 13 May 2011 10:31:01 +0000 (12:31 +0200)]
i2c-nomadik: break out single messsage transmission
Reduce code size in the message transfer function by factoring out
a single-message transfer function.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:30:53 +0000 (12:30 +0200)]
i2c-nomadik: reset the hw after status check
In case of I2C timeout, reset the HW only after the HW status
is read, otherwise the staus will be lost.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Jonas Aberg <jonas.aberg@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:30:42 +0000 (12:30 +0200)]
i2c-nomadik: remove the unnecessary delay
The delay in the driver seems to be not needed, so remove it.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Markus Grape <markus.grape@stericsson.com>
Tested-by: Per Persson <per.xb.persson@stericsson.com>
Tested-by: Chethan Krishna N <chethan.krishna@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:30:34 +0000 (12:30 +0200)]
i2c-nomadik: change the TX and RX threshold
1) Increase RX FIFO threshold so that there is a reduction in
the number of interrupts handled to complete a transaction.
2) Fill TX FIFO in the write function.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Jonas Aberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:30:23 +0000 (12:30 +0200)]
i2c-nomadik: add code to retry on timeout failure
It is seen that i2c-nomadik controller randomly stops generating the
interrupts leading to a i2c timeout. As a workaround to this problem,
add retries to the on going transfer on failure.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Jonas ABERG <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Rabin Vincent [Fri, 13 May 2011 10:30:07 +0000 (12:30 +0200)]
i2c-nomadik: use pm_runtime API
Use the pm_runtime API for pins control.
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Reviewed-by: Jonas Aberg <jonas.aberg@stericsson.com>
[deleted some surplus runtime PM code]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:29:55 +0000 (12:29 +0200)]
i2c-nomadik: print abort cause only on abort tag
Modify the code to:
1)Print the cause of i2c failure only if the status is set to ABORT.
2)Print slave address on send/receive fail, will help in which slave
failed.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Jonas Aberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:29:46 +0000 (12:29 +0200)]
i2c-nomadik: correct adapter timeout initialization
Correct the incorrect initialization of adapter timeout not to be
in milliseconds, as it needs to be done in jiffies.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
srinidhi kasagar [Fri, 13 May 2011 10:29:38 +0000 (12:29 +0200)]
i2c-nomadik: remove the redundant error message
The abort cause string itself is an error, so remove the redundant
explicit error message.
Signed-off-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Reviewed-by: Jonas Aberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:29:28 +0000 (12:29 +0200)]
i2c-nomadik: corrrect returned error numbers
The code was returning bad error numbers or just -1 in some cases.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Walleij [Fri, 13 May 2011 10:29:20 +0000 (12:29 +0200)]
i2c-nomadik: fix speed enumerator
The I2C speed enumerators in the i2c-nomadik header file were in
the wrong order.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Virupax Sadashivpetimath [Fri, 13 May 2011 10:29:12 +0000 (12:29 +0200)]
i2c-nomadik: make i2c timeout specific per i2c bus
Add option to have different i2c timeout delay for different i2c buses
specified in platform data. Default to the old value unless specified.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Reviewed-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Jonas Aberg [Fri, 13 May 2011 10:29:02 +0000 (12:29 +0200)]
i2c-nomadik: add regulator support
This on-chip I2C controller needs to fetch the regulator
representing its voltage domain so that it won't be switched off.
Signed-off-by: Jonas Aberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Torvalds [Tue, 24 May 2011 22:20:02 +0000 (15:20 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-parport: Various cleanups
i2c-i801: Don't depend on other kernel driver config options
i2c-i801: Check for vendor Fujitsu before probing for apanel
i2c-i801: Don't probe for slaves on IDF channels
i2c-i801: SMBus patch for Intel Panther Point DeviceIDs
i2c/writing-clients: Fix foo_driver.id_table
Linus Torvalds [Tue, 24 May 2011 22:11:46 +0000 (15:11 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
jbd: Fix comment to match the code in journal_start()
jbd/jbd2: remove obsolete summarise_journal_usage.
jbd: Fix forever sleeping process in do_get_write_access()
ext2: fix error msg when mounting fs with too-large blocksize
jbd: fix fsync() tid wraparound bug
ext3: Fix fs corruption when make_indexed_dir() fails
ext3: Fix lock inversion in ext3_symlink()
Linus Torvalds [Tue, 24 May 2011 22:04:00 +0000 (15:04 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: make plock operation killable
dlm: remove shared message stub for recovery
dlm: delayed reply message warning
dlm: Remove superfluous call to recalc_sigpending()
Linus Torvalds [Tue, 24 May 2011 20:38:19 +0000 (13:38 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (43 commits)
TOMOYO: Fix wrong domainname validation.
SELINUX: add /sys/fs/selinux mount point to put selinuxfs
CRED: Fix load_flat_shared_library() to initialise bprm correctly
SELinux: introduce path_has_perm
flex_array: allow 0 length elements
flex_arrays: allow zero length flex arrays
flex_array: flex_array_prealloc takes a number of elements, not an end
SELinux: pass last path component in may_create
SELinux: put name based create rules in a hashtable
SELinux: generic hashtab entry counter
SELinux: calculate and print hashtab stats with a generic function
SELinux: skip filename trans rules if ttype does not match parent dir
SELinux: rename filename_compute_type argument to *type instead of *con
SELinux: fix comment to state filename_compute_type takes an objname not a qstr
SMACK: smack_file_lock can use the struct path
LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH
LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE
SELINUX: Make selinux cache VFS RCU walks safe
SECURITY: Move exec_permission RCU checks into security modules
SELinux: security_read_policy should take a size_t not ssize_t
...
Linus Torvalds [Tue, 24 May 2011 20:31:37 +0000 (13:31 -0700)]
Merge branch 'kbuild' of git://git./linux/kernel/git/mmarek/kbuild-2.6
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: make KBUILD_NOCMDDEP=1 handle empty built-in.o
scripts/kallsyms.c: fix potential segfault
scripts/gen_initramfs_list.sh: Convert to a /bin/sh script
kbuild: Fix GNU make v3.80 compatibility
kbuild: Fix passing -Wno-* options to gcc 4.4+
kbuild: move scripts/basic/docproc.c to scripts/docproc.c
kbuild: Fix Makefile.asm-generic for um
kbuild: Allow to combine multiple W= levels
kbuild: Disable -Wunused-but-set-variable for gcc 4.6.0
Fix handling of backlash character in LINUX_COMPILE_BY name
kbuild: asm-generic support
kbuild: implement several W= levels
kbuild: Fix build with binutils <= 2.19
initramfs: Use KBUILD_BUILD_TIMESTAMP for generated entries
kbuild: Allow to override LINUX_COMPILE_BY and LINUX_COMPILE_HOST macros
kbuild: Drop unused LINUX_COMPILE_TIME and LINUX_COMPILE_DOMAIN macros
kbuild: Use the deterministic mode of ar
kbuild: Call gzip with -n
kbuild: move KALLSYMS_EXTRA_PASS from Kconfig to Makefile
Kconfig: improve KALLSYMS_ALL documentation
Fix up trivial conflict in Makefile
Linus Torvalds [Tue, 24 May 2011 20:28:35 +0000 (13:28 -0700)]
Merge git://git./linux/kernel/git/brodo/pcmcia-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: Make struct pcmcia_device_id const, sound drivers edition
staging: pcmcia: Convert pcmcia_device_id declarations to const
pcmcia: Convert pcmcia_device_id declarations to const
pcmcia: Make declaration and uses of struct pcmcia_device_id const
pcmcia/sa1100: put sa11x0_pcmcia_hw_init[] to .devinit.data
Linus Torvalds [Tue, 24 May 2011 19:06:40 +0000 (12:06 -0700)]
Merge branch 'drm-core-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits)
drivers/gpu/drm/radeon/atom.c: fix warning
drm/radeon/kms: bump kms version number
drm/radeon/kms: properly set num banks for fusion asics
drm/radeon/kms/atom: move dig phy init out of modesetting
drm/radeon/kms/cayman: fix typo in register mask
drm/radeon/kms: fix typo in spread spectrum code
drm/radeon/kms: fix tile_config value reported to userspace on cayman.
drm/radeon/kms: fix incorrect comparison in cayman setup code.
drm/radeon/kms: add wait idle ioctl for eg->cayman
drm/radeon/cayman: setup hdp to invalidate and flush when asked
drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked
agp/uninorth: Fix lockups with radeon KMS and >1x.
drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only
drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices
drm/radeon/kms: fixup eDP connector handling
drm/radeon/kms: bail early for eDP in hotplug callback
drm/radeon/kms: simplify hotplug handler logic
drm/radeon/kms: rewrite DP handling
drm/radeon/kms/atom: add support for setting DP panel mode
drm/radeon/kms: atombios.h updates for DP panel mode
...
Linus Torvalds [Tue, 24 May 2011 19:06:02 +0000 (12:06 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (29 commits)
[S390] cpu hotplug: fix external interrupt subclass mask handling
[S390] oprofile: dont access lowcore
[S390] oprofile: add missing irq stats counter
[S390] Ignore sendmmsg system call note wired up warning
[S390] s390,oprofile: fix compile error for !CONFIG_SMP
[S390] s390,oprofile: fix alert counter increment
[S390] Remove unused includes in process.c
[S390] get CPC image name
[S390] sclp: event buffer dissection
[S390] chsc: process channel-path-availability information
[S390] refactor page table functions for better pgste support
[S390] merge page_test_dirty and page_clear_dirty
[S390] qdio: prevent compile warning
[S390] sclp: remove unnecessary sendmask check
[S390] convert old cpumask API into new one
[S390] pfault: cleanup code
[S390] pfault: cpu hotplug vs missing completion interrupts
[S390] smp: add __noreturn attribute to cpu_die()
[S390] percpu: implement arch specific irqsafe_cpu_ops
[S390] vdso: disable gcov profiling
...
Jean Delvare [Tue, 24 May 2011 18:58:49 +0000 (20:58 +0200)]
i2c-parport: Various cleanups
* Fix white space.
* Rename labels to something meaningful.
* Prefix defines with PORT_ to avoid collision with macros from
<linux/parport.h>.
* Add const markers where possible.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Tue, 24 May 2011 18:58:49 +0000 (11:58 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (40 commits)
Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander
Input: tsc2007 - add X, Y and Z fuzz factors to platform data
Input: tsc2007 - add poll_period parameter to platform data
Input: tsc2007 - add poll_delay parameter to platform data
Input: tsc2007 - add max_rt parameter to platform data
Input: tsc2007 - debounce pressure measurement
Input: ad714x - fix captouch wheel option algorithm
Input: ad714x - allow platform code to specify irqflags
Input: ad714x - fix threshold and completion interrupt masks
Input: ad714x - fix up input configuration
Input: elantech - remove support for proprietary X driver
Input: elantech - report multitouch with proper ABS_MT messages
Input: elantech - export pressure and width when supported
Input: elantech - describe further the protocol
Input: atmel_tsadcc - correct call to input_free_device
Input: add driver FSL MPR121 capacitive touch sensor
Input: remove useless synchronize_rcu() calls
Input: ads7846 - fix gpio_pendown configuration
Input: ads7846 - add possibility to use external vref on ads7846
Input: rotary-encoder - add support for half-period encoders
...
Jean Delvare [Tue, 24 May 2011 18:58:49 +0000 (20:58 +0200)]
i2c-i801: Don't depend on other kernel driver config options
Don't let other driver config options influence us, as it makes the
code more complex and fragile for a small benefit. There's nothing
wrong with instantiating I2C devices even if they don't have a driver.
And we're talking about 835 extra bytes in the binary on x86-64,
that's hardly worth arguing about.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: David Woodhouse <david.woodhouse@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Jean Delvare [Tue, 24 May 2011 18:58:49 +0000 (20:58 +0200)]
i2c-i801: Check for vendor Fujitsu before probing for apanel
Scanning the BIOS memory for the apanel information is costly, so
avoid doing it on non-Fujitsu machines.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Tue, 24 May 2011 18:58:49 +0000 (20:58 +0200)]
i2c-i801: Don't probe for slaves on IDF channels
I don't know if Fujitsu is ever going to produce Patsburg-based
machines, but if they do, I'd rather not probe the secondary (IDF)
SMBus channels. At least not until we have a good reason for doing so.
On a side note, I'm not even sure if it is right to enable detection
of HWMON and DDC devices on the IDF channels. Time will tell...
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Seth Heasley [Tue, 24 May 2011 18:58:49 +0000 (20:58 +0200)]
i2c-i801: SMBus patch for Intel Panther Point DeviceIDs
This patch adds the SMBus controller DeviceID for the Intel Panther Point PCH.
Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Vikram Narayanan [Tue, 24 May 2011 18:58:48 +0000 (20:58 +0200)]
i2c/writing-clients: Fix foo_driver.id_table
The i2c_device_id structure variable's name is not used in the
i2c_driver structure.
Signed-off-by: Vikram Narayanan <vikram186@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Tue, 24 May 2011 18:53:42 +0000 (11:53 -0700)]
Merge branch 'for-2.6.40' of git://git./linux/kernel/git/tj/percpu
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Unify input section names
percpu: Avoid extra NOP in percpu_cmpxchg16b_double
percpu: Cast away printk format warning
percpu: Always align percpu output section to PAGE_SIZE
Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
Linus Torvalds [Tue, 24 May 2011 18:51:26 +0000 (11:51 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: switch to dynamic printks
UBI: turn some macros into static inline
UBI: improve checking in debugging prints
UBI: fix typo in a message
UBI: fix minor stylistic issues
UBI: use __packed instead of __attribute__((packed))
UBI: cleanup comments around volume properties
UBI: re-name set volume properties ioctl
UBI: make the control character device non-seekable
Linus Torvalds [Tue, 24 May 2011 18:51:07 +0000 (11:51 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6: (52 commits)
UBIFS: switch to dynamic printks
UBIFS: fix kernel-doc comments
UBIFS: fix extremely rare mount failure
UBIFS: simplify LEB recovery function further
UBIFS: always cleanup the recovered LEB
UBIFS: clean up LEB recovery function
UBIFS: fix-up free space on mount if flag is set
UBIFS: add the fixup function
UBIFS: add a superblock flag for free space fix-up
UBIFS: share the next_log_lnum helper
UBIFS: expect corruption only in last journal head LEBs
UBIFS: synchronize write-buffer before switching to the next bud
UBIFS: remove BUG statement
UBIFS: change bud replay function conventions
UBIFS: substitute the replay tree with a replay list
UBIFS: simplify replay
UBIFS: store free and dirty space in the bud replay entry
UBIFS: remove unnecessary stack variable
UBIFS: double check that buds are replied in order
UBIFS: make 2 functions static
...
Linus Torvalds [Tue, 24 May 2011 18:50:27 +0000 (11:50 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/gerg/m68knommu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: (22 commits)
m68knommu: Use generic show_interrupts()
coldfire_qspi compile fix
m68k: merge the mmu and non-mmu versions of sys_m68k.c
m68knommu: use asm-generic/bitops/ext2-atomic.h
m68knommu: Remove obsolete #include <linux/sys.h>
m68k: merge mmu and non-mmu versions of asm-offsets.c
m68k: merge non-mmu and mmu versions of m68k_ksyms.c
m68knommu: remove un-needed exporting of COLDFIRE symbols
m68knommu: move EXPORT of kernel_thread to function definition
m68knommu: move EXPORT of local checksumming functions to definitions
m68knommu: move EXPORT of dump_fpu to function definition
m68knommu: clean up mm/init_no.c
m68k: merge the mmu and non-mmu mm/Makefile
m68k: mv kmap_mm.c to kmap.c
m68knommu: remove stubs for __ioremap() and iounmap()
m68knommu: remove unused kernel_set_cachemode()
m68k: let Makefile sort out compiling mmu and non-mmu lib/checksum.c
m68k: remove duplicate memcpy() implementation
m68k: remove duplicate memset() implementation
m68k: remove duplicate memmove() implementation
...
Linus Torvalds [Tue, 24 May 2011 18:49:16 +0000 (11:49 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, ioapic: Restore ioapic entries during resume properly
x86: Get rid of asmregparm
um: Use RWSEM_GENERIC_SPINLOCK on x86
Suresh Siddha [Tue, 24 May 2011 17:45:31 +0000 (10:45 -0700)]
x86, ioapic: Restore ioapic entries during resume properly
In mask/restore_ioapic_entries() we should be restoring ioapic
entries when ioapics[apic].saved_registers is not NULL.
Fix the typo and address the resume hang regression reported by
Linus.
This was not found sooner because the systems where these
changes were tested on kept the IO-APIC entries intact over
resume.
Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Daniel J Blueman <daniel.blueman@gmail.com>
Link: http://lkml.kernel.org/r/1306259131.7171.7.camel@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
James Morris [Tue, 24 May 2011 13:20:19 +0000 (23:20 +1000)]
Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into for-linus
Conflicts:
lib/flex_array.c
security/selinux/avc.c
security/selinux/hooks.c
security/selinux/ss/policydb.c
security/smack/smack_lsm.c
Manually resolve conflicts.
Signed-off-by: James Morris <jmorris@namei.org>
James Morris [Tue, 24 May 2011 12:55:24 +0000 (22:55 +1000)]
Merge branch 'next' into for-linus
Richard Weinberger [Mon, 23 May 2011 22:18:05 +0000 (00:18 +0200)]
x86: Get rid of asmregparm
As UML does no longer need asmregparm we can remove it.
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: namhyung@gmail.com
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/%3C1306189085-29896-1-git-send-email-richard%40nod.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Richard Weinberger [Mon, 23 May 2011 20:51:33 +0000 (22:51 +0200)]
um: Use RWSEM_GENERIC_SPINLOCK on x86
Commit d12337 (rwsem: Remove redundant asmregparm annotation)
broke rwsem on UML.
As we cannot compile UML with -mregparm=3 and keeping asmregparm only
for UML is inadequate the easiest solution is using RWSEM_GENERIC_SPINLOCK.
Thanks to Thomas Gleixner for the idea.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: <stable@kernel.org> # .39.x
Link: http://lkml.kernel.org/r/%3C1306183893-26655-1-git-send-email-richard%40nod.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tejun Heo [Tue, 24 May 2011 07:59:36 +0000 (09:59 +0200)]
Merge branch 'fixes-2.6.39' into for-2.6.40
Dmitry Torokhov [Tue, 24 May 2011 07:06:26 +0000 (00:06 -0700)]
Merge branch 'next' into for-linus
Geert Uytterhoeven [Sat, 30 Apr 2011 21:15:07 +0000 (23:15 +0200)]
m68knommu: Use generic show_interrupts()
Apart from whitespace differences, /proc/interrupts doesn't change by
enabling GENERIC_IRQ_SHOW.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Steven King [Sun, 24 Apr 2011 17:48:07 +0000 (10:48 -0700)]
coldfire_qspi compile fix
The m68k/m68knommu merge broke the qspi build.
Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Greg Ungerer [Thu, 21 Apr 2011 02:48:07 +0000 (12:48 +1000)]
m68k: merge the mmu and non-mmu versions of sys_m68k.c
There is a lot of common code in the sys_m68k.c files. The mmu and non-mmu
versions can easily be merged into a single file.
There is really only 2 functions that differ in the 2 cases. A single
ifdef on CONFIG_MMU can take care of this. Alternatively we could break
those 2 functions out and maintain sys_m68k_no.c and sys_m68k_mm.c with
just this code in it (Makefile could then just build the right one).
Does anyone have strong feelings on which way they want this done?
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Akinobu Mita [Sun, 17 Apr 2011 13:57:29 +0000 (22:57 +0900)]
m68knommu: use asm-generic/bitops/ext2-atomic.h
m68knommu can use generic implementation of ext2 atomic bitops.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Geert Uytterhoeven [Wed, 6 Apr 2011 19:43:19 +0000 (21:43 +0200)]
m68knommu: Remove obsolete #include <linux/sys.h>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Greg Ungerer [Tue, 29 Mar 2011 05:55:36 +0000 (15:55 +1000)]
m68k: merge mmu and non-mmu versions of asm-offsets.c
It is strait forward to merge the mmu and non-mmu versions of
asm-offstes.c. Some name changes are required for the preempt and
thread_info.flags in the non-mmu entry.S assembler to make them
consistent for both setups.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Linus Torvalds [Tue, 24 May 2011 04:24:07 +0000 (21:24 -0700)]
Merge branch 'sh-latest' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (23 commits)
sh: Ignore R_SH_NONE module relocations.
SH: SE7751: Fix pcibios_map_platform_irq prototype.
sh: remove warning and warning_symbol from struct stacktrace_ops
sh: wire up sys_sendmmsg.
clocksource: sh_tmu: Runtime PM support
clocksource: sh_tmu: __clocksource_updatefreq_hz() update
clocksource: sh_cmt: Runtime PM support
clocksource: sh_cmt: __clocksource_updatefreq_hz() update
dmaengine: shdma: synchronize RCU before freeing, simplify spinlock
dmaengine: shdma: add runtime- and system-level power management
dmaengine: shdma: fix locking
sh: sh-sci: sh7377 and sh73a0 build fixes
sh: cosmetic improvement: use an existing pointer
serial: sh-sci: suspend/resume wakeup support V2
serial: sh-sci: Runtime PM support
sh: select IRQ_FORCED_THREADING.
sh: intc: Set virtual IRQs as nothread.
sh: fixup fpu.o compile order
i2c: add a module alias to the sh-mobile driver
ALSA: add a module alias to the FSI driver
...
Linus Torvalds [Tue, 24 May 2011 04:20:48 +0000 (21:20 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix sample type size calculation in 32 bits archs
profile: Use vzalloc() rather than vmalloc() & memset()
Linus Torvalds [Tue, 24 May 2011 04:12:49 +0000 (21:12 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (247 commits)
[media] gspca - sunplus: Fix some warnings and simplify code
[media] gspca: Fix some warnings tied to 'no debug'
[media] gspca: Unset debug by default
[media] gspca - cpia1: Remove a bad conditional compilation instruction
[media] gspca - main: Remove USB traces
[media] gspca - main: Version change to 2.13
[media] gspca - stk014 / t613: Accept the index 0 in querymenu
[media] gspca - kinect: Remove __devinitdata
[media] gspca - cpia1: Fix some warnings
[media] video/Kconfig: Fix mis-classified devices
[media] support for medion dvb stick 1660:1921
[media] tm6000: fix uninitialized field, change prink to dprintk
[media] cx231xx: Add support for Iconbit U100
[media] saa7134 add new TV cards
[media] Use a more consistent value for RC repeat period
[media] cx18: Move spinlock and vb_type initialisation into stream_init
[media] tm6000: remove tm6010 sif audio start and stop
[media] tm6000: remove unused exports
[media] tm6000: add pts logging
[media] tm6000: change from ioctl to unlocked_ioctl
...
Linus Torvalds [Tue, 24 May 2011 04:11:38 +0000 (21:11 -0700)]
Merge git://git./linux/kernel/git/hirofumi/fatfs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6:
fat: Fix statfs->f_namelen
fat: Replace all printk with fat_msg()
fat: Add fat_msg() function for preformated FAT messages
fat: Convert fat_fs_error to use %pV
fat: Fix possible null deref in fat_cache_add()
fat: use new setup() for ->dir_ops too
Guenter Roeck [Mon, 23 May 2011 21:05:38 +0000 (14:05 -0700)]
hwmon: (coretemp) Add comments describing the handling of HT CPUs
The coretemp driver provides a single set of device attributes for each
physical core of a HT CPU to avoid duplicate sensors. This
functionality was introduced with commit
d883b9f09772 ("hwmon:
(coretemp) Skip duplicate CPU entries").
Commit
e40cc4bdfd4b ("x86/hwmon: register alternate sibling upon CPU
removal") extends this functionality to register the HT sibling of a CPU
which is taken offline, to ensure that sensor attributes are provided if
at least one HT sibling of a core is online.
Add comments into the code describing the functionality in some more
detail.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 24 May 2011 04:07:40 +0000 (21:07 -0700)]
kernel/watchdog.c: Use proper ANSI C prototypes
We try to enforce it by using -Wstrict-prototypes, but apparently they
sometimes get through. Introduced by
4eec42f39204 ("watchdog: Change
the default timeout and configure nmi watchdog period based").
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frederic Weisbecker [Tue, 24 May 2011 01:31:26 +0000 (03:31 +0200)]
perf tools: Fix sample type size calculation in 32 bits archs
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.
Fix the constant used for the shift.
This fixes a 32-bit perf top failure reported by Eric Dumazet:
Can't parse sample, err = -14
Can't parse sample, err = -14
...
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Greg Ungerer [Tue, 29 Mar 2011 05:06:37 +0000 (15:06 +1000)]
m68k: merge non-mmu and mmu versions of m68k_ksyms.c
After cleaning up m68k_ksyms_no.c it is now strait forward to merge
the non-mmu and mmu versions of m68k_ksyms.c. The need for the extra
gcc functions is not strictly based on having an MMU or not. It is
based on the family the processor belongs too, so use an appropriate
conditional check.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Greg Ungerer [Tue, 29 Mar 2011 04:25:14 +0000 (14:25 +1000)]
m68knommu: remove un-needed exporting of COLDFIRE symbols
There is no reason most of the symbols enclosed in a conditional
on CONFIG_COLDFIRE need to be exported. And they sure don't need to
be doing it in m68k_ksyms_no.c. Move the dma symbols export (which
are currently needed) to the definitions of those, and remove the
rest of the exporting here.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Greg Ungerer [Tue, 29 Mar 2011 04:14:21 +0000 (14:14 +1000)]
m68knommu: move EXPORT of kernel_thread to function definition
The EXPORT_SYMBOL(kernel_thread) belongs at the definition of that function,
not in some other random code file. So move it there.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>