David S. Miller [Sun, 11 Apr 2010 09:44:30 +0000 (02:44 -0700)]
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
David S. Miller [Sun, 11 Apr 2010 09:40:49 +0000 (02:40 -0700)]
Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"
This reverts commit
2626419ad5be1a054d350786b684b41d23de1538.
It causes regressions for people with IGB cards. Connection
requests don't complete etc. The true cause of the issue is
still not known, but we should sort this out in net-next-2.6
not net-2.6
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 9 Apr 2010 18:53:06 +0000 (11:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Check correct variable for allocation failure
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
RDMA/cm: Set num_paths when manually assigning path records
IB/cm: Fix device_create() return value check
Linus Torvalds [Fri, 9 Apr 2010 18:52:48 +0000 (11:52 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] Update default configuration.
[S390] nss: add missing .previous statement to asm function
[S390] increase default size of vmalloc area
[S390] s390: disable change bit override
[S390] fix io_return critical section cleanup
[S390] sclp_async: potential buffer overflow
[S390] arch/s390/kernel: Add missing unlock
Linus Torvalds [Fri, 9 Apr 2010 18:50:29 +0000 (11:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
loop: Update mtime when writing using aops
block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
backing-dev: Handle class_create() failure
Block: Fix block/elevator.c elevator_get() off-by-one error
drbd: lc_element_by_index() never returns NULL
cciss: unlock on error path
cfq-iosched: Do not merge queues of BE and IDLE classes
cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
i2o: Remove the dangerous kobj_to_i2o_device macro
block: remove 16 bytes of padding from struct request on 64bits
cfq-iosched: fix a kbuild regression
block: make CONFIG_BLK_CGROUP visible
Remove GENHD_FL_DRIVERFS
block: Export max number of segments and max segment size in sysfs
block: Finalize conversion of block limits functions
block: Fix overrun in lcm() and move it to lib
vfs: improve writeback_inodes_wb()
paride: fix off-by-one test
drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
...
Linus Torvalds [Fri, 9 Apr 2010 18:50:01 +0000 (11:50 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (29 commits)
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
drm/nv50: implement gpio set/get routines
drm/nv50: parse/use some more de-magiced parts of gpio table entries
drm/nouveau: store raw gpio table entry in bios gpio structs
drm/nv40: Init some tiling-related PGRAPH state.
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
drm/nv50: another dodgy DP hack
drm/nv50: punt hotplug irq handling out to workqueue
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
drm/nv50: Allow using the NVA3 new compute class.
drm/nv50: cleanup properly if PDISPLAY init fails
drm/nouveau: fixup the init failure paths some more
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
drm/nv40: add LVDS table quirk for Dell Latitude D620
drm/nv40: rework lvds table parsing
drm/nouveau: detect vram amount once, and save the value
drm/nouveau: remove some unused members from drm_nouveau_private
drm/nouveau: Make use of TTM busy_placements.
drm/nv50: add more 0x100c80 flushy magic
drm/nv50: fix fbcon when framebuffer above 4GiB mark
...
David Howells [Tue, 6 Apr 2010 21:36:20 +0000 (22:36 +0100)]
radix_tree_tag_get() is not as safe as the docs make out [ver #2]
radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
or radix_tree_tag_clear(). The problem is that the double tag_get() in
radix_tree_tag_get():
if (!tag_get(node, tag, offset))
saw_unset_tag = 1;
if (height == 1) {
int ret = tag_get(node, tag, offset);
may see the value change due to the action of set/clear. RCU is no protection
against this as no pointers are being changed, no nodes are being replaced
according to a COW protocol - set/clear alter the node directly.
The documentation in linux/radix-tree.h, however, says that
radix_tree_tag_get() is an exception to the rule that "any function modifying
the tree or tags (...) must exclude other modifications, and exclude any
functions reading the tree".
The problem is that the next statement in radix_tree_tag_get() checks that the
tag doesn't vary over time:
BUG_ON(ret && saw_unset_tag);
This has been seen happening in FS-Cache:
https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html
To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
comments that the value of the tag may change whilst the RCU read lock is held,
and thus that the return value of radix_tree_tag_get() may not be relied upon
unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
running concurrently with it.
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Wed, 7 Apr 2010 16:23:41 +0000 (19:23 +0300)]
slub: Fix kmem_ptr_validate() for non-kernel pointers
As suggested by Linus, fix up kmem_ptr_validate() to handle non-kernel pointers
more graciously. The patch changes kmem_ptr_validate() to use the newly
introduced kern_ptr_validate() helper to check that a pointer is a valid kernel
pointer before we attempt to convert it into a 'struct page'.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Wed, 7 Apr 2010 16:23:40 +0000 (19:23 +0300)]
slab: Generify kernel pointer validation
As suggested by Linus, introduce a kern_ptr_validate() helper that does some
sanity checks to make sure a pointer is a valid kernel pointer. This is a
preparational step for fixing SLUB kmem_ptr_validate().
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 9 Apr 2010 17:05:33 +0000 (10:05 -0700)]
Revert "memory-hotplug: add 0x prefix to HEX block_size_bytes"
This reverts commit
ba168fc37dea145deeb8fa9e7e71c748d2e00d74.
It changes user-visible sysfs interfaces, and breaks some existing user
space applications which apparently rely on the fact that the output
does not contain the "0x" prefix.
Requested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Fri, 9 Apr 2010 17:03:35 +0000 (10:03 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6
Roland Dreier [Fri, 9 Apr 2010 16:14:21 +0000 (09:14 -0700)]
Merge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linus
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:04 +0000 (13:43 +0200)]
[S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 9 Apr 2010 11:43:03 +0000 (13:43 +0200)]
[S390] nss: add missing .previous statement to asm function
The savesys_ipl_nss asm function is put into the .init.text section
however it is missing a ".previous" section which would restore the
previous section.
Luckily all functions in early.c are init functions so it doesn't
matter currently.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:02 +0000 (13:43 +0200)]
[S390] increase default size of vmalloc area
The default size of the vmalloc area is currently 1 GB. The memory resource
controller uses about 10 MB of vmalloc space per gigabyte of memory. That
turns a system with more than ~100 GB memory unbootable with the default
vmalloc size. It costs us nothing to increase the default size to some
more adequate value, e.g. 128 GB.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Christian Borntraeger [Fri, 9 Apr 2010 11:43:01 +0000 (13:43 +0200)]
[S390] s390: disable change bit override
commit
6a985c6194017de2c062916ad1cd00dee0302c40
([S390] s390: use change recording override for kernel mapping)
deactivated the change bit recording for the kernel mapping to
improve the performance. This works most of the time, but there
are cases (e.g. kernel runs in home space, futex atomic compare xcmg)
where we modify user memory with the kernel mapping instead of the
user mapping.
Instead of fixing these cases, this patch just deactivates change bit
override to avoid future problems with other kernel code that might
use the kernel mapping for user memory.
CC: stable@kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:00 +0000 (13:43 +0200)]
[S390] fix io_return critical section cleanup
If a machine check interrupts the io interrupt handler on one of the
instructions between io_return and io_leave the critical section
cleanup code will move the return psw to io_work_loop. By doing that
the switch from the asynchronous interrupt stack to the process stack
is skipped. If e.g. TIF_NEED_RESCHED is set things break because
the scheduler is called with the asynchronous interrupts stack.
Moving the psw back to io_return instead fixes the problem.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Dan Carpenter [Fri, 9 Apr 2010 11:42:59 +0000 (13:42 +0200)]
[S390] sclp_async: potential buffer overflow
"len" hasn't been properly range checked so we shouldn't use it as an
array offset. This can only be written to by root but it would still be
annoying to accidentally write more than 3 characters and corrupt your
memory.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Julia Lawall [Fri, 9 Apr 2010 11:42:58 +0000 (13:42 +0200)]
[S390] arch/s390/kernel: Add missing unlock
In the default case the lock is not unlocked. The return is
converted to a goto, to share the unlock at the end of the function.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Divyesh Shah [Fri, 9 Apr 2010 07:29:57 +0000 (09:29 +0200)]
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
When CFQ dispatches requests forcefully due to a barrier or changing iosched,
it runs through all cfqq's dispatching requests and then expires each queue.
However, it does not activate a cfqq before flushing its IOs resulting in
using stale values for computing slice_used.
This patch fixes it by calling activate queue before flushing reuqests from
each queue.
This is useful mostly for barrier requests because when the iosched is changing
it really doesnt matter if we have incorrect accounting since we're going to
break down all structures anyway.
We also now expire the current timeslice before moving on with the dispatch
to accurately account slice used for that cfqq.
Signed-off-by: Divyesh Shah<dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Dave Airlie [Fri, 9 Apr 2010 04:27:51 +0000 (14:27 +1000)]
Merge remote branch 'nouveau/for-airlied' of ../drm-nouveau-next into drm-linus
* 'nouveau/for-airlied' of ../drm-nouveau-next: (21 commits)
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
drm/nv50: implement gpio set/get routines
drm/nv50: parse/use some more de-magiced parts of gpio table entries
drm/nouveau: store raw gpio table entry in bios gpio structs
drm/nv40: Init some tiling-related PGRAPH state.
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
drm/nv50: another dodgy DP hack
drm/nv50: punt hotplug irq handling out to workqueue
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
drm/nv50: Allow using the NVA3 new compute class.
drm/nv50: cleanup properly if PDISPLAY init fails
drm/nouveau: fixup the init failure paths some more
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
drm/nv40: add LVDS table quirk for Dell Latitude D620
drm/nv40: rework lvds table parsing
drm/nouveau: detect vram amount once, and save the value
drm/nouveau: remove some unused members from drm_nouveau_private
drm/nouveau: Make use of TTM busy_placements.
drm/nv50: add more 0x100c80 flushy magic
drm/nv50: fix fbcon when framebuffer above 4GiB mark
...
Ben Skeggs [Mon, 15 Mar 2010 22:45:07 +0000 (08:45 +1000)]
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
There's one known case where we never stop recieving DEFER, and loop here
forever. Lets not do that..
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 7 Apr 2010 02:57:35 +0000 (12:57 +1000)]
drm/nv50: implement gpio set/get routines
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 7 Apr 2010 02:05:32 +0000 (12:05 +1000)]
drm/nv50: parse/use some more de-magiced parts of gpio table entries
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 7 Apr 2010 02:00:14 +0000 (12:00 +1000)]
drm/nouveau: store raw gpio table entry in bios gpio structs
And use our own version of the GPIO table for the INIT_GPIO opcode.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Francisco Jerez [Tue, 6 Apr 2010 19:11:58 +0000 (21:11 +0200)]
drm/nv40: Init some tiling-related PGRAPH state.
Fixes garbled 3D on an nv46 card.
Reported-by: Francesco Marella <francesco.marella@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Marcin Kościelnicki [Fri, 2 Apr 2010 10:28:18 +0000 (10:28 +0000)]
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 30 Mar 2010 06:01:41 +0000 (16:01 +1000)]
drm/nv50: another dodgy DP hack
Allows *some* DP cards to keep working in some corner cases that most
people shouldn't hit. I hit it all the time with development, so this
can stay for now.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 30 Mar 2010 05:14:41 +0000 (15:14 +1000)]
drm/nv50: punt hotplug irq handling out to workqueue
On DP outputs we'll likely end up running vbios init tables here, which
may sleep.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 29 Mar 2010 00:06:09 +0000 (10:06 +1000)]
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
This value interacts with some registers we don't currently know how to
program properly ourselves. The default of 5 that we were using matches
what the VBIOS on early DP cards do, but later ones use 6, which would
cause nouveau to program an incorrect mode on these chips.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Marcin Kościelnicki [Wed, 24 Mar 2010 13:43:16 +0000 (13:43 +0000)]
drm/nv50: Allow using the NVA3 new compute class.
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 25 Mar 2010 06:01:04 +0000 (16:01 +1000)]
drm/nv50: cleanup properly if PDISPLAY init fails
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 25 Mar 2010 06:00:09 +0000 (16:00 +1000)]
drm/nouveau: fixup the init failure paths some more
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 19 Mar 2010 02:49:59 +0000 (12:49 +1000)]
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 18 Mar 2010 03:38:04 +0000 (13:38 +1000)]
drm/nv40: add LVDS table quirk for Dell Latitude D620
Should fix:
https://bugzilla.redhat.com/show_bug.cgi?id=505132
https://bugzilla.redhat.com/show_bug.cgi?id=543091
https://bugzilla.redhat.com/show_bug.cgi?id=530425
https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/
+bug/539730
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 18 Mar 2010 02:05:43 +0000 (12:05 +1000)]
drm/nv40: rework lvds table parsing
All indications seem to be that the version 0x30 table should be handled
the same way as 0x40 (as used on G80), at least for the parts that we
currently try use.
This commit cleans up the parsing to make it clearer about what we're
actually trying to achieve, and unifies the 0x30/0x40 parsing.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 17 Mar 2010 23:45:20 +0000 (09:45 +1000)]
drm/nouveau: detect vram amount once, and save the value
As opposed to repeatedly reading the amount back from the GPU every
time we need to know the VRAM size.
We should now fail to load gracefully on detecting no VRAM, rather than
something potentially messy happening.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 17 Mar 2010 23:23:19 +0000 (09:23 +1000)]
drm/nouveau: remove some unused members from drm_nouveau_private
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Francisco Jerez [Thu, 18 Mar 2010 12:07:47 +0000 (13:07 +0100)]
drm/nouveau: Make use of TTM busy_placements.
Previously we were filling it the same as "placements", but in some
cases there're valid alternatives that we were ignoring completely.
Keeping a back-up memory type helps on several low-mem situations.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 15 Mar 2010 06:43:47 +0000 (16:43 +1000)]
drm/nv50: add more 0x100c80 flushy magic
Fixes the !vbo_fifo path in the 3D driver on certain chipsets. Still not
really any good idea of what exactly the magic achieves, but it makes
things work.
While we're at it, in the PCIEGART path, flush on unbinding also.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 16 Mar 2010 03:20:58 +0000 (13:20 +1000)]
drm/nv50: fix fbcon when framebuffer above 4GiB mark
This can't actually happen right now, but lets fix it anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Marcin Kościelnicki [Wed, 17 Mar 2010 00:58:47 +0000 (00:58 +0000)]
drm/nv50: Fix NEWCTX_DONE flag number
Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Wey-Yi Guy [Thu, 8 Apr 2010 20:17:37 +0000 (13:17 -0700)]
iwlwifi: need check for valid qos packet before free
For 4965, need to check it is valid qos frame before free, only valid
QoS frame has the tid used to free the packets.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Nikanth Karthikesan [Thu, 8 Apr 2010 19:39:31 +0000 (21:39 +0200)]
loop: Update mtime when writing using aops
Update mtime when writing to backing filesystem using the address space
operations write_begin and write_end.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Thu, 8 Apr 2010 18:58:14 +0000 (11:58 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
not overwriting file_lock structure after GET_LK
cifs: Fix a kernel BUG with remote OS/2 server (try #3)
[CIFS] initialize nbytes at the beginning of CIFSSMBWrite()
[CIFS] Add mmap for direct, nobrl cifs mount types
David S. Miller [Thu, 8 Apr 2010 18:32:30 +0000 (11:32 -0700)]
tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb
Back in commit
04a0551c87363f100b04d28d7a15a632b70e18e7
("loopback: Drop obsolete ip_summed setting") we stopped
setting CHECKSUM_UNNECESSARY in the loopback xmit.
This is because such a setting was a lie since it implies that the
checksum field of the packet is properly filled in.
Instead what happens normally is that CHECKSUM_PARTIAL is set and
skb->csum is calculated as needed.
But this was only happening for TCP data packets (via the
skb->ip_summed assignment done in tcp_sendmsg()). It doesn't
happen for non-data packets like ACKs etc.
Fix this by setting skb->ip_summed in the common non-data packet
constructor. It already is setting skb->csum to zero.
But this reminds us that we still have things like ip_output.c's
ip_dev_loopback_xmit() which sets skb->ip_summed to the value
CHECKSUM_UNNECESSARY, which Herbert's patch teaches us is not
valid. So we'll have to address that at some point too.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jorge Boncompte [DTI2] [Thu, 8 Apr 2010 04:56:48 +0000 (04:56 +0000)]
udp: fix for unicast RX path optimization
Commits
5051ebd275de672b807c28d93002c2fb0514a3c9 and
5051ebd275de672b807c28d93002c2fb0514a3c9 ("ipv[46]: udp: optimize unicast RX
path") broke some programs.
After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
down, after the 10th tunnel came up. My modified rp-l2tp uses a global
unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
tunnel after parameter negotiation.
After ten sockets were open and due to mixed parameters to
udp[46]_lib_lookup2() kernel started to drop packets.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 8 Apr 2010 17:02:02 +0000 (10:02 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)
Mark Lord [Wed, 7 Apr 2010 17:52:08 +0000 (13:52 -0400)]
libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)
Most drives from Seagate, Hitachi, and possibly other brands,
do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1).
So instead use LBA48 for such accesses.
This bug could bite a lot of systems, especially when the user has
taken care to align partitions to 4KB boundaries. On misaligned systems,
it is less likely to be encountered, since a 4KB read would end at
0x10000000 rather than at 0x0fffffff.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Thu, 8 Apr 2010 15:37:05 +0000 (08:37 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix sched_getaffinity()
Linus Torvalds [Thu, 8 Apr 2010 14:45:36 +0000 (07:45 -0700)]
Merge git://git./linux/kernel/git/davem/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
ide: Fix IDE taskfile with cfq scheduler
ide: Must hold queue lock when requeueing
ide: Requeue request after DMA timeout
Linus Torvalds [Thu, 8 Apr 2010 14:44:53 +0000 (07:44 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI / PM: Move ACPI video resume to a PM notifier
ACPI: Reduce ACPI resource conflict message to KERN_WARNING, printk cleanup
ACPI: battery drivers should call power_supply_changed()
ACPI: battery: Fix CONFIG_ACPI_SYSFS_POWER=n
PNPACPI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
ACPI: Don't send KEY_UNKNOWN for random video notifications
ACPI: NUMA: map pxms to low node ids
ACPI: use _HID when supplied by root-level devices
ACPI / ACPICA: Do not check reference counters in acpi_ev_enable_gpe()
ACPI: fixes a false alarm from lockdep
ACPI dock: support multiple ACPI dock devices
ACPI: EC: Allow multibyte access to EC
Brice Goglin [Thu, 8 Apr 2010 05:23:45 +0000 (22:23 -0700)]
myri10ge: fix rx_pause in myri10ge_set_pauseparam
Fix rx_pause management in myri10ge_set_pauseparam().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick Loschmidt [Thu, 8 Apr 2010 04:52:07 +0000 (21:52 -0700)]
net: corrected documentation for hardware time stamping
The current documentation for hardware time stamping does not
correctly specify the available kernel functions since the
implementation was changed later on.
Signed-off-by: Patrick Loschmidt <Patrick.Loschmidt@oeaw.ac.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 8 Apr 2010 04:50:08 +0000 (21:50 -0700)]
stmmac: use resource_size()
Resource size should be calculated as end - start + 1 because we start
counting at zero. I changed the code to resource_size() to do the
calculation.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hughes [Sun, 4 Apr 2010 06:48:10 +0000 (06:48 +0000)]
x.25 attempts to negotiate invalid throughput
The current X.25 code has some bugs in throughput negotiation:
1. It does negotiation in all cases, usually there is no need
2. It incorrectly attempts to negotiate the throughput class in one
direction only. There are separate throughput classes for input
and output and if either is negotiated both mist be negotiates.
This is bug https://bugzilla.kernel.org/show_bug.cgi?id=15681
This bug was first reported by Daniel Ferenci to the linux-x25 mailing
list on 6/8/2004, but is still present.
The current (2.6.34) x.25 code doesn't seem to know that the X.25
throughput facility includes two values, one for the required
throughput outbound, one for inbound.
This causes it to attempt to negotiate throughput 0x0A, which is
throughput 9600 inbound and the illegal value "0" for inbound
throughput.
Because of this some X.25 devices (e.g. Cisco 1600) refuse to connect
to Linux X.25.
The following patch fixes this behaviour. Unless the user specifies a
required throughput it does not attempt to negotiate. If the user
does not specify a throughput it accepts the suggestion of the remote
X.25 system. If the user requests a throughput then it validates both
the input and output throughputs and correctly negotiates them with
the remote end.
Signed-off-by: John Hughes <john@calva.com>
Tested-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hughes [Thu, 8 Apr 2010 04:29:25 +0000 (21:29 -0700)]
x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.
Here is a patch to stop X.25 examining fields beyond the end of the packet.
For example, when a simple CALL ACCEPTED was received:
10 10 0f
x25_parse_facilities was attempting to decode the FACILITIES field, but this
packet contains no facilities field.
Signed-off-by: John Hughes <john@calva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 8 Apr 2010 04:20:47 +0000 (21:20 -0700)]
bridge: Fix IGMP3 report parsing
The IGMP3 report parsing is looking at the wrong address for
group records. This patch fixes it.
Reported-by: Banyeer <banyeer@yahoo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Thu, 8 Apr 2010 03:53:54 +0000 (20:53 -0700)]
cnic: Fix crash during bnx2x MTU change.
cnic_service_bnx2x() irq handler can be called during chip reset from
MTU change. Need to check that the cnic's device state is up before
handling the irq.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 8 Apr 2010 01:49:20 +0000 (18:49 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
hvc_console: Fix race between hvc_close and hvc_remove
virtio: disable multiport console support.
virtio: console makes incorrect assumption about virtio API
virtio: console: Fix early_put_chars usage
MAINTAINERS: Put the virtio-console entry in correct alphabetical order
Anton Blanchard [Tue, 6 Apr 2010 11:42:38 +0000 (21:42 +1000)]
hvc_console: Fix race between hvc_close and hvc_remove
I don't claim to understand the tty layer, but it seems like hvc_open and
hvc_close should be balanced in their kref reference counting.
Right now we get a kref every call to hvc_open:
if (hp->count++ > 0) {
tty_kref_get(tty); <----- here
spin_unlock_irqrestore(&hp->lock, flags);
hvc_kick();
return 0;
} /* else count == 0 */
tty->driver_data = hp;
hp->tty = tty_kref_get(tty); <------ or here if hp->count was 0
But hvc_close has:
tty_kref_get(tty);
if (--hp->count == 0) {
...
/* Put the ref obtained in hvc_open() */
tty_kref_put(tty);
...
}
tty_kref_put(tty);
Since the outside kref get/put balance we only do a single kref_put when
count reaches 0.
The patch below changes things to call tty_kref_put once for every
hvc_close call, and with that my machine boots fine.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Wed, 31 Mar 2010 18:56:42 +0000 (21:56 +0300)]
virtio: disable multiport console support.
Move MULTIPORT feature and related config changes
out of exported headers, and disable the feature
at runtime.
At this point, it seems less risky to keep code around
until we can enable it than rip it out completely.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 8 Apr 2010 15:46:16 +0000 (09:46 -0600)]
virtio: console makes incorrect assumption about virtio API
The get_buf() API sets the second arg to the number of bytes *written*
by the other side; in this case it should be zero as these are output buffers.
lguest gets this right (obviously kvm's console doesn't), resulting in
continual buildup of console writes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Amit Shah <amit.shah@redhat.com>
François Diakhaté [Tue, 23 Mar 2010 12:53:15 +0000 (18:23 +0530)]
virtio: console: Fix early_put_chars usage
Currently early_put_chars is not used by virtio_console because it can
only be used once a port has been found, at which point it's too late
because it is no longer needed. This patch should fix it.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Tue, 23 Mar 2010 12:53:09 +0000 (18:23 +0530)]
MAINTAINERS: Put the virtio-console entry in correct alphabetical order
Move around the entry for virtio-console to keep the file sorted.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
David S. Miller [Wed, 7 Apr 2010 23:52:29 +0000 (16:52 -0700)]
Merge branch 'vhost' of git://git./linux/kernel/git/mst/vhost
Amit Kumar Salecha [Wed, 7 Apr 2010 23:51:49 +0000 (16:51 -0700)]
qlcnic: fix set mac addr
If interface is down, mac address request are not sent to fw
but it is getting add in driver mac list.
Driver mac list should be in sync with fw i.e addresses communicated
to fw.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 7 Apr 2010 23:50:58 +0000 (16:50 -0700)]
r6040: fix r6040_multicast_list
As reported in <https://bugzilla.kernel.org/show_bug.cgi?id=15355>, r6040_
multicast_list currently crashes. This is due a wrong maximum of multicast
entries. This patch fixes the following issues with multicast:
- number of maximum entries if off-by-one (4 instead of 3)
- the writing of the hash table index is not necessary and leads to invalid
values being written into the MCR1 register, so the MAC is simply put in a non
coherent state
- when we exceed the maximum number of mutlticast address, writing the
broadcast address should be done in registers MID_1{L,M,H} instead of
MID_O{L,M,H}, otherwise we would loose the adapter's MAC address
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 7 Apr 2010 23:41:03 +0000 (16:41 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6
Kevin Hilman [Wed, 7 Apr 2010 18:52:46 +0000 (11:52 -0700)]
rwsem generic spinlock: use IRQ save/restore spinlocks
rwsems can be used with IRQs disabled, particularily in early boot
before IRQs are enabled. Currently the spin_unlock_irq() usage in the
slow-patch will unconditionally enable interrupts and cause problems
since interrupts are not yet initialized or enabled.
This patch uses save/restore versions of IRQ spinlocks in the slowpath
to ensure interrupts are not unintentionally disabled.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Wed, 7 Apr 2010 23:06:07 +0000 (00:06 +0100)]
Have nfs ->d_revalidate() report errors properly
If nfs atomic open implementation ends up doing open request from
->d_revalidate() codepath and gets an error from server, return that error
to caller explicitly and don't bother with lookup_instantiate_filp() at all.
->d_revalidate() can return an error itself just fine...
See
http://bugzilla.kernel.org/show_bug.cgi?id=15674
http://marc.info/?l=linux-kernel&m=
126988782722711&w=2
for original report.
Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Wed, 7 Apr 2010 09:39:01 +0000 (09:39 +0000)]
IB/mlx4: Check correct variable for allocation failure
The intent here is to check the "mfrpl->mapped_page_list" allocation.
We checked "mfrpl->ibfrpl.page_list" earlier.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Chien Tung [Thu, 25 Mar 2010 13:39:50 +0000 (13:39 +0000)]
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
cap.max_inline_data is incorrectly set in init_attr instead of attr.
Set it in attr so subsequent init_attr.cap assignment will get the
correct value.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Thu, 25 Mar 2010 19:12:36 +0000 (19:12 +0000)]
RDMA/cm: Set num_paths when manually assigning path records
When manually assigning the path records to use for a connection, save
the number of paths that were set. Otherwise, checks against num_path
will show 0, even though path record data is available.
This was discovered by manually setting the path records from user
space, then querying the kernel to see if the correct path records
were assigned, only to discover that the kernel returned 0 path
records to the query.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Wed, 7 Apr 2010 21:01:51 +0000 (14:01 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Enable Nehalem-EX support
perf kmem: Fix breakage introduced by
5a0e3ad slab.h script
Linus Torvalds [Wed, 7 Apr 2010 18:03:06 +0000 (11:03 -0700)]
Merge branch 'davinci-fixes-for-linus' of git://git./linux/kernel/git/khilman/linux-davinci
* 'davinci-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci:
davinci: fix compile warning: <mach/da8xx.h>: #include <linux/platform_device.h>
davinci: DM365: fix duplicate default IRQ priorities
davinci: edma: clear events in edma_start()
davinci: da8xx/omap-l1: fix build error when CONFIG_DAVINCI_MUX is undefined
davinci: timers: don't enable timer until clocksource is initialized
Linus Torvalds [Wed, 7 Apr 2010 18:02:23 +0000 (11:02 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
x86: Increase CONFIG_NODES_SHIFT max to 10
ibft, x86: Change reserve_ibft_region() to find_ibft_region()
x86, hpet: Fix bug in RTC emulation
x86, hpet: Erratum workaround for read after write of HPET comparator
bootmem, x86: Fix 32bit numa system without RAM on node 0
nobootmem, x86: Fix 32bit numa system without RAM on node 0
x86: Handle overlapping mptables
x86: Make e820_remove_range to handle all covered case
x86-32, resume: do a global tlb flush in S4 resume
Sergei Shtylyov [Fri, 26 Mar 2010 14:56:58 +0000 (17:56 +0300)]
davinci: fix compile warning: <mach/da8xx.h>: #include <linux/platform_device.h>
This hushes the following warning:
arch/arm/mach-davinci/include/mach/da8xx.h:104: warning: ‘struct platform_device’
declared inside parameter list
arch/arm/mach-davinci/include/mach/da8xx.h:104: warning: its scope is only this
definition or declaration, which is probably not what you want
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Linus Torvalds [Wed, 7 Apr 2010 15:48:39 +0000 (08:48 -0700)]
Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Remove unused variable from ptrace
microblaze: io.h: Add io big-endian function
microblaze: Enable memory leak detector
microblaze: Fix futex code
microblaze: Fix ftrace_update_ftrace_func panic
Linus Torvalds [Wed, 7 Apr 2010 15:42:25 +0000 (08:42 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: mixart: range checking proc file
ALSA: hda - Fix a wrong array range check in patch_realtek.c
ALSA: ASoC: move dma_data from snd_soc_dai to snd_soc_pcm_stream
ALSA: hda - Enable amplifiers on Acer Inspire 6530G
ASoC: Only do WM8994 bias off transition from standby
ASoC: Don't use DCS_DATAPATH_BUSY for WM hubs devices
ASoC: Don't do runtime wm_hubs DC servo updates if using offset correction
ASoC: Support second DC servo readback method for wm_hubs
ASoC: Avoid wraparound in wm_hubs DC servo correction
ALSA: echoaudio - Eliminate use after free
ALSA: i2c: cleanup: change parameter to pointer
ALSA: hda - Add MSI blacklist for Aopen MZ915-M
ASoC: OMAP: Fix capture pointer handling for OMAP1510 to work correctly with recent ALSA PCM code
ALSA: hda - Update document about MSI and interrupts
ALSA: hda: Fix 0 dB offset for Lenovo Thinkpad models using AD1981
ALSA: hda - Add missing printk argument in previous patch
ASoC: Fix passing platform_data to ac97 bus users and fix a leak
ALSA: hda - Fix ADC/MUX assignment of ALC269 codec
ALSA: hda - Fix invalid bit values passed to snd_hda_codec_amp_stereo()
ASoC: wm8994: playback => capture
Linus Torvalds [Wed, 7 Apr 2010 15:38:27 +0000 (08:38 -0700)]
Merge branch 'slabh' of git://git./linux/kernel/git/tj/misc
* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
nodemask: include slab.h from drivers/base/node.c
David Howells [Tue, 6 Apr 2010 21:35:09 +0000 (14:35 -0700)]
fs-cache: order the debugfs stats correctly
Order the debugfs statistics correctly. The values displayed through a
seq_printf() statement should be in the same order as the names in the
format string.
In the 'Lookups' line, objects created ('crt=') and lookups timed out
('tmo=') have their values transposed.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 6 Apr 2010 21:35:09 +0000 (14:35 -0700)]
frv: fix kernel/user segment handling in NOMMU mode
In NOMMU mode, the FRV segment handling is broken because KERNEL_DS ==
USER_DS. This causes tests of the following sort:
/* don't pin down non-user-based iovecs */
if (segment_eq(get_fs(), KERNEL_DS))
return NULL;
to malfunction.
To fix this, make USER_DS the top of RAM instead of the top of the non-IO
address space, and make KERNEL_DS one more than the top of the non-IO
address space.
Also get rid of FRV's __addr_ok() as nothing uses it.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 6 Apr 2010 21:35:08 +0000 (14:35 -0700)]
frv: hide uncached_access() when pgprot_noncached is not #defined
Hide uncached_access() when pgprot_noncached is not #defined. This prevents
the following warning:
CC drivers/char/mem.o
drivers/char/mem.c:229: warning: 'uncached_access' defined but not used
Repairs
d7d4d849b4e3acc405ec222884936800ffb26d48 ("drivers/char/mem.c:
cleanups").
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Zapolskiy [Tue, 6 Apr 2010 21:35:07 +0000 (14:35 -0700)]
rtc-mxc: multiple fixes in rtc-mxc probe method
On exit paths in mxc_rtc_probe() method some resources are not freed
correctly.
This patch fixes:
* unrequested memory region containing imx RTC registers
* iounmap() isn't called on exit_free_pdata branch
* clock get rate is called for freed clock source
* clock isn't disabled on exit_put_clk branch
To simplify the fix managed device resources are used.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 6 Apr 2010 21:35:05 +0000 (14:35 -0700)]
memcg: fix race in file_mapped accounting
Presently, memcg's FILE_MAPPED accounting has following race with
move_account (happens at rmdir()).
increment page->mapcount (rmap.c)
mem_cgroup_update_file_mapped() move_account()
lock_page_cgroup()
check page_mapped() if
page_mapped(page)>1 {
FILE_MAPPED -1 from old memcg
FILE_MAPPED +1 to old memcg
}
.....
overwrite pc->mem_cgroup
unlock_page_cgroup()
lock_page_cgroup()
FILE_MAPPED + 1 to pc->mem_cgroup
unlock_page_cgroup()
Then,
old memcg (-1 file mapped)
new memcg (+2 file mapped)
This happens because move_account see page_mapped() which is not guarded
by lock_page_cgroup(). This patch adds FILE_MAPPED flag to page_cgroup
and move account information based on it. Now, all checks are synchronous
with lock_page_cgroup().
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Balbir Singh <balbir@in.ibm.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Andrea Righi <arighi@develer.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 6 Apr 2010 21:35:04 +0000 (14:35 -0700)]
pagemap: fix pfn calculation for hugepage
When we look into pagemap using page-types with option -p, the value of
pfn for hugepages looks wrong (see below.) This is because pte was
evaluated only once for one vma although it should be updated for each
hugepage. This patch fixes it.
$ page-types -p 3277 -Nl -b huge
voffset offset len flags
7f21e8a00 11e400 1 ___U___________H_G________________
7f21e8a01 11e401 1ff ________________TG________________
^^^
7f21e8c00 11e400 1 ___U___________H_G________________
7f21e8c01 11e401 1ff ________________TG________________
^^^
One hugepage contains 1 head page and 511 tail pages in x86_64 and each
two lines represent each hugepage. Voffset and offset mean virtual
address and physical address in the page unit, respectively. The
different hugepages should not have the same offset value.
With this patch applied:
$ page-types -p 3386 -Nl -b huge
voffset offset len flags
7fec7a600 112c00 1 ___UD__________H_G________________
7fec7a601 112c01 1ff ________________TG________________
^^^
7fec7a800 113200 1 ___UD__________H_G________________
7fec7a801 113201 1ff ________________TG________________
^^^
OK
More info:
- This patch modifies walk_page_range()'s hugepage walker. But the
change only affects pagemap_read(), which is the only caller of hugepage
callback.
- Without this patch, hugetlb_entry() callback is called per vma, that
doesn't match the natural expectation from its name.
- With this patch, hugetlb_entry() is called per hugepte entry and the
callback can become much simpler.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yong Zhang [Tue, 6 Apr 2010 21:35:03 +0000 (14:35 -0700)]
ratelimit: fix the return value when __ratelimit() fails to acquire the lock
The log of commit
edaac8e3167501cda336231d00611bf59c164346 ("ratelimit:
Fix/allow use in atomic contexts"), indicates that we want to suppress the
callback when the trylock fails.
Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yong Zhang [Tue, 6 Apr 2010 21:35:02 +0000 (14:35 -0700)]
kernel.h: fix wrong usage of __ratelimit()
When __ratelimit() returns 1 this means that we can go ahead.
Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yong Zhang [Tue, 6 Apr 2010 21:35:01 +0000 (14:35 -0700)]
ratelimit: annotate ___ratelimit()
To prevent from wrongly using the return value.
[akpm@linux-foundation.org: fix spello]
Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Tue, 6 Apr 2010 21:35:01 +0000 (14:35 -0700)]
/dev/mem: allow rewinding
commit
dcefafb6 ("/dev/mem: dont allow seek to last page") inadvertently
disabled rewinding on /dev/mem.
This broke x86info for example.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 6 Apr 2010 21:35:00 +0000 (14:35 -0700)]
vfs: rename block_fsync() to blkdev_fsync()
Requested by hch, for consistency now it is exported.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anton Blanchard [Tue, 6 Apr 2010 21:34:58 +0000 (14:34 -0700)]
raw: fsync method is now required
Commit
148f948ba877f4d3cdef036b1ff6d9f68986706a (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke
the raw driver.
We now call through generic_file_aio_write -> generic_write_sync ->
vfs_fsync_range. vfs_fsync_range has:
if (!fop || !fop->fsync) {
ret = -EINVAL;
goto out;
}
But drivers/char/raw.c doesn't set an fsync method.
We have two options: fix it or remove the raw driver completely. I'm
happy to do either, the fact this has been broken for so long suggests it
is rarely used.
The patch below adds an fsync method to the raw driver. My knowledge of
the block layer is pretty sketchy so this could do with a once over.
If we instead decide to remove the raw driver, this patch might still be
useful as a backport to 2.6.33 and 2.6.32.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Shishkin [Tue, 6 Apr 2010 21:34:57 +0000 (14:34 -0700)]
mb862xxfb: update Valentin's email address
Since Valentin's email address @siemens.com is no longer valid, it's time
to change it to the one that actually works so that I don't have to
manually forward patches against mb862xx to him every time.
Signed-off-by: Alexander Shishkin <virtuoso@slind.org>
Acked-by: Valentin Sitdikov <v.sitdikov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Tue, 6 Apr 2010 21:34:57 +0000 (14:34 -0700)]
mb862xxfb: fix acceleration module license
mb862xxfb_accel built as a separate module, but it does not have a
MODULE_LICENSE, so it taints the kernel. Add a MODULE_LICENSE to it (same
as mb862xxfb license).
mb862xxfb_accel: module license 'unspecified' taints kernel.
Or should mb862xxfb_accel be built into the mb862xxfb binary file instead?
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Alexander Shishkin <virtuoso@slind.org>
Cc: Valentin Sitdikov <v.sitdikov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Tue, 6 Apr 2010 21:34:56 +0000 (14:34 -0700)]
mm: revert "vmscan: get_scan_ratio() cleanup"
Shaohua Li reported his tmpfs streaming I/O test can lead to make oom.
The test uses a 6G tmpfs in a system with 3G memory. In the tmpfs, there
are 6 copies of kernel source and the test does kbuild for each copy. His
investigation shows the test has a lot of rotated anon pages and quite few
file pages, so get_scan_ratio calculates percent[0] (i.e. scanning
percent for anon) to be zero. Actually the percent[0] shoule be a big
value, but our calculation round it to zero.
Although before commit
84b18490 ("vmscan: get_scan_ratio() cleanup") , we
have the same problem too. But the old logic can rescue percent[0]==0
case only when priority==0. It had hided the real issue. I didn't think
merely streaming io can makes percent[0]==0 && priority==0 situation. but
I was wrong.
So, definitely we have to fix such tmpfs streaming io issue. but anyway I
revert the regression commit at first.
This reverts commit
84b18490d1f1bc7ed5095c929f78bc002eb70f26.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: Shaohua Li <shaohua.li@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anton Blanchard [Tue, 6 Apr 2010 21:34:55 +0000 (14:34 -0700)]
devmem: handle class_create() failure
I hit this when we had a bug in IDR for a few days. Basically sysfs would
fail to create new inodes since it uses an IDR and therefore class_create
would fail.
While we are unlikely to see this fail we may as well handle it instead of
oopsing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Tue, 6 Apr 2010 21:34:53 +0000 (14:34 -0700)]
readahead: fix NULL filp dereference
btrfs relocate_file_extent_cluster() calls us with NULL filp:
[ 4005.426805] BUG: unable to handle kernel NULL pointer dereference at
00000021
[ 4005.426818] IP: [<
c109a130>] page_cache_sync_readahead+0x18/0x3e
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Yan Zheng <yanzheng@21cn.com>
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfram Sang [Tue, 6 Apr 2010 21:34:52 +0000 (14:34 -0700)]
device_attributes: add sysfs_attr_init() for dynamic attributes
Made necessary by
6992f5334995af474c2b58d010d08bc597f0f2fe ("sysfs: Use
one lockdep class per sysfs attribute").
Prevents further "key xxx not in .data" bug-reports. Although some
attributes could probably be converted to static ones, this is left for
people having hardware to test.
Found by this semantic patch:
@ init @
type T;
identifier A;
@@
T {
...
struct device_attribute A;
...
};
@ main extends init @
expression E;
statement S;
identifier err;
T *name;
@@
... when != sysfs_attr_init(&name->A.attr);
(
+ sysfs_attr_init(&name->A.attr);
if (device_create_file(E, &name->A))
S
|
+ sysfs_attr_init(&name->A.attr);
err = device_create_file(E, &name->A);
)
While reviewing, I put the initialization to apropriate places.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg KH <gregkh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Mike Isely <isely@pobox.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Sujith Thomas <sujith.thomas@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergey Senozhatsky [Tue, 6 Apr 2010 21:34:51 +0000 (14:34 -0700)]
drivers/thermal/thermal_sys.c: fix 'key
f70f4b50 not in .data' in thermal_sys
Initialize sysfs attributes before device_create_file call.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=15548
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg KH <gregkh@suse.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>