Ingo Molnar [Fri, 6 Jul 2012 14:13:58 +0000 (16:13 +0200)]
Merge branch 'rcu/next' of git://git./linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull the RCU tree from Paul E. McKenney:
"The major features of this series are:
1. Preventing latency spikes of more than 200 microseconds for
kernels built with NR_CPUS=4096, which is reportedly becoming
the default for some distros. This is a first step, as it does
not help with systems that actually -have- 4096 CPUs (work on
this case is in progress, but is not yet ready for mainline).
This category also includes improving concurrency of rcu_barrier(),
placed here due to conflicts. Posted to LKML at:
https://lkml.org/lkml/2012/6/22/381. Note that patches 18-22
of that series have been defered to 3.7, as they have not yet
proven themselves to be mainline-ready (and yes, these are the
ones intended to get rid of RCU's latency spikes for systems
that actually have 4096 CPUs).
2. Updates to documentation and rcutorture fixes, the latter category
including improvements to rcu_barrier() testing. Posted to LKML at:
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/04094.html.
3. Miscellaneous fixes posted to LKML at:
https://lkml.org/lkml/2012/6/22/500, with the exception of the
last commit, which was posted here:
http://www.gossamer-threads.com/lists/linux/kernel/
1561830
4. RCU_FAST_NO_HZ fixes and improvements. Posted to LKML at:
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/00006.html
and http://www.gossamer-threads.com/lists/linux/kernel/
1561833.
The first four patches of the first series went into 3.5 to fix
a regression.
5. Code-style fixes. These were posted to LKML at
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01180.html and
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01181.html.
"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Paul E. McKenney [Thu, 17 May 2012 22:12:45 +0000 (15:12 -0700)]
rcu: Fix broken strings in RCU's source code.
Although the C language allows you to break strings across lines, doing
this makes it hard for people to find the Linux kernel code corresponding
to a given console message. This commit therefore fixes broken strings
throughout RCU's source code.
Suggested-by: Josh Triplett <josh@joshtriplett.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Thu, 28 Jun 2012 15:08:25 +0000 (08:08 -0700)]
rcu: Fix code-style issues involving "else"
The Linux kernel coding style says that single-statement blocks should
omit curly braces unless the other leg of the "if" statement has
multiple statements, in which case the curly braces should be included.
This commit fixes RCU's violations of this rule.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Fri, 6 Jul 2012 12:59:20 +0000 (05:59 -0700)]
Merge branches 'bigrtm.2012.07.04a', 'doctorture.2012.07.02a', 'fixes.2012.07.06a' and 'fnh.2012.07.02a' into HEAD
bigrtm: First steps towards getting RCU out of the way of
tens-of-microseconds real-time response on systems compiled
with NR_CPUS=4096. Also cleanups for and increased concurrency
of rcu_barrier() family of primitives.
doctorture: rcutorture and documentation improvements.
fixes: Miscellaneous fixes.
fnh: RCU_FAST_NO_HZ fixes and improvements.
Paul E. McKenney [Mon, 25 Jun 2012 19:54:17 +0000 (12:54 -0700)]
rcu: Introduce check for callback list/count mismatch
The recent bug that introduced the RCU callback list/count mismatch
showed the need for a diagnostic to check for this, which this commit
adds.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Linus Torvalds [Thu, 5 Jul 2012 20:20:02 +0000 (13:20 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Small fixes on multiple ARM platforms
- A build regression from a previous fix on dove and mv78xx0
- Two fixes for recently (3.5-rc1) changed mmp/pxa code
- multiple omap2+ bug fixes
- two trivial fixes for i.MX
- one v3.5 regression for mxs"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: apx4devkit: fix FEC enabling PHY clock
ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks
ARM: Orion: Fix WDT compile for Dove and MV78xx0
ARM: mmp: remove mach/gpio-pxa.h
ARM: imx: assert SCC gate stays enabled
ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled
ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure
ARM: imx27_visstrim_m10: Do not include <asm/system.h>
ARM: pxa: hx4700: Fix basic suspend/resume
Linus Torvalds [Thu, 5 Jul 2012 20:16:21 +0000 (13:16 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti:
"Memory leak and oops on the x86 mmu code, and sanitization of the
KVM_IRQFD ioctl."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: fix shrinking page from the empty mmu
KVM: fix fault page leak
KVM: Sanitize KVM_IRQFD flags
KVM: Add missing KVM_IRQFD API documentation
KVM: Pass kvm_irqfd to functions
Linus Torvalds [Thu, 5 Jul 2012 20:09:37 +0000 (13:09 -0700)]
Merge branch 'fixes-for-3.5' of git://git./linux/kernel/git/cooloney/linux-leds
Pull leds fix from Bryan Wu:
"Fix for heartbeat led trigger driver"
* 'fixes-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: heartbeat: fix bug on panic
Linus Torvalds [Thu, 5 Jul 2012 20:06:25 +0000 (13:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
"I held off on my rc5 pull because I hit an oops during log recovery
after a crash. I wanted to make sure it wasn't a regression because
we have some logging fixes in here.
It turns out that a commit during the merge window just made it much
more likely to trigger directory logging instead of full commits,
which exposed an old bug.
The new backref walking code got some additional fixes. This should
be the final set of them.
Josef fixed up a corner where our O_DIRECT writes and buffered reads
could expose old file contents (not stale, just not the most recent).
He and Liu Bo fixed crashes during tree log recover as well.
Ilya fixed errors while we resume disk balancing operations on
readonly mounts."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: run delayed directory updates during log replay
Btrfs: hold a ref on the inode during writepages
Btrfs: fix tree log remove space corner case
Btrfs: fix wrong check during log recovery
Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
Btrfs: resume balance on rw (re)mounts properly
Btrfs: restore restriper state on all mounts
Btrfs: fix dio write vs buffered read race
Btrfs: don't count I/O statistic read errors for missing devices
Btrfs: resolve tree mod log locking issue in btrfs_next_leaf
Btrfs: fix tree mod log rewind of ADD operations
Btrfs: leave critical region in btrfs_find_all_roots as soon as possible
Btrfs: always put insert_ptr modifications into the tree mod log
Btrfs: fix tree mod log for root replacements at leaf level
Btrfs: support root level changes in __resolve_indirect_ref
Btrfs: avoid waiting for delayed refs when we must not
Linus Torvalds [Thu, 5 Jul 2012 18:53:47 +0000 (11:53 -0700)]
Merge branch 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux
Pull DT fixes from Rob Herring:
"Mainly some documentation updates and 2 fixes:
- An export symbol fix for of_platform_populate from Stephen W.
- A fix for the order compatible entries are matched to ensure the
first compatible string is matched when there are multiple matches."
Normally these would go through Grant Likely (thus the "fixes-for-grant"
branch name), but Grant is in the middle of moving to Scotland, and is
practically offline until sometime in August. So pull directly from Rob.
* 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux:
of: match by compatible property first
dt: mc13xxx.txt: Fix gpio number assignment
dt: fsl-fec.txt: Fix gpio number assignment
dt: fsl-mma8450.txt: Add missing 'reg' description
dt: fsl-imx-esdhc.txt: Fix gpio number assignment
dt: fsl-imx-cspi.txt: Fix comment about GPIOs used for chip selects
of: Add Avionic Design vendor prefix
of: export of_platform_populate()
Arnd Bergmann [Thu, 5 Jul 2012 10:16:13 +0000 (12:16 +0200)]
Merge tag 'omap-fixes-for-v3.5-rc5' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
PM related fixes for omaps mostly to get suspend/resume
working again.
* tag 'omap-fixes-for-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks
ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled
ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 5 Jul 2012 09:06:36 +0000 (11:06 +0200)]
Merge branch 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
* 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6:
ARM: apx4devkit: fix FEC enabling PHY clock
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tony Lindgren [Thu, 5 Jul 2012 08:12:08 +0000 (01:12 -0700)]
Merge tag 'omap-fixes-b-for-3.5rc' of git://git./linux/kernel/git/pjw/omap-pending into fixes
A few more OMAP fixes for 3.5-rc. These fix some bugs with power
management and McBSP.
Lauri Hintsala [Thu, 5 Jul 2012 07:31:36 +0000 (10:31 +0300)]
ARM: apx4devkit: fix FEC enabling PHY clock
Ethernet stopped to work after mxs clk framework change.
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Benoit Cousson [Wed, 4 Jul 2012 12:55:29 +0000 (06:55 -0600)]
ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
The commit
503d0ea24d1d3dd3db95e5e0edd693da7a2a23eb
ARM: OMAP4: hwmod data: Add aliases for McBSP fclk clocks
added a wrong "prcm_clk" alias for PRCM clock whereas the McBSP
driver and previous OMAPs are using "prcm_fck".
It thus lead to the following warning.
[ 47.409729] omap-mcbsp: clks: could not clk_get() prcm_fck
Fix that by changing the opt_clk role to prcm_fck.
Reported-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Sebastien Guiriec <s-guiriec@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Paul Walmsley [Wed, 4 Jul 2012 12:55:29 +0000 (06:55 -0600)]
ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks
The OMAP4 usb_host_fs (OHCI) and AESS IP blocks require some special
programming for them to enter idle. Without this programming, they
will prevent the rest of the chip from entering full chip idle.
To implement the idle programming cleanly, this will take some
coordination between maintainers. This is likely to take some time,
so it is probably best to leave this for 3.6 or 3.7. So, in the
meantime, prevent these IP blocks from being registered.
Later, once the appropriate support is available, this patch can be
reverted.
This second version comments out the IP block data since Benoît didn't
like removing it.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Benoît Cousson <b-cousson@ti.com>
Arnd Bergmann [Wed, 4 Jul 2012 11:49:58 +0000 (13:49 +0200)]
Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
From Haojian Zhuang <haojian.zhuang@gmail.com>:
* 'fixes' of git://github.com/hzhuang1/linux:
ARM: mmp: remove mach/gpio-pxa.h
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Wed, 4 Jul 2012 11:46:05 +0000 (13:46 +0200)]
Merge tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes
From Sascha Hauer <s.hauer@pengutronix.de>:
ARM i.MX fixes for v3.5-rc5
* tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6:
ARM: imx: assert SCC gate stays enabled
ARM: imx27_visstrim_m10: Do not include <asm/system.h>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Andrew Lunn [Fri, 29 Jun 2012 07:25:58 +0000 (09:25 +0200)]
ARM: Orion: Fix WDT compile for Dove and MV78xx0
Commit
0fa1f0609a0c1fe8b2be3c0089a2cb48f7fda521 (ARM: Orion: Fix
Virtual/Physical mixup with watchdog) broke the Dove & MV78xx0
build. Although these two SoC don't use the watchdog, the shared
platform code still needs to build. Add the necessary defines.
Cc: stable@vger.kernel.org
Reported-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Paul Bolle [Mon, 2 Jul 2012 21:40:14 +0000 (23:40 +0200)]
ARM: mmp: remove mach/gpio-pxa.h
Commit
157d2644cb0c1e71a18baaffca56d2b1d0ebf10f ("ARM: pxa: change gpio
to platform device") removed all includes of mach/gpio-pxa.h. It kept
this unused header in the tree. Using it can't work, as it itself
includes the non-existent header plat/gpio-pxa.h. This header can safely
be removed.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Alexander Holler [Tue, 3 Jul 2012 06:35:47 +0000 (14:35 +0800)]
leds: heartbeat: fix bug on panic
With commit
49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
has happened:
[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]
[ 1526.520230]
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230] #0: (&mm->mmap_sem){++++++}, at: [<
ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230] #1: (panic_lock){+.+...}, at: [<
ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230] #2: (rcu_read_lock){.+.+..}, at: [<
ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230] [<
ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230] [<
ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230] [<
ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230] [<
ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230] [<
ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230] [<
ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230] [<
ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230] [<
ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230] [<
ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230] [<
ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230] [<
ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230] [<
ffffffff811473e2>] out_of_memory+0x15d/0x1d3
So in case of a panic, now just turn of the LED. Other approaches like
scheduling a work to unregister the trigger aren't working because there
isn't much which still runs after a panic occured (except timers).
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
Uwe Kleine-König [Mon, 2 Jul 2012 09:30:34 +0000 (11:30 +0200)]
ARM: imx: assert SCC gate stays enabled
The SCC clock is needed in internal boot mode and so must keep enabled.
This same issue was fixed for the pre-common-clk code in commit
3d6e614 (mx35: Fix boot ROM hang in internal boot mode)
Cc: John Ogness <jogness@linutronix.de>
Cc: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Linus Torvalds [Wed, 4 Jul 2012 01:06:49 +0000 (18:06 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull fix to common clk framework from Michael Turquette:
"The previous set of common clk fixes for -rc5 left an uninitialized
int which could lead to bad array indexing when switching clock
parents. The issue is fixed with a trivial change to the code flow in
__clk_set_parent."
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
clk: fix parent validation in __clk_set_parent()
Linus Torvalds [Wed, 4 Jul 2012 01:05:35 +0000 (18:05 -0700)]
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull raid10 build failure fix from NeilBrown:
"I really shouldn't do important things late in the day. It seems that
I get careless."
* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md/raid10: fix careless build error
Linus Torvalds [Wed, 4 Jul 2012 01:01:54 +0000 (18:01 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking update from David Miller:
1) Fix RX sequence number handling in mwifiex, from Stone Piao.
2) Netfilter ipset mis-compares device names, fix from Florian
Westphal.
3) Fix route leak in ipv6 IPVS, from Eric Dumazet.
4) NFS fixes. Several buffer overflows in NCI layer from Dan
Rosenberg, and release sock OOPS'er fix from Eric Dumazet.
5) Fix WEP handling ath9k, we started using a bit the chip provides to
indicate undecrypted packets but that bit turns out to be unreliable
in certain configurations. Fix from Felix Fietkau.
6) Fix Kconfig dependency bug in wlcore, from Randy Dunlap.
7) New USB IDs for rtlwifi driver from Larry Finger.
8) Fix crashes in qmi_wwan usbnet driver when disconnecting, from Bjørn
Mork.
9) Gianfar driver programs coalescing settings properly in single queue
mode, but does not do so in multi-queue mode. Fix from Claudiu
Manoil.
10) Missing module.h include in davinci_cpdma.c, from Daniel Mack.
11) Need dummy handler for IPSET_CMD_NONE otherwise we crash in ipset if
we get this via nfnetlink, fix from Tomasz Bursztyka.
12) Missing RCU unlock in nfnetlink error path, also from Tomasz.
13) Fix divide by zero in igbvf when the user tries to set an RX
coalescing value of 0 usecs, from Mitch A Williams.
14) We can process SCTP sacks for the wrong transport, oops. Fix from
Neil Horman.
15) Remove hw IP payload checksumming from e1000e driver. This has zery
value in our stack, and turning it on creates a very unintuitive
restriction for users when using jumbo MTUs.
Specifically, when IP payload checksums are on you cannot use both
receive hashing offload and jumbo MTU. Fix from Bruce Allan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
e1000e: remove use of IP payload checksum
sctp: be more restrictive in transport selection on bundled sacks
igbvf: fix divide by zero
netfilter: nfnetlink: fix missing rcu_read_unlock in nfnetlink_rcv_msg
netfilter: ipset: fix crash if IPSET_CMD_NONE command is sent
davinci_cpdma: include linux/module.h
gianfar: Fix RXICr/TXICr programming for multi-queue mode
net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.
net: qmi_wwan: fix Oops while disconnecting
mwifiex: fix memory leak associated with IE manamgement
ath9k: fix panic caused by returning a descriptor we have queued for reuse
mac80211: correct behaviour on unrecognised action frames
ath9k: enable serialize_regmode for non-PCIE AR9287
rtlwifi: rtl8192cu: New USB IDs
NFC: Return from rawsock_release when sk is NULL
iwlwifi: fix activating inactive stations
wlcore: drop INET dependency
ath9k: fix dynamic WEP related regression
NFC: Prevent multiple buffer overflows in NCI
netfilter: update location of my trees
...
NeilBrown [Tue, 3 Jul 2012 23:35:35 +0000 (09:35 +1000)]
md/raid10: fix careless build error
build error introduced by commit
b357f04a67c2aeee8
That function doesn't get extra args until a later patch. Bother.
Reported-by: Fengguang Wu <wfg@linux.intel.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Linus Torvalds [Tue, 3 Jul 2012 22:51:22 +0000 (15:51 -0700)]
floppy: cancel any pending fd_timeouts before adding a new one
In commit
070ad7e793dc ("floppy: convert to delayed work and
single-thread wq") the 'fd_timeout' timer was converted to a delayed
work. However, the "del_timer(&fd_timeout)" was lost in the process,
and any previous pending timeouts would stay active when we then
re-queued the timeout.
This resulted in the floppy probe sequence having a (stale) 20s timeout
rather than the intended 3s timeout, and thus made booting with the
floppy driver (but no actual floppy controller) take much longer than it
should.
Of course, there's little reason for most people to compile the floppy
driver into the kernel at all, which is why most people never noticed.
Canceling the delayed work where we used to do the del_timer() fixes the
issue, and makes the floppy probing use the proper new timeout instead.
The three second timeout is still very wasteful, but better than the 20s
one.
Reported-and-tested-by: Andi Kleen <ak@linux.intel.com>
Reported-and-tested-by: Calvin Walton <calvin.walton@kepstin.ca>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 3 Jul 2012 22:45:10 +0000 (15:45 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block bits from Jens Axboe:
"As vacation is coming up, thought I'd better get rid of my pending
changes in my for-linus branch for this iteration. It contains:
- Two patches for mtip32xx. Killing a non-compliant sysfs interface
and moving it to debugfs, where it belongs.
- A few patches from Asias. Two legit bug fixes, and one killing an
interface that is no longer in use.
- A patch from Jan, making the annoying partition ioctl warning a bit
less annoying, by restricting it to !CAP_SYS_RAWIO only.
- Three bug fixes for drbd from Lars Ellenberg.
- A fix for an old regression for umem, it hasn't really worked since
the plugging scheme was changed in 3.0.
- A few fixes from Tejun.
- A splice fix from Eric Dumazet, fixing an issue with pipe
resizing."
* 'for-linus' of git://git.kernel.dk/linux-block:
scsi: Silence unnecessary warnings about ioctl to partition
block: Drop dead function blk_abort_queue()
block: Mitigate lock unbalance caused by lock switching
block: Avoid missed wakeup in request waitqueue
umem: fix up unplugging
splice: fix racy pipe->buffers uses
drbd: fix null pointer dereference with on-congestion policy when diskless
drbd: fix list corruption by failing but already aborted reads
drbd: fix access of unallocated pages and kernel panic
xen/blkfront: Add WARN to deal with misbehaving backends.
blkcg: drop local variable @q from blkg_destroy()
mtip32xx: Create debugfs entries for troubleshooting
mtip32xx: Remove 'registers' and 'flags' from sysfs
blkcg: fix blkg_alloc() failure path
block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED
block: fix return value on cfq_init() failure
mtip32xx: Remove version.h header file inclusion
xen/blkback: Copy id field when doing BLKIF_DISCARD.
Xiao Guangrong [Tue, 3 Jul 2012 06:31:43 +0000 (14:31 +0800)]
KVM: MMU: fix shrinking page from the empty mmu
Fix:
[ 3190.059226] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 3190.062224] IP: [<
ffffffffa02aac66>] mmu_page_zap_pte+0x10/0xa7 [kvm]
[ 3190.063760] PGD
104f50067 PUD
112bea067 PMD 0
[ 3190.065309] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 3190.066860] CPU 1
[ ...... ]
[ 3190.109629] Call Trace:
[ 3190.111342] [<
ffffffffa02aada6>] kvm_mmu_prepare_zap_page+0xa9/0x1fc [kvm]
[ 3190.113091] [<
ffffffffa02ab2f5>] mmu_shrink+0x11f/0x1f3 [kvm]
[ 3190.114844] [<
ffffffffa02ab25d>] ? mmu_shrink+0x87/0x1f3 [kvm]
[ 3190.116598] [<
ffffffff81150c9d>] ? prune_super+0x142/0x154
[ 3190.118333] [<
ffffffff8110a4f4>] ? shrink_slab+0x39/0x31e
[ 3190.120043] [<
ffffffff8110a687>] shrink_slab+0x1cc/0x31e
[ 3190.121718] [<
ffffffff8110ca1d>] do_try_to_free_pages
This is caused by shrinking page from the empty mmu, although we have
checked n_used_mmu_pages, it is useless since the check is out of mmu-lock
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Tue, 3 Jul 2012 06:31:01 +0000 (14:31 +0800)]
KVM: fix fault page leak
fault_page is forgot to be freed
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Rajendra Nayak [Tue, 3 Jul 2012 06:41:41 +0000 (12:11 +0530)]
clk: fix parent validation in __clk_set_parent()
The below commit introduced a bug in __clk_set_parent()
which could cause it to *skip* the parent validation
which makes sure the parent passed to the api is a valid
one.
commit
7975059db572eb47f0fb272a62afeae272a4b209
Author: Rajendra Nayak <rnayak@ti.com>
Date: Wed Jun 6 14:41:31 2012 +0530
clk: Allow late cache allocation for clk->parents
This was identified by the following compiler warning..
drivers/clk/clk.c: In function '__clk_set_parent':
drivers/clk/clk.c:1083:5: warning: 'i' may be used uninitialized in this function [-Wuninitialized]
.. as reported by Marc Kleine-Budde.
There were various options discussed on how to fix this, one
being initing 'i' to clk->num_parents, but the below approach
was found to be more appropriate as it also makes the 'parent
validation' code simpler to read.
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Cc: stable@kernel.org
Linus Torvalds [Tue, 3 Jul 2012 18:10:18 +0000 (11:10 -0700)]
Merge tag 'sound-3.5' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few driver-specific fixes for ASoC and HD-audio."
* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix no sound from ALC662 after Windows reboot
ASoC: tlv320aic3x: Fix codec pll configure bug
ASoC: wm2200: Add missing BCLK rate
Linus Torvalds [Tue, 3 Jul 2012 18:08:16 +0000 (11:08 -0700)]
Merge tag 'dm-3.5-fixes' of git://git./linux/kernel/git/agk/linux-dm
Pull device-mapper fixes from Alasdair G Kergon:
"Four minor thin provisioning fixes and correct and update dm-verity
documentation."
* tag 'dm-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm: verity fix documentation
dm persistent data: fix allocation failure in space map checker init
dm persistent data: handle space map checker creation failure
dm persistent data: fix shadow_info_leak on dm_tm_destroy
dm thin: commit metadata before creating metadata snapshot
Linus Torvalds [Tue, 3 Jul 2012 17:59:37 +0000 (10:59 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"One regression fix, two radeon fixes (one for an oops), and an i915
fix to unload framebuffers earlier.
We originally were going to leave the i915 fix until -next, but grub2
in some situations causes vesafb/efifb to be loaded now, and this
causes big slowdowns, and I have reports in rawhide I'd like to have
fixed."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: kick any firmware framebuffers before claiming the gtt
drm: edid: Don't add inferred modes with higher resolution
drm/radeon: fix rare segfault
drm/radeon: fix VM page table setup on SI
Linus Torvalds [Tue, 3 Jul 2012 17:40:43 +0000 (10:40 -0700)]
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull md fixes from NeilBrown:
"md: collection of bug fixes for 3.5
You go away for 2 weeks vacation and what do you get when you come
back? Piles of bugs :-)
Some found by inspection, some by testing, some during use in the
field, and some while developing for the next window..."
* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md: fix up plugging (again).
md: support re-add of recovering devices.
md/raid1: fix bug in read_balance introduced by hot-replace
raid5: delayed stripe fix
md/raid456: When read error cannot be recovered, record bad block
md: make 'name' arg to md_register_thread non-optional.
md/raid10: fix failure when trying to repair a read error.
md/raid5: fix refcount problem when blocked_rdev is set.
md:Add blk_plug in sync_thread.
md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
md/raid5: Do not add data_offset before call to is_badblock
md/raid5: prefer replacing failed devices over want-replacement devices.
md/raid10: Don't try to recovery unmatched (and unused) chunks.
Linus Torvalds [Tue, 3 Jul 2012 17:39:40 +0000 (10:39 -0700)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security
Pull security layer fixes from James Morris.
A documentation update, and a nommu build fix.
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: Fix nommu build.
security: document no_new_privs
Milan Broz [Tue, 3 Jul 2012 11:55:41 +0000 (12:55 +0100)]
dm: verity fix documentation
Veritysetup is now part of cryptsetup package.
Remove on-disk header description (which is not parsed in kernel)
and point users to cryptsetup where it the format is documented.
Mention units for block size paramaters.
Fix target line specification and dmsetup parameters.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 3 Jul 2012 11:55:37 +0000 (12:55 +0100)]
dm persistent data: fix allocation failure in space map checker init
If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and memory is fragmented and a
sufficiently-large metadata device is used in a thin pool then the space
map checker will fail to allocate the memory it requires.
Switch from kmalloc to vmalloc to allow larger virtually contiguous
allocations for the space map checker's internal count arrays.
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 3 Jul 2012 11:55:35 +0000 (12:55 +0100)]
dm persistent data: handle space map checker creation failure
If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and dm_sm_checker_create()
fails, dm_tm_create_internal() would still return success even though it
cleaned up all resources it was supposed to have created. This will
lead to a kernel crash:
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
...
RIP: 0010:[<
ffffffff81593659>] [<
ffffffff81593659>] dm_bufio_get_block_size+0x9/0x20
Call Trace:
[<
ffffffff81599bae>] dm_bm_block_size+0xe/0x10
[<
ffffffff8159b8b8>] sm_ll_init+0x78/0xd0
[<
ffffffff8159c1a6>] sm_ll_new_disk+0x16/0xa0
[<
ffffffff8159c98e>] dm_sm_disk_create+0xfe/0x160
[<
ffffffff815abf6e>] dm_pool_metadata_open+0x16e/0x6a0
[<
ffffffff815aa010>] pool_ctr+0x3f0/0x900
[<
ffffffff8158d565>] dm_table_add_target+0x195/0x450
[<
ffffffff815904c4>] table_load+0xe4/0x330
[<
ffffffff815917ea>] ctl_ioctl+0x15a/0x2c0
[<
ffffffff81591963>] dm_ctl_ioctl+0x13/0x20
[<
ffffffff8116a4f8>] do_vfs_ioctl+0x98/0x560
[<
ffffffff8116aa51>] sys_ioctl+0x91/0xa0
[<
ffffffff81869f52>] system_call_fastpath+0x16/0x1b
Fix the space map checker code to return an appropriate ERR_PTR and have
dm_sm_disk_create() and dm_tm_create_internal() check for it with
IS_ERR.
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 3 Jul 2012 11:55:33 +0000 (12:55 +0100)]
dm persistent data: fix shadow_info_leak on dm_tm_destroy
Cleanup the shadow table before destroying the transaction manager.
Reference: leak was identified with kmemleak when running
test_discard_random_sectors in the thinp-test-suite.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Joe Thornber [Tue, 3 Jul 2012 11:55:31 +0000 (12:55 +0100)]
dm thin: commit metadata before creating metadata snapshot
Userland sometimes sees a corrupt metadata block if metadata is changing
rapidly when a metadata snapshot is reserved for userland, To make the
problem go away, commit before we take the metadata snapshot (which is a
sensible thing to do anyway).
The checksums mean userland spots this corruption immediately so there's
no risk of acting on incorrect data. No corruption exists from the
kernel's point of view, and thin_check passes after pool shutdown.
I believe this is to do with shared blocks at the first level of the
{device, mapping} btree. Prior to the metadata-snap support no sharing
at this level was possible, so this patch is only required after commit
cc8394d86f045b86ff303d3c9e4ce47d97148951 ("dm thin: provide userspace
access to pool metadata").
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Paul Mundt [Mon, 2 Jul 2012 05:34:11 +0000 (14:34 +0900)]
security: Fix nommu build.
The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:
security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in
include backing-dev.h directly to fix it up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Daniel Vetter [Sun, 1 Jul 2012 15:09:42 +0000 (17:09 +0200)]
drm/i915: kick any firmware framebuffers before claiming the gtt
Especially vesafb likes to map everything as uc- (yikes), and if that
mapping hangs around still while we try to map the gtt as wc the
kernel will downgrade our request to uc-, resulting in abyssal
performance.
Unfortunately we can't do this as early as readon does (i.e. as the
first thing we do when initializing the hw) because our fb/mmio space
region moves around on a per-gen basis. So I've had to move it below
the gtt initialization, but that seems to work, too. The important
thing is that we do this before we set up the gtt wc mapping.
Now an altogether different question is why people compile their
kernels with vesafb enabled, but I guess making things just work isn't
bad per se ...
v2:
- s/radeondrmfb/inteldrmfb/
- fix up error handling
v3: Kill #ifdef X86, this is Intel after all. Noticed by Ben Widawsky.
v4: Jani Nikula complained about the pointless bool primary
initialization.
v5: Don't oops if we can't allocate, noticed by Chris Wilson.
v6: Resolve conflicts with agp rework and fixup whitespace.
This is commit
e188719a2891f01b3100d in drm-next.
Backport to 3.5 -fixes queue requested by Dave Airlie - due to grub
using vesa on fedora their initrd seems to load vesafb before loading
the real kms driver. So tons more people actually experience a
dead-slow gpu. Hence also the Cc: stable.
Cc: stable@vger.kernel.org
Reported-and-tested-by: "Kilarski, Bernard R" <bernard.r.kilarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Takashi Iwai [Tue, 3 Jul 2012 09:22:11 +0000 (11:22 +0200)]
drm: edid: Don't add inferred modes with higher resolution
When a monitor EDID doesn't give the preferred bit, driver assumes
that the mode with the higest resolution and rate is the preferred
mode. Meanwhile the recent changes for allowing more modes in the
GFT/CVT ranges give actually more modes, and some modes may be over
the native size. Thus such a mode would be picked up as the preferred
mode although it's no native resolution.
For avoiding such a problem, this patch limits the addition of
inferred modes by checking not to be greater than other modes.
Also, it checks the duplicated mode entry at the same time.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Mon, 2 Jul 2012 16:40:54 +0000 (12:40 -0400)]
drm/radeon: fix rare segfault
In gem idle/busy ioctl the radeon object was derefenced after
drm_gem_object_unreference_unlocked which in case the object
have been destroyed lead to use of a possibly free pointer with
possibly wrong data.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
NeilBrown [Tue, 3 Jul 2012 07:45:31 +0000 (17:45 +1000)]
md: fix up plugging (again).
The value returned by "mddev_check_plug" is only valid until the
next 'schedule' as that will unplug things. This could happen at any
call to mempool_alloc.
So just calling mddev_check_plug at the start doesn't really make
sense.
So call it just before, or just after, queuing things for the thread.
As the action that happens at unplug is to wake the thread, this makes
lots of sense.
If we cannot add a plug (which requires a small GFP_ATOMIC alloc) we
wake thread immediately.
RAID5 is a bit different. Requests are queued for the thread and the
thread is woken by release_stripe. So we don't need to wake the
thread on failure.
However the thread doesn't perform certain actions when there is any
active plug, so it is important to install a plug before waking the
thread. So for RAID5 we install the plug *before* queuing the request
and waking the thread.
Without this patch it is possible for raid1 or raid10 to queue a
request without then waking the thread, resulting in the array locking
up.
Also change raid10 to only flush_pending_write when there are not
active plugs, just like raid1.
This patch is suitable for 3.0 or later. I plan to submit it to
-stable, but I'll like to let it spend a few weeks in mainline
first to be sure it is completely safe.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 05:59:06 +0000 (15:59 +1000)]
md: support re-add of recovering devices.
We currently only allow a device to be re-added if it appear to be
in-sync. This is overly restrictive as it may be desirable to re-add
a device that is in the middle of recovery.
So remove the test for "InSync" - the test on rdev->raid_disk is
sufficient to ensure that the re-add will succeed.
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Tested-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 05:58:42 +0000 (15:58 +1000)]
md/raid1: fix bug in read_balance introduced by hot-replace
When we added hot_replace we doubled the number of devices
that could be in a RAID1 array. So we doubled how far read_balance
would search. Unfortunately we didn't double the point at which
it looped back to the beginning - so it effectively loops over
all non-replacement disks twice.
This doesn't cause bad behaviour, but it pointless and means we
never read from replacement devices.
Signed-off-by: NeilBrown <neilb@suse.de>
Shaohua Li [Tue, 3 Jul 2012 05:57:19 +0000 (15:57 +1000)]
raid5: delayed stripe fix
There isn't locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but
the two bits have relationship. A delayed stripe can be moved to hold list only
when preread active stripe count is below IO_THRESHOLD. If a stripe has both
the bits set, such stripe will be in delayed list and preread count not 0,
which will make such stripe never leave delayed list.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
majianpeng [Tue, 3 Jul 2012 05:57:02 +0000 (15:57 +1000)]
md/raid456: When read error cannot be recovered, record bad block
We may not be able to fix a bad block if:
- the array is degraded
- the over-write fails.
In these cases we currently eject the device, but we should
record a bad block if possible.
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 05:56:52 +0000 (15:56 +1000)]
md: make 'name' arg to md_register_thread non-optional.
Having the 'name' arg optional and defaulting to the current
personality name is no necessary and leads to errors, as when
changing the level of an array we can end up using the
name of the old level instead of the new one.
So make it non-optional and always explicitly pass the name
of the level that the array will be.
Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 05:55:33 +0000 (15:55 +1000)]
md/raid10: fix failure when trying to repair a read error.
commit
58c54fcca3bac5bf9290cfed31c76e4c4bfbabaf
md/raid10: handle further errors during fix_read_error better.
in 3.1 added "r10_sync_page_io" which takes an IO size in sectors.
But we were passing the IO size in bytes!!!
This resulting in bio_add_page failing, and empty request being sent
down, and a consequent BUG_ON in scsi_lib.
[fix missing space in error message at same time]
This fix is suitable for 3.1.y and later.
Cc: stable@vger.kernel.org
Reported-by: Christian Balzer <chibi@gol.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Linus Torvalds [Tue, 3 Jul 2012 02:52:25 +0000 (19:52 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull a couple more powerpc fixes from Benjamin Herrenschmidt:
"Here are two more fixes that I "missed" when scrubbing patchwork last
week which are worth still having in 3.5."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/kvm: sldi should be sld
powerpc/xmon: Use cpumask iterator to avoid warning
Andy Lutomirski [Mon, 2 Jul 2012 21:03:58 +0000 (14:03 -0700)]
security: document no_new_privs
Document no_new_privs.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
NeilBrown [Tue, 3 Jul 2012 02:13:29 +0000 (12:13 +1000)]
md/raid5: fix refcount problem when blocked_rdev is set.
commit
43220aa0f22cd3ce5b30246d50ccd696d119edea
md/raid5: fix a hang on device failure.
fixed a hang, but introduced a refcounting in-balance so
that if the presence of bad-blocks ever caused an rdev to
be 'blocked' we would increment the refcount on the rdev and
never decrement it.
So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
is not called.
Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
majianpeng [Tue, 3 Jul 2012 02:12:26 +0000 (12:12 +1000)]
md:Add blk_plug in sync_thread.
Add blk_plug in sync_thread will increase the performance of sync.
Because sync_thread did not blk_plug,so when raid sync, the bio merge
not well.
Testing environment:
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
Controller.
OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
x86_64 x86_64 x86_64 GNU/Linux.
RAID5: four ST31000524NS disk.
Without blk_plug:recovery speed about 63M/Sec;
Add blk_plug:recovery speed about 120M/Sec.
Using blktrace:
blktrace -d /dev/sdb -w 60 -o -|blkparse -i -
without blk_plug:
Total (8,16):
Reads Queued: 309811, 1239MiB Writes Queued: 0, 0KiB
Read Dispatches: 283583, 1189MiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 273351, 1149MiB Writes Completed: 0, 0KiB
Read Merges: 23533, 94132KiB Write Merges: 0, 0KiB
IO unplugs: 0 Timer unplugs: 0
add blk_plug:
Total (8,16):
Reads Queued: 428697, 1714MiB Writes Queued: 0, 0KiB
Read Dispatches: 3954, 1714MiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 3956, 1715MiB Writes Completed: 0, 0KiB
Read Merges: 424743, 1698MiB Write Merges: 0, 0KiB
IO unplugs: 0 Timer unplugs: 3384
The ratio of merge will be markedly increased.
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
majianpeng [Tue, 3 Jul 2012 02:11:54 +0000 (12:11 +1000)]
md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
nr_pending so we lose the reference we hold on the rdev.
So atomic_inc it first to maintain the reference.
This bug was introduced by commit
73e92e51b7969ef5477d
md/raid5. Don't write to known bad block on doubtful devices.
which appeared in 3.0, so patch is suitable for stable kernels since
then.
Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
majianpeng [Tue, 12 Jun 2012 00:31:10 +0000 (08:31 +0800)]
md/raid5: Do not add data_offset before call to is_badblock
In chunk_aligned_read() we are adding data_offset before calling
is_badblock. But is_badblock also adds data_offset, so that is bad.
So move the addition of data_offset to after the call to
is_badblock.
This bug was introduced by commit
31c176ecdf3563140e639
md/raid5: avoid reading from known bad blocks.
which first appeared in 3.0. So that patch is suitable for any
-stable kernel from 3.0.y onwards. However it will need minor
revision for most of those (as the comment didn't appear until
recently).
Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 01:46:53 +0000 (11:46 +1000)]
md/raid5: prefer replacing failed devices over want-replacement devices.
If a RAID5 has both a failed device and a device marked as
'WantReplacement', then we should preferentially replace the failed
device.
However the current code replaces whichever is found first.
So split into 2 loops, check fail failed/missing first, and only check
for WantReplacement if nothing is failed or missing.
Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 3 Jul 2012 00:37:30 +0000 (10:37 +1000)]
md/raid10: Don't try to recovery unmatched (and unused) chunks.
If a RAID10 has an odd number of chunks - as might happen when there
are an odd number of devices - the last chunk has no pair and so is
not mirrored. We don't store data there, but when recovering the last
device in an array we retry to recover that last chunk from a
non-existent location. This results in an error, and the recovery
aborts.
When we get to that last chunk we should just stop - there is nothing
more to do anyway.
This bug has been present since the introduction of RAID10, so the
patch is appropriate for any -stable kernel.
Cc: stable@vger.kernel.org
Reported-by: Christian Balzer <chibi@gol.com>
Tested-by: Christian Balzer <chibi@gol.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Alex Williamson [Fri, 29 Jun 2012 15:56:24 +0000 (09:56 -0600)]
KVM: Sanitize KVM_IRQFD flags
We only know of one so far.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alex Williamson [Fri, 29 Jun 2012 15:56:16 +0000 (09:56 -0600)]
KVM: Add missing KVM_IRQFD API documentation
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alex Williamson [Fri, 29 Jun 2012 15:56:08 +0000 (09:56 -0600)]
KVM: Pass kvm_irqfd to functions
Prune this down to just the struct kvm_irqfd so we can avoid
changing function definition for every flag or field we use.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Arnd Bergmann [Mon, 2 Jul 2012 21:14:29 +0000 (23:14 +0200)]
Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
* 'fixes' of git://github.com/hzhuang1/linux:
ARM: pxa: hx4700: Fix basic suspend/resume
Chris Mason [Mon, 2 Jul 2012 19:29:53 +0000 (15:29 -0400)]
Btrfs: run delayed directory updates during log replay
While we are resolving directory modifications in the
tree log, we are triggering delayed metadata updates to
the filesystem btrees.
This commit forces the delayed updates to run so the
replay code can find any modifications done. It stops
us from crashing because the directory deleltion replay
expects items to be removed immediately from the tree.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
cc: stable@kernel.org
Josef Bacik [Wed, 27 Jun 2012 21:18:41 +0000 (17:18 -0400)]
Btrfs: hold a ref on the inode during writepages
We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent. This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages. If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Wed, 27 Jun 2012 19:10:56 +0000 (15:10 -0400)]
Btrfs: fix tree log remove space corner case
The tree log stuff can have allocated space that we end up having split
across a bitmap and a real extent. The free space code does not deal with
this, it assumes that if it finds an extent or bitmap entry that the entire
range must fall within the entry it finds. This isn't necessarily the case,
so rework the remove function so it can handle this case properly. This
fixed two panics the user hit, first in the case where the space was
initially in a bitmap and then in an extent entry, and then the reverse
case. Thanks,
Reported-and-tested-by: Shaun Reich <sreich@kde.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Liu Bo [Tue, 26 Jun 2012 03:59:09 +0000 (21:59 -0600)]
Btrfs: fix wrong check during log recovery
When we're evicting an inode during log recovery, we need to ensure that the inode
is not in orphan state any more, which means inode's run_time flags has _no_
BTRFS_INODE_HAS_ORPHAN_ITEM. Thus, the BUG_ON was triggered because of a wrong
check for the flags.
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Alexander Block [Mon, 25 Jun 2012 21:36:12 +0000 (15:36 -0600)]
Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
We used the wrong ioctl macro for the getflags ioctl before.
As we don't have the set/getflags ioctls in the user space ioctl.h
at the moment, it's safe to fix it now.
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Ilya Dryomov [Fri, 22 Jun 2012 18:24:13 +0000 (12:24 -0600)]
Btrfs: resume balance on rw (re)mounts properly
This introduces btrfs_resume_balance_async(), which, given that
restriper state was recovered earlier by btrfs_recover_balance(),
resumes balance in btrfs-balance kthread.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Fri, 22 Jun 2012 18:24:12 +0000 (12:24 -0600)]
Btrfs: restore restriper state on all mounts
Fix a bug that triggered asserts in btrfs_balance() in both normal and
resume modes -- restriper state was not properly restored on read-only
mounts. This factors out resuming code from btrfs_restore_balance(),
which is now also called earlier in the mount sequence to avoid the
problem of some early writes getting the old profile.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Josef Bacik [Tue, 19 Jun 2012 14:59:00 +0000 (10:59 -0400)]
Btrfs: fix dio write vs buffered read race
Miao pointed out there's a problem with mixing dio writes and buffered
reads. If the read happens between us invalidating the page range and
actually locking the extent we can bring in pages into page cache. Then
once the write finishes if somebody tries to read again it will just find
uptodate pages and we'll read stale data. So we need to lock the extent and
check for uptodate bits in the range. If there are uptodate bits we need to
unlock and invalidate again. This will keep this race from happening since
we will hold the extent locked until we create the ordered extent, and then
teh read side always waits for ordered extents. There was also a race in
how we updated i_size, previously we were relying on the generic DIO stuff
to adjust the i_size after the DIO had completed, but this happens outside
of the extent lock which means reads could come in and not see the updated
i_size. So instead move this work into where we create the extents, and
then this way the update ordered i_size stuff works properly in the endio
handlers. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Stefan Behrens [Thu, 14 Jun 2012 14:42:31 +0000 (16:42 +0200)]
Btrfs: don't count I/O statistic read errors for missing devices
It is normal behaviour of the low level btrfs function btrfs_map_bio()
to complete a bio with -EIO if the device is missing, instead of just
preventing the bio creation in an earlier step.
This used to cause I/O statistic read error increments and annoying
printk_ratelimited messages. This commit fixes the issue.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Reported-by: Carey Underwood <cwillu@cwillu.com>
Paul E. McKenney [Sun, 24 Jun 2012 17:15:02 +0000 (10:15 -0700)]
rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to
also disable itself. This commit therefore checks for tick_nohz_enabled
being zero, disabling rcu_prepare_for_idle() if so. This commit assumes
that tick_nohz_enabled can change at runtime: If this is not the case,
then a simpler approach suffices.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 5 Jun 2012 22:53:53 +0000 (15:53 -0700)]
rcu: Fix qlen_lazy breakage
Commit
d8169d4c (Make __kfree_rcu() less dependent on compiler choices)
created a macro out of an inline function in order to avoid build
breakage for certain combinations of gcc flags. Unfortunately, it also
converted a kfree_call_rcu() to a call_rcu(), which made the rcu_data
structure's ->qlen_lazy field lose counts. This commit therefore changes
the call_rcu() back to kfree_call_rcu().
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 5 Jun 2012 03:45:10 +0000 (20:45 -0700)]
rcu: Round FAST_NO_HZ lazy timeout to nearest second
Currently, if several CPUs in the same package have all lazy RCU
callbacks, their wakeups will be uncorrelated. If all the CPUs are in the
same power domain (as is often the case), this will result in unnecessary
power-ups of the package. This commit therefore uses round_jiffies()
to round the timeouts to a second boundary, increasing the odds that
they can be coalesced with each other or with other timeouts.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Thu, 10 May 2012 22:37:48 +0000 (15:37 -0700)]
rcu: The rcu_needs_cpu() function is not a quiescent state
The TINY_PREEMPT_RCU() function rcu_preempt_needs_cpu(), which is called
from rcu_needs_cpu(), assumes that it is in a quiescent state with respect
to the CPU. This is no longer the case. This commit therefore updates
rcu_preempt_needs_cpu() to make it aware that it is not running in a
quiescent state.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
Paul E. McKenney [Wed, 9 May 2012 22:55:39 +0000 (15:55 -0700)]
rcu: Dump only the current CPU's buffers for idle-entry/exit warnings
Problems in RCU idle entry and exit are almost always confined to the
offending CPU. This commit therefore switches ftrace_dump() from
DUMP_ALL to DUMP_ORIG.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
Paul E. McKenney [Thu, 21 Jun 2012 18:26:42 +0000 (11:26 -0700)]
rcu: Add check for CPUs going offline with callbacks queued
If a CPU goes offline with callbacks queued, those callbacks might be
indefinitely postponed, which can result in a system hang. This commit
therefore inserts warnings for this condition.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 19 Jun 2012 18:58:27 +0000 (11:58 -0700)]
rcu: Disable preemption in rcu_blocking_is_gp()
It is time to optimize CONFIG_TREE_PREEMPT_RCU's synchronize_rcu()
for uniprocessor optimization, which means that rcu_blocking_is_gp()
can no longer rely on RCU read-side critical sections having disabled
preemption. This commit therefore disables preemption across
rcu_blocking_is_gp()'s scan of the cpu_online_mask.
(Updated from previous version to fix embarrassing bug spotted by
Wu Fengguang.)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Carsten Emde [Tue, 19 Jun 2012 08:43:16 +0000 (10:43 +0200)]
rcu: Prevent uninitialized string in RCU CPU stall info
An uninitialized string may be displayed at the end of the rcu_preempt
detected stall info such as
0: (1 GPs behind) idle=075/
140000000000000/0 =8?^D=8?^D
^^^^^^^^^^
if CONFIG_RCU_FAST_NO_HZ is not defined.
This trivial patch clears the string in this case.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Wed, 6 Jun 2012 14:12:43 +0000 (07:12 -0700)]
rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU
The rcu_is_cpu_idle() function is used if CONFIG_DEBUG_LOCK_ALLOC,
but TINY_RCU defines it only when CONFIG_PROVE_RCU. This causes
build failures when CONFIG_DEBUG_LOCK_ALLOC=y but CONFIG_PROVE_RCU=n.
This commit therefore adjusts the #ifdefs for rcu_is_cpu_idle() so
that it is defined when CONFIG_DEBUG_LOCK_ALLOC=y.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Wed, 30 May 2012 10:21:48 +0000 (03:21 -0700)]
rcu: Split RCU core processing out of __call_rcu()
The __call_rcu() function is a bit overweight, so this commit splits
it into actual enqueuing of and accounting for the callback (__call_rcu())
and associated RCU-core processing (__call_rcu_core()).
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Sat, 26 May 2012 15:56:01 +0000 (08:56 -0700)]
rcu: Prevent __call_rcu() from invoking RCU core on offline CPUs
The __call_rcu() function will invoke the RCU core, for example, if
it detects that the current CPU has too many callbacks. However, this
can happen on an offline CPU that is on its way to the idle loop, in
which case it is an error to invoke the RCU core, and the excess callbacks
will be adopted in any case. This commit therefore adds checks to
__call_rcu() for running on an offline CPU, refraining from invoking
the RCU core in this case.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 23 May 2012 05:10:24 +0000 (22:10 -0700)]
rcu: Make __call_rcu() handle invocation from idle
Although __call_rcu() is handled correctly when called from a momentary
non-idle period, if it is called on a CPU that RCU believes to be idle
on RCU_FAST_NO_HZ kernels, the callback might be indefinitely postponed.
This commit therefore ensures that RCU is aware of the new callback and
has a chance to force the CPU out of dyntick-idle mode when a new callback
is posted.
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Fri, 25 May 2012 21:25:58 +0000 (14:25 -0700)]
rcu: Remove function versions of __kfree_rcu and __is_kfree_rcu_offset
Commit
d8169d4c (Make __kfree_rcu() less dependent on compiler choices)
added cpp macro versions of __kfree_rcu() and __is_kfree_rcu_offset(),
but failed to remove the old inline-function versions. This commit does
this cleanup.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Mon, 21 May 2012 18:58:36 +0000 (11:58 -0700)]
rcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations
The CONFIG_TREE_PREEMPT_RCU and CONFIG_TINY_PREEMPT_RCU versions of
__rcu_read_lock() and __rcu_read_unlock() are identical, so this commit
consolidates them into kernel/rcupdate.h.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 16 May 2012 22:51:08 +0000 (15:51 -0700)]
rcu: Remove return value from rcu_assign_pointer()
The return value from rcu_assign_pointer() is not used, and using it
would be quite ugly, for example:
q = rcu_assign_pointer(global_p, p);
To prevent this sort of ugliness from spreading, this commit wraps
rcu_assign_pointer() in a do-while loop.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reported-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 16 May 2012 23:31:38 +0000 (16:31 -0700)]
key: Remove extraneous parentheses from rcu_assign_keypointer()
This commit removes the extraneous parentheses from rcu_assign_keypointer()
so that rcu_assign_pointer() can be wrapped in do-while. It also wraps
rcu_assign_keypointer() in a do-while and parenthesizes its final argument,
as suggested by David Howells.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 16 May 2012 22:42:30 +0000 (15:42 -0700)]
rcu: Remove return value from RCU_INIT_POINTER()
The return value from RCU_INIT_POINTER() is not used, and using it
would be quite ugly, for example:
q = RCU_INIT_POINTER(global_p, p);
To prevent this sort of ugliness from appearing, this commit wraps
RCU_INIT_POINTER() in a do-while loop.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: David Howells <dhowells@redhat.com>
Paul E. McKenney [Wed, 16 May 2012 22:33:15 +0000 (15:33 -0700)]
rcu: Use new RCU_POINTER_INITIALIZER for gcc-style initializations
This commit applies the INIT_RCU_POINTER() macro to all uses of
RCU_POINTER_INITIALIZER() that were all too cleverly creating gcc-style
initializations.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Paul E. McKenney [Wed, 16 May 2012 22:23:45 +0000 (15:23 -0700)]
rcu: Add a gcc-style structure initializer for RCU pointers
RCU_INIT_POINTER() returns a value that is never used, and which should
be abolished due to terminal ugliness:
q = RCU_INIT_POINTER(global_p, p);
However, there are two uses that cannot be handled by a do-while
formulation because they do gcc-style initialization:
RCU_INIT_POINTER(.real_cred, &init_cred),
RCU_INIT_POINTER(.cred, &init_cred),
This usage is clever, but not necessarily the nicest approach.
This commit therefore creates an RCU_POINTER_INITIALIZER() macro that
is specifically designed for gcc-style initialization.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Paul E. McKenney [Wed, 9 May 2012 22:44:42 +0000 (15:44 -0700)]
rcu: Add ACCESS_ONCE() to ->qlen accesses
The _rcu_barrier() function accesses other CPUs' rcu_data structure's
->qlen field without benefit of locking. This commit therefore adds
the required ACCESS_ONCE() wrappers around accesses and updates that
need it.
ACCESS_ONCE() is not needed when a CPU accesses its own ->qlen, or
in code that cannot run while _rcu_barrier() is sampling ->qlen fields.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 9 May 2012 22:39:56 +0000 (15:39 -0700)]
rcu: Consolidate duplicate callback-list initialization
There are a couple of open-coded initializations of the rcu_data
structure's RCU callback list. This commit therefore consolidates
them into a new init_callback_list() function.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 9 May 2012 15:45:12 +0000 (08:45 -0700)]
rcu: Fix detection of abruptly-ending stall
The code that attempts to identify stalls that end just as we detect
them is broken by both flavors of initialization failure. This commit
therefore properly initializes and computes the count of the number
of reasons why the RCU grace period is stalled.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 30 May 2012 00:50:51 +0000 (17:50 -0700)]
rcu: Make rcutorture fakewriters invoke rcu_barrier()
The current rcutorture rcu_barrier() testing never intentionally runs
more than one instance of rcu_barrier() at a given time. This fails
to test the the shiny new concurrency features of rcu_barrier(). This
commit therefore modifies the rcutorture fakewriter kthread to randomly
invoke rcu_barrier() rather than the usual synchronize_rcu().
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Fri, 25 May 2012 01:17:48 +0000 (18:17 -0700)]
rcu: Fix diagnostic-printk typo in rcutorture
The rcu_torture_barrier() function has a copy-and-paste typo in the
string passed to rcutorture_shutdown_absorb(), which this commit fixes.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Tue, 29 May 2012 02:21:41 +0000 (19:21 -0700)]
rcu: Fix bug in rcu_barrier() torture test
The child threads in the rcu_torture_barrier_cbs() are improperly
synchronized, which can cause the rcu_barrier() tests to hang. The
failure mode is as follows:
1. CPU 0 running in rcu_torture_barrier() sets barrier_cbs_count
to n_barrier_cbs.
2. CPU 1 running in rcu_torture_barrier_cbs() wakes up, posts
its RCU callback, and atomically decrements barrier_cbs_count.
Because barrier_cbs_count is not zero, it does not do the wake_up().
3. CPU 2 running in rcu_torture_barrier_cbs() wakes up, but
finds that barrier_cbs_count is not equal to n_barrier_cbs,
and so returns to sleep.
4. The value of barrier_cbs_count therefore never reaches zero,
which causes the test to hang.
This commit therefore uses a phase variable to coordinate the test,
preventing this scenario from occurring.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 8 May 2012 17:21:50 +0000 (10:21 -0700)]
rcu: Test srcu_barrier() from rcutorture test suite
SRCU now has a call_srcu() and an srcu_barrier(), but rcutorture does not
test them. This commit adds the machinery to allow rcutorture's existing
tests for call_rcu() and rcu_barrier() to apply to the SRCU equivalents.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Mon, 7 May 2012 20:44:44 +0000 (13:44 -0700)]
rcu: Rationalize ordering of torture_ops list
Move the raw SRCU interfaces out of the middle of the normal SRCU
interfaces.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>