firefly-linux-kernel-4.4.55.git
12 years agoMerge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Ingo Molnar [Fri, 6 Jul 2012 14:13:58 +0000 (16:13 +0200)]
Merge branch 'rcu/next' of git://git./linux/kernel/git/paulmck/linux-rcu into core/rcu

Pull the RCU tree from Paul E. McKenney:

"The major features of this series are:

1. Preventing latency spikes of more than 200 microseconds for
kernels built with NR_CPUS=4096, which is reportedly becoming
the default for some distros.  This is a first step, as it does
not help with systems that actually -have- 4096 CPUs (work on
this case is in progress, but is not yet ready for mainline).
This category also includes improving concurrency of rcu_barrier(),
placed here due to conflicts.  Posted to LKML at:
https://lkml.org/lkml/2012/6/22/381.  Note that patches 18-22
of that series have been defered to 3.7, as they have not yet
proven themselves to be mainline-ready (and yes, these are the
ones intended to get rid of RCU's latency spikes for systems
that actually have 4096 CPUs).
2. Updates to documentation and rcutorture fixes, the latter category
including improvements to rcu_barrier() testing.  Posted to LKML at:
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/04094.html.
3. Miscellaneous fixes posted to LKML at:
https://lkml.org/lkml/2012/6/22/500, with the exception of the
last commit, which was posted here:
http://www.gossamer-threads.com/lists/linux/kernel/1561830
4. RCU_FAST_NO_HZ fixes and improvements.  Posted to LKML at:
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/00006.html
and http://www.gossamer-threads.com/lists/linux/kernel/1561833.
The first four patches of the first series went into 3.5 to fix
a regression.
5. Code-style fixes.  These were posted to LKML at
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01180.html and
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01181.html.
"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agorcu: Fix broken strings in RCU's source code.
Paul E. McKenney [Thu, 17 May 2012 22:12:45 +0000 (15:12 -0700)]
rcu: Fix broken strings in RCU's source code.

Although the C language allows you to break strings across lines, doing
this makes it hard for people to find the Linux kernel code corresponding
to a given console message.  This commit therefore fixes broken strings
throughout RCU's source code.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Fix code-style issues involving "else"
Paul E. McKenney [Thu, 28 Jun 2012 15:08:25 +0000 (08:08 -0700)]
rcu: Fix code-style issues involving "else"

The Linux kernel coding style says that single-statement blocks should
omit curly braces unless the other leg of the "if" statement has
multiple statements, in which case the curly braces should be included.
This commit fixes RCU's violations of this rule.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agoMerge branches 'bigrtm.2012.07.04a', 'doctorture.2012.07.02a', 'fixes.2012.07.06a...
Paul E. McKenney [Fri, 6 Jul 2012 12:59:20 +0000 (05:59 -0700)]
Merge branches 'bigrtm.2012.07.04a', 'doctorture.2012.07.02a', 'fixes.2012.07.06a' and 'fnh.2012.07.02a' into HEAD

bigrtm: First steps towards getting RCU out of the way of
tens-of-microseconds real-time response on systems compiled
with NR_CPUS=4096.  Also cleanups for and increased concurrency
of rcu_barrier() family of primitives.
doctorture: rcutorture and documentation improvements.
fixes:  Miscellaneous fixes.
fnh: RCU_FAST_NO_HZ fixes and improvements.

12 years agorcu: Introduce check for callback list/count mismatch
Paul E. McKenney [Mon, 25 Jun 2012 19:54:17 +0000 (12:54 -0700)]
rcu: Introduce check for callback list/count mismatch

The recent bug that introduced the RCU callback list/count mismatch
showed the need for a diagnostic to check for this, which this commit
adds.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agoMerge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Thu, 5 Jul 2012 20:20:02 +0000 (13:20 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Small fixes on multiple ARM platforms
   - A build regression from a previous fix on dove and mv78xx0
   - Two fixes for recently (3.5-rc1) changed mmp/pxa code
   - multiple omap2+ bug fixes
   - two trivial fixes for i.MX
   - one v3.5 regression for mxs"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: apx4devkit: fix FEC enabling PHY clock
  ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
  ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks
  ARM: Orion: Fix WDT compile for Dove and MV78xx0
  ARM: mmp: remove mach/gpio-pxa.h
  ARM: imx: assert SCC gate stays enabled
  ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled
  ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure
  ARM: imx27_visstrim_m10: Do not include <asm/system.h>
  ARM: pxa: hx4700: Fix basic suspend/resume

12 years agoMerge git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 5 Jul 2012 20:16:21 +0000 (13:16 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fix from Marcelo Tosatti:
 "Memory leak and oops on the x86 mmu code, and sanitization of the
  KVM_IRQFD ioctl."

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: fix shrinking page from the empty mmu
  KVM: fix fault page leak
  KVM: Sanitize KVM_IRQFD flags
  KVM: Add missing KVM_IRQFD API documentation
  KVM: Pass kvm_irqfd to functions

12 years agoMerge branch 'fixes-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/coolone...
Linus Torvalds [Thu, 5 Jul 2012 20:09:37 +0000 (13:09 -0700)]
Merge branch 'fixes-for-3.5' of git://git./linux/kernel/git/cooloney/linux-leds

Pull leds fix from Bryan Wu:
 "Fix for heartbeat led trigger driver"

* 'fixes-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
  leds: heartbeat: fix bug on panic

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Thu, 5 Jul 2012 20:06:25 +0000 (13:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs updates from Chris Mason:
 "I held off on my rc5 pull because I hit an oops during log recovery
  after a crash.  I wanted to make sure it wasn't a regression because
  we have some logging fixes in here.

  It turns out that a commit during the merge window just made it much
  more likely to trigger directory logging instead of full commits,
  which exposed an old bug.

  The new backref walking code got some additional fixes.  This should
  be the final set of them.

  Josef fixed up a corner where our O_DIRECT writes and buffered reads
  could expose old file contents (not stale, just not the most recent).
  He and Liu Bo fixed crashes during tree log recover as well.

  Ilya fixed errors while we resume disk balancing operations on
  readonly mounts."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: run delayed directory updates during log replay
  Btrfs: hold a ref on the inode during writepages
  Btrfs: fix tree log remove space corner case
  Btrfs: fix wrong check during log recovery
  Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
  Btrfs: resume balance on rw (re)mounts properly
  Btrfs: restore restriper state on all mounts
  Btrfs: fix dio write vs buffered read race
  Btrfs: don't count I/O statistic read errors for missing devices
  Btrfs: resolve tree mod log locking issue in btrfs_next_leaf
  Btrfs: fix tree mod log rewind of ADD operations
  Btrfs: leave critical region in btrfs_find_all_roots as soon as possible
  Btrfs: always put insert_ptr modifications into the tree mod log
  Btrfs: fix tree mod log for root replacements at leaf level
  Btrfs: support root level changes in __resolve_indirect_ref
  Btrfs: avoid waiting for delayed refs when we must not

12 years agoMerge branch 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux
Linus Torvalds [Thu, 5 Jul 2012 18:53:47 +0000 (11:53 -0700)]
Merge branch 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux

Pull DT fixes from Rob Herring:
 "Mainly some documentation updates and 2 fixes:

   - An export symbol fix for of_platform_populate from Stephen W.
   - A fix for the order compatible entries are matched to ensure the
     first compatible string is matched when there are multiple matches."

Normally these would go through Grant Likely (thus the "fixes-for-grant"
branch name), but Grant is in the middle of moving to Scotland, and is
practically offline until sometime in August. So pull directly from Rob.

* 'fixes-for-grant' of git://sources.calxeda.com/kernel/linux:
  of: match by compatible property first
  dt: mc13xxx.txt: Fix gpio number assignment
  dt: fsl-fec.txt: Fix gpio number assignment
  dt: fsl-mma8450.txt: Add missing 'reg' description
  dt: fsl-imx-esdhc.txt: Fix gpio number assignment
  dt: fsl-imx-cspi.txt: Fix comment about GPIOs used for chip selects
  of: Add Avionic Design vendor prefix
  of: export of_platform_populate()

12 years agoMerge tag 'omap-fixes-for-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 5 Jul 2012 10:16:13 +0000 (12:16 +0200)]
Merge tag 'omap-fixes-for-v3.5-rc5' of git://git./linux/kernel/git/tmlind/linux-omap into fixes

PM related fixes for omaps mostly to get suspend/resume
working again.

* tag 'omap-fixes-for-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
  ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks
  ARM: OMAP4: TWL6030: ensure sys_nirq1 is mux'd and wakeup enabled
  ARM: OMAP2: Overo: init I2C before MMC to fix MMC suspend/resume failure

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoMerge branch 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2...
Arnd Bergmann [Thu, 5 Jul 2012 09:06:36 +0000 (11:06 +0200)]
Merge branch 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes

* 'mxs/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: apx4devkit: fix FEC enabling PHY clock

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoMerge tag 'omap-fixes-b-for-3.5rc' of git://git.kernel.org/pub/scm/linux/kernel/git...
Tony Lindgren [Thu, 5 Jul 2012 08:12:08 +0000 (01:12 -0700)]
Merge tag 'omap-fixes-b-for-3.5rc' of git://git./linux/kernel/git/pjw/omap-pending into fixes

A few more OMAP fixes for 3.5-rc.  These fix some bugs with power
management and McBSP.

12 years agoARM: apx4devkit: fix FEC enabling PHY clock
Lauri Hintsala [Thu, 5 Jul 2012 07:31:36 +0000 (10:31 +0300)]
ARM: apx4devkit: fix FEC enabling PHY clock

Ethernet stopped to work after mxs clk framework change.

Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
12 years agoARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4
Benoit Cousson [Wed, 4 Jul 2012 12:55:29 +0000 (06:55 -0600)]
ARM: OMAP2+: hwmod data: Fix wrong McBSP clock alias on OMAP4

The commit 503d0ea24d1d3dd3db95e5e0edd693da7a2a23eb
  ARM: OMAP4: hwmod data: Add aliases for McBSP fclk clocks

added a wrong "prcm_clk" alias for PRCM clock whereas the McBSP
driver and previous OMAPs are using "prcm_fck".

It thus lead to the following warning.

[   47.409729] omap-mcbsp: clks: could not clk_get() prcm_fck

Fix that by changing the opt_clk role to prcm_fck.

Reported-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Sebastien Guiriec <s-guiriec@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
12 years agoARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess...
Paul Walmsley [Wed, 4 Jul 2012 12:55:29 +0000 (06:55 -0600)]
ARM: OMAP4: hwmod data: temporarily comment out data for the usb_host_fs and aess IP blocks

The OMAP4 usb_host_fs (OHCI) and AESS IP blocks require some special
programming for them to enter idle.  Without this programming, they
will prevent the rest of the chip from entering full chip idle.

To implement the idle programming cleanly, this will take some
coordination between maintainers.  This is likely to take some time,
so it is probably best to leave this for 3.6 or 3.7.  So, in the
meantime, prevent these IP blocks from being registered.

Later, once the appropriate support is available, this patch can be
reverted.

This second version comments out the IP block data since Benoît didn't
like removing it.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Benoît Cousson <b-cousson@ti.com>
12 years agoMerge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
Arnd Bergmann [Wed, 4 Jul 2012 11:49:58 +0000 (13:49 +0200)]
Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes

From Haojian Zhuang <haojian.zhuang@gmail.com>:

* 'fixes' of git://github.com/hzhuang1/linux:
  ARM: mmp: remove mach/gpio-pxa.h

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoMerge tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes
Arnd Bergmann [Wed, 4 Jul 2012 11:46:05 +0000 (13:46 +0200)]
Merge tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes

From Sascha Hauer <s.hauer@pengutronix.de>:

ARM i.MX fixes for v3.5-rc5

* tag 'v3.5-imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6:
  ARM: imx: assert SCC gate stays enabled
  ARM: imx27_visstrim_m10: Do not include <asm/system.h>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoARM: Orion: Fix WDT compile for Dove and MV78xx0
Andrew Lunn [Fri, 29 Jun 2012 07:25:58 +0000 (09:25 +0200)]
ARM: Orion: Fix WDT compile for Dove and MV78xx0

Commit 0fa1f0609a0c1fe8b2be3c0089a2cb48f7fda521 (ARM: Orion: Fix
Virtual/Physical mixup with watchdog) broke the Dove & MV78xx0
build. Although these two SoC don't use the watchdog, the shared
platform code still needs to build. Add the necessary defines.

Cc: stable@vger.kernel.org
Reported-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoARM: mmp: remove mach/gpio-pxa.h
Paul Bolle [Mon, 2 Jul 2012 21:40:14 +0000 (23:40 +0200)]
ARM: mmp: remove mach/gpio-pxa.h

Commit 157d2644cb0c1e71a18baaffca56d2b1d0ebf10f ("ARM: pxa: change gpio
to platform device") removed all includes of mach/gpio-pxa.h. It kept
this unused header in the tree. Using it can't work, as it itself
includes the non-existent header plat/gpio-pxa.h. This header can safely
be removed.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
12 years agoleds: heartbeat: fix bug on panic
Alexander Holler [Tue, 3 Jul 2012 06:35:47 +0000 (14:35 +0800)]
leds: heartbeat: fix bug on panic

With commit 49dca5aebfdeadd4bf27b6cb4c60392147dc35a4 I introduced
a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
has happened:

[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]
[ 1526.520230]
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3

So in case of a panic, now just turn of the LED. Other approaches like
scheduling a work to unregister the trigger aren't working because there
isn't much which still runs after a panic occured (except timers).

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
12 years agoARM: imx: assert SCC gate stays enabled
Uwe Kleine-König [Mon, 2 Jul 2012 09:30:34 +0000 (11:30 +0200)]
ARM: imx: assert SCC gate stays enabled

The SCC clock is needed in internal boot mode and so must keep enabled.
This same issue was fixed for the pre-common-clk code in commit

3d6e614 (mx35: Fix boot ROM hang in internal boot mode)

Cc: John Ogness <jogness@linutronix.de>
Cc: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
12 years agoMerge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Linus Torvalds [Wed, 4 Jul 2012 01:06:49 +0000 (18:06 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull fix to common clk framework from Michael Turquette:
 "The previous set of common clk fixes for -rc5 left an uninitialized
  int which could lead to bad array indexing when switching clock
  parents.  The issue is fixed with a trivial change to the code flow in
  __clk_set_parent."

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: fix parent validation in __clk_set_parent()

12 years agoMerge tag 'md-3.5-fixes' of git://neil.brown.name/md
Linus Torvalds [Wed, 4 Jul 2012 01:05:35 +0000 (18:05 -0700)]
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md

Pull raid10 build failure fix from NeilBrown:
 "I really shouldn't do important things late in the day.  It seems that
  I get careless."

* tag 'md-3.5-fixes' of git://neil.brown.name/md:
  md/raid10: fix careless build error

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 4 Jul 2012 01:01:54 +0000 (18:01 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking update from David Miller:

 1) Fix RX sequence number handling in mwifiex, from Stone Piao.

 2) Netfilter ipset mis-compares device names, fix from Florian
    Westphal.

 3) Fix route leak in ipv6 IPVS, from Eric Dumazet.

 4) NFS fixes.  Several buffer overflows in NCI layer from Dan
    Rosenberg, and release sock OOPS'er fix from Eric Dumazet.

 5) Fix WEP handling ath9k, we started using a bit the chip provides to
    indicate undecrypted packets but that bit turns out to be unreliable
    in certain configurations.  Fix from Felix Fietkau.

 6) Fix Kconfig dependency bug in wlcore, from Randy Dunlap.

 7) New USB IDs for rtlwifi driver from Larry Finger.

 8) Fix crashes in qmi_wwan usbnet driver when disconnecting, from Bjørn
    Mork.

 9) Gianfar driver programs coalescing settings properly in single queue
    mode, but does not do so in multi-queue mode.  Fix from Claudiu
    Manoil.

10) Missing module.h include in davinci_cpdma.c, from Daniel Mack.

11) Need dummy handler for IPSET_CMD_NONE otherwise we crash in ipset if
    we get this via nfnetlink, fix from Tomasz Bursztyka.

12) Missing RCU unlock in nfnetlink error path, also from Tomasz.

13) Fix divide by zero in igbvf when the user tries to set an RX
    coalescing value of 0 usecs, from Mitch A Williams.

14) We can process SCTP sacks for the wrong transport, oops.  Fix from
    Neil Horman.

15) Remove hw IP payload checksumming from e1000e driver.  This has zery
    value in our stack, and turning it on creates a very unintuitive
    restriction for users when using jumbo MTUs.

    Specifically, when IP payload checksums are on you cannot use both
    receive hashing offload and jumbo MTU.  Fix from Bruce Allan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  e1000e: remove use of IP payload checksum
  sctp: be more restrictive in transport selection on bundled sacks
  igbvf: fix divide by zero
  netfilter: nfnetlink: fix missing rcu_read_unlock in nfnetlink_rcv_msg
  netfilter: ipset: fix crash if IPSET_CMD_NONE command is sent
  davinci_cpdma: include linux/module.h
  gianfar: Fix RXICr/TXICr programming for multi-queue mode
  net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.
  net: qmi_wwan: fix Oops while disconnecting
  mwifiex: fix memory leak associated with IE manamgement
  ath9k: fix panic caused by returning a descriptor we have queued for reuse
  mac80211: correct behaviour on unrecognised action frames
  ath9k: enable serialize_regmode for non-PCIE AR9287
  rtlwifi: rtl8192cu: New USB IDs
  NFC: Return from rawsock_release when sk is NULL
  iwlwifi: fix activating inactive stations
  wlcore: drop INET dependency
  ath9k: fix dynamic WEP related regression
  NFC: Prevent multiple buffer overflows in NCI
  netfilter: update location of my trees
  ...

12 years agomd/raid10: fix careless build error
NeilBrown [Tue, 3 Jul 2012 23:35:35 +0000 (09:35 +1000)]
md/raid10: fix careless build error

build error introduced by commit b357f04a67c2aeee8

That function doesn't get extra args until a later patch.  Bother.

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agofloppy: cancel any pending fd_timeouts before adding a new one
Linus Torvalds [Tue, 3 Jul 2012 22:51:22 +0000 (15:51 -0700)]
floppy: cancel any pending fd_timeouts before adding a new one

In commit 070ad7e793dc ("floppy: convert to delayed work and
single-thread wq") the 'fd_timeout' timer was converted to a delayed
work.  However, the "del_timer(&fd_timeout)" was lost in the process,
and any previous pending timeouts would stay active when we then
re-queued the timeout.

This resulted in the floppy probe sequence having a (stale) 20s timeout
rather than the intended 3s timeout, and thus made booting with the
floppy driver (but no actual floppy controller) take much longer than it
should.

Of course, there's little reason for most people to compile the floppy
driver into the kernel at all, which is why most people never noticed.

Canceling the delayed work where we used to do the del_timer() fixes the
issue, and makes the floppy probing use the proper new timeout instead.
The three second timeout is still very wasteful, but better than the 20s
one.

Reported-and-tested-by: Andi Kleen <ak@linux.intel.com>
Reported-and-tested-by: Calvin Walton <calvin.walton@kepstin.ca>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 3 Jul 2012 22:45:10 +0000 (15:45 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block bits from Jens Axboe:
 "As vacation is coming up, thought I'd better get rid of my pending
  changes in my for-linus branch for this iteration.  It contains:

   - Two patches for mtip32xx.  Killing a non-compliant sysfs interface
     and moving it to debugfs, where it belongs.

   - A few patches from Asias.  Two legit bug fixes, and one killing an
     interface that is no longer in use.

   - A patch from Jan, making the annoying partition ioctl warning a bit
     less annoying, by restricting it to !CAP_SYS_RAWIO only.

   - Three bug fixes for drbd from Lars Ellenberg.

   - A fix for an old regression for umem, it hasn't really worked since
     the plugging scheme was changed in 3.0.

   - A few fixes from Tejun.

   - A splice fix from Eric Dumazet, fixing an issue with pipe
     resizing."

* 'for-linus' of git://git.kernel.dk/linux-block:
  scsi: Silence unnecessary warnings about ioctl to partition
  block: Drop dead function blk_abort_queue()
  block: Mitigate lock unbalance caused by lock switching
  block: Avoid missed wakeup in request waitqueue
  umem: fix up unplugging
  splice: fix racy pipe->buffers uses
  drbd: fix null pointer dereference with on-congestion policy when diskless
  drbd: fix list corruption by failing but already aborted reads
  drbd: fix access of unallocated pages and kernel panic
  xen/blkfront: Add WARN to deal with misbehaving backends.
  blkcg: drop local variable @q from blkg_destroy()
  mtip32xx: Create debugfs entries for troubleshooting
  mtip32xx: Remove 'registers' and 'flags' from sysfs
  blkcg: fix blkg_alloc() failure path
  block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED
  block: fix return value on cfq_init() failure
  mtip32xx: Remove version.h header file inclusion
  xen/blkback: Copy id field when doing BLKIF_DISCARD.

12 years agoKVM: MMU: fix shrinking page from the empty mmu
Xiao Guangrong [Tue, 3 Jul 2012 06:31:43 +0000 (14:31 +0800)]
KVM: MMU: fix shrinking page from the empty mmu

Fix:

 [ 3190.059226] BUG: unable to handle kernel NULL pointer dereference at           (null)
 [ 3190.062224] IP: [<ffffffffa02aac66>] mmu_page_zap_pte+0x10/0xa7 [kvm]
 [ 3190.063760] PGD 104f50067 PUD 112bea067 PMD 0
 [ 3190.065309] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 [ 3190.066860] CPU 1
[ ...... ]
 [ 3190.109629] Call Trace:
 [ 3190.111342]  [<ffffffffa02aada6>] kvm_mmu_prepare_zap_page+0xa9/0x1fc [kvm]
 [ 3190.113091]  [<ffffffffa02ab2f5>] mmu_shrink+0x11f/0x1f3 [kvm]
 [ 3190.114844]  [<ffffffffa02ab25d>] ? mmu_shrink+0x87/0x1f3 [kvm]
 [ 3190.116598]  [<ffffffff81150c9d>] ? prune_super+0x142/0x154
 [ 3190.118333]  [<ffffffff8110a4f4>] ? shrink_slab+0x39/0x31e
 [ 3190.120043]  [<ffffffff8110a687>] shrink_slab+0x1cc/0x31e
 [ 3190.121718]  [<ffffffff8110ca1d>] do_try_to_free_pages

This is caused by shrinking page from the empty mmu, although we have
checked n_used_mmu_pages, it is useless since the check is out of mmu-lock

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoKVM: fix fault page leak
Xiao Guangrong [Tue, 3 Jul 2012 06:31:01 +0000 (14:31 +0800)]
KVM: fix fault page leak

fault_page is forgot to be freed

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoclk: fix parent validation in __clk_set_parent()
Rajendra Nayak [Tue, 3 Jul 2012 06:41:41 +0000 (12:11 +0530)]
clk: fix parent validation in __clk_set_parent()

The below commit introduced a bug in __clk_set_parent()
which could cause it to *skip* the parent validation
which makes sure the parent passed to the api is a valid
one.

    commit 7975059db572eb47f0fb272a62afeae272a4b209
    Author: Rajendra Nayak <rnayak@ti.com>
    Date:   Wed Jun 6 14:41:31 2012 +0530

        clk: Allow late cache allocation for clk->parents

This was identified by the following compiler warning..

    drivers/clk/clk.c: In function '__clk_set_parent':
    drivers/clk/clk.c:1083:5: warning: 'i' may be used uninitialized in this function [-Wuninitialized]

.. as reported by Marc Kleine-Budde.

There were various options discussed on how to fix this, one
being initing 'i' to clk->num_parents, but the below approach
was found to be more appropriate as it also makes the 'parent
validation' code simpler to read.

Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Cc: stable@kernel.org
12 years agoMerge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Tue, 3 Jul 2012 18:10:18 +0000 (11:10 -0700)]
Merge tag 'sound-3.5' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just a few driver-specific fixes for ASoC and HD-audio."

* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix no sound from ALC662 after Windows reboot
  ASoC: tlv320aic3x: Fix codec pll configure bug
  ASoC: wm2200: Add missing BCLK rate

12 years agoMerge tag 'dm-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Linus Torvalds [Tue, 3 Jul 2012 18:08:16 +0000 (11:08 -0700)]
Merge tag 'dm-3.5-fixes' of git://git./linux/kernel/git/agk/linux-dm

Pull device-mapper fixes from Alasdair G Kergon:
 "Four minor thin provisioning fixes and correct and update dm-verity
  documentation."

* tag 'dm-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm: verity fix documentation
  dm persistent data: fix allocation failure in space map checker init
  dm persistent data: handle space map checker creation failure
  dm persistent data: fix shadow_info_leak on dm_tm_destroy
  dm thin: commit metadata before creating metadata snapshot

12 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Tue, 3 Jul 2012 17:59:37 +0000 (10:59 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "One regression fix, two radeon fixes (one for an oops), and an i915
  fix to unload framebuffers earlier.

  We originally were going to leave the i915 fix until -next, but grub2
  in some situations causes vesafb/efifb to be loaded now, and this
  causes big slowdowns, and I have reports in rawhide I'd like to have
  fixed."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: kick any firmware framebuffers before claiming the gtt
  drm: edid: Don't add inferred modes with higher resolution
  drm/radeon: fix rare segfault
  drm/radeon: fix VM page table setup on SI

12 years agoMerge tag 'md-3.5-fixes' of git://neil.brown.name/md
Linus Torvalds [Tue, 3 Jul 2012 17:40:43 +0000 (10:40 -0700)]
Merge tag 'md-3.5-fixes' of git://neil.brown.name/md

Pull md fixes from NeilBrown:
 "md: collection of bug fixes for 3.5

  You go away for 2 weeks vacation and what do you get when you come
  back? Piles of bugs :-)

  Some found by inspection, some by testing, some during use in the
  field, and some while developing for the next window..."

* tag 'md-3.5-fixes' of git://neil.brown.name/md:
  md: fix up plugging (again).
  md: support re-add of recovering devices.
  md/raid1: fix bug in read_balance introduced by hot-replace
  raid5: delayed stripe fix
  md/raid456: When read error cannot be recovered, record bad block
  md: make 'name' arg to md_register_thread non-optional.
  md/raid10: fix failure when trying to repair a read error.
  md/raid5: fix refcount problem when blocked_rdev is set.
  md:Add blk_plug in sync_thread.
  md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
  md/raid5: Do not add data_offset before call to is_badblock
  md/raid5: prefer replacing failed devices over want-replacement devices.
  md/raid10: Don't try to recovery unmatched (and unused) chunks.

12 years agoMerge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Tue, 3 Jul 2012 17:39:40 +0000 (10:39 -0700)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security

Pull security layer fixes from James Morris.

A documentation update, and a nommu build fix.

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix nommu build.
  security: document no_new_privs

12 years agodm: verity fix documentation
Milan Broz [Tue, 3 Jul 2012 11:55:41 +0000 (12:55 +0100)]
dm: verity fix documentation

Veritysetup is now part of cryptsetup package.
Remove on-disk header description (which is not parsed in kernel)
and point users to cryptsetup where it the format is documented.
Mention units for block size paramaters.
Fix target line specification and dmsetup parameters.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm persistent data: fix allocation failure in space map checker init
Mike Snitzer [Tue, 3 Jul 2012 11:55:37 +0000 (12:55 +0100)]
dm persistent data: fix allocation failure in space map checker init

If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and memory is fragmented and a
sufficiently-large metadata device is used in a thin pool then the space
map checker will fail to allocate the memory it requires.

Switch from kmalloc to vmalloc to allow larger virtually contiguous
allocations for the space map checker's internal count arrays.

Reported-by: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm persistent data: handle space map checker creation failure
Mike Snitzer [Tue, 3 Jul 2012 11:55:35 +0000 (12:55 +0100)]
dm persistent data: handle space map checker creation failure

If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and dm_sm_checker_create()
fails, dm_tm_create_internal() would still return success even though it
cleaned up all resources it was supposed to have created.  This will
lead to a kernel crash:

general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
...
RIP: 0010:[<ffffffff81593659>]  [<ffffffff81593659>] dm_bufio_get_block_size+0x9/0x20
Call Trace:
  [<ffffffff81599bae>] dm_bm_block_size+0xe/0x10
  [<ffffffff8159b8b8>] sm_ll_init+0x78/0xd0
  [<ffffffff8159c1a6>] sm_ll_new_disk+0x16/0xa0
  [<ffffffff8159c98e>] dm_sm_disk_create+0xfe/0x160
  [<ffffffff815abf6e>] dm_pool_metadata_open+0x16e/0x6a0
  [<ffffffff815aa010>] pool_ctr+0x3f0/0x900
  [<ffffffff8158d565>] dm_table_add_target+0x195/0x450
  [<ffffffff815904c4>] table_load+0xe4/0x330
  [<ffffffff815917ea>] ctl_ioctl+0x15a/0x2c0
  [<ffffffff81591963>] dm_ctl_ioctl+0x13/0x20
  [<ffffffff8116a4f8>] do_vfs_ioctl+0x98/0x560
  [<ffffffff8116aa51>] sys_ioctl+0x91/0xa0
  [<ffffffff81869f52>] system_call_fastpath+0x16/0x1b

Fix the space map checker code to return an appropriate ERR_PTR and have
dm_sm_disk_create() and dm_tm_create_internal() check for it with
IS_ERR.

Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm persistent data: fix shadow_info_leak on dm_tm_destroy
Mike Snitzer [Tue, 3 Jul 2012 11:55:33 +0000 (12:55 +0100)]
dm persistent data: fix shadow_info_leak on dm_tm_destroy

Cleanup the shadow table before destroying the transaction manager.

Reference: leak was identified with kmemleak when running
test_discard_random_sectors in the thinp-test-suite.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm thin: commit metadata before creating metadata snapshot
Joe Thornber [Tue, 3 Jul 2012 11:55:31 +0000 (12:55 +0100)]
dm thin: commit metadata before creating metadata snapshot

Userland sometimes sees a corrupt metadata block if metadata is changing
rapidly when a metadata snapshot is reserved for userland,  To make the
problem go away, commit before we take the metadata snapshot (which is a
sensible thing to do anyway).

The checksums mean userland spots this corruption immediately so there's
no risk of acting on incorrect data.  No corruption exists from the
kernel's point of view, and thin_check passes after pool shutdown.

I believe this is to do with shared blocks at the first level of the
{device, mapping} btree.  Prior to the metadata-snap support no sharing
at this level was possible, so this patch is only required after commit
cc8394d86f045b86ff303d3c9e4ce47d97148951 ("dm thin: provide userspace
access to pool metadata").

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agosecurity: Fix nommu build.
Paul Mundt [Mon, 2 Jul 2012 05:34:11 +0000 (14:34 +0900)]
security: Fix nommu build.

The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
12 years agodrm/i915: kick any firmware framebuffers before claiming the gtt
Daniel Vetter [Sun, 1 Jul 2012 15:09:42 +0000 (17:09 +0200)]
drm/i915: kick any firmware framebuffers before claiming the gtt

Especially vesafb likes to map everything as uc- (yikes), and if that
mapping hangs around still while we try to map the gtt as wc the
kernel will downgrade our request to uc-, resulting in abyssal
performance.

Unfortunately we can't do this as early as readon does (i.e. as the
first thing we do when initializing the hw) because our fb/mmio space
region moves around on a per-gen basis. So I've had to move it below
the gtt initialization, but that seems to work, too. The important
thing is that we do this before we set up the gtt wc mapping.

Now an altogether different question is why people compile their
kernels with vesafb enabled, but I guess making things just work isn't
bad per se ...

v2:
- s/radeondrmfb/inteldrmfb/
- fix up error handling

v3: Kill #ifdef X86, this is Intel after all. Noticed by Ben Widawsky.

v4: Jani Nikula complained about the pointless bool primary
initialization.

v5: Don't oops if we can't allocate, noticed by Chris Wilson.

v6: Resolve conflicts with agp rework and fixup whitespace.

This is commit e188719a2891f01b3100d in drm-next.

Backport to 3.5 -fixes queue requested by Dave Airlie - due to grub
using vesa on fedora their initrd seems to load vesafb before loading
the real kms driver. So tons more people actually experience a
dead-slow gpu. Hence also the Cc: stable.

Cc: stable@vger.kernel.org
Reported-and-tested-by: "Kilarski, Bernard R" <bernard.r.kilarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agodrm: edid: Don't add inferred modes with higher resolution
Takashi Iwai [Tue, 3 Jul 2012 09:22:11 +0000 (11:22 +0200)]
drm: edid: Don't add inferred modes with higher resolution

When a monitor EDID doesn't give the preferred bit, driver assumes
that the mode with the higest resolution and rate is the preferred
mode.  Meanwhile the recent changes for allowing more modes in the
GFT/CVT ranges give actually more modes, and some modes may be over
the native size.  Thus such a mode would be picked up as the preferred
mode although it's no native resolution.

For avoiding such a problem, this patch limits the addition of
inferred modes by checking not to be greater than other modes.
Also, it checks the duplicated mode entry at the same time.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agodrm/radeon: fix rare segfault
Jerome Glisse [Mon, 2 Jul 2012 16:40:54 +0000 (12:40 -0400)]
drm/radeon: fix rare segfault

In gem idle/busy ioctl the radeon object was derefenced after
drm_gem_object_unreference_unlocked which in case the object
have been destroyed lead to use of a possibly free pointer with
possibly wrong data.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agomd: fix up plugging (again).
NeilBrown [Tue, 3 Jul 2012 07:45:31 +0000 (17:45 +1000)]
md: fix up plugging (again).

The value returned by "mddev_check_plug" is only valid until the
next 'schedule' as that will unplug things.  This could happen at any
call to mempool_alloc.
So just calling mddev_check_plug at the start doesn't really make
sense.

So call it just before, or just after, queuing things for the thread.
As the action that happens at unplug is to wake the thread, this makes
lots of sense.
If we cannot add a plug (which requires a small GFP_ATOMIC alloc) we
wake thread immediately.

RAID5 is a bit different.  Requests are queued for the thread and the
thread is woken by release_stripe.  So we don't need to wake the
thread on failure.
However the thread doesn't perform certain actions when there is any
active plug, so it is important to install a plug before waking the
thread.  So for RAID5 we install the plug *before* queuing the request
and waking the thread.

Without this patch it is possible for raid1 or raid10 to queue a
request without then waking the thread, resulting in the array locking
up.

Also change raid10 to only flush_pending_write when there are not
active plugs, just like raid1.

This patch is suitable for 3.0 or later.  I plan to submit it to
-stable, but I'll like to let it spend a few weeks in mainline
first to be sure it is completely safe.

Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd: support re-add of recovering devices.
NeilBrown [Tue, 3 Jul 2012 05:59:06 +0000 (15:59 +1000)]
md: support re-add of recovering devices.

We currently only allow a device to be re-added if it appear to be
in-sync.  This is overly restrictive as it may be desirable to re-add
a device that is in the middle of recovery.

So remove the test for "InSync" - the test on rdev->raid_disk is
sufficient to ensure that the re-add will succeed.

Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Tested-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid1: fix bug in read_balance introduced by hot-replace
NeilBrown [Tue, 3 Jul 2012 05:58:42 +0000 (15:58 +1000)]
md/raid1: fix bug in read_balance introduced by hot-replace

When we added hot_replace we doubled the number of devices
that could be in a RAID1 array.  So we doubled how far read_balance
would search.  Unfortunately we didn't double the point at which
it looped back to the beginning - so it effectively loops over
all non-replacement disks twice.
This doesn't cause bad behaviour, but it pointless and means we
never read from replacement devices.

Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoraid5: delayed stripe fix
Shaohua Li [Tue, 3 Jul 2012 05:57:19 +0000 (15:57 +1000)]
raid5: delayed stripe fix

There isn't locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but
the two bits have relationship. A delayed stripe can be moved to hold list only
when preread active stripe count is below IO_THRESHOLD. If a stripe has both
the bits set, such stripe will be in delayed list and preread count not 0,
which will make such stripe never leave delayed list.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid456: When read error cannot be recovered, record bad block
majianpeng [Tue, 3 Jul 2012 05:57:02 +0000 (15:57 +1000)]
md/raid456: When read error cannot be recovered, record bad block

We may not be able to fix a bad block if:
 - the array is degraded
 - the over-write fails.

In these cases we currently eject the device, but we should
record a bad block if possible.

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd: make 'name' arg to md_register_thread non-optional.
NeilBrown [Tue, 3 Jul 2012 05:56:52 +0000 (15:56 +1000)]
md: make 'name' arg to md_register_thread non-optional.

Having the 'name' arg optional and defaulting to the current
personality name is no necessary and leads to errors, as when
changing the level of an array we can end up using the
name of the old level instead of the new one.

So make it non-optional and always explicitly pass the name
of the level that the array will be.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid10: fix failure when trying to repair a read error.
NeilBrown [Tue, 3 Jul 2012 05:55:33 +0000 (15:55 +1000)]
md/raid10: fix failure when trying to repair a read error.

commit 58c54fcca3bac5bf9290cfed31c76e4c4bfbabaf
     md/raid10: handle further errors during fix_read_error better.

in 3.1 added "r10_sync_page_io" which takes an IO size in sectors.
But we were passing the IO size in bytes!!!
This resulting in bio_add_page failing, and empty request being sent
down, and a consequent BUG_ON in scsi_lib.

[fix missing space in error message at same time]

This fix is suitable for 3.1.y and later.

Cc: stable@vger.kernel.org
Reported-by: Christian Balzer <chibi@gol.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Tue, 3 Jul 2012 02:52:25 +0000 (19:52 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull a couple more powerpc fixes from Benjamin Herrenschmidt:
 "Here are two more fixes that I "missed" when scrubbing patchwork last
  week which are worth still having in 3.5."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/kvm: sldi should be sld
  powerpc/xmon: Use cpumask iterator to avoid warning

12 years agosecurity: document no_new_privs
Andy Lutomirski [Mon, 2 Jul 2012 21:03:58 +0000 (14:03 -0700)]
security: document no_new_privs

Document no_new_privs.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
12 years agomd/raid5: fix refcount problem when blocked_rdev is set.
NeilBrown [Tue, 3 Jul 2012 02:13:29 +0000 (12:13 +1000)]
md/raid5: fix refcount problem when blocked_rdev is set.

commit 43220aa0f22cd3ce5b30246d50ccd696d119edea
    md/raid5: fix a hang on device failure.

fixed a hang, but introduced a refcounting in-balance so
that if the presence of bad-blocks ever caused an rdev to
be 'blocked' we would increment the refcount on the rdev and
never decrement it.

So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
is not called.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd:Add blk_plug in sync_thread.
majianpeng [Tue, 3 Jul 2012 02:12:26 +0000 (12:12 +1000)]
md:Add blk_plug in sync_thread.

Add blk_plug in sync_thread will increase the performance of sync.
Because sync_thread did not blk_plug,so when raid sync, the bio merge
not well.

Testing environment:
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
Controller.
OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
x86_64 x86_64 x86_64 GNU/Linux.
RAID5: four ST31000524NS disk.

Without blk_plug:recovery speed about 63M/Sec;
Add blk_plug:recovery speed about 120M/Sec.

Using blktrace:
blktrace -d /dev/sdb -w 60  -o -|blkparse -i -

without blk_plug:
Total (8,16):
 Reads Queued:      309811,     1239MiB  Writes Queued:           0,        0KiB
 Read Dispatches:   283583,     1189MiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0  Writes Requeued:         0
 Reads Completed:   273351,     1149MiB  Writes Completed:        0,        0KiB
 Read Merges:        23533,    94132KiB  Write Merges:            0,        0KiB
 IO unplugs:             0          Timer unplugs:           0

add blk_plug:
Total (8,16):
 Reads Queued:      428697,     1714MiB  Writes Queued:           0,        0KiB
 Read Dispatches:     3954,     1714MiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0  Writes Requeued:         0
 Reads Completed:     3956,     1715MiB  Writes Completed:        0,        0KiB
 Read Merges:       424743,     1698MiB  Write Merges:            0,        0KiB
 IO unplugs:             0          Timer unplugs:        3384

The ratio of merge will be markedly increased.

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
majianpeng [Tue, 3 Jul 2012 02:11:54 +0000 (12:11 +1000)]
md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev

In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
nr_pending so we lose the reference we hold on the rdev.
So atomic_inc it first to maintain the reference.

This bug was introduced by commit  73e92e51b7969ef5477d
    md/raid5.  Don't write to known bad block on doubtful devices.

which appeared in 3.0, so patch is suitable for stable kernels since
then.

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid5: Do not add data_offset before call to is_badblock
majianpeng [Tue, 12 Jun 2012 00:31:10 +0000 (08:31 +0800)]
md/raid5: Do not add data_offset before call to is_badblock

In chunk_aligned_read() we are adding data_offset before calling
is_badblock.  But is_badblock also adds data_offset, so that is bad.

So move the addition of data_offset to after the call to
is_badblock.

This bug was introduced by commit 31c176ecdf3563140e639
     md/raid5: avoid reading from known bad blocks.
which first appeared in 3.0.  So that patch is suitable for any
-stable kernel from 3.0.y onwards.  However it will need minor
revision for most of those (as the comment didn't appear until
recently).

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid5: prefer replacing failed devices over want-replacement devices.
NeilBrown [Tue, 3 Jul 2012 01:46:53 +0000 (11:46 +1000)]
md/raid5: prefer replacing failed devices over want-replacement devices.

If a RAID5 has both a failed device and a device marked as
'WantReplacement', then we should preferentially replace the failed
device.
However the current code replaces whichever is found first.
So split into 2 loops, check fail failed/missing first, and only check
for WantReplacement if nothing is failed or missing.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd/raid10: Don't try to recovery unmatched (and unused) chunks.
NeilBrown [Tue, 3 Jul 2012 00:37:30 +0000 (10:37 +1000)]
md/raid10: Don't try to recovery unmatched (and unused) chunks.

If a RAID10 has an odd number of chunks - as might happen when there
are an odd number of devices - the last chunk has no pair and so is
not mirrored.  We don't store data there, but when recovering the last
device in an array we retry to recover that last chunk from a
non-existent location.  This results in an error, and the recovery
aborts.

When we get to that last chunk we should just stop - there is nothing
more to do anyway.

This bug has been present since the introduction of RAID10, so the
patch is appropriate for any -stable kernel.

Cc: stable@vger.kernel.org
Reported-by: Christian Balzer <chibi@gol.com>
Tested-by: Christian Balzer <chibi@gol.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoKVM: Sanitize KVM_IRQFD flags
Alex Williamson [Fri, 29 Jun 2012 15:56:24 +0000 (09:56 -0600)]
KVM: Sanitize KVM_IRQFD flags

We only know of one so far.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoKVM: Add missing KVM_IRQFD API documentation
Alex Williamson [Fri, 29 Jun 2012 15:56:16 +0000 (09:56 -0600)]
KVM: Add missing KVM_IRQFD API documentation

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoKVM: Pass kvm_irqfd to functions
Alex Williamson [Fri, 29 Jun 2012 15:56:08 +0000 (09:56 -0600)]
KVM: Pass kvm_irqfd to functions

Prune this down to just the struct kvm_irqfd so we can avoid
changing function definition for every flag or field we use.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoMerge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
Arnd Bergmann [Mon, 2 Jul 2012 21:14:29 +0000 (23:14 +0200)]
Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes

* 'fixes' of git://github.com/hzhuang1/linux:
  ARM: pxa: hx4700: Fix basic suspend/resume

12 years agoBtrfs: run delayed directory updates during log replay
Chris Mason [Mon, 2 Jul 2012 19:29:53 +0000 (15:29 -0400)]
Btrfs: run delayed directory updates during log replay

While we are resolving directory modifications in the
tree log, we are triggering delayed metadata updates to
the filesystem btrees.

This commit forces the delayed updates to run so the
replay code can find any modifications done.  It stops
us from crashing because the directory deleltion replay
expects items to be removed immediately from the tree.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
cc: stable@kernel.org

12 years agoBtrfs: hold a ref on the inode during writepages
Josef Bacik [Wed, 27 Jun 2012 21:18:41 +0000 (17:18 -0400)]
Btrfs: hold a ref on the inode during writepages

We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent.  This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages.  If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
12 years agoBtrfs: fix tree log remove space corner case
Josef Bacik [Wed, 27 Jun 2012 19:10:56 +0000 (15:10 -0400)]
Btrfs: fix tree log remove space corner case

The tree log stuff can have allocated space that we end up having split
across a bitmap and a real extent.  The free space code does not deal with
this, it assumes that if it finds an extent or bitmap entry that the entire
range must fall within the entry it finds.  This isn't necessarily the case,
so rework the remove function so it can handle this case properly.  This
fixed two panics the user hit, first in the case where the space was
initially in a bitmap and then in an extent entry, and then the reverse
case.  Thanks,

Reported-and-tested-by: Shaun Reich <sreich@kde.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
12 years agoBtrfs: fix wrong check during log recovery
Liu Bo [Tue, 26 Jun 2012 03:59:09 +0000 (21:59 -0600)]
Btrfs: fix wrong check during log recovery

When we're evicting an inode during log recovery, we need to ensure that the inode
is not in orphan state any more, which means inode's run_time flags has _no_
BTRFS_INODE_HAS_ORPHAN_ITEM.  Thus, the BUG_ON was triggered because of a wrong
check for the flags.

Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
12 years agoBtrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
Alexander Block [Mon, 25 Jun 2012 21:36:12 +0000 (15:36 -0600)]
Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS

We used the wrong ioctl macro for the getflags ioctl before.
As we don't have the set/getflags ioctls in the user space ioctl.h
at the moment, it's safe to fix it now.

Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
12 years agoBtrfs: resume balance on rw (re)mounts properly
Ilya Dryomov [Fri, 22 Jun 2012 18:24:13 +0000 (12:24 -0600)]
Btrfs: resume balance on rw (re)mounts properly

This introduces btrfs_resume_balance_async(), which, given that
restriper state was recovered earlier by btrfs_recover_balance(),
resumes balance in btrfs-balance kthread.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
12 years agoBtrfs: restore restriper state on all mounts
Ilya Dryomov [Fri, 22 Jun 2012 18:24:12 +0000 (12:24 -0600)]
Btrfs: restore restriper state on all mounts

Fix a bug that triggered asserts in btrfs_balance() in both normal and
resume modes -- restriper state was not properly restored on read-only
mounts.  This factors out resuming code from btrfs_restore_balance(),
which is now also called earlier in the mount sequence to avoid the
problem of some early writes getting the old profile.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
12 years agoBtrfs: fix dio write vs buffered read race
Josef Bacik [Tue, 19 Jun 2012 14:59:00 +0000 (10:59 -0400)]
Btrfs: fix dio write vs buffered read race

Miao pointed out there's a problem with mixing dio writes and buffered
reads.  If the read happens between us invalidating the page range and
actually locking the extent we can bring in pages into page cache.  Then
once the write finishes if somebody tries to read again it will just find
uptodate pages and we'll read stale data.  So we need to lock the extent and
check for uptodate bits in the range.  If there are uptodate bits we need to
unlock and invalidate again.  This will keep this race from happening since
we will hold the extent locked until we create the ordered extent, and then
teh read side always waits for ordered extents.  There was also a race in
how we updated i_size, previously we were relying on the generic DIO stuff
to adjust the i_size after the DIO had completed, but this happens outside
of the extent lock which means reads could come in and not see the updated
i_size.  So instead move this work into where we create the extents, and
then this way the update ordered i_size stuff works properly in the endio
handlers.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
12 years agoBtrfs: don't count I/O statistic read errors for missing devices
Stefan Behrens [Thu, 14 Jun 2012 14:42:31 +0000 (16:42 +0200)]
Btrfs: don't count I/O statistic read errors for missing devices

It is normal behaviour of the low level btrfs function btrfs_map_bio()
to complete a bio with -EIO if the device is missing, instead of just
preventing the bio creation in an earlier step.
This used to cause I/O statistic read error increments and annoying
printk_ratelimited messages. This commit fixes the issue.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Reported-by: Carey Underwood <cwillu@cwillu.com>
12 years agorcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
Paul E. McKenney [Sun, 24 Jun 2012 17:15:02 +0000 (10:15 -0700)]
rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter

If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to
also disable itself.  This commit therefore checks for tick_nohz_enabled
being zero, disabling rcu_prepare_for_idle() if so.  This commit assumes
that tick_nohz_enabled can change at runtime: If this is not the case,
then a simpler approach suffices.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Fix qlen_lazy breakage
Paul E. McKenney [Tue, 5 Jun 2012 22:53:53 +0000 (15:53 -0700)]
rcu: Fix qlen_lazy breakage

Commit d8169d4c (Make __kfree_rcu() less dependent on compiler choices)
created a macro out of an inline function in order to avoid build
breakage for certain combinations of gcc flags.  Unfortunately, it also
converted a kfree_call_rcu() to a call_rcu(), which made the rcu_data
structure's ->qlen_lazy field lose counts.  This commit therefore changes
the call_rcu() back to kfree_call_rcu().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Round FAST_NO_HZ lazy timeout to nearest second
Paul E. McKenney [Tue, 5 Jun 2012 03:45:10 +0000 (20:45 -0700)]
rcu: Round FAST_NO_HZ lazy timeout to nearest second

Currently, if several CPUs in the same package have all lazy RCU
callbacks, their wakeups will be uncorrelated.  If all the CPUs are in the
same power domain (as is often the case), this will result in unnecessary
power-ups of the package.  This commit therefore uses round_jiffies()
to round the timeouts to a second boundary, increasing the odds that
they can be coalesced with each other or with other timeouts.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: The rcu_needs_cpu() function is not a quiescent state
Paul E. McKenney [Thu, 10 May 2012 22:37:48 +0000 (15:37 -0700)]
rcu: The rcu_needs_cpu() function is not a quiescent state

The TINY_PREEMPT_RCU() function rcu_preempt_needs_cpu(), which is called
from rcu_needs_cpu(), assumes that it is in a quiescent state with respect
to the CPU.  This is no longer the case.  This commit therefore updates
rcu_preempt_needs_cpu() to make it aware that it is not running in a
quiescent state.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
12 years agorcu: Dump only the current CPU's buffers for idle-entry/exit warnings
Paul E. McKenney [Wed, 9 May 2012 22:55:39 +0000 (15:55 -0700)]
rcu: Dump only the current CPU's buffers for idle-entry/exit warnings

Problems in RCU idle entry and exit are almost always confined to the
offending CPU.  This commit therefore switches ftrace_dump() from
DUMP_ALL to DUMP_ORIG.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
12 years agorcu: Add check for CPUs going offline with callbacks queued
Paul E. McKenney [Thu, 21 Jun 2012 18:26:42 +0000 (11:26 -0700)]
rcu: Add check for CPUs going offline with callbacks queued

If a CPU goes offline with callbacks queued, those callbacks might be
indefinitely postponed, which can result in a system hang.  This commit
therefore inserts warnings for this condition.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Disable preemption in rcu_blocking_is_gp()
Paul E. McKenney [Tue, 19 Jun 2012 18:58:27 +0000 (11:58 -0700)]
rcu: Disable preemption in rcu_blocking_is_gp()

It is time to optimize CONFIG_TREE_PREEMPT_RCU's synchronize_rcu()
for uniprocessor optimization, which means that rcu_blocking_is_gp()
can no longer rely on RCU read-side critical sections having disabled
preemption.  This commit therefore disables preemption across
rcu_blocking_is_gp()'s scan of the cpu_online_mask.

(Updated from previous version to fix embarrassing bug spotted by
Wu Fengguang.)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Prevent uninitialized string in RCU CPU stall info
Carsten Emde [Tue, 19 Jun 2012 08:43:16 +0000 (10:43 +0200)]
rcu: Prevent uninitialized string in RCU CPU stall info

An uninitialized string may be displayed at the end of the rcu_preempt
detected stall info such as

0: (1 GPs behind) idle=075/140000000000000/0 =8?^D=8?^D
                                             ^^^^^^^^^^
if CONFIG_RCU_FAST_NO_HZ is not defined.

This trivial patch clears the string in this case.

Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU
Paul E. McKenney [Wed, 6 Jun 2012 14:12:43 +0000 (07:12 -0700)]
rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU

The rcu_is_cpu_idle() function is used if CONFIG_DEBUG_LOCK_ALLOC,
but TINY_RCU defines it only when CONFIG_PROVE_RCU.  This causes
build failures when CONFIG_DEBUG_LOCK_ALLOC=y but CONFIG_PROVE_RCU=n.
This commit therefore adjusts the #ifdefs for rcu_is_cpu_idle() so
that it is defined when CONFIG_DEBUG_LOCK_ALLOC=y.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Split RCU core processing out of __call_rcu()
Paul E. McKenney [Wed, 30 May 2012 10:21:48 +0000 (03:21 -0700)]
rcu: Split RCU core processing out of __call_rcu()

The __call_rcu() function is a bit overweight, so this commit splits
it into actual enqueuing of and accounting for the callback (__call_rcu())
and associated RCU-core processing (__call_rcu_core()).

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Prevent __call_rcu() from invoking RCU core on offline CPUs
Paul E. McKenney [Sat, 26 May 2012 15:56:01 +0000 (08:56 -0700)]
rcu: Prevent __call_rcu() from invoking RCU core on offline CPUs

The __call_rcu() function will invoke the RCU core, for example, if
it detects that the current CPU has too many callbacks.  However, this
can happen on an offline CPU that is on its way to the idle loop, in
which case it is an error to invoke the RCU core, and the excess callbacks
will be adopted in any case.  This commit therefore adds checks to
__call_rcu() for running on an offline CPU, refraining from invoking
the RCU core in this case.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Make __call_rcu() handle invocation from idle
Paul E. McKenney [Wed, 23 May 2012 05:10:24 +0000 (22:10 -0700)]
rcu: Make __call_rcu() handle invocation from idle

Although __call_rcu() is handled correctly when called from a momentary
non-idle period, if it is called on a CPU that RCU believes to be idle
on RCU_FAST_NO_HZ kernels, the callback might be indefinitely postponed.
This commit therefore ensures that RCU is aware of the new callback and
has a chance to force the CPU out of dyntick-idle mode when a new callback
is posted.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Remove function versions of __kfree_rcu and __is_kfree_rcu_offset
Paul E. McKenney [Fri, 25 May 2012 21:25:58 +0000 (14:25 -0700)]
rcu: Remove function versions of __kfree_rcu and __is_kfree_rcu_offset

Commit d8169d4c (Make __kfree_rcu() less dependent on compiler choices)
added cpp macro versions of __kfree_rcu() and __is_kfree_rcu_offset(),
but failed to remove the old inline-function versions.  This commit does
this cleanup.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations
Paul E. McKenney [Mon, 21 May 2012 18:58:36 +0000 (11:58 -0700)]
rcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations

The CONFIG_TREE_PREEMPT_RCU and CONFIG_TINY_PREEMPT_RCU versions of
__rcu_read_lock() and __rcu_read_unlock() are identical, so this commit
consolidates them into kernel/rcupdate.h.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Remove return value from rcu_assign_pointer()
Paul E. McKenney [Wed, 16 May 2012 22:51:08 +0000 (15:51 -0700)]
rcu: Remove return value from rcu_assign_pointer()

The return value from rcu_assign_pointer() is not used, and using it
would be quite ugly, for example:

q = rcu_assign_pointer(global_p, p);

To prevent this sort of ugliness from spreading, this commit wraps
rcu_assign_pointer() in a do-while loop.

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reported-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agokey: Remove extraneous parentheses from rcu_assign_keypointer()
Paul E. McKenney [Wed, 16 May 2012 23:31:38 +0000 (16:31 -0700)]
key: Remove extraneous parentheses from rcu_assign_keypointer()

This commit removes the extraneous parentheses from rcu_assign_keypointer()
so that rcu_assign_pointer() can be wrapped in do-while.  It also wraps
rcu_assign_keypointer() in a do-while and parenthesizes its final argument,
as suggested by David Howells.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Remove return value from RCU_INIT_POINTER()
Paul E. McKenney [Wed, 16 May 2012 22:42:30 +0000 (15:42 -0700)]
rcu: Remove return value from RCU_INIT_POINTER()

The return value from RCU_INIT_POINTER() is not used, and using it
would be quite ugly, for example:

q = RCU_INIT_POINTER(global_p, p);

To prevent this sort of ugliness from appearing, this commit wraps
RCU_INIT_POINTER() in a do-while loop.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: David Howells <dhowells@redhat.com>
12 years agorcu: Use new RCU_POINTER_INITIALIZER for gcc-style initializations
Paul E. McKenney [Wed, 16 May 2012 22:33:15 +0000 (15:33 -0700)]
rcu: Use new RCU_POINTER_INITIALIZER for gcc-style initializations

This commit applies the INIT_RCU_POINTER() macro to all uses of
RCU_POINTER_INITIALIZER() that were all too cleverly creating gcc-style
initializations.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
12 years agorcu: Add a gcc-style structure initializer for RCU pointers
Paul E. McKenney [Wed, 16 May 2012 22:23:45 +0000 (15:23 -0700)]
rcu: Add a gcc-style structure initializer for RCU pointers

RCU_INIT_POINTER() returns a value that is never used, and which should
be abolished due to terminal ugliness:

q = RCU_INIT_POINTER(global_p, p);

However, there are two uses that cannot be handled by a do-while
formulation because they do gcc-style initialization:

RCU_INIT_POINTER(.real_cred, &init_cred),
RCU_INIT_POINTER(.cred, &init_cred),

This usage is clever, but not necessarily the nicest approach.
This commit therefore creates an RCU_POINTER_INITIALIZER() macro that
is specifically designed for gcc-style initialization.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
12 years agorcu: Add ACCESS_ONCE() to ->qlen accesses
Paul E. McKenney [Wed, 9 May 2012 22:44:42 +0000 (15:44 -0700)]
rcu: Add ACCESS_ONCE() to ->qlen accesses

The _rcu_barrier() function accesses other CPUs' rcu_data structure's
->qlen field without benefit of locking.  This commit therefore adds
the required ACCESS_ONCE() wrappers around accesses and updates that
need it.

ACCESS_ONCE() is not needed when a CPU accesses its own ->qlen, or
in code that cannot run while _rcu_barrier() is sampling ->qlen fields.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Consolidate duplicate callback-list initialization
Paul E. McKenney [Wed, 9 May 2012 22:39:56 +0000 (15:39 -0700)]
rcu: Consolidate duplicate callback-list initialization

There are a couple of open-coded initializations of the rcu_data
structure's RCU callback list.  This commit therefore consolidates
them into a new init_callback_list() function.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Fix detection of abruptly-ending stall
Paul E. McKenney [Wed, 9 May 2012 15:45:12 +0000 (08:45 -0700)]
rcu: Fix detection of abruptly-ending stall

The code that attempts to identify stalls that end just as we detect
them is broken by both flavors of initialization failure.  This commit
therefore properly initializes and computes the count of the number
of reasons why the RCU grace period is stalled.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Make rcutorture fakewriters invoke rcu_barrier()
Paul E. McKenney [Wed, 30 May 2012 00:50:51 +0000 (17:50 -0700)]
rcu: Make rcutorture fakewriters invoke rcu_barrier()

The current rcutorture rcu_barrier() testing never intentionally runs
more than one instance of rcu_barrier() at a given time.  This fails
to test the the shiny new concurrency features of rcu_barrier().  This
commit therefore modifies the rcutorture fakewriter kthread to randomly
invoke rcu_barrier() rather than the usual synchronize_rcu().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Fix diagnostic-printk typo in rcutorture
Paul E. McKenney [Fri, 25 May 2012 01:17:48 +0000 (18:17 -0700)]
rcu: Fix diagnostic-printk typo in rcutorture

The rcu_torture_barrier() function has a copy-and-paste typo in the
string passed to rcutorture_shutdown_absorb(), which this commit fixes.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Fix bug in rcu_barrier() torture test
Paul E. McKenney [Tue, 29 May 2012 02:21:41 +0000 (19:21 -0700)]
rcu: Fix bug in rcu_barrier() torture test

The child threads in the rcu_torture_barrier_cbs() are improperly
synchronized, which can cause the rcu_barrier() tests to hang.  The
failure mode is as follows:

1. CPU 0 running in rcu_torture_barrier() sets barrier_cbs_count
     to n_barrier_cbs.

2. CPU 1 running in rcu_torture_barrier_cbs() wakes up, posts
     its RCU callback, and atomically decrements barrier_cbs_count.
     Because barrier_cbs_count is not zero, it does not do the wake_up().

3. CPU 2 running in rcu_torture_barrier_cbs() wakes up, but
     finds that barrier_cbs_count is not equal to n_barrier_cbs,
     and so returns to sleep.

4. The value of barrier_cbs_count therefore never reaches zero,
     which causes the test to hang.

This commit therefore uses a phase variable to coordinate the test,
preventing this scenario from occurring.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
12 years agorcu: Test srcu_barrier() from rcutorture test suite
Paul E. McKenney [Tue, 8 May 2012 17:21:50 +0000 (10:21 -0700)]
rcu: Test srcu_barrier() from rcutorture test suite

SRCU now has a call_srcu() and an srcu_barrier(), but rcutorture does not
test them.  This commit adds the machinery to allow rcutorture's existing
tests for call_rcu() and rcu_barrier() to apply to the SRCU equivalents.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
12 years agorcu: Rationalize ordering of torture_ops list
Paul E. McKenney [Mon, 7 May 2012 20:44:44 +0000 (13:44 -0700)]
rcu: Rationalize ordering of torture_ops list

Move the raw SRCU interfaces out of the middle of the normal SRCU
interfaces.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>