firefly-linux-kernel-4.4.55.git
13 years agorcu: Fix unpaired rcu_irq_enter() from locking selftests
Frederic Weisbecker [Fri, 20 May 2011 00:09:54 +0000 (02:09 +0200)]
rcu: Fix unpaired rcu_irq_enter() from locking selftests

HARDIRQ_ENTER() maps to irq_enter() which calls rcu_irq_enter().
But HARDIRQ_EXIT() maps to __irq_exit() which doesn't call
rcu_irq_exit().

So for every locking selftest that simulates hardirq disabled,
we create an imbalance in the rcu extended quiescent state
internal state.

As a result, after the first missing rcu_irq_exit(), subsequent
irqs won't exit dyntick-idle mode after leaving the interrupt
handler.  This means that RCU won't see the affected CPU as being
in an extended quiescent state, resulting in long grace-period
delays (as in grace periods extending for hours).

To fix this, just use __irq_enter() to simulate the hardirq
context. This is sufficient for the locking selftests as we
don't need to exit any extended quiescent state or perform
any check that irqs normally do when they wake up from idle.

As a side effect, this patch makes it possible to restore
"rcu: Decrease memory-barrier usage based on semi-formal proof",
which eventually helped finding this bug.

Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stable <stable@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agoRevert "rcu: Decrease memory-barrier usage based on semi-formal proof"
Paul E. McKenney [Thu, 12 May 2011 08:08:07 +0000 (01:08 -0700)]
Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"

This reverts commit e59fb3120becfb36b22ddb8bd27d065d3cdca499.

This reversion was due to (extreme) boot-time slowdowns on SPARC seen by
Yinghai Lu and on x86 by Ingo
.
This is a non-trivial reversion due to intervening commits.

Conflicts:

Documentation/RCU/trace.txt
kernel/rcutree.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agonet,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
Paul E. McKenney [Mon, 2 May 2011 07:56:57 +0000 (00:56 -0700)]
net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree

The RCU callback prl_entry_destroy_rcu() just calls kfree(), so we can
use kfree_rcu() instead of call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agobatman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
Paul E. McKenney [Mon, 2 May 2011 07:52:23 +0000 (00:52 -0700)]
batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu

The RCU callback softif_neigh_free_rcu() just calls kfree(), so we can
use kfree_rcu() instead of call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Sven Eckelmann <sven@narfation.org>
13 years agobatman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
Paul E. McKenney [Mon, 2 May 2011 06:27:50 +0000 (23:27 -0700)]
batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()

The RCU callback neigh_node_free_rcu() just calls kfree(), so we can use
kfree_rcu() instead of call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Sven Eckelmann <sven@narfation.org>
13 years agobatman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
Paul E. McKenney [Mon, 2 May 2011 06:25:02 +0000 (23:25 -0700)]
batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu

The RCU callback gw_node_free_rcu() just calls kfree(), so we can use
kfree_rcu() instead of call_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Sven Eckelmann <sven@narfation.org>
13 years agonet,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:02:42 +0000 (18:02 +0800)]
net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()

The rcu callback kfree_tid_tx() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(kfree_tid_tx).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: "David S. Miller" <davem@davemloft.net>
13 years agonet,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:15:02 +0000 (12:15 +0800)]
net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()

The rcu callback xt_osf_finger_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(xt_osf_finger_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:14:15 +0000 (12:14 +0800)]
net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()

The rcu callback work_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(work_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:10:25 +0000 (12:10 +0800)]
net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()

The rcu callback wq_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(wq_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:09:03 +0000 (12:09 +0800)]
net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()

The rcu callback phonet_device_rcu_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(phonet_device_rcu_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agoperf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:08:29 +0000 (12:08 +0800)]
perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()

The rcu callback swevent_hlist_release_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(swevent_hlist_release_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agoperf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:07:41 +0000 (12:07 +0800)]
perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()

The rcu callback free_ctx() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(free_ctx).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:07:09 +0000 (12:07 +0800)]
net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()

The rcu callback __nf_ct_ext_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__nf_ct_ext_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:06:32 +0000 (12:06 +0800)]
net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()

The rcu callback net_generic_release() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(net_generic_release).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:04:50 +0000 (12:04 +0800)]
net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()

The rcu callback netlbl_unlhsh_free_addr6() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(netlbl_unlhsh_free_addr6).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:03:56 +0000 (12:03 +0800)]
net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()

The rcu callback netlbl_unlhsh_free_addr4() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(netlbl_unlhsh_free_addr4).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agosecurity,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:03:19 +0000 (12:03 +0800)]
security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()

The rcu callback sel_netif_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netif_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:02:47 +0000 (12:02 +0800)]
net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()

The rcu callback xps_dev_maps_release() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(xps_dev_maps_release).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:02:20 +0000 (12:02 +0800)]
net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()

The rcu callback xps_map_release() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(xps_map_release).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:01:31 +0000 (12:01 +0800)]
net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()

The rcu callback rps_map_release() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(rps_map_release).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(ipv6_mc_socklist_reclaim) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:00:50 +0000 (12:00 +0800)]
net,rcu: convert call_rcu(ipv6_mc_socklist_reclaim) to kfree_rcu()

The rcu callback ipv6_mc_socklist_reclaim() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ipv6_mc_socklist_reclaim).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agomacvlan,rcu: convert call_rcu(macvlan_port_rcu_free) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:00:07 +0000 (12:00 +0800)]
macvlan,rcu: convert call_rcu(macvlan_port_rcu_free) to kfree_rcu()

The rcu callback macvlan_port_rcu_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(macvlan_port_rcu_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agoixgbe,rcu: convert call_rcu(ring_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:57:21 +0000 (11:57 +0800)]
ixgbe,rcu: convert call_rcu(ring_free_rcu) to kfree_rcu()

The rcu callback ring_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ring_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(free_dm_hw_stat) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:39:43 +0000 (11:39 +0800)]
net,rcu: convert call_rcu(free_dm_hw_stat) to kfree_rcu()

The rcu callback free_dm_hw_stat() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(free_dm_hw_stat).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(ip_mc_socklist_reclaim) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:45:08 +0000 (11:45 +0800)]
net,rcu: convert call_rcu(ip_mc_socklist_reclaim) to kfree_rcu()

The rcu callback ip_mc_socklist_reclaim() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ip_mc_socklist_reclaim).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(ip_sf_socklist_reclaim) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:44:46 +0000 (11:44 +0800)]
net,rcu: convert call_rcu(ip_sf_socklist_reclaim) to kfree_rcu()

The rcu callback ip_sf_socklist_reclaim() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ip_sf_socklist_reclaim).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(ip_mc_list_reclaim) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:44:08 +0000 (11:44 +0800)]
net,rcu: convert call_rcu(ip_mc_list_reclaim) to kfree_rcu()

The rcu callback ip_mc_list_reclaim() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ip_mc_list_reclaim).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(__gen_kill_estimator) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:43:26 +0000 (11:43 +0800)]
net,rcu: convert call_rcu(__gen_kill_estimator) to kfree_rcu()

The rcu callback __gen_kill_estimator() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__gen_kill_estimator).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(__leaf_info_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:42:34 +0000 (11:42 +0800)]
net,rcu: convert call_rcu(__leaf_info_free_rcu) to kfree_rcu()

The rcu callback __leaf_info_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__leaf_info_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(fc_rport_free_rcu) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:42:11 +0000 (11:42 +0800)]
net,rcu: convert call_rcu(fc_rport_free_rcu) to kfree_rcu()

The rcu callback fc_rport_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(fc_rport_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agosecurity,rcu: convert call_rcu(user_update_rcu_disposal) to kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 04:11:07 +0000 (12:11 +0800)]
security,rcu: convert call_rcu(user_update_rcu_disposal) to kfree_rcu()

The rcu callback user_update_rcu_disposal() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(user_update_rcu_disposal).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,act_police,rcu: remove rcu_barrier()
Lai Jiangshan [Tue, 15 Mar 2011 10:11:46 +0000 (18:11 +0800)]
net,act_police,rcu: remove rcu_barrier()

There is no callback of this module maybe queued
since we use kfree_rcu(), we can safely remove the rcu_barrier().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(dn_dev_free_ifa_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:10:12 +0000 (18:10 +0800)]
net,rcu: convert call_rcu(dn_dev_free_ifa_rcu) to kfree_rcu()

The rcu callback dn_dev_free_ifa_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(dn_dev_free_ifa_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(ha_rcu_free) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:08:58 +0000 (18:08 +0800)]
net,rcu: convert call_rcu(ha_rcu_free) to kfree_rcu()

The rcu callback ha_rcu_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ha_rcu_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(sctp_local_addr_free) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:05:02 +0000 (18:05 +0800)]
net,rcu: convert call_rcu(sctp_local_addr_free) to kfree_rcu()

The rcu callback sctp_local_addr_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sctp_local_addr_free).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(listeners_free_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:01:42 +0000 (18:01 +0800)]
net,rcu: convert call_rcu(listeners_free_rcu) to kfree_rcu()

The rcu callback listeners_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(listeners_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(inet6_ifa_finish_destroy_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 10:00:14 +0000 (18:00 +0800)]
net,rcu: convert call_rcu(inet6_ifa_finish_destroy_rcu) to kfree_rcu()

The rcu callback inet6_ifa_finish_destroy_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(inet6_ifa_finish_destroy_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(in6_dev_finish_destroy_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:59:14 +0000 (17:59 +0800)]
net,rcu: convert call_rcu(in6_dev_finish_destroy_rcu) to kfree_rcu()

The rcu callback in6_dev_finish_destroy_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(in6_dev_finish_destroy_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(tcf_police_free_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:58:00 +0000 (17:58 +0800)]
net,rcu: convert call_rcu(tcf_police_free_rcu) to kfree_rcu()

[PATCH 05/17] net,rcu: convert call_rcu(tcf_police_free_rcu) to kfree_rcu()

The rcu callback tcf_police_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(tcf_police_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agonet,rcu: convert call_rcu(tcf_common_free_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:57:04 +0000 (17:57 +0800)]
net,rcu: convert call_rcu(tcf_common_free_rcu) to kfree_rcu()

The rcu callback tcf_common_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(tcf_common_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agocgroup,rcu: convert call_rcu(__free_css_id_cb) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:56:10 +0000 (17:56 +0800)]
cgroup,rcu: convert call_rcu(__free_css_id_cb) to kfree_rcu()

The rcu callback __free_css_id_cb() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__free_css_id_cb).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agocgroup,rcu: convert call_rcu(free_cgroup_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:55:16 +0000 (17:55 +0800)]
cgroup,rcu: convert call_rcu(free_cgroup_rcu) to kfree_rcu()

The rcu callback free_cgroup_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(free_cgroup_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agocgroup,rcu: convert call_rcu(free_css_set_rcu) to kfree_rcu()
Lai Jiangshan [Tue, 15 Mar 2011 09:53:46 +0000 (17:53 +0800)]
cgroup,rcu: convert call_rcu(free_css_set_rcu) to kfree_rcu()

The rcu callback free_css_set_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(free_css_set_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: permit rcu_read_unlock() to be called while holding runqueue locks
Paul E. McKenney [Thu, 5 May 2011 04:43:49 +0000 (21:43 -0700)]
rcu: permit rcu_read_unlock() to be called while holding runqueue locks

Avoid calling into the scheduler while holding core RCU locks.  This
allows rcu_read_unlock() to be called while holding the runqueue locks,
but only as long as there was no chance of the RCU read-side critical
section having been preempted.  (Otherwise, if RCU priority boosting
is enabled, rcu_read_unlock() might call into the scheduler in order to
unboost itself, which might allows self-deadlock on the runqueue locks
within the scheduler.)

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: provide rcu_virt_note_context_switch() function.
Gleb Natapov [Wed, 4 May 2011 13:31:03 +0000 (16:31 +0300)]
rcu: provide rcu_virt_note_context_switch() function.

Provide rcu_virt_note_context_switch() for vitalization use to note
quiescent state during guest entry.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: get rid of signed overflow in check_cpu_stall()
Paul E. McKenney [Tue, 3 May 2011 06:40:04 +0000 (23:40 -0700)]
rcu: get rid of signed overflow in check_cpu_stall()

Signed integer overflow is undefined by the C standard, so move
calculations to unsigned.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: optimize rcutiny
Eric Dumazet [Thu, 28 Apr 2011 05:23:45 +0000 (07:23 +0200)]
rcu: optimize rcutiny

rcu_sched_qs() currently calls local_irq_save()/local_irq_restore() up
to three times.

Remove irq masking from rcu_qsctr_help() / invoke_rcu_kthread()
and do it once in rcu_sched_qs() / rcu_bh_qs()

This generates smaller code as well.

   text    data     bss     dec     hex filename
   2314     156      24    2494     9be kernel/rcutiny.old.o
   2250     156      24    2430     97e kernel/rcutiny.new.o

Fix an outdated comment for rcu_qsctr_help()
Move invoke_rcu_kthread() definition before its use.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: prevent call_rcu() from diving into rcu core if irqs disabled
Paul E. McKenney [Fri, 8 Apr 2011 05:47:23 +0000 (22:47 -0700)]
rcu: prevent call_rcu() from diving into rcu core if irqs disabled

This commit marks a first step towards making call_rcu() have
real-time behavior.  If irqs are disabled, don't dive into the
RCU core.  Later on, this new early exit will wake up the
per-CPU kthread, which first must be modified to handle the
cases involving callback storms.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: further lower priority in rcu_yield()
Paul E. McKenney [Sun, 27 Mar 2011 05:01:35 +0000 (22:01 -0700)]
rcu: further lower priority in rcu_yield()

Although rcu_yield() dropped from real-time to normal priority, there
is always the possibility that the competing tasks have been niced.
So nice to 19 in rcu_yield() to help ensure that other tasks have a
better chance of running.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: introduce kfree_rcu()
Lai Jiangshan [Fri, 18 Mar 2011 03:15:47 +0000 (11:15 +0800)]
rcu: introduce kfree_rcu()

Many rcu callbacks functions just call kfree() on the base structure.
These functions are trivial, but their size adds up, and furthermore
when they are used in a kernel module, that module must invoke the
high-latency rcu_barrier() function at module-unload time.

The kfree_rcu() function introduced by this commit addresses this issue.
Rather than encoding a function address in the embedded rcu_head
structure, kfree_rcu() instead encodes the offset of the rcu_head
structure within the base structure.  Because the functions are not
allowed in the low-order 4096 bytes of kernel virtual memory, offsets
up to 4095 bytes can be accommodated.  If the offset is larger than
4095 bytes, a compile-time error will be generated in __kfree_rcu().
If this error is triggered, you can either fall back to use of call_rcu()
or rearrange the structure to position the rcu_head structure into the
first 4096 bytes.

Note that the allowable offset might decrease in the future, for example,
to allow something like kmem_cache_free_rcu().

The new kfree_rcu() function can replace code as follows:

call_rcu(&p->rcu, simple_kfree_callback);

where "simple_kfree_callback()" might be defined as follows:

void simple_kfree_callback(struct rcu_head *p)
{
struct foo *q = container_of(p, struct foo, rcu);

kfree(q);
}

with the following:

kfree_rcu(&p->rcu, rcu);

Note that the "rcu" is the name of a field in the structure being
freed.  The reason for using this rather than passing in a pointer
to the base structure is that the above approach allows better type
checking.

This commit is based on earlier work by Lai Jiangshan and Manfred Spraul:

Lai's V1 patch: http://lkml.org/lkml/2008/9/18/1
Manfred's patch: http://lkml.org/lkml/2009/1/2/115

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: fix spelling
Paul E. McKenney [Wed, 2 Mar 2011 21:15:15 +0000 (13:15 -0800)]
rcu: fix spelling

The "preemptible" spelling is preferable.  May as well fix it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: call __rcu_read_unlock() in exit_rcu for tree RCU
Lai Jiangshan [Fri, 25 Feb 2011 19:37:59 +0000 (11:37 -0800)]
rcu: call __rcu_read_unlock() in exit_rcu for tree RCU

Using __rcu_read_lock() in place of rcu_read_lock() leaves any debug
state as it really should be, namely with the lock still held.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Converge TINY_RCU expedited and normal boosting
Paul E. McKenney [Fri, 25 Feb 2011 03:26:21 +0000 (19:26 -0800)]
rcu: Converge TINY_RCU expedited and normal boosting

This applies a trick from TREE_RCU boosting to TINY_RCU, eliminating
code and adding comments.  The key point is that it is possible for
the booster thread itself to work out whether there is a normal or
expedited boost required based solely on local information.  There
is therefore no need for boost initiation to know or care what type
of boosting is required.  In addition, when boosting is complete for
a given grace period, then by definition there cannot be any more
boosting for that grace period.  This allows eliminating yet more
state and statistics.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: remove useless ->boosted_this_gp field
Paul E. McKenney [Thu, 24 Feb 2011 23:25:21 +0000 (15:25 -0800)]
rcu: remove useless ->boosted_this_gp field

The ->boosted_this_gp field is a holdover from an earlier design that
was to carry out multiple boost operations in parallel.  It is not required
by the current design, which boosts one task at a time.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: code cleanups in TINY_RCU priority boosting.
Paul E. McKenney [Thu, 24 Feb 2011 01:03:06 +0000 (17:03 -0800)]
rcu: code cleanups in TINY_RCU priority boosting.

Extraneous semicolon, bad comment, and fold INIT_LIST_HEAD() into
list_del() to get list_del_init().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Switch to this_cpu() primitives
Paul E. McKenney [Wed, 23 Feb 2011 19:10:52 +0000 (11:10 -0800)]
rcu: Switch to this_cpu() primitives

This removes a couple of lines from invoke_rcu_cpu_kthread(), improving
readability.

Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings
Paul E. McKenney [Wed, 23 Feb 2011 17:56:00 +0000 (09:56 -0800)]
rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings

Avoid additional multiple-warning confusion in memory-corruption scenarios.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: mark rcutorture boosting callback as being on-stack
Paul E. McKenney [Wed, 30 Mar 2011 16:10:44 +0000 (09:10 -0700)]
rcu: mark rcutorture boosting callback as being on-stack

The CONFIG_DEBUG_OBJECTS_RCU_HEAD facility requires that on-stack RCU
callbacks be flagged explicitly to debug-objects using the
init_rcu_head_on_stack() and destroy_rcu_head_on_stack() functions.
This commit applies those functions to the rcutorture code that tests
RCU priority boosting.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment
Paul E. McKenney [Tue, 29 Mar 2011 19:56:56 +0000 (12:56 -0700)]
rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment

Verify that rcu_head structures are aligned to a four-byte boundary.
This check is enabled by CONFIG_DEBUG_OBJECTS_RCU_HEAD.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT
Mathieu Desnoyers [Wed, 23 Feb 2011 17:42:14 +0000 (09:42 -0800)]
rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT

The prohibition of DEBUG_OBJECTS_RCU_HEAD from !PREEMPT was due to the
fixup actions.  So just produce a warning from !PREEMPT.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Add forward-progress diagnostic for per-CPU kthreads
Paul E. McKenney [Sat, 23 Apr 2011 01:08:51 +0000 (18:08 -0700)]
rcu: Add forward-progress diagnostic for per-CPU kthreads

Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s
service loop, and add it to the rcudata trace output.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: add grace-period age and more kthread state to tracing
Paul E. McKenney [Wed, 6 Apr 2011 23:01:16 +0000 (16:01 -0700)]
rcu: add grace-period age and more kthread state to tracing

This commit adds the age in jiffies of the current grace period along
with the duration in jiffies of the longest grace period since boot
to the rcu/rcugp debugfs file.  It also adds an additional "O" state
to kthread tracing to differentiate between the kthread waiting due to
having nothing to do on the one hand and waiting due to being on the
wrong CPU on the other hand.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: fix tracing bug thinko on boost-balk attribution
Paul E. McKenney [Mon, 2 May 2011 10:46:10 +0000 (03:46 -0700)]
rcu: fix tracing bug thinko on boost-balk attribution

The rcu_initiate_boost_trace() function mis-attributed refusals to
initiate RCU priority boosting that were in fact due to its not yet
being time to boost.  This patch fixes the faulty comparison.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: update tracing documentation for new rcutorture and rcuboost
Paul E. McKenney [Wed, 6 Apr 2011 22:20:47 +0000 (15:20 -0700)]
rcu: update tracing documentation for new rcutorture and rcuboost

This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost
trace files.  The description has been updated as suggested by Josh
Triplett.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: make rcutorture version numbers available through debugfs
Paul E. McKenney [Mon, 4 Apr 2011 04:33:51 +0000 (21:33 -0700)]
rcu: make rcutorture version numbers available through debugfs

It is not possible to accurately correlate rcutorture output with that
of debugfs.  This patch therefore adds a debugfs file that prints out
the rcutorture version number, permitting easy correlation.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: add tracing for RCU's kthread run states.
Paul E. McKenney [Wed, 30 Mar 2011 00:48:28 +0000 (17:48 -0700)]
rcu: add tracing for RCU's kthread run states.

Add tracing to help debugging situations when RCU's kthreads are not
running but are supposed to be.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: add callback-queue information to rcudata output
Paul E. McKenney [Mon, 28 Mar 2011 22:47:07 +0000 (15:47 -0700)]
rcu: add callback-queue information to rcudata output

This commit adds an indication of the state of the callback queue using
a string of four characters following the "ql=" integer queue length.
The first character is "N" if there are callbacks that have been
queued that are not yet ready to be handled by the next grace period, or
"." otherwise.  The second character is "R" if there are callbacks queued
that are ready to be handled by the next grace period, or "." otherwise.
The third character is "W" if there are callbacks waiting for the current
grace period, or "." otherwise.  Finally, the fourth character is "D"
if there are callbacks that have been handled by a prior grace period
and are waiting to be invoked, or ".".

Note that callbacks that are in the process of being invoked are
not shown.  These callbacks would have been removed from the rcu_data
structure's list by rcu_do_batch() prior to being executed.  (These
callbacks are also not reflected in the "ql=" total, FWIW.)

Also, document the new callback-queue trace information.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Update RCU's trace.txt documentation for new format
Paul E. McKenney [Mon, 28 Mar 2011 04:37:58 +0000 (21:37 -0700)]
rcu: Update RCU's trace.txt documentation for new format

The trace.txt file had obsolete output for the debugfs rcu/rcudata
file, so update it.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Add boosting to TREE_PREEMPT_RCU tracing
Paul E. McKenney [Tue, 22 Feb 2011 21:42:43 +0000 (13:42 -0800)]
rcu: Add boosting to TREE_PREEMPT_RCU tracing

Includes total number of tasks boosted, number boosted on behalf of each
of normal and expedited grace periods, and statistics on attempts to
initiate boosting that failed for various reasons.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: eliminate unused boosting statistics
Paul E. McKenney [Mon, 21 Feb 2011 21:31:55 +0000 (13:31 -0800)]
rcu: eliminate unused boosting statistics

The n_rcu_torture_boost_allocerror and n_rcu_torture_boost_afferror
statistics are not actually incremented anymore, so eliminate them.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: avoid hammering sched with yet another bound RT kthread
Paul E. McKenney [Mon, 18 Apr 2011 06:45:23 +0000 (23:45 -0700)]
rcu: avoid hammering sched with yet another bound RT kthread

The scheduler does not appear to take kindly to having multiple
real-time threads bound to a CPU that is going offline.  So this
commit is a temporary hack-around to avoid that happening.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: put per-CPU kthread at non-RT priority during CPU hotplug operations
Paul E. McKenney [Mon, 18 Apr 2011 22:31:26 +0000 (15:31 -0700)]
rcu: put per-CPU kthread at non-RT priority during CPU hotplug operations

If you are doing CPU hotplug operations, it is best not to have
CPU-bound realtime tasks running CPU-bound on the outgoing CPU.
So this commit makes per-CPU kthreads run at non-realtime priority
during that time.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Force per-rcu_node kthreads off of the outgoing CPU
Paul E. McKenney [Thu, 14 Apr 2011 19:13:53 +0000 (12:13 -0700)]
rcu: Force per-rcu_node kthreads off of the outgoing CPU

The scheduler has had some heartburn in the past when too many real-time
kthreads were affinitied to the outgoing CPU.  So, this commit lightens
the load by forcing the per-rcu_node and the boost kthreads off of the
outgoing CPU.  Note that RCU's per-CPU kthread remains on the outgoing
CPU until the bitter end, as it must in order to preserve correctness.

Also avoid disabling hardirqs across calls to set_cpus_allowed_ptr(),
given that this function can block.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
13 years agorcu: priority boosting for TREE_PREEMPT_RCU
Paul E. McKenney [Mon, 7 Feb 2011 20:47:15 +0000 (12:47 -0800)]
rcu: priority boosting for TREE_PREEMPT_RCU

Add priority boosting for TREE_PREEMPT_RCU, similar to that for
TINY_PREEMPT_RCU.  This is enabled by the default-off RCU_BOOST
kernel parameter.  The priority to which to boost preempted
RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter
(defaulting to real-time priority 1) and the time to wait before
boosting the readers who are blocking a given grace period is
controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to
500 milliseconds).

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: move TREE_RCU from softirq to kthread
Paul E. McKenney [Wed, 12 Jan 2011 22:10:23 +0000 (14:10 -0800)]
rcu: move TREE_RCU from softirq to kthread

If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers.  Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked.  If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.

But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.

Also add comments and properly synchronized all accesses to
rcu_cpu_kthread_task, as suggested by Lai Jiangshan.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: merge TREE_PREEPT_RCU blocked_tasks[] lists
Paul E. McKenney [Tue, 30 Nov 2010 05:56:39 +0000 (21:56 -0800)]
rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists

Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the
rcu_node structure into a single ->blkd_tasks list with ->gp_tasks
and ->exp_tasks tail pointers.  This is in preparation for RCU priority
boosting, which will add a third dimension to the combinatorial explosion
in the ->blocked_tasks[] case, but simply a third pointer in the new
->blkd_tasks case.

Also update documentation to reflect blocked_tasks[] merge

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Decrease memory-barrier usage based on semi-formal proof
Paul E. McKenney [Tue, 7 Sep 2010 17:38:22 +0000 (10:38 -0700)]
rcu: Decrease memory-barrier usage based on semi-formal proof

Commit d09b62d fixed grace-period synchronization, but left some smp_mb()
invocations in rcu_process_callbacks() that are no longer needed, but
sheer paranoia prevented them from being removed.  This commit removes
them and provides a proof of correctness in their absence.  It also adds
a memory barrier to rcu_report_qs_rsp() immediately before the update to
rsp->completed in order to handle the theoretical possibility that the
compiler or CPU might move massive quantities of code into a lock-based
critical section.  This also proves that the sheer paranoia was not
entirely unjustified, at least from a theoretical point of view.

In addition, the old dyntick-idle synchronization depended on the fact
that grace periods were many milliseconds in duration, so that it could
be assumed that no dyntick-idle CPU could reorder a memory reference
across an entire grace period.  Unfortunately for this design, the
addition of expedited grace periods breaks this assumption, which has
the unfortunate side-effect of requiring atomic operations in the
functions that track dyntick-idle state for RCU.  (There is some hope
that the algorithms used in user-level RCU might be applied here, but
some work is required to handle the NMIs that user-space applications
can happily ignore.  For the short term, better safe than sorry.)

This proof assumes that neither compiler nor CPU will allow a lock
acquisition and release to be reordered, as doing so can result in
deadlock.  The proof is as follows:

1. A given CPU declares a quiescent state under the protection of
its leaf rcu_node's lock.

2. If there is more than one level of rcu_node hierarchy, the
last CPU to declare a quiescent state will also acquire the
->lock of the next rcu_node up in the hierarchy,  but only
after releasing the lower level's lock.  The acquisition of this
lock clearly cannot occur prior to the acquisition of the leaf
node's lock.

3. Step 2 repeats until we reach the root rcu_node structure.
Please note again that only one lock is held at a time through
this process.  The acquisition of the root rcu_node's ->lock
must occur after the release of that of the leaf rcu_node.

4. At this point, we set the ->completed field in the rcu_state
structure in rcu_report_qs_rsp().  However, if the rcu_node
hierarchy contains only one rcu_node, then in theory the code
preceding the quiescent state could leak into the critical
section.  We therefore precede the update of ->completed with a
memory barrier.  All CPUs will therefore agree that any updates
preceding any report of a quiescent state will have happened
before the update of ->completed.

5. Regardless of whether a new grace period is needed, rcu_start_gp()
will propagate the new value of ->completed to all of the leaf
rcu_node structures, under the protection of each rcu_node's ->lock.
If a new grace period is needed immediately, this propagation
will occur in the same critical section that ->completed was
set in, but courtesy of the memory barrier in #4 above, is still
seen to follow any pre-quiescent-state activity.

6. When a given CPU invokes __rcu_process_gp_end(), it becomes
aware of the end of the old grace period and therefore makes
any RCU callbacks that were waiting on that grace period eligible
for invocation.

If this CPU is the same one that detected the end of the grace
period, and if there is but a single rcu_node in the hierarchy,
we will still be in the single critical section.  In this case,
the memory barrier in step #4 guarantees that all callbacks will
be seen to execute after each CPU's quiescent state.

On the other hand, if this is a different CPU, it will acquire
the leaf rcu_node's ->lock, and will again be serialized after
each CPU's quiescent state for the old grace period.

On the strength of this proof, this commit therefore removes the memory
barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
The effect is to reduce the number of memory barriers by one and to
reduce the frequency of execution from about once per scheduling tick
per CPU to once per grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agorcu: Remove conditional compilation for RCU CPU stall warnings
Paul E. McKenney [Wed, 9 Feb 2011 01:14:39 +0000 (17:14 -0800)]
rcu: Remove conditional compilation for RCU CPU stall warnings

The RCU CPU stall warnings can now be controlled using the
rcu_cpu_stall_suppress boot-time parameter or via the same parameter
from sysfs.  There is therefore no longer any reason to have
kernel config parameters for this feature.  This commit therefore
removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel config parameters.  The RCU_CPU_STALL_TIMEOUT parameter remains
to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter
remains to allow task-stall information to be suppressed if desired.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
13 years agoLinux 2.6.39-rc6
Linus Torvalds [Wed, 4 May 2011 02:59:13 +0000 (19:59 -0700)]
Linux 2.6.39-rc6

13 years agoMerge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Wed, 4 May 2011 01:52:09 +0000 (18:52 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: fix gart setup on fusion parts (v2)
  drm: Send pending vblank events before disabling vblank.
  drm/radeon: fix regression on atom cards with hardcoded EDID record.
  drm/radeon/kms: add some new pci ids

13 years agodrm/radeon/kms: fix gart setup on fusion parts (v2)
Alex Deucher [Tue, 3 May 2011 23:28:02 +0000 (19:28 -0400)]
drm/radeon/kms: fix gart setup on fusion parts (v2)

Out of the entire GART/VM subsystem, the hw designers changed
the location of 3 regs.

v2: airlied: add parameter for userspace to work from.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm: Send pending vblank events before disabling vblank.
Christopher James Halse Rogers [Wed, 27 Apr 2011 06:10:57 +0000 (16:10 +1000)]
drm: Send pending vblank events before disabling vblank.

This is the least-bad behaviour.  It means that we signal the
vblank event before it actually happens, but since we're disabling
vblanks there's no guarantee that it will *ever* happen otherwise.

This prevents GL applications which use WaitMSC from hanging
indefinitely.

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon: fix regression on atom cards with hardcoded EDID record.
Dave Airlie [Sun, 1 May 2011 10:16:30 +0000 (20:16 +1000)]
drm/radeon: fix regression on atom cards with hardcoded EDID record.

Since fafcf94e2b5732d1e13b440291c53115d2b172e9 introduced an edid size, it seems to have broken this path.

This manifest as oops on T500 Lenovo laptops with dual graphics primarily.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33812
cc: stable@kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: add some new pci ids
Alex Deucher [Tue, 3 May 2011 19:15:55 +0000 (15:15 -0400)]
drm/radeon/kms: add some new pci ids

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agologfs: initialize superblock entries earlier
Linus Torvalds [Tue, 3 May 2011 23:10:25 +0000 (16:10 -0700)]
logfs: initialize superblock entries earlier

In particular, s_freeing_list needs to be initialized early, since it is
used on some of the error paths when mounts fail.  The mapping inode,
for example, would be initialized and then free'd on an error path
before s_freeing_list was initialized, but the inode drop operation
needs the s_freeing_list to be set up.

Normally you'd never see this, because not only is logfs fairly rare,
but a successful mount will never have any issues.

Reported-by: werner <w.landgraf@ru.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'stable/bug-fixes-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 3 May 2011 16:25:42 +0000 (09:25 -0700)]
Merge branch 'stable/bug-fixes-for-rc5' of git://git./linux/kernel/git/konrad/xen

* 'stable/bug-fixes-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top
  xen/mmu: Add workaround "x86-64, mm: Put early page table high"

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Linus Torvalds [Tue, 3 May 2011 16:24:44 +0000 (09:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/cjb/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sdhci: Check mrq != NULL in sdhci_tasklet_finish
  mmc: sdhci: Check mrq->cmd in sdhci_tasklet_finish
  mmc: tmio: fix .set_ios(MMC_POWER_UP) handling
  mmc: fix a race between card-detect rescan and clock-gate work instances
  mmc: omap: Fix possible NULL pointer deref
  mmc: core: mmc_add_card(): fix missing break in switch statement
  mmc: sdhci-pci: Fix error case in sdhci_pci_probe_slot()

13 years agoMerge branches 'x86-fixes-for-linus' and 'irq-fixes-for-linus' of git://git.kernel...
Linus Torvalds [Tue, 3 May 2011 16:23:44 +0000 (09:23 -0700)]
Merge branches 'x86-fixes-for-linus' and 'irq-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, reboot: Fix relocations in reboot_32.S
  x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo()
  x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Fix typo CONFIG_GENIRC_IRQ_SHOW_LEVEL

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Tue, 3 May 2011 03:26:32 +0000 (20:26 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wm831x-ts - move BTN_TOUCH reporting to data transfer
  Input: wm831x-ts - allow IRQ flags to be specified
  Input: wm831x-ts - fix races with IRQ management

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Tue, 3 May 2011 01:00:43 +0000 (18:00 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  sysctl: net: call unregister_net_sysctl_table where needed
  Revert: veth: remove unneeded ifname code from veth_newlink()
  smsc95xx: fix reset check
  tg3: Fix failure to enable WoL by default when possible
  networking: inappropriate ioctl operation should return ENOTTY
  amd8111e: trivial typo spelling: Negotitate -> Negotiate
  ipv4: don't spam dmesg with "Using LC-trie" messages
  af_unix: Only allow recv on connected seqpacket sockets.
  mii: add support of pause frames in mii_get_an
  net: ftmac100: fix scheduling while atomic during PHY link status change
  usbnet: Transfer of maintainership
  usbnet: add support for some Huawei modems with cdc-ether ports
  bnx2: cancel timer on device removal
  iwl4965: fix "Received BA when not expected"
  iwlagn: fix "Received BA when not expected"
  dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
  usbnet: Resubmit interrupt URB if device is open
  iwl4965: fix "TX Power requested while scanning"
  iwlegacy: led stay solid on when no traffic
  b43: trivial: update module info about ucode16_mimo firmware
  ...

13 years agosysctl: net: call unregister_net_sysctl_table where needed
Lucian Adrian Grijincu [Sun, 1 May 2011 01:44:01 +0000 (01:44 +0000)]
sysctl: net: call unregister_net_sysctl_table where needed

ctl_table_headers registered with register_net_sysctl_table should
have been unregistered with the equivalent unregister_net_sysctl_table

Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoRevert: veth: remove unneeded ifname code from veth_newlink()
Jiri Pirko [Sat, 30 Apr 2011 01:28:17 +0000 (01:28 +0000)]
Revert: veth: remove unneeded ifname code from veth_newlink()

84c49d8c3e4abefb0a41a77b25aa37ebe8d6b743 ("veth: remove unneeded
ifname code from veth_newlink()") caused regression on veth
creation. This patch reverts the original one.

Reported-by: Michał Mirosław <mirqus@gmail.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosmsc95xx: fix reset check
Rabin Vincent [Sat, 30 Apr 2011 08:29:27 +0000 (08:29 +0000)]
smsc95xx: fix reset check

The reset loop check should check the MII_BMCR register value for
BMCR_RESET rather than for MII_BMCR (the register address, which also
happens to be zero).

Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Fix failure to enable WoL by default when possible
Rafael J. Wysocki [Thu, 28 Apr 2011 11:02:15 +0000 (11:02 +0000)]
tg3: Fix failure to enable WoL by default when possible

tg3 is supposed to enable WoL by default on adapters which support
that, but it fails to do so unless the adapter's
/sys/devices/.../power/wakeup file contains 'enabled' during the
initialization of the adapter.  Fix that by making tg3 use
device_set_wakeup_enable() to enable wakeup automatically whenever
WoL should be enabled by default.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetworking: inappropriate ioctl operation should return ENOTTY
Lifeng Sun [Wed, 27 Apr 2011 22:04:51 +0000 (22:04 +0000)]
networking: inappropriate ioctl operation should return ENOTTY

ioctl() calls against a socket with an inappropriate ioctl operation
are incorrectly returning EINVAL rather than ENOTTY:

  [ENOTTY]
      Inappropriate I/O control operation.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=33992
Signed-off-by: Lifeng Sun <lifongsun@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agox86, reboot: Fix relocations in reboot_32.S
H. Peter Anvin [Mon, 2 May 2011 21:33:24 +0000 (14:33 -0700)]
x86, reboot: Fix relocations in reboot_32.S

The use of base for %ebx in this file is arbitrary, *except* that we
also use it to compute the real-mode segment.  Therefore, make it so
that r_base really is the true address to which %ebx points.

This resolves kernel bugzilla 33302.

Reported-and-tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/n/tip-08os5wi3yq1no0y4i5m4z7he@git.kernel.org
13 years agoamd8111e: trivial typo spelling: Negotitate -> Negotiate
Joe Perches [Mon, 2 May 2011 09:59:29 +0000 (09:59 +0000)]
amd8111e: trivial typo spelling: Negotitate -> Negotiate

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top
Stefano Stabellini [Tue, 12 Apr 2011 11:19:49 +0000 (12:19 +0100)]
xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top

mask_rw_pte is currently checking if a pfn is a pagetable page if it
falls in the range pgt_buf_start - pgt_buf_end but that is incorrect
because pgt_buf_end is a moving target: pgt_buf_top is the real
boundary.

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoxen/mmu: Add workaround "x86-64, mm: Put early page table high"
Konrad Rzeszutek Wilk [Fri, 29 Apr 2011 15:34:00 +0000 (11:34 -0400)]
xen/mmu: Add workaround "x86-64, mm: Put early page table high"

As a consequence of the commit:

commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

it causes the Linux kernel to crash under Xen:

mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7)
(XEN) mm.c:3027:d0 Error while pinning mfn b1d89
(XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...

The reason is that at some point init_memory_mapping is going to reach
the pagetable pages area and map those pages too (mapping them as normal
memory that falls in the range of addresses passed to init_memory_mapping
as argument). Some of those pages are already pagetable pages (they are
in the range pgt_buf_start-pgt_buf_end) therefore they are going to be
mapped RO and everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW.  When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).

In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).

For this reason, this function is introduced which is called _after_
the init_memory_mapping has completed (in a perfect world we would
call this function from init_memory_mapping, but lets ignore that).

Because we are called _after_ init_memory_mapping the pgt_buf_[start,
end,top] have all changed to new values (b/c another init_memory_mapping
is called). Hence, the first time we enter this function, we save
away the pgt_buf_start value and update the pgt_buf_[end,top].

When we detect that the "old" pgt_buf_start through pgt_buf_end
PFNs have been reserved (so memblock_x86_reserve_range has been called),
we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top.

And then we update those "old" pgt_buf_[end|top] with the new ones
so that we can redo this on the next pagetable.

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Updated with Jeremy's comments]
[v2: Added the crash output]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>