Tobias Klauser [Tue, 24 Jun 2014 13:33:22 +0000 (15:33 +0200)]
net: filter: Use kcalloc/kmalloc_array to allocate arrays
Use kcalloc/kmalloc_array to make it clear we're allocating arrays. No
integer overflow can actually happen here, since len/flen is guaranteed
to be less than BPF_MAXINSNS (4096). However, this changed makes sure
we're not going to get one if BPF_MAXINSNS were ever increased.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 24 Jun 2014 13:33:21 +0000 (15:33 +0200)]
trivial: net: filter: Change kerneldoc parameter order
Change the order of the parameters to sk_unattached_filter_create() in
the kerneldoc to reflect the order they appear in the actual function.
This fix is only cosmetic, in the generated doc they still appear in the
correct order without the fix.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 24 Jun 2014 13:33:20 +0000 (15:33 +0200)]
trivial: net: filter: Fix typo in comment
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Ripard [Mon, 23 Jun 2014 20:49:40 +0000 (22:49 +0200)]
net: allwinner: emac: Add missing free_irq
If the mdio probe function fails in emac_open, the interrupt we just requested
isn't freed. If emac_open is called again, for example because we try to set up
the interface again, the kernel will oops because the interrupt wasn't properly
released.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: <stable@vger.kernel.org> # 3.11+
Signed-off-by: David S. Miller <davem@davemloft.net>
Thadeu Lima de Souza Cascardo [Sat, 21 Jun 2014 12:48:08 +0000 (09:48 -0300)]
cxgb4: use dev_port to identify ports
Commit
3f85944fe207d0225ef21a2c0951d4946fc9a95d ("net: Add sysfs file
for port number") introduce dev_port to network devices. cxgb4 adapters
have multiple ports on the same PCI function, and used dev_id to
identify those ports. That use was removed by commit
8c367fcbe6549195d2eb11e62bea233f811aad41 ("cxgb4: Do not set
net_device::dev_id to VI index"), since dev_id should be used only when
devices share the same MAC address.
Using dev_port for cxgb4 allows different ports on the same PCI function
to be identified.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Mon, 23 Jun 2014 09:50:17 +0000 (10:50 +0100)]
xen-netback: bookkeep number of active queues in our own module
The original code uses netdev->real_num_tx_queues to bookkeep number of
queues and invokes netif_set_real_num_tx_queues to set the number of
queues. However, netif_set_real_num_tx_queues doesn't allow
real_num_tx_queues to be smaller than 1, which means setting the number
to 0 will not work and real_num_tx_queues is untouched.
This is bogus when xenvif_free is invoked before any number of queues is
allocated. That function needs to iterate through all queues to free
resources. Using the wrong number of queues results in NULL pointer
dereference.
So we bookkeep the number of queues in xen-netback to solve this
problem. This fixes a regression introduced by multiqueue patchset in
3.16-rc1.
There's another bug in original code that the real number of RX queues
is never set. In current Xen multiqueue design, the number of TX queues
and RX queues are in fact the same. We need to set the numbers of TX and
RX queues to the same value.
Also remove xenvif_select_queue and leave queue selection to core
driver, as suggested by David Miller.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Sat, 21 Jun 2014 06:28:15 +0000 (23:28 -0700)]
tg3: Change nvram command timeout value to 50ms
Commit
506724c463fcd63477a5e404728a980b71f80bb7 "tg3: Override clock,
link aware and link idle mode during NVRAM dump" changed the timeout
value for nvram command execution from 100ms to 1ms. But the 1ms
timeout value was only sufficient for nvram read operations but not
write operations for most of the devices supported by tg3 driver.
This patch sets the MAX to 50ms. Also it uses usleep_range instead
of udelay.
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 20 Jun 2014 09:32:36 +0000 (17:32 +0800)]
cxgb4: Not need to hold the adap_rcu_lock lock when read adap_rcu_list
cxgb4_netdev maybe lead to dead lock, since it uses a spin lock, and be called
in both thread and softirq context, but not disable BH, the lockdep report is
below; In fact, cxgb4_netdev only reads adap_rcu_list with RCU protection, so
not need to hold spin lock again.
=================================
[ INFO: inconsistent lock state ]
3.14.7+ #24 Tainted: G C O
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
radvd/3794 [HC0[0]:SC1[1]:HE1:SE0] takes:
(adap_rcu_lock){+.?...}, at: [<
ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
{SOFTIRQ-ON-W} state was registered at:
[<
ffffffff810fca81>] __lock_acquire+0x34a/0xe48
[<
ffffffff810fd98b>] lock_acquire+0x82/0x9d
[<
ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43
[<
ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
[<
ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4]
[<
ffffffff815da98b>] notifier_call_chain+0x32/0x5c
[<
ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e
[<
ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11
[<
ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18
[<
ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6]
[<
ffffffffa01f8df0>] addrconf_add_linklocal+0x5f/0x95 [ipv6]
[<
ffffffffa01fc3e9>] addrconf_notify+0x632/0x841 [ipv6]
[<
ffffffff815da98b>] notifier_call_chain+0x32/0x5c
[<
ffffffff810e09a1>] __raw_notifier_call_chain+0x9/0xb
[<
ffffffff810e09b2>] raw_notifier_call_chain+0xf/0x11
[<
ffffffff8151b3b7>] call_netdevice_notifiers_info+0x4e/0x56
[<
ffffffff8151b3d0>] call_netdevice_notifiers+0x11/0x13
[<
ffffffff8151c0a6>] netdev_state_change+0x1f/0x38
[<
ffffffff8152f004>] linkwatch_do_dev+0x3b/0x49
[<
ffffffff8152f184>] __linkwatch_run_queue+0x10b/0x144
[<
ffffffff8152f1dd>] linkwatch_event+0x20/0x27
[<
ffffffff810d7bc0>] process_one_work+0x1cb/0x2ee
[<
ffffffff810d7e3b>] worker_thread+0x12e/0x1fc
[<
ffffffff810dd391>] kthread+0xc4/0xcc
[<
ffffffff815dc48c>] ret_from_fork+0x7c/0xb0
irq event stamp: 3388
hardirqs last enabled at (3388): [<
ffffffff810c6c85>]
__local_bh_enable_ip+0xaa/0xd9
hardirqs last disabled at (3387): [<
ffffffff810c6c2d>]
__local_bh_enable_ip+0x52/0xd9
softirqs last enabled at (3288): [<
ffffffffa01f1d5b>]
rcu_read_unlock_bh+0x0/0x2f [ipv6]
softirqs last disabled at (3289): [<
ffffffff815ddafc>]
do_softirq_own_stack+0x1c/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(adap_rcu_lock);
<Interrupt>
lock(adap_rcu_lock);
*** DEADLOCK ***
5 locks held by radvd/3794:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<
ffffffffa020b85a>]
rawv6_sendmsg+0x74b/0xa4d [ipv6]
#1: (rcu_read_lock){.+.+..}, at: [<
ffffffff8151ac6b>]
rcu_lock_acquire+0x0/0x29
#2: (rcu_read_lock){.+.+..}, at: [<
ffffffffa01f4cca>]
rcu_lock_acquire.constprop.16+0x0/0x30 [ipv6]
#3: (rcu_read_lock){.+.+..}, at: [<
ffffffff810e09b4>]
rcu_lock_acquire+0x0/0x29
#4: (rcu_read_lock){.+.+..}, at: [<
ffffffffa0998782>]
rcu_lock_acquire.constprop.40+0x0/0x30 [cxgb4]
stack backtrace:
CPU: 7 PID: 3794 Comm: radvd Tainted: G C O 3.14.7+ #24
Hardware name: Supermicro X7DBU/X7DBU, BIOS 6.00 12/03/2007
ffffffff81f15990 ffff88012fdc36a8 ffffffff815d0016 0000000000000006
ffff8800c80dc2a0 ffff88012fdc3708 ffffffff815cc727 0000000000000001
0000000000000001 ffff880100000000 ffffffff81015b02 ffff8800c80dcb58
Call Trace:
<IRQ> [<
ffffffff815d0016>] dump_stack+0x4e/0x71
[<
ffffffff815cc727>] print_usage_bug+0x1ec/0x1fd
[<
ffffffff81015b02>] ? save_stack_trace+0x27/0x44
[<
ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0
[<
ffffffff810fc640>] mark_lock+0x11b/0x212
[<
ffffffff810fca0b>] __lock_acquire+0x2d4/0xe48
[<
ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0
[<
ffffffff810fbff6>] ? check_usage_forwards+0x4c/0xa6
[<
ffffffff810c6c8a>] ? __local_bh_enable_ip+0xaf/0xd9
[<
ffffffff810fd98b>] lock_acquire+0x82/0x9d
[<
ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4]
[<
ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4]
[<
ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43
[<
ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4]
[<
ffffffffa09987b0>] ? rcu_lock_acquire.constprop.40+0x2e/0x30 [cxgb4]
[<
ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4]
[<
ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4]
[<
ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4]
[<
ffffffff810fd99d>] ? lock_acquire+0x94/0x9d
[<
ffffffff810e09b4>] ? raw_notifier_call_chain+0x11/0x11
[<
ffffffff815da98b>] notifier_call_chain+0x32/0x5c
[<
ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e
[<
ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11
[<
ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18
[<
ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6]
[<
ffffffff810fde6a>] ? trace_hardirqs_on+0xd/0xf
[<
ffffffffa01fb634>] addrconf_prefix_rcv+0x385/0x6ea [ipv6]
[<
ffffffffa0207950>] ndisc_rcv+0x9d3/0xd76 [ipv6]
[<
ffffffffa020d536>] icmpv6_rcv+0x592/0x67b [ipv6]
[<
ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9
[<
ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9
[<
ffffffff810fd8dc>] ? lock_release+0x14e/0x17b
[<
ffffffffa020df97>] ? rcu_read_unlock+0x21/0x23 [ipv6]
[<
ffffffff8150df52>] ? rcu_read_unlock+0x23/0x23
[<
ffffffffa01f4ede>] ip6_input_finish+0x1e4/0x2fc [ipv6]
[<
ffffffffa01f540b>] ip6_input+0x33/0x38 [ipv6]
[<
ffffffffa01f5557>] ip6_mc_input+0x147/0x160 [ipv6]
[<
ffffffffa01f4ba3>] ip6_rcv_finish+0x7c/0x81 [ipv6]
[<
ffffffffa01f5397>] ipv6_rcv+0x3a1/0x3e2 [ipv6]
[<
ffffffff8151ef96>] __netif_receive_skb_core+0x4ab/0x511
[<
ffffffff810fdc94>] ? mark_held_locks+0x71/0x99
[<
ffffffff8151f0c0>] ? process_backlog+0x69/0x15e
[<
ffffffff8151f045>] __netif_receive_skb+0x49/0x5b
[<
ffffffff8151f0cf>] process_backlog+0x78/0x15e
[<
ffffffff8151f571>] ? net_rx_action+0x1a2/0x1cc
[<
ffffffff8151f47b>] net_rx_action+0xac/0x1cc
[<
ffffffff810c69b7>] ? __do_softirq+0xad/0x218
[<
ffffffff810c69ff>] __do_softirq+0xf5/0x218
[<
ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30
<EOI> [<
ffffffff810c6bb6>] do_softirq+0x38/0x5d
[<
ffffffffa01f1d5b>] ? ip6_copy_metadata+0x156/0x156 [ipv6]
[<
ffffffff810c6c78>] __local_bh_enable_ip+0x9d/0xd9
[<
ffffffffa01f1d88>] rcu_read_unlock_bh+0x2d/0x2f [ipv6]
[<
ffffffffa01f28b4>] ip6_finish_output2+0x381/0x3d8 [ipv6]
[<
ffffffffa01f49ef>] ip6_finish_output+0x6e/0x73 [ipv6]
[<
ffffffffa01f4a70>] ip6_output+0x7c/0xa8 [ipv6]
[<
ffffffff815b1bfa>] dst_output+0x18/0x1c
[<
ffffffff815b1c9e>] ip6_local_out+0x1c/0x21
[<
ffffffffa01f2489>] ip6_push_pending_frames+0x37d/0x427 [ipv6]
[<
ffffffff81558af8>] ? skb_orphan+0x39/0x39
[<
ffffffffa020b85a>] ? rawv6_sendmsg+0x74b/0xa4d [ipv6]
[<
ffffffffa020ba51>] rawv6_sendmsg+0x942/0xa4d [ipv6]
[<
ffffffff81584cd2>] inet_sendmsg+0x3d/0x66
[<
ffffffff81508930>] __sock_sendmsg_nosec+0x25/0x27
[<
ffffffff8150b0d7>] sock_sendmsg+0x5a/0x7b
[<
ffffffff810fd8dc>] ? lock_release+0x14e/0x17b
[<
ffffffff8116d756>] ? might_fault+0x9e/0xa5
[<
ffffffff8116d70d>] ? might_fault+0x55/0xa5
[<
ffffffff81508cb1>] ? copy_from_user+0x2a/0x2c
[<
ffffffff8150b70c>] ___sys_sendmsg+0x226/0x2d9
[<
ffffffff810fcd25>] ? __lock_acquire+0x5ee/0xe48
[<
ffffffff810fde01>] ? trace_hardirqs_on_caller+0x145/0x1a1
[<
ffffffff8118efcb>] ? slab_free_hook.isra.71+0x50/0x59
[<
ffffffff8115c81f>] ? release_pages+0xbc/0x181
[<
ffffffff810fd99d>] ? lock_acquire+0x94/0x9d
[<
ffffffff81115e97>] ? read_seqcount_begin.constprop.25+0x73/0x90
[<
ffffffff8150c408>] __sys_sendmsg+0x3d/0x5b
[<
ffffffff8150c433>] SyS_sendmsg+0xd/0x19
[<
ffffffff815dc53d>] system_call_fastpath+0x1a/0x1f
Reported-by: Ben Greear <greearb@candelatech.com>
Cc: Casey Leedom <leedom@chelsio.com>
Cc: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Mon, 23 Jun 2014 11:11:29 +0000 (16:41 +0530)]
be2net: fix qnq mode detection on VFs
The driver (on PF or VF) needs to detect if the function is in qnq mode for
a HW hack in be_rx_compl_get() to work.
The driver queries this information using the GET_PROFILE_CONFIG cmd
(since the commit below can caused this regression.) But this cmd is not
available on VFs and so the VFs fail to detect qnq mode. This causes
vlan traffic to not work.
The fix is to use the the adapter->function_mode value queried via
QUERY_FIRMWARE_CONFIG cmd on both PFs and VFs to detect the qnq mode.
Also QNQ_MODE was incorrectly named FLEX10_MODE; correcting that too as the
fix reads much better with the name change.
Fixes: f93f160b5 ("refactor multi-channel config code for Skyhawk-R chip")
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Retanubun [Fri, 20 Jun 2014 14:11:07 +0000 (10:11 -0400)]
of: mdio: fixup of_phy_register_fixed_link parsing of new bindings
Fixes commit
3be2a49e5c08 ("of: provide a binding for fixed link PHYs")
Fix the parsing of the new fixed link dts bindings for duplex,
pause, and asym_pause by using the correct device node pointer.
Signed-off-by: Richard Retanubun <rretanubun.work@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phoebe Buckheister [Wed, 18 Jun 2014 14:28:49 +0000 (16:28 +0200)]
at86rf230: fix irq setup
Commit
8eba0eefae24953962067 ("at86rf230: remove irq_type in
request_irq") removed the trigger configuration when requesting an irq,
and instead relied on the interrupt trigger to be properly configured
already. This does not seem to be an assumption that can be safely made,
since boards disable all interrupt triggers on boot.
On these boards, force the irq to trigger on rising edge, which is also
the default for the chip.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fengguang Wu [Sun, 22 Jun 2014 10:32:51 +0000 (12:32 +0200)]
net: phy: at803x: fix coccinelle warnings
drivers/net/phy/at803x.c:196:26-32: ERROR: application of sizeof to pointer
sizeof when applied to a pointer typed expression gives the size of
the pointer
Generated by: scripts/coccinelle/misc/noderef.cocci
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Sun, 22 Jun 2014 10:21:34 +0000 (13:21 +0300)]
net/mlx4_core: Fix the error flow when probing with invalid VF configuration
Single ported VF are currently not supported on configurations where
one or both ports are IB. When we hit this case, the relevant flow in
the driver didn't return error and jumped to the wrong label. Fix that.
Fixes: dd41cc3 ('net/mlx4: Adapt num_vfs/probed_vf params for single port VF')
Reported-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Zary [Sun, 22 Jun 2014 10:01:12 +0000 (12:01 +0200)]
tulip: Poll link status more frequently for Comet chips
It now takes up to 60 seconds to detect cable (un)plug on ADMtek Comet chips.
That's too slow and might cause people to think that it doesn't work at all.
Poll link status every 2 seconds instead of 60 for ADMtek Comet chips.
That should be fast enough while not stressing the system too much.
Tested with ADMtek AN983B.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Wed, 18 Jun 2014 12:21:24 +0000 (14:21 +0200)]
net: huawei_cdc_ncm: increase command buffer size
Messages from the modem exceeding 256 bytes cause communication
failure.
The WDM protocol is strictly "read on demand", meaning that we only
poll for unread data after receiving a notification from the modem.
Since we have no way to know how much data the modem has to send,
we must make sure that the buffer we provide is "big enough".
Message truncation does not work. Truncated messages are left unread
until the modem has another message to send. Which often won't
happen until the userspace application has given up waiting for the
final part of the last message, and therefore sends another command.
With a proper CDC WDM function there is a descriptor telling us
which buffer size the modem uses. But with this vendor specific
implementation there is no known way to calculate the exact "big
enough" number. It is an unknown property of the modem firmware.
Experience has shown that 256 is too small. The discussion of
this failure ended up concluding that 512 might be too small as
well. So 1024 seems like a reasonable value for now.
Fixes: 41c47d8cfd68 ("net: huawei_cdc_ncm: Introduce the huawei_cdc_ncm driver")
Cc: Enrico Mioso <mrkiko.rs@gmail.com>
Reported-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-By: Enrico Mioso <mrkiko.rs@gmail.com>
Tested-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Wed, 18 Jun 2014 11:51:48 +0000 (17:21 +0530)]
drivers: net: cpsw: fix dual EMAC stall when connected to same switch
In commit
629c9a8fd0bbdfc6d702526b327470166ec39c6b (drivers: net: cpsw: Add
default vlan for dual emac case also), api cpsw_add_default_vlan() also
changes the port vlan which is required to seperate the ports which results
in the following behavior
In Dual EMAC mode, when both the Etnernet connected is connected to same
switch, it creates a loop in the switch and when a broadcast packet is
received it is forwarded to the other port which stalls the whole switch
and needs a reset/power cycle to the switch to recover. So intead of using
the api, add only the default VLAN entry in dual EMAC case.
Cc: Yegor Yefremov <yegorslists@googlemail.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Jun 2014 23:15:04 +0000 (16:15 -0700)]
Merge branch 'xen-netfront'
David Vrabel says:
====================
xen-netfront: fix resume regressions in 3.16-rc1
The introduction of multi-queue support to xen-netfront in 3.16-rc1,
broke resume/migration.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel [Wed, 18 Jun 2014 09:47:28 +0000 (10:47 +0100)]
xen-netfront: recreate queues correctly when reconnecting
When reconnecting to the backend (after a resume/migration, for example),
a different number of queues may be required (since the guest may have
moved to a different host with different capabilities). During the
reconnection the old queues are torn down and new ones created.
Introduce xennet_create_queues() and xennet_destroy_queues() that fixes
three bugs during the reconnection.
- The old info->queues was leaked.
- The old queue's napi instances were not deleted.
- The new queue's napi instances were left disabled (which meant no
packets could be received).
The xennet_destroy_queues() calls is deferred until the reconnection
instead of the disconnection (in xennet_disconnect_backend()) because
napi_disable() might sleep.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel [Wed, 18 Jun 2014 09:47:27 +0000 (10:47 +0100)]
xen-netfront: fix oops when disconnected from backend
xennet_disconnect_backend() was not correctly iterating over all the
queues.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Jun 2014 22:50:45 +0000 (15:50 -0700)]
Merge branch 'at803x'
Daniel Mack says:
====================
Handle stuck TX queue bug in AT8030 PHY
These three small patches circument a hardware bug in AT8030 PHYs that
leads to stuck TX FIFO queues when the link goes away while there are
pending patches in der outbound queue. This bug has been confirmed by
the vendor, and their only proposed fix is to apply a hardware reset
every time the link goes down.
v1 -> v2:
* Rename phy device callback from adjust_state to link_change_notify
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Wed, 18 Jun 2014 09:01:43 +0000 (11:01 +0200)]
net: phy: at803x: Add support for hardware reset
The AT8030 will enter a FIFO error mode if a packet is transmitted while
the cable is unplugged. This hardware issue is acknowledged by the
vendor, and the only proposed solution is to conduct a hardware reset
via the external pin each time the link goes down. There is apparantly
no way to fix up the state via the register set.
This patch adds support for reading a 'reset-gpios' property from the DT
node of the PHY. If present, this gpio is used to apply a hardware reset
each time a 'link down' condition is detected. All relevant registers
are read out before, and written back after the reset cycle.
Doing this every time the link goes down might seem like overkill, but
there is unfortunately no way of figuring out whether the PHY is in
such a lock-up state. Hence, this is the only way of reliably fixing up
things.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Wed, 18 Jun 2014 09:01:42 +0000 (11:01 +0200)]
net: phy: at803x: use #defines for supported PHY ids
This removes magic values from two tables and also allows us to match
against specific PHY models at runtime.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Wed, 18 Jun 2014 09:01:41 +0000 (11:01 +0200)]
net: phylib: add link_change_notify callback to phy device
Add a notify callback to inform phy drivers when the core is about to
do its link adjustment. No change for drivers that do not implement
this callback.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Wed, 18 Jun 2014 05:46:02 +0000 (13:46 +0800)]
8021q: fix a potential memory leak
skb_cow called in vlan_reorder_header does not free the skb when it failed,
and vlan_reorder_header returns NULL to reset original skb when it is called
in vlan_untag, lead to a memory leak.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Jun 2014 04:32:21 +0000 (21:32 -0700)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless
John W. Linville says:
====================
pull request: wireless 2014-06-18
Please pull this batch of fixes intended for the 3.16 stream!
For the Bluetooth bits, Gustavo says:
"This is our first batch of fixes for 3.16. Be aware that two patches here
are not exactly bugfixes:
*
71f28af57066 Bluetooth: Add clarifying comment for conn->auth_type
This commit just add some important security comments to the code, we found
it important enough to include it here for 3.16 since it is security related.
*
9f7ec8871132 Bluetooth: Refactor discovery stopping into its own function
This commit is just a refactor in a preparation for a fix in the next
commit (
f8680f128b).
All the other patches are fixes for deadlocks and for the Bluetooth protocols,
most of them related to authentication and encryption."
On top of that...
Chin-Ran Lo fixes a problems with overlapping DMA areas in mwifiex.
Michael Braun corrects a couple of issues in order to enable a new
device in rt2800usb.
Rafał Miłecki reverts a b43 patch that caused a regression, fixes a
Kconfig typo, and corrects a frequency reporting error with the G-PHY.
Stanislaw Grsuzka fixes an rfkill regression for rt2500pci, and avoids
a rt2x00 scheduling while atomic BUG.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 18 Jun 2014 21:46:31 +0000 (23:46 +0200)]
net: sctp: check proc_dointvec result in proc_sctp_do_auth
When writing to the sysctl field net.sctp.auth_enable, it can well
be that the user buffer we handed over to proc_dointvec() via
proc_sctp_do_auth() handler contains something other than integers.
In that case, we would set an uninitialized 4-byte value from the
stack to net->sctp.auth_enable that can be leaked back when reading
the sysctl variable, and it can unintentionally turn auth_enable
on/off based on the stack content since auth_enable is interpreted
as a boolean.
Fix it up by making sure proc_dointvec() returned sucessfully.
Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Thu, 19 Jun 2014 01:38:13 +0000 (18:38 -0700)]
tg3: Clear NETIF_F_TSO6 flag before doing software GSO
Commit
d3f6f3a1d818410c17445bce4f4caab52eb102f1 ("tg3: Prevent page
allocation failure during TSO workaround") modified driver logic
to use tg3_tso_bug() for any TSO fragment that hits hardware bug
conditions thus the patch increased the scope of work for tg3_tso_bug()
to cover devices that support NETIF_F_TSO6 as well. Prior to the
patch, tg3_tso_bug() would only be used on devices supporting
NETIF_F_TSO.
A regression was introduced for IPv6 packets requiring the workaround.
To properly perform GSO on SKBs with TCPV6 gso_type, we need to call
skb_gso_segment() with NETIF_F_TSO6 feature flag cleared, or the
function will return NULL and cause a kernel oops as tg3 is not handling
a NULL return value. This patch fixes the problem.
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Thu, 19 Jun 2014 01:15:03 +0000 (21:15 -0400)]
tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb
If there is an MSS change (or misbehaving receiver) that causes a SACK
to arrive that covers the end of an skb but is less than one MSS, then
tcp_match_skb_to_sack() was rounding up pkt_len to the full length of
the skb ("Round if necessary..."), then chopping all bytes off the skb
and creating a zero-byte skb in the write queue.
This was visible now because the recently simplified TLP logic in
bef1909ee3ed1c ("tcp: fixing TLP's FIN recovery") could find that 0-byte
skb at the end of the write queue, and now that we do not check that
skb's length we could send it as a TLP probe.
Consider the following example scenario:
mss: 1000
skb: seq: 0 end_seq: 4000 len: 4000
SACK: start_seq: 3999 end_seq: 4000
The tcp_match_skb_to_sack() code will compute:
in_sack = false
pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999
new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000
new_len += mss = 4000
Previously we would find the new_len > skb->len check failing, so we
would fall through and set pkt_len = new_len = 4000 and chop off
pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment
afterward in the write queue.
With this new commit, we notice that the new new_len >= skb->len check
succeeds, so that we return without trying to fragment.
Fixes: adb92db857ee ("tcp: Make SACK code to split only at mss boundaries")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Jun 2014 01:12:15 +0000 (18:12 -0700)]
Revert "net: return actual error on register_queue_kobjects"
This reverts commit
d36a4f4b472334562b8e7252e35d3d770db83815.
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Wed, 18 Jun 2014 22:34:57 +0000 (15:34 -0700)]
net: filter: fix upper BPF instruction limit
The original checks (via sk_chk_filter) for instruction count uses ">",
not ">=", so changing this in sk_convert_filter has the potential to break
existing seccomp filters that used exactly BPF_MAXINSNS many instructions.
Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # v3.15+
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 18 Jun 2014 23:31:30 +0000 (01:31 +0200)]
net: sctp: propagate sysctl errors from proc_do* properly
sysctl handler proc_sctp_do_hmac_alg(), proc_sctp_do_rto_min() and
proc_sctp_do_rto_max() do not properly reflect some error cases
when writing values via sysctl from internal proc functions such
as proc_dointvec() and proc_dostring().
In all these cases we pass the test for write != 0 and partially
do additional work just to notice that additional sanity checks
fail and we return with hard-coded -EINVAL while proc_do*
functions might also return different errors. So fix this up by
simply testing a successful return of proc_do* right after
calling it.
This also allows to propagate its return value onwards to the user.
While touching this, also fix up some minor style issues.
Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Liu [Tue, 17 Jun 2014 14:32:42 +0000 (22:32 +0800)]
net: return actual error on register_queue_kobjects
Return the actual error code if call kset_create_and_add() failed
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Tue, 17 Jun 2014 13:11:09 +0000 (16:11 +0300)]
bonding: Advertize vxlan offload features when supported
When the underlying device supports TCP offloads for VXLAN/UDP
encapulated traffic, we need to reflect that through the hw_enc_features
field of the bonding net-device. This will cause the xmit path
in the core networking stack to provide bonding with encapsulated
GSO frames to offload into the HW etc.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirko Lindner [Tue, 17 Jun 2014 10:53:39 +0000 (12:53 +0200)]
skge: Added FS A8NE-FM to the list of 32bit DMA boards
Added FUJITSU SIEMENS A8NE-FM to the list of 32bit DMA boards
>From Tomi O.:
After I added an entry to this MB into the skge.c
driver in order to enable the mentioned 64bit dma disable quirk,
the network data corruptions ended and everything is fine again.
Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Jun 2014 23:07:31 +0000 (16:07 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
netfilter fixes for net
The following patchset contains netfilter updates for your net tree,
they are:
1) Fix refcount leak when dumping the dying/unconfirmed conntrack lists,
from Florian Westphal.
2) Fix crash in NAT when removing a netnamespace, also from Florian.
3) Fix a crash in IPVS when trying to remove an estimator out of the
sysctl scope, from Julian Anastasov.
4) Add zone attribute to the routing to calculate the message size in
ctnetlink events, from Ken-ichirou MATSUZAWA.
5) Another fix for the dying/unconfirmed list which was preventing to
dump more than one memory page of entries (~17 entries in x86_64).
6) Fix missing RCU-safe list insertion in the rule replacement code
in nf_tables.
7) Since the new transaction infrastructure is in place, we have to
upgrade the chain use counter from u16 to u32 to avoid overflow
after more than 2^16 rules are added.
8) Fix refcount leak when replacing rule in nf_tables. This problem
was also introduced in new transaction.
9) Call the ->destroy() callback when releasing nft-xt rules to fix
module refcount leaks.
10) Set the family in the netlink messages that contain set elements
in nf_tables to make it consistent with other object types.
11) Don't dump NAT port information if it is unset in nft_nat.
12) Update the MAINTAINERS file, I have merged the ebtables entry
into netfilter. While at it, also removed the netfilter users
mailing list, the development list should be enough.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Wed, 18 Jun 2014 18:39:25 +0000 (14:39 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Pablo Neira Ayuso [Mon, 16 Jun 2014 11:01:52 +0000 (13:01 +0200)]
MAINTAINERS: merge ebtables into netfilter entry
Moreover, remove reference to the netfilter users mailing list,
so they don't receive patches.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fugang Duan [Wed, 18 Jun 2014 00:33:52 +0000 (08:33 +0800)]
net: fec: Don't clear IPV6 header checksum field when IP accelerator enable
The commit
96c50caa5148 (net: fec: Enable IP header hardware checksum)
enable HW IP header checksum for IPV4 and IPV6, which causes IPV6 TCP/UDP
cannot work. (The issue is reported by Russell King)
For FEC IP header checksum function: Insert IP header checksum. This "IINS"
bit is written by the user. If set, IP accelerator calculates the IP header
checksum and overwrites the IINS corresponding header field with the calculated
value. The checksum field must be cleared by user, otherwise the checksum
always is 0xFFFF.
So the previous patch clear IP header checksum field regardless of IP frame
type.
In fact, IP HW detect the packet as IPV6 type, even if the "IINS" bit is set,
the IP accelerator is not triggered to calculates IPV6 header checksum because
IPV6 frame format don't have checksum.
So this results in the IPV6 frame being corrupted.
The patch just add software detect the current packet type, if it is IPV6
frame, it don't clear IP header checksum field.
Cc: Russell King <linux@arm.linux.org.uk>
Reported-and-tested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Tue, 17 Jun 2014 09:59:20 +0000 (11:59 +0200)]
ptp: ptp_pch depends on x86_32
The ptp_pch driver is for a companion chip to the Intel Atom E600
series processors. These are 32-bit x86 processors so the driver is
only needed on X86_32.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stanislaw Gruszka [Mon, 16 Jun 2014 16:45:15 +0000 (18:45 +0200)]
rt2x00: fix rfkill regression on rt2500pci
As reported by Niels, starting rfkill polling during device probe
(commit
e2bc7c5, generally sane change) broke rfkill on rt2500pci
device. I considered that bug as some initalization issue, which
should be fixed on rt2500pci specific code. But after several
attempts (see bug report for details) we fail to find working solution.
Hence I decided to revert to old behaviour on rt2500pci to fix
regression.
Additionally patch also unregister rfkill on device remove instead
of ifconfig down, what was another issue introduced by bad commit.
Bug report:
https://bugzilla.kernel.org/show_bug.cgi?id=73821
Fixes: e2bc7c5f3cb8 ("rt2x00: Fix rfkill_polling register function.")
Cc: stable@vger.kernel.org
Bisected-by: Niels <nille0386@googlemail.com>
Reported-and-tested-by: Niels <nille0386@googlemail.com>
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Rafał Miłecki [Thu, 12 Jun 2014 20:28:22 +0000 (22:28 +0200)]
b43: fix frequency reported on G-PHY with /new/ firmware
Support for firmware rev 508+ was added years ago, but we never noticed
it reports channel in a different way for G-PHY devices. Instead of
offset from 2400 MHz it simply passes channel id (AKA hw_value).
So far it was (most probably) affecting monitor mode users only, but
the following recent commit made it noticeable for quite everybody:
commit
3afc2167f60a327a2c1e1e2600ef209a3c2b75b7
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date: Tue Mar 4 16:50:13 2014 +0200
cfg80211/mac80211: ignore signal if the frame was heard on wrong channel
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Cc: stable@vger.kernel.org
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Braun [Thu, 12 Jun 2014 17:33:41 +0000 (19:33 +0200)]
rt2800usb:fix hang during firmware load
The device 057c:8501 (AVM Fritz! WLAN v2 rev. B) boots into a state that does
not actually require loading a firmware file. The vendors driver finds out
about this by checking a firmware state register, so this patch adds this here.
Finally, with this patch applied, my wifi dongle actually becomes
useful (scan + connect to wpa network works).
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Braun [Thu, 12 Jun 2014 17:33:36 +0000 (19:33 +0200)]
rt2800usb:fix efuse detection
The device 057c:8501 (AVM Fritz! WLAN v2 rev. B) currently does not
load. One thing observed is that the vendors driver detects EFUSE mode
for this device, but rt2800usb does not. This is due to rt2800usb
lacking a check for the firmware mode present in the vendors driver,
that this patch adopts for rt2800usb.
With this patch applied, the 'RF chipset' detection does no longer fail.
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John W. Linville [Tue, 17 Jun 2014 18:08:47 +0000 (14:08 -0400)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth
Dave Jones [Mon, 16 Jun 2014 20:59:02 +0000 (16:59 -0400)]
hyperv: fix apparent cut-n-paste error in send path teardown
c25aaf814a63: "hyperv: Enable sendbuf mechanism on the send path" added
some teardown code that looks like it was copied from the recieve path
above, but missed a variable name replacement.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones [Mon, 16 Jun 2014 20:30:36 +0000 (16:30 -0400)]
tcp: remove unnecessary tcp_sk assignment.
This variable is overwritten by the child socket assignment before
it ever gets used.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Metcalf [Mon, 16 Jun 2014 17:14:05 +0000 (13:14 -0400)]
net: tile: fix unused variable warning
'i' is unused in tile_net_dev_init() after commit
d581ebf5a1f
("net: tile: Use helpers from linux/etherdevice.h to check/set MAC").
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Riesch [Mon, 16 Jun 2014 12:46:45 +0000 (14:46 +0200)]
ptp: In the testptp utility, use clock_adjtime from glibc when available
clock_adjtime was included in glibc 2.14. _GNU_SOURCE must be defined
to make it available.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Mon, 16 Jun 2014 08:13:55 +0000 (10:13 +0200)]
isdn: hisax: Drop duplicate Kconfig entry
There are 2 HISAX_AVM_A1_PCMCIA Kconfig entries. The kbuild system
ignores the second one, and apparently nobody noticed the problem so
far, so let's remove that second entry.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Mon, 16 Jun 2014 08:09:37 +0000 (10:09 +0200)]
isdn: hisax: Merge Kconfig ifs
The first half of the HiSax config options is presented if
ISDN_DRV_HISAX!=n, while the second half of the options is presented
if ISDN_DRV_HISAX. That's the same, so merge both conditionals.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tyler Hall [Mon, 16 Jun 2014 02:23:17 +0000 (22:23 -0400)]
slcan: Port write_wakeup deadlock fix from slip
The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
by a change made in both slcan and slip. This is a direct port of that
fix.
Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Andre Naujoks <nautsch2@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Tyler Hall [Mon, 16 Jun 2014 02:23:16 +0000 (22:23 -0400)]
slip: Fix deadlock in write_wakeup
Use schedule_work() to avoid potentially taking the spinlock in
interrupt context.
Commit
cc9fa74e2a ("slip/slcan: added locking in wakeup function") added
necessary locking to the wakeup function and
367525c8c2/
ddcde142be ("can:
slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
is also taken in timers.
Disabling softirqs is not sufficient, however, as tty drivers may call
write_wakeup from interrupt context. This driver calls tty->ops->write() with
its spinlock held, which may immediately cause an interrupt on the same CPU and
subsequent spin_bug().
Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
causes lockdep to point out a possible circular locking dependency
between these locks:
(&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
(&port_lock_key){-.....}, at: serial8250_handle_irq.part.13
The slip transmit is holding the slip spinlock when calling the tty write.
This grabs the port lock. On an interrupt, the handler grabs the port
lock and calls write_wakeup which grabs the slip lock. This could be a
problem if a serial interrupt occurs on another CPU during the slip
transmit.
To deal with these issues, don't grab the lock in the wakeup function by
deferring the writeout to a workqueue. Also hold the lock during close
when de-assigning the tty pointer to safely disarm the worker and
timers.
This bug is easily reproducible on the first transmit when slip is
used with the standard 8250 serial driver.
[<
c0410b7c>] (spin_bug+0x0/0x38) from [<
c006109c>] (do_raw_spin_lock+0x60/0x1d0)
r5:
eab27000 r4:
ec02754c
[<
c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<
c04185c0>] (_raw_spin_lock+0x28/0x2c)
r10:
0000001f r9:
eabb814c r8:
eabb8140 r7:
40070193 r6:
ec02754c r5:
eab27000
r4:
ec02754c r3:
00000000
[<
c0418598>] (_raw_spin_lock+0x0/0x2c) from [<
bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
r4:
ec027540 r3:
00000003
[<
bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<
c026e420>] (tty_wakeup+0x48/0x68)
r6:
00000000 r5:
ea80c480 r4:
eab27000 r3:
bf3a01d0
[<
c026e3d8>] (tty_wakeup+0x0/0x68) from [<
c028a8ec>] (uart_write_wakeup+0x2c/0x30)
r5:
ed68ea90 r4:
c06790d8
[<
c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<
c028dc44>] (serial8250_tx_chars+0x114/0x170)
[<
c028db30>] (serial8250_tx_chars+0x0/0x170) from [<
c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
r6:
000000c2 r5:
00000060 r4:
c06790d8 r3:
00000000
[<
c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<
c02933a4>] (dw8250_handle_irq+0x38/0x64)
r7:
00000000 r6:
edd2f390 r5:
000000c2 r4:
c06790d8
[<
c029336c>] (dw8250_handle_irq+0x0/0x64) from [<
c028d2f4>] (serial8250_interrupt+0x44/0xc4)
r6:
00000000 r5:
00000000 r4:
c06791c4 r3:
c029336c
[<
c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<
c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
r10:
c06790d8 r9:
eab27000 r8:
00000000 r7:
00000000 r6:
0000001f r5:
edd52980
r4:
ec53b6c0 r3:
c028d2b0
[<
c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<
c006822c>] (handle_irq_event+0x4c/0x6c)
r10:
c06790d8 r9:
eab27000 r8:
c0673ae0 r7:
c05c2020 r6:
ec53b6c0 r5:
edd529d4
r4:
edd52980
[<
c00681e0>] (handle_irq_event+0x0/0x6c) from [<
c006b140>] (handle_level_irq+0xe8/0x100)
r6:
00000000 r5:
edd529d4 r4:
edd52980 r3:
00022000
[<
c006b058>] (handle_level_irq+0x0/0x100) from [<
c00676f8>] (generic_handle_irq+0x30/0x40)
r5:
0000001f r4:
0000001f
[<
c00676c8>] (generic_handle_irq+0x0/0x40) from [<
c000f57c>] (handle_IRQ+0xd0/0x13c)
r4:
ea997b18 r3:
000000e0
[<
c000f4ac>] (handle_IRQ+0x0/0x13c) from [<
c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
r8:
000003ff r7:
ea997b18 r6:
ffffffff r5:
60070013 r4:
c0674dc0
[<
c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<
c0013840>] (__irq_svc+0x40/0x70)
Exception stack(0xea997b18 to 0xea997b60)
7b00:
00000001 20070013
7b20:
00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
7b40:
c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
r9:
eab27000 r8:
ed10103e r7:
ea997b4c r6:
ffffffff r5:
60070013 r4:
c04186c8
[<
c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<
c0288fc0>] (uart_start+0x40/0x44)
r4:
c06790d8 r3:
c028ddd8
[<
c0288f80>] (uart_start+0x0/0x44) from [<
c028982c>] (uart_write+0xe4/0xf4)
r6:
0000003e r5:
00000000 r4:
ed68ea90 r3:
0000003e
[<
c0289748>] (uart_write+0x0/0xf4) from [<
bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
r10:
ed388e60 r9:
0000003c r8:
ffffffdd r7:
0000003e r6:
ec02754c r5:
ea717eb8
r4:
ec027000
[<
bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<
c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
r8:
eaf163c0 r7:
ec027000 r6:
ea717eb8 r5:
00000000 r4:
00000000
Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Andre Naujoks <nautsch2@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Fri, 13 Jun 2014 14:03:21 +0000 (10:03 -0400)]
vmxnet3: adjust ring sizes when interface is down
If ethtool is used to update ring sizes on a vmxnet3 interface that isn't
running, the change isn't stored, meaning the ring update is effectively is
ignored and lost without any indication to the user.
Other network drivers store the ring size update so that ring allocation uses
the new sizes next time the interface is brought up. This patch modifies
vmxnet3 to behave this way as well
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rafał Miłecki [Tue, 10 Jun 2014 11:32:05 +0000 (13:32 +0200)]
b43: fix typo in Kconfig (make B43_BUSES_BCMA_AND_SSB the default for real)
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Stanislaw Gruszka [Tue, 10 Jun 2014 10:51:06 +0000 (12:51 +0200)]
rt2x00: disable TKIP on USB
On USB we can not get atomically TKIP key. We have to disable support
for TKIP acceleration on USB hardware to avoid bug as showed bellow.
[ 860.827243] BUG: scheduling while atomic: hostapd/3397/0x00000002
<snip>
[ 860.827280] Call Trace:
[ 860.827282] [<
ffffffff81682ea6>] dump_stack+0x4d/0x66
[ 860.827284] [<
ffffffff8167eb9b>] __schedule_bug+0x47/0x55
[ 860.827285] [<
ffffffff81685bb3>] __schedule+0x733/0x7b0
[ 860.827287] [<
ffffffff81685c59>] schedule+0x29/0x70
[ 860.827289] [<
ffffffff81684f8a>] schedule_timeout+0x15a/0x2b0
[ 860.827291] [<
ffffffff8105ac50>] ? ftrace_raw_event_tick_stop+0xc0/0xc0
[ 860.827294] [<
ffffffff810c13c2>] ? __module_text_address+0x12/0x70
[ 860.827296] [<
ffffffff81686823>] wait_for_completion_timeout+0xb3/0x140
[ 860.827298] [<
ffffffff81080fc0>] ? wake_up_state+0x20/0x20
[ 860.827301] [<
ffffffff814d5b3d>] usb_start_wait_urb+0x7d/0x150
[ 860.827303] [<
ffffffff814d5cd5>] usb_control_msg+0xc5/0x110
[ 860.827305] [<
ffffffffa02fb0c6>] rt2x00usb_vendor_request+0xc6/0x160 [rt2x00usb]
[ 860.827307] [<
ffffffffa02fb215>] rt2x00usb_vendor_req_buff_lock+0x75/0x150 [rt2x00usb]
[ 860.827309] [<
ffffffffa02fb393>] rt2x00usb_vendor_request_buff+0xa3/0xe0 [rt2x00usb]
[ 860.827311] [<
ffffffffa023d1a3>] rt2x00usb_register_multiread+0x33/0x40 [rt2800usb]
[ 860.827314] [<
ffffffffa05805f9>] rt2800_get_tkip_seq+0x39/0x50 [rt2800lib]
[ 860.827321] [<
ffffffffa0480f88>] ieee80211_get_key+0x218/0x2a0 [mac80211]
[ 860.827322] [<
ffffffff815cc68c>] ? __nlmsg_put+0x6c/0x80
[ 860.827329] [<
ffffffffa051b02e>] nl80211_get_key+0x22e/0x360 [cfg80211]
Cc: stable@vger.kernel.org
Reported-and-tested-by: Peter Wu <lekensteyn@gmail.com>
Reported-and-tested-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Chin-Ran Lo [Sat, 7 Jun 2014 02:37:10 +0000 (19:37 -0700)]
mwifiex: fix tx_info/rx_info overlap with PCIe dma_mapping
On PCIe Tx data path, network interface specific tx_info
parameters such as bss_num and bss_type are saved at
"skb->cb + sizeof(dma_addr_t)" (returned by MWIFIEX_SKB_TXCB).
Later mwifiex_map_pci_memory() called from
mwifiex_pcie_send_data() will memcpy
sizeof(struct mwifiex_dma_mapping) bytes to save PCIe DMA
address and length information at beginning of skb->cb.
This accidently overwrites bss_num and bss_type saved in skb->cb
previously because bss_num/bss_type and mwifiex_dma_mapping data
overlap.
Similarly, on PCIe Rx data path, rx_info parameters overlaps
with PCIe DMA address and length information too.
Fix it by defining mwifiex_cb structure and having
MWIFIEX_SKB_TXCB and MWIFIEX_SKB_RXCB return the correct address
of tx_info/rx_info using the structure members.
Also add a BUILD_BUG_ON to maks sure that mwifiex_cb structure
doesn't exceed the size of skb->cb.
Reviewed-by: Aaron Durbin <adurbin@chromium.org>
Signed-off-by: Chin-Ran Lo <crlo@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Rafał Miłecki [Sat, 31 May 2014 17:40:45 +0000 (19:40 +0200)]
b43: disable 5 GHz on G-PHY
This fixes regression introduced by adding some G-PHY devices to the
list of dual band devices. There is simply no support for 5 GHz on
G-PHY devices in b43. It results in:
WARNING: CPU: 0 PID: 79 at drivers/net/wireless/b43/phy_g.c:75 b43_gphy_channel_switch+0x125/0x130 [b43]()
b43-phy1 ERROR: PHY init: Channel switch to default failed
Regression was introduced by the following commit:
commit
773cfc508f4d64c14547ff8751b5cbd473124364
Author: Rafał Miłecki <zajec5@gmail.com>
Date: Mon May 19 23:18:55 2014 +0200
b43: add more devices to the bands database
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Florian Westphal [Sat, 7 Jun 2014 19:17:04 +0000 (21:17 +0200)]
netfilter: nf_nat: fix oops on netns removal
Quoting Samu Kallio:
Basically what's happening is, during netns cleanup,
nf_nat_net_exit gets called before ipv4_net_exit. As I understand
it, nf_nat_net_exit is supposed to kill any conntrack entries which
have NAT context (through nf_ct_iterate_cleanup), but for some
reason this doesn't happen (perhaps something else is still holding
refs to those entries?).
When ipv4_net_exit is called, conntrack entries (including those
with NAT context) are cleaned up, but the
nat_bysource hashtable is long gone - freed in nf_nat_net_exit. The
bug happens when attempting to free a conntrack entry whose NAT hash
'prev' field points to a slot in the freed hash table (head for that
bin).
We ignore conntracks with null nat bindings. But this is wrong,
as these are in bysource hash table as well.
Restore nat-cleaning for the netns-is-being-removed case.
bug:
https://bugzilla.kernel.org/show_bug.cgi?id=65191
Fixes: c2d421e1718 ('netfilter: nf_nat: fix race when unloading protocol modules')
Reported-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Debugged-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ken-ichirou MATSUZAWA [Mon, 16 Jun 2014 11:52:34 +0000 (13:52 +0200)]
netfilter: ctnetlink: add zone size to length
Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 16 Jun 2014 11:20:39 +0000 (13:20 +0200)]
Merge branch 'ipvs'
Simon Horman says:
====================
Fix for panic due use of tot_stats estimator outside of CONFIG_SYSCTL
It has been present since v3.6.39.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 13 Jun 2014 11:45:38 +0000 (13:45 +0200)]
netfilter: nft_nat: don't dump port information if unset
Don't include port information attributes if they are unset.
Reported-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 11 Jun 2014 17:05:28 +0000 (19:05 +0200)]
netfilter: nf_tables: indicate family when dumping set elements
Set the nfnetlink header that indicates the family of this element.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 11 Jun 2014 12:27:46 +0000 (14:27 +0200)]
netfilter: nft_compat: call {target, match}->destroy() to cleanup entry
Otherwise, the reference to external objects (eg. modules) are not
released when the rules are removed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 10 Jun 2014 08:53:03 +0000 (10:53 +0200)]
netfilter: nf_tables: fix wrong type in transaction when replacing rules
In
b380e5c ("netfilter: nf_tables: add message type to transactions"),
I used the wrong message type in the rule replacement case. The rule
that is replaced needs to be handled as a deleted rule.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 10 Jun 2014 08:53:02 +0000 (10:53 +0200)]
netfilter: nf_tables: decrement chain use counter when replacing rules
Thus, the chain use counter remains with the same value after the
rule replacement.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 10 Jun 2014 08:53:01 +0000 (10:53 +0200)]
netfilter: nf_tables: use u32 for chain use counter
Since
4fefee5 ("netfilter: nf_tables: allow to delete several objects
from a batch"), every new rule bumps the chain use counter. However,
this is limited to 16 bits, which means that it will overrun after
2^16 rules.
Use a u32 chain counter and check for overflows (just like we do for
table objects).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 10 Jun 2014 08:53:00 +0000 (10:53 +0200)]
netfilter: nf_tables: use RCU-safe list insertion when replacing rules
The patch
5e94846 ("netfilter: nf_tables: add insert operation") did
not include RCU-safe list insertion when replacing rules.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Sun, 8 Jun 2014 09:41:23 +0000 (11:41 +0200)]
netfilter: ctnetlink: fix refcnt leak in dying/unconfirmed list dumper
'last' keeps track of the ct that had its refcnt bumped during previous
dump cycle. Thus it must not be overwritten until end-of-function.
Another (unrelated, theoretical) issue: Don't attempt to bump refcnt of a conntrack
whose reference count is already 0. Such conntrack is being destroyed
right now, its memory is freed once we release the percpu dying spinlock.
Fixes: b7779d06 ('netfilter: conntrack: spinlock per cpu to protect special lists.')
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 5 Jun 2014 12:28:44 +0000 (14:28 +0200)]
netfilter: ctnetlink: fix dumping of dying/unconfirmed conntracks
The dumping prematurely stops, it seems the callback argument that
indicates that all entries have been dumped is set after iterating
on the first cpu list. The dumping also may stop before the entire
per-cpu list content is also dumped.
With this patch, conntrack -L dying now shows the dying list content
again.
Fixes: b7779d06 ("netfilter: conntrack: spinlock per cpu to protect special lists.")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Mon, 16 Jun 2014 03:45:28 +0000 (17:45 -1000)]
Linux 3.16-rc1
Linus Torvalds [Mon, 16 Jun 2014 02:37:03 +0000 (16:37 -1000)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix checksumming regressions, from Tom Herbert.
2) Undo unintentional permissions changes for SCTP rto_alpha and
rto_beta sysfs knobs, from Denial Borkmann.
3) VXLAN, like other IP tunnels, should advertize it's encapsulation
size using dev->needed_headroom instead of dev->hard_header_len.
From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: sctp: fix permissions for rto_alpha and rto_beta knobs
vxlan: Checksum fixes
net: add skb_pop_rcv_encapsulation
udp: call __skb_checksum_complete when doing full checksum
net: Fix save software checksum complete
net: Fix GSO constants to match NETIF flags
udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
vxlan: use dev->needed_headroom instead of dev->hard_header_len
MAINTAINERS: update cxgb4 maintainer
Linus Torvalds [Mon, 16 Jun 2014 02:02:20 +0000 (16:02 -1000)]
Merge tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux
Pull more clock framework updates from Mike Turquette:
"This contains the second half the of the clk changes for 3.16.
They are simply fixes and code refactoring for the OMAP clock drivers.
The sunxi clock driver changes include splitting out the one
mega-driver into several smaller pieces and adding support for the A31
SoC clocks"
* tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
clk: sunxi: document PRCM clock compatible strings
clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
clk: sun6i: Protect SDRAM gating bit
clk: sun6i: Protect CPU clock
clk: sunxi: Rework clock protection code
clk: sunxi: Move the GMAC clock to a file of its own
clk: sunxi: Move the 24M oscillator to a file of its own
clk: sunxi: Remove calls to clk_put
clk: sunxi: document new A31 USB clock compatible
clk: sunxi: Implement A31 USB clock
ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies
CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies
ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC)
CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck
CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)
dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings
ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock
CLK: TI: gate: add composite interface clock to OMAP2 only build
ARM: OMAP2: clock: add DT boot support for cpufreq_ck
CLK: TI: OMAP2: add clock init support
...
Linus Torvalds [Mon, 16 Jun 2014 01:58:03 +0000 (15:58 -1000)]
Merge git://git.infradead.org/users/willy/linux-nvme
Pull NVMe update from Matthew Wilcox:
"Mostly bugfixes again for the NVMe driver. I'd like to call out the
exported tracepoint in the block layer; I believe Keith has cleared
this with Jens.
We've had a few reports from people who're really pounding on NVMe
devices at scale, hence the timeout changes (and new module
parameters), hotplug cpu deadlock, tracepoints, and minor performance
tweaks"
[ Jens hadn't seen that tracepoint thing, but is ok with it - it will
end up going away when mq conversion happens ]
* git://git.infradead.org/users/willy/linux-nvme: (22 commits)
NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
NVMe: Use Log Page constants in SCSI emulation
NVMe: Define Log Page constants
NVMe: Fix hot cpu notification dead lock
NVMe: Rename io_timeout to nvme_io_timeout
NVMe: Use last bytes of f/w rev SCSI Inquiry
NVMe: Adhere to request queue block accounting enable/disable
NVMe: Fix nvme get/put queue semantics
NVMe: Delete NVME_GET_FEAT_TEMP_THRESH
NVMe: Make admin timeout a module parameter
NVMe: Make iod bio timeout a parameter
NVMe: Prevent possible NULL pointer dereference
NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD)
NVMe: Update data structures for NVMe 1.2
NVMe: Enable BUILD_BUG_ON checks
NVMe: Update namespace and controller identify structures to the 1.1a spec
NVMe: Flush with data support
NVMe: Configure support for block flush
NVMe: Add tracepoints
NVMe: Protect against badly formatted CQEs
...
Daniel Borkmann [Sat, 14 Jun 2014 22:59:14 +0000 (00:59 +0200)]
net: sctp: fix permissions for rto_alpha and rto_beta knobs
Commit
3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs
to jiffies conversions.") has silently changed permissions for
rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
this was to discourage users from tweaking rto_alpha and
rto_beta knobs in production environments since they are key
to correctly compute rtt/srtt.
RFC4960 under section 6.3.1. RTO Calculation says regarding
rto_alpha and rto_beta under rule C3 and C4:
[...]
C3) When a new RTT measurement R' is made, set
RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|
and
SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'
Note: The value of SRTT used in the update to RTTVAR
is its value before updating SRTT itself using the
second assignment. After the computation, update
RTO <- SRTT + 4 * RTTVAR.
C4) When data is in flight and when allowed by rule C5
below, a new RTT measurement MUST be made each round
trip. Furthermore, new RTT measurements SHOULD be
made no more than once per round trip for a given
destination transport address. There are two reasons
for this recommendation: First, it appears that
measuring more frequently often does not in practice
yield any significant benefit [ALLMAN99]; second,
if measurements are made more often, then the values
of RTO.Alpha and RTO.Beta in rule C3 above should be
adjusted so that SRTT and RTTVAR still adjust to
changes at roughly the same rate (in terms of how many
round trips it takes them to reflect new values) as
they would if making only one measurement per
round-trip and using RTO.Alpha and RTO.Beta as given
in rule C3. However, the exact nature of these
adjustments remains a research issue.
[...]
While it is discouraged to adjust rto_alpha and rto_beta
and not further specified how to adjust them, the RFC also
doesn't explicitly forbid it, but rather gives a RECOMMENDED
default value (rto_alpha=3, rto_beta=2). We have a couple
of users relying on the old permissions before they got
changed. That said, if someone really has the urge to adjust
them, we could allow it with a warning in the log.
Fixes: 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 15 Jun 2014 08:00:56 +0000 (01:00 -0700)]
Merge branch 'csum_fixes'
Tom Herbert says:
====================
Fixes related to some recent checksum modifications.
- Fix GSO constants to match NETIF flags
- Fix logic in saving checksum complete in __skb_checksum_complete
- Call __skb_checksum_complete from UDP if we are checksumming over
whole packet in order to save checksum.
- Fixes to VXLAN to work correctly with checksum complete
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 15 Jun 2014 06:24:36 +0000 (23:24 -0700)]
vxlan: Checksum fixes
Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
header to work properly with checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 15 Jun 2014 06:24:28 +0000 (23:24 -0700)]
net: add skb_pop_rcv_encapsulation
This function is used by UDP encapsulation protocols in RX when
crossing encapsulation boundary. If ip_summed is set to
CHECKSUM_UNNECESSARY and encapsulation is not set, change to
CHECKSUM_NONE since the checksum has not been validated within the
encapsulation. Clears csum_valid by the same rationale.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 15 Jun 2014 06:24:20 +0000 (23:24 -0700)]
udp: call __skb_checksum_complete when doing full checksum
In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 15 Jun 2014 06:24:03 +0000 (23:24 -0700)]
net: Fix save software checksum complete
Geert reported issues regarding checksum complete and UDP.
The logic introduced in commit
7e3cead5172927732f51fde
("net: Save software checksum complete") is not correct.
This patch:
1) Restores code in __skb_checksum_complete_header except for setting
CHECKSUM_UNNECESSARY. This function may be calculating checksum on
something less than skb->len.
2) Adds saving checksum to __skb_checksum_complete. The full packet
checksum 0..skb->len is calculated without adding in pseudo header.
This value is saved in skb->csum and then the pseudo header is added
to that to derive the checksum for validation.
3) In both __skb_checksum_complete_header and __skb_checksum_complete,
set skb->csum_valid to whether checksum of zero was computed. This
allows skb_csum_unnecessary to return true without changing to
CHECKSUM_UNNECESSARY which was done previously.
4) Copy new csum related bits in __copy_skb_header.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Sun, 15 Jun 2014 06:23:52 +0000 (23:23 -0700)]
net: Fix GSO constants to match NETIF flags
Joseph Gasparakis reported that VXLAN GSO offload stopped working with
i40e device after recent UDP changes. The problem is that the
SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This
patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several
GSO constants that were missing to avoid the problem in the future.
Reported-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 15 Jun 2014 00:49:48 +0000 (19:49 -0500)]
Merge tag 'scsi-for-linus' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is just a couple of drivers (hpsa and lpfc) that got left out for
further testing in linux-next. We also have one fix to a prior
submission (qla2xxx sparse)"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits)
qla2xxx: fix sparse warnings introduced by previous target mode t10-dif patch
lpfc: Update lpfc version to driver version 10.2.8001.0
lpfc: Fix ExpressLane priority setup
lpfc: mark old devices as obsolete
lpfc: Fix for initializing RRQ bitmap
lpfc: Fix for cleaning up stale ring flag and sp_queue_event entries
lpfc: Update lpfc version to driver version 10.2.8000.0
lpfc: Update Copyright on changed files from 8.3.45 patches
lpfc: Update Copyright on changed files
lpfc: Fixed locking for scsi task management commands
lpfc: Convert runtime references to old xlane cfg param to fof cfg param
lpfc: Fix FW dump using sysfs
lpfc: Fix SLI4 s abort loop to process all FCP rings and under ring_lock
lpfc: Fixed kernel panic in lpfc_abort_handler
lpfc: Fix locking for postbufq when freeing
lpfc: Fix locking for lpfc_hba_down_post
lpfc: Fix dynamic transitions of FirstBurst from on to off
hpsa: fix handling of hpsa_volume_offline return value
hpsa: return -ENOMEM not -1 on kzalloc failure in hpsa_get_device_id
hpsa: remove messages about volume status VPD inquiry page not supported
...
Linus Torvalds [Sun, 15 Jun 2014 00:48:43 +0000 (19:48 -0500)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull more btrfs updates from Chris Mason:
"This has a few fixes since our last pull and a new ioctl for doing
btree searches from userland. It's very similar to the existing
ioctl, but lets us return larger items back down to the app"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix error handling in create_pending_snapshot
btrfs: fix use of uninit "ret" in end_extent_writepage()
btrfs: free ulist in qgroup_shared_accounting() error path
Btrfs: fix qgroups sanity test crash or hang
btrfs: prevent RCU warning when dereferencing radix tree slot
Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
btrfs: new ioctl TREE_SEARCH_V2
btrfs: tree_search, search_ioctl: direct copy to userspace
btrfs: new function read_extent_buffer_to_user
btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW
btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer
btrfs: tree_search, search_ioctl: accept varying buffer
btrfs: tree_search: eliminate redundant nr_items check
Linus Torvalds [Sun, 15 Jun 2014 00:43:27 +0000 (19:43 -0500)]
Merge git://git.kvack.org/~bcrl/aio-next
Pull aio fix and cleanups from Ben LaHaise:
"This consists of a couple of code cleanups plus a minor bug fix"
* git://git.kvack.org/~bcrl/aio-next:
aio: cleanup: flatten kill_ioctx()
aio: report error from io_destroy() when threads race in io_destroy()
fs/aio.c: Remove ctx parameter in kiocb_cancel
Al Viro [Sat, 14 Jun 2014 06:12:41 +0000 (07:12 +0100)]
fix __swap_writepage() compile failure on old gcc versions
Tetsuo Handa wrote:
"Commit
62a8067a7f35 ("bio_vec-backed iov_iter") introduced an unnamed
union inside a struct which gcc-4.4.7 cannot handle. Name the unnamed
union as u in order to fix build failure"
Let's do this instead: there is only one place in the entire tree that
steps into this breakage. Anon structs and unions work in older gcc
versions; as the matter of fact, we have those in the tree - see e.g.
struct ieee80211_tx_info in include/net/mac80211.h
What doesn't work is handling their initializers:
struct {
int a;
union {
int b;
char c;
};
} x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}};
is the obvious syntax for initializer, perfectly fine for C11 and
handled correctly by gcc-4.7 or later.
Earlier versions, though, break on it - declaration is fine and so's
access to fields (i.e. x[0].c = 'a'; would produce the right code), but
members of the anon structs and unions are not inserted into the right
namespace. Tellingly, those older versions will not barf on struct {int
a; struct {int a;};}; - looks like they just have it hacked up somewhere
around the handling of . and -> instead of doing the right thing.
The easiest way to deal with that crap is to turn initialization of
those fields (in the only place where we have such initializer of
iov_iter) into plain assignment.
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 14 Jun 2014 21:51:25 +0000 (14:51 -0700)]
Merge tag 'hsi-for-3.16-fixes1' of git://git./linux/kernel/git/sre/linux-hsi
Pull HSI build fixes from Sebastian Reichel:
- tighten dependency between ssi-protocol and omap-ssi to fix build
failures with randconfig.
- use normal module refcounting in omap driver to fix build with
disabled module support
* tag 'hsi-for-3.16-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
hsi: omap_ssi_port: use normal module refcounting
HSI: fix omap ssi driver dependency
Linus Torvalds [Sat, 14 Jun 2014 21:49:51 +0000 (14:49 -0700)]
Merge tag 'gpio-v3.16-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"A first GPIO fix for the v3.16 series, this was serious since it
blocks the OMAP boot.
Sending you this vital fix before leaving for a short vacation so it
does not sit collecting dust in my tree for no good reason.
Apart from this, our v3.16 cycle looks like a good start"
* tag 'gpio-v3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: of: Fix handling for deferred probe for -gpio suffix
Linus Torvalds [Sat, 14 Jun 2014 21:46:29 +0000 (14:46 -0700)]
Merge branch 'x86-vdso-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 vdso fixes from Peter Anvin:
"Fixes for x86/vdso.
One is a simple build fix for bigendian hosts, one is to make "make
vdso_install" work again, and the rest is about working around a bug
in Google's Go language -- two are documentation patches that improves
the sample code that the Go coders took, modified, and broke; the
other two implements a workaround that keeps existing Go binaries from
segfaulting at least"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Fix vdso_install
x86/vdso: Hack to keep 64-bit Go programs working
x86/vdso: Add PUT_LE to store little-endian values
x86/vdso/doc: Make vDSO examples more portable
x86/vdso/doc: Rename vdso_test.c to vdso_standalone_test_x86.c
x86, vdso: Remove one final use of htole16()
Linus Torvalds [Sat, 14 Jun 2014 21:43:23 +0000 (14:43 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
- new driver for Sensirion SHTC1 humidity / temperature sensor
- convert ltc4151 and vexpress drivers to use devm functions
- drop generic chip detection from lm85 driver
- avoid forward declarations in atxp1 driver
- fix sign extensions in ina2xx driver
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: vexpress: Use devm helper for hwmon device registration
hwmon: (atxp1) Avoid forward declaration
hwmon: add support for Sensirion SHTC1 sensor
hwmon: (ltc4151) Convert to devm_hwmon_device_register_with_groups
hwmon: (lm85) Drop generic detection
hwmon: (ina2xx) Cast to s16 on shunt and current regs
Eric Dumazet [Thu, 12 Jun 2014 23:13:06 +0000 (16:13 -0700)]
udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
Its too easy to add thousand of UDP sockets on a particular bucket,
and slow down an innocent multicast receiver.
Early demux is supposed to be an optimization, we should avoid spending
too much time in it.
It is interesting to note __udp4_lib_demux_lookup() only tries to
match first socket in the chain.
10 is the threshold we already have in __udp4_lib_lookup() to switch
to secondary hash.
Fixes: 421b3885bf6d5 ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: David Held <drheld@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 12 Jun 2014 18:53:10 +0000 (11:53 -0700)]
vxlan: use dev->needed_headroom instead of dev->hard_header_len
When we mirror packets from a vxlan tunnel to other device,
the mirror device should see the same packets (that is, without
outer header). Because vxlan tunnel sets dev->hard_header_len,
tcf_mirred() resets mac header back to outer mac, the mirror device
actually sees packets with outer headers
Vxlan tunnel should set dev->needed_headroom instead of
dev->hard_header_len, like what other ip tunnels do. This fixes
the above problem.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: stephen hemminger <stephen@networkplumber.org>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dimitris Michailidis [Fri, 13 Jun 2014 21:11:14 +0000 (14:11 -0700)]
MAINTAINERS: update cxgb4 maintainer
Hari's been doing the patch submissions for a while now and he'll be
taking over as maintainer.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Lutomirski [Thu, 12 Jun 2014 15:28:10 +0000 (08:28 -0700)]
x86/vdso: Fix vdso_install
"make vdso_install" installs unstripped versions of the vdso objects
for the benefit of the debugger. This was broken by checkin:
6f121e548f83 x86, vdso: Reimplement vdso.so preparation in build-time C
The filenames are different now, so update the Makefile to cope.
This still installs the 64-bit vdso as vdso64.so. We believe this
will be okay, as the only known user is a patched gdb which is known
to use build-ids, but if it turns out to be a problem we may have to
add a link.
Inspired by a patch from Sam Ravnborg.
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/b10299edd8ba98d17e07dafcd895b8ecf4d99eff.1402586707.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Dan McLeran [Fri, 6 Jun 2014 14:27:27 +0000 (08:27 -0600)]
NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
This patch contains several fixes for Scsi START_STOP_UNIT. The previous
code did not account for signed vs. unsigned arithmetic which resulted
in an invalid lowest power state caculation when the device only supports
1 power state.
The code for Power Condition == 2 (Idle) was not following the spec. The
spec calls for setting the device to specific power states, depending
upon Power Condition Modifier, without accounting for the number of
power states supported by the device.
The code for Power Condition == 3 (Standby) was using a hard-coded '0'
which is replaced with the macro POWER_STATE_0.
Signed-off-by: Dan McLeran <daniel.mcleran@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Eric Sandeen [Thu, 12 Jun 2014 05:53:44 +0000 (00:53 -0500)]
btrfs: fix error handling in create_pending_snapshot
fcebe456 cut and pasted some code to a later point
in create_pending_snapshot(), but didn't switch
to the appropriate error handling for this stage
of the function.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Eric Sandeen [Thu, 12 Jun 2014 05:39:58 +0000 (00:39 -0500)]
btrfs: fix use of uninit "ret" in end_extent_writepage()
If this condition in end_extent_writepage() is false:
if (tree->ops && tree->ops->writepage_end_io_hook)
we will then test an uninitialized "ret" at:
ret = ret < 0 ? ret : -EIO;
The test for ret is for the case where ->writepage_end_io_hook
failed, and we'd choose that ret as the error; but if
there is no ->writepage_end_io_hook, nothing sets ret.
Initializing ret to 0 should be sufficient; if
writepage_end_io_hook wasn't set, (!uptodate) means
non-zero err was passed in, so we choose -EIO in that case.
Signed-of-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Eric Sandeen [Thu, 12 Jun 2014 05:14:59 +0000 (00:14 -0500)]
btrfs: free ulist in qgroup_shared_accounting() error path
If tmp = ulist_alloc(GFP_NOFS) fails, we return without
freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS)
and cause a memory leak.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Filipe Manana [Thu, 12 Jun 2014 01:47:37 +0000 (02:47 +0100)]
Btrfs: fix qgroups sanity test crash or hang
Often when running the qgroups sanity test, a crash or a hang happened.
This is because the extent buffer the test uses for the root node doesn't
have an header level explicitly set, making it have a random level value.
This is a problem when it's not zero for the btrfs_search_slot() calls
the test ends up doing, resulting in crashes or hangs such as the following:
[ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on
(...)
[ 6454.127760] BTRFS: selftest: Running qgroup tests
[ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup
[ 6454.127966] BTRFS: selftest: Qgroup basic add
[ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383]
[ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs]
[ 6480.152005] irq event stamp: 188448
[ 6480.152005] hardirqs last enabled at (188447): [<
ffffffff8168ef5c>] restore_args+0x0/0x30
[ 6480.152005] hardirqs last disabled at (188448): [<
ffffffff81698e6a>] apic_timer_interrupt+0x6a/0x80
[ 6480.152005] softirqs last enabled at (188446): [<
ffffffff810516cf>] __do_softirq+0x1cf/0x450
[ 6480.152005] softirqs last disabled at (188441): [<
ffffffff81051c25>] irq_exit+0xb5/0xc0
[ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4
[ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 6480.152005] task:
ffff8802146125a0 ti:
ffff8800d0d00000 task.ti:
ffff8800d0d00000
[ 6480.152005] RIP: 0010:[<
ffffffff81349a63>] [<
ffffffff81349a63>] __write_lock_failed+0x13/0x20
[ 6480.152005] RSP: 0018:
ffff8800d0d038e8 EFLAGS:
00000287
[ 6480.152005] RAX:
0000000000000000 RBX:
ffffffff8168ef5c RCX:
000005deb8525852
[ 6480.152005] RDX:
0000000000000000 RSI:
0000000000001d45 RDI:
ffff8802105000b8
[ 6480.152005] RBP:
ffff8800d0d038e8 R08:
fffffe12710f63db R09:
ffffffffa03196fb
[ 6480.152005] R10:
ffff8802146125a0 R11:
ffff880214612e28 R12:
ffff8800d0d03858
[ 6480.152005] R13:
0000000000000000 R14:
ffff8800d0d00000 R15:
ffff8802146125a0
[ 6480.152005] FS:
00007f14ff804700(0000) GS:
ffff880215e00000(0000) knlGS:
0000000000000000
[ 6480.152005] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[ 6480.152005] CR2:
00007fff4df0dac8 CR3:
00000000d1796000 CR4:
00000000000006f0
[ 6480.152005] Stack:
[ 6480.152005]
ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8
[ 6480.152005]
ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007
[ 6480.152005]
ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16
[ 6480.152005] Call Trace:
[ 6480.152005] [<
ffffffff810ae967>] do_raw_write_lock+0x47/0xa0
[ 6480.152005] [<
ffffffff8168e57e>] _raw_write_lock+0x5e/0x80
[ 6480.152005] [<
ffffffffa0319c16>] ? btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005] [<
ffffffffa0319c16>] btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005] [<
ffffffffa02b2acb>] btrfs_lock_root_node+0x3b/0x50 [btrfs]
[ 6480.152005] [<
ffffffffa02b81a6>] btrfs_search_slot+0x916/0xa20 [btrfs]
[ 6480.152005] [<
ffffffff811a727f>] ? create_object+0x23f/0x300
[ 6480.152005] [<
ffffffffa02b9958>] btrfs_insert_empty_items+0x78/0xd0 [btrfs]
[ 6480.152005] [<
ffffffffa036041a>] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs]
[ 6480.152005] [<
ffffffffa03605c3>] test_no_shared_qgroup+0xb1/0x1ca [btrfs]
[ 6480.152005] [<
ffffffff8108cad6>] ? local_clock+0x16/0x30
[ 6480.152005] [<
ffffffffa035ef8e>] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs]
[ 6480.152005] [<
ffffffffa03a69d2>] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs]
[ 6480.152005] [<
ffffffffa03a6a86>] init_btrfs_fs+0xb4/0x153 [btrfs]
[ 6480.152005] [<
ffffffff81000352>] do_one_initcall+0x102/0x150
[ 6480.152005] [<
ffffffff8103d223>] ? set_memory_nx+0x43/0x50
[ 6480.152005] [<
ffffffff81682668>] ? set_section_ro_nx+0x6d/0x74
[ 6480.152005] [<
ffffffff810d91cc>] load_module+0x1cdc/0x2630
(...)
Therefore initialize the extent buffer as an empty leaf (level 0).
Issue easy to reproduce when btrfs is built as a module via:
$ for ((i = 1; i <=
1000000; i++)); do rmmod btrfs; modprobe btrfs; done
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Sasha Levin [Wed, 11 Jun 2014 16:00:25 +0000 (12:00 -0400)]
btrfs: prevent RCU warning when dereferencing radix tree slot
Mark the dereference as protected by lock. Not doing so triggers
an RCU warning since the radix tree assumed that RCU is in use.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Wang Shilong [Wed, 11 Jun 2014 02:55:22 +0000 (10:55 +0800)]
Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
Steps to reproduce:
# mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5
# mkfs.ext4 /dev/sdc --->corrupt one of btrfs device
# mount /dev/sdb /mnt -o degraded
# btrfs scrub start -BRd /mnt
This is because readahead would skip missing device, this is not true
for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block
mapping. If expected data locates in missing device, readahead thread
would not call __readahead_hook() which makes event @rc->elems=0
wait forever.
Fix this problem by checking return value of btrfs_map_block(),we
can only skip missing device safely if there are several mirrors.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Gerhard Heift [Thu, 30 Jan 2014 15:24:03 +0000 (16:24 +0100)]
btrfs: new ioctl TREE_SEARCH_V2
This new ioctl call allows the user to supply a buffer of varying size in which
a tree search can store its results. This is much more flexible if you want to
receive items which are larger than the current fixed buffer of 3992 bytes or
if you want to fetch more items at once. Items larger than this buffer are for
example some of the type EXTENT_CSUM.
Signed-off-by: Gerhard Heift <Gerhard@Heift.Name>
Signed-off-by: Chris Mason <clm@fb.com>
Acked-by: David Sterba <dsterba@suse.cz>