Neil Horman [Wed, 12 Jun 2013 18:26:44 +0000 (14:26 -0400)]
sctp: fully initialize sctp_outq in sctp_outq_init
In commit
2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86
(refactor sctp_outq_teardown to insure proper re-initalization)
we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the
outq structure. Steve West recently asked me why I removed the q->error = 0
initalization from sctp_outq_teardown. I did so because I was operating under
the impression that sctp_outq_init would properly initalize that value for us,
but it doesn't. sctp_outq_init operates under the assumption that the outq
struct is all 0's (as it is when called from sctp_association_init), but using
it in __sctp_outq_teardown violates that assumption. We should do a memset in
sctp_outq_init to ensure that the entire structure is in a known state there
instead.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve.west@nsn.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
CC: davem@davemloft.net
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Poirier [Thu, 13 Jun 2013 13:09:47 +0000 (09:09 -0400)]
netiucv: Hold rtnl between name allocation and device registration.
fixes a race condition between concurrent initializations of netiucv devices
that try to use the same name.
sysfs: cannot create duplicate filename '/devices/iucv/netiucv2'
[...]
Call Trace:
([<
00000000002edea4>] sysfs_add_one+0xb0/0xdc)
[<
00000000002eecd4>] create_dir+0x80/0xfc
[<
00000000002eee38>] sysfs_create_dir+0xe8/0x118
[<
00000000003835a8>] kobject_add_internal+0x120/0x2d0
[<
00000000003839d6>] kobject_add+0x62/0x9c
[<
00000000003d9564>] device_add+0xcc/0x510
[<
000003e00212c7b4>] netiucv_register_device+0xc0/0x1ec [netiucv]
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Tested-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Thu, 13 Jun 2013 19:31:28 +0000 (15:31 -0400)]
tulip: Properly check dma mapping result
Tulip throws an error when dma debugging is enabled, as it doesn't properly
check dma mapping results with dma_mapping_error() durring tx ring refills.
Easy fix, just add it in, and drop the frame if the mapping is bad
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Grant Grundler <grundler@parisc-linux.org>
CC: "David S. Miller" <davem@davemloft.net>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yoshihiro Shimoda [Thu, 13 Jun 2013 01:15:45 +0000 (10:15 +0900)]
net: sh_eth: fix incorrect RX length error if R8A7740
This patch fixes an issue that the driver increments the "RX length error"
on every buffer in sh_eth_rx() if the R8A7740.
This patch also adds a description about the Receive Frame Status bits.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 7 Jun 2013 20:26:05 +0000 (13:26 -0700)]
ip_tunnel: remove __net_init/exit from exported functions
If CONFIG_NET_NS is not set then __net_init is the same as __init and
__net_exit is the same as __exit. These functions will be removed from
memory after the module loads or is removed. Functions that are exported
for use by other functions should never be labeled for removal.
Bug introduced by commit
c54419321455631079c
("GRE: Refactor GRE tunneling code.")
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Tue, 11 Jun 2013 10:02:05 +0000 (15:32 +0530)]
drivers: net: davinci_mdio: restore mdio clk divider in mdio resume
During suspend resume cycle all the register data is lost, so MDIO
clock divier value gets reset. This patch restores the clock divider
value.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Tue, 11 Jun 2013 10:02:04 +0000 (15:32 +0530)]
drivers: net: davinci_mdio: moving mdio resume earlier than cpsw ethernet driver
MDIO driver should resume before CPSW ethernet driver so that CPSW connect
to the phy and start tx/rx ethernet packets, changing the suspend/resume
apis with suspend_late/resume_early.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saurabh Mohan [Tue, 11 Jun 2013 00:45:10 +0000 (17:45 -0700)]
net/ipv4: ip_vti clear skb cb before tunneling.
If users apply shaper to vti tunnel then it will cause a kernel crash. The
problem seems to be due to the vti_tunnel_xmit function not clearing
skb->opt field before passing the packet to xfrm tunneling code.
Signed-off-by: Saurabh Mohan <saurabh@vyatta.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Wed, 12 Jun 2013 18:08:59 +0000 (11:08 -0700)]
tg3: Wait for boot code to finish after power on
Some systems that don't need wake-on-lan may choose to power down the
chip on system standby. Upon resume, the power on causes the boot code
to startup and initialize the hardware. On one new platform, this is
causing the device to go into a bad state due to a race between the
driver and boot code, once every several hundred resumes. The same race
exists on open since we come up from a power on.
This patch adds a wait for boot code signature at the beginning of
tg3_init_hw() which is common to both cases. If there has not been a
power-off or the boot code has already completed, the signature will be
present and poll_fw() returns immediately. Also return immediately if
the device does not have firmware.
Cc: stable@vger.kernel.org
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Wed, 12 Jun 2013 14:07:36 +0000 (16:07 +0200)]
l2tp: Fix sendmsg() return value
PPPoL2TP sockets should comply with the standard send*() return values
(i.e. return number of bytes sent instead of 0 upon success).
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Wed, 12 Jun 2013 14:07:23 +0000 (16:07 +0200)]
l2tp: Fix PPP header erasure and memory leak
Copy user data after PPP framing header. This prevents erasure of the
added PPP header and avoids leaking two bytes of uninitialised memory
at the end of skb's data buffer.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Tue, 11 Jun 2013 22:07:02 +0000 (00:07 +0200)]
bonding: fix igmp_retrans type and two related races
First the type of igmp_retrans (which is the actual counter of
igmp_resend parameter) is changed to u8 to be able to store values up
to 255 (as per documentation). There are two races that were hidden
there and which are easy to trigger after the previous fix, the first is
between bond_resend_igmp_join_requests and bond_change_active_slave
where igmp_retrans is set and can be altered by the periodic. The second
race condition is between multiple running instances of the periodic
(upon execution it can be scheduled again for immediate execution which
can cause the counter to go < 0 which in the unsigned case leads to
unnecessary igmp retransmissions).
Since in bond_change_active_slave bond->lock is held for reading and
curr_slave_lock for writing, we use curr_slave_lock for mutual
exclusion. We can't drop them as there're cases where RTNL is not held
when bond_change_active_slave is called. RCU is unlocked in
bond_resend_igmp_join_requests before getting curr_slave_lock since we
don't need it there and it's pointless to delay.
The decrement is moved inside the "if" block because if we decrement
unconditionally there's still a possibility for a race condition although
it is much more difficult to hit (many changes have to happen in
a very short period in order to trigger) which in the case of 3 parallel
running instances of this function and igmp_retrans == 1
(with check bond->igmp_retrans-- > 1) is:
f1 passes, doesn't re-schedule, but decrements - igmp_retrans = 0
f2 then passes, doesn't re-schedule, but decrements - igmp_retrans = 255
f3 does the unnecessary retransmissions.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Tue, 11 Jun 2013 22:07:01 +0000 (00:07 +0200)]
bonding: reset master mac on first enslave failure
If the bond device is supposed to get the first slave's MAC address and
the first enslavement fails then we need to reset the master's MAC
otherwise it will stay the same as the failed slave device. We do it
after err_undo_flags since that is the first place where the MAC can be
changed and we check if it should've been the first slave and if the
bond's MAC was set to it because that err place is used by multiple
locations prior to changing the master's MAC address.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 12 Jun 2013 14:02:27 +0000 (16:02 +0200)]
packet: packet_getname_spkt: make sure string is always 0-terminated
uaddr->sa_data is exactly of size 14, which is hard-coded here and
passed as a size argument to strncpy(). A device name can be of size
IFNAMSIZ (== 16), meaning we might leave the destination string
unterminated. Thus, use strlcpy() and also sizeof() while we're
at it. We need to memset the data area beforehand, since strlcpy
does not padd the remaining buffer with zeroes for user space, so
that we do not possibly leak anything.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dinh Nguyen [Wed, 12 Jun 2013 16:05:03 +0000 (11:05 -0500)]
net: ethernet: stmicro: stmmac: Fix compile error when STMMAC_XMIT_DEBUG used
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function:
stmmac_xmit drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1902:74:
error: expected ) before __func__
Signed-off-by: Dinh Nguyen <dinguyen@altera.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Tue, 11 Jun 2013 11:48:22 +0000 (17:18 +0530)]
be2net: Fix 32-bit DMA Mask handling
Fix to set the coherent DMA mask only if dma_set_mask() succeeded, and to
error out if either fails.
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 13 Jun 2013 08:26:54 +0000 (01:26 -0700)]
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Included change:
- fix "rtnl locked" concurrent executions by using rtnl_lock instead of
rtnl_trylock. This fix enables batman-adv initialisation to do not fail just
because somewhere else in the system another code path is holding the rtnl
lock. It is easy to see the problem when batman-adv is trying to start
together with other networking components.
- fix the routing protocol forwarding policy by enhancing the duplicate control
packet detection. When the right circumstances trigger the issue, some nodes in
the network become totally unreachable, so breaking the mesh connectivity.
- fix the Bridge Loop Avoidance component by not running the originator address
change handling routine when the component is disabled. The routine was
generating useless packets that were sent over the network.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Beulich [Tue, 11 Jun 2013 10:00:34 +0000 (11:00 +0100)]
xen-netback: don't de-reference vif pointer after having called xenvif_put()
When putting vif-s on the rx notify list, calling xenvif_put() must be
deferred until after the removal from the list and the issuing of the
notification, as both operations dereference the pointer.
Changing this got me to notice that the "irq" variable was effectively
unused (and was of too narrow type anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Thu, 13 Jun 2013 07:07:29 +0000 (10:07 +0300)]
macvlan: don't touch promisc without passthrough
commit
df8ef8f3aaa6692970a436204c4429210addb23a
"macvlan: add FDB bridge ops and macvlan flags"
added a way to control NOPROMISC macvlan flag through netlink.
However, with a non passthrough device we never set promisc on open,
even if NOPROMISC is off. As a result:
If userspace clears NOPROMISC on open, then does not clear it on a
netlink command, promisc counter is not decremented on stop and there
will be no way to clear it once macvlan is detached.
If userspace does not clear NOPROMISC on open, then sets NOPROMISC on a
netlink command, promisc counter will be decremented from 0 and overflow
to
fffffffff with no way to clear promisc.
To fix, simply ignore NOPROMISC flag in a netlink command for
non-passthrough devices, same as we do at open/close.
Since we touch this code anyway - check dev_set_promiscuity return code
and pass it to users (though an error here is unlikely).
Cc: "David S. Miller" <davem@davemloft.net>
Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 13 Jun 2013 00:18:29 +0000 (17:18 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking update from David Miller:
1) Fix dump iterator in nfnl_acct_dump() and ctnl_timeout_dump() to
dump all objects properly, from Pablo Neira Ayuso.
2) xt_TCPMSS must use the default MSS of 536 when no MSS TCP option is
present. Fix from Phil Oester.
3) qdisc_get_rtab() looks for an existing matching rate table and uses
that instead of creating a new one. However, it's key matching is
incomplete, it fails to check to make sure the ->data[] array is
identical too. Fix from Eric Dumazet.
4) ip_vs_dest_entry isn't fully initialized before copying back to
userspace, fix from Dan Carpenter.
5) Fix ubuf reference counting regression in vhost_net, from Jason
Wang.
6) When sock_diag dumps a socket filter back to userspace, we have to
translate it out of the kernel's internal representation first.
From Nicolas Dichtel.
7) davinci_mdio holds a spinlock while calling pm_runtime, which
sleeps. Fix from Sebastian Siewior.
8) Timeout check in sh_eth_check_reset is off by one, from Sergei
Shtylyov.
9) If sctp socket init fails, we can NULL deref during cleanup. Fix
from Daniel Borkmann.
10) netlink_mmap() does not propagate errors properly, from Patrick
McHardy.
11) Disable powersave and use minstrel by default in ath9k. From Sujith
Manoharan.
12) Fix a regression in that SOCK_ZEROCOPY is not set on tuntap sockets
which prevents vhost from being able to use zerocopy. From Jason
Wang.
13) Fix race between port lookup and TX path in team driver, from Jiri
Pirko.
14) Missing length checks in bluetooth L2CAP packet parsing, from Johan
Hedberg.
15) rtlwifi fails to connect to networking using any encryption method
other than WPA2. Fix from Larry Finger.
16) Fix iwlegacy build due to incorrect CONFIG_* ifdeffing for power
management stuff. From Yijing Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
b43: stop format string leaking into error msgs
ath9k: Use minstrel rate control by default
Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity"
ath9k: Disable PowerSave by default
net: wireless: iwlegacy: fix build error for il_pm_ops
rtlwifi: Fix a false leak indication for PCI devices
wl12xx/wl18xx: scan all 5ghz channels
wl12xx: increase minimum singlerole firmware version required
wl12xx: fix minimum required firmware version for wl127x multirole
rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks
mwifiex: debugfs: Fix out of bounds array access
Bluetooth: Fix mgmt handling of power on failures
Bluetooth: Fix missing length checks for L2CAP signalling PDUs
Bluetooth: btmrvl: support Marvell Bluetooth device SD8897
Bluetooth: Fix checks for LE support on LE-only controllers
team: fix checks in team_get_first_port_txable_rcu()
team: move add to port list before port enablement
team: check return value of team_get_port_by_index_rcu() for NULL
tuntap: set SOCK_ZEROCOPY flag during open
netlink: fix error propagation in netlink_mmap()
...
Linus Torvalds [Thu, 13 Jun 2013 00:08:49 +0000 (17:08 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull input layer bugfix from Jiri Kosina:
"Memory leak regression fix from Benjamin Tissoires"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: multitouch: prevent memleak with the allocated name
Linus Torvalds [Wed, 12 Jun 2013 23:42:39 +0000 (16:42 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"Outside of bcache (which really isn't super big), these are all
few-liners. There are a few important fixes in here:
- Fix blk pm sleeping when holding the queue lock
- A small collection of bcache fixes that have been done and tested
since bcache was included in this merge window.
- A fix for a raid5 regression introduced with the bio changes.
- Two important fixes for mtip32xx, fixing an oops and potential data
corruption (or hang) due to wrong bio iteration on stacked devices."
* 'for-linus' of git://git.kernel.dk/linux-block:
scatterlist: sg_set_buf() argument must be in linear mapping
raid5: Initialize bi_vcnt
pktcdvd: silence static checker warning
block: remove refs to XD disks from documentation
blkpm: avoid sleep when holding queue lock
mtip32xx: Correctly handle bio->bi_idx != 0 conditions
mtip32xx: Fix NULL pointer dereference during module unload
bcache: Fix error handling in init code
bcache: clarify free/available/unused space
bcache: drop "select CLOSURES"
bcache: Fix incompatible pointer type warning
Linus Torvalds [Wed, 12 Jun 2013 23:29:53 +0000 (16:29 -0700)]
Merge branch 'akpm' (updates from Andrew Morton)
Merge misc fixes from Andrew Morton:
"Bunch of fixes and one little addition to math64.h"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
include/linux/math64.h: add div64_ul()
mm: memcontrol: fix lockless reclaim hierarchy iterator
frontswap: fix incorrect zeroing and allocation size for frontswap_map
kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
mm: migration: add migrate_entry_wait_huge()
ocfs2: add missing lockres put in dlm_mig_lockres_handler
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
aio: fix io_destroy() regression by using call_rcu()
rtc-at91rm9200: use shadow IMR on at91sam9x5
rtc-at91rm9200: add shadow interrupt mask
rtc-at91rm9200: refactor interrupt-register handling
rtc-at91rm9200: add configuration support
rtc-at91rm9200: add match-table compile guard
fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
cciss: fix broken mutex usage in ioctl
audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
...
Alex Shi [Wed, 12 Jun 2013 21:05:10 +0000 (14:05 -0700)]
include/linux/math64.h: add div64_ul()
There is div64_long() to handle the s64/long division, but no mocro do
u64/ul division. It is necessary in some scenarios, so add this
function.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alex Shi <alex.shi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Wed, 12 Jun 2013 21:05:09 +0000 (14:05 -0700)]
mm: memcontrol: fix lockless reclaim hierarchy iterator
The lockless reclaim hierarchy iterator currently has a misplaced
barrier that can lead to use-after-free crashes.
The reclaim hierarchy iterator consist of a sequence count and a
position pointer that are read and written locklessly, with memory
barriers enforcing ordering.
The write side sets the position pointer first, then updates the
sequence count to "publish" the new position. Likewise, the read side
must read the sequence count first, then the position. If the sequence
count is up to date, it's guaranteed that the position is up to date as
well:
writer: reader:
iter->position = position if iter->sequence == expected:
smp_wmb() smp_rmb()
iter->sequence = sequence position = iter->position
However, the read side barrier is currently misplaced, which can lead to
dereferencing stale position pointers that no longer point to valid
memory. Fix this.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: <stable@kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 12 Jun 2013 21:05:08 +0000 (14:05 -0700)]
frontswap: fix incorrect zeroing and allocation size for frontswap_map
The bitmap accessed by bitops must have enough size to hold the required
numbers of bits rounded up to a multiple of BITS_PER_LONG. And the
bitmap must not be zeroed by memset() if the number of bits cleared is
not a multiple of BITS_PER_LONG.
This fixes incorrect zeroing and allocation size for frontswap_map. The
incorrect zeroing part doesn't cause any problem because frontswap_map
is freed just after zeroing. But the wrongly calculated allocation size
may cause the problem.
For 32bit systems, the allocation size of frontswap_map is about twice
as large as required size. For 64bit systems, the allocation size is
smaller than requeired if the number of bits is not a multiple of
BITS_PER_LONG.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Wed, 12 Jun 2013 21:05:07 +0000 (14:05 -0700)]
kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().
The reason is when it is killed, the 'rule' itself may have already
released, we should not access it. one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.
The work flow for adding a rule:
audit_receive() -> (need audit_cmd_mutex lock)
audit_receive_skb() ->
audit_receive_msg() ->
audit_receive_filter() ->
audit_add_rule() ->
audit_add_tree_rule() -> (need audit_filter_mutex lock)
...
unlock audit_filter_mutex
get_tree()
...
iterate_mounts() -> (iterate all related inodes)
tag_mount() ->
tag_trunk() ->
create_trunk() -> (assume it is 1st rule)
fsnotify_add_mark() ->
fsnotify_add_inode_mark() -> (add mark to inode->i_fsnotify_marks)
...
get_tree(); (each inode will get one)
...
lock audit_filter_mutex
The work flow for deleting an inode:
__destroy_inode() ->
fsnotify_inode_delete() ->
__fsnotify_inode_delete() ->
fsnotify_clear_marks_by_inode() -> (get mark from inode->i_fsnotify_marks)
fsnotify_destroy_mark() ->
fsnotify_destroy_mark_locked() ->
audit_tree_freeing_mark() ->
evict_chunk() ->
...
tree->goner = 1
...
kill_rules() -> (assume current->audit_context == NULL)
call_rcu() -> (rule->tree != NULL)
audit_free_rule_rcu() ->
audit_free_rule()
...
audit_schedule_prune() -> (assume current->audit_context == NULL)
kthread_run() -> (need audit_cmd_mutex and audit_filter_mutex lock)
prune_one() -> (delete it from prue_list)
put_tree(); (match the original get_tree above)
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Wed, 12 Jun 2013 21:05:04 +0000 (14:05 -0700)]
mm: migration: add migrate_entry_wait_huge()
When we have a page fault for the address which is backed by a hugepage
under migration, the kernel can't wait correctly and do busy looping on
hugepage fault until the migration finishes. As a result, users who try
to kick hugepage migration (via soft offlining, for example) occasionally
experience long delay or soft lockup.
This is because pte_offset_map_lock() can't get a correct migration entry
or a correct page table lock for hugepage. This patch introduces
migration_entry_wait_huge() to solve this.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [2.6.35+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Wed, 12 Jun 2013 21:05:03 +0000 (14:05 -0700)]
ocfs2: add missing lockres put in dlm_mig_lockres_handler
dlm_mig_lockres_handler() is missing a dlm_lockres_put() on an error path.
Signed-off-by: joyce <xuejiufei@huawei.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tomasz Stanislawski [Wed, 12 Jun 2013 21:05:02 +0000 (14:05 -0700)]
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
The watermark check consists of two sub-checks. The first one is:
if (free_pages <= min + lowmem_reserve)
return false;
The check assures that there is minimal amount of RAM in the zone. If
CMA is used then the free_pages is reduced by the number of free pages
in CMA prior to the over-mentioned check.
if (!(alloc_flags & ALLOC_CMA))
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
This prevents the zone from being drained from pages available for
non-movable allocations.
The second check prevents the zone from getting too fragmented.
for (o = 0; o < order; o++) {
free_pages -= z->free_area[o].nr_free << o;
min >>= 1;
if (free_pages <= min)
return false;
}
The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages. Therefore the CMA pages are subtracted twice.
This may cause a false positive fail of __zone_watermark_ok() if the CMA
area gets strongly fragmented. In such a case there are many 0-order
free pages located in CMA. Those pages are subtracted twice therefore
they will quickly drain free_pages during the check against
fragmentation. The test fails even though there are many free non-cma
pages in the zone.
This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.
Laura said:
We were observing allocation failures of higher order pages (order 5 =
128K typically) under tight memory conditions resulting in driver
failure. The output from the page allocation failure showed plenty of
free pages of the appropriate order/type/zone and mostly CMA pages in
the lower orders.
For full disclosure, we still observed some page allocation failures
even after applying the patch but the number was drastically reduced and
those failures were attributed to fragmentation/other system issues.
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Wed, 12 Jun 2013 21:05:00 +0000 (14:05 -0700)]
drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
The "info.fill" array isn't initialized so it can leak uninitialized stack
information to user space.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Robin Holt <holt@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Wed, 12 Jun 2013 21:04:59 +0000 (14:04 -0700)]
aio: fix io_destroy() regression by using call_rcu()
There was a regression introduced by
36f5588905c1 ("aio: refcounting
cleanup"), reported by Jens Axboe - the refcounting cleanup switched to
using RCU in the shutdown path, but the synchronize_rcu() was done in
the context of the io_destroy() syscall greatly increasing the time it
could block.
This patch switches it to call_rcu() and makes shutdown asynchronous
(more asynchronous than it was originally; before the refcount changes
io_destroy() would still wait on pending kiocbs).
Note that there's a global quota on the max outstanding kiocbs, and that
quota must be manipulated synchronously; otherwise io_setup() could
return -EAGAIN when there isn't quota available, and userspace won't
have any way of waiting until shutdown of the old kioctxs has finished
(besides busy looping).
So we release our quota before kioctx shutdown has finished, which
should be fine since the quota never corresponded to anything real
anyways.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Wed, 12 Jun 2013 21:04:57 +0000 (14:04 -0700)]
rtc-at91rm9200: use shadow IMR on at91sam9x5
Add support for the at91sam9x5-family which must use the shadow
interrupt mask due to a hardware issue (causing RTC_IMR to always be
zero).
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Wed, 12 Jun 2013 21:04:56 +0000 (14:04 -0700)]
rtc-at91rm9200: add shadow interrupt mask
Add shadow interrupt-mask register which can be used on SoCs where the
actual hardware register is broken.
Note that some care needs to be taken to make sure the shadow mask
corresponds to the actual hardware state. The added overhead is not an
issue for the non-broken SoCs due to the relatively infrequent
interrupt-mask updates. We do, however, only use the shadow mask value
as a fall-back when it actually needed as there is still a theoretical
possibility that the mask is incorrect (see the code for details).
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Wed, 12 Jun 2013 21:04:55 +0000 (14:04 -0700)]
rtc-at91rm9200: refactor interrupt-register handling
Add accessors for the interrupt register.
This will allow us to easily add a shadow interrupt-mask register to use
on SoCs where the interrupt-mask register cannot be used.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Wed, 12 Jun 2013 21:04:53 +0000 (14:04 -0700)]
rtc-at91rm9200: add configuration support
Add configuration support which can be used to implement SoC-specific
workarounds for broken hardware.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hovold [Wed, 12 Jun 2013 21:04:52 +0000 (14:04 -0700)]
rtc-at91rm9200: add match-table compile guard
The members of Atmel's at91sam9x5 family (9x5) have a broken RTC
interrupt mask register (AT91_RTC_IMR). It does not reflect enabled
interrupts but instead always returns zero.
The kernel's rtc-at91rm9200 driver handles the RTC for the 9x5 family.
Currently when the date/time is set, an interrupt is generated and this
driver neglects to handle the interrupt. The kernel complains about the
un-handled interrupt and disables it henceforth. This not only breaks
the RTC function, but since that interrupt is shared (Atmel's SYS
interrupt) then other things break as well (e.g. the debug port no
longer accepts characters).
Tested on the at91sam9g25. Bug confirmed by Atmel.
This patch (of 5):
Add missing match-table compile guard.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Goldwyn Rodrigues [Wed, 12 Jun 2013 21:04:51 +0000 (14:04 -0700)]
fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
While removing a non-empty directory, the kernel dumps a message:
(rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39
Suppress the error message from being printed in the dmesg so users
don't panic.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael Aquini [Wed, 12 Jun 2013 21:04:49 +0000 (14:04 -0700)]
swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
read_swap_cache_async() can race against get_swap_page(), and stumble
across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought
into the swapcache yet.
This transient swap_map state is expected to be transitory, but the
actual placement of discard at scan_swap_map() inserts a wait for I/O
completion thus making the thread at read_swap_cache_async() to loop
around its -EEXIST case, while the other end at get_swap_page() is
scheduled away at scan_swap_map(). This can leave the system deadlocked
if the I/O completion happens to be waiting on the CPU waitqueue where
read_swap_cache_async() is busy looping and !CONFIG_PREEMPT.
This patch introduces a cond_resched() call to make the aforementioned
read_swap_cache_async() busy loop condition to bail out when necessary,
thus avoiding the subtle race window.
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tony Lindgren [Wed, 12 Jun 2013 21:04:48 +0000 (14:04 -0700)]
drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
When booted in legacy mode device_init_wakeup() gets called by
drivers/mfd/twl-core.c when the children are initialized. However, when
booted using device tree, the children are created with
of_platform_populate() instead add_children().
This means that the RTC driver will not have device_init_wakeup() set,
and we need to call it from the driver probe like RTC drivers typically
do.
Without this we cannot test PM wake-up events on omaps for cases where
there may not be any physical wake-up event.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reported-by: Kevin Hilman <khilman@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen M. Cameron [Wed, 12 Jun 2013 21:04:47 +0000 (14:04 -0700)]
cciss: fix broken mutex usage in ioctl
If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked
(as is normal with the Array Configuration Utility) the process will
hang as below. It attempts to acquire the same mutex twice, once in
do_ioctl() and once in cciss_unlocked_open(). The BKL was recursive,
the mutex isn't.
Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7
20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
[...]
acu D
0000000000000001 0 3246 3191 0x00000080
Call Trace:
schedule+0x29/0x70
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0x17b/0x220
mutex_lock+0x2b/0x50
cciss_unlocked_open+0x2f/0x110 [cciss]
__blkdev_get+0xd3/0x470
blkdev_get+0x5c/0x1e0
register_disk+0x182/0x1a0
add_disk+0x17c/0x310
cciss_add_disk+0x13a/0x170 [cciss]
cciss_update_drive_info+0x39b/0x480 [cciss]
rebuild_lun_table+0x258/0x370 [cciss]
cciss_ioctl+0x34f/0x470 [cciss]
do_ioctl+0x49/0x70 [cciss]
__blkdev_driver_ioctl+0x28/0x30
blkdev_ioctl+0x200/0x7b0
block_ioctl+0x3c/0x40
do_vfs_ioctl+0x89/0x350
SyS_ioctl+0xa1/0xb0
system_call_fastpath+0x16/0x1b
This mutex usage was added into the ioctl path when the big kernel lock
was removed. As it turns out, these paths are all thread safe anyway
(or can easily be made so) and we don't want ioctl() to be single
threaded in any case.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Miller <mike.miller@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 12 Jun 2013 21:04:46 +0000 (14:04 -0700)]
audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
audit_log_start() does wait_for_auditd() in a loop until
audit_backlog_wait_time passes or audit_skb_queue has a room.
If signal_pending() is true this becomes a busy-wait loop, schedule() in
TASK_INTERRUPTIBLE won't block.
Thanks to Guy for fully investigating and explaining the problem.
(akpm: that'll cause the system to lock up on a non-preemptible
uniprocessor kernel)
(Guy: "Our customer was in fact running a uniprocessor machine, and they
reported a system hang.")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Guy Streeter <streeter@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Derek Basehore [Wed, 12 Jun 2013 21:04:45 +0000 (14:04 -0700)]
drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
During resume, we call hpet_rtc_timer_init after masking an irq bit in
hpet. This will cause the call to hpet_disable_rtc_channel to be undone
if RTC_AIE is the only bit not masked.
Allowing the cmos interrupt handler to run before resuming caused some
issues where the timer for the alarm was not removed. This would cause
other, later timers to not be cleared, so utilities such as hwclock
would time out when waiting for the update interrupt.
[akpm@linux-foundation.org: coding-style tweak]
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Osipenko [Wed, 12 Jun 2013 21:04:44 +0000 (14:04 -0700)]
drivers/rtc/rtc-tps6586x.c: device wakeup flags correction
Use device_init_wakeup() instead of device_set_wakeup_capable() and move
it before rtc dev registering. This fixes alarmtimer not registered
when tps6586x rtc is the only wakeup compatible rtc in the system.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Vagin [Wed, 12 Jun 2013 21:04:42 +0000 (14:04 -0700)]
memcg: don't initialize kmem-cache destroying work for root caches
struct memcg_cache_params has a union. Different parts of this union
are used for root and non-root caches. A part with destroying work is
used only for non-root caches.
BUG: unable to handle kernel paging request at
0000000fffffffe0
IP: kmem_cache_alloc+0x41/0x1f0
Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy
CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G D 3.10.0-rc1+ #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
RIP: kmem_cache_alloc+0x41/0x1f0
Call Trace:
getname_flags.part.34+0x30/0x140
getname+0x38/0x60
do_sys_open+0xc5/0x1e0
SyS_open+0x22/0x30
system_call_fastpath+0x16/0x1b
Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49
RIP [<
ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Li Zefan <lizefan@huawei.com>
Cc: <stable@vger.kernel.org> [3.9.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xiaowei.Hu [Wed, 12 Jun 2013 21:04:41 +0000 (14:04 -0700)]
ocfs2: ocfs2_prep_new_orphaned_file() should return ret
If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to
a NULL ocfs2_alloc_context struct in the following functions. A kernel
panic happens.
Signed-off-by: "Xiaowei.Hu" <xiaowei.hu@oracle.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Wed, 12 Jun 2013 21:04:40 +0000 (14:04 -0700)]
lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAGS=-W'.
For 'while' looping, need stop when 'nbytes == 0', or will cause issue.
('nbytes' is size_t which is always bigger or equal than zero).
The related warning: (with EXTRA_CFLAGS=-W)
lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Wed, 12 Jun 2013 21:04:39 +0000 (14:04 -0700)]
kmsg: honor dmesg_restrict sysctl on /dev/kmsg
The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections. Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions. With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.
To fix /dev/kmsg, let's compare the existing interfaces and what they
allow:
- /proc/kmsg allows:
- open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
single-reader interface (SYSLOG_ACTION_READ).
- everything, after an open.
- syslog syscall allows:
- anything, if CAP_SYSLOG.
- SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
dmesg_restrict==0.
- nothing else (EPERM).
The use-cases were:
- dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
- sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
destructive SYSLOG_ACTION_READs.
AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
clear the ring buffer.
Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.
To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.
- /dev/kmsg allows:
- open if CAP_SYSLOG or dmesg_restrict==0
- reading/polling, after open
Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192
[akpm@linux-foundation.org: use pr_warn_once()]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 12 Jun 2013 21:04:37 +0000 (14:04 -0700)]
reboot: rigrate shutdown/reboot to boot cpu
We recently noticed that reboot of a 1024 cpu machine takes approx 16
minutes of just stopping the cpus. The slowdown was tracked to commit
f96972f2dc63 ("kernel/sys.c: call disable_nonboot_cpus() in
kernel_restart()").
The current implementation does all the work of hot removing the cpus
before halting the system. We are switching to just migrating to the
boot cpu and then continuing with shutdown/reboot.
This also has the effect of not breaking x86's command line parameter
for specifying the reboot cpu. Note, this code was shamelessly copied
from arch/x86/kernel/reboot.c with bits removed pertaining to the
reboot_cpu command line parameter.
Signed-off-by: Robin Holt <holt@sgi.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Srivatsa S. Bhat [Wed, 12 Jun 2013 21:04:36 +0000 (14:04 -0700)]
CPU hotplug: provide a generic helper to disable/enable CPU hotplug
There are instances in the kernel where we would like to disable CPU
hotplug (from sysfs) during some important operation. Today the freezer
code depends on this and the code to do it was kinda tailor-made for
that.
Restructure the code and make it generic enough to be useful for other
usecases too.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Wed, 12 Jun 2013 20:35:24 +0000 (13:35 -0700)]
Merge branch 'wireless'
John W. Linville says:
====================
For now I have dropped the mac80211 tree from this request.
We are developing a little backlog of fixes and I would like to
avoid introducing any more uncertainty to this pull request for the
3.10 stream. All the other bits are the same as what was in the
2013-06-06 request, including the ath9k fixes intended to address
the problems observed by Linus w/ his Pixel, and a CVE fix for a
potential security issue in the b43 driver.
Regarding the wl12xx bits, Luca says:
"Here are three patches that I'd like to get into 3.10. Two of them, by
me, are related to the firmware version checks in our driver. Without
them, the firmwares fail to load. The other one, by Eliad, fixes a typo
bug in our 5GHz scanning code."
And as for the Bluetooth bits, Gustavo says:
"The following patches are important bug fixes for 3.10, plus the
support for a new device. We do have three fixes from Johan. The first
one is a fix to avoid LE-only devices to rely on the (inexistent)
extended features data. The second patch fixes length checks on
incoming L2CAP signalling PDUs so we can discard PDU whose size
doesn't match the one reported in the header. The last one fixes
the handling of power on failures, we now report proper errors to
mgmt when hci_dev_open()."
Along with that...
Larry Finger corrects an rtlwifi problem that caused some devices to
refuse to connect to non-WPA2 networks if the device had previously
assocated with a WPA2 network. He also adds a one-line fix to prevent
false reports from kmemleak.
Mark A. Greer fixes an out of bounds array access in mwifiex.
Felix Fietkau reverts an earlier ath9k initval patch that reduced rx
sensitivity in a number of ath9k devices with no corresponding benefit.
Kees Cook fixes a potential uid-0 to ring-0 escalation in b43
(CVE-2013-2852).
Sujith Manoharan turns-off powersave mode by default for ath9k, and
also defaults ath9k to use the minstrel_ht rate control algorithm.
Both of these are believed to contribute to greater stability/usability
of ath9k in real-world situations.
Yijing Wang fixes an iwlegacy build error for il_pm_ops if CONFIG_PM
is set but CONFIG_PM_SLEEP is not set.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 12 Jun 2013 18:48:14 +0000 (11:48 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Resurrect Alchemy platforms by invoking the WAIT instructions with
interrupts enabled. This still leaves the race condition between
testing TIF_NEED_RESCHED and the WAIT instruction for Alchemy
platforms which need a different fix than other MIPS platforms. But
at least it gets MIPS platforms flying again.
There are also fixes for two build errors (CONFIG_FTRACE=y with
CONFIG_DYNAMIC_FTRACE=n) and CONFIG_VIRTUALIZATION without CONFIG_KVM"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: ftrace: Add missing CONFIG_DYNAMIC_FTRACE
MIPS: include: mmu_context.h: Replace VIRTUALIZATION with KVM
MIPS: Alchemy: fix wait function
Linus Torvalds [Wed, 12 Jun 2013 18:34:26 +0000 (11:34 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Just some GMA500 memory leaks and i915 regression fix due to a
regression fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: prefer VBT modes for SVDO-LVDS over EDID
drm/i915: Enable hotplug interrupts after querying hw capabilities.
drm/i915: Fix hotplug interrupt enabling for SDVOC
drm/gma500/cdv: Fix cursor gem obj referencing on cdv
drm/gma500/psb: Fix cursor gem obj referencing on psb
drm/gma500/cdv: Unpin framebuffer on crtc disable
drm/gma500/psb: Unpin framebuffer on crtc disable
drm/gma500: Add fb gtt offset to fb base
Linus Torvalds [Wed, 12 Jun 2013 15:29:11 +0000 (08:29 -0700)]
Merge tag 'trace-fixes-v3.10-rc5' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Yoshihiro Yunomae fixed a regression in the output format when using
one of the counter clocks.
The new multibuffer code changed the trace_clock file to update the
trace instances tr->clock_id but the actual traces still used the
value from the obsolete global variable trace_clock_id"
* tag 'trace-fixes-v3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix outputting formats of x86-tsc and counter when use trace_clock
Linus Torvalds [Wed, 12 Jun 2013 15:28:19 +0000 (08:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
"There is a pair of fixes for double-frees in the recent bundle for
3.10, a couple of fixes for long-standing bugs (sleep while atomic and
an endianness fix), and a locking fix that can be triggered when osds
are going down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fix cleanup in rbd_add()
rbd: don't destroy ceph_opts in rbd_add()
ceph: ceph_pagelist_append might sleep while atomic
ceph: add cpu_to_le32() calls when encoding a reconnect capability
libceph: must hold mutex for reset_changed_osds()
John W. Linville [Wed, 12 Jun 2013 14:57:04 +0000 (10:57 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Kees Cook [Fri, 10 May 2013 21:48:21 +0000 (14:48 -0700)]
b43: stop format string leaking into error msgs
The module parameter "fwpostfix" is userspace controllable, unfiltered,
and is used to define the firmware filename. b43_do_request_fw() populates
ctx->errors[] on error, containing the firmware filename. b43err()
parses its arguments as a format string. For systems with b43 hardware,
this could lead to a uid-0 to ring-0 escalation.
CVE-2013-2852
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sujith Manoharan [Thu, 6 Jun 2013 04:36:29 +0000 (10:06 +0530)]
ath9k: Use minstrel rate control by default
The ath9k rate control algorithm has various architectural
issues that make it a poor fit in scenarios like congested
environments etc.
An example: https://bugzilla.redhat.com/show_bug.cgi?id=927191
Change the default to minstrel which is more robust in such cases.
The ath9k RC code is left in the driver for now, maybe it can
be removed altogether later on.
Cc: stable@vger.kernel.org
Cc: Jouni Malinen <jouni@qca.qualcomm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felix Fietkau [Mon, 3 Jun 2013 09:18:57 +0000 (11:18 +0200)]
Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity"
This reverts commit
68d9e1fa24d9c7c2e527f49df8d18fb8cf0ec943
This change reduces rx sensitivity with no apparent extra benefit.
It looks like it was meant for testing in a specific scenario,
but it was never properly validated.
Cc: rmanohar@qca.qualcomm.com
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sujith Manoharan [Sat, 1 Jun 2013 01:38:09 +0000 (07:08 +0530)]
ath9k: Disable PowerSave by default
Almost all the DMA issues which have plagued ath9k (in station mode)
for years are related to PS. Disabling PS usually "fixes" the user's
connection stablility. Reports of DMA problems are still trickling in
and are sitting in the kernel bugzilla. Until the PS code in ath9k is
given a thorough review, disbale it by default. The slight increase
in chip power consumption is a small price to pay for improved link
stability.
Cc: stable@vger.kernel.org
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Yijing Wang [Fri, 31 May 2013 06:05:32 +0000 (14:05 +0800)]
net: wireless: iwlegacy: fix build error for il_pm_ops
Fix build error for il_pm_ops if CONFIG_PM is set
but CONFIG_PM_SLEEP is not set.
ERROR: "il_pm_ops" [drivers/net/wireless/iwlegacy/iwl4965.ko] undefined!
ERROR: "il_pm_ops" [drivers/net/wireless/iwlegacy/iwl3945.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: netdev@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Thu, 30 May 2013 21:21:47 +0000 (16:21 -0500)]
rtlwifi: Fix a false leak indication for PCI devices
This false leak indication is avoided with a no-leak annotation to kmemleak.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Eliad Peller [Tue, 7 May 2013 12:41:09 +0000 (15:41 +0300)]
wl12xx/wl18xx: scan all 5ghz channels
Due to a typo, the current code copies only sizeof(cmd->channels_2)
bytes, which is smaller than the correct sizeof(cmd->channels_5)
size, resulting in a partial scan (some channels are skipped).
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Luciano Coelho [Fri, 10 May 2013 07:44:25 +0000 (10:44 +0300)]
wl12xx: increase minimum singlerole firmware version required
The minimum firmware version required for singlerole after recent
driver changes is 6/7.3.10.0.133.
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Luciano Coelho [Fri, 10 May 2013 07:19:38 +0000 (10:19 +0300)]
wl12xx: fix minimum required firmware version for wl127x multirole
There was a typo in commit 8675f9 (wlcore/wl12xx/wl18xx: verify
multi-role and single-role fw versions), which was causing the
multirole firmware for wl127x (WiLink6) to be rejected. The actual
minimum version needed for wl127x multirole is 6.5.7.0.42.
Reported-by: Levi Pearson <levipearson@gmail.com>
Reported-by: Michael Scott <hashcode0f@gmail.com>
Cc: stable@kernel.org # 3.9+
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Thu, 30 May 2013 23:05:55 +0000 (18:05 -0500)]
rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks
Driver rtl8192cu can connect to WPA2 networks, but fails for any other
encryption method. The cause is a failure to set the rate control data
blocks. These changes fix https://bugzilla.redhat.com/show_bug.cgi?id=952793
and https://bugzilla.redhat.com/show_bug.cgi?id=761525.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mark A. Greer [Wed, 29 May 2013 19:25:34 +0000 (12:25 -0700)]
mwifiex: debugfs: Fix out of bounds array access
When reading the contents of '/sys/kernel/debug/mwifiex/p2p0/info',
the following panic occurs:
$ cat /sys/kernel/debug/mwifiex/p2p0/info
Unable to handle kernel paging request at virtual address
74706164
pgd =
de530000
[
74706164] *pgd=
00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: phy_twl4030_usb omap2430 musb_hdrc mwifiex_sdio mwifiex
CPU: 0 PID: 1635 Comm: cat Not tainted
3.10.0-rc1-00010-g1268390 #1
task:
de16b6c0 ti:
de048000 task.ti:
de048000
PC is at strnlen+0xc/0x4c
LR is at string+0x3c/0xf8
pc : [<
c02c123c>] lr : [<
c02c2d1c>] psr:
a0000013
sp :
de049e10 ip :
c06efba0 fp :
de6d2092
r10:
bf01a260 r9 :
ffffffff r8 :
74706164
r7 :
0000ffff r6 :
ffffffff r5 :
de6d209c r4 :
00000000
r3 :
ff0a0004 r2 :
74706164 r1 :
ffffffff r0 :
74706164
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control:
10c5387d Table:
9e530019 DAC:
00000015
Process cat (pid: 1635, stack limit = 0xde048240)
Stack: (0xde049e10 to 0xde04a000)
9e00:
de6d2092 00000002 bf01a25e de6d209c
9e20:
de049e80 c02c438c 0000000a ff0a0004 ffffffff 00000000 00000000 de049e48
9e40:
00000000 2192df6d ff0a0004 ffffffff 00000000 de6d2092 de049ef8 bef3cc00
9e60:
de6b0000 dc358000 de6d2000 00000000 00000003 c02c45a4 bf01790c bf01a254
9e80:
74706164 bf018698 00000000 de59c3c0 de048000 de049f80 00001000 bef3cc00
9ea0:
00000008 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9ec0:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9ee0:
00000000 00000000 00000000 00000001 00000000 00000000 6669776d 20786569
9f00:
20302e31 2e343128 392e3636 3231702e 00202933 00000000 00000003 c0294898
9f20:
00000000 00000000 00000000 00000000 de59c3c0 c0107c04 de554000 de59c3c0
9f40:
00001000 bef3cc00 de049f80 bef3cc00 de049f80 00000000 00000003 c0108a00
9f60:
de048000 de59c3c0 00000000 00000000 de59c3c0 00001000 bef3cc00 c0108b60
9f80:
00000000 00000000 00001000 bef3cc00 00000003 00000003 c0014128 de048000
9fa0:
00000000 c0013f80 00001000 bef3cc00 00000003 bef3cc00 00001000 00000000
9fc0:
00001000 bef3cc00 00000003 00000003 00000001 00000001 00000001 00000003
9fe0:
00000000 bef3cbdc 00011984 b6f1127c 60000010 00000003 18dbdd2c 7f7bfffd
[<
c02c123c>] (strnlen+0xc/0x4c) from [<
c02c2d1c>] (string+0x3c/0xf8)
[<
c02c2d1c>] (string+0x3c/0xf8) from [<
c02c438c>] (vsnprintf+0x1e8/0x3e8)
[<
c02c438c>] (vsnprintf+0x1e8/0x3e8) from [<
c02c45a4>] (sprintf+0x18/0x24)
[<
c02c45a4>] (sprintf+0x18/0x24) from [<
bf01790c>] (mwifiex_info_read+0xfc/0x3e8 [mwifiex])
[<
bf01790c>] (mwifiex_info_read+0xfc/0x3e8 [mwifiex]) from [<
c0108a00>] (vfs_read+0xb0/0x144)
[<
c0108a00>] (vfs_read+0xb0/0x144) from [<
c0108b60>] (SyS_read+0x44/0x70)
[<
c0108b60>] (SyS_read+0x44/0x70) from [<
c0013f80>] (ret_fast_syscall+0x0/0x30)
Code:
e12fff1e e3510000 e1a02000 0a00000d (
e5d03000)
---[ end trace
ca98273dc605a04f ]---
The panic is caused by the mwifiex_info_read() routine assuming that
there can only be four modes (0-3) which is an invalid assumption.
For example, when testing P2P, the mode is '8' (P2P_CLIENT) so the
code accesses data beyond the bounds of the bss_modes[] array which
causes the panic. Fix this by updating bss_modes[] to support the
current list of modes and adding a check to prevent the out-of-bounds
access from occuring in the future when more modes are added.
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Acked-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johan Hedberg [Wed, 29 May 2013 06:51:29 +0000 (09:51 +0300)]
Bluetooth: Fix mgmt handling of power on failures
If hci_dev_open fails we need to ensure that the corresponding
mgmt_set_powered command gets an appropriate response. This patch fixes
the missing response by adding a new mgmt_set_powered_failed function
that's used to indicate a power on failure to mgmt. Since a situation
with the device being rfkilled may require special handling in user
space the patch uses a new dedicated mgmt status code for this.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johan Hedberg [Tue, 28 May 2013 10:46:30 +0000 (13:46 +0300)]
Bluetooth: Fix missing length checks for L2CAP signalling PDUs
There has been code in place to check that the L2CAP length header
matches the amount of data received, but many PDU handlers have not been
checking that the data received actually matches that expected by the
specific PDU. This patch adds passing the length header to the specific
handler functions and ensures that those functions fail cleanly in the
case of an incorrect amount of data.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bing Zhao [Tue, 14 May 2013 01:15:32 +0000 (18:15 -0700)]
Bluetooth: btmrvl: support Marvell Bluetooth device SD8897
The register offsets have been changed in SD8897 and newer chips.
Define a new btmrvl_sdio_card_reg map for SD88xx.
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Frank Huang <frankh@marvell.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johan Hedberg [Wed, 24 Apr 2013 10:05:32 +0000 (13:05 +0300)]
Bluetooth: Fix checks for LE support on LE-only controllers
LE-only controllers do not support extended features so any kind of host
feature bit checks do not make sense for them. This patch fixes code
used for both single-mode (LE-only) and dual-mode (BR/EDR/LE) to use the
HCI_LE_ENABLED flag instead of the "Host LE supported" feature bit for
LE support tests.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Benjamin Tissoires [Wed, 29 May 2013 08:45:09 +0000 (10:45 +0200)]
HID: multitouch: prevent memleak with the allocated name
mt_free_input_name() was never called during .remove():
hid_hw_stop() removes the hid_input items in hdev->inputs, and so the
list is therefore empty after the call. In the end, we never free the
special names that has been allocated during .probe().
Restore the original name before freeing it to avoid acessing already
freed pointer.
This fixes a regression introduced by
49a5a827a ("HID: multitouch: append " Pen" to
the name of the stylus input")
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Pirko [Sat, 8 Jun 2013 13:00:55 +0000 (15:00 +0200)]
team: fix checks in team_get_first_port_txable_rcu()
should be checked if "cur" is txable, not "port".
Introduced by commit
6e88e1357c "team: use function team_port_txable()
for determing enabled and up port"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 8 Jun 2013 13:00:54 +0000 (15:00 +0200)]
team: move add to port list before port enablement
team_port_enable() adds port to port_hashlist. Reader sees port
in team_get_port_by_index_rcu() and returns it, but
team_get_first_port_txable_rcu() tries to go through port_list, where the
port is not inserted yet -> NULL pointer dereference.
Fix this by reordering port_list and port_hashlist insertion.
Panic is easily triggeable when txing packets and adding/removing port
in a loop.
Introduced by commit
3d249d4c "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sat, 8 Jun 2013 13:00:53 +0000 (15:00 +0200)]
team: check return value of team_get_port_by_index_rcu() for NULL
team_get_port_by_index_rcu() might return NULL due to race between port
removal and skb tx path. Panic is easily triggeable when txing packets
and adding/removing port in a loop.
introduced by commit
3d249d4ca "net: introduce ethernet teaming device"
and commit
753f993911b "team: introduce random mode" (for random mode)
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Sat, 8 Jun 2013 06:17:41 +0000 (14:17 +0800)]
tuntap: set SOCK_ZEROCOPY flag during open
Commit
54f968d6efdbf7dec36faa44fc11f01b0e4d1990
(tuntap: move socket to tun_file) forgets to set SOCK_ZEROCOPY flag, which will
prevent vhost_net from doing zercopy w/ tap. This patch fixes this by setting
it during file open.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 12 Jun 2013 06:07:21 +0000 (23:07 -0700)]
Merge branch 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme
Pull NVMe fixes from Matthew Wilcox.
* 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme:
NVMe: Add MSI support
NVMe: Use dma_set_mask() correctly
Return the result from user admin command IOCTL even in case of failure
NVMe: Do not cancel command multiple times
NVMe: fix error return code in nvme_submit_bio_queue()
NVMe: check for integer overflow in nvme_map_user_pages()
MAINTAINERS: update NVM EXPRESS DRIVER file list
NVMe: Fix a signedness bug in nvme_trans_modesel_get_mp
NVMe: Remove redundant version.h header include
Linus Torvalds [Tue, 11 Jun 2013 18:16:43 +0000 (11:16 -0700)]
Merge branch 'fixes' of git://git./virt/kvm/kvm
Pull kvm bugfixes from Gleb Natapov:
"There is one more fix for MIPS KVM ABI here, MIPS and PPC build
breakage fixes and a couple of PPC bug fixes"
* 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
kvm/ppc/booke: Hold srcu lock when calling gfn functions
kvm/ppc/booke64: Disable e6500 support
kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG
kvm: Add definition of KVM_REG_MIPS
KVM: add kvm_para_available to asm-generic/kvm_para.h
Yoshihiro YUNOMAE [Tue, 23 Apr 2013 01:32:39 +0000 (10:32 +0900)]
tracing: Fix outputting formats of x86-tsc and counter when use trace_clock
Outputting formats of x86-tsc and counter should be a raw format, but after
applying the patch(
2b6080f28c7cc3efc8625ab71495aae89aeb63a0), the format was
changed to nanosec. This is because the global variable trace_clock_id was used.
When we use multiple buffers, clock_id of each sub-buffer should be used. Then,
this patch uses tr->clock_id instead of the global variable trace_clock_id.
[ Basically, this fixes a regression where the multibuffer code changed the
trace_clock file to update tr->clock_id but the traces still use the old
global trace_clock_id variable, negating the file's effect. The global
trace_clock_id variable is obsolete and removed. - SR ]
Link: http://lkml.kernel.org/r/20130423013239.22334.7394.stgit@yunodevel
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Patrick McHardy [Tue, 11 Jun 2013 09:52:47 +0000 (02:52 -0700)]
netlink: fix error propagation in netlink_mmap()
Return the error if something went wrong instead of unconditionally
returning 0.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 6 Jun 2013 13:53:47 +0000 (15:53 +0200)]
net: sctp: fix NULL pointer dereference in socket destruction
While stress testing sctp sockets, I hit the following panic:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
IP: [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD
7cead067 PUD
7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task:
ffff88007ce0e0c0 ti:
ffff88007b568000 task.ti:
ffff88007b568000
RIP: 0010:[<
ffffffffa0490c4e>] [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:
ffff88007b569e08 EFLAGS:
00010292
RAX:
0000000000000000 RBX:
ffff88007db78a00 RCX:
dead000000200200
RDX:
ffffffffa049fdb0 RSI:
ffff8800379baf38 RDI:
0000000000000000
RBP:
ffff88007b569e18 R08:
ffff88007c230da0 R09:
0000000000000001
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
ffff880077990d00 R14:
0000000000000084 R15:
ffff88007db78a00
FS:
00007fc18ab61700(0000) GS:
ffff88007fc60000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000020 CR3:
000000007cf9d000 CR4:
00000000000007e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Stack:
ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
[<
ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
[<
ffffffff8145b60e>] sk_common_release+0x1e/0xf0
[<
ffffffff814df36e>] inet_create+0x2ae/0x350
[<
ffffffff81455a6f>] __sock_create+0x11f/0x240
[<
ffffffff81455bf0>] sock_create+0x30/0x40
[<
ffffffff8145696c>] SyS_socket+0x4c/0xc0
[<
ffffffff815403be>] ? do_page_fault+0xe/0x10
[<
ffffffff8153cb32>] ? page_fault+0x22/0x30
[<
ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP <
ffff88007b569e08>
CR2:
0000000000000020
---[ end trace
e0d71ec1108c1dd9 ]---
I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.
This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Thu, 6 Jun 2013 12:20:46 +0000 (15:20 +0300)]
vhost: fix ubuf_info cleanup
vhost_net_clear_ubuf_info didn't clear ubuf_info
after kfree, this could trigger double free.
Fix this and simplify this code to make it more robust: make sure
ubuf info is always freed through vhost_net_clear_ubuf_info.
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Thu, 6 Jun 2013 12:20:39 +0000 (15:20 +0300)]
vhost: check owner before we overwrite ubuf_info
If device has an owner, we shouldn't touch ubuf_info
since it might be in use.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 6 Jun 2013 10:57:02 +0000 (12:57 +0200)]
qmi_wwan/cdc_ether: let qmi_wwan handle the Huawei E1820
Another QMI speaking Qualcomm based device, which should be
driven by qmi_wwan, while cdc_ether should ignore it.
Like on other Huawei devices, the wwan function can appear
either as a single vendor specific interface or as a CDC ECM
class function using separate control and data interfaces.
The ECM control interface protocol is 0xff, likely in an
attempt to indicate that vendor specific management is
required.
In addition to the near standard CDC class, Huawei also add
vendor specific AT management commands to their firmwares.
This is probably an attempt to support non-Windows systems
using standard class drivers. Unfortunately, this part of
the firmware is often buggy. Linux is much better off using
whatever native vendor specific management protocol the
device offers, and Windows uses, whenever possible. This
means QMI in the case of Qualcomm based devices.
The E1820 has been verified to work fine with QMI.
Matching on interface number is necessary to distiguish the
wwan function from serial functions in the single interface
mode, as both function types will have class/subclass/function
set to ff/ff/ff.
The control interface number does not change in CDC ECM mode,
so the interface number matching rule is sufficient to handle
both modes. The cdc_ether blacklist entry is only relevant in
CDC ECM mode, but using a similar interface number based rule
helps document this as a transfer from one driver to another.
Other Huawei 02/06/ff devices are left with the cdc_ether driver
because we do not know whether they are based on Qualcomm chips.
The Huawei specific AT command management is known to be somewhat
hardware independent, and their usage of these class codes may
also be independent of the modem hardware.
Reported-by: Graham Inggs <graham.inggs@uct.ac.za>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Tue, 11 Jun 2013 09:38:27 +0000 (19:38 +1000)]
Merge tag 'drm-intel-fixes-2013-06-11' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Daniel writes:
Just tiny regression fixes here:
- Two fixes to fix sdvo hotplug which broke in the hpd storm detection
work.
- One fix to patch-up the sdvo lvds regression fixer from the last pull -
we need to prefer the vbt mode over edid modes.
* tag 'drm-intel-fixes-2013-06-11' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: prefer VBT modes for SVDO-LVDS over EDID
drm/i915: Enable hotplug interrupts after querying hw capabilities.
drm/i915: Fix hotplug interrupt enabling for SDVOC
Sergei Shtylyov [Wed, 5 Jun 2013 19:54:01 +0000 (23:54 +0400)]
sh_eth: fix result of sh_eth_check_reset() on timeout
When the first loop in sh_eth_check_reset() runs to its end, 'cnt' is 0, so the
following check for 'cnt < 0' fails to catch the timeout. Fix the condition in
this check, so that the timeout is actually reported.
While at it, fix the grammar in the failure message...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Siewior [Wed, 5 Jun 2013 16:54:00 +0000 (18:54 +0200)]
net/ti davinci_mdio: don't hold a spin lock while calling pm_runtime
was playing with suspend and run into this:
|BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:891
|in_atomic(): 1, irqs_disabled(): 0, pid: 1963, name: bash
|6 locks held by bash/1963:
|CPU: 0 PID: 1963 Comm: bash Not tainted 3.10.0-rc4+ #50
|[<
c0014fdc>] (unwind_backtrace+0x0/0xf8) from [<
c0011da4>] (show_stack+0x10/0x14)
|[<
c0011da4>] (show_stack+0x10/0x14) from [<
c02e8680>] (__pm_runtime_idle+0xa4/0xac)
|[<
c02e8680>] (__pm_runtime_idle+0xa4/0xac) from [<
c0341158>] (davinci_mdio_suspend+0x6c/0x9c)
|[<
c0341158>] (davinci_mdio_suspend+0x6c/0x9c) from [<
c02e0628>] (platform_pm_suspend+0x2c/0x54)
|[<
c02e0628>] (platform_pm_suspend+0x2c/0x54) from [<
c02e52bc>] (dpm_run_callback.isra.3+0x2c/0x64)
|[<
c02e52bc>] (dpm_run_callback.isra.3+0x2c/0x64) from [<
c02e57e4>] (__device_suspend+0x100/0x22c)
|[<
c02e57e4>] (__device_suspend+0x100/0x22c) from [<
c02e67e8>] (dpm_suspend+0x68/0x230)
|[<
c02e67e8>] (dpm_suspend+0x68/0x230) from [<
c0072a20>] (suspend_devices_and_enter+0x68/0x350)
|[<
c0072a20>] (suspend_devices_and_enter+0x68/0x350) from [<
c0072f18>] (pm_suspend+0x210/0x24c)
|[<
c0072f18>] (pm_suspend+0x210/0x24c) from [<
c0071c74>] (state_store+0x6c/0xbc)
|[<
c0071c74>] (state_store+0x6c/0xbc) from [<
c02714dc>] (kobj_attr_store+0x14/0x20)
|[<
c02714dc>] (kobj_attr_store+0x14/0x20) from [<
c01341a0>] (sysfs_write_file+0x16c/0x19c)
|[<
c01341a0>] (sysfs_write_file+0x16c/0x19c) from [<
c00ddfe4>] (vfs_write+0xb4/0x190)
|[<
c00ddfe4>] (vfs_write+0xb4/0x190) from [<
c00de3a4>] (SyS_write+0x3c/0x70)
|[<
c00de3a4>] (SyS_write+0x3c/0x70) from [<
c000e2c0>] (ret_fast_syscall+0x0/0x48)
I don't see a reason why the pm_runtime call must be under the lock.
Further I don't understand why this is a spinlock and not mutex.
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Scott Wood [Fri, 7 Jun 2013 00:16:32 +0000 (19:16 -0500)]
kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.
Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Scott Wood [Fri, 7 Jun 2013 00:16:31 +0000 (19:16 -0500)]
kvm/ppc/booke: Hold srcu lock when calling gfn functions
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Scott Wood [Fri, 7 Jun 2013 00:16:30 +0000 (19:16 -0500)]
kvm/ppc/booke64: Disable e6500 support
The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented). Disable e6500 on KVM until this is
fixed.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Mihai Caraman [Fri, 7 Jun 2013 00:16:29 +0000 (19:16 -0500)]
kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.
This fixes a build break for 64-bit booke KVM.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
David Daney [Mon, 10 Jun 2013 19:33:48 +0000 (12:33 -0700)]
mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG
The API requires that the GET_ONE_REG and SET_ONE_REG ioctls have this
extra information encoded in the register identifiers.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
David Daney [Mon, 10 Jun 2013 19:33:47 +0000 (12:33 -0700)]
kvm: Add definition of KVM_REG_MIPS
We use 0x7000000000000000ULL as 0x6000000000000000ULL is reserved for
ARM64.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Nicolas Dichtel [Wed, 5 Jun 2013 13:30:55 +0000 (15:30 +0200)]
sock_diag: fix filter code sent to userspace
Filters need to be translated to real BPF code for userland, like SO_GETFILTER.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Greear [Thu, 6 Jun 2013 21:29:49 +0000 (14:29 -0700)]
Fix lockup related to stop_machine being stuck in __do_softirq.
The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq. The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.
To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.
Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.
This was introduced in 3.9 by commit
c10d73671ad3 ("softirq: reduce
latencies").
It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.
The hang stack traces look something like this:
------------[ cut here ]------------
WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
Watchdog detected hard LOCKUP on cpu 2
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11
Call Trace:
<NMI> warn_slowpath_common+0x85/0x9f
warn_slowpath_fmt+0x46/0x48
watchdog_overflow_callback+0x9c/0xa7
__perf_event_overflow+0x137/0x1cb
perf_event_overflow+0x14/0x16
intel_pmu_handle_irq+0x2dc/0x359
perf_event_nmi_handler+0x19/0x1b
nmi_handle+0x7f/0xc2
do_nmi+0xbc/0x304
end_repeat_nmi+0x1e/0x2e
<<EOE>>
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
---[ end trace
4947dfa9b0a4cec3 ]---
BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
irq event stamp:
835637905
hardirqs last enabled at (
835637904): __do_softirq+0x9f/0x257
hardirqs last disabled at (
835637905): apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (
5654720): __do_softirq+0x1ff/0x257
softirqs last disabled at (
5654725): irq_exit+0x5f/0xbb
CPU 1
Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
RIP: tasklet_hi_action+0xf0/0xf0
Process migration/1
Call Trace:
<IRQ>
__do_softirq+0x117/0x257
irq_exit+0x5f/0xbb
smp_apic_timer_interrupt+0x8a/0x98
apic_timer_interrupt+0x72/0x80
<EOI>
printk+0x4d/0x4f
stop_machine_cpu_stop+0x22c/0x274
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Pekka Riikonen <priikone@iki.fi>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 11 Jun 2013 00:35:25 +0000 (17:35 -0700)]
Merge tag '9p-3.10-bug-fix-1' of git://git./linux/kernel/git/ericvh/v9fs
Pull net/9p bug fix from Eric Van Hensbergen:
"zero copy error fix"
* tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: Handle error in zero copy request correctly for 9p2000.u
Dave Airlie [Mon, 10 Jun 2013 22:16:10 +0000 (08:16 +1000)]
Merge branch 'gma500-fixes' of git://github.com/patjak/drm-gma500 into drm-fixes
Patrik writes:
Two fixes for memory leaks split into Cedarview and Poulsbo versions,
and a fix for properly setting the pipe base when using fbdev. It's on
my todo-list to start unifying the chips since they are very similar,
but until then I'd like to split them up in case there are side-effects
on Cedarview that I cannot currently test.
airled: Verified pull from github matches what I expected.
* 'gma500-fixes' of git://github.com/patjak/drm-gma500:
drm/gma500/cdv: Fix cursor gem obj referencing on cdv
drm/gma500/psb: Fix cursor gem obj referencing on psb
drm/gma500/cdv: Unpin framebuffer on crtc disable
drm/gma500/psb: Unpin framebuffer on crtc disable
drm/gma500: Add fb gtt offset to fb base
Jason Wang [Wed, 5 Jun 2013 08:44:57 +0000 (16:44 +0800)]
tuntap: fix a possible race between queue selection and changing queues
Complier may generate codes that re-read the tun->numqueues during
tun_select_queue(). This may be a race if vlan->numqueues were changed in the
same time and can lead unexpected result (e.g. very huge value).
We need prevent the compiler from generating such codes by adding an
ACCESS_ONCE() to make sure tun->numqueues were only read once.
Bug were introduced by commit
c8d68e6be1c3b242f1c598595830890b65cea64a
(tuntap: multiqueue support).
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 5 Jun 2013 07:40:46 +0000 (15:40 +0800)]
vhost_net: clear msg.control for non-zerocopy case during tx
When we decide not use zero-copy, msg.control should be set to NULL otherwise
macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs
wrongly.
Bug were introduced by commit
cedb9bdce099206290a2bdd02ce47a7b253b6a84
(vhost-net: skip head management if no outstanding).
This solves the following warnings:
WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]()
Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun]
CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566
Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011
ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48
ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0
ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58
Call Trace:
[<
ffffffff81796b73>] dump_stack+0x19/0x1e
[<
ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0
[<
ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20
[<
ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net]
[<
ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net]
[<
ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net]
[<
ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<
ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<
ffffffff81061f46>] kthread+0xc6/0xd0
[<
ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
[<
ffffffff817a1aec>] ret_from_fork+0x7c/0xb0
[<
ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 10 Jun 2013 20:30:33 +0000 (13:30 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
The following patchset contains four fixes for Netfilter and one fix
for IPVS, they are:
* Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from
Dan Carpenter.
* Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the
violation of RFC879, from Phil Oester.
* Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout,
from myself.
* Fix missing HW protocol in packets passed to user-space via NFQUEUE,
from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>