Pavel Emelyanov [Wed, 25 Apr 2012 23:43:04 +0000 (23:43 +0000)]
tcp repair: Fix unaligned access when repairing options (v2)
Don't pick __u8/__u16 values directly from raw pointers, but instead use
an array of structures of code:value pairs. This is OK, since the buffer
we take options from is not an skb memory, but a user-to-kernel one.
For those options which don't require any value now, require this to be
zero (for potential future extension of this API).
v2: Changed tcp_repair_opt to use two __u32-s as spotted by David Laight.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Wed, 25 Apr 2012 23:35:51 +0000 (23:35 +0000)]
6lowpan: duplicate definition of IEEE802154_ALEN
The same macros is defined in 'include/net/af_ieee802154.h' and is called
IEEE802154_ADDR_LEN. No need another one, so remove it.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Wed, 25 Apr 2012 23:35:50 +0000 (23:35 +0000)]
6lowpan: move frame allocation code to a separate function
Separate frame allocation routine from data processing function.
This makes code more human readable and easier for understanding.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:14 +0000 (20:52 +0000)]
mISDN: Added support for fragmentation of E1 interfaces of hfcmulti driver.
Fragmentation is usefull if multiple devices are connected to an E1
interface. Each fragment will have a subset of the available timeslots.
These devices require a cascde connection or a multiplexer.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:13 +0000 (20:52 +0000)]
mISDN: Rework of LED status display for HFC-4S/8S/E1 cards.
LEDs will show RED if layer 1 is disabled or fails.
LEDs will show GREEN if layer 1 is active.
LEDs will blink if traffic on D-channel.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:12 +0000 (20:52 +0000)]
mISDN: Using FLG_ACTIVE flag to determine if layer 1 is active or not.
We already have the flag for L1 active, so we should use it.
L2 will be solved in a later patch.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:11 +0000 (20:52 +0000)]
mISDN: Fixed false interruption of audio during bridging change.
Transmitted audio data was interrupted if a bridge was enabled or disabled.
Now transmission seamlessly continues during that action.
Fix in hfcmulti.ko
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:41:00 +0000 (20:41 +0000)]
atl1c: refine start/enable code for MAC module
merge TXQ/RXQ/MAC start/enable code to one function as they
are started/enabled at the same time, just like stop/disable them
in the function of atl1c_stop_mac.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:40:59 +0000 (20:40 +0000)]
atl1c: add function atl1c_power_saving
This function is used for suspend of S1/S3/S4 and driver remove.
It sets MAC/PHY based on the WoL configuation to get lower power
consumption.
atl1c_phy_power_saving is renamed to atl1c_phy_to_ps_link, this
function is just make PHY enter a link/speed mode to eat less
power.
REG_MAC_CTRL register is refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:15 +0000 (20:27 +0000)]
atl1c: remove PHY reset/init for link down event
it's unnecessary to reset/init phy when link down.
Only L1/L2 chip (supported by atlx) need such action.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:14 +0000 (20:27 +0000)]
atl1c: update PHY reset related routine
Many magic data are re-configured for PHY during its reset operation
based on chip type to get better compability and stability.
REG_PHY_CTRL register may be configured by BIOS before enter OS.
so, the driver can't directly write to it without any Read-Op.
this change also affect suspend and phy_disable routines.
PHY debug ports and extension registers are refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:13 +0000 (20:27 +0000)]
atl1c: remove PHY polling from atl1c_open
PHY polling code for FPGA is considered in every MDIO R/W API.
no need to add additional code to atl1c_open.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:12 +0000 (20:27 +0000)]
atl1c: refine SERDES-clock related code
bit 17/18 of reg1424 must be clear for l2cb 1.x, or it will cause
the write-reg operation fail without cable connected.
so, please do connect the cable when apply this patch to the driver
to make sure these 2bits are cleared by new driver.
The revised code is move to al1c_reset_mac.
SERDES register definition is refined as well.
when do reset MAC, speed/duplex control right should be transferred
to software before do PHY auto-neg -- by bit MASTER_CTRL_SPEED_MODE_SW.
SERDES register definition is refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:11 +0000 (20:27 +0000)]
atl1c: remove PHY contrl in atl1c_reset_pcie
atl1c_reset_phy follows atl1c_reset_pcie in the whole driver,
so, it's unnecessary to add PHY control code in atl1c_reset_pcie.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:10 +0000 (20:27 +0000)]
atl1c: refine phy-register read/write function
phy register is read/write via MDIO control module ---
that module will be affected by the hibernate status,
to access phy regs in hib stutus, slow frequency clk must
be selected.
To access phy extension register, the MDIO related
registers are refined/updated, a _core function is
re-wroted for both regular PHY regs and extension regs.
existing PHY r/w function is revised based on the _core.
PHY extension registers will be used for the comming
patches.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:09 +0000 (20:27 +0000)]
atl1c: remove REG_PHY_STATUS
this register is used for l1e(dev=1026)
l1c/l1d/l2cb don't use it.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:47:15 +0000 (01:47 +0000)]
be2net: Fix FW download for BE
Skip flashing a FW component if that component is not present in a
particular FW UFI image.
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:47:03 +0000 (01:47 +0000)]
be2net: Fix wrong status getting returned for MCC commands
MCC Response CQEs are processed as part of NAPI poll routine and
also synchronously. If MCC completions are consumed by NAPI poll
routine, wrong status is returned to synchronously waiting routine.
Fix this by getting status of MCC command from command response
instead of response CQEs.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:52 +0000 (01:46 +0000)]
be2net: Fix Lancer statistics
Fix port num sent in command to get stats. Also skip unnecessary
parsing of stats for Lancer.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:39 +0000 (01:46 +0000)]
be2net: Fix traffic stall INTx mode
EQ is getting armed wrongly in INTx mode as INTx interrupt is taking
some time to deassert. This can cause another interrupt while NAPI is
scheduled and scheduling a NAPI in interrupt does not take effect.
This causes interrupt to be missed and traffic stalls. Fixing this by
preventing wrong arming of EQ.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:28 +0000 (01:46 +0000)]
be2net: Fix ethtool self test for Lancer
Lancer does not support DDR self test. Fix ethtool self test by
skipping this test for Lancer.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:18 +0000 (01:46 +0000)]
be2net: Fix FW download in Lancer
Increase time given by driver to adapter for completing FW download
to 30 seconds. Also return correct status when FW download times out.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:06 +0000 (01:46 +0000)]
be2net: Fix VLAN/multicast packet reception
VLAN and multicast hardware filters are limited and can get
exhausted in adapters with many PCI functions. If setting
a VLAN or multicast filter fails due to lack of sufficient
hardware resources, these packets get dropped. Fix this by
switching to VLAN or multicast promiscous mode so that these
packets are not dropped.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Wed, 25 Apr 2012 20:54:48 +0000 (20:54 +0000)]
mISDN: DSP scheduling fix (version 2)
dsp_spl_jiffies need to be the same datatype as jiffies (which is ulong).
If not, on 64 bit systems it will fallback to schedule the DSP every jiffie
tic as soon jiffies become > 2^32.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Tue, 24 Apr 2012 02:51:51 +0000 (02:51 +0000)]
mISDN: Fix division by zero
If DTMF debug is set and tresh goes under 100, the printk will cause
a division by zero.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 02:51:50 +0000 (02:51 +0000)]
mISDN: Fixed hardware bridging/conference check routine of mISDN_dsp.ko.
In some cases the hardware bridging/conference (2-n parties) was selected,
but still pure software bridging/conference was used.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 02:51:49 +0000 (02:51 +0000)]
mISDN: Fix NULL pointer bug in if-condition of mISDN_dsp
Fix a bug (was introduced by a cut & paste error)
in cases when dsp->conf was NULL.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shan Wei [Tue, 24 Apr 2012 18:21:07 +0000 (18:21 +0000)]
net: sock_diag_handler structs can be const
read only, so change it to const.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Apr 2012 10:17:59 +0000 (10:17 +0000)]
ipv6: call consume_skb() in frag/reassembly
Some kfree_skb() calls should be replaced by consume_skb() to avoid
drop_monitor/dropwatch false positives.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 20 Apr 2012 09:49:23 +0000 (09:49 +0000)]
net: dcb: add CEE notify calls
This adds code to trigger CEE events when an APP change or setall
command is made from user space. This simplifies user space code
significantly by creating a single interface to listen on that
works with both firmware and userland agents.
And if we end up with multiple agents this keeps every thing in
sync userland agents, firmware agents, and kernel notifier consumers.
For an example agent that listens for these events see:
https://github.com/jrfastab/cgdcbxd
cgdcbxd is a daemon used to monitor DCB netlink events and manage
the net_prio control group sub-system.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Mon, 23 Apr 2012 04:49:13 +0000 (04:49 +0000)]
tipc: remove inline instances from C source files.
Untie gcc's hands and let it do what it wants within the
individual source files. There are two files, node.c and
port.c -- only the latter effectively changes (gcc-4.5.2).
Objdump shows gcc deciding to not inline port_peernode().
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 21:30:29 +0000 (21:30 +0000)]
af_netlink: drop_monitor/dropwatch friendly
Need to consume_skb() instead of kfree_skb() in netlink_dump() and
netlink_unicast_kernel() to avoid false dropwatch positives.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 21:30:21 +0000 (21:30 +0000)]
af_netlink: cleanups
netlink_destroy_callback() move to avoid forward reference
CodingStyle cleanups
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Apr 2012 17:48:27 +0000 (17:48 +0000)]
net: skb_can_coalesce returns a boolean
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Apr 2012 17:34:36 +0000 (17:34 +0000)]
tcp: tcp_try_coalesce returns a boolean
This clarifies code intention, as suggested by David.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Apr 2012 03:35:04 +0000 (23:35 -0400)]
net: make spd_fill_page() linear argument a bool
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 24 Apr 2012 03:14:36 +0000 (23:14 -0400)]
Merge git://git./linux/kernel/git/davem/net
Fix merge between commit
3adadc08cc1e ("net ax25: Reorder ax25_exit to
remove races") and commit
0ca7a4c87d27 ("net ax25: Simplify and
cleanup the ax25 sysctl handling")
The former moved around the sysctl register/unregister calls, the
later simply removed them.
With help from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 24 Apr 2012 03:06:11 +0000 (23:06 -0400)]
net: Use bool and remove inline in skb_splice_bits() code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 12:26:16 +0000 (12:26 +0000)]
net: speedup skb_splice_bits()
Commit
35f3d14db (pipe: add support for shrinking and growing pipes)
added a slowdown for splice(socket -> pipe), as we might grow the spd
used in skb_splice_bits() for each skb we process in splice() syscall.
Its not needed since skb lengths are capped. The default on-stack arrays
are more than enough.
Use MAX_SKB_FRAGS instead of PIPE_DEF_BUFFERS to describe the reasonable
limit per skb.
Add coalescing support to help splicing of GRO skbs built from linear
skbs (linked into frag_list)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 24 Apr 2012 02:45:19 +0000 (19:45 -0700)]
Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild
Pull build system failure fix from Michal Marek:
"This fixes build failure with newer gcc that adds some internal
symbols that end in "__mod_*_device_table", but are not actually the
tables themselves."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Fix modpost failures in fedora 17
Eric Dumazet [Mon, 23 Apr 2012 07:11:42 +0000 (07:11 +0000)]
tcp: introduce tcp_try_coalesce
commit
c8628155ece3 (tcp: reduce out_of_order memory use) took care of
coalescing tcp segments provided by legacy devices (linear skbs)
We extend this idea to fragged skbs, as their truesize can be heavy.
ixgbe for example uses 256+1024+PAGE_SIZE/2 = 3328 bytes per segment.
Use this coalescing strategy for receive queue too.
This contributes to reduce number of tcp collapses, at minimal cost, and
reduces memory overhead and packets drops.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowski [Mon, 23 Apr 2012 03:05:16 +0000 (03:05 +0000)]
bnx2x: Update driver version to 1.72.50-0
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 23 Apr 2012 03:05:11 +0000 (03:05 +0000)]
bnx2x: remove gro workaround
Removes GRO workaround, as issue is fixed in FW 7.2.51.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowski [Mon, 23 Apr 2012 03:04:46 +0000 (03:04 +0000)]
bnx2x: add afex support
Following patch adds afex multifunction support to the driver (afex
multifunction is based on vntag header) and updates FW version used to 7.2.51.
Support includes the following:
1. Configure vif parameters in firmware (default vlan, vif id, default
priority, allowed priorities) according to values received from NIC.
2. Configure FW to strip/add default vlan according to afex vlan mode.
3. Notify link up to OS only after vif is fully initialized.
4. Support vif list set/get requests and configure FW accordingly.
5. Supply afex statistics upon request from NIC.
6. Special handling to L2 interface in case of FCoE vif.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:50 +0000 (02:18 +0000)]
mlx4_en: Byte Queue Limit support
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:39 +0000 (02:18 +0000)]
mlx4_en: Moving to Interrupts for TX completions
Moving to interrupts instead of polling fpr TX completions
Avoiding situations where skb can be held in by the driver for
a long time (till timer expires).
The change is also necessary for supporting BQL.
Removing comp_lock that was required because we could handle TX
completions from several contexts: Interrupts, timer, polling.
Now there is only interrupts
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:33 +0000 (02:18 +0000)]
mlx4_en: Added Ethtool support for TX Interrupt coalescing
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 23:38:54 +0000 (23:38 +0000)]
tcp: sk_add_backlog() is too agressive for TCP
While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.
A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.
A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.
Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 23:34:26 +0000 (23:34 +0000)]
net: add a limit parameter to sk_add_backlog()
sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.
No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Mon, 23 Apr 2012 14:25:49 +0000 (14:25 +0000)]
net ax25: Fix the build when sysctl support is disabled.
Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since
20120420:
>
>
> include/net/ax25.h:447:75: error: expected ';' before '}' token
>
> static inline int ax25_register_dev_sysctl(ax25_dev *ax25_dev) { return 0 };
> static inline void ax25_unregister_dev_sysctl(ax25_dev *ax25_dev) {};
>
> First function: move ';' inside braces.
> Second function: drop the ';'.
Put the semicolons where it makes sense.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 24 Apr 2012 01:25:01 +0000 (18:25 -0700)]
Merge tag 'md-3.4-fixes' of git://neil.brown.name/md
Pull a few more md bug fixes from NeilBrown:
"2 are tagged for -stable, one being for a fairly serious bug that can
corrupt metadata and make it hard to recovery an array. The other is
for a more recent regression since 3.3"
* tag 'md-3.4-fixes' of git://neil.brown.name/md:
md: fix possible corruption of array metadata on shutdown.
md: don't call ->add_disk unless there is good reason.
DM RAID: Use safe version of rdev_for_each
Linus Torvalds [Tue, 24 Apr 2012 01:22:42 +0000 (18:22 -0700)]
Merge tag 'dlm-fixes-3.4' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm fixes from David Teigland:
"This includes one short patch fixing the behavior of the QUECVT flag,
which the gfs2 folks are waiting on."
* tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix QUECVT when convert queue is empty
Hugh Dickins [Mon, 23 Apr 2012 18:14:50 +0000 (11:14 -0700)]
mm: fix s390 BUG by __set_page_dirty_no_writeback on swap
Mel reports a BUG_ON(slot == NULL) in radix_tree_tag_set() on s390
3.0.13: called from __set_page_dirty_nobuffers() when page_remove_rmap()
tries to transfer dirty flag from s390 storage key to struct page and
radix_tree.
That would be because of reclaim's shrink_page_list() calling
add_to_swap() on this page at the same time: first PageSwapCache is set
(causing page_mapping(page) to appear as &swapper_space), then
page->private set, then tree_lock taken, then page inserted into
radix_tree - so there's an interval before taking the lock when the
radix_tree slot is empty.
We could fix this by moving __add_to_swap_cache()'s spin_lock_irq up
before the SetPageSwapCache. But a better fix is simply to do what's
five years overdue: Ken Chen introduced __set_page_dirty_no_writeback()
(if !PageDirty TestSetPageDirty) for tmpfs to skip all the radix_tree
overhead, and swap is just the same - it ignores the radix_tree tag, and
does not participate in dirty page accounting, so should be using
__set_page_dirty_no_writeback() too.
s390 testing now confirms that this does indeed fix the problem.
Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ken Chen <kenchen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Tue, 24 Apr 2012 00:23:16 +0000 (10:23 +1000)]
md: fix possible corruption of array metadata on shutdown.
commit
c744a65c1e2d59acc54333ce8
md: don't set md arrays to readonly on shutdown.
removed the possibility of a 'BUG' when data is written to an array
that has just been switched to read-only, but also introduced the
possibility that the array metadata could be corrupted.
If, when md_notify_reboot gets the mddev lock, the array is
in a state where it is assembled but hasn't been started (as can
happen if the personality module is not available, or in other unusual
situations), then incorrect metadata will be written out making it
impossible to re-assemble the array.
So only call __md_stop_writes() if the array has actually been
activated.
This patch is needed for any stable kernel which has had the above
commit applied.
Cc: stable@vger.kernel.org
Reported-by: Christoph Nelles <evilazrael@evilazrael.de>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 24 Apr 2012 00:23:14 +0000 (10:23 +1000)]
md: don't call ->add_disk unless there is good reason.
Commit
7bfec5f35c68121e7b18
md/raid5: If there is a spare and a want_replacement device, start replacement.
cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.
So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.
This fix is suitable for 3.3.y:
Cc: stable@vger.kernel.org
Reported-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Jonathan Brassow [Tue, 24 Apr 2012 00:23:13 +0000 (10:23 +1000)]
DM RAID: Use safe version of rdev_for_each
Fix segfault caused by using rdev_for_each instead of rdev_for_each_safe
Commit
dafb20fa34320a472deb7442f25a0c086e0feb33 mistakenly replaced a safe
iterator with an unsafe one when making some macro changes.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Eric W. Biederman [Mon, 23 Apr 2012 12:13:02 +0000 (12:13 +0000)]
net sysctl: Add place holder functions for when sysctl support is compiled out of the kernel.
Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since
20120420:
>
>
>
> ERROR: "unregister_net_sysctl_table" [net/phonet/phonet.ko] undefined!
> ERROR: "register_net_sysctl" [net/phonet/phonet.ko] undefined!
>
> when CONFIG_SYSCTL is not enabled.
Add static inline stub functions to gracefully handle the case when sysctl
support is not present.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Sat, 21 Apr 2012 18:53:22 +0000 (18:53 +0000)]
be2net: fix ethtool get settings
ethtool get settings was not displaying all the settings correctly.
use the get_phy_info to get more information about the PHY to fix this.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Teigland [Wed, 4 Apr 2012 14:49:15 +0000 (09:49 -0500)]
dlm: fix QUECVT when convert queue is empty
The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.
Signed-off-by: David Teigland <teigland@redhat.com>
David S. Miller [Mon, 23 Apr 2012 07:21:58 +0000 (03:21 -0400)]
tcp: Fix build warning after tcp_{v4,v6}_init_sock consolidation.
net/ipv4/tcp_ipv4.c: In function 'tcp_v4_init_sock':
net/ipv4/tcp_ipv4.c:1891:19: warning: unused variable 'tp' [-Wunused-variable]
net/ipv6/tcp_ipv6.c: In function 'tcp_v6_init_sock':
net/ipv6/tcp_ipv6.c:1836:19: warning: unused variable 'tp' [-Wunused-variable]
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 23 Apr 2012 04:19:15 +0000 (21:19 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"Here's my usual Sunday push, just for one revert which PeterZ hollered
about after last weeks push. Other than that, all seems strangely
quiet as far as fixes go in non-platform ARM land at the moment."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
Revert "ARM: 7359/2: smp_twd: Only wait for reprogramming on active cpus"
Linus Torvalds [Mon, 23 Apr 2012 04:07:51 +0000 (21:07 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for powerpc. Note the addition to the generic
irq.h. This is part of a 3-patches regression fix for mpic due to
changes in how IRQ_TYPE_NONE is being handled. Thomas agreed to the
addition of the new IRQ_TYPE_DEFAULT contant, however he hasn't
replied with an Ack to the actual patch yet. I don't to wait much
longer with these patches tho."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mpic: Properly set default triggers
irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers
powerpc/mpic: Fix confusion between hw_irq and virq
powerpc/pmac: Don't add_timer() twice
powerpc/eeh: Fix crash caused by null eeh_dev
powerpc/mpc85xx: add MPIC message dts node
powerpc/mpic_msgr: fix offset error when setting mer register
powerpc/mpic_msgr: add lock for MPIC message global variable
powerpc/mpic_msgr: fix compile error when SMP disabled
powerpc: fix build when CONFIG_BOOKE_WDT is enabled
powerpc/85xx: don't call of_platform_bus_probe() twice
Linus Torvalds [Mon, 23 Apr 2012 04:02:57 +0000 (21:02 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix namespace init and cleanup in phonet to fix some oopses, from
Eric W. Biederman.
2) Missing kfree_skb() in AF_KEY, from Julia Lawall.
3) Refcount leak and source address handling fix in l2tp from James
Chapman.
4) Memory leak fix in CAIF from Tomasz Gregorek.
5) When routes are cloned from ipv6 addrconf routes, we don't process
expirations properly. Fix from Gao Feng.
6) Fix panic on DMA errors in atl1 driver, from Tony Zelenoff.
7) Only enable interrupts in 8139cp driver after we've registered the
IRQ handler. From Jason Wang.
8) Fix too many reads of KS_CIDER register in ks8851 during probe,
fixing crashes on spurious interrupts. From Matt Renzelmann.
9) Missing include in ath5k driver and missing iounmap on probe
failure, from Jonathan Bither.
10) Fix RX packet handling in smsc911x driver, from Will Deacon.
11) Fix ixgbe WoL on fiber by leaving the laser on during shutdown.
12) ks8851 needs MAX_RECV_FRAMES increased otherwise the internal MAC
buffers are easily overflown. Fix from Davide Cimingahi.
13) Fix memory leaks in peak_usb CAN driver, from Jesper Juhl.
14) gred packet scheduler can dump in WRED more when doing a netlink
dump. Fix from David Ward.
15) Fix MTU in USB smsc75xx driver, from Stephane Fillod.
16) Dummy device needs ->ndo_uninit handler to properly handle
->ndo_init failures. From Hiroaki SHIMODA.
17) Fix TX fragmentation in ath9k driver, from Sujith Manoharan.
18) Missing RTNL lock in ixgbe PM resume, from Benjamin Poirier.
19) Missing iounmap in farsync WAN driver, from Julia Lawall.
20) With LRO/GRO, tcp_grow_window() is easily tricked into not growing
the receive window properly, and this hurts performance. Fix from
Eric Dumazet.
21) Network namespace init failure can leak net_generic data, fix from
Julian Anastasov.
22) Fix skb_over_panic due to mis-accounting in TCP for partially ACK'd
SKBs. From Eric Dumazet.
23) New IDs for qmi_wwan driver, from Bjørn Mork.
24) Fix races in ax25_exit(), from Eric W. Biederman.
25) IPV6 TCP doesn't handle TCP_MAXSEG socket option properly, copy over
logic from the IPV4 side. From Neal Cardwell.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
tcp: fix TCP_MAXSEG for established IPv6 passive sockets
drivers/net: Do not free an IRQ if its request failed
drop_monitor: allow more events per second
ks8851: Fix request_irq/free_irq mismatch
net/hyperv: Adding cancellation to ensure rndis filter is closed
ks8851: Fix mutex deadlock in ks8851_net_stop()
net ax25: Reorder ax25_exit to remove races.
icplus: fix interrupt for IC+ 101A/G and 1001LF
net: qmi_wwan: support Sierra Wireless MC77xx devices in QMI mode
bnx2x: off by one in bnx2x_ets_e3b0_sp_pri_to_cos_set()
ksz884x: don't copy too much in netdev_set_mac_address()
tcp: fix retransmit of partially acked frames
netns: do not leak net_generic data on failed init
net/sock.h: fix sk_peek_off kernel-doc warning
tcp: fix tcp_grow_window() for large incoming frames
drivers/net/wan/farsync.c: add missing iounmap
davinci_mdio: Fix MDIO timeout check
ipv6: clean up rt6_clean_expires
ipv6: fix rt6_update_expires
arcnet: rimi: Fix device name in debug output
...
Benjamin Herrenschmidt [Thu, 19 Apr 2012 17:30:57 +0000 (17:30 +0000)]
powerpc/mpic: Properly set default triggers
This gets rid of the unused default senses array, and replaces the
incorrect use of IRQ_TYPE_NONE with the new IRQ_TYPE_DEFAULT for
the initial set_trigger() call when mapping an interrupt.
This in turn makes us read the HW state and update the irq desc
accordingly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 19 Apr 2012 17:29:42 +0000 (17:29 +0000)]
irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers
This is meant typically to allow a PIC driver's irq domain map() callback
to establish sane defaults for the interrupt (and make sure that the HW
and the irq_desc are in sync as far as the trigger is concerned).
The irq core may not call the set_trigger callback if it thinks the
trigger is already set to the right setting, so we need to ensure new
descriptors are properly synchronized with the hardware.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Thu, 19 Apr 2012 17:29:34 +0000 (17:29 +0000)]
powerpc/mpic: Fix confusion between hw_irq and virq
mpic_is_ipi() takes a virq and immediately converts it to a hw_irq.
However, one of the two call sites calls it with a ... hw_irq. The
other call site also happens to have the hw_irq at hand, so let's
change it to just take that as an argument. Also change mpic_is_tm()
for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Wed, 18 Apr 2012 22:16:48 +0000 (22:16 +0000)]
powerpc/pmac: Don't add_timer() twice
If the interrupt and the timeout happen roughly at the same
time, we can get into a situation where the timer function
is run while the interrupt has already been processed. In
this case, the timer function might end up doing an add_timer
on an already pending timer, causing a BUG_ON() to trigger.
Instead, just skip the whole timeout operation if we see that
the timer is pending. The spinlock ensures that the only way
that happens is if we already started a new operation and thus
the timeout can be ignored.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Gavin Shan [Mon, 16 Apr 2012 19:55:39 +0000 (19:55 +0000)]
powerpc/eeh: Fix crash caused by null eeh_dev
The problem was reported by Anton Blanchard. While EEH error
happened to the PCI device without the corresponding device
driver, kernel crash was seen. Eventually, I successfully
reproduced the problem on Firebird-L machine with utility
"errinjct". Initially, the device driver for Emulex ethernet
MAC has been disabled from .config and force data parity on
the Emulex ethernet MAC with help of "errinjct". Eventually,
I saw the kernel crash after issueing couple of "lspci -v"
command.
The root cause behind is that the PCI device, including the
reference to the corresponding eeh device, will be removed
from the system while EEH does recovery. Afterwards, the
PCI device will be probed again and added into the system
accordingly. So it's not safe to retrieve the eeh device from
the corresponding PCI device after the PCI device has been removed
and not added again.
The patch fixes the issue and retrieve the eeh device from OF node
instead of PCI device after the PCI device has been removed.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 23 Apr 2012 00:55:20 +0000 (10:55 +1000)]
Merge remote-tracking branch 'kumar/merge' into merge
Neal Cardwell [Sun, 22 Apr 2012 09:45:47 +0000 (09:45 +0000)]
tcp: fix TCP_MAXSEG for established IPv6 passive sockets
Commit
f5fff5d forgot to fix TCP_MAXSEG behavior IPv6 sockets, so IPv6
TCP server sockets that used TCP_MAXSEG would find that the advmss of
child sockets would be incorrect. This commit mirrors the advmss logic
from tcp_v4_syn_recv_sock in tcp_v6_syn_recv_sock. Eventually this
logic should probably be shared between IPv4 and IPv6, but this at
least fixes this issue.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 21 Apr 2012 21:47:52 +0000 (14:47 -0700)]
Linux 3.4-rc4
Wu Jiajun-B06378 [Thu, 19 Apr 2012 22:54:35 +0000 (22:54 +0000)]
gianfar: add GRO support
Replace netif_receive_skb with napi_gro_receive.
Signed-off-by: Jiajun Wu <b06378@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lee Jones [Thu, 19 Apr 2012 10:36:33 +0000 (10:36 +0000)]
drivers/net: Do not free an IRQ if its request failed
Refrain from attempting to free an interrupt line if the request
fails and hence, there is no IRQ to free.
CC: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 19 Apr 2012 21:56:11 +0000 (21:56 +0000)]
af_packet: packet_getsockopt() cleanup
Factorize code, since most fetched values are int type.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Thu, 19 Apr 2012 09:55:21 +0000 (09:55 +0000)]
tcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock()
This commit moves the (substantial) common code shared between
tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family
independent function, tcp_init_sock().
Centralizing this functionality should help avoid drift issues,
e.g. where the IPv4 side is updated without a corresponding update to
IPv6. There was already some drift: IPv4 initialized snd_cwnd to
TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to
2 (in this case it should not matter, since snd_cwnd is also
initialized in tcp_init_metrics(), but the general risks and
maintenance overhead remain).
When diffing the old and new code, note that new tcp_init_sock()
function uses the order of steps from the tcp_v4_init_sock()
implementation (the order is slightly different in
tcp_v6_init_sock()).
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 19 Apr 2012 09:38:17 +0000 (09:38 +0000)]
net: allow better page reuse in splice(sock -> pipe)
splice() from socket to pipe needs linear_to_page() helper to transfert
skb header to part of page.
We can reset the offset in the current sk->sk_sndmsg_page if we are the
last user of the page.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yong Zhang [Thu, 19 Apr 2012 20:28:32 +0000 (20:28 +0000)]
sparc32,leon: add notify_cpu_starting()
Otherwise cpu_active_mask will not set, which lead to other issue.
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Konrad Eisele <konrad@gaisler.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 19 Apr 2012 07:16:21 +0000 (07:16 +0000)]
drop_monitor: allow more events per second
It seems there is a logic error in trace_drop_common(), since we store
only 64 drops, even if they are from same location.
This fix is a one liner, but we probably need more work to avoid useless
atomic dec/inc
Now I can watch 1 Mpps drops through dropwatch...
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 20 Apr 2012 04:42:06 +0000 (04:42 +0000)]
team: add per-port option for enabling/disabling ports
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 20 Apr 2012 04:42:05 +0000 (04:42 +0000)]
team: allow to enable/disable ports
This patch changes content of hashlist (used to get port struct by
computed index (0...en_port_count-1)). Now the hash list contains only
enabled ports so userspace will be able to say what ports can be used
for tx/rx. This becomes handy when userspace will need to disable ports
which does not belong to active aggregator. By default, newly added port
is enabled.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 20 Apr 2012 04:42:04 +0000 (04:42 +0000)]
team: lb: let userspace care about port macs
Better to leave this for userspace
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 20 Apr 2012 18:04:01 +0000 (20:04 +0200)]
net: change big iov allocations
iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
sock_kmalloc()
As these allocations are temporary only and small enough, it makes sense
to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.
Slightly changed fast path to be even faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Renzelmann [Thu, 19 Apr 2012 07:17:17 +0000 (07:17 +0000)]
ks8851: Fix request_irq/free_irq mismatch
The dev_id parameter passed to free_irq needs to match the one passed
to the corresponding request_irq.
Signed-off-by: Matt Renzelmann <mjr@cs.wisc.edu>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:57 +0000 (03:41 +0000)]
tcp: Repair connection-time negotiated parameters
There are options, which are set up on a socket while performing
TCP handshake. Need to resurrect them on a socket while repairing.
A new sockoption accepts a buffer and parses it. The buffer should
be CODE:VALUE sequence of bytes, where CODE is standard option
code and VALUE is the respective value.
Only 4 options should be handled on repaired socket.
To read 3 out of 4 of these options the TCP_INFO sockoption can be
used. An ability to get the last one (the mss_clamp) was added by
the previous patch.
Now the restore. Three of these options -- timestamp_ok, mss_clamp
and snd_wscale -- are just restored on a coket.
The sack_ok flags has 2 issues. First, whether or not to do sacks
at all. This flag is just read and set back. No other sack info is
saved or restored, since according to the standart and the code
dropping all sack-ed segments is OK, the sender will resubmit them
again, so after the repair we will probably experience a pause in
connection. Next, the fack bit. It's just set back on a socket if
the respective sysctl is set. No collected stats about packets flow
is preserved. As far as I see (plz, correct me if I'm wrong) the
fack-based congestion algorithm survives dropping all of the stats
and repairs itself eventually, probably losing the performance for
that period.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:32 +0000 (03:41 +0000)]
tcp: Report mss_clamp with TCP_MAXSEG option in repair mode
The mss_clamp is the only connection-time negotiated option which
cannot be obtained from the user space. Make the TCP_MAXSEG sockopt
report one in the repair mode.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:41:01 +0000 (03:41 +0000)]
tcp: Repair socket queues
Reading queues under repair mode is done with recvmsg call.
The queue-under-repair set by TCP_REPAIR_QUEUE option is used
to determine which queue should be read. Thus both send and
receive queue can be read with this.
Caller must pass the MSG_PEEK flag.
Writing to queues is done with sendmsg call and yet again --
the repair-queue option can be used to push data into the
receive queue.
When putting an skb into receive queue a zero tcp header is
appented to its head to address the tcp_hdr(skb)->syn and
the ->fin checks by the (after repair) tcp_recvmsg. These
flags flags are both set to zero and that's why.
The fin cannot be met in the queue while reading the source
socket, since the repair only works for closed/established
sockets and queueing fin packet always changes its state.
The syn in the queue denotes that the respective skb's seq
is "off-by-one" as compared to the actual payload lenght. Thus,
at the rcv queue refill we can just drop this flag and set the
skb's sequences to precice values.
When the repair mode is turned off, the write queue seqs are
updated so that the whole queue is considered to be 'already sent,
waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the
protocol POV the send queue looks like it was sent, but the data
between the write_seq and snd_nxt is lost in the network.
This helps to avoid another sockoption for setting the snd_nxt
sequence. Leaving the whole queue in a 'not yet sent' state (as
it will be after sendmsg-s) will not allow to receive any acks
from the peer since the ack_seq will be after the snd_nxt. Thus
even the ack for the window probe will be dropped and the
connection will be 'locked' with the zero peer window.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:40:39 +0000 (03:40 +0000)]
tcp: Initial repair mode
This includes (according the the previous description):
* TCP_REPAIR sockoption
This one just puts the socket in/out of the repair mode.
Allowed for CAP_NET_ADMIN and for closed/establised sockets only.
When repair mode is turned off and the socket happens to be in
the established state the window probe is sent to the peer to
'unlock' the connection.
* TCP_REPAIR_QUEUE sockoption
This one sets the queue which we're about to repair. The
'no-queue' is set by default.
* TCP_QUEUE_SEQ socoption
Sets the write_seq/rcv_nxt of a selected repaired queue.
Allowed for TCP_CLOSE-d sockets only. When the socket changes
its state the other seq-s are changed by the kernel according
to the protocol rules (most of the existing code is actually
reused).
* Ability to forcibly bind a socket to a port
The sk->sk_reuse is set to SK_FORCE_REUSE.
* Immediate connect modification
The connect syscall initializes the connection, then directly jumps
to the code which finalizes it.
* Silent close modification
The close just aborts the connection (similar to SO_LINGER with 0
time) but without sending any FIN/RST-s to peer.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:40:01 +0000 (03:40 +0000)]
tcp: Move code around
This is just the preparation patch, which makes the needed for
TCP repair code ready for use.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 19 Apr 2012 03:39:36 +0000 (03:39 +0000)]
sock: Introduce named constants for sk_reuse
Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 21 Apr 2012 19:45:52 +0000 (12:45 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull "ARM: SoC fixes" from Olof Johansson:
* at91, ux500, imx, omap and bcmring:
- at91 fixes for =m driver build issues, irqdomain fixes and config
dependency fixes
- ux500 kconfig dependency fixes and a smp wakeup bugfix
- imx idle bugfix and build fix due to irq domain changes
- omap uart pinmux fixes, softreset regression revert and misc fixes
- bcmring build error regression fix
* ux500 and imx had some small defconfig updates in this branch
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (27 commits)
ARM: bcmring: fix UART declarations
ARM: imx: Fix imx5 idle logic bug
ARM: imx27-dt: Fix build due to removal of irq_domain_add_simple()
ARM: imx_v4_v5_defconfig: Add support for CONFIG_REGULATOR_FIXED_VOLTAGE
ARM: OMAP1: DMTIMER: fix broken timer clock source selection
ARM: OMAP: serial: Fix the ocp smart idlemode handling bug
ARM: OMAP2+: UART: Fix incorrect population of default uart pads
ARM: OMAP: sram: fix BUG in dpll code for !PM case
dmaengine: Kconfig: fix Atmel at_hdmac entry
USB: gadget/at91_udc: add gpio_to_irq() function to vbus interrupt
USB: ohci-at91: change annotations for probe/remove functions
leds-atmel-pwm.c: Make pwmled_probe() __devinit
ARM: at91: fix at91sam9261ek Ethernet dm9000 irq
ARM: at91: fix rm9200ek flash size
ARM: at91: remove empty at91_init_serial function
ARM: at91: fix typo in at91_pmc_base assembly declaration
ARM: at91: Export at91_matrix_base
ARM: at91: Export at91_pmc_base
ARM: at91: Export at91_ramc_base
ARM: at91: Export at91_st_base
...
Linus Torvalds [Sat, 21 Apr 2012 19:44:37 +0000 (12:44 -0700)]
Merge tag 'mmc-fixes-for-3.4-rc4' of git://git./linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- Build fix for omap_hsmmc with OF against 3.4-rc1.
- Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which
broke hotplug card detection when UNSAFE_RESUME is set.
- Fix a race condition in omap_hsmmc with runtime PM.
- Fix two libertas SDIO-powered-resume regressions.
- Small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx.
* tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: core: Do not pre-claim host in suspend
mmc: dw_mmc: prevent NULL dereference for dma_ops
mmc: unbreak sdhci-esdhc-imx on i.MX25
mmc: cd-gpio: Include header to pickup exported symbol prototypes
mmc: sdhci: refine non-removable card checking for card detection
mmc: dw_mmc: Fix switch from DMA to PIO
mmc: remove MMC bus legacy suspend/resume method
mmc: omap_hsmmc: Get rid of of_have_populated_dt() usage
mmc: omap_hsmmc: build fix for CONFIG_OF=y and CONFIG_MMC_OMAP_HS=m
mmc: fixes for eMMC v4.5 sanitize operation
mmc: fixes for eMMC v4.5 discard operation
Linus Torvalds [Sat, 21 Apr 2012 19:43:23 +0000 (12:43 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- Fixes a regression at DVB core when switching from DVB-S2 to DVB-S on
Kaffeine (Fedora 16 Bugzilla #812895);
- Fixes a mutex unlock at an error condition at drx-k;
- Fix winbond-cir set mode;
- mt9m032: Fix a compilation breakage with some random Kconfig;
- mt9m032: fix two dead locks;
- xc5000: don't require an special firmware (that won't be provided by
the vendor) just because the xtal frequency is different;
- V4L DocBook: fix some typos at multi-plane formats description.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] xc5000: support 32MHz & 31.875MHz xtal using the 41.024.5 firmware
[media] V4L: mt9m032: fix compilation breakage
[media] V4L: DocBook: Fix typos in the multi-plane formats description
[media] V4L: mt9m032: fix two dead-locks
[media] rc-core: set mode for winbond-cir
[media] drxk: Does not unlock mutex if sanity check failed in scu_command()
[media] dvb_frontend: Fix a regression when switching back to DVB-S
Linus Torvalds [Sat, 21 Apr 2012 19:42:12 +0000 (12:42 -0700)]
Merge tag 'mfd-for-linus-3.4' of git://git./linux/kernel/git/sameo/mfd-2.6
Pull MFD fixes from Samuel Ortiz:
"We have 3 build fixes, a OMAP USB host PHY reset fix and the twl6040
conversion to an i2c driver. The latter may not sound like a fix but
the twl6040 MFD driver won't probe without it, triggering an OMAP4
audio regression."
* tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Fix modular builds of rc5t583 regulator support
mfd: Fix asic3_gpio_to_irq
ARM: OMAP3: USB: Fix the EHCI ULPI PHY reset issue
mfd: Convert twl6040 to i2c driver, and separate it from twl core
mfd : Fix dbx500 compilation error
Wenqi Ma [Thu, 19 Apr 2012 00:39:37 +0000 (00:39 +0000)]
net/hyperv: Adding cancellation to ensure rndis filter is closed
Although the network interface is down, the RX packets number which
could be observed by ifconfig may keep on increasing.
This is because the WORK scheduled in netvsc_set_multicast_list()
may be executed after netvsc_close(). That means the rndis filter
may be re-enabled by do_set_multicast() even if it was closed by
netvsc_close().
By canceling possible WORK before close the rndis filter, the issue
could be never happened.
Signed-off-by: Wenqi Ma <wenqi_ma@trendmicro.com.cn>
Reviewed-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Boyd [Wed, 18 Apr 2012 17:25:58 +0000 (17:25 +0000)]
ks8851: Fix mutex deadlock in ks8851_net_stop()
There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.
Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.
=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
((&ks->irq_work)){+.+...}, at: [<
c019e0b8>] flush_work+0x0/0xac
but task is already holding lock:
(&ks->lock){+.+...}, at: [<
bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ks->lock){+.+...}:
[<
c01b89c8>] __lock_acquire+0x940/0x9f8
[<
c01b9058>] lock_acquire+0x10c/0x130
[<
c083dbec>] mutex_lock_nested+0x68/0x3dc
[<
bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
[<
c019c580>] process_one_work+0x2d8/0x518
[<
c019cb98>] worker_thread+0x220/0x3a0
[<
c01a2ad4>] kthread+0x88/0x94
[<
c0107008>] kernel_thread_exit+0x0/0x8
-> #0 ((&ks->irq_work)){+.+...}:
[<
c01b7984>] validate_chain+0x914/0x1018
[<
c01b89c8>] __lock_acquire+0x940/0x9f8
[<
c01b9058>] lock_acquire+0x10c/0x130
[<
c019e104>] flush_work+0x4c/0xac
[<
bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
[<
c06b209c>] __dev_close_many+0x98/0xcc
[<
c06b2174>] dev_close_many+0x68/0xd0
[<
c06b22ec>] rollback_registered_many+0xcc/0x2b8
[<
c06b2554>] rollback_registered+0x28/0x34
[<
c06b25b8>] unregister_netdevice_queue+0x58/0x7c
[<
c06b25f4>] unregister_netdev+0x18/0x20
[<
bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
[<
c049ddf0>] spi_drv_remove+0x18/0x1c
[<
c0468e98>] __device_release_driver+0x7c/0xbc
[<
c0468f64>] driver_detach+0x8c/0xb4
[<
c0467f00>] bus_remove_driver+0xb8/0xe8
[<
c01c1d20>] sys_delete_module+0x1e8/0x27c
[<
c0105ec0>] ret_fast_syscall+0x0/0x3c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ks->lock);
lock((&ks->irq_work));
lock(&ks->lock);
lock((&ks->irq_work));
*** DEADLOCK ***
4 locks held by rmmod/125:
#0: (&__lockdep_no_validate__){+.+.+.}, at: [<
c0468f44>] driver_detach+0x6c/0xb4
#1: (&__lockdep_no_validate__){+.+.+.}, at: [<
c0468f50>] driver_detach+0x78/0xb4
#2: (rtnl_mutex){+.+.+.}, at: [<
c06b25e8>] unregister_netdev+0xc/0x20
#3: (&ks->lock){+.+...}, at: [<
bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Apr 2012 10:56:16 +0000 (10:56 +0000)]
drivers/net: decouple ISA and ISA_DMA_API
The two options are separate, and some platforms (e.g. arm pxa)
have ISA slots but no ISA dma controller, so they cannot build
drivers using the DMA API functions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Apr 2012 10:56:15 +0000 (10:56 +0000)]
sungem: use mdelay instead of udelay where necessary
Some architectures like ARM cannot handle large numbers as
arguments to udelay, so the drivers should use mdelay when
delaying for multiple miliseconds.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Apr 2012 10:56:14 +0000 (10:56 +0000)]
donauboe: replace excessive udelay with msleep
No driver should spin the CPU for 10ms, so better use
an msleep, which is allowed in the ->suspend function.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Apr 2012 10:56:13 +0000 (10:56 +0000)]
8390: select CRC32 support
The ax88796 driver uses the CRC32 functions, so make sure that
they are actually enabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Apr 2012 10:56:12 +0000 (10:56 +0000)]
drivers/net: iwmc3200 depends on EXPERIMENTAL
The iwmc3200 driver selects other code in Kconfig that depends on
EXPERIMENTAL. Kconfig warns about this when CONFIG_EXPERIMENTAL
is not already set, so logically, these options should also
be marked experimental or promoted to stable.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>