Eric W. Biederman [Tue, 11 Mar 2014 21:31:09 +0000 (14:31 -0700)]
bnx2: Don't receive packets when the napi budget == 0
Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.
This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Beulich [Tue, 11 Mar 2014 13:56:05 +0000 (13:56 +0000)]
consolidate duplicate code is skb_checksum_setup() helpers
consolidate duplicate code is skb_checksum_setup() helpers
Realizing that the skb_maybe_pull_tail() calls in the IP-protocol
specific portions of both helpers are terminal ones (i.e. no further
pulls are expected), their maximum size to be pulled can be made match
their minimal size needed, thus making the code identical and hence
possible to be moved into another helper.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 13 Mar 2014 19:02:16 +0000 (15:02 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to igb, e1000e, ixgbe and ixgbevf.
Tom Herbert provides changes to e1000e, igb and ixgbe to call skb_set_hash()
to set the hash and its type in an skbuff.
Carolyn provides a fix for igb where using ethtool for EEE settings, which
was not working correctly.
Jacob provides some trivial cleanups and fixes for ixgbe which mainly
dealt with the file headers.
Julia Lawall provides a one fix for ixgbevf where the driver did not need
to adjust the power state on suspend, so the call to pci_set_power_state()
in the resume function was a no-op.
v2:
- dropped patches 4-6 from original series which implemented debugfs for
igb from Carolyn based on feed back from David Miller and Or Gerlitz
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 13 Mar 2014 18:58:47 +0000 (14:58 -0400)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
For the mac80211 bits, Johannes says:
"This time, I have a number of small fixes and improvements, and those
are fairly straight-forward. More interesting changes come from Luca
with some preparations for the CSA work, mostly around interface/channel
combinations checking. One other possibly interesting change is a small
one by myself to add NAPI support back to mac80211, which can help
improve TCP behaviour through GRO."
For the Bluetooth bits, Gustavo says:
"This is our first pull request for 3.15, the main feature here is the addition of
the privacy feature for low energy devices. Other than that we have a bunch of small
improvements, fixes, and clean ups all over the tree."
And...
"Another pull request to 3.15. Here we have the second part of the LE private
feature, the LE auto-connect feature and improvements to the power off
procedures. The rest are small improvements, clean up, and fixes."
For the iwlwifi bits, Emmanuel says:
"I have here a whole bunch of various things. Trivial cleanups,
debugfs handlers and new stuff for the new generation of devices
along with new capabilities for monitor mode. We also have support
for power save for dual interface mode, but that is not supported by
the firmware currently available."
And for the Atheros bits, Kalle says:
"For ath10k Alexander did some cleanup to PCI error cases and switched
ath10k to use pci_enable_msi_range(). Michal implemented AP CSA support
and sta_rc_update() operation. I enabled firmware "STA quick kickout"
functionality for faster detection of disappeared clients.
Also there are lots of small fixes to everywhere from various people."
I pulled the wireless tree to avoid some merge conflicts, and I
reverted the staging patch that I had mistakenly merged previously.
Along with that, mwifiex, brcmfmac, wil6210, ath9k, and a few other
drivers get their usual round of updates. Also notable is the addition
of yet another driver in the rtlwifi family.
...
I have amended this commit request to correct the build problems in
staging, including a warning added to one of the staging drivers by
the wireless-next tree. I also included a fix from Larry Finger to
address an issue found in rtl8723be by Dan Carpenter and smatch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 13 Mar 2014 18:36:18 +0000 (14:36 -0400)]
Merge branch 'cxgb4-next'
Hariprasad Shenai says:
====================
Misc. fixes for cxgb4
This patch series provides miscelleneous fixes for Chelsio T4/T5 adapters
cxgb4 driver related to SGE and MTU.
Also fixes regression in LSO calcuation path.
("cxgb4: Calculate len properly for LSO path")
The patches series is created against David Miller's 'net-next' tree.
And includes patches on cxgb4 driver.
We would like to request this patch series to get merged via David Miller's
'net-next' tree.
We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Thu, 13 Mar 2014 15:20:50 +0000 (20:50 +0530)]
cxgb4: Calculate len properly for LSO path
Commit
0034b29 ("cxgb4: Don't assume LSO only uses SGL path in t4_eth_xmit()")
introduced a regression where-in length was calculated wrongly for LSO path,
causing chip hangs.
So, correct the calculation of len.
Fixes: 0034b29 ("cxgb4: Don't assume LSO only uses SGL path in t4_eth_xmit()")
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Thu, 13 Mar 2014 15:20:49 +0000 (20:50 +0530)]
cxgb4: Updates for T5 SGE's Egress Congestion Threshold
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Thu, 13 Mar 2014 15:20:48 +0000 (20:50 +0530)]
cxgb4: Rectify emitting messages about SGE Ingress DMA channels being potentially stuck
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Thu, 13 Mar 2014 15:20:47 +0000 (20:50 +0530)]
cxgb4: Add code to dump SGE registers when hitting idma hangs
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Thu, 13 Mar 2014 15:20:46 +0000 (20:50 +0530)]
cxgb4: Fix some small bugs in t4_sge_init_soft() when our Page Size is 64KB
We'd come in with SGE_FL_BUFFER_SIZE[0] and [1] both equal to 64KB and the
extant logic would flag that as an error.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 13 Mar 2014 18:21:43 +0000 (14:21 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
drivers/net/wireless/ath/ath9k/recv.c
Larry Finger [Mon, 10 Mar 2014 23:53:10 +0000 (18:53 -0500)]
rtlwifi: rtl8723be: Fix array dimension problems
Commit
a619d1abe20c leads to the following static checker warning:
drivers/net/wireless/rtlwifi/rtl8723be/phy.c:667 _rtl8723be_store_tx_power_by_rate()
error: buffer overflow 'rtlphy->tx_power_by_rate_offset[band]' 4 <= 5
This warning arises because the code is testing the indices for the wrong maximum
values. In addition, the tests merely putput a warning, and then procedes to
corrupt memory. With this change, any such invalid memory access is avoided.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John W. Linville [Thu, 13 Mar 2014 17:08:27 +0000 (13:08 -0400)]
wlan-ng: fixup staging driver for removal of ieee80211_dsss_chan_to_freq
Commit
3ebe8e257307 ("ieee80211: remove function
ieee80211_{dsss_chan_to_freq, freq_to_dsss_chan}") removed
ieee80211_dsss_chan_to_freq, but it neglected to account for this
staging driver...
Cc: Zhao, Gang <gamerh2o@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John W. Linville [Thu, 13 Mar 2014 16:53:52 +0000 (12:53 -0400)]
rtl8821ae: fixup staging driver for revised ieee80211_is_robust_mgmt_frame
Commit
d8ca16db6bb2 ("mac80211: add length check in
ieee80211_is_robust_mgmt_frame()") changed that API to take an skb,
and added "_ieee80211_is_robust_mgmt_frame" as a direct replacement
for the older API. This is the same fix that was applied to the other
rtlwifi drivers in that commit.
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Daniel Borkmann [Thu, 13 Mar 2014 13:45:26 +0000 (14:45 +0100)]
net: sctp: remove NULL check in sctp_assoc_update_retran_path
This is basically just to let Coverity et al shut up. Remove an
unneeded NULL check in sctp_assoc_update_retran_path().
It is safe to remove it, because in sctp_assoc_update_retran_path()
we iterate over the list of transports, our own transport which is
asoc->peer.retran_path included. In the iteration, we skip the
list head element and transports in state SCTP_UNCONFIRMED.
Such transports came from peer addresses received in INIT/INIT-ACK
address parameters. They are not yet confirmed by a heartbeat and
not available for data transfers.
We know however that in the list of transports, even if it contains
such elements, it at least contains our asoc->peer.retran_path as
well, so even if next to that element, we only encounter
SCTP_UNCONFIRMED transports, we are always going to fall back to
asoc->peer.retran_path through sctp_trans_elect_best(), as that is
for sure not SCTP_UNCONFIRMED as per
fbdf501c9374 ("sctp: Do no
select unconfirmed transports for retransmissions").
Whenever we call sctp_trans_elect_best() it will give us a non-NULL
element back, and therefore when we break out of the loop, we are
guaranteed to have a non-NULL transport pointer, and can remove
the NULL check.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 13 Mar 2014 15:11:49 +0000 (11:11 -0400)]
Revert "Revert "Staging: rtl8812ae: remove modules field of rate_control_ops""
This reverts commit
161d7855543520cde5f49df788b0ea0553a9f83a.
Reversal of fortune -- I thought this was going to be resolved by other
means, but that hasn't materialized. Plus, apparently we now care more
than I realized about not breaking staging drivers...
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Julia Lawall [Wed, 8 Jan 2014 08:04:32 +0000 (08:04 +0000)]
ixgbevf: delete unneeded call to pci_set_power_state
This driver does not need to adjust the power state on suspend, so the
call to pci_set_power_state in the resume function is a no-op. Drop it,
to make the code more understandable.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 22 Feb 2014 01:23:52 +0000 (01:23 +0000)]
ixgbe: fix some multiline hw_dbg prints
This patch fixes some formatting on multilined print messages, so that
the text of the print appears on a single line, which aids in grepping
the sourcecode for where the error came from.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 22 Feb 2014 01:23:51 +0000 (01:23 +0000)]
ixgbe: fixup header for ixgbe_set_rxpba_82598
The header above this function did not match the function prototype.
This patch rewords the comment to specify the correct parameters.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 22 Feb 2014 01:23:50 +0000 (01:23 +0000)]
ixgbe: add Linux NICS mailing list to contact info
This patch updates the contact information on the ixgbe driver files so
that every file includes the Linux NICS address, as it is still used,
but only a few of the files mentioned it.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 22 Feb 2014 01:23:49 +0000 (01:23 +0000)]
ixgbe: move setting rx_pb_size into get_invariants
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Masanari Iida [Tue, 14 Jan 2014 16:14:42 +0000 (01:14 +0900)]
ixgbe: Fix format string in ixgbe_fcoe.c
cppcheck detected following warning in ixgbe_fcoe.c
(warning) %d in format string (no. 1) requires 'int' but the
argument type is 'unsigned int'.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Tested-By: Jack Morgan<jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tom Herbert [Wed, 18 Dec 2013 16:47:04 +0000 (16:47 +0000)]
net: ixgbe calls skb_set_hash
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.
Signed-off-by: Tom Herbert <therbert@google.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Wed, 12 Mar 2014 03:58:22 +0000 (03:58 +0000)]
igb: Fix for devices using ethtool for EEE settings
This patch fixes a problem where using ethtool for EEE setting was not
working correctly. This patch also fixes a problem where
the function that checks for EEE status on i354 devices was not being
called and was causing warnings with static analysis tools.
Reported-by: Rashika Kheria <rashika.kheria@gmail.com>
Reported-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tom Herbert [Wed, 18 Dec 2013 16:46:58 +0000 (16:46 +0000)]
net: igb calls skb_set_hash
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.
Signed-off-by: Tom Herbert <therbert@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tom Herbert [Wed, 18 Dec 2013 16:46:48 +0000 (16:46 +0000)]
net: e1000e calls skb_set_hash
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.
Signed-off-by: Tom Herbert <therbert@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Wed, 12 Mar 2014 20:22:20 +0000 (16:22 -0400)]
Merge branch 'dev_kfree_skb_any'
Eric W. Biederman says:
====================
Using dev_kfree_skb_any for functions called in multiple contexts
This patchset should be an uncontroversial set of changes to change
dev_kfree_skb to dev_kfree_skb_any for code paths that are called in
hard irq contexts in addition to other contexts. netpoll is the reason
this code gets called in multiple contexts.
There is more coming but these changes are a good starting place, and
stand on their own.
Since the last round changes to the rx path have been removed netpoll
will changed to avoid that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Eric W. Biederman [Tue, 11 Mar 2014 21:20:26 +0000 (14:20 -0700)]
gianfar: Carefully free skbs in functions called by netpoll.
netpoll can call functions in hard irq context that are ordinarily
called in lesser contexts. For those functions use dev_kfree_skb_any
and dev_consume_skb_any so skbs are freed safely from hard irq
context.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:19:50 +0000 (14:19 -0700)]
benet: Call dev_kfree_skby_any instead of kfree_skb.
Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as
which can be called in hard irq by netpoll, softirq context
by normal napi polling, and in normal sleepable context
by the network device close method.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:19:14 +0000 (14:19 -0700)]
mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:18:42 +0000 (14:18 -0700)]
ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:18:14 +0000 (14:18 -0700)]
tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:17:41 +0000 (14:17 -0700)]
bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:16:58 +0000 (14:16 -0700)]
bonding: Call dev_kfree_skby_any instead of kfree_skb.
Replace kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:16:14 +0000 (14:16 -0700)]
r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:15:36 +0000 (14:15 -0700)]
8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Tue, 11 Mar 2014 21:14:58 +0000 (14:14 -0700)]
8139cp: Call dev_kfree_skby_any instead of kfree_skb.
Replace kfree_skb with dev_kfree_skb_any in cp_start_xmit
as it can be called in both hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Tue, 11 Mar 2014 13:23:09 +0000 (18:53 +0530)]
be2net: update driver version to 10.2
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 11 Mar 2014 13:23:08 +0000 (18:53 +0530)]
be2net: Fix vlans_added counter
When a VLAN is added by user, adapter->vlans_added is incremented.
But if the VLAN is already programmed in HW, driver ends up
incrementing the counter wrongly.
Increment the counter only if VLAN is not already programmed in the HW.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 11 Mar 2014 13:23:07 +0000 (18:53 +0530)]
be2net: Create multiple TXQs on RSS capable multi-channel BE3-R interfaces
Currently the driver creates only a single TXQ on any BE3-R multi-channel
interface.
This patch changes this and creates multiple TXQs on RSS-capable multi-channel
BE3-R interfaces. This change helps improve the TX pps performance on the
affected interface.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ravikumar Nelavelli [Tue, 11 Mar 2014 13:23:06 +0000 (18:53 +0530)]
be2net: fix pmac_id[] allocation size
The allocation size must be be_max_uc() and not "be_max_uc() + 1"
Signed-off-by: Ravikumar Nelavelli <ravikumar.nelavelli@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ravikumar Nelavelli [Tue, 11 Mar 2014 13:23:05 +0000 (18:53 +0530)]
be2net: log LPVID used in multi-channel configs
Signed-off-by: Ravikumar Nelavelli <ravikumar.nelavelli@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Tue, 11 Mar 2014 13:23:04 +0000 (18:53 +0530)]
be2net: Add link state control for VFs
Add support to control VF's link state by implementing the
ndo_set_vf_link_state() hook.
Signed-off-by: Suresh Reddy <suresh.reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Tue, 11 Mar 2014 13:23:03 +0000 (18:53 +0530)]
be2net: Use GET_PROFILE_CONFIG cmd for BE3-R to query max-vfs
Use GET_PROFILE_CONFIG_V1 cmd even for BE3-R (it's already used for
Lancer-R and Skyhawk-R), to query max-vfs value supported by the FW.
This is needed as on some configs, the value exported in the PCI-config
space is not accurate.
Signed-off-by: Suresh Reddy <suresh.reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 12 Mar 2014 19:57:26 +0000 (15:57 -0400)]
Merge branch 'mlx4-next'
Or Gerlitz says:
====================
mlx4: Add SRIOV support for RoCE
This series adds SRIOV support for RoCE (RDMA over Ethernet) to the mlx4 driver.
The patches are against net-next, as of commit
2d8d40a "pkt_sched: fq:
do not hold qdisc lock while allocating memory"
changes from V1:
- addressed feedback from Dave on patch #3 and changed get_real_sgid_index()
to be called fill_in_real_sgid_index() and be a void function.
- removed some checkpatch warnings on long lines
changes from V0:
- always check the return code of mlx4_get_roce_gid_from_slave().
The call we fixed is introduced in patch #1 and later removed by
patch #3 that allows guests to have multiple GIDS. The 1..3
separation was done for proper division of patches to logical changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:43 +0000 (12:00 +0200)]
mlx4: Activate RoCE/SRIOV
To activate RoCE/SRIOV, need to remove the following:
1. In mlx4_ib_add, need to remove the error return preventing
initialization of a RoCE port under SRIOV.
2. In update_vport_qp_params (in resource_tracker.c) need to remove
the error return when a RoCE RC or UD qp is detected.
This error return causes the INIT-to-RTR qp transition to fail
in the wrapper function under RoCE/SRIOV.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shani Michaelli [Wed, 12 Mar 2014 10:00:42 +0000 (12:00 +0200)]
mlx4_ib: Fix SIDR support of for UD QPs under SRIOV/RoCE
* Handle CM_SIDR_REQ_ATTR_ID and CM_SIDR_REP_ATTR_ID
in multiplex_cm_handler and demux_cm_handler.
* Handle Service ID Resolution messages and REQ messages
separately, for their formats are different.
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:41 +0000 (12:00 +0200)]
mlx4: Implement IP based gids support for RoCE/SRIOV
Since there is no connection between the MAC/VLAN and the GID
when using IP-based addressing, the proxy QP1 (running on the
slave) must pass the source-mac, destination-mac, and vlan_id
information separately from the GID. Additionally, the Host
must pass the remote source-mac and vlan_id back to the slave,
This is achieved as follows:
Outgoing MADs:
1. Source MAC: obtained from the CQ completion structure
(struct ib_wc, smac field).
2. Destination MAC: obtained from the tunnel header
3. vlan_id: obtained from the tunnel header.
Incoming MADs
1. The source (i.e., remote) MAC and vlan_id are passed in
the tunnel header to the proxy QP1.
VST mode support:
For outgoing MADs, the vlan_id obtained from the header is
discarded, and the vlan_id specified by the Hypervisor is used
instead.
For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the
"invalid" vlan (0xffff) is substituted when forwarding to the slave.
Signed-off-by: Moni Shoua <monis@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:40 +0000 (12:00 +0200)]
mlx4: Add ref counting to port MAC table for RoCE
The IB side of RoCE requires the MAC table index of the
MAC address used by its QPs.
To obtain the real MAC index, the IB side registers the
MAC (increasing its ref count, and also returning the
real MAC index) during the modify-qp sequence.
This protects against the ETH side deleting or modifying
that MAC table entry while the QP is active.
Note that until the modify-qp command returns success,
the MAC and VLAN information only has "candidate" status.
If the modify-qp succeeds, the "candidate" info is promoted
to the operational MAC/VLAN info for the qp. If the modify fails,
the candidate MAC/VLAN is unregistered, and the old qp info
is preserved.
The patch is a bit complex, because there are multiple qp
transitions where the primary-path information may be
modified: INIT-to-RTR, and SQD-to-SQD.
Similarly for the alternate path information.
Therefore the code must handle cases where path information
has already been entered into the QP context by previous
qp transitions.
For the MAC address, the success logic is as follows:
1. If there was no previous MAC, simply move the candidate
MAC information to the operational information, and reset
the candidate MAC info.
2. If there was a previous MAC, unregister it. Then move
the MAC information from candidate to operational, and
reset the candidate info (as in 1. above).
The MAC address failure logic is the same for all cases:
- Unregister the candidate MAC, and reset the candidate MAC info.
For Vlan registration, the logic is similar.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:39 +0000 (12:00 +0200)]
mlx4: In RoCE allow guests to have multiple GIDS
The GIDs are statically distributed, as follows:
PF: gets 16 GIDs
VFs: Remaining GIDS are divided evenly between VFs activated by the driver.
If the division is not even, lower-numbered VFs get an extra GID.
For an IB interface, the number of gids per guest remains as before: one gid per guest.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:38 +0000 (12:00 +0200)]
mlx4_core: For RoCE, allow slaves to set the GID entry at that slave's index
For IB transport, the host determines the slave GIDs. For ETH (RoCE),
however, the slave's GID is determined by the IP address that the slave
itself assigns to the ETH device used by RoCE.
In this case, the slave must be able to write its GIDs to the HCA gid table
(at the GID indices that slave "owns").
This commit adds processing for the SET_PORT_GID_TABLE opcode modifier
for the SET_PORT command wrapper (so that slaves may modify their GIDS
for RoCE).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 12 Mar 2014 10:00:37 +0000 (12:00 +0200)]
mlx4: Adjust QP1 multiplexing for RoCE/SRIOV
This requires the following modifications:
1. Fix build_mlx4_header to properly fill in the ETH fields
2. Adjust mux and demux QP1 flow to support RoCE.
This commit still assumes only one GID per slave for RoCE.
The commit enabling multiple GIDs is a subsequent commit, and
is done separately because of its complexity.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 12 Mar 2014 19:53:54 +0000 (15:53 -0400)]
Merge branch 'tipc'
Jon Maloy says:
====================
tipc: simplifications in socket and port layer
After the removal of the tipc native API the relation between
tipc_port and its API types is strictly one-to-one, i.e, the latter
can now only be a socket API. This change opens up for
simplifications both in the code, data and locking structure.
We start with this series, where we ensure that port and socket
structures are co-allocated. Note that the first commit in the
series is unrelated to the above.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:13 +0000 (11:31 -0400)]
tipc: eliminate redundant lookups in registry
As an artefact from the native interface, the message sending functions
in the port takes a port ref as first parameter, and then looks up in
the registry to find the corresponding port pointer. This despite the
fact that the only currently existing caller, tipc_sock, already knows
this pointer.
We change the signature of these functions to take a struct tipc_port*
argument, and remove the redundant lookups.
We also remove an unmotivated extra lookup in the function
socket.c:auto_connect(), and, as the lookup functions tipc_port_deref()
and ref_deref() now become unused, we remove these two functions.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:12 +0000 (11:31 -0400)]
tipc: align usage of variable names and macros in socket
The practice of naming variables in TIPC is inconistent, sometimes
even within the same file.
In this commit we align variable names and declarations within
socket.c, and function and macro names within socket.h. We also
reduce the number of conversion macros to two, in order to make
usage less obsure.
These changes are purely cosmetic.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:11 +0000 (11:31 -0400)]
tipc: eliminate redundant locking
The three functions tipc_portimportance(), tipc_portunreliable() and
tipc_portunreturnable() and their corresponding tipc_set* functions,
are all grabbing port_lock when accessing the targeted port. This is
unnecessary in the current code, since these calls only are made from
within socket downcalls, already protected by sock_lock.
We remove the redundant locking. Also, since the functions now become
trivial one-liners, we move them to port.h and make them inline.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:10 +0000 (11:31 -0400)]
tipc: eliminate upcall function pointers between port and socket
Due to the original one-to-many relation between port and user API
layers, upcalls to the API have been performed via function pointers,
installed in struct tipc_port at creation. Since this relation now
always is one-to-one, we can instead use ordinary function calls.
We remove the function pointers 'dispatcher' and ´wakeup' from
struct tipc_port, and replace them with calls to the renamed
functions tipc_sk_rcv() and tipc_sk_wakeup().
At the same time we change the name and signature of the functions
tipc_createport() and tipc_deleteport() to reflect their new role
as mere initialization/destruction functions.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:09 +0000 (11:31 -0400)]
tipc: aggregate port structure into socket structure
After the removal of the tipc native API the relation between
a tipc_port and its API types is strictly one-to-one, i.e, the
latter can now only be a socket API. There is therefore no need
to allocate struct tipc_port and struct sock independently.
In this commit, we aggregate struct tipc_port into struct tipc_sock,
hence saving both CPU cycles and structure complexity.
There are no functional changes in this commit, except for the
elimination of the separate allocation/freeing of tipc_port.
All other changes are just adaptatons to the new data structure.
This commit also opens up for further code simplifications and
code volume reduction, something we will do in later commits.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:08 +0000 (11:31 -0400)]
tipc: remove redundant 'peer_name' field in struct tipc_sock
The field 'peer_name' in struct tipc_sock is redundant, since
this information already is available from tipc_port, to which
tipc_sock has a reference.
We remove the field, and ensure that peer node and peer port
info instead is fetched via the functions that already exist
for this purpose.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 12 Mar 2014 15:31:07 +0000 (11:31 -0400)]
tipc: replace reference table rwlock with spinlock
The lock for protecting the reference table is declared as an
RWLOCK, although it is only used in write mode, never in read
mode.
We redefine it to become a spinlock.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Wed, 12 Mar 2014 08:43:17 +0000 (09:43 +0100)]
flowcache: Fix resource leaks on namespace exit.
We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.
Fixes: ca925cf1534e ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:22:37 +0000 (10:22 -0700)]
lg-vl600: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:22:36 +0000 (10:22 -0700)]
xilinx: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:22:30 +0000 (10:22 -0700)]
brocade: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:04:20 +0000 (10:04 -0700)]
tipc: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:04:18 +0000 (10:04 -0700)]
ieee802154: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:04:17 +0000 (10:04 -0700)]
net: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Wed, 12 Mar 2014 17:04:15 +0000 (10:04 -0700)]
8021q: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.
Make these uses consistent with the rest of the kernel.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Tue, 11 Mar 2014 16:01:24 +0000 (18:01 +0200)]
gianfar: Fix multi-queue support checks @probe()
priv is not instantiated at gfar_of_init() time, when
parsing the DT for info on supported HW queues. Before
the netdev can be allocated, the number of supported
queues must be known. Because the number of supported
queues depends on device type, move the compatibility
checks before netdev allocation. Local vars are used
to hold the operation mode info before netdev allocation.
This fixes the null accesses for priv->.., in gfar_of_init.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Tue, 11 Mar 2014 08:24:19 +0000 (16:24 +0800)]
r8152: support dumping the hw counters
Add dumping the tally counter by ethtool.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Stilwell [Tue, 11 Mar 2014 00:29:25 +0000 (19:29 -0500)]
ieee802154: at86rf230: add support for rf233 chip
The rf233 and rf231 are sufficiently similar that we can treat
rf233 like rf231.
rf233 is missing some features that rf231 has, but we don't currently
make use of them so there's nothing to handle differently yet.
Should we add support in the future for rf231 *_NOCLK or SLEEP states,
or PAD_IO drive strength, exceptions will need to be made for rf233.
Signed-off-by: Thomas Stilwell <stilwellt@openlabs.co>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 12 Mar 2014 03:54:56 +0000 (23:54 -0400)]
Merge branch 'pkt_sched_cond_resched'
Eric Dumazet says:
====================
pkt_sched: allow scheduling points
We have seen delays of more than 50ms in class or qdisc dumps, in case
device is under high TX stress, even with the prior 4KB per skb limit.
With the new 16KB limit, this could translate to 200ms delays.
Add cond_resched() to give a chance to higher prio tasks to get cpu.
But before doing so, we need to remove the rcu locking from tc_dump_qdisc()
as David spotted.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 11 Mar 2014 00:11:43 +0000 (17:11 -0700)]
pkt_sched: add cond_resched() to class and qdisc dump
We have seen delays of more than 50ms in class or qdisc dumps, in case
device is under high TX stress, even with the prior 4KB per skb limit.
Add cond_resched() to give a chance to higher prio tasks to get cpu.
Signed-off-by; Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 11 Mar 2014 00:11:42 +0000 (17:11 -0700)]
pkt_sched: do not use rcu in tc_dump_qdisc()
Like all rtnetlink dump operations, we hold RTNL in tc_dump_qdisc(),
so we do not need to use rcu protection to protect list of netdevices.
This will allow preemption to occur, thus reducing latencies.
Following patch adds explicit cond_resched() calls.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Mon, 10 Mar 2014 16:48:38 +0000 (09:48 -0700)]
bonding: force cast of IP address in options
The option code is taking IP address and putting it into a generic
container. Force cast to silence sparse warnings.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Mon, 10 Mar 2014 16:41:46 +0000 (09:41 -0700)]
netdev: set __percpu attribute on netdev_alloc_pcpu_stats
This patch fixes sparse warnings in vlan driver.
It propagates the sparse __percpu attribute from alloc_percpu
into netdev_alloc_pcpu_stats. I expect it may trigger additional
sparse warnings from other drivers that are missing the __percpu
attribute.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Tue, 11 Mar 2014 02:40:08 +0000 (10:40 +0800)]
ipv6: ip6_forward: perform skb->pkt_type check at the beginning
Packets which have L2 address different from ours should be
already filtered before entering into ip6_forward().
Perform that check at the beginning to avoid processing such packets.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Tue, 11 Mar 2014 02:20:32 +0000 (10:20 +0800)]
r8152: add skb_cow_head
Call skb_cow_head() before editing the tx packet header. The header
would be reallocated if it is shared.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Mon, 10 Mar 2014 12:12:23 +0000 (13:12 +0100)]
net: eth: cpsw: Use net_device_stats from struct net_device
Instead of using an own copy of struct net_device_stats in struct
cpsw_priv, use stats from struct net_device. Also remove the thus
unnecessary .ndo_get_stats function, as it just returns dev->stats,
which is the default.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 10 Mar 2014 14:09:07 +0000 (07:09 -0700)]
flowcache: restore a single flow_cache kmem_cache
It is not legal to create multiple kmem_cache having the same name.
flowcache can use a single kmem_cache, no need for a per netns
one.
Fixes: ca925cf1534e ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gu Zheng [Mon, 10 Mar 2014 01:57:34 +0000 (09:57 +0800)]
net: add a pre-check of net_ns in sk_change_net()
We do not need to switch the net_ns if the target net_ns the same
as the current one, so here we add a pre-check of net_ns to avoid
this as David suggested.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 10 Mar 2014 00:36:02 +0000 (17:36 -0700)]
tcp: timestamp SYN+DATA messages
All skb in socket write queue should be properly timestamped.
In case of FastOpen, we special case the SYN+DATA 'message' as we
queue in socket wrote queue the two fallback skbs:
1) SYN message by itself.
2) DATA segment by itself.
We should make sure these skbs have proper timestamps.
Add a WARN_ON_ONCE() to eventually catch future violations.
Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Sun, 9 Mar 2014 23:10:59 +0000 (16:10 -0700)]
hyperv: Change the receive buffer size for legacy hosts
Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller
receive buffer size, otherwise the buffer will not be accepted by the legacy hosts.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Sun, 9 Mar 2014 08:51:40 +0000 (09:51 +0100)]
6lowpan: reassembly: fix access of ctl table entry
Correct offset is 3 of the 6lowpanfrag_max_datagram_size value in proc
entry ctl table and not 2.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 10 Mar 2014 19:52:17 +0000 (15:52 -0400)]
Merge branch 'hyperv-next'
K. Y. Srinivasan says:
====================
Drivers: net: hyperv: Enable various offloads
This patch set enables both checksum as well as segmentation offload.
As part of this effort I have enabled scatter gather I/O a well.
In version 2 of these patches, I addressed comments from David Miller and
Dan Carpenter.
In this version I have addressed the latest comments from David Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:18 +0000 (19:23 -0800)]
Drivers: net: hyperv: Enable large send offload
Enable segmentation offload.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:17 +0000 (19:23 -0800)]
Drivers: net: hyperv: Enable send side checksum offload
Enable send side checksum offload.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:16 +0000 (19:23 -0800)]
Drivers: net: hyperv: Enable receive side IP checksum offload
Enable receive side checksum offload.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:15 +0000 (19:23 -0800)]
Drivers: net: hyperv: Enable offloads on the host
Prior to enabling guest side offloads, enable the offloads on the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:14 +0000 (19:23 -0800)]
Drivers: net: hyperv: Cleanup the send path
In preparation for enabling offloads, cleanup the send path.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Sun, 9 Mar 2014 03:23:13 +0000 (19:23 -0800)]
Drivers: net: hyperv: Enable scatter gather I/O
Cleanup the code and enable scatter gather I/O.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Harvey [Sat, 8 Mar 2014 04:59:53 +0000 (20:59 -0800)]
sky2: allow mac to come from dt
The driver reads the mac address from the device registers which would
need to have been programmed by the bootloader. This patch adds
the ability to pull the mac from devicetree via the pci device dt node.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Cc: netdev@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Changes since v2:
- eliminated use of stack tmpaddr per feedback
Changes since v1:
- simplified based on feedback
- fixed formatting
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 7 Mar 2014 22:57:43 +0000 (14:57 -0800)]
l2tp: fix unused variable warning
net/l2tp/l2tp_core.c:1111:15: warning: unused variable
'sk' [-Wunused-variable]
Fixes: 31c70d5956fc ("l2tp: keep original skb ownership")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kleber Sacilotto de Souza [Fri, 7 Mar 2014 22:48:25 +0000 (19:48 -0300)]
IB/mlx5_core: remove unreachable function call in module init
The call to mlx5_health_cleanup() in the module init function can never
be reached. Removing it.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 7 Mar 2014 20:02:33 +0000 (12:02 -0800)]
netlink: autosize skb lengthes
One known problem with netlink is the fact that NLMSG_GOODSIZE is
really small on PAGE_SIZE==4096 architectures, and it is difficult
to know in advance what buffer size is used by the application.
This patch adds an automatic learning of the size.
First netlink message will still be limited to ~4K, but if user used
bigger buffers, then following messages will be able to use up to 16KB.
This speedups dump() operations by a large factor and should be safe
for legacy applications.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 7 Mar 2014 18:27:41 +0000 (18:27 +0000)]
sfc: Use ether_addr_copy and eth_broadcast_addr
Faster than memcpy/memset on some architectures.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 10 Mar 2014 17:17:44 +0000 (13:17 -0400)]
Merge branch 'gianfar-next'
Claudiu Manoil says:
====================
gianfar: Tx timeout issue
There's an older Tx timeout issue showing up on etsec2 devices
with 2 CPUs. I pinned this issue down to processing overhead
incurred by supporting multiple Tx/Rx rings, as explained in
the 2nd patch below. But before this, there's also a concurency
issue leading to Rx/Tx spurrious interrupts, addressed by the
'Tx NAPI' patch below.
The Tx timeout can be triggered with multiple Tx flows,
'iperf -c -N 8' commands, on a 2 CPUs etsec2 based (P1020) board.
Before the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 1 timed out
WARNING: at net/sched/sch_generic.c:279
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted
3.13.0-rc3-03386-g89ea59c #23
task:
ed84ef40 ti:
ed868000 task.ti:
ed868000
NIP:
c04627a8 LR:
c04627a8 CTR:
c02fb270
REGS:
ed869d00 TRAP: 0700 Not tainted (
3.13.0-rc3-03386-g89ea59c)
MSR:
00029000 <CE,EE,ME> CR:
44000022 XER:
20000000
[...]
root@p1020rdb-pc:~# [ ID] Interval Transfer Bandwidth
[ 5] 0.0-19.3 sec 1000 MBytes 434 Mbits/sec
[ 8] 0.0-39.7 sec 1000 MBytes 211 Mbits/sec
[ 9] 0.0-40.1 sec 1000 MBytes 209 Mbits/sec
[ 3] 0.0-40.2 sec 1000 MBytes 209 Mbits/sec
[ 10] 0.0-59.0 sec 1000 MBytes 142 Mbits/sec
[ 7] 0.0-74.6 sec 1000 MBytes 112 Mbits/sec
[ 6] 0.0-74.7 sec 1000 MBytes 112 Mbits/sec
[ 4] 0.0-74.7 sec 1000 MBytes 112 Mbits/sec
[SUM] 0.0-74.7 sec 7.81 GBytes 898 Mbits/sec
root@p1020rdb-pc:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:04:9f:00:13:01
inet addr:172.16.1.1 Bcast:172.16.255.255 Mask:255.255.0.0
inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:708722 errors:0 dropped:0 overruns:0 frame:0
TX packets:
8717849 errors:6 dropped:0 overruns:1470 carrier:0
collisions:0 txqueuelen:1000
RX bytes:
58118018 (55.4 MiB) TX bytes:
274069482 (261.3 MiB)
Base address:0xa000
"""
After applying the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# [ ID] Interval Transfer Bandwidth
[ 9] 0.0-70.5 sec 1000 MBytes 119 Mbits/sec
[ 5] 0.0-70.5 sec 1000 MBytes 119 Mbits/sec
[ 6] 0.0-70.7 sec 1000 MBytes 119 Mbits/sec
[ 4] 0.0-71.0 sec 1000 MBytes 118 Mbits/sec
[ 8] 0.0-71.1 sec 1000 MBytes 118 Mbits/sec
[ 3] 0.0-71.2 sec 1000 MBytes 118 Mbits/sec
[ 10] 0.0-71.3 sec 1000 MBytes 118 Mbits/sec
[ 7] 0.0-71.3 sec 1000 MBytes 118 Mbits/sec
[SUM] 0.0-71.3 sec 7.81 GBytes 942 Mbits/sec
root@p1020rdb-pc:~# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:04:9f:00:13:01
inet addr:172.16.1.1 Bcast:172.16.255.255 Mask:255.255.0.0
inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:728446 errors:0 dropped:0 overruns:0 frame:0
TX packets:
8690057 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:
59732650 (56.9 MiB) TX bytes:
271554306 (258.9 MiB)
Base address:0xa000
"""
v2: PATCH 2:
Replaced CPP check with run-time condition to
limit the number of queues. Updated comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Fri, 7 Mar 2014 12:42:46 +0000 (14:42 +0200)]
gianfar: Use Single-Queue polling for "fsl,etsec2"
For the "fsl,etsec2" compatible models the driver currently
supports 8 Tx and Rx DMA rings (aka HW queues). However, there
are only 2 pairs of Rx/Tx interrupt lines, as these controllers
are integrated in low power SoCs with 2 CPUs at most. As a result,
there are at most 2 NAPI instances that have to service multiple
Tx and Rx queues for these devices. This complicates the NAPI
polling routine having to iterate over the mutiple Rx/Tx queues
hooked to the same interrupt lines. And there's also an overhead
at HW level, as the controller needs to service all the 8 Tx rings
in a round robin manner. The combined overhead shows up for multi
parallel Tx flows transmitted by the kernel stack, when the driver
usually starts returning NETDEV_TX_BUSY leading to NETDEV WATCHDOG
Tx timeout triggering if the Tx path is congested for too long.
As an alternative, this patch makes the driver support only one
Tx/Rx DMA ring per NAPI instance (per interrupt group or pair
of Tx/Rx interrupt lines) by default. The simplified single queue
polling routine (gfar_poll_sq) will be the default napi poll routine
for the etsec2 devices too. Some adjustments needed to be made to
link the Tx/Rx HW queues with each NAPI instance (2 in this case).
The gfar_poll_sq() is already successfully used by older SQ_SG_MODE
(single interrupt group) controllers.
This patch fixes Tx timeout triggering under heavy Tx traffic load
(i.e. iperf -c -P 8) for the "fsl,etsec2" (currently the only
MQ_MG_MODE devices). There's also a significant memory footprint
reduction by supporting 2 Rx/Tx DMA rings (at most), instead of 8,
for these devices.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Fri, 7 Mar 2014 12:42:45 +0000 (14:42 +0200)]
gianfar: Separate out the Tx interrupt handling (Tx NAPI)
There are some concurrency issues on devices w/ 2 CPUs related
to the handling of Rx and Tx interrupts. eTSEC has separate
interrupt lines for Rx and Tx but a single imask register
to mask these interrupts and a single NAPI instance to handle
both Rx and Tx work. As a result, the Rx and Tx ISRs are
identical, both are invoking gfar_schedule_cleanup(), however
both handlers can be entered at the same time when the Rx and
Tx interrupts are taken by different CPUs. In this case
spurrious interrupts (SPU) show up (in /proc/interrupts)
indicating a concurrency issue. Also, Tx overruns followed
by Tx timeout have been observed under heavy Tx traffic load.
To address these issues, the schedule cleanup ISR part has
been changed to handle the Rx and Tx interrupts independently.
The patch adds a separate NAPI poll routine for Tx cleanup to
be triggerred independently by the Tx confirmation interrupts
only. Existing poll functions are modified to handle only
the Rx path processing. The Tx poll routine does not need a
budget, since Tx processing doesn't consume NAPI budget, and
hence it is registered with minimum NAPI weight.
NAPI scheduling does not require locking since there are
different NAPI instances between the Rx and Tx confirmation
paths now.
So, the patch fixes the occurence of spurrious Rx/Tx interrupts.
Tx overruns also occur less frequently now.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 7 Mar 2014 10:45:31 +0000 (18:45 +0800)]
vlan: use use ether_addr_equal_64bits to instead of ether_addr_equal
Ether_addr_equal_64bits is more efficient than ether_addr_equal, and
can be used when each argument is an array within a structure that
contains at least two bytes of data beyond the array, so it is safe
to use it for vlan, and make sense for fast path.
Cc: Joe Perches <joe@perches.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>