Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

vxlan: add x-netns support

This patch allows to switch the netns when packet is encapsulated or
decapsulated.
The vxlan socket is openned into the i/o netns, ie into the netns where
encapsulated packets are received. The socket lookup is done into this netns to
find the corresponding vxlan tunnel. After decapsulation, the packet is
injecting into the corresponding interface which may stand to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Configuration example:
ip netns add netns1
ip netns exec netns1 ip link set lo up
ip link add vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0
ip link set vxlan10 netns netns1
ip netns exec netns1 ip addr add 192.168.0.249/24 broadcast 192.168.0.255 dev vxlan10
ip netns exec netns1 ip link set vxlan10 up

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bcm63xx_enet: Use ARRAY_SIZE instead of open coding it

Use the ARRAY_SIZE macro to determine the number of entries in
bcm_enet_gstrings_stats.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/intel/igb/e1000_mac.c
net/core/filter.c

Both conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>

smc91x: improve definition of debug macros

Redefine some macros that were conditioned upon SMC_DEBUG level.

By allowing compiler to verify parameters used by these macros
unconditionally, we can flag compilation failures.

Compiler will still optimize out the unused code path depending on
SMC_DEBUG, so this is a net gain.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Add tlb_dynamic_lb parameter for tlb mode

The aggresive load balancing causes packet re-ordering as active
flows are moved from a slave to another within the group. Sometime
this aggresive lb is not necessary if the preference is for less
re-ordering. This parameter if used with value "0" disables
this dynamic flow shuffling minimizing packet re-ordering. Of course
the side effect is that it has to live with the static load balancing
that the hashing distribution provides. This impact is less severe if
the correct xmit-hashing-policy is used for the tlb setup.

The default value of the parameter is set to "1" mimicing the earlier
behavior.

Ran the netperf test with 200 stream for 1 min between two hosts with
4x1G trunk (xmit-lb mode with xmit-policy L3+4) before and after these
changes. Following was the command used for those 200 instances -

    netperf -t TCP_RR -l 60 -s 5 -H <host> -- -r81920,81920

Transactions per second:
    Before change: 1,367.11
    After  change: 1,470.65

Change-Id: Ie3f75c77282cf602e83a6e833c6eb164e72a0990
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Added bond_tlb_xmit() for tlb mode.

Re-organized the xmit function for the lb mode separating tlb xmit
from the alb mode. This will enable use of the hashing policies
like 802.3ad mode. Also extended use of xmit-hash-policy to tlb mode.

Now the tlb-mode defaults to BOND_XMIT_POLICY_LAYER2 if the xmit policy
module parameter is not set (just like 802.3ad, or Xor mode).

Change-Id: I140257403d272df75f477b380207338d0f04963e
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Reorg bond_alb_xmit code

Separating the actual xmit part from the function in a separate
function that can be used in the tlb_xmit in the next patch. Also
there is no reason do_tx_balance to be an int so changing it to
bool type.

Change-Id: I9c48ff30487810f68587e621a191db616f49bd3b
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Changed hashing function to just provide hash

Modified the hash function to return just hash separating from the
modulo operation that can be performed by the caller. This is to
make way for the tlb mode to use the same hashing policies that
are used in the 802.3ad and Xor mode.

Change-Id: I276609e87e0ca213c4d1b17b79c5e0b0f3d0dd6f
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6lowpan: nuke net_ieee802154_lowpan() accessor when 6lowpan is disabled

Johannes noted this is not needed, all of the fragment
accessors don't need CONFIG_NET_NS. This goes test compiled with
CONFIG_BT_6LOWPAN=y and a disabled CONFIG_NET_NS.

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Cc: David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to ixgbe, ixgbevf, e1000e, igb and i40e.

Jacob converts the ixgbe low_water into an array which allows the
algorithm to output different values for different TCs and we can
distinguish between them.  Removes vlan_filter_disable() and
vlan_filter_enable() in ixgbe so that we can do the work directly in
set_rx_mode().  Changes the setting of multicast filters only when
the interface is not in promiscuous mode for multicast packets in
ixgbe.  Improves MAC filter handling by adding mac_table API based
on work done for igb, which includes functions to add/delete MAC
filters.

Mark changes register reads in ixgbe to an out-of-line function since
register reads are slow.

Emil provides a ixgbevf patch to update the driver description since
it supports more than just 82599 parts now.

David provides several cleanup patches for e1000e which resolve some
checkpatch issues as well as changing occurrences of returning 0 or 1 in
bool functions to returning true false or true.

Carolyn provides several cleanup patches for igb which fix checkpatch
warnings.

Mitch provides a fix for i40evf where the driver would correctly allow
the virtual function link state to be controlled by 'ip set link', but
would not report it correctly back.  This is fixed by filling out
the appropriate field in the VF info struct.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: initialize A and X registers

exisiting BPF verifier allows uninitialized access to registers,
'ret A' is considered to be a valid filter.
So initialize A and X to zero to prevent leaking kernel memory
In the future BPF verifier will be rejecting such filters

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/phy: Remove return value for void function

This was caught when using a spatch (aka. coccinelle) script
written by Joe Perches.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Shruti Kanetkar <Shruti@Freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'via-rhine'

Alexey Charkov says:

====================
net: via-rhine: add support for on-chip Rhine controllers

This series introduces platform bus (OpenFirmware) binding for
via-rhine, as used in various ARM-based Systems-on-Chip by
VIA/WonderMedia.

This has been tested in OF configuration by myself on a WM8950-based VIA
APC Rock development board and on a WM8850-based netbook, and in PCI
configuration by Roger.

Please note that the initial version of these patches was signed off by
Roger, but some time has passed since then, so I'm not including his
sign-off until explicit notice.

Changes since v1:
- Fixed indentation of function arguments
- Switched to 'dev_is_pci' instead of string comparison on bus name
- Dropped 'rhine,revision' DT attribute, put the revision into OF match
table instead
- Included actual device tree nodes where applicable
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: via-rhine: add OF bus binding

This should make the driver usable with VIA/WonderMedia ARM-based
Systems-on-Chip integrated Rhine III adapters. Note that these
are always in MMIO mode, and don't have any known EEPROM.

Signed-off-by: Alexey Charkov <alchark@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: via-rhine: reduce usage of the PCI-specific struct

Use more generic data structures instead of struct pci_dev wherever
possible in preparation for OF bus binding

Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: via-rhine: switch to generic DMA functions

Remove legacy PCI DMA wrappers and instead use generic DMA functions
directly in preparation for OF bus binding

Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Update my email address

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

vxlan: ensure to advertise the right fdb remote

The goal of this patch is to fix rtnelink notification. The main problem was
about notification for fdb entry with more than one remote. Before the patch,
when a remote was added to an existing fdb entry, the kernel advertised the
first remote instead of the added one. Also when a remote was removed from a fdb
entry with several remotes, the deleted remote was not advertised.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/phy: micrel: fix bugged test on device tree loading for ksz9021

In ksz9021_load_values_from_of() val2 to val4 aren't tested against their
initialization value.
This causes the test to always succeed, and this value to be used as if it
was loaded from the devicetree instead of being ignored, in case of a
missing/invalid property in the ethernet OF device node.
As a result, the value "0" is written to the relevant registers.

Change the conditions to test against the right initialization value.

Signed-off-by: Hubert Chaumette <hchaumette@adeneo-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: Fix leak and NULL dereference on error path

The recent patch that moved broadcasts to process context added
a couple of bugs on the error path where we may dereference NULL
or leak an skb. This patch fixes them.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'gre_netns'

Nicolas Dichtel says:

====================
gre: allow to switch netns during encap/decap

The goal of this serie is to add x-netns support for the module ip_gre and
ip6_gre, ie the encapsulation addresses and the network device are not owned by
the same namespace.

Example to configure an ipv4 gre tunnel:

modprobe ip_gre
ip netns add netns1
ip netns exec netns1 ip link set lo up
ip link add gre1 type gre remote 10.16.0.121 local 10.16.0.249 ikey 10 okey 10 csum
ip link set gre1 netns netns1
ip netns exec netns1 ip link set gre1 up
ip netns exec netns1 ip addr add dev gre1 192.168.0.249 remote 192.168.0.121
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ip6gre: add x-netns support

This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gre: add x-netns support

This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hyperv: Simplify the send_completion variables

The union contains only one member now, so we use the variables in it directly.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hyperv: Remove recv_pkt_list and lock

Removed recv_pkt_list and lock, and updated related code, so that
the locking overhead is reduced especially when multiple channels
are in use.

The recv_pkt_list isn't actually necessary because the packets are
processed sequentially in each channel. It has been replaced by a
local variable, and the related lock for this list is also removed.
The is_data_pkt field is not used in receive path, so its assignment
is cleaned up.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

i40e: report VF link state correctly

Although the driver would correctly allow the VF link state to be
controlled by 'ip set link', it would not report it correctly back.

Fix this by filling out the appropriate field in the vf info struct.

Change-ID: I58d8e356438190e1ee9660b424301af6f416cdbe
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

igb: Cleanups to fix incorrect indentation

This patch fixes WARNING:LEADING_SPACE, WARNING:SPACING, ERROR:SPACING,
WARNING:SPACE_BEFORE_TAB and ERROR_CODE_INDENT from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

igb: Cleanups to fix braces location warnings

This patch fixes WARNING:BRACES and ERROR:OPEN_BRACE from
checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

igb: Cleanups for messaging

This patch fixes WARNING:PREFER_PR_LEVEL and WARNING:SPLIT_STRING
from checkpatch file check.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Cleanup use of deprecated DEFINE_PCI_DEVICE_TABLE

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Cleanup checkpatch extra space

Fixing "WARNING:SPACING: Unnecessary space before function pointer arguments"

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Cleanup to fix checkpatch missing blank lines

Fixing "WARNING:SPACING: networking uses a blank line after declarations"

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1000e: Cleanup return values in ethtool

Changing occurrences of returning 0 and 1 from bool functions to false and
true, respectively

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbevf: remove 82599 from the module description

This patch removes 82599 from the description of the ixgbevf module
since the VF driver is supported on other parts as well.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: improve mac filter handling

Add mac_table API based on work done for igb, which includes functions
to add and delete mac filters. This simplifies code for various entities
that use MAC filters such as VMDQ, SR-IOV, MACVLAN, and such.

Reported-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: change handling of multicast filters

In line with changes done by Alex Duyck regarding unicast filters, we
now only set multicast filters when the interface is not in promiscuous
mode for multicast packets. This also has an impact on the RAR usage
such that SR-IOV has some RARs reserved for its own usage.

Reported-by: Alex Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: remove vlan_filter_disable and enable functions

Previously these functions handled stripping setup as well, but this has
already been removed from these functions. Rather than encapsulating
this into a function, we can just do the work directly in set_rx_mode.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: Use out-of-line function for register reads

Register reads are slow, so don't inline them.

Size before:
   text    data     bss     dec     hex filename
226337    8280     552 235169   396a1 ixgbe.ko

Size after:
   text    data     bss     dec     hex filename
194578    8280     552 203410   31a92 ixgbe.ko

for about a 14% reduction in text size.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: convert low_water into an array

Since fc.high_water is an array, we should treat low_water as an array
also. This allows the algorithm to output different values for different
TCs, and then we can distinguish between them. In addition, this patch
changes one path that didn't honor the return value from ixgbe_setup_fc.

Reported-by: Aaron Salter <aaron.k.salter@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e and i40evf.

Greg provides two patches for i40e, the first adds the netdev ops to support
the addition of static FDB entries in the physical function (PF) MAC/VLAN
filter table so that the virtual functions (VFs) can communicate with
bridged virtual Ethernet ports such as those provided by the virtio
driver.  The second is to fix an issue where the assignment of a port
VLAN after it is already up and running requires the VF driver to be
reloaded, so print a message warning the host administrator about the
need to reload the VF driver.  In addition, knock the VF offline so that
it does not continue to receive traffic not on the port VLAN assigned to
it.

Jesse provides a patch for i40e and i40evf to unhide and enable the
PREFENA field in the receive host memory cache (RX-HMC) for best
performance.

Mitch provides a i40e patch to implement the net device op for Tx
bandwidth setting.

Catherine removes a firmware workaround that is no longer needed with
the latest firmware for i40e.  She also provides some minor cleanups
as well bumps the driver versions.

Anjali provides a fix for i40e displaying IPv4 flow director filters
which needed additional information to be communicated up above in
order for it to be displayed correctly.

Shannon adds tracking of the NVM busy state so that the driver won't
allow a new NVM update command until a completion event is received
from the current update.  Updates the admin queue API to reflect
recent changes in the firmware.  Also rearranges the "if netdev" logic
to prepare for handling non-netdev VSIs.  Lastly rework the fdir
setup and tear down to use the newly created i40e_vsi_open() and
i40e_vsi_close(), which also fixes a memory leak of the FDIR queue
buffer info structs across a reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'netlink-bind'

Richard Guy Briggs says:

====================
audit: implement multicast socket for journald

This is a patch set Eric Paris and I have been working on to add a restricted
capability read-only netlink multicast socket to kernel audit to enable
userspace clients such as systemd/journald to receive audit logs, in addition
to the bidirectional auditd userspace client.

Currently, auditd has the CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE capabilities
(but uses CAP_NET_ADMIN).  The CAP_AUDIT_READ capability will be added for use
by read-only AUDIT_NLGRP_READLOG multicast group clients to the kaudit
subsystem.  This will remove the dependence on CAP_NET_ADMIN for the multicast
read-only socket.

Patches 1-3 provide a way for per-protocol bind functions to
signal an error and to be able to clean up after themselves.

The first netfilter cleanup patch has already been accepted by a netfilter
maintainer, though I don't see it upstream yet, so it is included for
completeness.

The second patch adds the per-protocol bind function return code to signal to
the netlink code that no further processing should be done and to undo the work
already done.
V1: This rev fixes a bug introduced by flattening the code in the last posting.
*V2: This rev moves the per-protocol bind call above the socket exposure call
and refactors out the unbind procedure.

The third provides a way per protocol to undo bind actions on DROP.

Patches 4-6 implement the audit multicast socket with capability checking.

The fourth patch adds the bind function capability check to multicast join
requests for audit.

The fifth patch adds the audit log read multicast group.  An assumption has
been made that systemd/journald reside in the initial network namespace.  This
could be changed to check the actual network namespace of systemd/journald
should this assumption no longer be true since audit now supports all network
namespaces.  This version of the patch now directly sends the broadcast when
the packet is ready rather than waiting until it passes the queue.

The sixth checks if any clients actually exist before sending.

Since the net tree is busier than the audit tree, conflicts are more likely and
the audit patches depend on the net patches, it is proposed to have the net
tree carry this entire patchset for 3.16.  Are the net maintainers ok with this?

https://bugzilla.redhat.com/show_bug.cgi?id=887992

First posted:   https://www.redhat.com/archives/linux-audit/2013-January/msg00008.html
                https://lkml.org/lkml/2013/1/27/279

Please find source for a test program at:
http://people.redhat.com/rbriggs/audit-multicast-listen/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

audit: send multicast messages only if there are listeners

Test first to see if there are any userspace multicast listeners bound to the
socket before starting the multicast send work.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

audit: add netlink multicast group for log read

Add a netlink multicast socket with one group to kaudit for "best-effort"
delivery to read-only userspace clients such as systemd, in addition to the
existing bidirectional unicast auditd userspace client.

Currently, auditd is intended to use the CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE
capabilities, but actually uses CAP_NET_ADMIN. The CAP_AUDIT_READ capability
is added for use by read-only AUDIT_NLGRP_READLOG netlink multicast group
clients to the kaudit subsystem.

This will safely give access to services such as systemd to consume audit logs
while ensuring write access remains restricted for integrity.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

audit: add netlink audit protocol bind to check capabilities on multicast join

Register a netlink per-protocol bind fuction for audit to check userspace
process capabilities before allowing a multicast group connection.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: implement unbind to netlink_setsockopt NETLINK_DROP_MEMBERSHIP

Call the per-protocol unbind function rather than bind function on
NETLINK_DROP_MEMBERSHIP in netlink_setsockopt().

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: have netlink per-protocol bind function return an error code.

Have the netlink per-protocol optional bind function return an int error code
rather than void to signal a failure.

This will enable netlink protocols to perform extra checks including
capabilities and permissions verifications when updating memberships in
multicast groups.

In netlink_bind() and netlink_setsockopt() the call to the per-protocol bind
function was moved above the multicast group update to prevent any access to
the multicast socket groups before checking with the per-protocol bind
function. This will enable the per-protocol bind function to be used to check
permissions which could be denied before making them available, and to avoid
the messy job of undoing the addition should the per-protocol bind function
fail.

The netfilter subsystem seems to be the only one currently using the
per-protocol bind function.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: simplify nfnetlink_bind

Remove duplicity and simplify code flow by moving the rcu_read_unlock() above
the condition and let the flow control exit naturally at the end of the
function.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

filter: added BPF random opcode

Added a new ancillary load (bpf call in eBPF parlance) that produces
a 32-bit random number. We are implementing it as an ancillary load
(instead of an ISA opcode) because (a) it is simpler, (b) allows easy
JITing, and (c) seems more in line with generic ISAs that do not have
"get a random number" as a instruction, but as an OS call.

The main use for this ancillary load is to perform random packet sampling.

Signed-off-by: Chema Gonzalez <chema@google.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: unnecessary to check if vlan_pcpu_stats is NULL

if allocating memory for vlan_pcpu_stats failed, the device can not be operated

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Support for configurable RSS hash key

This be2net patch implements the get/set_rxfh() ethtool hooks.
RSS_CONFIG device command is invoked to set hashkey and indirection table.
It also uses an initial random value for RSS hash key instead of a
hard-coded value as hard-coded values for a hash-key are usually
considered a security risk.

Signed-off-by: Venkat Duvvuru <VenkatKumar.Duvvuru@Emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: Support for configurable RSS hash key

This ethtool patch primarily copies the ioctl command data structures
from/to the User space and invokes the driver hook.

Signed-off-by: Venkat Duvvuru <VenkatKumar.Duvvuru@Emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

neterion/s2io: remove unused s2io_start_tx_queue routine

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: avoid retransmits of TCP packets hanging in host queues

In commit 0e280af026a5 ("tcp: introduce TCPSpuriousRtxHostQueues SNMP
counter") we added a logic to detect when a packet was retransmitted
while the prior clone was still in a qdisc or driver queue.

We are now confident we can do better, and catch the problem before
we fragment a TSO packet before retransmit, or in TLP path.

This patch fully exploits the logic by simply canceling the spurious
retransmit.
Original packet is in a queue and will eventually leave the host.

This helps to avoid network collapses when some events make the RTO
estimations very wrong, particularly when dealing with huge number of
sockets with synchronized blast.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: support IFA_F_MANAGETEMPADDR for address deletion too

Userspace applications can use IFA_F_MANAGETEMPADDR with RTM_NEWADDR
already to indicate that the kernel should take care of temporary
address management.

This patch adds related functionality to RTM_DELADDR. By setting
IFA_F_MANAGETEMPADDR a userspace application can indicate that the kernel
should delete all related temporary addresses as well.

A corresponding patch for the "ip addr del" command has been applied to
iproute2 already.

Signed-off-by: Heiner Kallweit <heiner.kallweit@web.de>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

hisax/icc: add missing semicolon after label

A label just before a brace needs a following semicolon (empty statement).

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

smc91x: fix compile error when SMC_DEBUG >= 2

When SMC_DEBUG >= 2, we hit the following compilation error:

drivers/net/ethernet/smsc/smc91x.c:85:0:
drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_findirq’:
drivers/net/ethernet/smsc/smc91x.c:1784:9: error: ‘dev’ undeclared (first use in this function)
DBG(2, dev, "%s: %s\n", CARDNAME, __func__);
^
Fix it by passing in the appropriate netdev pointer.

Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tipc-next'

Ying Xue says:

====================
purge tipc_net_lock

Now tipc routing hierarchy comprises the structures 'node', 'link'and
'bearer'. The whole hierarchy is protected by a big read/write lock,
tipc_net_lock, to ensure that nothing is added or removed while code
is accessing any of these structures. Obviously the locking policy
makes node, link and bearer components closely bound together so that
their relationship becomes unnecessarily complex. In the worst case,
such locking policy not only has a negative influence on performance,
but also it's prone to lead to deadlock occasionally.

In order o decouple the complex relationship between bearer and node
as well as link, the locking policy is adjusted as follows:

- Bearer level
  RTNL lock is used on update side, and RCU is used on read side.
  Meanwhile, all bearer instances including broadcast bearer are
  saved into bearer_list array.

- Node and link level
  All node instances are saved into two tipc_node_list and node_htable
  lists. The two lists are protected by node_list_lock on write side,
  and they are guarded with RCU lock on read side. All members in node
  structure including link instances are protected by node spin lock.

- The relationship between bearer and node
  When link accesses bearer, it first needs to find the bearer with
  its bearer identity from the bearer_list array. When bearer accesses
  node, it can iterate the node_htable hash list with the node address
  to find the corresponding node.

In the new locking policy, every component has its private locking
solution and the relationship between bearer and node is very simple,
that is, they can find each other with node address or bearer identity
from node_htable hash list or bearer_list array.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: fix race in disc create/delete

Commit a21a584d6720ce349b05795b9bcfab3de8e58419 (tipc: fix neighbor
detection problem after hw address change) introduces a race condition
involving tipc_disc_delete() and tipc_disc_add/remove_dest that can
cause TIPC to dereference the pointer to the bearer discovery request
structure after it has been freed since a stray pointer is left in the
bearer structure.

In order to fix the issue, the process of resetting the discovery
request handler is optimized: the discovery request handler and request
buffer are just reset instead of being freed, allocated and initialized.
As the request point is always valid and the request's lock is taken
while the request handler is reset, the race doesn't happen any more.

Reported-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: use bc_lock to protect node map in bearer structure

The node map variable - 'nodes' in bearer structure is only used by
bclink. When bclink accesses it, bc_lock is held. But when change it,
for instance, in tipc_bearer_add_dest() or tipc_bearer_remove_dest()
the bc_lock is not taken at all. To avoid any inconsistent data, we
should always grab bc_lock while accessing node map variable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: use bearer_disable to disable bearer in tipc_l2_device_event

As bearer pointer is known in tipc_l2_device_event(), it's unnecessary
to search it again in tipc_disable_bearer(). If tipc_disable_bearer()
is replaced with bearer_disable() in tipc_l2_device_event(), this will
help us save a bit time when bearer is disabled.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: make media_ptr pointed netdevice valid

The 'media_ptr' pointer in bearer structure which points to network
device, is protected by RCU. So, before netdevice is released,
synchronize_net() should be involved to prevent no any user of
the netdevice on read side from accessing it after it is freed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: purge tipc_net_lock lock

Now tipc routing hierarchy comprises the structures 'node', 'link'and
'bearer'. The whole hierarchy is protected by a big read/write lock,
tipc_net_lock, to ensure that nothing is added or removed while code
is accessing any of these structures. Obviously the locking policy
makes node, link and bearer components closely bound together so that
their relationship becomes unnecessarily complex. In the worst case,
such locking policy not only has a negative influence on performance,
but also it's prone to lead to deadlock occasionally.

In order o decouple the complex relationship between bearer and node
as well as link, the locking policy is adjusted as follows:

- Bearer level
  RTNL lock is used on update side, and RCU is used on read side.
  Meanwhile, all bearer instances including broadcast bearer are
  saved into bearer_list array.

- Node and link level
  All node instances are saved into two tipc_node_list and node_htable
  lists. The two lists are protected by node_list_lock on write side,
  and they are guarded with RCU lock on read side. All members in node
  structure including link instances are protected by node spin lock.

- The relationship between bearer and node
  When link accesses bearer, it first needs to find the bearer with
  its bearer identity from the bearer_list array. When bearer accesses
  node, it can iterate the node_htable hash list with the node
  address to find the corresponding node.

In the new locking policy, every component has its private locking
solution and the relationship between bearer and node is very simple,
that is, they can find each other with node address or bearer identity
from node_htable hash list or bearer_list array.

Until now above all changes have been done, so tipc_net_lock can be
removed safely.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: use RCU to protect media_ptr pointer

Now the media_ptr pointer is protected with tipc_net_lock write lock
on write side; tipc_net_lock read lock is used to read side. As the
part of effort of eliminating tipc_net_lock, we decide to adjust the
locking policy of media_ptr pointer protection: on write side, RTNL
lock is use while on read side RCU read lock is applied.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: decouple the relationship between bearer and link

Currently on both paths of message transmission and reception, the
read lock of tipc_net_lock must be held before bearer is accessed,
while the write lock of tipc_net_lock has to be taken before bearer
is configured. Although it can ensure that bearer is always valid on
the two data paths, link and bearer is closely bound together.

So as the part of effort of removing tipc_net_lock, the locking
policy of bearer protection will be adjusted as below: on the two
data paths, RCU is used, and on the configuration path of bearer,
RTNL lock is applied.

Now RCU just covers the path of message reception. To make it possible
to protect the path of message transmission with RCU, link should not
use its stored bearer pointer to access bearer, but it should use the
bearer identity of its attached bearer as index to get bearer instance
from bearer_list array, which can help us decouple the relationship
between bearer and link. As a result, bearer on the path of message
transmission can be safely protected by RCU when we access bearer_list
array within RCU lock protection.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: convert bearer_list to RCU list

Convert bearer_list to RCU list. It's protected by RTNL lock on
update side, and RCU read lock is applied to read side.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: use RTNL lock to protect tipc_net_stop routine

As the tipc network initialization(ie, tipc_net_start routine) is
under RTNL protection, its corresponding deinitialization part(ie,
tipc_net_stop routine) should be protected by RTNL too.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: adjust locking policy of protecting tipc_ptr pointer of net_device

Currently the 'tipc_ptr' pointer is protected by tipc_net_lock
write lock on write side, and RCU read lock is applied to read
side. In addition, there have two paths on write side where we
may change variables pointed by the 'tipc_ptr' pointer: one is
to configure bearer by tipc-config tool and another one is that
bearer status is changed by notification events of its attached
interface. But on the latter path, we improperly deem that
accessing 'tipc_ptr' pointer happens on read side with
rcu_read_lock() although some variables pointed by the 'tipc_ptr'
pointer are changed possibly.

Moreover, as now the both paths are guarded by RTNL lock, it's
better to adjust the locking policy of 'tipc_ptr' pointer
protection, allowing RTNL instead of tipc_net_lock write lock to
protect it on write side, which will help us purge tipc_net_lock
in the future.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: replace config_mutex lock with RTNL lock

There have two paths where we can configure or change bearer status:
one is that bearer is configured from user space with tipc-config
tool; another one is that bearer is changed by notification events
from its attached interface. On the first path, one dedicated
config_mutex lock is guarded; on the latter path, RTNL lock has been
placed to serialize the process of dealing with interface events.
So, if RTNL lock is also used to protect the first path, this will
not only extremely help us simplify current locking policy, but also
config_mutex lock can be deleted as well.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sxgbe: Added phy_found error path

This patch adds phy_found error path when there is no phy device
and changes bus_name.

Signed-off-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sxgbe: rearrange dma descriptor

This patch moves cksum_ctl to tx_rd_des23 from cksum_pktlen for correct checksum
offloading and modifies size for Tx/Rx descriptor.

Signed-off-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: zero is an invald queue_pairs number

Execute "ethtool -L eth0 combined 0" in guest, if multiqueue
is enabled, virtnet_send_command() will return -EINVAL error,
there is a validation in QEMU.

But if multiqueue is disabled, virtnet_set_queues() will just
return zero (success). We should return error for this situation.

Signed-off-by: Amos Kong <akong@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

arc_emac: write initial MAC address from devicetree to hw

The MAC address retrieved from dt was not actually written to the
hardware. This meant proper communication was only possible after
changing the MAC address.

Fix that by always writing the mac address during probing.

Signed-off-by: Max Schwarz <max.schwarz@online.de>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fix ns_capable check in sock_diag_put_filterinfo

The caller needs capabilities on the namespace being queried, not on
their own namespace. This is a security bug, although it likely has
only a minor impact.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

i40e/i40evf: Bump build versions

Bump i40e to version 0.3.43 and i40evf to version 0.9.21.

Change-ID: Ice4c715731bfa1dfc12dd45418675a3ba6e08d57
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Tweak for-loop in i40e_ethtool.c

Tweak a for-loop to make it easier to add conditional stats in the future.

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Cleanup if/else statements

Simplify some if/else statements in i40e_main.c

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: rework fdir setup and teardown

Use the newer i40e_vsi_open() and i40e_vsi_close() in the FDIR VSI
lifetime. This makes sure we're using standard methods for all the
VSI open and close paths. This also fixes a memory leak of the
FDIR queue buffer info structs across a reset.

Change-ID: I1b60a1b08ab923afe4f49810c2c7844d850e19b9
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: use generic vsi_open to unquiesce vsi

Use the new i40e_vsi_open() for waking VSIs back up in order to
be sure all the standard actions happen.

Change-ID: Ic3479410dd3079733f4951dcea69f101e69e77df
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: abstract the close path for better netdev vsis

Abstract out the vsi close actions into a single function so they
can be used correctly for both netdev and non-netdev based VSIs.

Change-ID: I59e3d115fcb20e614a09477281b7787dd340d276
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: prep vsi_open logic for non-netdev cases

Rearrange the "if netdev" logic slightly to get ready for handling
non-netdev VSIs.

Change-ID: Ia0bfe13d4c994a2351a3c31fe725b75caeb397ee
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e/i40evf: update AdminQ API

Reflect recent changes in firmware:
- remove storm control
- simplify PHY link management values
- add partition bandwidth configuration

Change-ID: If266ed2f9a89ad176cf8a74aeaef68613af76bc8
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e/i40evf: add tracking to NVM busy state

The NVM updates take some time and are asynchronous actions that signal
their completion with an AdminQ event. This code tracks when there is
an NVM update outstanding and won't allow a new update command until a
completion event is received from the current update.

Change-ID: Ic132fe16bd9dc09b002ed38297a877c1a01553ce
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Fix an issue with displaying IPv4 FD filters

The flow spec coming in for IPv4 filters is IP_USER_FLOW, which
needed some more info to be communicated up above in order for it
to be displayed correctly.

Change-ID: Ia968238e0d7c4c4df12908ba81f0c4501280f3ec
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Remove a FW workaround

Remove the FW workaround to increment the number of msix vectors.

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e/i40evf: Bump build versions

Bump i40e to 0.3.41 and i40evf to 0.9.20.

Change-ID: If49251a1a81a0f25e8f74bc8b7d086befb6df676
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Enable VF Tx bandwidth setting

Implement the net device op for Tx bandwidth setting. Setting the Tx
bandwidth is done by 'ip link set <PF device> vf <VF num> rate <Tx
rate>', with the rate specified in Mbit/sec. The rate setting is
displayed with 'ip link show'.

Change-ID: I4d45dda8320632fdb6ec92c87d083e51070b46ab
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Reset the VF upon conflicting VLAN configuration

If a host VMM administrator hoses his VF by assigning a port VLAN after
it is already up and running with implicit permission to set local
VLANs then we print a message warning the host administrator that the
VF driver needs to be reloaded.

In addition we need to knock the VF offline so that it does not continue
to receive traffic not on the port VLAN assigned to it. So we reset the
VF. The VF will cease operation and the administrator will be forced to
unload and reload the VF driver to make it work again.

Change-ID: Iae1ae006b244e74e30a4ee546b3c5fca5cfb40aa
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e/i40evf: unhide and enable to one prefena field

The PREFENA field in the receive host memory cache (RX-HMC)
must be visible in order to be set to 1 at driver init for
best performance.

Change-ID: I16b0bcd84cf56f4b6c938201ff5e954bee5a1992
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

i40e: Add bridge FDB add/del/dump ops

Add the netdev ops to support addition of static FDB entries in the
physical function (PF) MAC/VLAN filter table so that virtual functions
(VFs) can communicate with bridged virtual Ethernet ports such as those
provided by the virtio driver.

Change-ID: Ifbd6817a75074e3b5cdf945a5635f26440bf15df
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

MAINTAINERS: SXGBE authors update

The mail address for Siva Reddy Kallam is bouncing, remove the email
address from the MAINTAINERS entry for Samsung's SXGBE driver.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ndo_set_config'

Florian Fainelli says:

====================
net: ndo_set_config/ifmap cleanups

This patch series removes a bunch of useless ndo_set_config in non
PCMCIA Ethernet drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: remove stmmac_config

stmmac_config() denies changing the base address and interrupt
parameters, and ignores any other settings from the ifmap parameters,
thus making stmmac_config() useless, remove it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sxgbe: remove sxgbe_config

sxgbe_config() denies changing the base address and interrupt, and
ignores all other 'struct ifmap' members, which means that it is useless
as is, so let's remove it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cpmac: remove cpmac_config

cpmac_config() refuses changing the base address parameter, and ignores
all other parameters, which means that it is pretty useless as it is, so
let's remove it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethoc: remove ethoc_config

ethoc_config() returns -ENOSYS and does not implement anything useful,
let's remove it such that net/core/dev_ioctl.c::dev_ifsioc can return
something meaningful like -EOPNOTSUPP.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hyperv: Add support for virtual Receive Side Scaling (vRSS)

This feature allows multiple channels to be used by each virtual NIC.
It is available on Hyper-V host 2012 R2.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to e1000e, igb, ixgbe and i40e.

Most notably are Jakub's patches to clean up the Rx time stamping
code for ixgbe and the fix up of debug messages with proper termination.

Jesse's i40e patch fixes an issue reported by Eric Dumazet that the
i40e driver was allowing the hardware to replicate the PSH flag on
all segments of a TSO operation. With this fix, we are now configuring
the CWR bit to only be set in the first packet of a TSO and we
enable TSO_ECN in order to advertise to the stack that we do the right
thing on the wire.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: make tcp_cwnd_application_limited() static

Make tcp_cwnd_application_limited() static and move it from tcp_input.c to
tcp_output.c

Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6lowpan: include net/net_namespace.h on 6lowpan namepsace header

Don't rely on driver files or other headers having this file included.

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6lowpan: make lowpan_cb static

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>