firefly-linux-kernel-4.4.55.git
10 years agoe1000: Use is_broadcast_ether_addr/is_multicast_ether_addr helpers
Tobias Klauser [Tue, 20 May 2014 08:22:55 +0000 (08:22 +0000)]
e1000: Use is_broadcast_ether_addr/is_multicast_ether_addr helpers

Use the is_broadcast_ether_addr/is_multicast_ether_addr helper functions
from linux/etherdevice.h instead of open coding them.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb: remove redundant PHY power down register write
Todd Fujinaka [Thu, 8 May 2014 23:20:24 +0000 (23:20 +0000)]
igb: remove redundant PHY power down register write

One of the registers used to power down the PHY was found to be wrong
(should be bit 2 not bit 1) on further inspection it was also found to
be redundant.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Out of line __ew32_prepare/__ew32
Andi Kleen [Tue, 20 May 2014 08:22:45 +0000 (08:22 +0000)]
e1000e: Out of line __ew32_prepare/__ew32

Out of lining these two common inlines saves about 30k text size,
due to their errata workarounds.

14131431 2008136 1507328 17646895 10d452f vmlinux-before-e1000e
14101415 2004040 1507328 17612783 10cbfef vmlinux-e1000e

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Fix expand setting EEE link info to all affected parts
David Ertman [Tue, 13 May 2014 00:06:26 +0000 (00:06 +0000)]
e1000e: Fix expand setting EEE link info to all affected parts

Previously, the update_phy_task was only calling e1000_set_eee_pchlan()
for phy.type 82579.  This patch is to cause this function to be called
for 82579 and newer phy.types.  This causes the dev_spec->eee_lp_ability
to have the correct value when going into SX states.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Cleanup parenthesis around return value
David Ertman [Tue, 13 May 2014 00:02:12 +0000 (00:02 +0000)]
e1000e: Cleanup parenthesis around return value

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: 82574/82583 TimeSync errata for SYSTIM read
Todd Fujinaka [Sat, 3 May 2014 06:41:37 +0000 (06:41 +0000)]
e1000e: 82574/82583 TimeSync errata for SYSTIM read

Due to a synchronization error, the value read from SYSTIML/SYSTIMH
might be incorrect.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Failure to write SHRA turns on PROMISC mode
David Ertman [Tue, 6 May 2014 03:50:17 +0000 (03:50 +0000)]
e1000e: Failure to write SHRA turns on PROMISC mode

Previously, the check to turn on promiscuous mode only took into account
the total number of SHared Receive Address (SHRA) registers and if the
request was for a register within that range.  It is possible that the
Management Engine might have locked a number of SHRA and not allowed a
new address to be written to the requested register.

Add a function to determine the number of unlocked SHRA registers.  Then
determine if the number of registers available is sufficient for our needs,
if not then return -ENOMEM so that UNICAST PROMISC mode is activated.

Since the method by which ME claims SHRA registers is non-deterministic,
also add a return value to the function attempting to write an address
to a SHRA, and return a -E1000_ERR_CONFIG if the write fails.  The error
will be passed up the function chain and allow the driver to also set
UNICAST PROMISC when this happens.

Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: avoid duplicate code in suspend and stop paths
Jacob Keller [Fri, 16 May 2014 05:12:29 +0000 (05:12 +0000)]
ixgbe: avoid duplicate code in suspend and stop paths

Resume path calls .open but suspend path cannot call .stop because
fdirs should not be freed and control over hardware should not be
released until WoL is configured.  To avoid having to duplicate all
changes made in .stop on suspend path split out part of .stop that
is relevant during suspend and call it from .stop and during suspend.

This fix also ensures that ixgbe_ptp_suspend is called during the
suspend path, and helps avoid similar errors. We can't call
ixgbe_ptp_stop, since it will free the PTP clock device, which we
shouldn't be doing during a suspend path.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: separate the PTP suspend and stop actions
Jacob Keller [Fri, 16 May 2014 05:12:28 +0000 (05:12 +0000)]
ixgbe: separate the PTP suspend and stop actions

Since we are adding proper support for suspend of PTP, extract out of
ixgbe_ptp_stop those things relevant to suspend. Then, have
ixgbe_ptp_stop call ixgbe_ptp_suspend. The next patch in the series will
have ixgbe_ptp_suspend called from the ixgbe_suspend path.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: extract PTP clock device from ptp_init
Jacob Keller [Fri, 16 May 2014 05:12:27 +0000 (05:12 +0000)]
ixgbe: extract PTP clock device from ptp_init

In order to properly handle a suspend/resume cycle, we cannot destroy
the PTP clock device. As part of this, we should only re-create the
device on first initialization. After a resume, when ixgbe_ptp_init is
called, we won't create a new clock, and we will use the old clock
device. To that end, this patch extracts the clock creation out of
ptp_init, and only calls it if we don't already have a ptp_clock
pointer.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: allow ixgbe_ptp_reset to maintain current hwtstamp config
Jacob Keller [Fri, 16 May 2014 05:12:26 +0000 (05:12 +0000)]
ixgbe: allow ixgbe_ptp_reset to maintain current hwtstamp config

Rather than clearing the hwtstamp configuration, we should use the known
configuration requested by the user and call the function which has now
been separated from the ioctl. This means that after a reset, the
timestamp mode will be maintained rather than lost. We still can't
maintain the clock value, however.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: extract the hardware setup from the ixgbe_ptp_set_ts_config
Jacob Keller [Fri, 16 May 2014 05:12:25 +0000 (05:12 +0000)]
ixgbe: extract the hardware setup from the ixgbe_ptp_set_ts_config

Currently all of the hardware setup logic for the PTP hardware bits is
buried inside of the ioctl which sets the timestamp configuration. This
makes it hard to use this logic in other places (primarily reset), and
this means we can't restore current timestamp mode upon a MAC reset.
Extracting this logic into a separate function will enable future work
for the ixgbe_ptp_reset function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: rename ixgbe_ptp_enable to ixgbe_ptp_feature_enable
Jacob Keller [Fri, 16 May 2014 05:12:24 +0000 (05:12 +0000)]
ixgbe: rename ixgbe_ptp_enable to ixgbe_ptp_feature_enable

Since the name ixgbe_ptp_enable could be misconstrued as a function
which enables the whole PTP core, rename this function so that it is
clear the function is for enabling of the extra features such as PPS
signal.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix linking at 100Mbps on copper devices with MNG FW enabled
Emil Tantilov [Tue, 13 May 2014 08:24:00 +0000 (08:24 +0000)]
ixgbe: fix linking at 100Mbps on copper devices with MNG FW enabled

Driver was calling setup_link to make sure that fiber interfaces with MNG FW
enabled will get link on probe because the laser was most likely turned off.
This prevented non-fiber devices with MNG FW from linking at 100Mbps.

This patch adds a check to only call setup_link for fiber devices.

Reported-and-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoMerge branch 'net-sysfs-docs'
David S. Miller [Mon, 26 May 2014 05:02:39 +0000 (01:02 -0400)]
Merge branch 'net-sysfs-docs'

Florian Fainelli says:

====================
net: sysfs: documentation updates

This patch set contains some updates to the sysfs Documentation for the
sysfs net class.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: sysfs: document /sys/class/net/statistics/*
Florian Fainelli [Fri, 23 May 2014 23:35:42 +0000 (16:35 -0700)]
net: sysfs: document /sys/class/net/statistics/*

Document the network device statistics counter that are exposed as sysfs
attributes.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: sysfs: add documentation entries for /sys/class/<iface>/queues
Florian Fainelli [Fri, 23 May 2014 23:35:41 +0000 (16:35 -0700)]
net: sysfs: add documentation entries for /sys/class/<iface>/queues

Add sysfs documentation for the various attributes of a network
interface exposed in /sys/class/<iface>/queues/{rx,tx}-<queue>/

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: sysfs: add missing phys_port_id documentation
Florian Fainelli [Fri, 23 May 2014 23:35:40 +0000 (16:35 -0700)]
net: sysfs: add missing phys_port_id documentation

Add documentation for the phys_port_id sysfs attribute of a network
device.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: bridge: fix Kconfig unmet dependencies
Pablo Neira [Sun, 25 May 2014 12:48:33 +0000 (14:48 +0200)]
netfilter: bridge: fix Kconfig unmet dependencies

Before f5efc69 ("netfilter: nf_tables: Add meta expression key for
bridge interface name"), the entire net/bridge/netfilter/ directory
depended on BRIDGE_NF_EBTABLES, ie. on ebtables. However, that
directory already contained the nf_tables bridge extension that
we should allow to compile separately. In f5efc69, we tried to
generalize this by using CONFIG_BRIDGE_NETFILTER which was not a good
idea since this option already existed and it is dedicated to enable
the Netfilter bridge IP/ARP filtering.

Let's try to fix this mess by:

1) making net/bridge/netfilter/ dependent on the toplevel
   CONFIG_NETFILTER option, just like we do with the net/netfilter and
   net/ipv{4,6}/netfilter/ directories.

2) Changing 'selects' to 'depends on' NETFILTER_XTABLES for
   BRIDGE_NF_EBTABLES. I believe this problem was already before
   f5efc69:

warning: (BRIDGE_NF_EBTABLES) selects NETFILTER_XTABLES which has
unmet direct dependencies (NET && INET && NETFILTER)

3) Fix ebtables/nf_tables bridge dependencies by making NF_TABLES_BRIDGE
   and BRIDGE_NF_EBTABLES dependent on BRIDGE and NETFILTER:

warning: (NF_TABLES_BRIDGE && BRIDGE_NF_EBTABLES) selects
BRIDGE_NETFILTER which has unmet direct dependencies (NET && BRIDGE &&
NETFILTER && INET && NETFILTER_ADVANCED)

net/built-in.o: In function `br_parse_ip_options':
br_netfilter.c:(.text+0x4a5ba): undefined reference to `ip_options_compile'
br_netfilter.c:(.text+0x4a5ed): undefined reference to `ip_options_rcv_srr'
net/built-in.o: In function `br_nf_pre_routing_finish':
br_netfilter.c:(.text+0x4a8a4): undefined reference to `ip_route_input_noref'
br_netfilter.c:(.text+0x4a987): undefined reference to `ip_route_output_flow'
make: *** [vmlinux] Error 1

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoof: mdio: fix compile warning in of_mdiobus_register_phy()
Christian Engelmayer [Fri, 23 May 2014 21:33:55 +0000 (23:33 +0200)]
of: mdio: fix compile warning in of_mdiobus_register_phy()

Commit de906af1 (net: phy: make of_set_phy_supported work with genphy driver)
removed the last user of variable 'max_speed' in function
of_mdiobus_register_phy(), leading to compile warning "unused variable
‘max_speed’ [-Wunused-variable]". Thus remove it.

Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotg3: Override clock, link aware and link idle mode during NVRAM dump
Prashant Sreedharan [Sat, 24 May 2014 08:32:09 +0000 (01:32 -0700)]
tg3: Override clock, link aware and link idle mode during NVRAM dump

When cable is not present the clock speed of some of the devices is
reduced based upon power saving mode setting in NVRAM. Due to this
NVRAM reads take long time to complete as a result CPU soft lockup
message is seen. Fix is to override clock, disable link aware and link
idle modes before NVRAM reads and restore them back after the reads
are complete. During this period also check if the thread needs to be
rescheduled and if there are any signals to handle.

Also decrease the NVRAM command execution timeout value to 1ms.

Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: driver: stmicro: Remove some useless the lock protection
Yang Wei [Sun, 25 May 2014 01:53:44 +0000 (09:53 +0800)]
net: driver: stmicro: Remove some useless the lock protection

kernel always invokes a pair of rtnl_lock adn rtnl_unlock to
protect dev_ethtool(), so its not neccessary to invoke spin_lock/unlock
in ethtool_ops.

Signed-off-by: Yang Wei <Wei.Yang@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoteam: lb: use sizeof(*fprog) in __fprog_create
Daniel Borkmann [Sat, 24 May 2014 19:47:46 +0000 (21:47 +0200)]
team: lb: use sizeof(*fprog) in __fprog_create

sock_fprog and sock_fprog_kern are of equal size, however
it's cleaner to just use sizeof(*fprog) instead to always
have correct type.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agotipc: Don't reset the timeout when restarting
Arnaldo Carvalho de Melo [Fri, 23 May 2014 19:55:12 +0000 (15:55 -0400)]
tipc: Don't reset the timeout when restarting

As it may then take longer than what the user specified using
setsockopt(SO_RCVTIMEO).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'ks8851'
David S. Miller [Sat, 24 May 2014 18:03:33 +0000 (14:03 -0400)]
Merge branch 'ks8851'

Stephen Boyd says:

====================
ks8851 DT/regulator/gpio updates

This set of patches properly documents the micrel ks8851 spi ethernet
controller, converts to devm_regulator_get_optional() to make error
paths slightly simpler, and finally adds supports for another
optional regulator and a reset gpio. This allows me to use the ks8851
on my MSM8960 CDP board.

Changes since v1:
 * Dropped vendor prefix patch as that should go through DT tree
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: ks8851: Add of match table
Stephen Boyd [Fri, 23 May 2014 19:57:20 +0000 (12:57 -0700)]
net: ks8851: Add of match table

Users are currently just providing "ks8851" as the compatible for
this driver in device tree. Add a compatible string that provides
the vendor name along with the device name to be more explicit.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: ks8851: Add optional vdd_io regulator and reset gpio
Stephen Boyd [Fri, 23 May 2014 19:57:19 +0000 (12:57 -0700)]
net: ks8851: Add optional vdd_io regulator and reset gpio

Allow the ks8851 driver to enable an optional 1.8V vdd_io
regulator and assert the reset pin to the phy if a reset gpio is
present in device tree.

Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: ks8851: Use devm_regulator_get_optional()
Stephen Boyd [Fri, 23 May 2014 19:57:18 +0000 (12:57 -0700)]
net: ks8851: Use devm_regulator_get_optional()

This simplifies error paths and removes the need to
regulator_put().

Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodevicetree: bindings: Properly document micrel ks8851 SPI chips
Stephen Boyd [Fri, 23 May 2014 19:57:17 +0000 (12:57 -0700)]
devicetree: bindings: Properly document micrel ks8851 SPI chips

The ks8851 SPI ethernet wasn't documented, but we documented the
optional regulator supply for it under the mll based ethernet
chip. Furthermore, that compatible string needed another 'l'. Fix
all of this and document the optional vdd-io and reset-gpios
properties.

Cc: Nishanth Menon <nm@ti.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: <devicetree@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sat, 24 May 2014 04:32:30 +0000 (00:32 -0400)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/bonding/bond_alb.c
drivers/net/ethernet/altera/altera_msgdma.c
drivers/net/ethernet/altera/altera_sgdma.c
net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Fri, 23 May 2014 22:41:52 +0000 (15:41 -0700)]
Merge git://git./linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
 "A small bunch of bug fixes, in particular:

   1) On older cpus we need a different chunk of virtual address space
      to map the huge page TSB.

   2) Missing memory barrier in Niagara2 memcpy.

   3) trinity showed some places where fault validation was
      unnecessarily loud on sparc64

   4) Some sysfs printf's need a type adjustment, from Toralf Förster"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: fix format string mismatch in arch/sparc/kernel/sysfs.c
  sparc64: Add membar to Niagara2 memcpy code.
  sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
  sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault addresses.

10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 23 May 2014 22:29:43 +0000 (15:29 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "It looks like a sizeble collection but this is nearly 3 weeks of bug
  fixing while you were away.

   1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute
      the packet through a non-IPSEC protected path and the code has to
      be able to handle SKBs attached to routes lacking an attached xfrm
      state.  From Steffen Klassert.

   2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported
      sub-protocols, also from Steffen Klassert.

   3) Set local_df on fragmented netfilter skbs otherwise we won't be
      able to forward successfully, from Florian Westphal.

   4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without
      holding RCU lock, from Bjorn Mork.

   5) local_df test in ip_may_fragment is inverted, from Florian
      Westphal.

   6) jme driver doesn't check for DMA mapping failures, from Neil
      Horman.

   7) qlogic driver doesn't calculate number of TX queues properly, from
      Shahed Shaikh.

   8) fib_info_cnt can drift irreversibly positive if we fail to
      allocate the fi->fib_metrics array, from Sergey Popovich.

   9) Fix use after free in ip6_route_me_harder(), also from Sergey
      Popovich.

  10) When SYSCTL is disabled, we don't handle local_port_range and
      ping_group_range defaults properly at all, from Cong Wang.

  11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim
      driver, fix from Bjorn Mork.

  12) cassini driver needs nested lock annotations for TX locking, from
      Emil Goode.

  13) On init error ipv6 VTI driver can unregister pernet ops twice,
      oops.  Fix from Mahtias Krause.

  14) If macvlan device is down, don't propagate IFF_ALLMULTI changes,
      from Peter Christensen.

  15) Missing NULL pointer check while parsing netlink config options in
      ip6_tnl_validate().  From Susant Sahani.

  16) Fix handling of neighbour entries during ipv6 router reachability
      probing, from Duan Jiong.

  17) x86 and s390 JIT address randomization has some address
      calculation bugs leading to crashes, from Alexei Starovoitov and
      Heiko Carstens.

  18) Clear up those uglies with nop patching and net_get_random_once(),
      from Hannes Frederic Sowa.

  19) Option length miscalculated in ip6_append_data(), fix also from
      Hannes Frederic Sowa.

  20) A while ago we fixed a race during device unregistry when a
      namespace went down, turns out there is a second place that needs
      similar protection.  From Cong Wang.

  21) In the new Altera TSE driver multicast filtering isn't working,
      disable it and just use promisc mode until the cause is found.
      From Vince Bridgers.

  22) When we disable router enabling in ipv6 we have to flush the
      cached routes explicitly, from Duan Jiong.

  23) NBMA tunnels should not cache routes on the tunnel object because
      the key is variable, from Timo Teräs.

  24) With stacked devices GRO information in skb->cb[] can be not setup
      properly, make sure it is in all code paths.  From Eric Dumazet.

  25) Really fix stacked vlan locking, multiple levels of nesting with
      intervening non-vlan devices are possible.  From Vlad Yasevich.

  26) Fallback ipip tunnel device's mtu is not setup properly, from
      Steffen Klassert.

  27) The packet scheduler's tcindex filter can crash because we
      structure copy objects with list_head's inside, oops.  From Cong
      Wang.

  28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric
      Dumazet.

  29) In some configurations 'itag' in __mkroute_input() can end up
      being used uninitialized because of how fib_validate_source()
      works.  Fix it by explitly initializing itag to zero like all the
      other fib_validate_source() callers do, from Li RongQing"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  batman: fix a bogus warning from batadv_is_on_batman_iface()
  ipv4: initialise the itag variable in __mkroute_input
  bonding: Send ALB learning packets using the right source
  bonding: Don't assume 802.1Q when sending alb learning packets.
  net: doc: Update references to skb->rxhash
  stmmac: Remove unbalanced clk_disable call
  ipv6: gro: fix CHECKSUM_COMPLETE support
  net_sched: fix an oops in tcindex filter
  can: peak_pci: prevent use after free at netdev removal
  ip_tunnel: Initialize the fallback device properly
  vlan: Fix build error wth vlan_get_encap_level()
  can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option
  MAINTAINERS: Pravin Shelar is Open vSwitch maintainer.
  bnx2x: Convert return 0 to return rc
  bonding: Fix alb mode to only use first level vlans.
  bonding: Fix stacked device detection in arp monitoring
  macvlan: Fix lockdep warnings with stacked macvlan devices
  vlan: Fix lockdep warning with stacked vlan devices.
  net: Allow for more then a single subclass for netif_addr_lock
  net: Find the nesting level of a given device by type.
  ...

10 years agoMerge branch 'filter-next'
David S. Miller [Fri, 23 May 2014 20:48:50 +0000 (16:48 -0400)]
Merge branch 'filter-next'

Daniel Borkmann says:

====================
BPF updates

These were still in my queue. Please see individual patches for
details.

I have rebased these on top of current net-next with Andrew's
gcc union fixup [1] applied to avoid dealing with an unnecessary
merge conflict.

 [1] http://patchwork.ozlabs.org/patch/351577/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: filter: add test case for jump with holes and ret x variants
Daniel Borkmann [Fri, 23 May 2014 16:44:01 +0000 (18:44 +0200)]
net: filter: add test case for jump with holes and ret x variants

This patch adds three more test cases:

 1) long jumps with holes of unreachable code
 2) ret x
 3) ldx + ret x

All three tests are for classical BPF and to make sure that
any changes will not break some exotic behaviour that exists
probably since decades. The last two tests are expected to
fail by the BPF checker already, as in classic BPF only K
or A are allowed to be returned. Thus, there are now 52 test
cases for BPF.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: filter: improve test case framework
Daniel Borkmann [Fri, 23 May 2014 16:44:00 +0000 (18:44 +0200)]
net: filter: improve test case framework

This patch simplifies and refactors the test case code a
bit and also adds a summary of all test that passed or
failed in the kernel log, so that it's easier to spot if
something has failed.

Future work could further extend the test framework to also
support different input 'stimuli' i.e. related structures
to seccomp.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: filter: doc: add section for BPF test suite
Daniel Borkmann [Fri, 23 May 2014 16:43:59 +0000 (18:43 +0200)]
net: filter: doc: add section for BPF test suite

Mention the recently added test suite in the documentation file.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: filter: let unattached filters use sock_fprog_kern
Daniel Borkmann [Fri, 23 May 2014 16:43:58 +0000 (18:43 +0200)]
net: filter: let unattached filters use sock_fprog_kern

The sk_unattached_filter_create() API is used by BPF filters that
are not directly attached or related to sockets, and are used in
team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own
internal managment of obtaining filter blocks and thus already
have them in kernel memory and set up before calling into
sk_unattached_filter_create(). As a result, due to __user annotation
in sock_fprog, sparse triggers false positives (incorrect type in
assignment [different address space]) when filters are set up before
passing them to sk_unattached_filter_create(). Therefore, let
sk_unattached_filter_create() API use sock_fprog_kern to overcome
this issue.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: filter: remove DL macro
Daniel Borkmann [Fri, 23 May 2014 16:43:57 +0000 (18:43 +0200)]
net: filter: remove DL macro

Lets get rid of this macro. After commit 5bcfedf06f7f ("net: filter:
simplify label names from jump-table"), labels have become more
readable due to omission of BPF_ prefix but at the same time more
generic, so that things like `git grep -n` would not find them. As
a middle path, lets get rid of the DL macro as it's not strictly
needed and would otherwise just hide the full name.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'inet_csums_part3'
David S. Miller [Fri, 23 May 2014 20:29:04 +0000 (16:29 -0400)]
Merge branch 'inet_csums_part3'

Tom Herbert says:

====================
net: Checksum offload changes - Part III

I am working on overhauling RX checksum offload. Goals of this effort
are:

- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simply code

What is in this third patch set:

- Remove sk_no_check from sunrpc (doesn't seem to have any effect)
- Eliminate no_check from protosw. All protocols are using default of
  zero for this
- Split sk_no_check into sk_no_check_tx and sk_no_check_rx
- Make enabling of UDP6 more restrictive and explicit
- Support zero UDP6 checksums in l2tp

V2: Took out vxlan changes to set zero csums in IPv6, this will
    be in a later patch set.
V3: Fixed bug in restricting UDP6 checksums.

Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agol2tp: Add support for zero IPv6 checksums
Tom Herbert [Fri, 23 May 2014 15:47:40 +0000 (08:47 -0700)]
l2tp: Add support for zero IPv6 checksums

Added new L2TP configuration options to allow TX and RX of
zero checksums in IPv6. Default is not to use them.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: Make enabling of zero UDP6 csums more restrictive
Tom Herbert [Fri, 23 May 2014 15:47:32 +0000 (08:47 -0700)]
net: Make enabling of zero UDP6 csums more restrictive

RFC 6935 permits zero checksums to be used in IPv6 however this is
recommended only for certain tunnel protocols, it does not make
checksums completely optional like they are in IPv4.

This patch restricts the use of IPv6 zero checksums that was previously
intoduced. no_check6_tx and no_check6_rx have been added to control
the use of checksums in UDP6 RX and TX path. The normal
sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when
dealing with a dual stack socket).

A helper function has been added (udp_set_no_check6) which can be
called by tunnel impelmentations to all zero checksums (send on the
socket, and accept them as valid).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: Split sk_no_check into sk_no_check_{rx,tx}
Tom Herbert [Fri, 23 May 2014 15:47:19 +0000 (08:47 -0700)]
net: Split sk_no_check into sk_no_check_{rx,tx}

Define separate fields in the sock structure for configuring disabling
checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx.
The SO_NO_CHECK socket option only affects sk_no_check_tx. Also,
removed UDP_CSUM_* defines since they are no longer necessary.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: Eliminate no_check from protosw
Tom Herbert [Fri, 23 May 2014 15:47:09 +0000 (08:47 -0700)]
net: Eliminate no_check from protosw

It doesn't seem like an protocols are setting anything other
than the default, and allowing to arbitrarily disable checksums
for a whole protocol seems dangerous. This can be done on a per
socket basis.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosunrpc: Remove sk_no_check setting
Tom Herbert [Fri, 23 May 2014 15:46:55 +0000 (08:46 -0700)]
sunrpc: Remove sk_no_check setting

Setting sk_no_check to UDP_CSUM_NORCV seems to have no effect.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Fri, 23 May 2014 20:28:18 +0000 (16:28 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to igb, igbvf, ixgbe, i40e and i40evf.

Jacob provides eight patches to cleanup the ixgbe driver to resolve various
checkpatch.pl warnings/errors as well as minor coding style issues.

Stephen Hemminger and I provide simple cleanups of void functions which
had useless return statements at the end of the function which are not
needed.

v2: Dropped Emil's patch "ixgbe: fix the detection of SFP+ capable interfaces"
    while I wait for his updated patch to be validated.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'mvneta-next'
David S. Miller [Fri, 23 May 2014 19:32:15 +0000 (15:32 -0400)]
Merge branch 'mvneta-next'

Ezequiel Garcia says:

====================
net: ethernet: marvell: Assorted fixes

New round for this assorted fixes and clean-up series. There is more room for
clean-ups, and I'll start preparing more patches once these are accepted.

This series consists of cleanups and minor improvements on mvneta, mv643xx_eth
and mvmdio drivers. None of the patches imply any functionality change, except
for the patch six "Change the number of default rx queues to one".

This patch reduces the driver's allocated resources and makes the multiqueue
path in the poll function not get taken. The previous patchset contains more
details:

  http://permalink.gmane.org/gmane.linux.network/315015

As usual, any feedback on this will be well received!

Changes from v2:

  * Rebased on today's net-next and dropped patch
    "net: mvneta: Factorize feature setting", merged in the recent
    TSO series.

  * As per Sergei suggestion, used devm_kcalloc or devm_kmalloc_array
    when suitable.

Changes from v1:

  * Added two more clean-up patches to the series.

  * Added Sebastian's Acked-by's.

  * Fixed extra empty line in "net: mv643xx_eth: Simplify
    mv643xx_eth_adjust_link()" as pointed out by David Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Remove unneeded 'weigth' field
Ezequiel Garcia [Thu, 22 May 2014 23:07:03 +0000 (20:07 -0300)]
net: mvneta: Remove unneeded 'weigth' field

The 'weight' field is only used to pass the weigth to napi initialization
function. This commit removes the field, and instead uses a fixed value to
initialize the napi context.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvmdio: Use devm_* API to simplify the code
Ezequiel Garcia [Thu, 22 May 2014 23:07:02 +0000 (20:07 -0300)]
net: mvmdio: Use devm_* API to simplify the code

This commit makes use of devm_kmalloc_array() for memory allocation and the
recently introduced devm_mdiobus_alloc() API to simplify driver's code.
While here, remove a redundant out of memory error message.

Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Change the number of default rx queues to one
Ezequiel Garcia [Thu, 22 May 2014 23:07:01 +0000 (20:07 -0300)]
net: mvneta: Change the number of default rx queues to one

The driver does not support multiple rx queues, and so it's a waste
of resources to have a default number larger than one (1).

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Use prepare/commit API to simplify MAC address setting
Ezequiel Garcia [Thu, 22 May 2014 23:07:00 +0000 (20:07 -0300)]
net: mvneta: Use prepare/commit API to simplify MAC address setting

Use eth_prepare_mac_addr_change and eth_commit_mac_addr_change, instead
of manually checking and storing the MAC address, which makes the
code slightly more robust. This fixes the lack of valid MAC address check
in the driver's .ndo_set_mac_address hook.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Clean-up mvneta_init()
Ezequiel Garcia [Thu, 22 May 2014 23:06:59 +0000 (20:06 -0300)]
net: mvneta: Clean-up mvneta_init()

This commit cleans-up mvneta_init(), which initializes the hardware
and allocates the rx/qx queues. The queue allocation is simplified
by using devm_kcalloc instead of kzalloc. The unused phy_addr parameter
is removed. While here, the 'hal' references in the comments are removed.
This commit makes no functionality change.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Check tx queue setup error in mvneta_change_mtu()
Ezequiel Garcia [Thu, 22 May 2014 23:06:58 +0000 (20:06 -0300)]
net: mvneta: Check tx queue setup error in mvneta_change_mtu()

This commit checks the return code of mvneta_setup_txq() call
in mvneta_change_mtu(). Also, use the netdevice pointer directly
instead of dereferencing the port structure. While here, let's
fix a tiny comment typo.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mvneta: Clean-up mvneta_tx_frag_process()
Ezequiel Garcia [Thu, 22 May 2014 23:06:57 +0000 (20:06 -0300)]
net: mvneta: Clean-up mvneta_tx_frag_process()

A tiny clean-up to improve readability. This commit makes no functionality
change.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: mv643xx_eth: Simplify mv643xx_eth_adjust_link()
Ezequiel Garcia [Thu, 22 May 2014 23:06:56 +0000 (20:06 -0300)]
net: mv643xx_eth: Simplify mv643xx_eth_adjust_link()

Currently, mv643xx_eth_adjust_link() is only used to call mv643xx_adjust_pscr().
This commit renames the latter to the former, and therefore removes the extra
and useless function.

Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agolib/test_bpf.c: don't use gcc union shortcut
Andrew Morton [Thu, 22 May 2014 17:16:46 +0000 (10:16 -0700)]
lib/test_bpf.c: don't use gcc union shortcut

Older gcc's (mine is gcc-4.4.4) make a mess of this.

lib/test_bpf.c:74: error: unknown field 'insns' specified in initializer
lib/test_bpf.c:75: warning: missing braces around initializer
lib/test_bpf.c:75: warning: (near initialization for 'tests[0].<anonymous>.insns[0]')
lib/test_bpf.c:76: error: extra brace group at end of initializer
lib/test_bpf.c:76: error: (near initialization for 'tests[0].<anonymous>')
lib/test_bpf.c:76: warning: excess elements in union initializer
lib/test_bpf.c:76: warning: (near initialization for 'tests[0].<anonymous>')
lib/test_bpf.c:77: error: extra brace group at end of initializer

Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through...
Sucheta Chakraborty [Thu, 22 May 2014 13:59:05 +0000 (09:59 -0400)]
net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool.

o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed
  to have a bandwidth of at least this value.
  max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth
  of up to this value.

o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced
  which takes 4 arguments:
  netdev, VF number, min_tx_rate, max_tx_rate

o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler.

o Drivers that currently implement ndo_set_vf_tx_rate should now call
  ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth
  greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet
  implemented by driver.

o If user enters only one of either min_tx_rate or max_tx_rate, then,
  userland should read back the other value from driver and set both
  for IFLA_VF_RATE.
  Drivers that have not yet implemented IFLA_VF_RATE should always
  return min_tx_rate as 0 when read from ip tool.

o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then
  IFLA_VF_RATE should override.

o Idea is to have consistent display of rate values to user.

o Usage example: -

  ./ip link set p4p1 vf 0 rate 900

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

  ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300

  ./ip link show p4p1
  32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
  DEFAULT qlen 1000
    link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps,
    min_tx_rate 200Mbps
    vf 1 MAC f6:c6:7c:3f:3d:6c
    vf 2 MAC 56:32:43:98:d7:71
    vf 3 MAC d6:be:c3:b5:85:ff
    vf 4 MAC ee:a9:9a:1e:19:14
    vf 5 MAC 4a:d0:4c:07:52:18
    vf 6 MAC 3a:76:44:93:62:f9
    vf 7 MAC 82:e9:e7:e3:15:1a

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agohyperv: Add hash value into RNDIS Per-packet info
Haiyang Zhang [Wed, 21 May 2014 19:55:39 +0000 (12:55 -0700)]
hyperv: Add hash value into RNDIS Per-packet info

It passes the hash value as the RNDIS Per-packet info to the Hyper-V host,
so that the send completion notices can be spread across multiple channels.
MS-TFS: 140273

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvs...
David S. Miller [Fri, 23 May 2014 18:45:18 +0000 (14:45 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/pshelar/openvswitch

Pravin B Shelar says:

====================
Open vSwitch

A set of OVS changes for net-next/3.16.

Most of change are related to improving performance of flow setup by
minimizing critical sections.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 23 May 2014 17:04:04 +0000 (10:04 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "The biggest commit is an irqtime accounting loop latency fix, the rest
  are misc fixes all over the place: deadline scheduling, docs, numa,
  balancer and a bad to-idle latency fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/numa: Initialize newidle balance stats in sd_numa_init()
  sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance()
  sched: Skip double execution of pick_next_task_fair()
  sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check
  sched/deadline: Fix memory leak
  sched/deadline: Fix sched_yield() behavior
  sched: Sanitize irq accounting madness
  sched/docbook: Fix 'make htmldocs' warnings caused by missing description

10 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 23 May 2014 17:02:34 +0000 (10:02 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "The biggest changes are fixes for races that kept triggering Trinity
  crashes, plus liblockdep build fixes and smaller misc fixes.

  The liblockdep bits in perf/urgent are a pull mistake - they should
  have been in locking/urgent - but by the time I noticed other commits
  were added and testing was done :-/ Sorry about that"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
  perf: Prevent false warning in perf_swevent_add
  perf: Limit perf_event_attr::sample_period to 63 bits
  tools/liblockdep: Remove all build files when doing make clean
  tools/liblockdep: Build liblockdep from tools/Makefile
  perf/x86/intel: Fix Silvermont's event constraints
  perf: Fix perf_event_init_context()
  perf: Fix race in removing an event

10 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 23 May 2014 16:41:33 +0000 (09:41 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm radeon and nouveau fixes from Dave Airlie:
 "Fixes for the other big two.

  The radeon VCE one is large but it fixes some userspace triggerable
  issues, otherwise its blackscreens and oopses.

  Nouveau fixes a bleeding laptop panel issue when displayport is used
  sometimes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)
  drm/radeon: avoid segfault on device open when accel is not working.
  drm/radeon: fix typo in finding PLL params
  drm/radeon: fix register typo on si
  drm/radeon: fix buffer placement under memory pressure v2
  drm/radeon: fix page directory update size estimation
  drm/radeon: handle non-VGA class pci devices with ATRM
  drm/radeon: fix DCE83 check for mullins
  drm/radeon: check VCE relocation buffer range v3
  drm/radeon: also try GART for CPU accessed buffers
  drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup
  drm/nvd9/therm: handle another kind of PWM fan

10 years agoMerge branch 'akpm' (incoming from Andrew)
Linus Torvalds [Fri, 23 May 2014 16:38:07 +0000 (09:38 -0700)]
Merge branch 'akpm' (incoming from Andrew)

Merge misc fixes from Andrew Morton:
 "9 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS: add closing angle bracket to Vince Bridgers' email address
  Documentation: fix DOCBOOKS=... building
  ocfs2: fix double kmem_cache_destroy in dlm_init
  mm/memory-failure.c: fix memory leak by race between poison and unpoison
  wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR
  memcg: fix swapcache charge from kernel thread context
  mm: madvise: fix MADV_WILLNEED on shmem swapouts
  mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT
  hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage

10 years agoMAINTAINERS: add closing angle bracket to Vince Bridgers' email address
Tobias Klauser [Thu, 22 May 2014 18:54:24 +0000 (11:54 -0700)]
MAINTAINERS: add closing angle bracket to Vince Bridgers' email address

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Cc: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoDocumentation: fix DOCBOOKS=... building
Johannes Berg [Thu, 22 May 2014 18:54:23 +0000 (11:54 -0700)]
Documentation: fix DOCBOOKS=... building

Prior to commit 4266129964b8 ("[media] DocBook: Move all media docbook
stuff into its own directory") it was possible to build only a single
(or more) book(s) by calling, for example

    make htmldocs DOCBOOKS=80211.xml

This now fails:

    cp: target `.../Documentation/DocBook//media_api' is not a directory

Ignore errors from that copy to make this possible again.

Fixes: 4266129964b8 ("[media] DocBook: Move all media docbook stuff into its own directory")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoocfs2: fix double kmem_cache_destroy in dlm_init
Joseph Qi [Thu, 22 May 2014 18:54:22 +0000 (11:54 -0700)]
ocfs2: fix double kmem_cache_destroy in dlm_init

In dlm_init, if create dlm_lockname_cache failed in
dlm_init_master_caches, it will destroy dlm_lockres_cache which created
before twice.  And this will cause system die when loading modules.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm/memory-failure.c: fix memory leak by race between poison and unpoison
Naoya Horiguchi [Thu, 22 May 2014 18:54:21 +0000 (11:54 -0700)]
mm/memory-failure.c: fix memory leak by race between poison and unpoison

When a memory error happens on an in-use page or (free and in-use)
hugepage, the victim page is isolated with its refcount set to one.

When you try to unpoison it later, unpoison_memory() calls put_page()
for it twice in order to bring the page back to free page pool (buddy or
free hugepage list).  However, if another memory error occurs on the
page which we are unpoisoning, memory_failure() returns without
releasing the refcount which was incremented in the same call at first,
which results in memory leak and unconsistent num_poisoned_pages
statistics.  This patch fixes it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@vger.kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agowait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR
Masatake YAMATO [Thu, 22 May 2014 18:54:20 +0000 (11:54 -0700)]
wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR

In commit ad86622b478e ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide
EXIT_TRACE from user-space") the order of task state definitions were
changed: EXIT_DEAD and EXIT_ZOMBIE were swapped.  Though the charterers
for the states in TASK_STATE_TO_CHAR_STR string were not updated.  This
patch synchronizes the string to the order of definitions.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomemcg: fix swapcache charge from kernel thread context
Michal Hocko [Thu, 22 May 2014 18:54:19 +0000 (11:54 -0700)]
memcg: fix swapcache charge from kernel thread context

Commit 284f39afeaa4 ("mm: memcg: push !mm handling out to page cache
charge function") explicitly checks for page cache charges without any
mm context (from kernel thread context[1]).

This seemed to be the only possible case where memory could be charged
without mm context so commit 03583f1a631c ("memcg: remove unnecessary
!mm check from try_get_mem_cgroup_from_mm()") removed the mm check from
get_mem_cgroup_from_mm().  This however caused another NULL ptr
dereference during early boot when loopback kernel thread splices to
tmpfs as reported by Stephan Kulow:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000360
  IP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
  Oops: 0000 [#1] SMP
  Modules linked in: btrfs dm_multipath dm_mod scsi_dh multipath raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod parport_pc parport nls_utf8 isofs usb_storage iscsi_ibft iscsi_boot_sysfs arc4 ecb fan thermal nfs lockd fscache nls_iso8859_1 nls_cp437 sg st hid_generic usbhid af_packet sunrpc sr_mod cdrom ata_generic uhci_hcd virtio_net virtio_blk ehci_hcd usbcore ata_piix floppy processor button usb_common virtio_pci virtio_ring virtio edd squashfs loop ppa]
  CPU: 0 PID: 97 Comm: loop1 Not tainted 3.15.0-rc5-5-default #1
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  Call Trace:
    __mem_cgroup_try_charge_swapin+0x40/0xe0
    mem_cgroup_charge_file+0x8b/0xd0
    shmem_getpage_gfp+0x66b/0x7b0
    shmem_file_splice_read+0x18f/0x430
    splice_direct_to_actor+0xa2/0x1c0
    do_lo_receive+0x5a/0x60 [loop]
    loop_thread+0x298/0x720 [loop]
    kthread+0xc6/0xe0
    ret_from_fork+0x7c/0xb0

Also Branimir Maksimovic reported the following oops which is tiggered
for the swapcache charge path from the accounting code for kernel threads:

  CPU: 1 PID: 160 Comm: kworker/u8:5 Tainted: P           OE 3.15.0-rc5-core2-custom #159
  Hardware name: System manufacturer System Product Name/MAXIMUSV GENE, BIOS 1903 08/19/2013
  task: ffff880404e349b0 ti: ffff88040486a000 task.ti: ffff88040486a000
  RIP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
  Call Trace:
    __mem_cgroup_try_charge_swapin+0x45/0xf0
    mem_cgroup_charge_file+0x9c/0xe0
    shmem_getpage_gfp+0x62c/0x770
    shmem_write_begin+0x38/0x40
    generic_perform_write+0xc5/0x1c0
    __generic_file_aio_write+0x1d1/0x3f0
    generic_file_aio_write+0x4f/0xc0
    do_sync_write+0x5a/0x90
    do_acct_process+0x4b1/0x550
    acct_process+0x6d/0xa0
    do_exit+0x827/0xa70
    kthread+0xc3/0xf0

This patch fixes the issue by reintroducing mm check into
get_mem_cgroup_from_mm.  We could do the same trick in
__mem_cgroup_try_charge_swapin as we do for the regular page cache path
but it is not worth troubles.  The check is not that expensive and it is
better to have get_mem_cgroup_from_mm more robust.

[1] - http://marc.info/?l=linux-mm&m=139463617808941&w=2

Fixes: 03583f1a631c ("memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()")
Reported-and-tested-by: Stephan Kulow <coolo@suse.com>
Reported-by: Branimir Maksimovic <branimir.maksimovic@gmail.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm: madvise: fix MADV_WILLNEED on shmem swapouts
Johannes Weiner [Thu, 22 May 2014 18:54:17 +0000 (11:54 -0700)]
mm: madvise: fix MADV_WILLNEED on shmem swapouts

MADV_WILLNEED currently does not read swapped out shmem pages back in.

Commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page
cache radix trees") made find_get_page() filter exceptional radix tree
entries but failed to convert all find_get_page() callers that WANT
exceptional entries over to find_get_entry().  One of them is shmem swap
readahead in madvise, which now skips over any swap-out records.

Convert it to find_get_entry().

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT
Jens Axboe [Thu, 22 May 2014 18:54:16 +0000 (11:54 -0700)]
mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT

In some testing I ran today (some fio jobs that spread over two nodes),
we end up spending 40% of the time in filemap_check_errors().  That
smells fishy.  Looking further, this is basically what happens:

blkdev_aio_read()
    generic_file_aio_read()
        filemap_write_and_wait_range()
            if (!mapping->nr_pages)
                filemap_check_errors()

and filemap_check_errors() always attempts two test_and_clear_bit() on
the mapping flags, thus dirtying it for every single invocation.  The
patch below tests each of these bits before clearing them, avoiding this
issue.  In my test case (4-socket box), performance went from 1.7M IOPS
to 4.0M IOPS.

Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agohwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage
Chen Yucong [Thu, 22 May 2014 18:54:15 +0000 (11:54 -0700)]
hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage

For handling a free hugepage in memory failure, the race will happen if
another thread hwpoisoned this hugepage concurrently.  So we need to
check PageHWPoison instead of !PageHWPoison.

If hwpoison_filter(p) returns true or a race happens, then we need to
unlock_page(hpage).

Signed-off-by: Chen Yucong <slaoub@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoparisc: 'renameat2()' doesn't need (or have) a separate compat system call
Linus Torvalds [Fri, 23 May 2014 16:23:51 +0000 (09:23 -0700)]
parisc: 'renameat2()' doesn't need (or have) a separate compat system call

The 'renameat2()' system call was incorrectly added as a ENTRY_COMP() in
the parisc system call table by commit 18e480aa07f78 ("parisc: add
renameat2 syscall").  That causes a link-time error due to there not
being any compat version of that system call:

  arch/parisc/kernel/built-in.o: In function `sys_call_table':
  (.rodata+0xad0): undefined reference to `compat_sys_renameat2'
  make: *** [vmlinux] Error 1

Easily fixed by marking the system call as being the same for compat as
for native by using ENTRY_SAME() instead of ENTRY_COMP().

Reported-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoi40e,igb,ixgbe: remove usless return statements
Stephen Hemminger [Thu, 6 Mar 2014 05:28:12 +0000 (05:28 +0000)]
i40e,igb,ixgbe: remove usless return statements

Remove cases where useless bare return is left at end of function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb/ixgbe: remove return statements for void functions
Jeff Kirsher [Wed, 14 May 2014 08:01:09 +0000 (01:01 -0700)]
igb/ixgbe: remove return statements for void functions

Remove useless return statements for void functions which do not need
it.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
10 years agoixgbe: add /* fallthrough */ comment to case statements
Jacob Keller [Wed, 9 Apr 2014 06:03:17 +0000 (06:03 +0000)]
ixgbe: add /* fallthrough */ comment to case statements

This semicomplex switch-case has various fallthrough portions, that were
not indicated by a /* fallthrough */ comment.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: add space between operands to &
Jacob Keller [Wed, 9 Apr 2014 06:03:16 +0000 (06:03 +0000)]
ixgbe: add space between operands to &

This patch cleans up a checkpatch.pl style warning in the ixgbe code.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: don't check NULL for debugfs_remove_recursive
Jacob Keller [Wed, 9 Apr 2014 06:03:15 +0000 (06:03 +0000)]
ixgbe: don't check NULL for debugfs_remove_recursive

The debugfs_remove_recursive function is NULL-safe, so we don't need to
check here ourselves.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: add braces around else block
Jacob Keller [Wed, 9 Apr 2014 06:03:14 +0000 (06:03 +0000)]
ixgbe: add braces around else block

This commit fixes a checkpatch.pl warning for style, by adding braces
around the else block, since the if block requires braces.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix several concatenated strings to single line
Jacob Keller [Wed, 9 Apr 2014 06:03:13 +0000 (06:03 +0000)]
ixgbe: fix several concatenated strings to single line

This patch fixes various log strings that are split over multiple lines
in the ixgbe driver. This cleans up checkpatch.pl warnings, and makes it
easier to search the code for warning strings displayed to the kernel
log.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix checkpatch style of blank line after declaration
Jacob Keller [Wed, 9 Apr 2014 06:03:12 +0000 (06:03 +0000)]
ixgbe: fix checkpatch style of blank line after declaration

This patch fixes checkpatch warnings in ixgbe, by adding a blank line
between declaration and code blocks.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix function-like macro, remove semicolon
Jacob Keller [Wed, 9 Apr 2014 06:03:11 +0000 (06:03 +0000)]
ixgbe: fix function-like macro, remove semicolon

This patch removes the semicolon from the end of the do-while(0)
construct in two function-like macros.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: clean up checkpatch warnings about CODE_INDENT and LEADING_SPACE
Jacob Keller [Wed, 9 Apr 2014 06:03:10 +0000 (06:03 +0000)]
ixgbe: clean up checkpatch warnings about CODE_INDENT and LEADING_SPACE

The contents of this patch were originally generated by
"scripts/checkpatch.pl --fix-inplace --types CODE_INDENT,LEADING_SPACE
drivers/net/ethernet/ixgbe/*.[ch]", and then hand verified for
consistency.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoopenvswitch: Simplify genetlink code.
Pravin B Shelar [Tue, 6 May 2014 23:44:50 +0000 (16:44 -0700)]
openvswitch: Simplify genetlink code.

Following patch get rid of struct genl_family_and_ops which is
redundant due to changes to struct genl_family.

Signed-off-by: Kyle Mestery <mestery@noironetworks.com>
Acked-by: Kyle Mestery <mestery@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Minimize ovs_flow_cmd_new|set critical sections.
Jarno Rajahalme [Mon, 5 May 2014 22:22:25 +0000 (15:22 -0700)]
openvswitch: Minimize ovs_flow_cmd_new|set critical sections.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Split ovs_flow_cmd_new_or_set().
Jarno Rajahalme [Mon, 5 May 2014 21:53:51 +0000 (14:53 -0700)]
openvswitch: Split ovs_flow_cmd_new_or_set().

Following patch will be easier to reason about with separate
ovs_flow_cmd_new() and ovs_flow_cmd_set() functions.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Minimize ovs_flow_cmd_del critical section.
Jarno Rajahalme [Mon, 5 May 2014 21:40:13 +0000 (14:40 -0700)]
openvswitch: Minimize ovs_flow_cmd_del critical section.

ovs_flow_cmd_del() now allocates reply (if needed) after the flow has
already been removed from the flow table.  If the reply allocation
fails, a netlink error is signaled with netlink_set_err(), as is
already done in ovs_flow_cmd_new_or_set() in the similar situation.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Reduce locking requirements.
Jarno Rajahalme [Mon, 5 May 2014 21:28:07 +0000 (14:28 -0700)]
openvswitch: Reduce locking requirements.

Reduce and clarify locking requirements for ovs_flow_cmd_alloc_info(),
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info().

A datapath pointer is available only when holding a lock.  Change
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info() to take a
dp_ifindex directly, rather than a datapath pointer that is then
(only) used to get the dp_ifindex.  This is useful, since the
dp_ifindex is available even when the datapath pointer is not, both
before and after taking a lock, which makes further critical section
reduction possible.

Make ovs_flow_cmd_alloc_info() take an 'acts' argument instead a
'flow' pointer.  This allows some future patches to do the allocation
before acquiring the flow pointer.

The locking requirements after this patch are:

ovs_flow_cmd_alloc_info(): May be called without locking, must not be
called while holding the RCU read lock (due to memory allocation).
If 'acts' belong to a flow in the flow table, however, then the
caller must hold ovs_mutex.

ovs_flow_cmd_fill_info(): Either ovs_mutex or RCU read lock must be held.

ovs_flow_cmd_build_info(): This calls both of the above, so the caller
must hold ovs_mutex.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Fix ovs_flow_stats_get/clear RCU dereference.
Jarno Rajahalme [Mon, 5 May 2014 21:17:28 +0000 (14:17 -0700)]
openvswitch: Fix ovs_flow_stats_get/clear RCU dereference.

For ovs_flow_stats_get() using ovsl_dereference() was wrong, since
flow dumps call this with RCU read lock.

ovs_flow_stats_clear() is always called with ovs_mutex, so can use
ovsl_dereference().

Also, make the ovs_flow_stats_get() 'flow' argument const to make
later patches cleaner.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Fix typo.
Jarno Rajahalme [Mon, 5 May 2014 21:15:18 +0000 (14:15 -0700)]
openvswitch: Fix typo.

Incorrect struct name was confusing, even though otherwise
inconsequental.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Minimize dp and vport critical sections.
Jarno Rajahalme [Mon, 5 May 2014 21:13:32 +0000 (14:13 -0700)]
openvswitch: Minimize dp and vport critical sections.

Move most memory allocations away from the ovs_mutex critical
sections.  vport allocations still happen while the lock is taken, as
changing that would require major refactoring. Also, vports are
created very rarely so it should not matter.

Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather
than ovs_lock(), as nothing need to be changed.  This was done by
ovs_vport_cmd_get() already.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Make flow mask removal symmetric.
Jarno Rajahalme [Mon, 5 May 2014 20:24:53 +0000 (13:24 -0700)]
openvswitch: Make flow mask removal symmetric.

Masks are inserted when flows are inserted to the table, so it is
logical to correspondingly remove masks when flows are removed from
the table, in ovs_flow_table_remove().

This allows ovs_flow_free() to be called without locking, which will
be used by later patches.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Build flow cmd netlink reply only if needed.
Jarno Rajahalme [Mon, 5 May 2014 20:13:14 +0000 (13:13 -0700)]
openvswitch: Build flow cmd netlink reply only if needed.

Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a
reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or
OVS_FLOW_CMD_DEL.  Currently, OVS userspace does not request a reply
for OVS_FLOW_CMD_NEW, but usually does for OVS_FLOW_CMD_DEL, as stats
may have changed.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Clarify locking.
Jarno Rajahalme [Mon, 5 May 2014 18:32:17 +0000 (11:32 -0700)]
openvswitch: Clarify locking.

Remove unnecessary locking from functions that are always called with
appropriate locking.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Avoid assigning a NULL pointer to flow actions.
Jarno Rajahalme [Mon, 5 May 2014 16:59:40 +0000 (09:59 -0700)]
openvswitch: Avoid assigning a NULL pointer to flow actions.

Flow SET can accept an empty set of actions, with the intended
semantics of leaving existing actions unmodified.  This seems to have
been brokin after OVS 1.7, as we have assigned the flow's actions
pointer to NULL in this case, but we never check for the NULL pointer
later on.  This patch restores the intended behavior and documents it
in the include/linux/openvswitch.h.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoopenvswitch: Compact sw_flow_key.
Jarno Rajahalme [Mon, 5 May 2014 16:54:49 +0000 (09:54 -0700)]
openvswitch: Compact sw_flow_key.

Minimize padding in sw_flow_key and move 'tp' top the main struct.
These changes simplify code when accessing the transport port numbers
and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit
systems (128->120 bytes).  These changes also make the keys for IPv4
packets to fit in one cache line.

There is a valid concern for safety of packing the struct
ovs_key_ipv4_tunnel, as it would be possible to take the address of
the tun_id member as a __be64 * which could result in unaligned access
in some systems. However:

- sw_flow_key itself is 64-bit aligned, so the tun_id within is
  always
  64-bit aligned.
- We never make arrays of ovs_key_ipv4_tunnel (which would force
  every
  second tun_key to be misaligned).
- We never take the address of the tun_id in to a __be64 *.
- Whereever we use struct ovs_key_ipv4_tunnel outside the
  sw_flow_key,
  it is in stack (on tunnel input functions), where compiler has full
  control of the alignment.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agobatman: fix a bogus warning from batadv_is_on_batman_iface()
Cong Wang [Thu, 22 May 2014 18:57:17 +0000 (11:57 -0700)]
batman: fix a bogus warning from batadv_is_on_batman_iface()

batman tries to search dev->iflink to check if it's a batman interface,
but ->iflink could be 0, which is not a valid ifindex. It should just
avoid iflink == 0 case.

Reported-by: Jet Chen <jet.chen@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Antonio Quartulli <antonio@open-mesh.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'mlx4-next'
David S. Miller [Thu, 22 May 2014 21:17:34 +0000 (17:17 -0400)]
Merge branch 'mlx4-next'

Amir Vadai says:

====================
net/mlx4_core: Deprecate module parameter use_prio

This small patchset deprecates the mlx4_core module paramater 'use_prio', as
suggested by Carol Soto from IBM in [1].
Also, replaced some calls to the prefered pr_warn/info/devel macro's.

Patchset was applied and tested on commit b6052af: "Merge tag
'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge"

[1] - http://marc.info/?l=linux-netdev&m=139871350103432&w=2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4_core: Replace pr_warning() with pr_warn()
Amir Vadai [Thu, 22 May 2014 12:55:40 +0000 (15:55 +0300)]
net/mlx4_core: Replace pr_warning() with pr_warn()

As checkpatch suggests. Also changed some printk's into pr_*

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4_core: Deprecate use_prio module parameter
Amir Vadai [Thu, 22 May 2014 12:55:39 +0000 (15:55 +0300)]
net/mlx4_core: Deprecate use_prio module parameter

use_prio was added as part of an infrastructure for running FCoE in A0 mode.
FCoE didn't get into Mellanox Upstream driver, and when it will, it won't be
using A0 steering mode.

Therefore we can safely deprecate this module parameter without hurting any
existing user.

CC: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Thu, 22 May 2014 20:00:00 +0000 (16:00 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2014-05-22

This is the last ipsec pull request before I leave for
a three weeks vacation tomorrow. David, can you please
take urgent ipsec patches directly into net/net-next
during this time?

I'll continue to run the ipsec/ipsec-next trees as soon
as I'm back.

1) Simplify the xfrm audit handling, from Tetsuo Handa.

2) Codingstyle cleanup for xfrm_output, from abian Frederick.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>