Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

Bluetooth: Add clarifying comment for conn->auth_type

When responding to an IO capability request when we're the initiators of
the pairing we will not yet have the remote IO capability information.
Since the conn->auth_type variable is treated as an "absolute"
requirement instead of a hint of what's needed later in the user
confirmation request handler it's important that it doesn't have the
MITM bit set if there's any chance that the remote device doesn't have
the necessary IO capabilities.

This patch adds a clarifying comment so that conn->auth_type is left
untouched in this scenario.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Fix SSP acceptor just-works confirmation without MITM

From the Bluetooth Core Specification 4.1 page 1958:

"if both devices have set the Authentication_Requirements parameter to
one of the MITM Protection Not Required options, authentication stage 1
shall function as if both devices set their IO capabilities to
DisplayOnly (e.g., Numeric comparison with automatic confirmation on
both devices)"

So far our implementation has done user confirmation for all just-works
cases regardless of the MITM requirements, however following the
specification to the word means that we should not be doing confirmation
when neither side has the MITM flag set.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Tested-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org

Bluetooth: Fix check for connection encryption

The conn->link_key variable tracks the type of link key in use. It is
set whenever we respond to a link key request as well as when we get a
link key notification event.

These two events do not however always guarantee that encryption is
enabled: getting a link key request and responding to it may only mean
that the remote side has requested authentication but not encryption. On
the other hand, the encrypt change event is a certain guarantee that
encryption is enabled. The real encryption state is already tracked in
the conn->link_mode variable through the HCI_LM_ENCRYPT bit.

This patch fixes a check for encryption in the hci_conn_auth function to
use the proper conn->link_mode value and thereby eliminates the chance
of a false positive result.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org

Bluetooth: Fix incorrectly overriding conn->src_type

The src_type member of struct hci_conn should always reflect the address
type of the src_member. It should never be overridden. There is already
code in place in the command status handler of HCI_LE_Create_Connection
to copy the right initiator address into conn->init_addr_type.

Without this patch, if privacy is enabled, we will send the wrong
address type in the SMP identity address information PDU (it'll e.g.
contain our public address but a random address type).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org

Merge git://git./linux/kernel/git/davem/net-next

Pull networking updates from David Miller:

1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

4) BPF now has a "random" opcode, from Chema Gonzalez.

5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

6) Support TCP fastopen over ipv6, from Daniel Lee.

7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

8) Support software TSO in fec driver too, from Nimrod Andy.

9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...

Merge tag 'dm-3.16-changes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:
"This pull request is later than I'd have liked because I was waiting
  for some performance data to help finally justify sending the
  long-standing dm-crypt cpu scalability improvements upstream.

  Unfortunately we came up short, so those dm-crypt changes will
  continue to wait, but it seems we're not far off.

   . Add dm_accept_partial_bio interface to DM core to allow DM targets
     to only process a portion of a bio, the remainder being sent in the
     next bio.  This enables the old dm snapshot-origin target to only
     split write bios on chunk boundaries, read bios are now sent to the
     origin device unchanged.

   . Add DM core support for disabling WRITE SAME if the underlying SCSI
     layer disables it due to command failure.

   . Reduce lock contention in DM's bio-prison.

   . A few small cleanups and fixes to dm-thin and dm-era"

* tag 'dm-3.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm thin: update discard_granularity to reflect the thin-pool blocksize
  dm bio prison: implement per bucket locking in the dm_bio_prison hash table
  dm: remove symbol export for dm_set_device_limits
  dm: disable WRITE SAME if it fails
  dm era: check for a non-NULL metadata object before closing it
  dm thin: return ENOSPC instead of EIO when error_if_no_space enabled
  dm thin: cleanup noflush_work to use a proper completion
  dm snapshot: do not split read bios sent to snapshot-origin target
  dm snapshot: allocate a per-target structure for snapshot-origin target
  dm: introduce dm_accept_partial_bio
  dm: change sector_count member in clone_info from sector_t to unsigned

Merge tag 'pci-v3.16-changes-2' of git://git./linux/kernel/git/helgaas/pci

Pull more PCI updates from Bjorn Helgaas:
"Here are some more things I'd like to see in v3.16-rc1:

   - DMA alias iterator, part of some work to fix IOMMU issues
   - MVEBU, Tegra, DesignWare changes that I forgot to include before
   - Some whitespace code cleanup

  Details:

  IOMMU
    - Add DMA alias iterator (Alex Williamson)
    - Add DMA alias quirks for ASMedia, ITE, Tundra bridges (Alex Williamson)
    - Add DMA alias quirks for Marvell, Ricoh devices (Alex Williamson)
    - Add DMA alias quirk for HighPoint devices (Jérôme Carretero)

  MSI
    - Fix leak in free_msi_irqs() (Alexei Starovoitov)

  Marvell MVEBU
    - Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
    - Avoid setting an undefined window size (Jason Gunthorpe)
    - Allow several windows with the same target/attribute (Thomas Petazzoni)
    - Split PCIe BARs into multiple MBus windows when needed (Thomas Petazzoni)
    - Fix off-by-one in the computed size of the mbus windows (Willy Tarreau)

  NVIDIA Tegra
    - Use new OF interrupt mapping when possible (Lucas Stach)

  Synopsys DesignWare
    - Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
    - Use new OF interrupt mapping when possible (Lucas Stach)
    - Split Exynos and i.MX bindings (Lucas Stach)
    - Fix comment for setting number of lanes (Mohit Kumar)
    - Fix iATU programming for cfg1, io and mem viewport (Mohit Kumar)

  Miscellaneous
    - EXPORT_SYMBOL cleanup (Ryan Desfosses)
    - Whitespace cleanup (Ryan Desfosses)
    - Merge multi-line quoted strings (Ryan Desfosses)"

* tag 'pci-v3.16-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (21 commits)
  PCI: Add function 1 DMA alias quirk for HighPoint RocketRaid 642L
  PCI/MSI: Fix memory leak in free_msi_irqs()
  PCI: Merge multi-line quoted strings
  PCI: Whitespace cleanup
  PCI: Move EXPORT_SYMBOL so it immediately follows function/variable
  PCI: Add bridge DMA alias quirk for ITE bridge
  PCI: designware: Split Exynos and i.MX bindings
  PCI: Add bridge DMA alias quirk for ASMedia and Tundra bridges
  PCI: Add support for PCIe-to-PCI bridge DMA alias quirks
  PCI: Add function 1 DMA alias quirk for Marvell devices
  PCI: Add function 0 DMA alias quirk for Ricoh devices
  PCI: Add support for DMA alias quirks
  PCI: Convert pci_dev_flags definitions to bit shifts
  PCI: Add DMA alias iterator
  PCI: mvebu: Use '%pa' for printing 'phys_addr_t' type
  PCI: mvebu: Remove unnecessary use of 'conf_lock' spinlock
  PCI: designware: Remove unnecessary use of 'conf_lock' spinlock
  PCI: designware: Use new OF interrupt mapping when possible
  PCI: designware: Fix iATU programming for cfg1, io and mem viewport
  PCI: designware: Fix comment for setting number of lanes
  ...

Merge tag 'pm+acpi-3.16-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm

Pull more ACPI and power management updates from Rafael Wysocki:
"These are fixups on top of the previous PM+ACPI pull request,
  regression fixes (ACPI hotplug, cpufreq ppc-corenet), other bug fixes
  (ACPI reset, cpufreq), new PM trace points for system suspend
  profiling and a copyright notice update.

  Specifics:

   - I didn't remember correctly that the Hans de Goede's ACPI video
     patches actually didn't flip the video.use_native_backlight
     default, although we had discussed that and decided to do that.
     Since I said we would do that in the previous PM+ACPI pull request,
     make that change for real now.

   - ACPI bus check notifications for PCI host bridges don't cause the
     bus below the host bridge to be checked for changes as they should
     because of a mistake in the ACPI-based PCI hotplug (ACPIPHP)
     subsystem that forgets to add hotplug contexts to PCI host bridge
     ACPI device objects.  Create hotplug contexts for PCI host bridges
     too as appropriate.

   - Revert recent cpufreq commit related to the big.LITTLE cpufreq
     driver that breaks arm64 builds.

   - Fix for a regression in the ppc-corenet cpufreq driver introduced
     during the 3.15 cycle and causing the driver to use the remainder
     from do_div instead of the quotient.  From Ed Swarthout.

   - Resets triggered by panic activate a BUG_ON() in vmalloc.c on
     systems where the ACPI reset register is located in memory address
     space.  Fix from Randy Wright.

   - Fix for a problem with cpufreq governors that decisions made by
     them may be suboptimal due to the fact that deferrable timers are
     used by them for CPU load sampling.  From Srivatsa S Bhat.

   - Fix for a problem with the Tegra cpufreq driver where the CPU
     frequency is temporarily switched to a "stable" level that is
     different from both the initial and target frequencies during
     transitions which causes udelay() to expire earlier than it should
     sometimes.  From Viresh Kumar.

   - New trace points and rework of some existing trace points for
     system suspend/resume profiling from Todd Brandt.

   - Assorted cpufreq fixes and cleanups from Stratos Karafotis and
     Viresh Kumar.

   - Copyright notice update for suspend-and-cpuhotplug.txt from
     Srivatsa S Bhat"

* tag 'pm+acpi-3.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / hotplug / PCI: Add hotplug contexts to PCI host bridges
  PM / sleep: trace events for device PM callbacks
  cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR
  cpufreq: tegra: update comment for clarity
  cpufreq: intel_pstate: Remove duplicate CPU ID check
  cpufreq: Mark CPU0 driver with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
  PM / Documentation: Update copyright in suspend-and-cpuhotplug.txt
  cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info'
  cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
  PM / sleep: trace events for suspend/resume
  cpufreq: ppc-corenet-cpu-freq: do_div use quotient
  Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
  cpufreq: Tegra: implement intermediate frequency callbacks
  cpufreq: add support for intermediate (stable) frequencies
  ACPI / video: Change the default for video.use_native_backlight to 1
  ACPI: Fix bug when ACPI reset register is implemented in system memory

Merge branch 'for-next' of git://git./linux/kernel/git/cooloney/linux-leds

Pull LED updates from Bryan Wu:
"I just found merge window is open and I'm quite busy and almost forget
  to send out this pull request.  Thanks Russell and Alexandre ping me
  about this.

  So basically we got some clean up and leds-pwm fixing patches from
  Russell"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
  leds: Remove duplicated OOM message for individual driver
  drivers/leds: Replace __get_cpu_var use through this_cpu_ptr
  leds: lp55xx: add DT bindings for LP55231
  leds: 88pm860x: Fix missing refcount decrement for parent of_node
  leds: 88pm860x: Use of_get_child_by_name
  leds: leds-pwm: add DT support for LEDs wired to supply
  leds: leds-pwm: implement PWM inversion
  leds: leds-pwm: convert OF parsing code to use led_pwm_add()
  leds: leds-pwm: provide a common function to setup a single led-pwm device
  leds: pca9685: Remove leds-pca9685 driver
  dell-led: add mic mute led interface

Merge tag 'backlight-for-linus-3.16' of git://git./linux/kernel/git/lee/backlight

Pull backlight fixes from Lee Jones:
"This merely contains some very basic build/run-time bug fixes"

* tag 'backlight-for-linus-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: gpio-backlight: Fix warning when the GPIO is on a I2C chip
  video/backlight: s6e63m0: Fix string type mismatch
  video/backlight: LP8788 needs PWM
  video/backlight: LP855X needs PWM
  video/pxa: LCD_CORGI needs BACKLIGHT_CLASS_DEVICE
  video/backlight: LM3630A needs PWM

Merge tag 'mfd-for-linus-3.16-1' of git://git./linux/kernel/git/lee/mfd

Pull more MFD updates from Lee Jones:
"I missed collecting these patches due to a branch/tag naming
  ambiguity.  Completely my own fault, as I mindlessly named a branch
  and tag identically.  Sorry for the fuss.

  This pull-request contains some misplaced patches from Tony Lindgren
  that should have been part of the initial one"

* tag 'mfd-for-linus-3.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: twl4030-power: Add a configuration to turn off oscillator during off-idle
  mfd: twl4030-power: Add support for board specific configuration
  mfd: twl4030-power: Add recommended idle configuration
  mfd: twl4030-power: Add generic reset configuration
  mfd: twl4030-power: Fix some defines for SW_EVENTS
  mfd: twl4030-power: Fix hang on reboot if sleep configuration was loaded earlier

Merge tag 'mmc-v3.16-2' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fixes from Ulf Hansson:
"Here are some mmc fixes for 3.16.

   - fix some various compiler warnings
   - make atmel-mci compile again
   - fix regression for sdhci-msm"

* tag 'mmc-v3.16-2' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: simplify SDHCI Kconfig dependencies
  mmc: omap: don't select TPS65010
  mmc: mvsdio: avoid compiler warning
  mmc: atmel-mci: incude asm/cacheclush.h
  mmc: sdhci-msm: Fix fallout from sdhci refactoring
  mmc: usdhi6rol0: fix compiler warnings

Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
"This is the main drm merge window pull request, changes all over the
  place, mostly normal levels of churn.

  Highlights:

  Core drm:
     More cleanups, fix race on connector/encoder naming, docs updates,
     object locking rework in prep for atomic modeset

  i915:
     mipi DSI support, valleyview power fixes, cursor size fixes,
     execlist refactoring, vblank improvements, userptr support, OOM
     handling improvements

  radeon:
     GPUVM tuning and large page size support, gart fixes, deep color
     HDMI support, HDMI audio cleanups

  nouveau:
     - displayport rework should fix lots of issues
     - initial gk20a support
     - gk110b support
     - gk208 fixes

  exynos:
     probe order fixes, HDMI changes, IPP consolidation

  msm:
     debugfs updates, misc fixes

  ast:
     ast2400 support, sync with UMS driver

  tegra:
     cleanups, hdmi + hw cursor for Tegra 124.

  panel:
     fixes existing panels add some new ones.

  ipuv3:
     moved from staging to drivers/gpu"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (761 commits)
  drm/nouveau/disp/dp: fix tmds passthrough on dp connector
  drm/nouveau/dp: probe dpcd to determine connectedness
  drm/nv50-: trigger update after all connectors disabled
  drm/nv50-: prepare for attaching a SOR to multiple heads
  drm/gf119-/disp: fix debug output on update failure
  drm/nouveau/disp/dp: make use of postcursor when its available
  drm/g94-/disp/dp: take max pullup value across all lanes
  drm/nouveau/bios/dp: parse lane postcursor data
  drm/nouveau/dp: fix support for dpms
  drm/nouveau: register a drm_dp_aux channel for each dp connector
  drm/g94-/disp: add method to power-off dp lanes
  drm/nouveau/disp/dp: maintain link in response to hpd signal
  drm/g94-/disp: bash and wait for something after changing lane power regs
  drm/nouveau/disp/dp: split link config/power into two steps
  drm/nv50/disp: train PIOR-attached DP from second supervisor
  drm/nouveau/disp/dp: make use of existing output data for link training
  drm/gf119/disp: start removing direct vbios parsing from supervisor
  drm/nv50/disp: start removing direct vbios parsing from supervisor
  drm/nouveau/disp/dp: maintain receiver caps in response to hpd signal
  drm/nouveau/disp/dp: create subclass for dp outputs
  ...

rtnetlink: fix userspace API breakage for iproute2 < v3.9.0

When running RHEL6 userspace on a current upstream kernel, "ip link"
fails to show VF information.

The reason is a kernel<->userspace API change introduced by commit
88c5b5ce5cb57 ("rtnetlink: Call nlmsg_parse() with correct header length"),
after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
in the netlink request.

iproute2 adjusted for the API change in its commit 63338dca4513
("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").

The problem has been noticed before:
http://marc.info/?l=linux-netdev&m=136692296022182&w=2
(Subject: Re: getting VF link info seems to be broken in 3.9-rc8)

We can do better than tell those with old userspace to upgrade. We can
recognize the old iproute2 in the kernel by checking the netlink message
length. Even when including the IFLA_EXT_MASK attribute, its netlink
message is shorter than struct ifinfomsg.

With this patch "ip link" shows VF information in both old and new
iproute2 versions.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fixing TLP's FIN recovery

Fix to a problem observed when losing a FIN segment that does not
contain data. In such situations, TLP is unable to recover from
*any* tail loss and instead adds at least PTO ms to the
retransmission process, i.e., RTO = RTO + PTO.

Signed-off-by: Per Hurtig <per.hurtig@kau.se>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fec'

Fugang Duan says:

====================
net: fec: Enable Software TSO to improve the tx performance

Add SG and software TSO support for FEC.
This feature allows to improve outbound throughput performance.
Tested on imx6dl sabresd board, running iperf tcp tests shows:
        * 82% improvement comparing with NO SG & TSO patch

$ ethtool -K eth0 sg on
$ ethtool -K eth0 tso on
[  3] local 10.192.242.108 port 35388 connected with 10.192.242.167 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 3.0 sec   181 MBytes   506 Mbits/sec
* cpu loading is 30%

$ ethtool -K eth0 sg off
$ ethtool -K eth0 tso off
[  3] local 10.192.242.108 port 52618 connected with 10.192.242.167 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 3.0 sec  99.5 MBytes   278 Mbits/sec

FEC HW support IP header and TCP/UDP hw checksum, support multi buffer descriptor transfer
one frame, but don't support HW TSO. And imx6q/dl SOC FEC Gbps speed has HW bus Bandwidth
limitation (400Mbps ~ 700Mbps), imx6sx SOC FEC Gbps speed has no HW bandwidth limitation.

The patch set just enable TSO feature, which is done following the mv643xx_eth driver.

Test result analyze:
imx6dl sabresd board: there have 82% improvement, since imx6dl FEC HW has bandwidth limitation,
                      the performance with SW TSO is a milestone.

Addition test:
imx6sx sdb board:
upstream still don't support imx6sx due to some patches being upstream... they use same FEC IP.
Use the SW TSO patches test imx6sx sdb board in internal kernel tree:
No SW TSO patch: tx bandwidth 840Mbps, cpu loading is 100%.
SW TSO patch:    tx bandwidth 942Mbps, cpu loading is 65%.
It means the patch set have great improvement for imx6sx FEC performance.

V2:
* From Frank Li's suggestion:
Change the API "fec_enet_txdesc_entry_free" name to "fec_enet_get_free_txdesc_num".
* Summary David Laight and Eric Dumazet's thoughts:
RX BD entry number change to 256.
* From ezequiel's suggestion:
Follow the latest TSO fixes from his solution to rework the queue stop/wake-up.
Avoid unmapping the TSO header buffers.
* From Eric Dumazet's suggestion:
Avoid more bytes copy, just copying the unaligned part of the payload into first
descriptor. The suggestion will bring more complex for the driver, and imx6dl FEC
DMA need 16 bytes alignment, but cpu loading is not problem that cpu loading is
30%, the current performance is so better. Later chip like imx6sx Gigbit FEC DMA
support byte alignment, so there don't exist memory copy. So, the V2 version drop
the suggestion.
Anyway, thanks for Eric's response and suggestion.

V3:
* From David Laight's feedback:
Decide to drop RX BD entry number change for the SW TSO patch set.
I will generate one separate patch to increase RX BDs entry for interrupt coalescing feature which
will be supported in my later patch set.

V4:
* From David Laight's feedback:
Remove the conditional in .fec_enet_get_bd_index().

V5:
* Patch #4 update:
  From David Laight's feedback:
"expect fec_enet_get_free_txdesc_num() to return one less than it does currently."
Change the function:
Return space available, 0..size-1.  it always leave one free entry. Which is same as linux circ_buf.

Thanks for Eric and ezequiel's help and idea.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Add software TSO support

Add software TSO support for FEC.
This feature allows to improve outbound throughput performance.

Tested on imx6dl sabresd board, running iperf tcp tests shows:
- 16.2% improvement comparing with FEC SG patch
- 82% improvement comparing with NO SG & TSO patch

$ ethtool -K eth0 tso on
$ iperf -c 10.192.242.167 -t 3 &
[  3] local 10.192.242.108 port 35388 connected with 10.192.242.167 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 3.0 sec   181 MBytes   506 Mbits/sec

During the testing, CPU loading is 30%.
Since imx6dl FEC Bandwidth is limited to SOC system bus bandwidth, the
performance with SW TSO is a milestone.

CC: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Laight <David.Laight@ACULAB.COM>
CC: Li Frank <B20596@freescale.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Add Scatter/gather support

Add Scatter/gather support for FEC.
This feature allows to improve outbound throughput performance.

Tested on imx6dl sabresd board:
Running iperf tests shows a 55.4% improvement.

$ ethtool -K eth0 sg off
$ iperf -c 10.192.242.167 -t 3 &
[  3] local 10.192.242.108 port 52618 connected with 10.192.242.167 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 3.0 sec  99.5 MBytes   278 Mbits/sec

$ ethtool -K eth0 sg on
$ iperf -c 10.192.242.167 -t 3 &
[  3] local 10.192.242.108 port 52617 connected with 10.192.242.167 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 3.0 sec   154 MBytes   432 Mbits/sec

CC: Li Frank <B20596@freescale.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Increase buffer descriptor entry number

In order to support SG, software TSO, let's increase BD entry number.

CC: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Factorize feature setting

In order to enhance the code readable, let's factorize the
feature list.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Enable IP header hardware checksum

IP header checksum is calcalated by network layer in default.
To support software TSO, it is better to use HW calculate the
IP header checksum.

FEC hw checksum feature request the checksum field in frame
is zero, otherwise the calculative CRC is not correct.

For segmentated TCP packet, HW calculate the IP header checksum again,
it doesn't bring any impact. For SW TSO, HW calculated checksum bring
better performance.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Factorize the .xmit transmit function

Make the code more readable and easy to support other features like
SG, TSO, moving the common transmit function to one api.

And the patch also factorize the getting BD index to it own function.

CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: fix compile error when compiling without IPv6 support

Some fields in "struct net_bridge" aren't available when compiling the
kernel without IPv6 support. Therefore adding a check/macro to skip the
complaining code sections in that case.

Introduced by 2cd4143192e8c60f66cb32c3a30c76d0470a372d
("bridge: memorize and export selected IGMP/MLD querier port")

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: fix smatch warning / potential null pointer dereference

"New smatch warnings:
net/bridge/br_multicast.c:1368 br_ip6_multicast_query() error:
we previously assumed 'group' could be null (see line 1349)"

In the rare (sort of broken) case of a query having a Maximum
Response Delay of zero, we could create a potential null pointer
dereference.

Fixing this by skipping the multicast specific MLD Query parsing again
if no multicast group address is available.

Introduced by dc4eb53a996a78bfb8ea07b47423ff5a3aadc362
("bridge: adhere to querier election mechanism specified by RFCs")

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

via-rhine: fix full-duplex with autoneg disable

With some specific configuration (VT6105M on Soekris 5510 and depending
on the device at the other end), fragmented packets were not transmitted
when forcing 100 full-duplex with autoneg disable.

This fix now write full-duplex chips register when forcing full or
half-duplex not only when autoneg is enable.

Signed-off-by: François Cachereul <f.cachereul@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs updates from Al Viro:
"This the bunch that sat in -next + lock_parent() fix.  This is the
  minimal set; there's more pending stuff.

  In particular, I really hope to get acct.c fixes merged this cycle -
  we need that to deal sanely with delayed-mntput stuff.  In the next
  pile, hopefully - that series is fairly short and localized
  (kernel/acct.c, fs/super.c and fs/namespace.c).  In this pile: more
  iov_iter work.  Most of prereqs for ->splice_write with sane locking
  order are there and Kent's dio rewrite would also fit nicely on top of
  this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
  lock_parent: don't step on stale ->d_parent of all-but-freed one
  kill generic_file_splice_write()
  ceph: switch to iter_file_splice_write()
  shmem: switch to iter_file_splice_write()
  nfs: switch to iter_splice_write_file()
  fs/splice.c: remove unneeded exports
  ocfs2: switch to iter_file_splice_write()
  ->splice_write() via ->write_iter()
  bio_vec-backed iov_iter
  optimize copy_page_{to,from}_iter()
  bury generic_file_aio_{read,write}
  lustre: get rid of messing with iovecs
  ceph: switch to ->write_iter()
  ceph_sync_direct_write: stop poking into iov_iter guts
  ceph_sync_read: stop poking into iov_iter guts
  new helper: copy_page_from_iter()
  fuse: switch to ->write_iter()
  btrfs: switch to ->write_iter()
  ocfs2: switch to ->write_iter()
  xfs: switch to ->write_iter()
  ...

Merge branch 'bnx2x'

Yuval Mintz says:

====================
bnx2x: Bug fixes patch series

This patch series contains various bug fixes - 2 link related fixes,
one sriov-related issue and an additional fix for a theoretical bug
on new boards.

Please consider applying these patches to `net'.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Enlarge the dorq threshold for VFs

A malicious VF might try to starve the other VFs & PF by creating
contineous doorbell floods. In order to negate this, HW has a threshold of
doorbells per client, which will stop the client doorbells from arriving
if crossed.

The threshold currently configured for VFs is too low - under extreme traffic
scenarios, it's possible for a VF to reach the threshold and thus for its
fastpath to stop working.

Signed-off-by: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Check for UNDI in uncommon branch

If L2FW utilized by the UNDI driver has the same version number as that
of the regular FW, a driver loading after UNDI and receiving an uncommon
answer from management will mistakenly assume the loaded FW matches its
own requirement and try to exist the flow via FLR.

Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com>
Signed-off-by: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Fix 1G-baseT link

Set the phy access mode even in case of link-flap avoidance.

Signed-off-by: Yaniv Rosner <yaniv.rosner@qlogic.com>
Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com>
Signed-off-by: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Fix link for KR with swapped polarity lane

This avoids clearing the RX polarity setting in KR mode when polarity lane
is swapped, as otherwise this will result in failed link.

Signed-off-by: Yaniv Rosner <yaniv.rosner@qlogic.com>
Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com>
Signed-off-by: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Fix sk_ack_backlog wrap-around problem

Consider the scenario:
For a TCP-style socket, while processing the COOKIE_ECHO chunk in
sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check,
a new association would be created in sctp_unpack_cookie(), but afterwards,
some processing maybe failed, and sctp_association_free() will be called to
free the previously allocated association, in sctp_association_free(),
sk_ack_backlog value is decremented for this socket, since the initial
value for sk_ack_backlog is 0, after the decrement, it will be 65535,
a wrap-around problem happens, and if we want to establish new associations
afterward in the same socket, ABORT would be triggered since sctp deem the
accept queue as full.
Fix this issue by only decrementing sk_ack_backlog for associations in
the endpoint's list.

Fix-suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'pm-sleep'

* pm-sleep:
PM / sleep: trace events for device PM callbacks
PM / sleep: trace events for suspend/resume

Merge branch 'pm-cpufreq'

* pm-cpufreq:
  cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR
  cpufreq: tegra: update comment for clarity
  cpufreq: intel_pstate: Remove duplicate CPU ID check
  cpufreq: Mark CPU0 driver with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
  cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info'
  cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
  cpufreq: ppc-corenet-cpu-freq: do_div use quotient
  Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
  cpufreq: Tegra: implement intermediate frequency callbacks
  cpufreq: add support for intermediate (stable) frequencies

Merge branches 'acpi-general' and 'acpi-video'

* acpi-general:
ACPI: Fix bug when ACPI reset register is implemented in system memory

* acpi-video:
ACPI / video: Change the default for video.use_native_backlight to 1

Merge branch 'acpi-hotplug'

* acpi-hotplug:
ACPI / hotplug / PCI: Add hotplug contexts to PCI host bridges

mmc: simplify SDHCI Kconfig dependencies

We have a number of front-end drivers for SDHCI_PLTFM, some of them
use 'select MMC_SDHCI_PLTFM', others use 'depends on'. This is
inconsistent and confusing, and in one case has also led to a
build error because of incomplete dependencies:

warning: (MMC_SDHCI_PXAV3 && MMC_SDHCI_PXAV2 && MMC_SDHCI_BCM_KONA) selects MMC_SDHCI_PLTFM which has unmet direct dependencies (MMC && MMC_SDHCI)
drivers/built-in.o: In function `sdhci_sirf_resume':
:(.text+0xaaacb4): undefined reference to `sdhci_resume_host'
drivers/built-in.o: In function `sdhci_sirf_suspend':
:(.text+0xaaacf8): undefined reference to `sdhci_suspend_host'
drivers/built-in.o: In function `sdhci_sirf_probe':
:(.text+0xaaaf44): undefined reference to `sdhci_add_host'
:(.text+0xaaaf50): undefined reference to `sdhci_remove_host'

This changes Kconfig to use 'depends on MMC_SDHCI_PLTFM' for all these
cases, to fix the build error and make the logic more logical.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mmc: omap: don't select TPS65010

The MMC host driver should not select the pmic driver, since that
may have other dependencies, notably i2c in this case. It's not
clear what the exact requirement of the driver is, but to preserve
the behavior, this patch changes the 'select' into 'depends on',
meaning you now have to turn on TPS65010 explicitly and then
MMC_OMAP.

Found during randconfig build testing.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-omap@vger.kernel.org
Cc: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mmc: mvsdio: avoid compiler warning

gcc correctly points out that hw_state can be used uninitially
in the mvsd_setup_data() function. This rearranges the function
to ensure it always contains a proper value.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Chris Ball <chris@printf.net>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: linux-mmc@vger.kernel.org
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mmc: atmel-mci: incude asm/cacheclush.h

This avoids a build error due to the use of flush_dcache_page.

drivers/mmc/host/atmel-mci.c: In function 'atmci_read_data_pio':
drivers/mmc/host/atmel-mci.c:1870:5: error: implicit declaration of function 'flush_dcache_page' [-Werror=implicit-function-declaration]
flush_dcache_page(sg_page(sg));
^

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mmc: sdhci-msm: Fix fallout from sdhci refactoring

The sdhci core was refactored recently and some of those
refactorings required changes in every sdhci platform driver.
Those updates happened around the same time as when the msm
driver was merged so the refactorings missed the msm driver.
Hook in the basic library functions so that we can boot apq8074
dragonboards again instead of crashing when we try to jump to
NULL function pointers.

Reported-by: Kevin Hilman <khilman@linaro.org>
Cc: Georgi Djakov <gdjakov@mm-sol.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Kumar Gala <galak@codeaurora.org>
Reviewed-by: Georgi Djakov <gdjakov@mm-sol.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mmc: usdhi6rol0: fix compiler warnings

Fix a number of wrong print formats.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski+renesas@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

lock_parent: don't step on stale ->d_parent of all-but-freed one

Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished. The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.

We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge commit '9f12600fe425bc28f0ccba034a77783c09c15af4' into for-linus

Backmerge of dcache.c changes from mainline. It's that, or complete
rebase...

Conflicts:
fs/splice.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

kill generic_file_splice_write()

no callers left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ceph: switch to iter_file_splice_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

shmem: switch to iter_file_splice_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

nfs: switch to iter_splice_write_file()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs/splice.c: remove unneeded exports

ocfs2 was using a bunch of splice.c guts...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ocfs2: switch to iter_file_splice_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

->splice_write() via ->write_iter()

iter_file_splice_write() - a ->splice_write() instance that gathers the
pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
it to ->write_iter(). A bunch of simple cases coverted to that...

[AV: fixed the braino spotted by Cyrill]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'virtio-next-for-linus' of git://git./linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
"Main excitement is a virtio_scsi fix for alloc holding spinlock on the
  abort path, which I refuse to CC stable since (1) I discovered it
  myself, and (2) it's been there forever with no reports"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_scsi: don't call virtqueue_add_sgs(... GFP_NOIO) holding spinlock.
  virtio-rng: fixes for device registration/unregistration
  virtio-rng: fix boot with virtio-rng device
  virtio-rng: support multiple virtio-rng devices
  virtio_ccw: introduce device_lost in virtio_ccw_device
  virtio: virtio_break_device() to mark all virtqueues broken.

Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull vhost infrastructure updates from Michael S. Tsirkin:
"This reworks vhost core dropping unnecessary RCU uses in favor of VQ
  mutexes which are used on fast path anyway.  This fixes worst-case
  latency for users which change the memory mappings a lot.  Memory
  allocation for vhost-net now supports fallback on vmalloc (same as for
  vhost-scsi) this makes it possible to create the device on systems
  where memory is very fragmented, with slightly lower performance"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: move memory pointer to VQs
  vhost: move acked_features to VQs
  vhost: replace rcu with mutex
  vhost-net: extend device allocation to vmalloc

Merge git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile changes from Chris Metcalf:
"These mostly just address smaller issues reported to me"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch: tile: kernel: unaligned.c: Cleaning up uninitialized variables
  drivers/tty/hvc/hvc_tile.c: use PTR_ERR_OR_ZERO
  replace strict_strto* call with kstrto*
  tile: Update comments for generic idle conversion
  tile: cleanup the comment in init_pgprot
  tile: use BOOTMEM_DEFAULT instead of magic number 0 for reserve_bootmem flags

Merge tag 'modules-next-for-linus' of git://git./linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
"Most of this is cleaning up various driver sysfs permissions so we can
  re-add the perm check (we unified the module param and sysfs checks,
  but the module ones were stronger so we weakened them temporarily).

  Param parsing gets documented, and also "--" now forces args to be
  handed to init (and ignored by the kernel).

  Module NX/RO protections get tightened: we now set them before calling
  parse_args()"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: set nx before marking module MODULE_STATE_COMING.
  samples/kobject/: avoid world-writable sysfs files.
  drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
  drivers/staging/speakup/: avoid world-writable sysfs files.
  drivers/regulator/virtual: avoid world-writable sysfs files.
  drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
  drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
  drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
  drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
  speakup: fix incorrect perms on speakup_acntsa.c
  cpumask.h: silence warning with -Wsign-compare
  Documentation: Update kernel-parameters.tx
  param: hand arguments after -- straight to init
  modpost: Fix resource leak in read_dump()

Merge git://git./linux/kernel/git/davem/net

Conflicts:
net/core/rtnetlink.c
net/core/skbuff.c

Both conflicts were very simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>

net/core: Add VF link state control policy

Commit 1d8faf48c7 (net/core: Add VF link state control) added VF link state
control to the netlink VF nested structure, but failed to add a proper entry
for the new structure into the VF policy table. Add the missing entry so
the table and the actual data copied into the netlink nested struct are in
sync.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/fsl: xgmac_mdio is dependent on OF_MDIO

Signed-off-by: Shruti Kanetkar <Shruti@Freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/fsl: Make xgmac_mdio read error message useful

Print the device address, the register number and the PHY ID for
which the MDIO read operation failed

Signed-off-by: Shruti Kanetkar <Shruti@Freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net_sched: drr: warn when qdisc is not work conserving

The DRR scheduler requires that items on the active list are work
conserving, i.e. do not hold on to skbs for throttling purposes, etc.
Attaching e.g. tbf renders DRR useless because all other classes on the
active list are delayed as well.

So, warn users that this configuration won't work as expected; we
already do this in couple of other qdiscs, see e.g.

commit b00355db3f88d96810a60011a30cfb2c3469409d
('pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving warning handler')

The 'const' change is needed to avoid compiler warning ("discards 'const'
qualifier from pointer target type").

tested with:
drr_hier() {
        parent=$1
        classes=$2
        for i in  $(seq 1 $classes); do
                classid=$parent$(printf %x $i)
                tc class add dev eth0 parent $parent classid $classid drr
tc qdisc add dev eth0 parent $classid tbf rate 64kbit burst 256kbit limit 64kbit
        done
}
tc qdisc add dev eth0 root handle 1: drr
drr_hier 1: 32
tc filter add dev eth0 protocol all pref 1 parent 1: handle 1 flow hash keys dst perturb 1 divisor 32

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'inet_csums'

Tom Herbert says:

====================
net: Checksum offload changes - Part IV

I am working on overhauling RX checksum offload. Goals of this effort
are:

- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simply code

What is in this fourth patch set:

- Preserve CHECKSUM_COMPLETE instead of changing it to
  CHECKSUM_UNNECESSARY. This allows correct reuse in validating multiple
  csums in a packet.
- When SW needs to compute the packet checksum, save it as
  CHECKSUM_COMPLETE. Also mark that checksum was compute by SW.
- Add skb_gro_postpull_rcsum to udp and vxlan to make GRO work with
  CHECKSUM_COMPLETE.

v2: Removed patch setting skb_encapsulation when validating checksum
    in tcp_gro_receive

Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: Add skb_gro_postpull_rcsum to udp and vxlan

Need to gro_postpull_rcsum for GRO to work with checksum complete.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Save software checksum complete

In skb_checksum complete, if we need to compute the checksum for the
packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
Subsequent checksum verification can use this.

Also, added csum_complete_sw flag to distinguish between software and
hardware generated checksum complete, we should always be able to trust
the software computation.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Preserve CHECKSUM_COMPLETE at validation

Currently when the first checksum in a packet is validated using
CHECKSUM_COMPLETE, ip_summed is overwritten to be CHECKSUM_UNNECESSARY
so that any subsequent checksums in the packet are not correctly
validated.

This patch adds csum_valid flag in sk_buff and uses that to indicate
validated checksum instead of setting CHECKSUM_UNNECESSARY. The bit
is set accordingly in the skb_checksum_validate_* functions. The flag
is checked in skb_checksum_complete, so that validation is communicated
between checksum_init and checksum_complete sequence in TCP and UDP.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qlcnic-next'

Shahed Shaikh says:

====================
This series contains an enhancement in the area of firmware minidump collection
and optimization of ring count validation function.

Please apply this series to net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Update version to 5.3.60

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Optimize ring count validations

- Check interrupt mode at the start of qlcnic_set_channels().
- Do not validate ring count if they are not going to change.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Pre-allocate DMA buffer used for minidump collection

Pre-allocate the physically contiguous DMA buffer used for
minidump collection at driver load time, rather than at
run time, to minimize allocation failures. Driver will allocate
the buffer at load time if PEX DMA support capability is indicated
by the adapter.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_vti: fix sparse warnings for VTI_ISVTI

This patch fixes the following sparse warnings:

net/ipv4/ip_tunnel.c:245:53: warning: restricted __be16 degrades to integer
net/ipv4/ip_vti.c:321:19: warning: incorrect type in assignment (different base types)
net/ipv4/ip_vti.c:321:19:    expected restricted __be16 [addressable] [assigned] [usertype] i_flags
net/ipv4/ip_vti.c:321:19:    got int
net/ipv4/ip_vti.c:447:24: warning: incorrect type in assignment (different base types)
net/ipv4/ip_vti.c:447:24:    expected restricted __be16 [usertype] i_flags
net/ipv4/ip_vti.c:447:24:    got int

Since VTI_ISVTI is always used with ip_tunnel_parm->i_flags (which is __be16),
we can __force cast VTI_ISVTI to __be16 in header file.

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: davinci_cpdma: double free on error

We recently change the kzalloc() to devm_kzalloc() so freeing "ctlr"
here could lead to a double free.

Fixes: e194312854ed ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

amd-xgbe: unwind on error in xgbe_mdio_register()

There is a typo here so we return directly instead of unwinding.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mrf24j40: add device managed APIs

adds the device managed APIs so that no need worry about
freeing the resources.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
Signed-off-by: David S. Miller <davem@davemloft.net>

ceph: remove bogus extern

Sparse complained about this bogus extern on definition of
a function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: document internal instruction encoding

This patch adds a description of eBPFs instruction encoding in order
to bring the documentation in line with the implementation.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: mention eBPF terminology as well

Since the term eBPF is used anyway on mailing list discussions, lets
also document that in the main BPF documentation file and replace a
couple of occurrences with eBPF terminology to be more clear.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: fix a race in ip4_datagram_release_cb()

Alexey gave a AddressSanitizer[1] report that finally gave a good hint
at where was the origin of various problems already reported by Dormando
in the past [2]

Problem comes from the fact that UDP can have a lockless TX path, and
concurrent threads can manipulate sk_dst_cache, while another thread,
is holding socket lock and calls __sk_dst_set() in
ip4_datagram_release_cb() (this was added in linux-3.8)

It seems that all we need to do is to use sk_dst_check() and
sk_dst_set() so that all the writers hold same spinlock
(sk->sk_dst_lock) to prevent corruptions.

TCP stack do not need this protection, as all sk_dst_cache writers hold
the socket lock.

[1]
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

AddressSanitizer: heap-use-after-free in ipv4_dst_check
Read of size 2 by thread T15453:
[<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
[<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
[<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0
[<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413
[<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0
[<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
[<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
[<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
[<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Freed by thread T15455:
[<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
[<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280
[<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0
[<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
[<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
[<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
[<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Allocated by thread T15453:
[<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
[<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
[<     inlined    >] __ip_route_output_key+0x3e8/0xf70
__mkroute_output ./net/ipv4/route.c:1939
[<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
[<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
[<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0
[<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
[<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
[<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
[<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

[2]
<4>[196727.311203] general protection fault: 0000 [#1] SMP
<4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
<4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
<4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
<4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
<4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>]  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
<4>[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
<4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
<4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
<4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
<4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
<4>[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
<4>[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
<4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[196727.311713] Stack:
<4>[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
<4>[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
<4>[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
<4>[196727.311885] Call Trace:
<4>[196727.311907]  <IRQ>
<4>[196727.311912]  [<ffffffff815b7f42>] dst_destroy+0x32/0xe0
<4>[196727.311959]  [<ffffffff815b86c6>] dst_release+0x56/0x80
<4>[196727.311986]  [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0
<4>[196727.312013]  [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820
<4>[196727.312041]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
<4>[196727.312070]  [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150
<4>[196727.312097]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
<4>[196727.312125]  [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230
<4>[196727.312154]  [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90
<4>[196727.312183]  [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360
<4>[196727.312212]  [<ffffffff815fe00b>] ip_rcv+0x22b/0x340
<4>[196727.312242]  [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan]
<4>[196727.312275]  [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640
<4>[196727.312308]  [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150
<4>[196727.312338]  [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70
<4>[196727.312368]  [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0
<4>[196727.312397]  [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140
<4>[196727.312433]  [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe]
<4>[196727.312463]  [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340
<4>[196727.312491]  [<ffffffff815b1691>] net_rx_action+0x111/0x210
<4>[196727.312521]  [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70
<4>[196727.312552]  [<ffffffff810519d0>] __do_softirq+0xd0/0x270
<4>[196727.312583]  [<ffffffff816cef3c>] call_softirq+0x1c/0x30
<4>[196727.312613]  [<ffffffff81004205>] do_softirq+0x55/0x90
<4>[196727.312640]  [<ffffffff81051c85>] irq_exit+0x55/0x60
<4>[196727.312668]  [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0
<4>[196727.312696]  [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a
<4>[196727.312722]  <EOI>
<1>[196727.313071] RIP  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
<4>[196727.313100]  RSP <ffff885effd23a70>
<4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
<0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt

Reported-by: Alexey Preobrazhensky <preobr@google.com>
Reported-by: dormando <dormando@rydia.ne>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 8141ed9fcedb2 ("ipv4: Add a socket release callback for datagram sockets")
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: add test_bpf module under MAINTAINERS' networking section

Add lib/test_bpf.c entry to maintainers file under networking.
All changes were posted via netdev for review, so make sure
other people Cc it as well when they call get_maintainer.pl.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add __pskb_copy_fclone and pskb_copy_for_clone

There are several instances where a pskb_copy or __pskb_copy is
immediately followed by an skb_clone.

Add a couple of new functions to allow the copy skb to be allocated
from the fclone cache and thus speed up subsequent skb_clone calls.

Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <antonio@meshcoding.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Arvid Brodin <arvid.brodin@alten.se>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: PIO:Restrict to 64bit arch and use 64-bit writes.

Fixes:ee45fd92c739
("sfc: Use TX PIO for sufficiently small packets")

The linux net driver uses memcpy_toio() in order to copy into
the PIO buffers.
Even on a 64bit machine this causes 32bit accesses to a write-
combined memory region.
There are hardware limitations that mean that only 64bit
naturally aligned accesses are safe in all cases.
Due to being write-combined memory region two 32bit accesses
may be coalesced to form a 64bit non 64bit aligned access.
Solution was to open-code the memory copy routines using pointers
and to only enable PIO for x86_64 machines.

Not tested on platforms other than x86_64 because this patch
disables the PIO feature on other platforms.
Compile-tested on x86 to ensure that works.

The WARN_ON_ONCE() code in the previous version of this patch
has been moved into the internal sfc debug driver as the
assertion was unnecessary in the upstream kernel code.

This bug fix applies to v3.13 and v3.14 stable branches.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bridge-next'

Toshiaki Makita says:

====================
bridge: 802.1ad vlan protocol support

Currently bridge vlan filtering doesn't work fine with 802.1ad protocol.
Only if a bridge is configured without pvid, the bridge receives only
802.1ad tagged frames and no STP is used, it will work.
Otherwise:
- If pvid is configured, it can put only 802.1Q tags but cannot put 802.1ad
  tags.
- If 802.1Q and 802.1ad tagged frames arrive in mixture, it applies filtering
  regardless of their protocols.
- While an 802.1ad bridge should use another mac address for STP BPDU and
  should forward customer's BPDU frames, it can't.
Thus, we can't properly handle frames once 802.1ad is used.

Handling 802.1ad is useful if we want to allow stacked vlans to be used,
e.g., guest VMs wants to use vlan tags and the host also wants to segregate
guest's traffic from other guests' by vlan tags.

Here is the image describing how to configure a bridge to filter VMs traffic.

         +-------+p/u   +-----+  +---------+
+----+  |       |------|vnet0|--|User A VM|
|eth0|--|802.1ad|      +-----+  +---------+
+----+  |bridge |p/u   +-----+  +---------+
         |       |------|vnet1|--|User B VM|
         +-------+      +-----+  +---------+
p/u: pvid/untagged

This patch set enables us to set vlan protocols per bridge.
This tries to implement a bridge like S-VLAN component in IEEE 802.1Q-2011
spec.

Note that there is another possible implementation that sets vlan protocols
per port. Some HW switches seem to take that approach.
However, I think per-bridge approach is better, because;
- I think the typical usage of an 802.1ad bridge is segregating 802.1Q tagged
  traffic (like what is described above), and this doesn't need the ability to
  be set protocols per port. Also, If a bridge has many ports and it supports
  per-port setting, we might have to make much more extra configurations to
  change protocols of all ports.

- I assume that the main perpose to set protocol per port is to assign S-VID
  according to C-VID, or to realize two logical bridges (one is an 802.1Q
  filtering bridge and the other is an 802.1ad filtering bridge) in one bridge.
  The former usually needs additional features such as vlan id mapping, and
  is likely to make bridge's code complicated. If a user wants, such enhanced
  features can be accomplished by a combination of multiple bridges, so it is
  not absolutely necessary to implement these features in a bridge itself.
  The latter is simply unnecessary because we can easily make two bridges of
  which one is an 802.1Q bridge and the other is an 802.1ad bridge.

Here is an example of the enhanced feature that we can realize by using
multiple bridges and veth interfaces. This way is documented in
IEEE 802.1Q-2011 clause 15.4 (C-tagged service interface).

+----+  +-------+p/u         +------+  +----+  +--+
|eth0|--|802.1ad|----veth----|802.1Q|--|vnet|--|VM|
+----+  |bridge |----veth----|bridge|  +----+  +--+
         +-------+p/u         +------+
p/u: pvid/untagged

In this configuration, we can map C-VIDs to any S-VID.
For example;
C-VID 10 and 20 to S-VID 100
C-VID 30 to S-VID 110
This is achieved through the 802.1Q bridge that forwards C-tagged frames to
proper ports of the 802.1ad bridge.

Changes:
v1 -> v2:
- Make the way to forward bridge group addresses more generic by introducing
  new mask, group_fwd_mask_required.

RFC -> v1:
- Add S-TAG tx offload.
- Remove a fix around stacked vlan which has already been fixed.
- Take into account Bridge Group Addresses.
- Separate handling of protocol-mismatch from br_vlan_get_tag().
- Change the way to set vlan_proto from netlink to sysfs because no other
  existing configuration per bridge can be set by netlink.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Support 802.1ad vlan filtering

This enables us to change the vlan protocol for vlan filtering.
We come to be able to filter frames on the basis of 802.1ad vlan tags
through a bridge.

This also changes br->group_addr if it has not been set by user.
This is needed for an 802.1ad bridge.
(See IEEE 802.1Q-2011 8.13.5.)

Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad
bridge can forward the Nearest Customer Bridge group addresses except
for br->group_addr, which should be passed to higher layer.

To change the vlan protocol, write a protocol in sysfs:
# echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Prepare for forwarding another bridge group addresses

If a bridge is an 802.1ad bridge, it must forward another bridge group
addresses (the Nearest Customer Bridge group addresses).
(For details, see IEEE 802.1Q-2011 8.6.3.)

As user might not want group_fwd_mask to be modified by enabling 802.1ad,
introduce a new mask, group_fwd_mask_required, which indicates addresses
the bridge wants to forward. This will be set by enabling 802.1ad.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Prepare for 802.1ad vlan filtering support

This enables a bridge to have vlan protocol informantion and allows vlan
tag manipulation (retrieve, insert and remove tags) according to the vlan
protocol.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: Add 802.1ad tx vlan acceleration

Bridge device doesn't need to embed S-tag into skb->data.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: xen-netback: include linux/vmalloc.h again

commit e9ce7cb6b107 ("xen-netback: Factor queue-specific data into
queue struct") added a use of vzalloc/vfree to interface.c, but
removed the #include <linux/vmalloc.h> statement at the same time,
which causes this build error:

drivers/net/xen-netback/interface.c: In function 'xenvif_free':
drivers/net/xen-netback/interface.c:754:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
vfree(vif->queues);
^
cc1: some warnings being treated as errors

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: realtek: register/unregister multiple drivers properly

Using phy_drivers_register/_unregister functions is proper way to
handle multiple PHY drivers registration. For Realtek PHY drivers
module, it fixes incomplete current error-handlings up and adds
missed unregistration for the RTL8201CP driver.

Signed-off-by: Jongsung Kim <neidhard.kim@lge.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sh_eth: Fix timing of RACT setting in sh_eth_rx()

This patch fixes an issue that we cannot use nfs rootfs correctly
on r8a7790 when the command below runs on a host PC.

$ sudo ping -f -l 8 $BOARD_IP_ADDR

Since the driver sets the RACT to 1 in the first while loop of
sh_eth_rx(), the controller accepts a next frame into the next RX
descriptor during the while loop. But, in the first while loop
doesn't allocate a next skb. So, this patch removes the RACT setting
in the first while loop of sh_eth_rx().

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sh_eth: Fix receive packet "exceeded" condition in sh_eth_rx()

This patch fixes the packet "exceeded" condition in sh_eth_rx() when
RACT in an RX descriptor is not set and the "quota" is 0.
Otherwise, kernel panic happens because the "&n->poll_list" is deleted
twice in sh_eth_poll() which calls napi_complete() and net_rx_action().

Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: filter: fix warning on 32-bit arch

fix compiler warning on 32-bit architectures:

net/core/filter.c: In function '__sk_run_filter':
net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: fix potential bug in function tipc_backlog_rcv

In commit 4f4482dcd9a0606a30541ff165ddaca64748299b ("tipc: compensate
for double accounting in socket rcv buffer") we access 'truesize' of
a received buffer after it might have been released by the function
filter_rcv().

In this commit we correct this by reading the value of 'truesize' to
the stack before delivering the buffer to filter_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sxgbe: remove duplicate SXGBE_CORE_L34_ADDCTL_REG define

The SXGBE_CORE_L34_ADDCTL_REG define is cut and pasted twice so we can
delete the second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: remove duplicate QLC_83XX_GET_LSO_CAPABILITY define

The QLC_83XX_GET_LSO_CAPABILITY define is cut and pasted twice so we can
delete the second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx4'

Amir Vadai says:

====================
cpumask,net: affinity hint helper function

This patchset will set affinity hint to influence IRQs to be allocated on the
same NUMA node as the one where the card resides. As discussed in
http://www.spinics.net/lists/netdev/msg271497.html

If number of IRQs allocated is greater than the number of local NUMA cores, all
local cores will be used first, and the rest of the IRQs will be on a remote
NUMA node.
If no NUMA support - IRQ's and cores will be mapped 1:1

Since the utility function to calculate the mapping could be useful in other mq
drivers in the kernel, it was added to cpumask.[ch]

This patchset was tested and applied on top of net-next since the first
consumer is a network device (mlx4_en). Over commit fff1f59 "mac802154:
llsec: add forgotten list_del_rcu in key removal"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Use affinity hint

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores. If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.

Signed-off-by: Yuval Atias <yuvala@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cpumask: Utility function to set n'th cpu - local cpu first

This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux

Pull thermal management update from Zhang Rui:
"Specifics:

   - fix a bug in Exynos thermal driver, which overwrites the hardware
     trip point threshold when updating software trigger levels and
     results in emergency shutdown.  From: Tushar Behera.

   - add thermal sensor support for Armada 375 and 38x SoCs.  From
     Ezequiel Garcia.

   - add TMU (Thermal Management Unit) support for Exynos5260 and
     Exynos5420 SoCs.  From Naveen Krishna Chatradhi.

   - add support for the additional digital temperature sensors in the
     Intel SoCs like Bay Trail.  From: Srinivas Pandruvada.

   - a couple of cleanups and small fixes from Jingoo Han, Bartlomiej
     Zolnierkiewicz, Geert Uytterhoeven, Jacob Pan, Paul Walmsley and
     Lan,Tianyu"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (21 commits)
  thermal: spear: remove unnecessary OOM messages
  thermal: exynos: remove unnecessary OOM messages
  thermal: rcar: remove unnecessary OOM messages
  thermal: armada: Support Armada 380 SoC
  thermal: armada: Support Armada 375 SoC
  thermal: armada: Allow to specify an 'inverted readout' sensor
  thermal: armada: Pass the platform_device to init_sensor()
  thermal: armada: Add generic infrastructure to handle the sensor
  thermal: armada: Add infrastructure to support generic formulas
  thermal: armada: Rename armada_thermal_ops struct
  thermal/intel_powerclamp: add newer cpu ids
  thermal: rcar: Use pm_runtime_put() i.s.o. pm_runtime_put_sync()
  thermal: samsung: Only update available threshold limits
  Thermal/int3403: Fix thermal hysteresis unit conversion
  thermal: Intel SoC DTS thermal
  thermal: samsung: Add TMU support for Exynos5260 SoCs
  thermal: samsung: Add TMU support for Exynos5420 SoCs
  thermal: samsung: change base_common to more meaningful base_second
  thermal: samsung: replace inten_ bit fields with intclr_
  thermal: offer Samsung thermal support only when ARCH_EXYNOS is defined
  ...

Merge tag 'pwm/for-3.16-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm

Pull pwm changes from Thierry Reding:
"The majority of these changes are cleanups and fixes across all
  drivers.  Redundant error messages are removed and more PWM
  controllers set the .can_sleep flag to signal that they can't be used
  in atomic context.

  Support is added for the Broadcom Kona family of SoCs and the Intel
  LPSS driver can now probe PCI devices in addition to ACPI devices.
  Upon shutdown, the pwm-backlight driver will now power off the
  backlight.  It also uses the new descriptor-based GPIO API for more
  concise GPIO handling.

  A large chunk of these changes also converts platforms to use the
  lookup mechanism rather than relying on the global number space to
  reference PWM devices.  This is largely in preparation for more
  unification and cleanups in future patches.  Eventually it will allow
  the legacy PWM API to be removed"

* tag 'pwm/for-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (38 commits)
  pwm: fsl-ftm: set pwm_chip can_sleep flag
  pwm: ab8500: Fix wrong value shift for disable/enable PWM
  pwm: samsung: do not set manual update bit in pwm_samsung_config
  pwm: lp3943: Set pwm_chip can_sleep flag
  pwm: atmel: set pwm_chip can_sleep flag
  pwm: mxs: set pwm_chip can_sleep flag
  pwm: tiehrpwm: inline accessor functions
  pwm: tiehrpwm: don't build PM related functions when not needed
  pwm-backlight: retrieve configured PWM period
  leds: leds-pwm: retrieve configured PWM period
  ARM: pxa: hx4700: use PWM_LOOKUP to initialize struct pwm_lookup
  ARM: shmobile: armadillo: use PWM_LOOKUP to initialize struct pwm_lookup
  ARM: OMAP3: Beagle: use PWM_LOOKUP to initialize struct pwm_lookup
  pwm: modify PWM_LOOKUP to initialize all struct pwm_lookup members
  ARM: pxa: hx4700: initialize all the struct pwm_lookup members
  ARM: OMAP3: Beagle: initialize all the struct pwm_lookup members
  pwm: renesas-tpu: remove unused struct tpu_pwm_platform_data
  ARM: shmobile: armadillo: initialize all struct pwm_lookup members
  pwm: add period and polarity to struct pwm_lookup
  pwm: twl: Really disable twl6030 PWMs
  ...

dm thin: update discard_granularity to reflect the thin-pool blocksize

DM thinp already checks whether the discard_granularity of the data
device is a factor of the thin-pool block size. But when using the
dm-thin-pool's discard passdown support, DM thinp was not selecting the
max of the underlying data device's discard_granularity and the
thin-pool's block size.

Update set_discard_limits() to set discard_granularity to the max of
these values. This enables blkdev_issue_discard() to properly align the
discards that are sent to the DM thin device on a full block boundary.
As such each discard will now cover an entire DM thin-pool block and the
block will be reclaimed.

Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

dm bio prison: implement per bucket locking in the dm_bio_prison hash table

Split the single per bio-prison lock by using per bucket locking. Per
bucket locking benefits both dm-thin and dm-cache targets by reducing
bio-prison lock contention.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

Merge branches 'pci/msi', 'pci/iommu' and 'pci/cleanup' into next

* pci/msi:
  PCI/MSI: Fix memory leak in free_msi_irqs()

* pci/iommu:
  PCI: Add function 1 DMA alias quirk for HighPoint RocketRaid 642L
  PCI: Add bridge DMA alias quirk for ITE bridge

* pci/cleanup:
  PCI: Merge multi-line quoted strings
  PCI: Whitespace cleanup
  PCI: Move EXPORT_SYMBOL so it immediately follows function/variable