firefly-linux-kernel-4.4.55.git
9 years agoBluetooth: Add BT_WARN and bt_dev_warn logging macros
Frederic Danis [Wed, 23 Sep 2015 16:18:07 +0000 (18:18 +0200)]
Bluetooth: Add BT_WARN and bt_dev_warn logging macros

Add warning logging macros to bluetooth subsystem logs.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btmrvl: add sd8997 chipset support
Amitkumar Karwar [Mon, 21 Sep 2015 10:06:42 +0000 (03:06 -0700)]
Bluetooth: btmrvl: add sd8997 chipset support

This patch adds support for Marvell's new chipset SD8997.
Register offsets and supported feature flags are updated.

Signed-off-by: Zhaoyang Liu <liuzy@marvell.com>
Signed-off-by: Cathy Luo <cluo@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btmrvl: remove extra space in cast
Amitkumar Karwar [Mon, 21 Sep 2015 10:06:41 +0000 (03:06 -0700)]
Bluetooth: btmrvl: remove extra space in cast

Coding style fix, extra spaces are removed to make casting
consistent.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: replace magic numbers
Alexander Aring [Mon, 21 Sep 2015 09:24:43 +0000 (11:24 +0200)]
mrf24j40: replace magic numbers

This patch replaces some magic numbers with defines for register bits,
mask and shifts.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: change irq trigger type behaviour
Alexander Aring [Mon, 21 Sep 2015 09:24:42 +0000 (11:24 +0200)]
mrf24j40: change irq trigger type behaviour

This patch changes the irq trigger type value while calling
devm_request_irq by using IRQF_TRIGGER_LOW when no irq type was given.
Additional we add support for change the irq polarity while hw init if
high level or low level triggered irq type are given.

For rising edge triggered irq's the mrf24j40 can't deal with that, this
races at position of tx completion irq, while the irq is disabled we
readout the irq status registers. This will resets the irq line so other
irq's can occur. Wile readout the irq status register the irq is still
disabled and edge triggered interrupts will be ignored.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add promiscuous mode support
Alexander Aring [Mon, 21 Sep 2015 09:24:41 +0000 (11:24 +0200)]
mrf24j40: add promiscuous mode support

This patch adds support for promiscuous mode by setting promiscuous (no
frame filtering), disable automatic ack handling and not filtering
frames where the crc is invalid.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add tx power support
Alexander Aring [Mon, 21 Sep 2015 09:24:40 +0000 (11:24 +0200)]
mrf24j40: add tx power support

This patch supports setting of transmit power for the mrf24j40ma
transceiver only. The mrf24j40mc has some amplifier to change the
transmit power, I am currently not sure how the mapping for this
amplifier looks like.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add cca ed level support
Alexander Aring [Mon, 21 Sep 2015 09:24:39 +0000 (11:24 +0200)]
mrf24j40: add cca ed level support

This patch supports handling to set the cca energy detection level for
the mrf24j40 transceiver.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add cca mode support
Alexander Aring [Mon, 21 Sep 2015 09:24:38 +0000 (11:24 +0200)]
mrf24j40: add cca mode support

This patch supports cca mode handling for mrf24j40.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add csma params support
Alexander Aring [Mon, 21 Sep 2015 09:24:37 +0000 (11:24 +0200)]
mrf24j40: add csma params support

This patch adds supports to change the CSMA parameters. The datasheet
doesn't say anything about max_be value. Seems not configurable and we
assume the 802.15.4 default. But this value must exists because there is
a min_be value.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: async interrupt handling
Alexander Aring [Mon, 21 Sep 2015 09:24:36 +0000 (11:24 +0200)]
mrf24j40: async interrupt handling

This patch removes the threaded irq handling and do a hardirq instead.
We need to switch to spi_async for this step for getting the irq status
register.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: rework rx handling to async rx handling
Alexander Aring [Mon, 21 Sep 2015 09:24:35 +0000 (11:24 +0200)]
mrf24j40: rework rx handling to async rx handling

This patch prepares that we can do the receive handling inside interrupt
context by using spi_async. This is necessary for introduce a
non-threaded irq handling.

Also we drop the bit setting for "RXDECINV" at register "BBREG1", we do
a driectly full write of register "BBREG1", because it contains the bit
RXDECINV only.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: rework tx handling to async tx handling
Alexander Aring [Mon, 21 Sep 2015 09:24:34 +0000 (11:24 +0200)]
mrf24j40: rework tx handling to async tx handling

This patch reworks the current transmit API to spi_async handling. We
removed the error handling check, because mac802154 has no chance to
report it. Also the transmit timeout handling can't be handled by xmit
async handling, for this usecase we need to implement the netdev
watchdog. These are all unlikely cases which we drop now and should be
provided by netdev watchdog.

We also drop the bit setting for TXNACKREQ at register TXNCON, this is
not necessary. The TXNCON register should set only once for each frame,
previous settings doesn't matter.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: add helpers for frame control checks
Alexander Aring [Mon, 21 Sep 2015 09:24:33 +0000 (11:24 +0200)]
ieee802154: add helpers for frame control checks

This patch introduce two static inline functions. The first to get the
frame control field from an sk_buff. The second is for checking on the
acknowledgment request bit on the frame control field. Later we can
introduce more functions to check on the frame control fields.

These will deprecate the current behaviour which requires a
host-byteorder conversion and manually bit handling.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: change to frame delivery with crc
Alexander Aring [Mon, 21 Sep 2015 09:24:32 +0000 (11:24 +0200)]
mrf24j40: change to frame delivery with crc

This patch changes the frame delivery to mac802154 with crc. This is
useful for monitor interface types which deliver the crc to userspace.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: use regmap for register access
Alexander Aring [Mon, 21 Sep 2015 09:24:31 +0000 (11:24 +0200)]
mrf24j40: use regmap for register access

This patch uses the regmap functions for transceiver register settings
where it's possible. This means everything except the hotpaths like
receive/transmit handling.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add regmap support
Alexander Aring [Mon, 21 Sep 2015 09:24:30 +0000 (11:24 +0200)]
mrf24j40: add regmap support

This patch introduce regmap support for short and long address space of
mrf24j40. It's only possible to use regmap_read/write/update_bits for
long address range. This is because I added lowlevel bus operation
because the write operation need to set the 12th bit to mark a register
write, but regmap only supports to set bits for register write access in
the first byte. We use other regmap register functions than
read/write/update_bits, so this should be fine.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add more register defines
Alexander Aring [Mon, 21 Sep 2015 09:24:29 +0000 (11:24 +0200)]
mrf24j40: add more register defines

For supporting regmap, this patch will add more register defines to
prepare a full register dump by regmap debugfs.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add random extended addr generation
Alexander Aring [Mon, 21 Sep 2015 09:24:28 +0000 (11:24 +0200)]
mrf24j40: add random extended addr generation

The mrf24j40 has no source to get a permanent extended address. This
patch will add a random generated permanent extended address source.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add default channel setting
Alexander Aring [Mon, 21 Sep 2015 09:24:27 +0000 (11:24 +0200)]
mrf24j40: add default channel setting

Per default mrf24j40 has the channel 11 after reset. This patch adds the
right phy default value for the channel setting.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: add device-tree support
Alexander Aring [Mon, 21 Sep 2015 09:24:26 +0000 (11:24 +0200)]
mrf24j40: add device-tree support

This patch adds devicetree support to mrf24j40 with proper devicetree
compatible strings.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: remove spi settings overwrite
Alexander Aring [Mon, 21 Sep 2015 09:24:25 +0000 (11:24 +0200)]
mrf24j40: remove spi settings overwrite

This patch removes spi settings while mrf24j40 probing. These settings
cannot be overwrite while device probing where spi controller should be
already configured. These settings need to be setup by device tree or
platform data.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: calling ieee802154_register_hw at last
Alexander Aring [Mon, 21 Sep 2015 09:24:24 +0000 (11:24 +0200)]
mrf24j40: calling ieee802154_register_hw at last

The function ieee802154_register_hw should always called at last.
Currently we do hardware init and such things after register hardware
into the subsystem. It could be that the subsystem already call driver
operations while running hardware init.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: use ieee802154_alloc_hw for private data
Alexander Aring [Mon, 21 Sep 2015 09:24:23 +0000 (11:24 +0200)]
mrf24j40: use ieee802154_alloc_hw for private data

This patch removes the own private dataroom allocation by calling
devm_kzalloc for devrec and assign this pointer to "devrec->hw->priv".
Instead we using like all other drivers ieee802154_alloc_hw and give the
size for the private driver dataroom at the first argument.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomrf24j40: cleanup define identation
Alexander Aring [Mon, 21 Sep 2015 09:24:22 +0000 (11:24 +0200)]
mrf24j40: cleanup define identation

This patch replaces the spaces after define by a tab.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: support edge triggered irq
Alexander Aring [Mon, 21 Sep 2015 07:37:54 +0000 (09:37 +0200)]
at86rf230: support edge triggered irq

This patch adds support for edge triggered irq types. We remove the
locking for irq resources by enable/disable irq and allocate directly
some heap buffer at isr. We have still a enable/disable irq path but
this is for level-triggered irq's which need to be disabled until
spi_async clear the irq line.

There is usually a little race condition between "irq line cleared" and
"enable_irq". When in this time a edge triggered irq arrived, we will
not recognize this interrupt. This case can't happend at at86rf230. The
reason is that we unmask TRX_END irq's only which indicates a transmit
or receive completion, which depends on the current state.

On Transmit:

TRX_END arrived and transceiver is in TX_ARET_ON state again, in this
state no other TRX_END can happen until we leave the state.

On Receive:

This is protected with the RX_SAFE_MODE bit which leaves the transceiver
in RX_AACK_BUSY until we readed the framebuffer. In this state no other
TRX_END can happen.

Tested with RPi where I first detected issues between edge/level irq's.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: tx: add warning if MTU exceeds
Alexander Aring [Fri, 18 Sep 2015 09:30:44 +0000 (11:30 +0200)]
mac802154: tx: add warning if MTU exceeds

Sending over AF_PACKET RAW sockets we can sending frames which exceeds
MTU size. To handling it correct we need to change things in AF_PACKET
which knows on RAW sockets an additional FCS is set by hardware or
mac802154 transmit functionality.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: change needed headroom/tailroom
Alexander Aring [Fri, 18 Sep 2015 09:30:43 +0000 (11:30 +0200)]
ieee802154: change needed headroom/tailroom

This patch cleanups needed_headroom, needed_tailroom and hard_header_len
fields for wpan and lowpan interfaces.

For wpan interfaces the worst case mac header len should be part of
needed_headroom, currently this is set as hard_header_len, but
hard_header_len should be set to the minimum header length which xmit
call assumes and this is the minimum frame length of 802.15.4.
The hard_header_len value will check inside send callbacl of AF_PACKET
raw sockets.

For lowpan interfaces, if fragmentation isn't needed the skb will
call dev_hard_header for 802154 layer and queue it afterwards. This
happens without new skb allocation, so we need the same headroom and
tailroom lengths like 802154 inside 802154 6lowpan layer. At least we
assume as minimum header length an ipv6 header size.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: introduce wpan_dev_header_ops
Alexander Aring [Fri, 18 Sep 2015 09:30:42 +0000 (11:30 +0200)]
ieee802154: introduce wpan_dev_header_ops

The current header_ops callback structure of net device are used mostly
from 802.15.4 upper-layers. Because this callback structure is a very
generic one, which is also used by e.g. DGRAM AF_PACKET sockets, we
can't make this callback structure 802.15.4 specific which is currently
is.

I saw the smallest "constraint" for calling this callback with
dev_hard_header/dev_parse_header by AF_PACKET which assign a 8 byte
array for address void pointers. Currently 802.15.4 specific protocols
like af802154 and 6LoWPAN will assign the "struct ieee802154_addr" as
these parameters which is greater than 8 bytes. The current callback
implementation for header_ops.create assumes always a complete
"struct ieee802154_addr" which AF_PACKET can't never handled and is
greater than 8 bytes.

For that reason we introduce now a "generic" create/parse header_ops
callback which allows handling with intra-pan extended addresses only.
This allows a small use-case with AF_PACKET to send "somehow" a valid
dataframe over DGRAM.

To keeping the current dev_hard_header behaviour we introduce a similar
callback structure "wpan_dev_header_ops" which contains 802.15.4 specific
upper-layer header creation functionality, which can be called by
wpan_dev_hard_header.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: header_ops: fix frame control setting
Alexander Aring [Fri, 18 Sep 2015 09:30:41 +0000 (11:30 +0200)]
ieee802154: header_ops: fix frame control setting

Sometimes upper-layer protocols wants to generate a new mac header by
filling "struct ieee802154_hdr" only. These upper-layers sets for the
address settings the source and dest fields, but not the fc fields for
indicate the source and dest address mode. This patch changes the
"ieee802154_hdr_push" function so the fc address fields are set
according the source and dest fields of "struct ieee802154_hdr".

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: llsec: fix device deletion from list
Alexander Aring [Fri, 18 Sep 2015 09:30:40 +0000 (11:30 +0200)]
mac802154: llsec: fix device deletion from list

This patch adds a missing list_del when a device description will be
deleted.

Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btmrvl: fix firmware dump issue
Nachiket Kukade [Fri, 18 Sep 2015 13:40:40 +0000 (06:40 -0700)]
Bluetooth: btmrvl: fix firmware dump issue

First firmware dump attempt from user works fine, but firmware goes
into bad state after this. Subsequent attempts fails.

As required by the firmware dump implementation, this change writes
FW_DUMP_READ_DONE value to dump ctrl register to address this issue.

Signed-off-by: Nachiket Kukade <kukaden@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agodrivers/net/ieee802154/at86rf230.c: seq_printf() now returns NULL
Stephen Rothwell [Tue, 22 Sep 2015 03:41:44 +0000 (20:41 -0700)]
drivers/net/ieee802154/at86rf230.c: seq_printf() now returns NULL

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Stefan Schmidt <stefan@osg.samsung.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'cpsw-macid-no-of'
David S. Miller [Tue, 22 Sep 2015 00:21:47 +0000 (17:21 -0700)]
Merge branch 'cpsw-macid-no-of'

Mugunthan V N says:

====================
Add support for reading macid when DT macid not found

Did a boot test on dra7-evm [1] and am437x-gp-evm [2].
Pushed a branch [3] for others to test the patch.

[1]: http://pastebin.ubuntu.com/12513420/
[2]: http://pastebin.ubuntu.com/12513428/
[3]: git://git.ti.com/~mugunthanvnm/ti-linux-kernel/linux.git cpsw-macid-read-support
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoarm: dts: am4372: add syscon phandle to cpsw node
Mugunthan V N [Mon, 21 Sep 2015 10:26:53 +0000 (15:56 +0530)]
arm: dts: am4372: add syscon phandle to cpsw node

There are 2 MACIDs stored in the control module of the am4372.
These are read by the cpsw driver if no valid MACID was found
in the devicetree.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoarm: dts: dra7: add syscon phandle to cpsw node
Mugunthan V N [Mon, 21 Sep 2015 10:26:52 +0000 (15:56 +0530)]
arm: dts: dra7: add syscon phandle to cpsw node

There are 2 MACIDs stored in the control module of the dra7.
These are read by the cpsw driver if no valid MACID was found
in the devicetree.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: cpsw-common: add support for reading mac address for dra7 and am437x...
Mugunthan V N [Mon, 21 Sep 2015 10:26:51 +0000 (15:56 +0530)]
drivers: net: cpsw-common: add support for reading mac address for dra7 and am437x platforms

Adding support for reading mac address using syscon driver for
dra7 and am437x platforms

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers: net: cpsw: davinci_emac: move reading mac id to common file
Mugunthan V N [Mon, 21 Sep 2015 10:26:50 +0000 (15:56 +0530)]
drivers: net: cpsw: davinci_emac: move reading mac id to common file

Moving mac address reading from ethernet driver to common
file for better maintenance and for code reusable.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge tag 'linux-can-next-for-4.4-20150921' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Tue, 22 Sep 2015 00:15:03 +0000 (17:15 -0700)]
Merge tag 'linux-can-next-for-4.4-20150921' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2015-09-17

this is a pull request of 8 patches for net-next/master.

All 8 patches are by me and cleanup the flexcan driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: bcmgenet: Remove duplicate test for tx_coalesce_usecs_high
Florian Fainelli [Fri, 18 Sep 2015 21:16:53 +0000 (14:16 -0700)]
net: bcmgenet: Remove duplicate test for tx_coalesce_usecs_high

We were checking twice for ec->tx_coalesce_usecs_high, remove the
duplicate test.

Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Reported-by: kbuild-all@01.org
Fixes: 2f9130709d2c19 ("net: bcmgenet: Implement TX coalescing control knobs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: send loss probe after 1s if no RTT available
Yuchung Cheng [Fri, 18 Sep 2015 18:40:33 +0000 (11:40 -0700)]
tcp: send loss probe after 1s if no RTT available

This patch makes TLP to use 1 sec timer by default when RTT is
not available due to SYN/ACK retransmission or SYN cookies.

Prior to this change, the lack of RTT prevents TLP so the first
data packets sent can only be recovered by fast recovery or RTO.
If the fast recovery fails to trigger the RTO is 3 second when
SYN/ACK is retransmitted. With this patch we can trigger fast
recovery in 1sec instead.

Note that we need to check Fast Open more properly. A Fast Open
connection could be (accepted then) closed before it receives
the final ACK of 3WHS so the state is FIN_WAIT_1. Without the
new check, TLP will retransmit FIN instead of SYN/ACK.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: usec resolution SYN/ACK RTT
Yuchung Cheng [Fri, 18 Sep 2015 18:36:14 +0000 (11:36 -0700)]
tcp: usec resolution SYN/ACK RTT

Currently SYN/ACK RTT is measured in jiffies. For LAN the SYN/ACK
RTT is often measured as 0ms or sometimes 1ms, which would affect
RTT estimation and min RTT samping used by some congestion control.

This patch improves SYN/ACK RTT to be usec resolution if platform
supports it. While the timestamping of SYN/ACK is done in request
sock, the RTT measurement is carefully arranged to avoid storing
another u64 timestamp in tcp_sock.

For regular handshake w/o SYNACK retransmission, the RTT is sampled
right after the child socket is created and right before the request
sock is released (tcp_check_req() in tcp_minisocks.c)

For Fast Open the child socket is already created when SYN/ACK was
sent, the RTT is sampled in tcp_rcv_state_process() after processing
the final ACK an right before the request socket is released.

If the SYN/ACK was retransmistted or SYN-cookie was used, we rely
on TCP timestamps to measure the RTT. The sample is taken at the
same place in tcp_rcv_state_process() after the timestamp values
are validated in tcp_validate_incoming(). Note that we do not store
TS echo value in request_sock for SYN-cookies, because the value
is already stored in tp->rx_opt used by tcp_ack_update_rtt().

One side benefit is that the RTT measurement now happens before
initializing congestion control (of the passive side). Therefore
the congestion control can use the SYN/ACK RTT.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 's390-next'
David S. Miller [Mon, 21 Sep 2015 23:03:05 +0000 (16:03 -0700)]
Merge branch 's390-next'

Ursula Braun says:

====================
s390: qeth and iucv patches

here is version 2 of some s390 related qeth patches for net-next. The patch by
Thomas Richter adds a new feature to the qeth layer2 code; the remaining
patches are minor improvements.
Version 2 of patch 4 uses the desired indentation in function declarations
and definitions spanning multiple lines in almost all cases. Thomas run into a
conflict with the maximum number of columns once. Thus you will still see one
function definition using an earlier column before the opening paranthesis.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agos390/iucv: do not use arrays as argument
Ursula Braun [Fri, 18 Sep 2015 14:06:52 +0000 (16:06 +0200)]
s390/iucv: do not use arrays as argument

The iucv code uses arrays as arguments. Even though this does not
really cause a problem, it could be misleading, since the compiler
turns array arguments into just a pointer argument. To be more
precise this patch changes the array arguments into pointers.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoqeth: add layer 2 RX/TX checksum offloading
Thomas Richter [Fri, 18 Sep 2015 14:06:51 +0000 (16:06 +0200)]
qeth: add layer 2 RX/TX checksum offloading

Checksum offloading for send and receive is already
supported for layer 3 (IP layer). This patch
adds support for RX and TX hardware checksum offloading
for layer 2 (MAC layer). The hardware calculates the checksum
for IP UDP and TCP packets.

This patch moves the hardware checksum offloading setup
to the set of common functions in qeth_core_main.c.
Layer 2 and layer 3 now simply call the same common functions.

Also note that TX checksum offloading is always enabled.
The device driver relies on the TCP/IP stack to make use of
this feature.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reviewed-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoqeth: move OSA portname into deprecated status
Ursula Braun [Fri, 18 Sep 2015 14:06:50 +0000 (16:06 +0200)]
qeth: move OSA portname into deprecated status

An OSA-Express port name was required to identify a shared OSA port.
All operating system instances that shared the port had to use the
same port name. This requirement no longer applies.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoqeth: no write permission for readonly sysattr
Lakhvich Dmitriy [Fri, 18 Sep 2015 14:06:49 +0000 (16:06 +0200)]
qeth: no write permission for readonly sysattr

User is not allowed to write into bridge_state sysfs file.
Fixed attribute not mislead the user

Signed-off-by: Lakhvich Dmitriy <ldmitriy@ru.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reported-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Reviewed-by: Thomas Richter <tmricht@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoqeth: remove extraneous length from %pM format
Eugene Crosser [Fri, 18 Sep 2015 14:06:48 +0000 (16:06 +0200)]
qeth: remove extraneous length from %pM format

Length specifier in the %pM format is not supported (at least, not
documented). Remove it, and also an extraneous '&' for the array.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Mon, 21 Sep 2015 23:00:44 +0000 (16:00 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2015-09-18

Here's the first bluetooth-next pull request for the 4.4 kernel:

 - ieee802154 cleanups & fixes
 - debugfs support for the at86rf230 driver
 - Support for quirky (seemingly counterfeit) CSR Bluetooth controllers
 - Power management and device config improvements for Intel controllers
 - Fix for devices with incorrect advertising data length
 - Fix for closing HCI user channel socket

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocan: flexcan: enable interrupts atomically at the end of flexcan_chip_start()
Marc Kleine-Budde [Thu, 27 Aug 2015 12:24:48 +0000 (14:24 +0200)]
can: flexcan: enable interrupts atomically at the end of flexcan_chip_start()

This patch defers the writing of the interrupts bits of the CTRL register order
to enables all interrupts atomically at the the of the flexcan_chip_start()
function.

Suggested-by: Torsten Lang <torsten.lang@uweschneider.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: give member of flexcan_priv holding mailboxes a sensible name
Marc Kleine-Budde [Tue, 25 Aug 2015 08:39:19 +0000 (10:39 +0200)]
can: flexcan: give member of flexcan_priv holding mailboxes a sensible name

This patch gives the member of flexcan_priv holding mailboxes a sensible name,
by renaming from "cantxfg" to "mb":

    struct flexcan_priv::cantxfg -> struct flexcan_priv::mb

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: use pointer to struct regs instead of void pointer for mmio address...
Marc Kleine-Budde [Fri, 8 May 2015 07:32:58 +0000 (09:32 +0200)]
can: flexcan: use pointer to struct regs instead of void pointer for mmio address space

This patch renames the pointer to the mmio address space from "base" to "regs"
and changes the type from "void __iomem *" to "struct flexcan_regs __iomem *".

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: rename feature into quirks
Marc Kleine-Budde [Fri, 8 May 2015 13:22:36 +0000 (15:22 +0200)]
can: flexcan: rename feature into quirks

This patch renames the "features" member of struct flexcan_devtype_data to
"quirks". The corresponding defines are renamed too, to reflect what they
actually do.

    FLEXCAN_HAS_V10_FEATURES      -> FLEXCAN_QUIRK_DISABLE_RXFG
    FLEXCAN_HAS_BROKEN_ERR_STATE  -> FLEXCAN_QUIRK_BROKEN_ERR_STATE
    FLEXCAN_HAS_MECR_FEATURES     -> FLEXCAN_QUIRK_DISABLE_MECR

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: flexcan_chip_start(): cleanup writing of reg_mcr
Marc Kleine-Budde [Mon, 31 Aug 2015 19:32:34 +0000 (21:32 +0200)]
can: flexcan: flexcan_chip_start(): cleanup writing of reg_mcr

This patch changes the order the individual bits of the mcr register in
flexcan_chip_start() are or'ed together to match the datasheet. The inline
documentation is adjusted accordingly.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: remove unused header files
Marc Kleine-Budde [Sat, 9 May 2015 16:25:05 +0000 (18:25 +0200)]
can: flexcan: remove unused header files

This patch removes unused header files from the flexcan driver.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: headers: make header files self contained
Marc Kleine-Budde [Sat, 9 May 2015 15:47:52 +0000 (17:47 +0200)]
can: headers: make header files self contained

This patch adds the missing #include-s to the dev.h and led.h, so that they can
be used without including further header files.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agocan: flexcan: cleanup coding style and fix typos
Marc Kleine-Budde [Thu, 6 Aug 2015 12:53:57 +0000 (14:53 +0200)]
can: flexcan: cleanup coding style and fix typos

This patch fixes up the coding style to make checkpatch happier. Some typos are
also fixed.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 21 Sep 2015 05:26:58 +0000 (22:26 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-09-17

This series contains updates to i40e and i40evf.

Shannon provides updates to i40e and i40evf to resolve an issue with the
nvmupdate utility.  First renames a variable name to reduce confusion and
to differentiate it from the actual user variable.  Then added the ability
to save the admin queue write back descriptor if a caller supplies a
buffer for it to be saved into.  Added a new GetStatus command so that
the NVM update tool can query the current status instead of doing fake
write requests to probe for readiness.  Added wait states to the NVM
update state machine to signify when waiting for an update operation to
finish, whether we are in the middle of a set of write operations, or we
are now idle but waiting.  Then added a facility to run admin queue
commands through the NVM update utility in order to allow the update
tools to interact with the firmware and do special commands needed for
updates and configuration changes.  Also added a facility to recover the
result of a previously run admin queue command.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge tag 'linux-can-next-for-4.4-20150917' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Mon, 21 Sep 2015 04:58:23 +0000 (21:58 -0700)]
Merge tag 'linux-can-next-for-4.4-20150917' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2015-09-17

this is a pull request of two patches for net-next/master.

Gerhard Bertelsmann adds support for the CAN controller found on the
Allwinner A10/A20 SoC.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agorxrpc: Replace get_seconds with ktime_get_seconds
Ksenija Stanojevic [Thu, 17 Sep 2015 16:12:53 +0000 (18:12 +0200)]
rxrpc: Replace get_seconds with ktime_get_seconds

Replace time_t type and get_seconds function which are not y2038 safe
on 32-bit systems. Function ktime_get_seconds use monotonic instead of
real time and therefore will not cause overflow.

Signed-off-by: Ksenija Stanojevic <ksenija.stanojevic@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'hsilicon-net-subsys'
David S. Miller [Mon, 21 Sep 2015 04:42:58 +0000 (21:42 -0700)]
Merge branch 'hsilicon-net-subsys'

huangdaode says:

====================
net: Hisilicon Network Subsystem support

This is V2 of Hisilicon Network Subsystem(HNS) patchesets taking care
about LKML comments.

Please find out the changes from the change logs.
This patchset is rebased on mainline kernel Linux 4.3-rc1 branch.

[PATCH v2 1/5] Device Tree Binding Documentation
[PATCH v2 2/5] Merge MDIO Module
[PATCH v2 3/5] Hisilicon Network Acceleration Engine Framework
[PATCH v2 4/5] Distributed System Area Fabric Module
[PATCH v2 5/5] Basic Ethernet Driver Module

Changes from V1:
1. Remove "inline" in C file (according to LKML comment, same in below).
2. Fix a bug about class_find_device.
3. Change the DTS pattern on hnae, restruct it to compatible with Hi1610 soc.
4. Unified hip04_mdio and hip05_mdio into hns_mdio, which is more usaul for
   later SOCs.

V1 Patches Reference: https://lkml.org/lkml/2015/8/14/165
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: add Hisilicon Network Subsystem basic ethernet support
huangdaode [Thu, 17 Sep 2015 06:51:50 +0000 (14:51 +0800)]
net: add Hisilicon Network Subsystem basic ethernet support

This is to add basic ethernet support for HNS. It is one of the way to
use the HNS acceleration engine. But most of the decoding/encoding
capability of the AE cannot be used in this way.

This submit contains the basic feature as a ethernet driver. More will
be added later.

Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: add Hisilicon Network Subsystem DSAF support
huangdaode [Thu, 17 Sep 2015 06:51:49 +0000 (14:51 +0800)]
net: add Hisilicon Network Subsystem DSAF support

DSAF, namely Distributed System Area Fabric, is one of the HNS
acceleration engine implementation. This patch add DSAF driver to the
system.

hns_ae_adapt: the adaptor for registering the driver to HNAE framework
hns_dsaf_mac: MAC cover interface for GE and XGE
hns_dsaf_gmac: GE (10/100/1000G Ethernet) MAC function
hns_dsaf_xgmac: XGE (10000+G Ethernet) MAC function
hns_dsaf_main: the platform device driver for the whole hardware
hns_dsaf_misc: some misc helper function, such as LED support
hns_dsaf_ppe: packet process engine function
hns_dsaf_rcb: ring buffer function

Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: add Hisilicon Network Subsystem hnae framework support
huangdaode [Thu, 17 Sep 2015 06:51:48 +0000 (14:51 +0800)]
net: add Hisilicon Network Subsystem hnae framework support

HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
unified ring buffer interface for Hisilicon Network Acceleration
Engines.

With the interface, upper layer can work as ethernet driver, ODP driver
or other service driver on purpose.

Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: add Hisilicon Network Subsystem MDIO support
huangdaode [Thu, 17 Sep 2015 06:51:47 +0000 (14:51 +0800)]
net: add Hisilicon Network Subsystem MDIO support

The MDIO support for Hisilicon Network Subsystem. It is used in Hislicon
hip04, hip05 and Hi1610 SoC to control the external PHY

Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: add Hisilicon Network Subsystem support (config and documents)
huangdaode [Thu, 17 Sep 2015 06:51:46 +0000 (14:51 +0800)]
net: add Hisilicon Network Subsystem support (config and documents)

The Hisilicon Network Subsystem is a long term evolution IP which is
supposed to be used in Hisilicon ICT SoC. The IP, which is called hns
for short, is a TCP/IP acceleration engine, which can directly decode
TCP/IP stream and distribute them to different ring buffers.

HNS can be configured to work on different mode for different scenario.
This patch make use only some of the mode to make it as standard
ethernet NIC. The other mode will be added soon.

The whole function has 4 kernel sub-modules:

hnae: the HNS acceleration engine framework. It provides a abstract
interface between the engine and the upper layers which make use of the
engine by ring buffer.

hns_enet_drv: a standard ethernet driver that base on the ring buffer.

hns_dsaf: one of the implementation of HNS acceleration engine, which is
applied on Hililicon hip05, Hi1610 and other later-on SoCs

hns_mdio: the mdio control to the PHY, used by acceleration engine

This submit add basic config and documents

Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoxen-netfront: always set num queues if possible
chas williams [Wed, 16 Sep 2015 20:28:25 +0000 (16:28 -0400)]
xen-netfront: always set num queues if possible

If netfront connects with two (or more) queues and then reconnects with
only one queue it fails to delete or rewrite the multi-queue-num-queues
key and netback will try to use the wrong number of queues.

Always write the num-queues field if the backend has multi-queue support.

Signed-off-by: Chas Williams <3chas3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoBluetooth: Fix reporting incorrect EIR in device found mgmt event
Szymon Janc [Wed, 16 Sep 2015 18:21:54 +0000 (20:21 +0200)]
Bluetooth: Fix reporting incorrect EIR in device found mgmt event

Some remote devices (ie Gigaset G-Tag) misbehave with ADV data length.
This can lead to incorrect EIR format in device found event when
ADV_DATA and SCAN_RSP are merged (terminator field before SCAN_RSP
part).

Fix this by inspecting ADV_DATA and correct its length if terminator
is found.

> HCI Event: LE Meta Event (0x3e) plen 42              [hci0] 32.172182
      LE Advertising Report (0x02)
        Num reports: 1
        Event type: Connectable undirected - ADV_IND (0x00)
        Address type: Public (0x00)
        Address: 7C:2F:80:94:97:5A (Gigaset Communications GmbH)
        Data length: 30
        Flags: 0x06
          LE General Discoverable Mode
          BR/EDR Not Supported
        Company: Gigaset Communications GmbH (384)
          Data: 021512348094975abbc5
        16-bit Service UUIDs (partial): 1 entry
          Battery Service (0x180f)
        RSSI: -65 dBm (0xbf)
> HCI Event: LE Meta Event (0x3e) plen 27              [hci0] 32.172191
      LE Advertising Report (0x02)
        Num reports: 1
        Event type: Scan response - SCAN_RSP (0x04)
        Address type: Public (0x00)
        Address: 7C:2F:80:94:97:5A (Gigaset Communications GmbH)
        Data length: 15
        Name (complete): Gigaset G-tag
        RSSI: -59 dBm (0xc5)

Note "Data length: 30" in ADV_DATA which results in 9 extra zero bytes
after Battery Service UUID. Terminator field present in the middle of
EIR in Device Found event resulted in userspace stop parsing EIR and
skipping device name.

@ Device Found: 7C:2F:80:94:97:5A (1) rssi -59 flags 0x0000
      02 01 06 0d ff 80 01 02 15 12 34 80 94 97 5a bb  ..........4...Z.
      c5 03 02 0f 18 00 00 00 00 00 00 00 00 00 0e 09  ................
      47 69 67 61 73 65 74 20 47 2d 74 61 67           Gigaset G-tag

With this fix EIR with merged ADV_DATA and SCAN_RSP in device found
event is properly formatted:

@ Device Found: 7C:2F:80:94:97:5A (1) rssi -59 flags 0x0000
      02 01 06 0d ff 80 01 02 15 12 34 80 94 97 5a bb  ..........4...Z.
      c5 03 02 0f 18 0e 09 47 69 67 61 73 65 74 20 47  .......Gigaset G
      2d 74 61 67                                      -tag

Signed-off-by: Szymon Janc <ext.szymon.janc@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: Add BT_ERR_RATELIMITED
Szymon Janc [Wed, 16 Sep 2015 18:21:53 +0000 (20:21 +0200)]
Bluetooth: Add BT_ERR_RATELIMITED

This patch adds ratelimited version of the BT_ERR macro.

Signed-off-by: Szymon Janc <ext.szymon.janc@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agosch_dsmark: improve memory locality
Eric Dumazet [Thu, 17 Sep 2015 23:37:13 +0000 (16:37 -0700)]
sch_dsmark: improve memory locality

Memory placement in sch_dsmark is silly : Better place mask/value
in the same cache line.

Also, we can embed small arrays in the first cache line and
remove a potential cache miss.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'bcmgenet-irq-coalesce'
David S. Miller [Fri, 18 Sep 2015 05:17:14 +0000 (22:17 -0700)]
Merge branch 'bcmgenet-irq-coalesce'

Florian Fainelli says:

====================
net: bcmgenet: Interrupt coalescing

This patch series adds support for interrupt coalescing for GENET
adapters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: bcmgenet: Implement RX coalescing control knobs
Florian Fainelli [Wed, 16 Sep 2015 23:47:40 +0000 (16:47 -0700)]
net: bcmgenet: Implement RX coalescing control knobs

Add support for the ethtool rx-frames coalescing parameter which allows
defining the number of RX interrupts per frames received. The RDMA
engine supports a configurable timeout with a resolution of
approximately 8.192 us.

We can no longer enable the BDONE/PDONE interrupts as those would
fire for each packet/buffer received, which would defeat the MBDONE
interrupt purpose. The MBDONE interrupt is guaranteed to correspond to a
PDONE/BDONE interrupt when the threshold is set to 1.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: bcmgenet: Implement TX coalescing control knobs
Florian Fainelli [Wed, 16 Sep 2015 23:47:39 +0000 (16:47 -0700)]
net: bcmgenet: Implement TX coalescing control knobs

Configuring the ethtool tx-frames property, which translates into N
packets before a TX interrupt is the simplest configuration scheme
because it requires no locking neither at the softare nor hardware
level, and is completely indepedent from the link speed. Since ethtool
does not allow per-tx queue coalescing parameters, we apply the same
setting to any transmit queue.

We can no longer enable the BDONE/PDONE interrupts as those would fire
for each packet/buffer received, which would defeat the MBDONE interrupt
purpose. The MBDONE interrupt is guaranteed to correspond to a
PDONE/BDONE interrupt when the threshold is set to 1, but offers
interrupt coalescing when the value is > 1.

Since the HW is configured to generate an interrupt when the ring
becomes emtpy, we have to deny any timeout/timer settings coming from
user-space to indicate we can only generate an interrupt very <N>
packets.

While we are at it, fix the DMA_INTR_THRESHOLD_MASK value which was off
by one bit (0xff vs. 0x1ff).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Remove not defined MAC_CR_GMII_EN_ bit from MAC_CR.
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:41:19 +0000 (23:41 +0000)]
lan78xx: Remove not defined MAC_CR_GMII_EN_ bit from MAC_CR.

Remove not defined MAC_CR_GMII_EN_ bit from MAC_CR.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Create lan78xx_get_mdix_status() and lan78xx_set_mdix_status() for MDIX...
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:41:14 +0000 (23:41 +0000)]
lan78xx: Create lan78xx_get_mdix_status() and lan78xx_set_mdix_status() for MDIX control.

Create lan78xx_get_mdix_status() and lan78xx_set_mdix_status() for MDIX control.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Remove phy defines in lan78xx.h and use defines in include/linux/microchipphy.h
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:41:07 +0000 (23:41 +0000)]
lan78xx: Remove phy defines in lan78xx.h and use defines in include/linux/microchipphy.h

Remove phy defines in lan78xx.h and use defines in include/linux/microchipphy.h.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Update to use phylib instead of mii_if_info.
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:40:54 +0000 (23:40 +0000)]
lan78xx: Update to use phylib instead of mii_if_info.

Update to use phylib instead of mii_if_info.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Add PHYLIB and MICROCHIP_PHY as default config.
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:40:47 +0000 (23:40 +0000)]
lan78xx: Add PHYLIB and MICROCHIP_PHY as default config.

Add PHYLIB and MICROCHIP_PHY as default configuration for lan78xx.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agolan78xx: Check device ready bit (PMT_CTL_READY_) after reset the PHY
Woojung.Huh@microchip.com [Wed, 16 Sep 2015 23:40:39 +0000 (23:40 +0000)]
lan78xx: Check device ready bit (PMT_CTL_READY_) after reset the PHY

Check device ready bit (PMT_CTL_READY_) after reset the PHY.
Device may not be ready even if PHY_RST_ is cleared depends on configuration.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: Initialize table in fib result
David Ahern [Wed, 16 Sep 2015 16:16:39 +0000 (10:16 -0600)]
net: Initialize table in fib result

Sergey, Richard and Fabio reported an oops in ip_route_input_noref. e.g., from Richard:

[    0.877040] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056
[    0.877597] IP: [<ffffffff8155b5e2>] ip_route_input_noref+0x1a2/0xb00
[    0.877597] PGD 3fa14067 PUD 3fa6e067 PMD 0
[    0.877597] Oops: 0000 [#1] SMP
[    0.877597] Modules linked in: virtio_net virtio_pci virtio_ring virtio
[    0.877597] CPU: 1 PID: 119 Comm: ifconfig Not tainted 4.2.0+ #1
[    0.877597] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.877597] task: ffff88003fab0bc0 ti: ffff88003faa8000 task.ti: ffff88003faa8000
[    0.877597] RIP: 0010:[<ffffffff8155b5e2>]  [<ffffffff8155b5e2>] ip_route_input_noref+0x1a2/0xb00
[    0.877597] RSP: 0018:ffff88003ed03ba0  EFLAGS: 00010202
[    0.877597] RAX: 0000000000000046 RBX: 00000000ffffff8f RCX: 0000000000000020
[    0.877597] RDX: ffff88003fab50b8 RSI: 0000000000000200 RDI: ffffffff8152b4b8
[    0.877597] RBP: ffff88003ed03c50 R08: 0000000000000000 R09: 0000000000000000
[    0.877597] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003fab6f00
[    0.877597] R13: ffff88003fab5000 R14: 0000000000000000 R15: ffffffff81cb5600
[    0.877597] FS:  00007f6de5751700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
[    0.877597] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.877597] CR2: 0000000000000056 CR3: 000000003fa6d000 CR4: 00000000000006e0
[    0.877597] Stack:
[    0.877597]  0000000000000000 0000000000000046 ffff88003fffa600 ffff88003ed03be0
[    0.877597]  ffff88003f9e2c00 697da8c0017da8c0 ffff880000000000 000000000007fd00
[    0.877597]  0000000000000000 0000000000000046 0000000000000000 0000000400000000
[    0.877597] Call Trace:
[    0.877597]  <IRQ>
[    0.877597]  [<ffffffff812bfa1f>] ? cpumask_next_and+0x2f/0x40
[    0.877597]  [<ffffffff8158e13c>] arp_process+0x39c/0x690
[    0.877597]  [<ffffffff8158e57e>] arp_rcv+0x13e/0x170
[    0.877597]  [<ffffffff8151feec>] __netif_receive_skb_core+0x60c/0xa00
[    0.877597]  [<ffffffff81515795>] ? __build_skb+0x25/0x100
[    0.877597]  [<ffffffff81515795>] ? __build_skb+0x25/0x100
[    0.877597]  [<ffffffff81521ff6>] __netif_receive_skb+0x16/0x70
[    0.877597]  [<ffffffff81522078>] netif_receive_skb_internal+0x28/0x90
[    0.877597]  [<ffffffff8152288f>] napi_gro_receive+0x7f/0xd0
[    0.877597]  [<ffffffffa0017906>] virtnet_receive+0x256/0x910 [virtio_net]
[    0.877597]  [<ffffffffa0017fd8>] virtnet_poll+0x18/0x80 [virtio_net]
[    0.877597]  [<ffffffff815234cd>] net_rx_action+0x1dd/0x2f0
[    0.877597]  [<ffffffff81053228>] __do_softirq+0x98/0x260
[    0.877597]  [<ffffffff8164969c>] do_softirq_own_stack+0x1c/0x30

The root cause is use of res.table uninitialized.

Thanks to Nikolay for noticing the uninitialized use amongst the maze of
gotos.

As Nikolay pointed out the second initialization is not required to fix
the oops, but rather to fix a related problem where a valid lookup should
be invalidated before creating the rth entry.

Fixes: b7503e0cdb5d ("net: Add FIB table id to rtable")
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Reported-by: Richard Alpe <richard.alpe@ericsson.com>
Reported-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'bpf_avoid_clone'
David S. Miller [Fri, 18 Sep 2015 04:09:07 +0000 (21:09 -0700)]
Merge branch 'bpf_avoid_clone'

Alexei Starovoitov says:

====================
bpf: performance improvements

v1->v2: dropped redundant iff_up check in patch 2

At plumbers we discussed different options on how to get rid of skb_clone
from bpf_clone_redirect(), the patch 2 implements the best option.
Patch 1 adds 'integrated exts' to cls_bpf to improve performance by
combining simple actions into bpf classifier.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobpf: add bpf_redirect() helper
Alexei Starovoitov [Wed, 16 Sep 2015 06:05:43 +0000 (23:05 -0700)]
bpf: add bpf_redirect() helper

Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
  bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocls_bpf: introduce integrated actions
Daniel Borkmann [Wed, 16 Sep 2015 06:05:42 +0000 (23:05 -0700)]
cls_bpf: introduce integrated actions

Often cls_bpf classifier is used with single action drop attached.
Optimize this use case and let cls_bpf return both classid and action.
For backwards compatibility reasons enable this feature under
TCA_BPF_FLAG_ACT_DIRECT flag.

Then more interesting programs like the following are easier to write:
int cls_bpf_prog(struct __sk_buff *skb)
{
  /* classify arp, ip, ipv6 into different traffic classes
   * and drop all other packets
   */
  switch (skb->protocol) {
  case htons(ETH_P_ARP):
    skb->tc_classid = 1;
    break;
  case htons(ETH_P_IP):
    skb->tc_classid = 2;
    break;
  case htons(ETH_P_IPV6):
    skb->tc_classid = 3;
    break;
  default:
    return TC_ACT_SHOT;
  }

  return TC_ACT_OK;
}

Joint work with Daniel Borkmann.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: only check perm protocol when register proto
Junwei Zhang [Fri, 18 Sep 2015 04:00:05 +0000 (00:00 -0400)]
net: only check perm protocol when register proto

The permanent protocol nodes are at the head of the list,
So only need check all these nodes.

No matter the new node is permanent or not,
insert the new node after the last permanent protocol node,

If the new node conflicts with existing permanent node,
return error.

Signed-off-by: Martin Zhang <martinbj2008@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobonding: use l4 hash if available
Eric Dumazet [Tue, 15 Sep 2015 22:24:28 +0000 (15:24 -0700)]
bonding: use l4 hash if available

If skb carries a l4 hash, no need to perform a flow dissection.

Performance is slightly better :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39012e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39393e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39988e+06

After patch :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.43579e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44304e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44312e+06

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: provide skb->hash to synack packets
Eric Dumazet [Tue, 15 Sep 2015 22:24:20 +0000 (15:24 -0700)]
tcp: provide skb->hash to synack packets

In commit b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf
on xmit"), Tom provided a l4 hash to most outgoing TCP packets.

We'd like to provide one as well for SYNACK packets, so that all packets
of a given flow share same txhash, to later enable bonding driver to
also use skb->hash to perform slave selection.

Note that a SYNACK retransmit shuffles the tx hash, as Tom did
in commit 265f94ff54d62 ("net: Recompute sk_txhash on negative routing
advice") for established sockets.

This has nice effect making TCP flows resilient to some kind of black
holes, even at connection establish phase.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoi40e/i40evf: Bump i40e to 1.3.21 and i40evf to 1.3.13
Catherine Sullivan [Fri, 28 Aug 2015 21:56:01 +0000 (17:56 -0400)]
i40e/i40evf: Bump i40e to 1.3.21 and i40evf to 1.3.13

Bump.

Change-ID: If7ce84218361defa209142d1d8c6f69d48c2d7ad
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: add get AQ result command to nvmupdate utility
Shannon Nelson [Fri, 28 Aug 2015 21:55:51 +0000 (17:55 -0400)]
i40e/i40evf: add get AQ result command to nvmupdate utility

Add a facility to recover the result of a previously run AQ command.

Change-ID: I21afec2c20c1a5e6ba60c7fbfcbedfff78c10e45
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: add exec_aq command to nvmupdate utility
Shannon Nelson [Fri, 28 Aug 2015 21:55:50 +0000 (17:55 -0400)]
i40e/i40evf: add exec_aq command to nvmupdate utility

Add a facility to run AQ commands through the nvmupdate utility in order
to allow the update tools to interact with the FW and do special
commands needed for updates and configuration changes.

Change-ID: I5c41523e4055b37f8e4ee479f7a0574368f4a588
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: add wait states to NVM state machine
Shannon Nelson [Fri, 28 Aug 2015 21:55:49 +0000 (17:55 -0400)]
i40e/i40evf: add wait states to NVM state machine

This adds wait states to the NVM update state machine to signify when
waiting for an update operation to finish, whether we're in the middle
of a set of Write operations, or we're now idle but waiting.

Change-ID: Iabe91d6579ef6a2ea560647e374035656211ab43
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: add GetStatus command for nvmupdate
Shannon Nelson [Fri, 28 Aug 2015 21:55:48 +0000 (17:55 -0400)]
i40e/i40evf: add GetStatus command for nvmupdate

This adds a new GetStatus command so that the NVM update tool can query
the current status instead of doing fake write requests to probe for
readiness.

Change-ID: I671ec6ccd4dfc9dbac3a03b964589d693fda5cd8
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: add handling of writeback descriptor
Shannon Nelson [Fri, 28 Aug 2015 21:55:47 +0000 (17:55 -0400)]
i40e/i40evf: add handling of writeback descriptor

If the writeback descriptor buffer was previously created, this gives it
to the AQ command request to be used to save the results.

Change-ID: I8c8a1af81e6ebed6d0a15ed31697fe1a6c4e3708
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: save aq writeback for future inspection
Shannon Nelson [Thu, 27 Aug 2015 15:42:42 +0000 (11:42 -0400)]
i40e/i40evf: save aq writeback for future inspection

Add the ability to save the AdminQ write back descriptor if a
caller supplies a buffer for it to be saved into.

Change-ID: I3d1301d26360b39a2d66dc8569e851f54133a3af
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: rename variable to prevent clash of understanding
Shannon Nelson [Thu, 23 Jul 2015 20:54:33 +0000 (16:54 -0400)]
i40e: rename variable to prevent clash of understanding

This code returns something that becomes the errno value from ethtool and
passes around a pointer to an errno variable.  This patch changes the name
slightly to differentiate it from the actual user errno variable.

Change-ID: Idaa37845c069e66f4cea072e90f471bb2142454d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoMerge branch 'nf_hook_netns'
David S. Miller [Fri, 18 Sep 2015 00:18:38 +0000 (17:18 -0700)]
Merge branch 'nf_hook_netns'

Eric W. Biederman says:

====================
Passing net through the netfilter hooks

My primary goal with this patchset and it's follow ups is to cleanup the
network routing paths so that we do not look at the output device to
derive the network namespace.  My plan is to pass the network namespace
of the transmitting socket through the output path, to replace code that
looks at the output network device today.  Once that is done we can have
routes with output devices outside of the current network namespace.
Which should allow reception and transmission of packets in network
namespaces to be as fast as normal packet reception and transmission
with early demux disabled, because it will same code path.

Once skb_dst(skb)->dev is a little better under control I think it will
also be possible to use rcu to cleanup the ancient hack that sets
dst->dev to loopback_dev when a network device is removed.

The work to get there is a series of code cleanups.  I am starting with
passing net into the netfilter hooks and into the functions that are
called after the netfilter hooks.  This removes from netfilter the
need to guess which network namespace it is working on.

To get there I perform a series of minor prep patches so the big changes
at the end are possible to audit without getting lost in the noise.  In
particular I have a lot of patches computing net into a local variable
and then using it through out the function.

So this patchset encompases removing dead code, sorting out the _sk
functions that were added last time someone pushed a prototype change
through the post netfilter functions.  Cleaning up individual functions
use of the network namespace.  Passing net into the netfilter hooks.
Passing net into the post netfilter functions.  Using state->net in
the netfilter code where it is available and trivially usable.

Pablo, Dave I don't know whose tree this makes more sense to go
through.  I am assuming at least initially Pablos as netfilter is
involved.  From what I have seen there will be a lot of back and forth
between the netfilter code paths and the routing code paths.

The patches are also available (against 4.3-rc1) at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next.git master
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Add blank lines in callers of netfilter hooks
Eric W. Biederman [Thu, 17 Sep 2015 22:21:31 +0000 (17:21 -0500)]
netfilter: Add blank lines in callers of netfilter hooks

In code review it was noticed that I had failed to add some blank lines
in places where they are customarily used.  Taking a second look at the
code I have to agree blank lines would be nice so I have added them
here.

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass net into okfn
Eric W. Biederman [Wed, 16 Sep 2015 01:04:18 +0000 (20:04 -0500)]
netfilter: Pass net into okfn

This is immediately motivated by the bridge code that chains functions that
call into netfilter.  Without passing net into the okfns the bridge code would
need to guess about the best expression for the network namespace to process
packets in.

As net is frequently one of the first things computed in continuation functions
after netfilter has done it's job passing in the desired network namespace is in
many cases a code simplification.

To support this change the function dst_output_okfn is introduced to
simplify passing dst_output as an okfn.  For the moment dst_output_okfn
just silently drops the struct net.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Use nf_hook_state.net
Eric W. Biederman [Wed, 16 Sep 2015 01:04:17 +0000 (20:04 -0500)]
netfilter: Use nf_hook_state.net

Instead of saying "net = dev_net(state->in?state->in:state->out)"
just say "state->net".  As that information is now availabe,
much less confusing and much less error prone.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass struct net into the netfilter hooks
Eric W. Biederman [Wed, 16 Sep 2015 01:04:16 +0000 (20:04 -0500)]
netfilter: Pass struct net into the netfilter hooks

Pass a network namespace parameter into the netfilter hooks.  At the
call site of the netfilter hooks the path a packet is taking through
the network stack is well known which allows the network namespace to
be easily and reliabily.

This allows the replacement of magic code like
"dev_net(state->in?:state->out)" that appears at the start of most
netfilter hooks with "state->net".

In almost all cases the network namespace passed in is derived
from the first network device passed in, guaranteeing those
paths will not see any changes in practice.

The exceptions are:
xfrm/xfrm_output.c:xfrm_output_resume()         xs_net(skb_dst(skb)->xfrm)
ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont()      ip_vs_conn_net(cp)
ipvs/ip_vs_xmit.c:ip_vs_send_or_cont()          ip_vs_conn_net(cp)
ipv4/raw.c:raw_send_hdrinc()                    sock_net(sk)
ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
ipv6/ndisc.c:ndisc_send_skb()                   dev_net(skb->dev) not dev_net(dst->dev)
ipv6/raw.c:raw6_send_hdrinc()                   sock_net(sk)
br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

In all cases these exceptions seem to be a better expression for the
network namespace the packet is being processed in then the historic
"dev_net(in?in:out)".  I am documenting them in case something odd
pops up and someone starts trying to track down what happened.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobridge: Add br_netif_receive_skb remove netif_receive_skb_sk
Eric W. Biederman [Wed, 16 Sep 2015 01:04:15 +0000 (20:04 -0500)]
bridge: Add br_netif_receive_skb remove netif_receive_skb_sk

netif_receive_skb_sk is only called once in the bridge code, replace
it with a bridge specific function that calls netif_receive_skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>