firefly-linux-kernel-4.4.55.git
9 years agofakelb: add support for async xmit handling
Alexander Aring [Sun, 17 May 2015 19:45:09 +0000 (21:45 +0200)]
fakelb: add support for async xmit handling

This patch will add async xmit support for the fakelb driver.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: remove fakelb_hw_deliver
Alexander Aring [Sun, 17 May 2015 19:45:08 +0000 (21:45 +0200)]
fakelb: remove fakelb_hw_deliver

This patch cleanups the fakelb_hw_deliver function by removing them.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: add virtual phy reset defaults
Alexander Aring [Sun, 17 May 2015 19:45:07 +0000 (21:45 +0200)]
fakelb: add virtual phy reset defaults

This patch adds reset defaults for the fakelb phy. I used the channel 13
and page 0 which is the same default like at86rf233.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: use own channel and page attributes
Alexander Aring [Sun, 17 May 2015 19:45:06 +0000 (21:45 +0200)]
fakelb: use own channel and page attributes

This patch adds an own phy attribute for page and channel into
fakelb_phy. The current way is to use the internal mac802154 stored phy
pib values which can occur in locking issues while using it inside the
driver layer.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: introduce fakelb ifup phys list
Alexander Aring [Sun, 17 May 2015 19:45:05 +0000 (21:45 +0200)]
fakelb: introduce fakelb ifup phys list

This patch introduce a fakelb ifup phys list, which stores all
registered phys which are in an operated mode. This will reduce the
iterations of non-operated phys while transmit frames. There exists two
locks now, one rwlock for the operated interfaces and the spinlock for
protecting the list for all registered virtual phys.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: move lock out of iteration
Alexander Aring [Sun, 17 May 2015 19:45:04 +0000 (21:45 +0200)]
fakelb: move lock out of iteration

The list need to be protected while iteration which is need when other
list iterates at the same time over this list.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: declare fakelb list static
Alexander Aring [Sun, 17 May 2015 19:45:03 +0000 (21:45 +0200)]
fakelb: declare fakelb list static

This patch moves the fakelb list of all registered phy's in a static
declaration in the fakelb driver.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: declare rwlock static
Alexander Aring [Sun, 17 May 2015 19:45:02 +0000 (21:45 +0200)]
fakelb: declare rwlock static

This patch moves the rwlock declarition into a static variable.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: don't deliver when one phy
Alexander Aring [Sun, 17 May 2015 19:45:01 +0000 (21:45 +0200)]
fakelb: don't deliver when one phy

A real phy don't transmit the transmitted frame back to the phy. This
behaviour will confuse the 6LoWPAN and the upper IPv6 neighbour
discovery cache. We need now at least two virtual phy's to creating a
virtual wpan network.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: rename fakelb_dev_priv to fakelb_phy
Alexander Aring [Sun, 17 May 2015 19:45:00 +0000 (21:45 +0200)]
fakelb: rename fakelb_dev_priv to fakelb_phy

This patch renames fakelb_dev_priv to fakelb_phy. We don't faking
devices here, we fake wpan phys. This avoids also confusing with the
variable priv, which is used several times in this driver to represent
this structure.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: use list_for_each_entry_safe
Alexander Aring [Sun, 17 May 2015 19:44:59 +0000 (21:44 +0200)]
fakelb: use list_for_each_entry_safe

Iterate and removing items from a list, we should use
list_for_each_entry_safe instead list_for_each_entry to avoid accidents
by removing.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agofakelb: creating two virtual phys per default
Alexander Aring [Sun, 17 May 2015 19:44:58 +0000 (21:44 +0200)]
fakelb: creating two virtual phys per default

This patch change the default virtual wpan phys of fakelb driver from
one to two. To have one virtual phy makes no sense, because it need at
least two virtual phy's to speak to each other.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: add support for atusb transceiver
Alexander Aring [Sun, 17 May 2015 19:44:57 +0000 (21:44 +0200)]
ieee802154: add support for atusb transceiver

This patch adds support for the atusb transceiver.

The current driver supports basic functionality only. Possible further
tasks would be to sync functionality with the at86rf230 driver, because
the atusb use internally an at86rf231 transceiver. Some of these
features need a firmware update like AACK and ARET handling.

I did small changes to this driver to work with xmit_async callback and
setting of a random extended perm address.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Cc: Werner Almesberger <werner@almesberger.net>
Cc: Stefan Schmidt <s.schmidt@samsung.com>
Cc: Richard Sharpe <realrichardsharpe@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: tx: allow xmit complete from hard irq
Alexander Aring [Sun, 17 May 2015 19:44:56 +0000 (21:44 +0200)]
mac802154: tx: allow xmit complete from hard irq

Replace consume_skb with dev_consume_skb_any in ieee802154_xmit_complete
which can be called in hard irq and other contexts.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: fix callback for aret handling
Alexander Aring [Sun, 17 May 2015 19:44:55 +0000 (21:44 +0200)]
at86rf230: fix callback for aret handling

This patch fix the complete callback for going into aret on. Currently
we starting after a state change into aret on again a state change into
aret on which is not necessary.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: fakelb: Fix potential NULL pointer dereference.
Martin Townsend [Sun, 17 May 2015 19:44:54 +0000 (21:44 +0200)]
mac802154: fakelb: Fix potential NULL pointer dereference.

fakelb_hw_deliver creates a copy of the skb's header which can
potentially return NULL so we now check for this before actually
delivering to the 802.15.4 MAC layer.

Signed-off-by: Martin Townsend <martin.townsend@xsilon.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agonl802154: add support for dump phy capabilities
Alexander Aring [Sun, 17 May 2015 19:44:53 +0000 (21:44 +0200)]
nl802154: add support for dump phy capabilities

This patch add support to nl802154 to dump all phy capabilities which is
inside the wpan_phy_supported struct. Also we introduce a new method to
dumping supported channels. The new method will offer a easier interface
and has lesser netlink traffic.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: add reset states of tx power level
Alexander Aring [Sun, 17 May 2015 19:44:52 +0000 (21:44 +0200)]
at86rf230: add reset states of tx power level

This patch adds the reset states for tx power levels after running reset
procedure.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: add cca ed level reset value
Alexander Aring [Sun, 17 May 2015 19:44:51 +0000 (21:44 +0200)]
at86rf230: add cca ed level reset value

This patch adds reset values for cca ed level values after running reset
procedure.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: rework tx cca energy detection level
Alexander Aring [Sun, 17 May 2015 19:44:50 +0000 (21:44 +0200)]
at86rf230: rework tx cca energy detection level

This patch reworks the cca energy detection level handling. This
contains a calculation which works on all transceiver types and
add support for dump cca energy detection levels.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: rework tx power support
Alexander Aring [Sun, 17 May 2015 19:44:49 +0000 (21:44 +0200)]
at86rf230: rework tx power support

This patch adds support for get transmit power levels and reworks the
set of transmit power levels which was broken before.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoat86rf230: set cca_modes supported flags
Alexander Aring [Sun, 17 May 2015 19:44:48 +0000 (21:44 +0200)]
at86rf230: set cca_modes supported flags

This patch sets the at86rf230 supported cca modes. In case of at86rf212
it also can support listen before transmit.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: add iftypes capability
Alexander Aring [Sun, 17 May 2015 19:44:47 +0000 (21:44 +0200)]
ieee802154: add iftypes capability

This patch adds capability flags for supported interface types.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agocfg802154: introduce wpan phy flags
Alexander Aring [Sun, 17 May 2015 19:44:46 +0000 (21:44 +0200)]
cfg802154: introduce wpan phy flags

This patch introduce a flag property for the wpan phy structure.
The current flag settings in ieee802154_hw are accessable in mac802154
layer only which is okay for flags which indicates MAC handling which
are done by phy. For real PHY layer settings like cca mode, transmit
power, cca energy detection level.

The difference between these flags are that the MAC handling flags are
only handled in mac802154/HardMac layer e.g. on an interface up. The phy
settings are direct netlink calls from nl802154 into the driver layer
and the nl802154 need to have a chance to check if the driver supports
this handling before sending to the next layer.

We also check now on PHY flags while dumping and setting pib attributes.
In comparing with MIB attributes the 802.15.4 gives us an default value
which we assume when a transceiver implement less functionality. In case
of MIB settings the nl802154 layer doesn't need to check on the
ieee802154_hw flags then.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: remove check if operation is supported
Alexander Aring [Sun, 17 May 2015 19:44:45 +0000 (21:44 +0200)]
mac802154: remove check if operation is supported

This patch removes the check if operation is supported by driver layer.
This is done now by capabilities flags, if these are valid then the
driver should support the operation, otherwise a WARN_ON occurs.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agomac802154: check for really changes
Alexander Aring [Sun, 17 May 2015 19:44:44 +0000 (21:44 +0200)]
mac802154: check for really changes

This patch adds check if the value is really changed inside pib/mib.
If a transceiver do support only one value for e.g. max_be then this
will also handle that the driver layer doesn't need to care about
handling to set one value only.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: add several phy supported handling
Alexander Aring [Sun, 17 May 2015 19:44:43 +0000 (21:44 +0200)]
ieee802154: add several phy supported handling

This patch adds support for phy supported handling for all other already
existing handling 802.15.4 functionality. We assume now a fully 802.15.4
complaint transceiver at phy allocation. If a transceiver can support
802.15.4 default values only, then the values should be overwirtten by
values the transceiver supports. If the transceiver doesn't set the
according hardware flags, we assume the 802.15.4 defaults now which
cannot be changed.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Suggested-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: introduce wpan_phy_supported
Alexander Aring [Sun, 17 May 2015 19:44:42 +0000 (21:44 +0200)]
ieee802154: introduce wpan_phy_supported

This patch introduce the wpan_phy_supported struct for wpan_phy. There
is currently no way to check if a transceiver can handle IEEE 802.15.4
complaint values. With this struct we can check before if the
transceiver supports these values before sending to driver layer.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Suggested-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Acked-by: Varka Bhadram <varkabhadram@gmail.com>
Cc: Alan Ott <alan@signal11.us>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: change cca ed level to mbm
Alexander Aring [Sun, 17 May 2015 19:44:41 +0000 (21:44 +0200)]
ieee802154: change cca ed level to mbm

This patch change the handling of cca energy detection level from dbm to
mbm. This prepares to handle floating point cca energy detection levels
values. The old netlink 802.15.4 will convert the dbm value to mbm for
handling backward compatibility.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: change transmit power to mbm
Alexander Aring [Sun, 17 May 2015 19:44:40 +0000 (21:44 +0200)]
ieee802154: change transmit power to mbm

This patch change the handling of transmit power level from dbm to mbm.
This prepares to handle floating point transmit power levels values. The
old netlink 802.15.4 will convert the dbm value to mbm for handling
backward compatibility.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: change transmit power to s32
Alexander Aring [Sun, 17 May 2015 19:44:39 +0000 (21:44 +0200)]
ieee802154: change transmit power to s32

This patch change the transmit power from s8 to s32. This prepares to store a
mbm value instead dbm inside the transmit power variable. The old
interface keep the a s8 dbm value, which should be backward compatibility
when assign s8 to s32.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoieee802154: move validation check out of softmac
Alexander Aring [Sun, 17 May 2015 19:44:38 +0000 (21:44 +0200)]
ieee802154: move validation check out of softmac

This patch moves the value validation out of softmac layer. We need
to be sure now that this value is accepted by the transceiver/mac802154 or
"possible" hardmac drivers before calling rdev-ops.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agonl802154: cleanup invalid argument handling
Alexander Aring [Sun, 17 May 2015 19:44:37 +0000 (21:44 +0200)]
nl802154: cleanup invalid argument handling

This patch cleanups the -EINVAL cases by combining them in one
condition.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btbcm: Fix calls to __hci_cmd_sync()
Frederic Danis [Fri, 15 May 2015 09:58:42 +0000 (11:58 +0200)]
Bluetooth: btbcm: Fix calls to __hci_cmd_sync()

Remove test of command reply status as it is already performed by
__hci_cmd_sync().

__hci_cmd_sync_ev() function already returns an error if it got a
non-zero status either through a Command Complete or a Command
Status event.

For both of these events the status is collected up in the event
handlers called by hci_event_packet() and then passed as the second
parameter to req_complete_skb(). The req_complete_skb() callback in
turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
the status in hdev->req_result. The hdev->req_result is then further
converted through bt_to_errno() back in __hci_cmd_sync_ev().

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btintel: Fix calls to __hci_cmd_sync()
Frederic Danis [Fri, 15 May 2015 09:58:41 +0000 (11:58 +0200)]
Bluetooth: btintel: Fix calls to __hci_cmd_sync()

Remove test of command reply status as it is already performed by
__hci_cmd_sync().

__hci_cmd_sync_ev() function already returns an error if it got a
non-zero status either through a Command Complete or a Command
Status event.

For both of these events the status is collected up in the event
handlers called by hci_event_packet() and then passed as the second
parameter to req_complete_skb(). The req_complete_skb() callback in
turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
the status in hdev->req_result. The hdev->req_result is then further
converted through bt_to_errno() back in __hci_cmd_sync_ev().

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btusb: Fix calls to __hci_cmd_sync()
Frederic Danis [Fri, 15 May 2015 09:58:40 +0000 (11:58 +0200)]
Bluetooth: btusb: Fix calls to __hci_cmd_sync()

Remove test of command reply status as it is already performed by
__hci_cmd_sync().

__hci_cmd_sync_ev() function already returns an error if it got a
non-zero status either through a Command Complete or a Command
Status event.

For both of these events the status is collected up in the event
handlers called by hci_event_packet() and then passed as the second
parameter to req_complete_skb(). The req_complete_skb() callback in
turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
the status in hdev->req_result. The hdev->req_result is then further
converted through bt_to_errno() back in __hci_cmd_sync_ev().

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: Fix calls to __hci_cmd_sync()
Frederic Danis [Fri, 15 May 2015 09:58:39 +0000 (11:58 +0200)]
Bluetooth: Fix calls to __hci_cmd_sync()

Remove test of command reply status as it is already performed by
__hci_cmd_sync().

__hci_cmd_sync_ev() function already returns an error if it got a
non-zero status either through a Command Complete or a Command
Status event.

For both of these events the status is collected up in the event
handlers called by hci_event_packet() and then passed as the second
parameter to req_complete_skb(). The req_complete_skb() callback in
turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
the status in hdev->req_result. The hdev->req_result is then further
converted through bt_to_errno() back in __hci_cmd_sync_ev().

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btrtl: Create separate module for Realtek BT driver
Carlo Caione [Thu, 14 May 2015 08:49:09 +0000 (10:49 +0200)]
Bluetooth: btrtl: Create separate module for Realtek BT driver

As already done for btintel and btbcm export setup as separate function
in a vendor-specific module to hold all the Realtek specific commands.

Signed-off-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btmrvl: fix compilation warning
Xinming Hu [Tue, 21 Apr 2015 13:59:56 +0000 (06:59 -0700)]
Bluetooth: btmrvl: fix compilation warning

This patch fixes a compile warnning "dump_num maybe used uninitialized in
this function".

Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agoBluetooth: btwilink: remove DEBUG define
Leo Yan [Tue, 5 May 2015 07:09:17 +0000 (15:09 +0800)]
Bluetooth: btwilink: remove DEBUG define

Remove the DEBUG define as the debug code; so can remove mass debug info
from log buffer when using dmesg.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
9 years agonet: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset
Pablo Neira [Tue, 12 May 2015 18:28:07 +0000 (20:28 +0200)]
net: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset

This fixes 4577139b2dabf589 ("net: use jump label patching for ingress qdisc in
__netif_receive_skb_core").

The only client of this is sch_ingress and it depends on NET_CLS_ACT. So
there is no way these definition can be of any help.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'packet_rollover'
David S. Miller [Wed, 13 May 2015 19:43:01 +0000 (15:43 -0400)]
Merge branch 'packet_rollover'

Willem de Bruijn says:

====================
refine packet socket rollover:

1. mitigate a case of lock contention
2. avoid exporting resource exhaustion to other sockets,
   by migrating only to a victim socket that has ample room
3. avoid reordering of most flows on the socket,
   by migrating first the flow responsible for load imbalance
4. help processes detect load imbalance,
   by exporting rollover counters

Context: rollover implements flow migration in packet socket fanout
groups in case of extreme load imbalance. It is a specific
implementation of migration that minimizes reordering by selecting
the same victim socket when possible (and by selecting subsequent
victims in a round robin fashion, from which its name derives).

Changes:
  v2 -> v3:
    - statistics: replace unsigned long with __aligned_u64
  v1 -> v2:
    - huge flow detection: run lockless
    - huge flow detection: replace stored index with random
    - contention avoidance: test in packet_poll while lock held
    - contention avoidance: clear pressure sooner

          packet_poll and packet_recvmsg would clear only if the sock
          is empty to avoid taking the necessary lock. But,
          * packet_poll already holds this lock, so a lockless variant
            __packet_rcv_has_room is cheap.
          * packet_recvmsg is usually called only for non-ring sockets,
            which also runs lockless.

    - preparation: drop "single return" patch

          packet_rcv_has_room is now a locked wrapper around
          __packet_rcv_has_room, achieving the same (single footer).

The benchmark mentioned in the patches is at
https://github.com/wdebruij/kerneltools/blob/master/tests/bench_rollover.c
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover statistics
Willem de Bruijn [Tue, 12 May 2015 15:56:50 +0000 (11:56 -0400)]
packet: rollover statistics

Rollover indicates exceptional conditions. Export a counter to inform
socket owners of this state.

If no socket with sufficient room is found, rollover fails. Also count
these events.

Finally, also count when flows are rolled over early thanks to huge
flow detection, to validate its correctness.

Tested:
  Read counters in bench_rollover on all other tests in the patchset

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover huge flows before small flows
Willem de Bruijn [Tue, 12 May 2015 15:56:49 +0000 (11:56 -0400)]
packet: rollover huge flows before small flows

Migrate flows from a socket to another socket in the fanout group not
only when the socket is full. Start migrating huge flows early, to
divert possible 4-tuple attacks without affecting normal traffic.

Introduce fanout_flow_is_huge(). This detects huge flows, which are
defined as taking up more than half the load. It does so cheaply, by
storing the rxhashes of the N most recent packets. If over half of
these are the same rxhash as the current packet, then drop it. This
only protects against 4-tuple attacks. N is chosen to fit all data in
a single cache line.

Tested:
  Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.

    lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s
    cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
      0         14         14          0          0          0          0
      1         20         20          0          0          0          0
      2         16         16          0          0          0          0
      3    6168824    6168824          0    4867721    4867721          0
      4    4867741    4867741          0          0          0          0
      5         12         12          0          0          0          0
      6         15         15          0          0          0          0
      7         17         17          0          0          0          0

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover lock contention avoidance
Willem de Bruijn [Tue, 12 May 2015 15:56:48 +0000 (11:56 -0400)]
packet: rollover lock contention avoidance

Rollover has to call packet_rcv_has_room on sockets in the fanout
group to find a socket to migrate to. This operation is expensive
especially if the packet sockets use rings, when a lock has to be
acquired.

Avoid pounding on the lock by all sockets by temporarily marking a
socket as "under memory pressure" when such pressure is detected.
While set, only the socket owner may call packet_rcv_has_room on the
socket. Once it detects normal conditions, it clears the flag. The
socket is not used as a victim by any other socket in the meantime.

Under reasonably balanced load, each socket writer frequently calls
packet_rcv_has_room and clears its own pressure field. As a backup
for when the socket is rarely written to, also clear the flag on
reading (packet_recvmsg, packet_poll) if this can be done cheaply
(i.e., without calling packet_rcv_has_room). This is only for
edge cases.

Tested:
  Ran bench_rollover: a process with 8 sockets in a single fanout
  group, each pinned to a single cpu that receives one nic recv
  interrupt. RPS and RFS are disabled. The benchmark uses packet
  rx_ring, which has to take a lock when determining whether a
  socket has room.

  Sent 3.5 Mpps of UDP traffic with sufficient entropy to spread
  uniformly across the packet sockets (and inserted an iptables
  rule to drop in PREROUTING to avoid protocol stack processing).

  Without this patch, all sockets try to migrate traffic to
  neighbors, causing lock contention when searching for a non-
  empty neighbor. The lock is the top 9 entries.

    perf record -a -g sleep 5

    -  17.82%   bench_rollover  [kernel.kallsyms]    [k] _raw_spin_lock
       - _raw_spin_lock
          - 99.00% spin_lock
      + 81.77% packet_rcv_has_room.isra.41
      + 18.23% tpacket_rcv
          + 0.84% packet_rcv_has_room.isra.41
    +   5.20%      ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.15%      ksoftirqd/1  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.14%      ksoftirqd/2  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.12%      ksoftirqd/7  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.12%      ksoftirqd/5  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.10%      ksoftirqd/4  [kernel.kallsyms]    [k] _raw_spin_lock
    +   4.66%      ksoftirqd/0  [kernel.kallsyms]    [k] _raw_spin_lock
    +   4.45%      ksoftirqd/3  [kernel.kallsyms]    [k] _raw_spin_lock
    +   1.55%   bench_rollover  [kernel.kallsyms]    [k] packet_rcv_has_room.isra.41

  On net-next with this patch, this lock contention is no longer a
  top entry. Most time is spent in the actual read function. Next up
  are other locks:

    +  15.52%  bench_rollover  bench_rollover     [.] reader
    +   4.68%         swapper  [kernel.kallsyms]  [k] memcpy_erms
    +   2.77%         swapper  [kernel.kallsyms]  [k] packet_lookup_frame.isra.51
    +   2.56%     ksoftirqd/1  [kernel.kallsyms]  [k] memcpy_erms
    +   2.16%         swapper  [kernel.kallsyms]  [k] tpacket_rcv
    +   1.93%         swapper  [kernel.kallsyms]  [k] mlx4_en_process_rx_cq

  Looking closer at the remaining _raw_spin_lock, the cost of probing
  in rollover is now comparable to the cost of taking the lock later
  in tpacket_rcv.

    -   1.51%         swapper  [kernel.kallsyms]  [k] _raw_spin_lock
       - _raw_spin_lock
          + 33.41% packet_rcv_has_room
          + 28.15% tpacket_rcv
          + 19.54% enqueue_to_backlog
          + 6.45% __free_pages_ok
          + 2.78% packet_rcv_fanout
          + 2.13% fanout_demux_rollover
          + 2.01% netif_receive_skb_internal

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover only to socket with headroom
Willem de Bruijn [Tue, 12 May 2015 15:56:47 +0000 (11:56 -0400)]
packet: rollover only to socket with headroom

Only migrate flows to sockets that have sufficient headroom, where
sufficient is defined as having at least 25% empty space.

The kernel has three different buffer types: a regular socket, a ring
with frames (TPACKET_V[12]) or a ring with blocks (TPACKET_V3). The
latter two do not expose a read pointer to the kernel, so headroom is
not computed easily. All three needs a different implementation to
estimate free space.

Tested:
  Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.

  bench_rollover has as many sockets as there are NIC receive queues
  in the system. Each socket is owned by a process that is pinned to
  one of the receive cpus. RFS is disabled. RPS is enabled with an
  identity mapping (cpu x -> cpu x), to count drops with softnettop.

    lpbb5:/export/hda3/willemb# ./bench_rollover -r -l 1000 -s
    Press [Enter] to exit

    cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
      0         16         16          0          0          0          0
      1         21         21          0          0          0          0
      2    5227502    5227502          0          0          0          0
      3         18         18          0          0          0          0
      4    6083289    6083289          0    5227496          0          0
      5         22         22          0          0          0          0
      6         21         21          0          0          0          0
      7          9          9          0          0          0          0

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover prepare: per-socket state
Willem de Bruijn [Tue, 12 May 2015 15:56:46 +0000 (11:56 -0400)]
packet: rollover prepare: per-socket state

Replace rollover state per fanout group with state per socket. Future
patches will add fields to the new structure.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopacket: rollover prepare: move code out of callsites
Willem de Bruijn [Tue, 12 May 2015 15:56:45 +0000 (11:56 -0400)]
packet: rollover prepare: move code out of callsites

packet_rcv_fanout calls fanout_demux_rollover twice. Move all rollover
logic into the callee to simplify these callsites, especially with
upcoming changes.

The main differences between the two callsites is that the FLAG
variant tests whether the socket previously selected by another
mode (RR, RND, HASH, ..) has room before migrating flows, whereas the
rollover mode has no original socket to test.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipv4: __ip_local_out_sk() is static
Eric Dumazet [Tue, 12 May 2015 13:31:48 +0000 (06:31 -0700)]
ipv4: __ip_local_out_sk() is static

__ip_local_out_sk() is only used from net/ipv4/ip_output.c

net/ipv4/ip_output.c:94:5: warning: symbol '__ip_local_out_sk' was not
declared. Should it be static?

Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp/dccp: tw_timer_handler() is static
Eric Dumazet [Tue, 12 May 2015 13:22:56 +0000 (06:22 -0700)]
tcp/dccp: tw_timer_handler() is static

tw_timer_handler() is only used from net/ipv4/inet_timewait_sock.c

Fixes: 789f558cfb36 ("tcp/dccp: get rid of central timewait timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'cls_flower'
David S. Miller [Wed, 13 May 2015 19:19:48 +0000 (15:19 -0400)]
Merge branch 'cls_flower'

Jiri Pirko says:

====================
introduce programable flow dissector and cls_flower

Per Davem's request, I prepared this patchset which introduces programmable
flow dissector. For current users of flow_keys, there is a wrapper
skb_flow_dissect_flow_keys which maintains the previous behaviour.
For purposes of cls_flower, couple of new dissection keys were introduced.

Note that this dissector can be also eventually used by openvswitch code.

Also, as a next step, I plan to get rid of *skb_flow_get_ports(export)
and *__skb_get_poff as their functionality can be now implemented by
skb_flow_dissect as well.

v2->v3:
- remove TCA_FLOWER_POLICE attr suggested by Jamal

v1->v2:
- move __skb_tx_hash rather to dev.c as suggested by Alex
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotc: introduce Flower classifier
Jiri Pirko [Tue, 12 May 2015 12:56:21 +0000 (14:56 +0200)]
tc: introduce Flower classifier

This patch introduces a flow-based filter. So far, the very essential
packet fields are supported.

This patch is only the first step. There is a lot of potential performance
improvements possible to implement. Also a lot of features are missing
now. They will be addressed in follow-up patches.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: change port array into src, dst tuple
Jiri Pirko [Tue, 12 May 2015 12:56:20 +0000 (14:56 +0200)]
flow_dissector: change port array into src, dst tuple

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: introduce support for Ethernet addresses
Jiri Pirko [Tue, 12 May 2015 12:56:19 +0000 (14:56 +0200)]
flow_dissector: introduce support for Ethernet addresses

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: introduce support for ipv6 addressses
Jiri Pirko [Tue, 12 May 2015 12:56:18 +0000 (14:56 +0200)]
flow_dissector: introduce support for ipv6 addressses

So far, only hashes made out of ipv6 addresses could be dissected. This
patch introduces support for dissection of full ipv6 addresses.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: add missing header includes
Jiri Pirko [Tue, 12 May 2015 12:56:17 +0000 (14:56 +0200)]
flow_dissector: add missing header includes

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissect: use programable dissector in skb_flow_dissect and friends
Jiri Pirko [Tue, 12 May 2015 12:56:16 +0000 (14:56 +0200)]
flow_dissect: use programable dissector in skb_flow_dissect and friends

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: introduce programable flow_dissector
Jiri Pirko [Tue, 12 May 2015 12:56:15 +0000 (14:56 +0200)]
flow_dissector: introduce programable flow_dissector

Introduce dissector infrastructure which allows user to specify which
parts of skb he wants to dissect.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: fix doc for skb_get_poff
Jiri Pirko [Tue, 12 May 2015 12:56:14 +0000 (14:56 +0200)]
flow_dissector: fix doc for skb_get_poff

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: move netdev_pick_tx and dependencies to net/core/dev.c
Jiri Pirko [Tue, 12 May 2015 12:56:13 +0000 (14:56 +0200)]
net: move netdev_pick_tx and dependencies to net/core/dev.c

next to its user. No relation to flow_dissector so it makes no sense to
have it in flow_dissector.c

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: move __skb_tx_hash to dev.c
Jiri Pirko [Tue, 12 May 2015 12:56:12 +0000 (14:56 +0200)]
net: move __skb_tx_hash to dev.c

__skb_tx_hash function has no relation to flow_dissect so just move it
to dev.c

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: move __skb_get_hash function declaration to flow_dissector.h
Jiri Pirko [Tue, 12 May 2015 12:56:11 +0000 (14:56 +0200)]
net: move __skb_get_hash function declaration to flow_dissector.h

Since the definition of the function is in flow_dissector.c, it makes
sense to have the declaration in flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: fix doc for __skb_get_hash and remove couple of empty lines
Jiri Pirko [Tue, 12 May 2015 12:56:10 +0000 (14:56 +0200)]
flow_dissector: fix doc for __skb_get_hash and remove couple of empty lines

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: move *skb_get_poff declarations into correct header
Jiri Pirko [Tue, 12 May 2015 12:56:09 +0000 (14:56 +0200)]
net: move *skb_get_poff declarations into correct header

Since these functions are defined in flow_dissector.c, move header
declarations from skbuff.h into flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoflow_dissector: remove unused function flow_get_hlen declaration
Jiri Pirko [Tue, 12 May 2015 12:56:08 +0000 (14:56 +0200)]
flow_dissector: remove unused function flow_get_hlen declaration

commit 56193d1bce ("net: Add function for parsing the header length out
of linear ethernet frames") added this function declaration but it is
defined nowhere.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: change name of flow_dissector header to match the .c file name
Jiri Pirko [Tue, 12 May 2015 12:56:07 +0000 (14:56 +0200)]
net: change name of flow_dissector header to match the .c file name

add couple of empty lines on the way.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'sfc-next'
David S. Miller [Wed, 13 May 2015 19:11:52 +0000 (15:11 -0400)]
Merge branch 'sfc-next'

Edward Cree says:

====================
sfc: Bowdlerise PTP MCDI errors

When the NIC doesn't support PTP, probe-time MCDI commands fail in
predictable ways.  Instead of logging cryptic MCDI errors, just log that
PTP isn't supported.

v2: Hopefully stop Thunderbird mangling the patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosfc: suppress some MCDI error messages in PTP
Edward Cree [Tue, 12 May 2015 12:05:09 +0000 (13:05 +0100)]
sfc: suppress some MCDI error messages in PTP

Also, remove a needless netif_err() from efx_ptp_update_stats() - if the
 MCDI fails it'll print its own error message, we don't need another that
 adds no information.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosfc: nicer log message on PTP probe fail
Edward Cree [Tue, 12 May 2015 12:04:52 +0000 (13:04 +0100)]
sfc: nicer log message on PTP probe fail

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: sched: use counter to break reclassify loops
Florian Westphal [Mon, 11 May 2015 17:50:41 +0000 (19:50 +0200)]
net: sched: use counter to break reclassify loops

Seems all we want here is to avoid endless 'goto reclassify' loop.
tc_classify_compat even resets this counter when something other
than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
break hypothetical loops induced by something other than perpetual
TC_ACT_RECLASSIFY return values.

skb_act_clone is now identical to skb_clone, so just use that.

Tested with following (bogus) filter:
tc filter add dev eth0 parent ffff: \
 protocol ip u32 match u32 0 0 police rate 10Kbit burst \
 64000 mtu 1500 action reclassify

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Wed, 13 May 2015 18:31:43 +0000 (14:31 -0400)]
Merge git://git./linux/kernel/git/davem/net

Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agofix missing copy_from_user in macvtap
Justin Cormack [Wed, 13 May 2015 18:19:02 +0000 (19:19 +0100)]
fix missing copy_from_user in macvtap

Fix missing copy_from_user in macvtap SIOCSIFHWADDR ioctl.

Signed-off-by: Justin Cormack <justin@netbsd.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoswitchdev: don't use anonymous union on switchdev attr/obj structs
Scott Feldman [Wed, 13 May 2015 18:16:50 +0000 (11:16 -0700)]
switchdev: don't use anonymous union on switchdev attr/obj structs

Older gcc versions (e.g.  gcc version 4.4.6) don't like anonymous unions
which was causing build issues on the newly added switchdev attr/obj
structs.  Fix this by using named union on structs.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'switchdev-cleanups'
David S. Miller [Wed, 13 May 2015 16:26:28 +0000 (12:26 -0400)]
Merge branch 'switchdev-cleanups'

Scott Feldman says:

====================
switchdev: more (minor) cleanups

Fix some sparse warnings and include some documentation review comments that
didn't get picked up in the switchdev Spring Cleanup series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoswitchdev: apply review comments on documentation
Scott Feldman [Wed, 13 May 2015 06:03:54 +0000 (23:03 -0700)]
switchdev: apply review comments on documentation

There were a few review comments on the switchdev.txt documentation that
didn't get included with the Spring Cleanup series, so include them now.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoswitchdev: align comment with other comments in block
Scott Feldman [Wed, 13 May 2015 06:03:53 +0000 (23:03 -0700)]
switchdev: align comment with other comments in block

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoswitchdev: sparse warning: pass ipv4 fib dst as network-byte order
Scott Feldman [Wed, 13 May 2015 06:03:52 +0000 (23:03 -0700)]
switchdev: sparse warning: pass ipv4 fib dst as network-byte order

And let driver convert it to host-byte order as needed.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoswitchdev: sparse warning: make __switchdev_port_obj_add static
Scott Feldman [Wed, 13 May 2015 06:03:51 +0000 (23:03 -0700)]
switchdev: sparse warning: make __switchdev_port_obj_add static

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 13 May 2015 04:10:38 +0000 (21:10 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Handle max TX power properly wrt VIFs and the MAC in iwlwifi, from
    Avri Altman.

 2) Use the correct FW API for scan completions in iwlwifi, from Avraham
    Stern.

 3) FW monitor in iwlwifi accidently uses unmapped memory, fix from Liad
    Kaufman.

 4) rhashtable conversion of mac80211 station table was buggy, the
    virtual interface was not taken into account.  Fix from Johannes
    Berg.

 5) Fix deadlock in rtlwifi by not using a zero timeout for
    usb_control_msg(), from Larry Finger.

 6) Update reordering state before calculating loss detection, from
    Yuchung Cheng.

 7) Fix off by one in bluetooth firmward parsing, from Dan Carpenter.

 8) Fix extended frame handling in xiling_can driver, from Jeppe
    Ledet-Pedersen.

 9) Fix CODEL packet scheduler behavior in the presence of TSO packets,
    from Eric Dumazet.

10) Fix NAPI budget testing in fm10k driver, from Alexander Duyck.

11) macvlan needs to propagate promisc settings down the the lower
    device, from Vlad Yasevich.

12) igb driver can oops when changing number of rings, from Toshiaki
    Makita.

13) Source specific default routes not handled properly in ipv6, from
    Markus Stenberg.

14) Use after free in tc_ctl_tfilter(), from WANG Cong.

15) Use softirq spinlocking in netxen driver, from Tony Camuso.

16) Two ARM bpf JIT fixes from Nicolas Schichan.

17) Handle MSG_DONTWAIT properly in ring based AF_PACKET sends, from
    Mathias Kretschmer.

18) Fix x86 bpf JIT implementation of FROM_{BE16,LE16,LE32}, from Alexei
    Starovoitov.

19) ll_temac driver DMA maps TX packet header with incorrect length, fix
    from Michal Simek.

20) We removed pm_qos bits from netdevice.h, but some indirect
    references remained.  Kill them.  From David Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
  net: Remove remaining remnants of pm_qos from netdevice.h
  e1000e: Add pm_qos header
  net: phy: micrel: Fix regression in kszphy_probe
  net: ll_temac: Fix DMA map size bug
  x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
  netns: return RTM_NEWNSID instead of RTM_GETNSID on a get
  Update be2net maintainers' email addresses
  net_sched: gred: use correct backlog value in WRED mode
  pppoe: drop pppoe device in pppoe_unbind_sock_work
  net: qca_spi: Fix possible race during probe
  net: mdio-gpio: Allow for unspecified bus id
  af_packet / TX_RING not fully non-blocking (w/ MSG_DONTWAIT).
  bnx2x: limit fw delay in kdump to 5s after boot
  ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.
  ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.
  mpls: Change reserved label names to be consistent with netbsd
  usbnet: avoid integer overflow in start_xmit
  netxen_nic: use spin_[un]lock_bh around tx_clean_lock (2)
  net: xgene_enet: Set hardware dependency
  net: amd-xgbe: Add hardware dependency
  ...

9 years agonet: Remove remaining remnants of pm_qos from netdevice.h
David Ahern [Tue, 12 May 2015 15:37:00 +0000 (09:37 -0600)]
net: Remove remaining remnants of pm_qos from netdevice.h

Commit e2c6544829f removed pm_qos from struct net_device but left the
comment and header file. Remove those.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoe1000e: Add pm_qos header
David Ahern [Tue, 12 May 2015 15:36:59 +0000 (09:36 -0600)]
e1000e: Add pm_qos header

Commit e2c6544829f moved pm_qos_req to e1000_adapter. Add the header file
that defines the struct.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: make skb_dst_pop routine static
Ying Xue [Tue, 12 May 2015 10:29:44 +0000 (18:29 +0800)]
net: make skb_dst_pop routine static

As xfrm_output_one() is the only caller of skb_dst_pop(), we should
make skb_dst_pop() localized.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: phy: micrel: Fix regression in kszphy_probe
Niklas Cassel [Tue, 12 May 2015 07:43:14 +0000 (09:43 +0200)]
net: phy: micrel: Fix regression in kszphy_probe

Don't do clock-mode-select if clk == NULL,
since when building without CONFIG_HAVE_CLK,
clk_get returns NULL and clk_get_rate returns 0.

Doing clock-mode-select in this cause causes kszphy_probe to
return -EINVAL and thus prevents the device from being probed.

The original code (before regression) would return 0
when building without CONFIG_HAVE_CLK.

Cc: stable <stable@vger.kernel.org> # 3.18+
Fixes: 1fadee0c3645 ("net/phy: micrel: Add clock support for
KSZ8021/KSZ8031")
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Niklas Cassel <niklass@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: ll_temac: Fix DMA map size bug
Michal Simek [Tue, 12 May 2015 06:06:15 +0000 (08:06 +0200)]
net: ll_temac: Fix DMA map size bug

DMA allocates skb->len instead of headlen
which is used for DMA.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotest_bpf: add 173 new testcases for eBPF
Michael Holzheu [Tue, 12 May 2015 05:22:44 +0000 (22:22 -0700)]
test_bpf: add 173 new testcases for eBPF

add an exhaustive set of eBPF tests bringing total to:
test_bpf: Summary: 233 PASSED, 0 FAILED, [0/226 JIT'ed]

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agosamples/bpf: fix in-source build of samples with clang
Brenden Blanco [Tue, 12 May 2015 04:25:51 +0000 (21:25 -0700)]
samples/bpf: fix in-source build of samples with clang

in-source build of 'make samples/bpf/' was incorrectly
using default compiler instead of invoking clang/llvm.
out-of-source build was ok.

Fixes: a80857822b0c ("samples: bpf: trivial eBPF program in C")
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agox86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
Alexei Starovoitov [Tue, 12 May 2015 06:25:16 +0000 (23:25 -0700)]
x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions

FROM_BE16:
'ror %reg, 8' doesn't clear upper bits of the register,
so use additional 'movzwl' insn to zero extend 16 bits into 64

FROM_LE16:
should zero extend lower 16 bits into 64 bit

FROM_LE32:
should zero extend lower 32 bits into 64 bit

Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocxgb4/cxgb4vf: Cleanup macros, add comments and add new MACROS
Hariprasad Shenai [Mon, 11 May 2015 23:13:43 +0000 (04:43 +0530)]
cxgb4/cxgb4vf: Cleanup macros, add comments and add new MACROS

Cleanup few MACROS left out in t4_hw.h to be consistent with the
existing ones. Also replace few hardcoded values with MACROS. Also
update comments for some code

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agohv_netvsc: Use the xmit_more skb flag to optimize signaling the host
KY Srinivasan [Mon, 11 May 2015 22:39:46 +0000 (15:39 -0700)]
hv_netvsc: Use the xmit_more skb flag to optimize signaling the host

Based on the information given to this driver (via the xmit_more skb flag),
we can defer signaling the host if more packets are on the way. This will help
make the host more efficient since it can potentially process a larger batch of
packets. Implement this optimization.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopktgen: fix packet generation
Alexei Starovoitov [Mon, 11 May 2015 22:19:48 +0000 (15:19 -0700)]
pktgen: fix packet generation

pkt_gen->last_ok was not set properly, so after the first burst
pktgen instead of allocating new packet, will reuse old one, advance
eth_type_trans further, which would mean the stack will be seeing very
short bogus packets.

Fixes: 62f64aed622b ("pktgen: introduce xmit_mode '<start_xmit|netif_receive>'")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'systemport-irq-coalesce'
David S. Miller [Wed, 13 May 2015 03:08:47 +0000 (23:08 -0400)]
Merge branch 'systemport-irq-coalesce'

Florian Fainelli says:

====================
net: systemport: interrupt coalescing support

This patch series adds support for RX & TX interrupt coalescing in the
systemport driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: systemport: Implement RX coalescing control knobs
Florian Fainelli [Mon, 11 May 2015 22:12:42 +0000 (15:12 -0700)]
net: systemport: Implement RX coalescing control knobs

Similarly to the TX path, allow the RX path to be configured with both
'rx-frames' and 'rx-usecs' coalescing parameters.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: systemport: Implement TX coalescing control knobs
Florian Fainelli [Mon, 11 May 2015 22:12:41 +0000 (15:12 -0700)]
net: systemport: Implement TX coalescing control knobs

Add the ability to configure both 'tx-frames' which controls how many frames
are doing to trigger a single interrupt and 'tx-usecs' which dictates how long
to wait before an interrupt should be services.

Since our timer resolution is close to 8.192 us, we round up to the nearest
value the 'tx-usecs' timeout value.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: deinline netif_tx_stop_all_queues(), remove WARN_ON in netif_tx_stop_queue()
Denys Vlasenko [Mon, 11 May 2015 19:17:53 +0000 (21:17 +0200)]
net: deinline netif_tx_stop_all_queues(), remove WARN_ON in netif_tx_stop_queue()

These functions compile to 60 bytes of machine code each.
With this .config: http://busybox.net/~vda/kernel_config
there are 617 calls of netif_tx_stop_queue()
and 49 calls of netif_tx_stop_all_queues() in vmlinux.

To fix this, remove WARN_ON in netif_tx_stop_queue()
as suggested by davem, and deinline netif_tx_stop_all_queues().

Change in code size is about 20k:

   text      data      bss       dec     hex filename
82426986 22255416 20627456 125309858 77813a2 vmlinux.before
82406248 22255416 20627456 125289120 777c2a0 vmlinux

gcc-4.7.2 still creates deinlined version of netif_tx_stop_queue
sometimes:

$ nm --size-sort vmlinux | grep netif_tx_stop_queue | wc -l
190

ffffffff81b558a8 <netif_tx_stop_queue>:
ffffffff81b558a8:       55                      push   %rbp
ffffffff81b558a9:       48 89 e5                mov    %rsp,%rbp
ffffffff81b558ac:       f0 80 8f e0 01 00 00    lock orb $0x1,0x1e0(%rdi)
ffffffff81b558b3:       01
ffffffff81b558b4:       5d                      pop    %rbp
ffffffff81b558b5:       c3                      retq

This needs additional fixing.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Alexei Starovoitov <alexei.starovoitov@gmail.com>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Joe Perches <joe@perches.com>
CC: David S. Miller <davem@davemloft.net>
CC: Jiri Pirko <jpirko@redhat.com>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: netfilter-devel@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacvtap add missing ioctls - fix wrapping
Justin Cormack [Mon, 11 May 2015 19:00:10 +0000 (20:00 +0100)]
macvtap add missing ioctls - fix wrapping

The macvtap driver tries to emulate all the ioctls supported by a normal
tun/tap driver, however it was missing the generic SIOCGIFHWADDR and
SIOCSIFHWADDR ioctls to get and set the mac address that are supported
by tun/tap. This patch adds these.

Signed-off-by: Justin Cormack <justin@netbsd.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Tue, 12 May 2015 23:02:06 +0000 (16:02 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
 "One build fix for build breakage of all MIPS SMP kernels caused by
  Rusty's fix of obsolete use of cpu mask helpers, another to fix the FP
  ABI selection when loading an ELF binary"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: fix FP mode selection in lieu of .MIPS.abiflags data
  MIPS: SMP: Fix build error.

9 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Tue, 12 May 2015 22:54:54 +0000 (15:54 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 - update MAINTAINERS git repo pointer
 - printk garbage fix
 - fix for qib and iw_cxgb4 bugs introduced in 4.1 window
 - fix for an older iWARP netlink bug
 - fix a memcpy issue in ehca driver

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  infiniband: Remove duplicated KERN_<LEVEL> from pr_<level> uses
  IB/qib: fix test of unsigned variable
  RDMA/core: Fix for parsing netlink string attribute
  MAINTAINERS: update the official rdma git repo
  iw_cxgb4: use wildcard mapping for getting remote addr info
  IB/ehca: use correct destination for memcpy

9 years agonetns: return RTM_NEWNSID instead of RTM_GETNSID on a get
Nicolas Dichtel [Mon, 11 May 2015 13:57:31 +0000 (15:57 +0200)]
netns: return RTM_NEWNSID instead of RTM_GETNSID on a get

Usually, RTM_NEWxxx is returned on a get (same as a dump).

Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge tag 'for-v4.1-rc' of git://git.infradead.org/battery-2.6
Linus Torvalds [Tue, 12 May 2015 22:49:29 +0000 (15:49 -0700)]
Merge tag 'for-v4.1-rc' of git://git.infradead.org/battery-2.6

Pull power supply and reset fixes from Sebastian Reichel:
 "misc fixes"

* tag 'for-v4.1-rc' of git://git.infradead.org/battery-2.6:
  power: bq27x00_battery: Add missing MODULE_ALIAS
  power: reset: Add MFD_SYSCON depends for brcmstb
  power: reset: ltc2952: Remove bogus hrtimer_start() return value checks
  power_supply: fix oops in collie_battery driver
  power/reset: at91: fix return value check in at91_reset_platform_probe()
  MAINTAINERS: Add me as maintainer of Nokia N900 power supply drivers
  axp288_fuel_gauge: Add original author details

9 years agoMerge branch 'switchdev_spring_cleanup'
David S. Miller [Tue, 12 May 2015 22:43:56 +0000 (18:43 -0400)]
Merge branch 'switchdev_spring_cleanup'

Scott Feldman says:

====================
switchdev: spring cleanup

v7:

Address review comments:

 - [Jiri] split the br_setlink and br_dellink reverts into their own patches
 - [Jiri] some parameter cleanup of rocker's memory allocators
 - [Jiri] pass trans mode as formal parameter rather than hanging off of
     rocker_port.

v6:

Address review comments:

 - [Jiri] split a couple of patches into one-logical-change per patch
 - [Joe Perches] revert checkpatch -f changes for wrapped lines with long
     symbols.

v5:

Address review comments:

 - [Jiri] include Jiri's s/swdev/switchdev rename patches up front.
 - [Jiri] squash some patches.  Now setlink/dellink/getlink patches are in
     three parts: new implementation, convert drivers to new, delete old impl.
 - [Jiri] some minor variable renames
 - [Jiri] use BUG_ON rather than WARN when COMMIT phase fails when PREPARE
     phase said it was safe to come into the water.
 - [Simon] rocker: fix a few transaction prepare-commit cases that were wrong.
     This was the bulk of the changes in v5.

v4:

Well, it was a lot of work, but now prepare-commit transaction model is how
davem advises: if prepare fails, abort the transaction.  The driver must do
resource reservations up front in prepare phase and return those resources if
aborting.  Commit phase would use reserved resources.  The good news is the
driver code (for rocker) now handles resource allocation failures better by not
leaving partially device or driver states.  This is a side-effect of the
prepare phase where state isn't modified; only validation of inputs and
resource reservations happen in the prepare phase.  Since we're supporting
setting attrs and add objs across lower devs in the stacked case, we need to
hold rtnl_lock (or ensure rtnl_lock is held) so lower devs don't move on us
during the prepare-commit transaction.  DSA driver code skips the prepare phase
and goes straight for the commit phase since no up-front allocations are done
and no device failures (that could be detected in the prepare phase) can
happen.

Remove NETIF_F_HW_SWITCH_OFFLOAD from rocker and the swdev_attr_set/get
wrappers.  DSA doesn't set NETIF_F_HW_SWITCH_OFFLOAD, so it can't be in
swdev_attr_set/get.  rocker doesn't need it; or rather can't support
NETIF_F_HW_SWITCH_OFFLOAD being set/cleared at run-time after the device
port is already up and offloading L2/L3.  NETIF_F_HW_SWITCH_OFFLOAD is still
left as a feature flag for drivers that can use it.

Drop the renaming patch for netdev_switch_notifier.  Other renames are a
result of moving to the attr get/set or obj add/del model.  Everything
but the netdev_switch_notifier is still prefixed with "swdev_".

v3:

Move to two-phase prepare-commit transaction model for attr set and obj add.
Driver gets a change in prepare phase to NACK transaction if lack of resources
or support in device.

v2:

Address review comments:

 - [Jiri] squash a few related patches
 - [Roopa] don't remove NETIF_F_HW_SWITCH_OFFLOAD
 - [Roopa] address VLAN setlink/dellink
 - [Ronen] print warning is attr set revert fails

Not address:

 - Using something other than "swdev_" prefix
 - Vendor extentions

The patch set grew a bit to not only support port attr get/set but also add
support for port obj add/del.  Example of port objs are VLAN, FDB entries, and
FIB entries.  The VLAN support now allows the swdev driver to get VLAN ranges
and flags like PVID and "untagged".  Sridhar will be adding FDB obj support
in follow-on patch.

v1:

The main theme of this patch set is to cleanup swdev in preparation for
new features or fixes to be added soon.  We have a pretty good idea now how
to handle stacked drivers in swdev, but there where some loose ends.  For
example, if a set failed in the middle of walking the lower devs, we would
leave the system in an undefined state...there was no way to recover back to
the previous state.  Speaking of sets, also recognize a pattern that most
swdev API accesses are gets or sets of port attributes, so go ahead and make
port attr get/set the central swdev API, and convert everything that is
set-ish/get-ish to this new API.

Features/fixes that should follow from this cleanup:

 - solve the duplicate pkt forwarding issue
 - get/set bridge attrs, like ageing_time, from/to device
 - get/set more bridge port attrs from/to device

There are some rename cleanups tagging along at the end, to give swdev
consistent naming.

And finally, some much needed updates to the switchdev.txt documentation to
hopefully capture the state-of-the-art of swdev.  Hopefully, we can do a better
job keeping this document up-to-date.

Tested with rocker, of course, to make sure nothing functional broke.  There
are a couple minor tweaks to DSA code for getting switch ID and setting STP
updates to use new API, but not expecting amy breakage there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>