firefly-linux-kernel-4.4.55.git
11 years agosfc: remove unused 'enum efx_rx_alloc_method'
Andrew Rybchenko [Thu, 28 Nov 2013 04:28:13 +0000 (08:28 +0400)]
sfc: remove unused 'enum efx_rx_alloc_method'

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: remove unused 'refcnt' from efx_rx_page_state
Andrew Rybchenko [Sat, 23 Nov 2013 04:42:07 +0000 (08:42 +0400)]
sfc: remove unused 'refcnt' from efx_rx_page_state

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Implement efx_nic_type::filter_clear_rx operation for EF10
Ben Hutchings [Thu, 21 Nov 2013 19:15:03 +0000 (19:15 +0000)]
sfc: Implement efx_nic_type::filter_clear_rx operation for EF10

The operation can now fail, so change its return type to int.

Remove the inline wrapper while we're changing the signature.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Allow filter removal only with exactly matching priority
Ben Hutchings [Thu, 21 Nov 2013 19:11:47 +0000 (19:11 +0000)]
sfc: Allow filter removal only with exactly matching priority

Currently a higher priority client can remove a lower priority
client's filter with equal match-expression.  This might happen if (a)
the higher priority client has a double-free bug, or (b) another
client with sufficient priority replaced and then removed an equal
filter, allowing the low priority client to insert an equal filter.

In neither case does it actually make sense to carry out the removal;
we should say the filter doesn't exist, as the filter currently
present is not the one that the high-priority client is referring to.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Don't refer to 'stack' in filter implementation
Ben Hutchings [Thu, 21 Nov 2013 19:02:22 +0000 (19:02 +0000)]
sfc: Don't refer to 'stack' in filter implementation

Change all the 'stack' naming to 'auto' (or other meaningful term);
the device address list is based on more than just what the network
stack wants, and the no-match filters aren't really what the stack
wants at all.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Change priority and flags for automatic MAC filters
Ben Hutchings [Thu, 21 Nov 2013 19:02:18 +0000 (19:02 +0000)]
sfc: Change priority and flags for automatic MAC filters

MAC filters inserted automatically by the driver, based on the device
address list (EF10) or no-match filters (Siena), should be overridable
at MANUAL or REQUIRED priority.  Currently they themselves have
REQUIRED priority and this requires some odd special-casing.

We also can't reliably tell whether such a MAC filter has or has
not been overridden.  We just remember that it is wanted by the
stack (RX_STACK flag).

Add another priority level, AUTO, between HINT and MANUAL, and
use this for the automatic filters while they have not been
overridden.  Remove the RX_STACK flag.  Add an RX_OVER_AUTO
flag which is set only when an AUTO filter has been overridden
(or was requested to be inserted while a higher-priority filter
existed).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Change efx_nic_type::rx_push_indir_table to push hash key as well
Andrew Rybchenko [Thu, 14 Nov 2013 05:00:27 +0000 (09:00 +0400)]
sfc: Change efx_nic_type::rx_push_indir_table to push hash key as well

The EF10 implementation already does this, and it makes more logical
sense to group the RSS hash key and indirection table together.
Rename the operation to rx_push_rss_config.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add more information to many warnings using WARN() and netdev_WARN()
Ben Hutchings [Fri, 1 Nov 2013 16:42:44 +0000 (16:42 +0000)]
sfc: Add more information to many warnings using WARN() and netdev_WARN()

In case of certain hardware and firmware errors it can be useful to
have more context than just the file and line number.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Remove unnecessary condition for processing the TX timestamp queue
Ben Hutchings [Thu, 28 Nov 2013 18:58:12 +0000 (18:58 +0000)]
sfc: Remove unnecessary condition for processing the TX timestamp queue

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Don't clear timestamps in efx_ptp_rx()
Ben Hutchings [Thu, 28 Nov 2013 18:58:11 +0000 (18:58 +0000)]
sfc: Don't clear timestamps in efx_ptp_rx()

A freshly allocated skb starts with timestamps clear.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Enable PTP clock and timestamping for all functions on EF10
Ben Hutchings [Thu, 5 Dec 2013 21:28:42 +0000 (21:28 +0000)]
sfc: Enable PTP clock and timestamping for all functions on EF10

The SFC9100 family has only one clock per controller, shared by all
functions.  Therefore only create a clock device under the primary
function, and make all other functions refer to the primary's clock
device.

Since PTP functionality is limited to port 0 and PF 0 on the earlier
SFN[56]322F boards, and we also set the primary flag for that
function, we can make the creation of a clock device conditional only
on this flag.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Associate primary and secondary functions of controller
Ben Hutchings [Fri, 18 Oct 2013 18:21:45 +0000 (19:21 +0100)]
sfc: Associate primary and secondary functions of controller

The primary function of an EF10 controller will share its clock
device with other functions in the same domain (which we call
secondary functions).  To this end, we need to associate functions
on the same controller.

We do not control probe order, so allow primary and secondary
functions to appear in any order.  Maintain global lists of all
primary functions and of unassociated secondary functions,
and a list of secondary functions on each primary function.

Use the VPD serial number to tell whether functions are part of the
same controller.  VPD will not be readable by virtual functions, so
this may need to be revisited later.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Store VPD serial number at probe time
Ben Hutchings [Thu, 5 Dec 2013 20:13:22 +0000 (20:13 +0000)]
sfc: Store VPD serial number at probe time

Original version by Stuart Hodgson.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add RX packet timestamping for EF10
Jon Cooper [Mon, 18 Nov 2013 12:54:41 +0000 (12:54 +0000)]
sfc: Add RX packet timestamping for EF10

The EF10 firmware can optionally insert RX timestamps in the packet
prefix.  These only include the clock minor value.  We must also
enable periodic time sync events on each event queue which provide
the high bits of the clock value.

[bwh: Combined and rebased several changes.
 Added the above description and some sanity checks for inline vs
 separate timestamps.
 Changed efx_rx_skb_attach_timestamp() to read the packet prefix
 from the skb head area.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Copy RX prefix into skb head area in efx_rx_mk_skb()
Ben Hutchings [Thu, 28 Nov 2013 18:58:11 +0000 (18:58 +0000)]
sfc: Copy RX prefix into skb head area in efx_rx_mk_skb()

We can potentially pull the entire packet contents into the head area
and then free the page it was in.  In order to read an inline
timestamp safely, we need to copy the prefix into the head area as
well.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: split setup of hardware timestamping into NIC-type operation
Daniel Pieczko [Thu, 21 Nov 2013 17:11:25 +0000 (17:11 +0000)]
sfc: split setup of hardware timestamping into NIC-type operation

I added efx_ptp_get_mode() to avoid moving the definition for
efx_ptp_data, since the current PTP mode is needed for
siena.c:siena_set_ptp_hwtstamp.

[bwh: Also move the rx_filters mask, and add kernel-doc]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add support for SFC9100 timestamp format
Laurence Evans [Wed, 4 Dec 2013 23:47:56 +0000 (23:47 +0000)]
sfc: Add support for SFC9100 timestamp format

The clock minor tick on the SFC9100 family is 2^-27 s, not 1 ns.
There are also various pipeline delays which we need to correct for
when interpreting timestamps.

We query the firmware for the clock format and corrections at run-time.

[bwh: Combined and rebased several changes]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Tidy up PTP synchronization code
Laurence Evans [Thu, 21 Nov 2013 10:38:24 +0000 (10:38 +0000)]
sfc: Tidy up PTP synchronization code

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: PTP - tidy up unused/useless variables
Laurence Evans [Thu, 21 Nov 2013 10:38:24 +0000 (10:38 +0000)]
sfc: PTP - tidy up unused/useless variables

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Remove kernel-doc for efx_ptp_data fields not present in this version
Ben Hutchings [Fri, 6 Dec 2013 18:52:34 +0000 (18:52 +0000)]
sfc: Remove kernel-doc for efx_ptp_data fields not present in this version

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Initialise efx_ptp_data::phc_clock_info from a static template
Ben Hutchings [Wed, 16 Oct 2013 17:32:41 +0000 (18:32 +0100)]
sfc: Initialise efx_ptp_data::phc_clock_info from a static template

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Do not use MAC address as clock name
Ben Hutchings [Wed, 16 Oct 2013 17:32:39 +0000 (18:32 +0100)]
sfc: Do not use MAC address as clock name

We'll be sharing clocks between multiple functions with their own MAC
addresses.  The name field is now documented as 'A short "friendly
name" to identify the clock ...' and '... not meant to be a unique
id.'  So use the name 'sfc'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Store flags from MC_CMD_DRV_ATTACH for later use
Ben Hutchings [Wed, 16 Oct 2013 17:32:34 +0000 (18:32 +0100)]
sfc: Store flags from MC_CMD_DRV_ATTACH for later use

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Remove dependency of PTP on having a dedicated channel
Ben Hutchings [Tue, 15 Oct 2013 16:54:56 +0000 (17:54 +0100)]
sfc: Remove dependency of PTP on having a dedicated channel

We need a dedicated channel on Siena to ensure we can match up
the separate RX and timestamp events for each PTP packet.  We won't
do this for EF10 as timestamps are delivered inline.

Pass a channel index of 0 to MC_CMD_PTP_OP_ENABLE when there is no
dedicated channel.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Split PTP multicast filter insertion/removal out of efx_ptp_{start,stop}()
Ben Hutchings [Tue, 15 Oct 2013 16:54:56 +0000 (17:54 +0100)]
sfc: Split PTP multicast filter insertion/removal out of efx_ptp_{start,stop}()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Return EBUSY for filter insertion on EF10, matching Falcon/Siena
Ben Hutchings [Wed, 9 Oct 2013 13:17:27 +0000 (14:17 +0100)]
sfc: Return EBUSY for filter insertion on EF10, matching Falcon/Siena

The MC firmware will return error MC_CMD_ERR_ENOSPC if filter
insertion fails due to lack of resources.  The net driver's filter
implementation for Falcon-architecture returns EBUSY.  They should
behave consistently, so for EF10 change ENOSPC to EBUSY.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Expose NVRAM_PARTITION_TYPE_LICENSE on EF10
Ben Hutchings [Wed, 9 Oct 2013 13:14:41 +0000 (14:14 +0100)]
sfc: Expose NVRAM_PARTITION_TYPE_LICENSE on EF10

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Fold efx_flush_all() into efx_stop_port() and update comments
Ben Hutchings [Tue, 8 Oct 2013 16:33:20 +0000 (17:33 +0100)]
sfc: Fold efx_flush_all() into efx_stop_port() and update comments

efx_flush_all() is a really misleading name - it has nothing to do
with e.g. flushing DMA queues.  Since it's called immediately after
efx_stop_port() and is highly dependent on what that does, combine
the two functions.

Update comments to explain what this is doing a little better.
Also update an related and erroneous comment in efx_start_port().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Map MCDI error MC_CMD_ERR_ENOTSUP to Linux EOPNOTSUPP
Ben Hutchings [Tue, 8 Oct 2013 15:36:58 +0000 (16:36 +0100)]
sfc: Map MCDI error MC_CMD_ERR_ENOTSUP to Linux EOPNOTSUPP

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Log all unexpected MCDI errors
Edward Cree [Fri, 31 May 2013 17:36:12 +0000 (18:36 +0100)]
sfc: Log all unexpected MCDI errors

Split each of efx_mcdi_rpc, efx_mcdi_rpc_finish, and efx_mcdi_rpc_async into
a normal and a _quiet version; made the former log MCDI errors with
netif_err (and include the raw MCDI error code), and the latter never log
them at all.  Changed various callers; any where some errors are expected
(but others are not) call the _quiet version and then if necessary log the
MCDI error themselves.  Said logging is done by new efx_mcdi_display_error.

Callers of efx_mcdi_rpc*_quiet functions which may want to log the error
need to ensure that their outbuf is big enough to hold an MCDI error; to
this end, they now use MCDI_DECLARE_BUF_OUT_OR_ERR, which always allocates
at least 8 bytes.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add new sensor names
Ben Hutchings [Wed, 4 Dec 2013 20:17:28 +0000 (20:17 +0000)]
sfc: Add new sensor names

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Revise sensor names to be more understandable and consistent
Edward Cree [Thu, 3 Oct 2013 18:06:18 +0000 (19:06 +0100)]
sfc: Revise sensor names to be more understandable and consistent

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Report units in sensor warnings
Edward Cree [Mon, 30 Sep 2013 09:52:49 +0000 (10:52 +0100)]
sfc: Report units in sensor warnings

Add units to the "Sensor reports condition X for raw value Y" messages.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Correct RX dropped count for drops while interface is down
Jon Cooper [Mon, 30 Sep 2013 16:36:50 +0000 (17:36 +0100)]
sfc: Correct RX dropped count for drops while interface is down

We don't directly control RX ingress on Siena or any later
controllers, and so we cannot prevent packets from entering the RX
datapath while the RX queues are not set up.  This results in
the hardware incrementing RX_NODESC_DROP_CNT, but it's not an
error and we should not include it in error stats.

When bringing an interface up or down, pull (or wait for) stats and
count the number of packets that were dropped while the interface was
down.  Subtract this from the reported RX dropped count.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Make initial fill of RX descriptors synchronous
Jon Cooper [Wed, 2 Oct 2013 10:04:14 +0000 (11:04 +0100)]
sfc: Make initial fill of RX descriptors synchronous

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Tighten the check for RX merged completion events
Ben Hutchings [Tue, 24 Sep 2013 22:21:57 +0000 (23:21 +0100)]
sfc: Tighten the check for RX merged completion events

The addition of RX event merging support means we don't reliably
detect dropped RX events now.  Currently we will only detect them if
the previous event for the RX queue had the CONT bit set.

Only accept RX completion events as merged if the
GET_CAPABILITIES_OUT_RX_BATCHING bit is set in datapath_caps (which it
won't be for the low-latency datapath) and the CONT bit is not set on
the event.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add MC BISTs to ethtool offline self test on EF10
Jon Cooper [Mon, 16 Sep 2013 13:18:51 +0000 (14:18 +0100)]
sfc: Add MC BISTs to ethtool offline self test on EF10

To run BISTs the MC goes down in to a special mode where it will only
respond to MCDI from the testing PF, and TX, RX and event queues are
torn down. Other PFs get a message as it goes down to tell them it's
going down.

When the other PFs get this message, they check the soft status
register to tell when the MC has rebooted after BIST mode and they can
start recovery.

[bwh: Convert the test result to 1 or -1 as for earlier NICs]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Update MCDI protocol definitions
Ben Hutchings [Wed, 4 Dec 2013 19:48:07 +0000 (19:48 +0000)]
sfc: Update MCDI protocol definitions

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Demote "MC Scheduler error" messages
Robert Stonehouse [Mon, 7 Oct 2013 17:44:17 +0000 (18:44 +0100)]
sfc: Demote "MC Scheduler error" messages

The MC firmware is cooperatively multitasking and its scheduler will
send an event when a task yields after running for more than the
expected maximum time.  This can be useful for firmware development
but does not usually indicate a serious error and does not help to
detect a lockup (there is a hardware watchdog that does that).
Change the message and reduce log level accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agoMerge branch 'sfc-3.13' into master
Ben Hutchings [Fri, 6 Dec 2013 22:28:18 +0000 (22:28 +0000)]
Merge branch 'sfc-3.13' into master

Merge sfc fixes destined for 3.13, as development for 3.14+ depends on
some of them.

11 years agosfc: Poll for MCDI completion once before timeout occurs
Robert Stonehouse [Wed, 9 Oct 2013 10:52:48 +0000 (11:52 +0100)]
sfc: Poll for MCDI completion once before timeout occurs

There is an as-yet unexplained bug that sometimes prevents (or delays)
the driver seeing the completion event for a completed MCDI request on
the SFC9120.  The requested configuration change will have happened
but the driver assumes it to have failed, and this can result in
further failures.  We can mitigate this by polling for completion
after unsuccessfully waiting for an event.

Fixes: 8127d661e77f ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once()
Robert Stonehouse [Wed, 9 Oct 2013 10:52:43 +0000 (11:52 +0100)]
sfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: RX buffer allocation takes prefix size into account in IP header alignment
Andrew Rybchenko [Sat, 16 Nov 2013 07:02:27 +0000 (11:02 +0400)]
sfc: RX buffer allocation takes prefix size into account in IP header alignment

rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal
to 14 on EF10. So, it should be taken into account if arch requires IP header
to be 4-bytes aligned (via NET_IP_ALIGN).

Fixes: 8127d661e77f ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Maintain current frequency adjustment when applying a time offset
Ben Hutchings [Thu, 5 Dec 2013 17:24:06 +0000 (17:24 +0000)]
sfc: Maintain current frequency adjustment when applying a time offset

There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock.  When
applying a time offset we should not change the frequency adjustment.

These two operations can now be requested separately but this requires
a flash firmware update.  Keep using the single operation, but
remember and repeat the previous frequency adjustment.

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Stop/re-start PTP when stopping/starting the datapath.
Alexandre Rames [Fri, 8 Nov 2013 10:20:31 +0000 (10:20 +0000)]
sfc: Stop/re-start PTP when stopping/starting the datapath.

This disables PTP when we bring the interface down to avoid getting
unmatched RX timestamp events, and tries to re-enable it when bringing
the interface up.

[bwh: Make efx_ptp_stop() safe on Falcon. Introduce
 efx_ptp_{start,stop}_datapath() functions; we'll expand them later.]

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Rate-limit log message for PTP packets without a matching timestamp event
Ben Hutchings [Fri, 6 Dec 2013 22:10:46 +0000 (22:10 +0000)]
sfc: Rate-limit log message for PTP packets without a matching timestamp event

In case of a flood of PTP packets, the timestamp peripheral and MC
firmware on the SFN[56]322F boards may not be able to provide
timestamp events for all packets.  Don't complain too much about this.

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: PTP: Moderate log message on event queue overflow
Laurence Evans [Mon, 28 Jan 2013 14:51:17 +0000 (14:51 +0000)]
sfc: PTP: Moderate log message on event queue overflow

Limit syslog flood if a PTP packet storm occurs.

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add length checks to efx_xmit_with_hwtstamp() and efx_ptp_is_ptp_tx()
Ben Hutchings [Fri, 6 Dec 2013 19:26:40 +0000 (19:26 +0000)]
sfc: Add length checks to efx_xmit_with_hwtstamp() and efx_ptp_is_ptp_tx()

efx_ptp_is_ptp_tx() must be robust against skbs from raw sockets that
have invalid IPv4 and UDP headers.

Add checks that:
- the transport header has been found
- there is enough space between network and transport header offset
  for an IPv4 header
- there is enough space after the transport header offset for a
  UDP header

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agotcp: auto corking
Eric Dumazet [Fri, 6 Dec 2013 06:36:05 +0000 (22:36 -0800)]
tcp: auto corking

With the introduction of TCP Small Queues, TSO auto sizing, and TCP
pacing, we can implement Automatic Corking in the kernel, to help
applications doing small write()/sendmsg() to TCP sockets.

Idea is to change tcp_push() to check if the current skb payload is
under skb optimal size (a multiple of MSS bytes)

If under 'size_goal', and at least one packet is still in Qdisc or
NIC TX queues, set the TCP Small Queue Throttled bit, so that the push
will be delayed up to TX completion time.

This delay might allow the application to coalesce more bytes
in the skb in following write()/sendmsg()/sendfile() system calls.

The exact duration of the delay is depending on the dynamics
of the system, and might be zero if no packet for this flow
is actually held in Qdisc or NIC TX ring.

Using FQ/pacing is a way to increase the probability of
autocorking being triggered.

Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control
this feature and default it to 1 (enabled)

Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking
This counter is incremented every time we detected skb was under used
and its flush was deferred.

Tested:

Interesting effects when using line buffered commands under ssh.

Excellent performance results in term of cpu usage and total throughput.

lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking
lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
9410.39

 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':

      35209.439626 task-clock                #    2.901 CPUs utilized
             2,294 context-switches          #    0.065 K/sec
               101 CPU-migrations            #    0.003 K/sec
             4,079 page-faults               #    0.116 K/sec
    97,923,241,298 cycles                    #    2.781 GHz                     [83.31%]
    51,832,908,236 stalled-cycles-frontend   #   52.93% frontend cycles idle    [83.30%]
    25,697,986,603 stalled-cycles-backend    #   26.24% backend  cycles idle    [66.70%]
   102,225,978,536 instructions              #    1.04  insns per cycle
                                             #    0.51  stalled cycles per insn [83.38%]
    18,657,696,819 branches                  #  529.906 M/sec                   [83.29%]
        91,679,646 branch-misses             #    0.49% of all branches         [83.40%]

      12.136204899 seconds time elapsed

lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking
lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
6624.89

 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
      40045.864494 task-clock                #    3.301 CPUs utilized
               171 context-switches          #    0.004 K/sec
                53 CPU-migrations            #    0.001 K/sec
             4,080 page-faults               #    0.102 K/sec
   111,340,458,645 cycles                    #    2.780 GHz                     [83.34%]
    61,778,039,277 stalled-cycles-frontend   #   55.49% frontend cycles idle    [83.31%]
    29,295,522,759 stalled-cycles-backend    #   26.31% backend  cycles idle    [66.67%]
   108,654,349,355 instructions              #    0.98  insns per cycle
                                             #    0.57  stalled cycles per insn [83.34%]
    19,552,170,748 branches                  #  488.244 M/sec                   [83.34%]
       157,875,417 branch-misses             #    0.81% of all branches         [83.34%]

      12.130267788 seconds time elapsed

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years ago3c59x/net: Use dev_is_pci() instead of hardcoding
Yijing Wang [Fri, 6 Dec 2013 06:34:55 +0000 (14:34 +0800)]
3c59x/net: Use dev_is_pci() instead of hardcoding

Use PCI standard macro dev_is_pci() instead of hardcoding.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/fddi: Replace local macro with PCI standard macro
Yijing Wang [Fri, 6 Dec 2013 06:34:19 +0000 (14:34 +0800)]
net/fddi: Replace local macro with PCI standard macro

Replace local macro DFX_BUS_PCI() with PCI standard macro
dev_is_pci().

Acked-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: optimize some skb_shinfo(skb) uses
Eric Dumazet [Fri, 6 Dec 2013 06:31:30 +0000 (22:31 -0800)]
tcp: optimize some skb_shinfo(skb) uses

Compiler doesn't know skb_shinfo(skb) pointer is usually constant.

By using a temporary variable, we help generating smaller code.

For example, tcp_init_nondata_skb() is inlined after this patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agogro: small napi_get_frags() optim
Eric Dumazet [Fri, 6 Dec 2013 05:44:27 +0000 (21:44 -0800)]
gro: small napi_get_frags() optim

Remove one useless conditional branch :
napi->skb is NULL, so nothing bad can happen.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: consistent use of IP6_INC_STATS_BH() in ip6_forward()
Eric Dumazet [Fri, 6 Dec 2013 05:38:06 +0000 (21:38 -0800)]
ipv6: consistent use of IP6_INC_STATS_BH() in ip6_forward()

ip6_forward() runs from softirq context, we can use the SNMP macros
assuming this.

Use same indentation for all IP6_INC_STATS_BH() calls.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopacket: use macro GET_PBDQC_FROM_RB to simplify the codes
Duan Jiong [Fri, 6 Dec 2013 05:29:36 +0000 (13:29 +0800)]
packet: use macro GET_PBDQC_FROM_RB to simplify the codes

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotun: spelling fixes
stephen hemminger [Fri, 6 Dec 2013 04:42:58 +0000 (20:42 -0800)]
tun: spelling fixes

Fix spelling errors in tun driver.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/*: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:44 +0000 (09:13 -0800)]
net/*: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: John Fastabend <john.r.fastabend@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Gustavo Padovan <gustavo@padovan.org>
CC: Johan Hedberg <johan.hedberg@gmail.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/irda: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:43 +0000 (09:13 -0800)]
net/irda: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetfilter: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:42 +0000 (09:13 -0800)]
netfilter: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: netfilter@vger.kernel.org
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetlabel: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:41 +0000 (09:13 -0800)]
netlabel: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoinclude/net/: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:40 +0000 (09:13 -0800)]
include/net/: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4/ipv6: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 17:13:39 +0000 (09:13 -0800)]
ipv4/ipv6: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
CC: James Morris <jmorris@namei.org>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosctp: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 14:28:48 +0000 (06:28 -0800)]
sctp: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers/net/*: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 14:28:47 +0000 (06:28 -0800)]
drivers/net/*: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Paul Mackerras <paulus@samba.org>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousb: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 14:28:46 +0000 (06:28 -0800)]
usb: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Oliver Neukum <oliver@neukum.org>
CC: Steve Glendinning <steve.glendinning@shawell.net>
CC: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoirda: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 14:28:44 +0000 (06:28 -0800)]
irda: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoethernet: Fix FSF address in file headers
Jeff Kirsher [Fri, 6 Dec 2013 14:28:43 +0000 (06:28 -0800)]
ethernet: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Santosh Raspatur <santosh@chelsio.com>
CC: Dimitris Michailidis <dm@chelsio.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Santiago Leon <santil@linux.vnet.ibm.com>
CC: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
CC: Olof Johansson <olof@lixom.net>
CC: Manish Chopra <manish.chopra@qlogic.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Rajesh Borundia <rajesh.borundia@qlogic.com>
CC: Nicolas Pitre <nico@fluxnic.net>
CC: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvlan: Support creating macvtaps from macvlans
Kevin Wallace [Tue, 3 Dec 2013 10:55:22 +0000 (02:55 -0800)]
macvlan: Support creating macvtaps from macvlans

When running in a network namespace whose only link to the outside
world is a macvlan device, not being able to create a macvtap off of
it is a real pain.

So modify macvtap creation to automatically forward a creation of a
macvtap on a macvlan to become a creation of a macvtap on the
underlying network device, just like is currently done with
macvlan-on-macvlan devices.

v2: Use netif_is_macvlan and macvlan_dev_real_dev helpers to make it
    more clear what we're doing.

Signed-off-by: Kevin Wallace <kevin@pentabarf.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'siocghwtstamp' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh...
David S. Miller [Fri, 6 Dec 2013 00:45:14 +0000 (19:45 -0500)]
Merge branch 'siocghwtstamp' of git://git./linux/kernel/git/bwh/sfc-next

Ben Hutchings says:

====================
SIOCGHWTSTAMP ioctl

1. Add the SIOCGHWTSTAMP ioctl and update the timestamping
documentation.
2. Implement SIOCGHWTSTAMP in most drivers that support SIOCSHWTSTAMP.
3. Add a test program to exercise SIOC{G,S}HWTSTAMP.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Thu, 5 Dec 2013 21:02:56 +0000 (16:02 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this batch of fixes intende for the 3.13 stream!

For the mac80211 bits, Johannes says:

"For now I have various fixes all over, mostly for issues introduced in
relatively recent patches. There's no real pattern to it. Some of the
issues like go back longer, but still seemed 3.13 material."

And...

"These are just two patches disabling the broken CSA code. Once this
goes into your tree I'll merge it into mac80211-next and revert there
(since we fixed the bugs there)."

For the iwlwifi bits, Emmanuel says:

"I have here a few fixes for BT Coex. One of them is a NULL pointer
dereference. Another one avoids to enable a feature that can make the
firmware unhappy since the firmware isn't ready for it yet. WE also
avoid a WARNING that can be triggered upon association in not-so-bad
cases even if the association succeeded. We add support for new NICs
(not yet on the market) and bump the API so that 3.13 will be able to
work with the new firmware that will be out soon hopefully.
I also have a boundary check from Johannes."

In addition to those...

- Arend van Spriel fixes a brcmfmac problem that could use an
uninitialized variable in an error path.

- Borislav Petkov fixes a Kconfig-based build breakage problem for
brcmsmac.

- Michal Nazarewicz fixes a couple of NULL pointer dereference problems
in ath9k and wcn36xx.

- Sujith Manoharan fixes a couple of ath9k problems related to
incorrect interpretation of EEPROM configuration data.

- Ujjal Roy fixes a memory leak in mwifiex.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Thu, 5 Dec 2013 14:29:56 +0000 (09:29 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem

11 years agoMerge branch 'cxgb4'
David S. Miller [Tue, 3 Dec 2013 21:55:49 +0000 (16:55 -0500)]
Merge branch 'cxgb4'

Hariprasad Shenai says:

====================
Fixes T5 adapter init, due to incorrect FW version check

This patch series fixes, Chelsio T5 adapter initialization failure due to
incorrect firmware version check. This patch series modifies the firmware
flashing mechanism for T4/T5 adapter.

The patch series moves chip type from struct adapter to struct adapter_params.
It changes the references of chip type in cxgb4 and cxgb4vf drivers such that
build failure is avoided.

Patch 3/3 is dependent on patch 1/3
Patch 2/3 is also dependent on patch 1/3

We would like to request this patch series to get merged via David Miller's
'net' tree.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add new scheme to update T4/T5 firmware
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:58 +0000 (17:05 +0530)]
cxgb4: Add new scheme to update T4/T5 firmware

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4vf: added much cleaner implementation of is_t4()
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:57 +0000 (17:05 +0530)]
cxgb4vf: added much cleaner implementation of is_t4()

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Much cleaner implementation of is_t4()/is_t5()
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:56 +0000 (17:05 +0530)]
cxgb4: Much cleaner implementation of is_t4()/is_t5()

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx4_core: destroy workqueue when driver fails to register
Wei Yang [Tue, 3 Dec 2013 02:04:10 +0000 (10:04 +0800)]
net/mlx4_core: destroy workqueue when driver fails to register

When driver registration fails, we need to clean up the resources allocated
before. mlx4_core missed destroying the workqueue allocated.

This patch destroys the workqueue when registration fails.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agords: prevent BUG_ON triggered on congestion update to loopback
Venkat Venkatsubra [Mon, 2 Dec 2013 23:41:39 +0000 (15:41 -0800)]
rds: prevent BUG_ON triggered on congestion update to loopback

After congestion update on a local connection, when rds_ib_xmit returns
less bytes than that are there in the message, rds_send_xmit calls
back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
to trigger.

For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
the message contains 8240 bytes. rds_send_xmit thinks there is more to send
and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
=4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].

The commit 6094628bfd94323fc1cea05ec2c6affd98c18f7f
"rds: prevent BUG_ON triggering on congestion map updates" introduced
this regression. That change was addressing the triggering of a different
BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
  BUG_ON(ret != 0 &&
      conn->c_xmit_sg == rm->data.op_nents);
This was the sequence it was going through:
(rds_ib_xmit)
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
   && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
   rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
     return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit:
  c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
     = 8192
  c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit:
  c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
  and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
rds_ib_xmit returns 8240
On this iteration this sequence causes the BUG_ON in rds_send_xmit:
    while (ret) {
     tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
     [tmp = 65536 - 57632 = 7904]
     conn->c_xmit_data_off += tmp;
     [c_xmit_data_off = 57632 + 7904 = 65536]
     ret -= tmp;
     [ret = 8240 - 7904 = 336]
     if (conn->c_xmit_data_off == sg->length) {
     conn->c_xmit_data_off = 0;
     sg++;
     conn->c_xmit_sg++;
     BUG_ON(ret != 0 &&
     conn->c_xmit_sg == rm->data.op_nents);
     [c_xmit_sg = 1, rm->data.op_nents = 1]

What the current fix does:
Since the congestion update over loopback is not actually transmitted
as a message, all that rds_ib_xmit needs to do is let the caller think
the full message has been transmitted and not return partial bytes.
It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
And 64Kb+48 when page size is 64Kb.

Reported-by: Josh Hunt <joshhunt00@gmail.com>
Tested-by: Honggang Li <honli@redhat.com>
Acked-by: Bang Nguyen <bang.nguyen@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxen-netback: clear vif->task on disconnect
Paul Durrant [Tue, 3 Dec 2013 14:06:25 +0000 (14:06 +0000)]
xen-netback: clear vif->task on disconnect

xenvif_start_xmit() relies on checking vif->task for NULL to determine
whether the vif is ready to accept packets. The task thread is stopped in
xenvif_disconnect() but task is not set to NULL. Thus, on a re-connect the
check will give a false positive.

Also since commit ea732dff5cfa10789007bf4a5b935388a0bb2a8f (Handle backend
state transitions in a more robust way) it should not be possible for
xenvif_connect() to be called if the vif is already connected so change the
check of vif->tx_irq to a BUG_ON() and also add a BUG_ON(vif->task).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRevert "net: Handle CHECKSUM_COMPLETE more adequately in pskb_trim_rcsum()."
David S. Miller [Mon, 2 Dec 2013 22:26:05 +0000 (17:26 -0500)]
Revert "net: Handle CHECKSUM_COMPLETE more adequately in pskb_trim_rcsum()."

This reverts commit 018c5bba052b3a383d83cf0c756da0e7bc748397.

It causes regressions for people using chips driven by the sungem
driver.  Suspicion is that the skb->csum value isn't being adjusted
properly.

The change also has a bug in that if __pskb_trim() fails, we'll leave
a corruped skb->csum value in there.  We would really need to revert
it to it's original value in that case.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: do not pretend FRAGLIST support
Eric Dumazet [Mon, 2 Dec 2013 16:51:13 +0000 (08:51 -0800)]
net: do not pretend FRAGLIST support

Few network drivers really supports frag_list : virtual drivers.

Some drivers wrongly advertise NETIF_F_FRAGLIST feature.

If skb with a frag_list is given to them, packet on the wire will be
corrupt.

Remove this flag, as core networking stack will make sure to
provide packets that can be sent without corruption.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Anirudha Sarangi <anirudh@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoIPv6: Fixed support for blackhole and prohibit routes
Kamala R [Mon, 2 Dec 2013 14:25:21 +0000 (19:55 +0530)]
IPv6: Fixed support for blackhole and prohibit routes

The behaviour of blackhole and prohibit routes has been corrected by setting
the input and output pointers of the dst variable appropriately. For
blackhole routes, they are set to dst_discard and to ip6_pkt_discard and
ip6_pkt_discard_out respectively for prohibit routes.

ipv6: ip6_pkt_prohibit(_out) should not depend on
CONFIG_IPV6_MULTIPLE_TABLES

We need ip6_pkt_prohibit(_out) available without
CONFIG_IPV6_MULTIPLE_TABLES

Signed-off-by: Kamala R <kamala@aristanetworks.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: fix third arg of anycast_dst_alloc(), must be bool.
François-Xavier Le Bail [Mon, 2 Dec 2013 10:28:49 +0000 (11:28 +0100)]
ipv6: fix third arg of anycast_dst_alloc(), must be bool.

Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fec_main: dma_map() only the length of the skb
Sebastian Siewior [Mon, 2 Dec 2013 09:52:55 +0000 (10:52 +0100)]
net: fec_main: dma_map() only the length of the skb

On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
bytes. This works because we don't overwrite any memory after the data buffer,
we remove it from cache if it was there. So we hurt performace in case the
mapping of a smaller area makes a difference.
There is also a bug: If the data area starts shortly before the end of
RAM say 0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough
space to fit the data area (according to skb->len) but we would map beyond
end of ram if we are using 2048. In v2.6.31 (against which kernel this patch
made) there is the following check in dma_cache_maint():

|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));

Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").

This patch was tested on v2.6.31 and then forward-ported and compile
tested only against the net tree. I think it is still worth fixing
mainline even after the BUG() statement is gone.

Tested-by: Fugang Duan <B38611@freescale.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: net: cpsw: fix dt probe for one port ethernet
Mugunthan V N [Mon, 2 Dec 2013 07:23:39 +0000 (12:53 +0530)]
drivers: net: cpsw: fix dt probe for one port ethernet

When only one port of the two port is pinned out, then dt probe is failing
because second port phy is not found. fixing this by checking the number of
slaves and breaking the loop.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoPCI / tg3: Give up chip reset and carrier loss handling if PCI device is not present
Rafael J. Wysocki [Sun, 1 Dec 2013 01:34:37 +0000 (02:34 +0100)]
PCI / tg3: Give up chip reset and carrier loss handling if PCI device is not present

Modify tg3_chip_reset() and tg3_close() to check if the PCI network
adapter device is accessible at all in order to skip poking it or
trying to handle a carrier loss in vain when that's not the case.
Introduce a special PCI helper function pci_device_is_present()
for this purpose.

Of course, this uncovers the lack of the appropriate RTNL locking
in tg3_suspend() and tg3_resume(), so add that locking in there
too.

These changes prevent tg3 from burning a CPU at 100% load level for
solid several seconds after the Thunderbolt link is disconnected from
a Matrox DS1 docking station.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: judge the accept_ra_defrtr before calling rt6_route_rcv
Duan Jiong [Tue, 26 Nov 2013 07:46:56 +0000 (15:46 +0800)]
ipv6: judge the accept_ra_defrtr before calling rt6_route_rcv

when dealing with a RA message, if accept_ra_defrtr is false,
the kernel will not add the default route, and then deal with
the following route information options. Unfortunately, those
options maybe contain default route, so let's judge the
accept_ra_defrtr before calling rt6_route_rcv.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
John W. Linville [Mon, 2 Dec 2013 18:20:03 +0000 (13:20 -0500)]
Merge branch 'for-john' of git://git./linux/kernel/git/jberg/mac80211

11 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 2 Dec 2013 18:15:39 +0000 (10:15 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 - Correction of fuzzy and fragile IRQ_RETVAL macro
 - IRQ related resume fix affecting only XEN
 - ARM/GIC fix for chained GIC controllers

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: Gic: fix boot for chained gics
  irq: Enable all irqs unconditionally in irq_resume
  genirq: Correct fuzzy and fragile IRQ_RETVAL() definition

11 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 2 Dec 2013 18:13:44 +0000 (10:13 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "Various smaller fixlets, all over the place"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/doc: Fix generation of device-drivers
  sched: Expose preempt_schedule_irq()
  sched: Fix a trivial typo in comments
  sched: Remove unused variable in 'struct sched_domain'
  sched: Avoid NULL dereference on sd_busy
  sched: Check sched_domain before computing group power
  MAINTAINERS: Update file patterns in the lockdep and scheduler entries

11 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 2 Dec 2013 18:13:09 +0000 (10:13 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Misc kernel and tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools lib traceevent: Fix conversion of pointer to integer of different size
  perf/trace: Properly use u64 to hold event_id
  perf: Remove fragile swevent hlist optimization
  ftrace, perf: Avoid infinite event generation loop
  tools lib traceevent: Fix use of multiple options in processing field
  perf header: Fix possible memory leaks in process_group_desc()
  perf header: Fix bogus group name
  perf tools: Tag thread comm as overriden

11 years agoMerge tag 'stable/for-linus-3.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Mon, 2 Dec 2013 18:12:01 +0000 (10:12 -0800)]
Merge tag 'stable/for-linus-3.13-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
 "Fixes to patches that went in this merge window along with a latent
  bug:
   - Fix lazy flushing in case m2p override fails.
   - Fix module compile issues with ARM/Xen
   - Add missing call to DMA map page for Xen SWIOTLB for ARM"

* tag 'stable/for-linus-3.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/gnttab: leave lazy MMU mode in the case of a m2p override failure
  xen/arm: p2m_init and p2m_lock should be static
  arm/xen: Export phys_to_mach to fix Xen module link errors
  swiotlb-xen: add missing xen_dma_map_page call

11 years agobrcmfmac: fix uninitialized warning
Arend van Spriel [Fri, 29 Nov 2013 22:00:31 +0000 (23:00 +0100)]
brcmfmac: fix uninitialized warning

Building brcmfmac for sparc64 gave the following warning:

  CC [M]  drivers/net/wireless/brcm80211/brcmfmac/bcmsdh_sdmmc.o
    bcmsdh_sdmmc.c: In function 'brcmf_sdioh_request_byte':
     bcmsdh_sdmmc.c:89:6: warning: 'err_ret' may be used uninitialized
                          in this function [-Wuninitialized]

Inspecting the code it indeed had a path of execution in
which the return value was used uninitialized. This patch
fixes that code path.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agonet: wireless: wcn36xx: fix potential NULL pointer dereference
Michal Nazarewicz [Mon, 2 Dec 2013 13:09:34 +0000 (14:09 +0100)]
net: wireless: wcn36xx: fix potential NULL pointer dereference

If kmalloc fails wcn36xx_smd_rsp_process will attempt to dereference
a NULL pointer.  There might be a better error recovery then just
printing an error, but printing an error message is better then the
current behaviour.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agonet: wireless: ath9k: avoid possible NULL pointer dereference
Michal Nazarewicz [Fri, 29 Nov 2013 17:06:46 +0000 (18:06 +0100)]
net: wireless: ath9k: avoid possible NULL pointer dereference

Code in ath9k_hw_set_clockrate function indicates that ah->curchan
(and thus chan local variable) may be NULL.  If that is indeed the
case, IS_CHAN_HT40(chan) check has to be performed only in branch
where chan is not NULL.  Moving the code under already existing
if condition fixes this issue.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath9k: Fix XLNA bias strength
Sujith Manoharan [Tue, 26 Nov 2013 01:51:39 +0000 (07:21 +0530)]
ath9k: Fix XLNA bias strength

The EEPROM parameter to determine whether the bias
strength values for XLNA have to be applied is part
of the miscConfiguration field and not featureEnable.

Cc: stable@vger.kernel.org
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoath9k: Fix QuickDrop usage
Sujith Manoharan [Tue, 26 Nov 2013 01:51:08 +0000 (07:21 +0530)]
ath9k: Fix QuickDrop usage

Bit 5 in the miscConfiguration field of the base EEPROM
header denotes whether QuickDrop is enabled or not. Fix
the incorrect usage of BIT(1) and also make sure that
this is done only for the required chips.

Cc: stable@vger.kernel.org
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
11 years agoMerge tag 'spi-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Mon, 2 Dec 2013 18:10:55 +0000 (10:10 -0800)]
Merge tag 'spi-v3.13-rc2' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A smattering of driver specific fixes here, including a bunch for a
  long standing common pattern in the error handling paths, and a fix
  for an embarrassing thinko in the new devm master registration code"

* tag 'spi-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi/pxa2xx: Restore private register bits.
  spi/qspi: Fix qspi remove path.
  spi/qspi: cleanup pm_runtime error check.
  spi/qspi: set correct platform drvdata in ti_qspi_probe()
  spi/pxa2xx: add new ACPI IDs
  spi: core: invert success test in devm_spi_register_master
  spi: spi-mxs: fix reference leak to master in mxs_spi_remove()
  spi: bcm63xx: fix reference leak to master in bcm63xx_spi_remove()
  spi: txx9: fix reference leak to master in txx9spi_remove()
  spi: mpc512x: fix reference leak to master in mpc512x_psc_spi_do_remove()
  spi: rspi: use platform drvdata correctly in rspi_remove()
  spi: bcm2835: fix reference leak to master in bcm2835_spi_remove()

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 2 Dec 2013 18:09:07 +0000 (10:09 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking updates from David Miller:
 "Here is a pile of bug fixes that accumulated while I was in Europe"

 1) In fixing kernel leaks to userspace during copying of socket
    addresses, we broke a case that used to work, namely the user
    providing a buffer larger than the in-kernel generic socket address
    structure.  This broke Ruby amongst other things.  Fix from Dan
    Carpenter.

 2) Fix regression added by byte queue limit support in 8139cp driver,
    from Yang Yingliang.

 3) The addition of MSG_SENDPAGE_NOTLAST buggered up a few sendpage
    implementations, they should just treat it the same as MSG_MORE.
    Fix from Richard Weinberger and Shawn Landden.

 4) Handle icmpv4 errors received on ipv6 SIT tunnels correctly, from
    Oussama Ghorbel.  In particular we should send an ICMPv6 unreachable
    in such situations.

 5) Fix some regressions in the recent genetlink fixes, in particular
    get the pmcraid driver to use the new safer interfaces correctly.
    From Johannes Berg.

 6) macvtap was converted to use a per-cpu set of statistics, but some
    code was still bumping tx_dropped elsewhere.  From Jason Wang.

 7) Fix build failure of xen-netback due to missing include on some
    architectures, from Andy Whitecroft.

 8) macvtap double counts received packets in statistics, fix from Vlad
    Yasevich.

 9) Fix various cases of using *_STATS_BH() when *_STATS() is more
    appropriate.  From Eric Dumazet and Hannes Frederic Sowa.

10) Pktgen ipsec mode doesn't update the ipv4 header length and checksum
    properly after encapsulation.  Fix from Fan Du.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  net/mlx4_en: Remove selftest TX queues empty condition
  {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation
  virtio_net: make all RX paths handle erors consistently
  virtio_net: fix error handling for mergeable buffers
  virtio_net: Fixed a trivial typo (fitler --> filter)
  netem: fix gemodel loss generator
  netem: fix loss 4 state model
  netem: missing break in ge loss generator
  net/hsr: Support iproute print_opt ('ip -details ...')
  net/hsr: Very small fix of comment style.
  MAINTAINERS: Added net/hsr/ maintainer
  ipv6: fix possible seqlock deadlock in ip6_finish_output2
  ixgbe: Make ixgbe_identify_qsfp_module_generic static
  ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default
  ixgbe: ixgbe_fwd_ring_down needs to be static
  e1000: fix possible reset_task running after adapter down
  e1000: fix lockdep warning in e1000_reset_task
  e1000: prevent oops when adapter is being closed and reset simultaneously
  igb: Fixed Wake On LAN support
  inet: fix possible seqlock deadlocks
  ...

11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwif...
John W. Linville [Mon, 2 Dec 2013 17:57:23 +0000 (12:57 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/iwlwifi/iwlwifi-fixes

11 years agovfs: fix subtle use-after-free of pipe_inode_info
Linus Torvalds [Mon, 2 Dec 2013 17:44:51 +0000 (09:44 -0800)]
vfs: fix subtle use-after-free of pipe_inode_info

The pipe code was trying (and failing) to be very careful about freeing
the pipe info only after the last access, with a pattern like:

        spin_lock(&inode->i_lock);
        if (!--pipe->files) {
                inode->i_pipe = NULL;
                kill = 1;
        }
        spin_unlock(&inode->i_lock);
        __pipe_unlock(pipe);
        if (kill)
                free_pipe_info(pipe);

where the final freeing is done last.

HOWEVER.  The above is actually broken, because while the freeing is
done at the end, if we have two racing processes releasing the pipe
inode info, the one that *doesn't* free it will decrement the ->files
count, and unlock the inode i_lock, but then still use the
"pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)".

This is *very* hard to trigger in practice, since the race window is
very small, and adding debug options seems to just hide it by slowing
things down.

Simon originally reported this way back in July as an Oops in
kmem_cache_allocate due to a single bit corruption (due to the final
"spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different
allocation that had re-used the free'd pipe-info), it's taken this long
to figure out.

Since the 'pipe->files' accesses aren't even protected by the pipe lock
(we very much use the inode lock for that), the simple solution is to
just drop the pipe lock early.  And since there were two users of this
pattern, create a helper function for it.

Introduced commit ba5bb147330a ("pipe: take allocation and freeing of
pipe_inode_info out of ->i_mutex").

Reported-by: Simon Kirby <sim@hostway.ca>
Reported-by: Ian Applegate <ia@cloudflare.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # v3.10+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>