firefly-linux-kernel-4.4.55.git
12 years agonet: Make ndo_neigh_destroy return void.
David S. Miller [Thu, 1 Dec 2011 19:16:04 +0000 (14:16 -0500)]
net: Make ndo_neigh_destroy return void.

The return value isn't used.

Suggested by Ben Hucthings.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: use a 64bit load/store in output path
Eric Dumazet [Wed, 30 Nov 2011 19:00:53 +0000 (19:00 +0000)]
ipv4: use a 64bit load/store in output path

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodccp: Evaluate ip_hdr() only once in dccp_v4_route_skb().
David S. Miller [Thu, 1 Dec 2011 18:28:34 +0000 (13:28 -0500)]
dccp: Evaluate ip_hdr() only once in dccp_v4_route_skb().

This also works around a bogus gcc warning generated by an
upcoming patch from Eric Dumazet that rearranges the layout
of struct flowi4.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopowerpc: tqm8548/tqm8xx: add and update CAN device nodes
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:21 +0000 (23:41 +0000)]
powerpc: tqm8548/tqm8xx: add and update CAN device nodes

This patch enables or updates support for the CC770 and AN82527
CAN controller on the TQM8548 and TQM8xx boards.

CC: devicetree-discuss@lists.ozlabs.org
CC: linuxppc-dev@ozlabs.org
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: cc770: add platform bus driver for the CC770 and AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:20 +0000 (23:41 +0000)]
can: cc770: add platform bus driver for the CC770 and AN82527

This driver works with both, static platform data and device tree
bindings. It has been tested on a TQM855L board with two AN82527
CAN controllers on the local bus.

CC: Devicetree-discuss@lists.ozlabs.org
CC: linuxppc-dev@ozlabs.org
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: cc770: add legacy ISA bus driver for the CC770 and AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:19 +0000 (23:41 +0000)]
can: cc770: add legacy ISA bus driver for the CC770 and AN82527

This patch adds support for legacy Bosch CC770 and Intel AN82527 CAN
controllers on the ISA or PC-104 bus. The I/O port or memory address
and the IRQ number must be specified via module parameters:

  insmod cc770_isa.ko port=0x310,0x380 irq=7,11

for ISA devices using I/O ports or:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11

for memory mapped ISA devices.

Indirect access via address and data port is supported as well:

  insmod cc770_isa.ko port=0x310,0x380 indirect=1 irq=7,11

Furthermore, the following mode parameter can be defined:

  clk: External oscillator clock frequency (default=16000000 [16 MHz])
  cir: CPU interface register (default=0x40 [DSC])
  bcr: Bus configuration register (default=0x40 [CBY])
  cor: Clockout register (default=0x00)

Note: for clk, cir, bcr and cor, the first argument re-defines the
default for all other devices, e.g.:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000

is equivalent to

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000,24000000

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: cc770: add driver core for the Bosch CC770 and Intel AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:18 +0000 (23:41 +0000)]
can: cc770: add driver core for the Bosch CC770 and Intel AN82527

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoicplus: mdio_write(), remove unnecessary for loop
Patrick Kelle [Thu, 1 Dec 2011 17:54:46 +0000 (12:54 -0500)]
icplus: mdio_write(), remove unnecessary for loop

At this point the variable j is always set to 7 and the code within
the loop has to run only once anyway.

As suggested by David Miller:
"You can simply this even further since p[7] is what is used here,
and this means len is one, the inner loop therefore executes only
once, and the p[7].field value is not used (it's zero in the table)
and the write to it is completely thrown away."

Signed-off-by: Patrick Kelle <patrick.kelle81@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: net_device flags is an unsigned int
Eric Dumazet [Wed, 30 Nov 2011 21:42:26 +0000 (21:42 +0000)]
net: net_device flags is an unsigned int

commit b00055aacdb ([NET] core: add RFC2863 operstate) changed
net_device flags from unsigned short to unsigned int.

Some core functions still assume its an unsigned short.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodsa: Include linux/if_ether.h to fix build error
Axel Lin [Wed, 30 Nov 2011 22:07:18 +0000 (22:07 +0000)]
dsa: Include linux/if_ether.h to fix build error

Include linux/if_ether.h to fix below build errors:

  CC      arch/arm/mach-kirkwood/common.o
In file included from arch/arm/mach-kirkwood/common.c:19:
include/net/dsa.h: In function 'dsa_uses_dsa_tags':
include/net/dsa.h:192: error: 'ETH_P_DSA' undeclared (first use in this function)
include/net/dsa.h:192: error: (Each undeclared identifier is reported only once
include/net/dsa.h:192: error: for each function it appears in.)
include/net/dsa.h: In function 'dsa_uses_trailer_tags':
include/net/dsa.h:197: error: 'ETH_P_TRAILER' undeclared (first use in this function)
make[1]: *** [arch/arm/mach-kirkwood/common.o] Error 1
make: *** [arch/arm/mach-kirkwood] Error 2

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetem: fix build error on 32bit arches
Eric Dumazet [Wed, 30 Nov 2011 23:32:14 +0000 (23:32 +0000)]
netem: fix build error on 32bit arches

ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocaif: Remove unused attributes from struct cflayer
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 13:02:32 +0000 (13:02 +0000)]
caif: Remove unused attributes from struct cflayer

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocaif: Remove unused enum and parameter in cfserl
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:48 +0000 (09:22 +0000)]
caif: Remove unused enum and parameter in cfserl

Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocaif: Restructure how link caif link layer enroll
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:47 +0000 (09:22 +0000)]
caif: Restructure how link caif link layer enroll

Enrolling CAIF link layers are refactored.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocaif: Allow cfpkt_extr_head to process empty message
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:46 +0000 (09:22 +0000)]
caif: Allow cfpkt_extr_head to process empty message

Allow NULL pointer in cfpkt_extr_head in order to
skip past header data.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetem: rate extension
Hagen Paul Pfeifer [Wed, 30 Nov 2011 12:20:26 +0000 (12:20 +0000)]
netem: rate extension

Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used.  But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
12 years agoipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()
Jun Zhao [Wed, 30 Nov 2011 06:21:05 +0000 (06:21 +0000)]
ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()

Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
12 years agoipv4 : igmp : Delete useless parameter in ip_mc_add1_src()
Jun Zhao [Wed, 30 Nov 2011 06:21:04 +0000 (06:21 +0000)]
ipv4 : igmp : Delete useless parameter in ip_mc_add1_src()

Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
12 years agoatm: clip: Use device neigh support on top of "arp_tbl".
David Miller [Mon, 25 Jul 2011 00:01:41 +0000 (00:01 +0000)]
atm: clip: Use device neigh support on top of "arp_tbl".

Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: Add device constructor/destructor capability.
David Miller [Mon, 25 Jul 2011 00:01:38 +0000 (00:01 +0000)]
neigh: Add device constructor/destructor capability.

If the neigh entry has device private state, it will need
constructor/destructor ops.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatm: clip: Convert over to neighbour_priv()
David Miller [Mon, 25 Jul 2011 00:01:33 +0000 (00:01 +0000)]
atm: clip: Convert over to neighbour_priv()

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.
David Miller [Mon, 25 Jul 2011 00:01:28 +0000 (00:01 +0000)]
neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.

Let the core self-size the neigh entry based upon the key length.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: Add infrastructure for allocating device neigh privates.
David Miller [Mon, 25 Jul 2011 00:01:25 +0000 (00:01 +0000)]
neigh: Add infrastructure for allocating device neigh privates.

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: Get rid of neigh_table->kmem_cachep
David Miller [Mon, 25 Jul 2011 00:01:22 +0000 (00:01 +0000)]
neigh: Get rid of neigh_table->kmem_cachep

We are going to alloc for device specific private areas for
neighbour entries, and in order to do that we have to move
away from the fixed allocation size enforced by using
neigh_table->kmem_cachep

As a nice side effect we can now use kfree_rcu().

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: Create mechanism for generic neigh private areas.
David Miller [Mon, 25 Jul 2011 00:01:17 +0000 (00:01 +0000)]
neigh: Create mechanism for generic neigh private areas.

The implementation private sits right after the primary_key memory.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosfc: fix race in efx_enqueue_skb_tso()
Eric Dumazet [Wed, 30 Nov 2011 22:12:27 +0000 (17:12 -0500)]
sfc: fix race in efx_enqueue_skb_tso()

As soon as skb is pushed to hardware, it can be completed and freed, so
we should not dereference skb anymore.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2: Support for byte queue limits
Eric Dumazet [Tue, 29 Nov 2011 11:53:05 +0000 (11:53 +0000)]
bnx2: Support for byte queue limits

Changes to bnx2 to use byte queue limits.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: inherit listener congestion control for passive cnx
Eric Dumazet [Wed, 30 Nov 2011 01:02:41 +0000 (01:02 +0000)]
tcp: inherit listener congestion control for passive cnx

Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.

Reported-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: Revert outdated cc770 driver patches.
David S. Miller [Wed, 30 Nov 2011 21:00:48 +0000 (16:00 -0500)]
can: Revert outdated cc770 driver patches.

Newer versions have been floating about, and I applied
to older variant unfortunately.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosky2: add bql support
stephen hemminger [Tue, 29 Nov 2011 15:15:33 +0000 (15:15 +0000)]
sky2: add bql support

This adds support for byte queue limits and aggregates statistics
update (suggestion from Eric).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
12 years agobnx2x: handle iSCSI SD mode
Dmitry Kravkov [Mon, 28 Nov 2011 12:31:49 +0000 (12:31 +0000)]
bnx2x: handle iSCSI SD mode

in iSCSI SD mode to bnx2x device assigned single mac address
which is supposted to be iscsi mac. If this mode is recognized
bnx2x will disable LRO, decrease number of queues to 1 and rx ring
size to the minumum allowed by FW, this in order minimize memory use.
It will tranfer mac for iscsi usage and zero primary mac of the netdev.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: cc770: legacy CC770 ISA bus driver
Wolfgang Grandegger [Thu, 24 Nov 2011 02:07:28 +0000 (02:07 +0000)]
can: cc770: legacy CC770 ISA bus driver

This patch adds support for legacy Bosch CC770 and Intel AN82527 CAN
controllers on the ISA or PC-104 bus. The I/O port or memory address
and the IRQ number must be specified via module parameters:

  insmod cc770_isa.ko port=0x310,0x380 irq=7,11

for ISA devices using I/O ports or:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11

for memory mapped ISA devices.

Indirect access via address and data port is supported as well:

  insmod cc770_isa.ko port=0x310,0x380 indirect=1 irq=7,11

Furthermore, the following mode parameter can be defined:

  clk: External oscillator clock frequency (default=16000000 [16 MHz])
  cir: CPU interface register (default=0x40 [CPU_DSC])
  ocr, Bus configuration register (default=0x00)
  cor, Clockout register (default=0x00)

Note: for clk, cir, bcr and cor, the first argument re-defines the
default for all other devices, e.g.:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000

is equivalent to

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000,24000000

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: cc770: add driver core for the Bosch CC770 and Intel AN82527
Wolfgang Grandegger [Thu, 24 Nov 2011 02:07:27 +0000 (02:07 +0000)]
can: cc770: add driver core for the Bosch CC770 and Intel AN82527

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/smsc911x: Add regulator support
Robert Marklund [Thu, 24 Nov 2011 01:03:07 +0000 (01:03 +0000)]
net/smsc911x: Add regulator support

Add some basic regulator support for the power pins, as needed
by the ST-Ericsson Snowball platform that powers up the SMSC911
chip using an external regulator.

Platforms that use regulators and the smsc911x and have no defined
regulator for the smsc911x and claim complete regulator
constraints with no dummy regulators will need to provide it, for
example using a fixed voltage regulator. It appears that this may
affect (apart from Ux500 Snowball) possibly these archs/machines
that from some grep:s appear to define both CONFIG_SMSC911X and
CONFIG_REGULATOR:

- ARM Freescale mx3 and OMAP 2 plus, Raumfeld machines
- Blackfin
- Super-H

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: linux-omap@vger.kernel.org
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Robert Marklund <robert.marklund@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: sja1000_isa: convert to platform driver to support x86_64 systems
Wolfgang Grandegger [Wed, 23 Nov 2011 23:58:22 +0000 (23:58 +0000)]
can: sja1000_isa: convert to platform driver to support x86_64 systems

This driver is currently not supported on x86_64 systems because the
"isa_driver" interface is used (CONFIG_ISA=y). To overcome this
limitation, the driver is converted to a platform driver, similar to
the serial 8250 driver.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: remove useless codes in ipmr_device_event()
RongQing.Li [Wed, 23 Nov 2011 23:10:52 +0000 (23:10 +0000)]
ipv4: remove useless codes in ipmr_device_event()

Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
distinguish whether to unregister the device.

When notify=1 means we does not need to unregister the device,
so calling unregister_netdevice_many is useless.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocan: sja1000_isa: fix "limited range" compiler warnings
Wolfgang Grandegger [Wed, 23 Nov 2011 23:08:35 +0000 (23:08 +0000)]
can: sja1000_isa: fix "limited range" compiler warnings

This patch fixes the compiler warnings: "comparison is always
false due to limited range of data type" by using "0xff" instead
of "-1" for unsigned values.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix skb_update_prio RCU usage.
Igor Maravic [Fri, 25 Nov 2011 07:44:54 +0000 (07:44 +0000)]
net: Fix skb_update_prio RCU usage.

Change function rcu_dereference to rcu_dereference_bh to avoid warning

[ INFO: suspicious RCU usage. ]
-------------------------------
net/core/dev.c:2459 suspicious rcu_dereference_check() usage!

because we are locking with

rcu_read_lock_bh();

in function dev_queue_xmit(struct sk_buff *skb)

Signed-off-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_choke: use skb_flow_dissect()
Eric Dumazet [Tue, 29 Nov 2011 04:22:15 +0000 (04:22 +0000)]
sch_choke: use skb_flow_dissect()

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_sfq: use skb_flow_dissect()
Eric Dumazet [Tue, 29 Nov 2011 03:40:45 +0000 (03:40 +0000)]
sch_sfq: use skb_flow_dissect()

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: avoid frag allocation for small frames
Eric Dumazet [Mon, 28 Nov 2011 22:41:47 +0000 (22:41 +0000)]
tcp: avoid frag allocation for small frames

tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.

If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.

This strategy is not welcome for small packets, wasting memory.

Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)

This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.

If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.

netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoflow_dissector: use a 64bit load/store
Eric Dumazet [Mon, 28 Nov 2011 20:30:35 +0000 (20:30 +0000)]
flow_dissector: use a 64bit load/store

Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
> From: Dimitris Michailidis <dm@chelsio.com>
> Date: Mon, 28 Nov 2011 08:25:39 -0800
>
> >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
> >> *flow)
> >> +{
> >> + int poff, nhoff = skb_network_offset(skb);
> >> + u8 ip_proto;
> >> + u16 proto = skb->protocol;
> >
> > __be16 instead of u16 for proto?
>
> I'll take care of this when I apply these patches.

( CC trimmed )

Thanks David !

Here is a small patch to use one 64bit load/store on x86_64 instead of
two 32bit load/stores.

[PATCH net-next] flow_dissector: use a 64bit load/store

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flow_keys (src,dst) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosfc: Support for byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:43 +0000 (16:33 +0000)]
sfc: Support for byte queue limits

Changes to sfc to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Support for byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:37 +0000 (16:33 +0000)]
bnx2x: Support for byte queue limits

Changes to bnx2x to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Support for byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:30 +0000 (16:33 +0000)]
tg3: Support for byte queue limits

Changes to tg3 to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoforcedeth: Support for byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:23 +0000 (16:33 +0000)]
forcedeth: Support for byte queue limits

Changes to forcedeth to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoe1000e: Support for byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:16 +0000 (16:33 +0000)]
e1000e: Support for byte queue limits

Changes to e1000e to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobql: Byte queue limits
Tom Herbert [Mon, 28 Nov 2011 16:33:09 +0000 (16:33 +0000)]
bql: Byte queue limits

Networking stack support for byte queue limits, uses dynamic queue
limits library.  Byte queue limits are maintained per transmit queue,
and a dql structure has been added to netdev_queue structure for this
purpose.

Configuration of bql is in the tx-<n> sysfs directory for the queue
under the byte_queue_limits directory.  Configuration includes:
limit_min, bql minimum limit
limit_max, bql maximum limit
hold_time, bql slack hold time

Also under the directory are:
limit, current byte limit
inflight, current number of bytes on the queue

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoxps: Add xps_queue_release function
Tom Herbert [Mon, 28 Nov 2011 16:33:02 +0000 (16:33 +0000)]
xps: Add xps_queue_release function

This patch moves the xps specific parts in netdev_queue_release into
its own function which netdev_queue_release can call.  This allows
netdev_queue_release to be more generic (for adding new attributes
to tx queues).

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Add netdev interfaces for recording sends/comp
Tom Herbert [Mon, 28 Nov 2011 16:32:52 +0000 (16:32 +0000)]
net: Add netdev interfaces for recording sends/comp

Add interfaces for drivers to call for recording number of packets and
bytes at send time and transmit completion.  Also, added a function to
"reset" a queue.  These will be used by Byte Queue Limits.

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Add queue state xoff flag for stack
Tom Herbert [Mon, 28 Nov 2011 16:32:44 +0000 (16:32 +0000)]
net: Add queue state xoff flag for stack

Create separate queue state flags so that either the stack or drivers
can turn on XOFF.  Added a set of functions used in the stack to determine
if a queue is really stopped (either by stack or driver)

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodql: Dynamic queue limits
Tom Herbert [Mon, 28 Nov 2011 16:32:35 +0000 (16:32 +0000)]
dql: Dynamic queue limits

Implementation of dynamic queue limits (dql).  This is a libary which
allows a queue limit to be dynamically managed.  The goal of dql is
to set the queue limit, number of objects to the queue, to be minimized
without allowing the queue to be starved.

dql would be used with a queue which has these properties:

1) Objects are queued up to some limit which can be expressed as a
   count of objects.
2) Periodically a completion process executes which retires consumed
   objects.
3) Starvation occurs when limit has been reached, all queued data has
   actually been consumed but completion processing has not yet run,
   so queuing new data is blocked.
4) Minimizing the amount of queued data is desirable.

A canonical example of such a queue would be a NIC HW transmit queue.

The queue limit is dynamic, it will increase or decrease over time
depending on the workload.  The queue limit is recalculated each time
completion processing is done.  Increases occur when the queue is
starved and can exponentially increase over successive intervals.
Decreases occur when more data is being maintained in the queue than
needed to prevent starvation.  The number of extra objects, or "slack",
is measured over successive intervals, and to avoid hysteresis the
limit is only reduced by the miminum slack seen over a configurable
time period.

dql API provides routines to manage the queue:
- dql_init is called to intialize the dql structure
- dql_reset is called to reset dynamic values
- dql_queued called when objects are being enqueued
- dql_avail returns availability in the queue
- dql_completed is called when objects have be consumed in the queue

Configuration consists of:
- max_limit, maximum limit
- min_limit, minimum limit
- slack_hold_time, time to measure instances of slack before reducing
  queue limit

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: add calxeda xgmac ethernet driver
Rob Herring [Tue, 22 Nov 2011 17:18:19 +0000 (17:18 +0000)]
net: add calxeda xgmac ethernet driver

Add support for the XGMAC 10Gb ethernet device in the Calxeda Highbank
SOC.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocorral some wayward N/A fw_version dust bunnies
Rick Jones [Tue, 22 Nov 2011 14:06:26 +0000 (14:06 +0000)]
corral some wayward N/A fw_version dust bunnies

Round-up some wayward "N/A" fw_version dust bunnies as part of that
clean-up.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: do not scale TSO segment size with reordering degree
Neal Cardwell [Mon, 21 Nov 2011 17:15:14 +0000 (17:15 +0000)]
tcp: do not scale TSO segment size with reordering degree

Since 2005 (c1b4a7e69576d65efc31a8cea0714173c2841244)
tcp_tso_should_defer has been using tcp_max_burst() as a target limit
for deciding how large to make outgoing TSO packets when not using
sysctl_tcp_tso_win_divisor. But since 2008
(dd9e0dda66ba38a2ddd1405ac279894260dc5c36) tcp_max_burst() returns the
reordering degree. We should not have tcp_tso_should_defer attempt to
build larger segments just because there is more reordering. This
commit splits the notion of deferral size used in TSO from the notion
of burst size used in cwnd moderation, and returns the TSO deferral
limit to its original value.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatm: br2684: Avoid alignment issues
Pascal Hambourg [Wed, 17 Aug 2011 06:37:52 +0000 (08:37 +0200)]
atm: br2684: Avoid alignment issues

Use memcmp() instead of cast to u16 when checking the PAD field.

Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatm: br2684: Make headroom and hard_header_len depend on the payload type
Pascal Hambourg [Wed, 17 Aug 2011 06:37:18 +0000 (08:37 +0200)]
atm: br2684: Make headroom and hard_header_len depend on the payload type

Routed payload requires less headroom than bridged payload.
So do not reallocate headroom if not needed.
Also, add worst case AAL5 overhead to netdev->hard_header_len.

Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: optimize socket timestamping
Eric Dumazet [Mon, 28 Nov 2011 12:04:18 +0000 (12:04 +0000)]
net: optimize socket timestamping

We can test/set multiple bits from sk_flags at once, to shorten a bit
socket setup/dismantle phase.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: dont call jump_label_dec from irq context
Eric Dumazet [Mon, 28 Nov 2011 11:16:50 +0000 (11:16 +0000)]
net: dont call jump_label_dec from irq context

Igor Maravic reported an error caused by jump_label_dec() being called
from IRQ context :

 BUG: sleeping function called from invalid context at kernel/mutex.c:271
 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
 1 lock held by swapper/0:
  #0:  (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340
 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
Call Trace:
 <IRQ>  [<ffffffff8104f417>] __might_sleep+0x137/0x1f0
 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370
 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80
 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0
 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80
 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50
 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20
 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50
 [<ffffffff81557b00>] __sk_free+0x80/0x200
 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff81557cba>] sock_wfree+0x3a/0x70
 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120
 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30
 [<ffffffff8155c119>] kfree_skb+0x49/0x170
 [<ffffffff815e936e>] arp_error_report+0x3e/0x90
 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0
 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0
 [<ffffffff81578d20>] ? neigh_update+0x640/0x640
 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0

Since jump_label_{inc|dec} must be called from process context only,
we must defer jump_label_dec() if net_disable_timestamp() is called
from interrupt context.

Reported-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/ethernet: convert drivers/net/ethernet/* to use module_platform_driver()
Axel Lin [Sun, 27 Nov 2011 16:44:17 +0000 (16:44 +0000)]
net/ethernet: convert drivers/net/ethernet/* to use module_platform_driver()

This patch converts the drivers in drivers/net/ethernet/* to use the
module_platform_driver() macro which makes the code smaller and a bit
simpler.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com>
Cc: Vitaly Bordug <vbordug@ru.mvista.com>
Cc: Wan ZongShun <mcuos.com@gmail.com>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Daniel Hellstrom <daniel@gaisler.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Richard Cochran <richard.cochran@omicron.at>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Sebastian Poehn <sebastian.poehn@belden.com>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/can: convert drivers/net/can/* to use module_platform_driver()
Axel Lin [Sun, 27 Nov 2011 15:42:31 +0000 (15:42 +0000)]
net/can: convert drivers/net/can/* to use module_platform_driver()

This patch converts the drivers in drivers/net/can/* to use the
module_platform_driver() macro which makes the code smaller and a bit
simpler.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Bhupesh Sharma <bhupesh.sharma@st.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Kurt Van Dijck <kurt.van.dijck@eia.be>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoNET: NETROM: Fix formatting.
Ralf Baechle [Thu, 24 Nov 2011 23:54:10 +0000 (23:54 +0000)]
NET: NETROM: Fix formatting.

The Linux coding style wants the return statement on its own line.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoNET: NETROM: Cleanup argument SIOCADDRT ioctl argument checking.
Ralf Baechle [Thu, 24 Nov 2011 23:09:00 +0000 (23:09 +0000)]
NET: NETROM: Cleanup argument SIOCADDRT ioctl argument checking.

nr_route.ndigis is unsigned int so the nr_route.ndigis < 0 expression is
never true and can be dropped.  Doing the nr_ax25_dev_get call later
allows the nr_route.ndigis test to bail out without having to dev_put.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Osterried <thomas@osterried.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoNET: NETROM: When adding a route verify length of mnemonic string.
Ralf Baechle [Thu, 24 Nov 2011 23:08:49 +0000 (23:08 +0000)]
NET: NETROM: When adding a route verify length of mnemonic string.

struct nr_route_struct's mnemonic permits a string of up to 7 bytes to be
used.  If userland passes a not zero terminated string to the kernel adding
a node to the routing table might result in the kernel attempting to read
copy a too long string.

Mnemonic is part of the NET/ROM routing protocol; NET/ROM routing table
updates only broadcast 6 bytes.  The 7th byte in the mnemonic array exists
only as a \0 termination character for the kernel code's convenience.

Fixed by rejecting mnemonic strings that have no terminating \0 in the first
7 characters.  Do this test only NETROM_NODE to avoid breaking NETROM_NEIGH
where userland might passing an uninitialized mnemonic field.

Initial patch by Dan Carpenter <dan.carpenter@oracle.com>.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Walter Harms <wharms@bfs.de>
Cc: Thomas Osterried <thomas@osterried.de>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoNET: AX.25: Check ioctl arguments to avoid overflows further down the road.
Ralf Baechle [Thu, 24 Nov 2011 06:12:59 +0000 (06:12 +0000)]
NET: AX.25: Check ioctl arguments to avoid overflows further down the road.

Very large, nonsenical arguments or use in very extreme conditions could
result in integer overflows.  Check ioctls arguments to avoid such
overflows and return -EINVAL for too large arguments.

To allow the use of AX.25 for even the most extreme setup (think packet
radio to the Phase 5E mars probe) we make no further attempt to clamp the
argument range.

Originally reported by Fan Long <longfancn@gmail.com> and a first patch
was sent by Xi Wang <xi.wang@gmail.com>.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Xi Wang <xi.wang@gmail.com>
Cc: Joerg Reuter <jreuter@yaina.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Thomas Osterried <thomas@osterried.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodsa: Move switch drivers to new directory drivers/net/dsa
Ben Hutchings [Sun, 27 Nov 2011 17:08:33 +0000 (17:08 +0000)]
dsa: Move switch drivers to new directory drivers/net/dsa

Support for specific hardware belongs under drivers/net/ not net/.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodsa: Move all definitions needed by drivers into <net/dsa.h>
Ben Hutchings [Sun, 27 Nov 2011 17:06:08 +0000 (17:06 +0000)]
dsa: Move all definitions needed by drivers into <net/dsa.h>

Any headers included by drivers should be under include/, and
any definitions they use are not really private to the core as
the name "dsa_priv.h" suggests.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodsa: Remove unnecessary exports
Ben Hutchings [Sun, 27 Nov 2011 17:05:06 +0000 (17:05 +0000)]
dsa: Remove unnecessary exports

I mistakenly exported functions from slave.c that are only called from
dsa.c, part of the same module.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Tue, 29 Nov 2011 00:21:10 +0000 (19:21 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next

12 years agotg3: Scale back code that modifies MRRS
Matt Carlson [Mon, 28 Nov 2011 09:41:04 +0000 (09:41 +0000)]
tg3: Scale back code that modifies MRRS

Tg3 normally gets a performance boost by increasing the PCI Maximum Read
Request Size (MRRS) to 4k.  Unfortunately, this is causing some problems
on particular hardware platforms.  This patch removes all code that
modifies the MRRS except for one case.

As part of a solution to fix an internal FIFO problem on the 5719, the
driver artificially capped the MRRS to 2k for the entire 5719, and later
5720, ASIC revs.  This was overly aggressive and only really needed to
be done for the 5719 A0.  In the spirit of the rest of this patch, the
driver will only reprogram the MRRS for this device if the value exceeds
the 2k cap.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: Fix TSO CAP for 5704 devs w / ASF enabled
Matt Carlson [Mon, 28 Nov 2011 09:41:03 +0000 (09:41 +0000)]
tg3: Fix TSO CAP for 5704 devs w / ASF enabled

On the earliest TSO capable devices, TSO was accomplished through
firmware.  The TSO cannot coexist with ASF management firmware though.
The tg3 driver determines whether or not ASF is enabled by calling
tg3_get_eeprom_hw_cfg(), which checks a particular bit of NIC memory.
Commit dabc5c670d3f86d15ee4f42ab38ec5bd2682487d, entitled "tg3: Move
TSO_CAPABLE assignment", accidentally moved the code that determines
TSO capabilities earlier than the call to tg3_get_eeprom_hw_cfg().  As a
consequence, the driver was attempting to determine TSO capabilities
before it had all the data it needed to make the decision.

This patch fixes the problem by revisiting and reevaluating the decision
after tg3_get_eeprom_hw_cfg() is called.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_sfb: use skb_flow_dissect()
Eric Dumazet [Mon, 28 Nov 2011 05:25:02 +0000 (05:25 +0000)]
sch_sfb: use skb_flow_dissect()

Current SFB double hashing is not fulfilling SFB theory, if two flows
share same rxhash value.

Using skb_flow_dissect() permits to really have better hash dispersion,
and get tunnelling support as well.

Double hashing point was mentioned by Florian Westphal

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocls_flow: use skb_flow_dissect()
Eric Dumazet [Mon, 28 Nov 2011 05:24:18 +0000 (05:24 +0000)]
cls_flow: use skb_flow_dissect()

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

This lack of tunnelling support was mentioned by Dan Siemon.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: use skb_flow_dissect() in __skb_get_rxhash()
Eric Dumazet [Mon, 28 Nov 2011 05:23:23 +0000 (05:23 +0000)]
net: use skb_flow_dissect() in __skb_get_rxhash()

No functional changes.

This uses the code we factorized in skb_flow_dissect()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: introduce skb_flow_dissect()
Eric Dumazet [Mon, 28 Nov 2011 05:22:18 +0000 (05:22 +0000)]
net: introduce skb_flow_dissect()

We use at least two flow dissectors in network stack, with known
limitations and code duplication.

Introduce skb_flow_dissect() to factorize this, highly inspired from
existing dissector from __skb_get_rxhash()

Note : We extensively use skb_header_pointer(), this permits us to not
touch skb at all.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Change value comparison order
Yaniv Rosner [Mon, 28 Nov 2011 00:49:53 +0000 (00:49 +0000)]
bnx2x: Change value comparison order

Change comparison order such that the variable will come before the compared value.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Cosmetic changes
Yaniv Rosner [Mon, 28 Nov 2011 00:49:52 +0000 (00:49 +0000)]
bnx2x: Cosmetic changes

Fix spelling, alignment, empty lines, relocate the is_4_port_mode function, and split bnx2x_link_status_update function.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Fix self test of BCM57800
Yaniv Rosner [Mon, 28 Nov 2011 00:49:51 +0000 (00:49 +0000)]
bnx2x: Fix self test of BCM57800

Fix the MAC test of the 1G port of the BCM57800 to use the UMAC instead of the XMAC.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Add known PHY type check
Yaniv Rosner [Mon, 28 Nov 2011 00:49:50 +0000 (00:49 +0000)]
bnx2x: Add known PHY type check

The populate function will fail in case an unknown external PHY is detected.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Change Warpcore MDIO work around mode
Yaniv Rosner [Mon, 28 Nov 2011 00:49:49 +0000 (00:49 +0000)]
bnx2x: Change Warpcore MDIO work around mode

This patch enables the usage of simpler MDC/MDIO work-around when accessing Warpcore registers.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Fix BCM84833 link and LED behavior
Yaniv Rosner [Mon, 28 Nov 2011 00:49:48 +0000 (00:49 +0000)]
bnx2x: Fix BCM84833 link and LED behavior

This patch contain several fixes for the BCM84833. This PHY is still not in bnx2x production, hence this patch can be considered as enhancement.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Warpcore HW reset following fan failure
Yaniv Rosner [Mon, 28 Nov 2011 00:49:47 +0000 (00:49 +0000)]
bnx2x: Warpcore HW reset following fan failure

Put Warpcore in low power mode in case of fan failure to reduce heat.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: ETS changes
Yaniv Rosner [Mon, 28 Nov 2011 00:49:46 +0000 (00:49 +0000)]
bnx2x: ETS changes

Fix a problem when new traffic class is created with 0% BW, the ETS is not conforming.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: PFC changes
Yaniv Rosner [Mon, 28 Nov 2011 00:49:45 +0000 (00:49 +0000)]
bnx2x: PFC changes

Change BRB to work in per class guaranteed mode and handle cases for BW 0%.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: tcp_sendmsg() wrong access to sk_route_caps
Eric Dumazet [Mon, 28 Nov 2011 00:27:47 +0000 (00:27 +0000)]
tcp: tcp_sendmsg() wrong access to sk_route_caps

Now sk_route_caps is u64, its dangerous to use an integer to store
result of an AND operator. It wont work if NETIF_F_SG is moved on the
upper part of u64.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Mon, 28 Nov 2011 19:11:18 +0000 (14:11 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem

12 years agonet/irda: convert drivers/net/irda/* to use module_platform_driver()
Axel Lin [Mon, 28 Nov 2011 01:29:11 +0000 (20:29 -0500)]
net/irda: convert drivers/net/irda/* to use module_platform_driver()

This patch converts the drivers in drivers/net/irda/* to use the
module_platform_driver() macro which makes the code smaller and a bit
simpler.

Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: skip cwnd moderation in TCP_CA_Open in tcp_try_to_open
Neal Cardwell [Wed, 16 Nov 2011 08:58:05 +0000 (08:58 +0000)]
tcp: skip cwnd moderation in TCP_CA_Open in tcp_try_to_open

The problem: Senders were overriding cwnd values picked during an undo
by calling tcp_moderate_cwnd() in tcp_try_to_open().

The fix: Don't moderate cwnd in tcp_try_to_open() if we're in
TCP_CA_Open, since doing so is generally unnecessary and specifically
would override a DSACK-based undo of a cwnd reduction made in fast
recovery.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: allow undo from reordered DSACKs
Neal Cardwell [Wed, 16 Nov 2011 08:58:04 +0000 (08:58 +0000)]
tcp: allow undo from reordered DSACKs

Previously, SACK-enabled connections hung around in TCP_CA_Disorder
state while snd_una==high_seq, just waiting to accumulate DSACKs and
hopefully undo a cwnd reduction. This could and did lead to the
following unfortunate scenario: if some incoming ACKs advance snd_una
beyond high_seq then we were setting undo_marker to 0 and moving to
TCP_CA_Open, so if (due to reordering in the ACK return path) we
shortly thereafter received a DSACK then we were no longer able to
undo the cwnd reduction.

The change: Simplify the congestion avoidance state machine by
removing the behavior where SACK-enabled connections hung around in
the TCP_CA_Disorder state just waiting for DSACKs. Instead, when
snd_una advances to high_seq or beyond we typically move to
TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or
TCP_CA_Disorder if we later receive enough DSACKs.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: use SACKs and DSACKs that arrive on ACKs below snd_una
Neal Cardwell [Wed, 16 Nov 2011 08:58:03 +0000 (08:58 +0000)]
tcp: use SACKs and DSACKs that arrive on ACKs below snd_una

The bug: When the ACK field is below snd_una (which can happen when
ACKs are reordered), senders ignored DSACKs (preventing undo) and did
not call tcp_fastretrans_alert, so they did not increment
prr_delivered to reflect newly-SACKed sequence ranges, and did not
call tcp_xmit_retransmit_queue, thus passing up chances to send out
more retransmitted and new packets based on any newly-SACKed packets.

The change: When the ACK field is below snd_una (the "old_ack" goto
label), call tcp_fastretrans_alert to allow undo based on any
newly-arrived DSACKs and try to send out more packets based on
newly-SACKed packets.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: use DSACKs that arrive when packets_out is 0
Neal Cardwell [Wed, 16 Nov 2011 08:58:02 +0000 (08:58 +0000)]
tcp: use DSACKs that arrive when packets_out is 0

The bug: Senders ignored DSACKs after recovery when there were no
outstanding packets (a common scenario for HTTP servers).

The change: when there are no outstanding packets (the "no_queue" goto
label), call tcp_fastretrans_alert() in order to use DSACKs to undo
congestion window reductions.

Other patches in this series will provide other changes that are
necessary to fully fix this problem.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: make is_dupack a parameter to tcp_fastretrans_alert()
Neal Cardwell [Wed, 16 Nov 2011 08:58:01 +0000 (08:58 +0000)]
tcp: make is_dupack a parameter to tcp_fastretrans_alert()

Allow callers to decide whether an ACK is a duplicate ACK. This is a
prerequisite to allowing fastretrans_alert to be called from new
contexts, such as the no_queue and old_ack code paths, from which we
have extra info that tells us whether an ACK is a dupack.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: bug fix for the case of vlan id 0 and UP 0
Amir Vadai [Sat, 26 Nov 2011 19:55:23 +0000 (19:55 +0000)]
net/mlx4_en: bug fix for the case of vlan id 0 and UP 0

When using vlan 0 and UP 0, vlan header wasn't placed.

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: adding loopback support
Amir Vadai [Sat, 26 Nov 2011 19:55:19 +0000 (19:55 +0000)]
net/mlx4_en: adding loopback support

Device must be in promiscuous mode or DMAC must be same as the host MAC, or
else packet will be dropped by the HW rx filtering.

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: fix WOL handlers were always looking at port2 capability bit
Oren Duer [Sat, 26 Nov 2011 19:55:15 +0000 (19:55 +0000)]
net/mlx4_en: fix WOL handlers were always looking at port2 capability bit

There are 2 capability bits for WOL, one for each port.
WOL handlers were looking only on the second bit, regardless of the port.

Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: using non collapsed CQ on TX
Yevgeny Petrilin [Sat, 26 Nov 2011 19:55:10 +0000 (19:55 +0000)]
net/mlx4_en: using non collapsed CQ on TX

Moving to regular Completion Queue implementation (not collapsed)
Completion for each transmitted packet is written to new entry.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4_en: fix sparse warning on a cast which truncates bits from constant value
Or Gerlitz [Sat, 26 Nov 2011 19:55:06 +0000 (19:55 +0000)]
net/mlx4_en: fix sparse warning on a cast which truncates bits from constant value

the MLX4_EN_WOL_DO_MODIFY flag which is defined through enum targets
bit 63, this triggers a "cast truncate bits from constant value
(8000000000000000 becomes 0)" warning from sparse, fix that by using
define instead of enum.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4: fix UDP RSS related settings
Or Gerlitz [Sat, 26 Nov 2011 19:55:02 +0000 (19:55 +0000)]
net/mlx4: fix UDP RSS related settings

Using RSS which takes into account UDP headers is controlled by
a module param, fix the setting of the HW RSS context to align
with that scheme. So far it was uncoditionally allowing hashing
on the UDP headers.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/mlx4: move RSS related definitions to be global
Or Gerlitz [Sat, 26 Nov 2011 19:54:58 +0000 (19:54 +0000)]
net/mlx4: move RSS related definitions to be global

Towards adding RSS support for IB drivers/application who use
the mlx4 HW, make the RSS related definitions global and change
the mlx4_en driver to use them.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: report ISDN4Linux interface only once
Tilman Schmidt [Sun, 27 Nov 2011 07:39:22 +0000 (07:39 +0000)]
isdn/gigaset: report ISDN4Linux interface only once

Move the "ISDN4Linux interface" message from device registration,
where it is emitted for each device, to driver registration, where
it is emitted only once, for consistency with the CAPI variant.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>