firefly-linux-kernel-4.4.55.git
14 years agoigb: add registers etc. printout code just before resetting adapters
Taku Izumi [Tue, 27 Apr 2010 14:39:30 +0000 (14:39 +0000)]
igb: add registers etc. printout code just before resetting adapters

This patch adds registers (,tx/rx rings' status and so on) printout
code just before resetting adapters. This will be helpful for detecting
the root cause of adapters reset.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: add registers etc. printout code just before resetting adapters
Taku Izumi [Tue, 27 Apr 2010 14:39:08 +0000 (14:39 +0000)]
e1000e: add registers etc. printout code just before resetting adapters

This patch adds registers (,tx/rx rings' status and so on) printout
code just before resetting adapters. This will be helpful for detecting
the root cause of adapters reset.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000: Use netdev_<level>, pr_<level> and dev_<level>
Emil Tantilov [Tue, 27 Apr 2010 14:02:58 +0000 (14:02 +0000)]
e1000: Use netdev_<level>, pr_<level> and dev_<level>

This patch is an alternative to similar patch provided by Joe Perches.

Substitute DPRINTK macro for e_<level> that uses netdev_<level> and dev_<level>
similar to e1000e.
- Convert printk to pr_<level> where applicable.
- Use common #define pr_fmt for the driver.
- Use dev_<level> for displaying text in parts of the driver where the interface
  name is not assigned (like e1000_param.c).
- Better align test with the new macros.

CC: Joe Perches <joe@perches.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoRevert "bridge: Use hlist_for_each_entry_rcu() in br_multicast_add_router()"
David S. Miller [Tue, 27 Apr 2010 23:49:58 +0000 (16:49 -0700)]
Revert "bridge: Use hlist_for_each_entry_rcu() in br_multicast_add_router()"

This reverts commit ff65e8275f6c96a5eda57493bd84c4555decf7b3.

As explained by Stephen Hemminger, the traversal doesn't require
RCU handling as we hold a lock.

The list addition et al. calls, on the other hand, do.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbevf: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:10:50 +0000 (13:10 +0000)]
ixgbevf: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:10:27 +0000 (13:10 +0000)]
ixgbe: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgb: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:10:03 +0000 (13:10 +0000)]
ixgb: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoigbvf: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:09:44 +0000 (13:09 +0000)]
igbvf: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoigb: convert igb from using PCI DMA functions to using DMA API functions
Alexander Duyck [Tue, 27 Apr 2010 13:09:25 +0000 (13:09 +0000)]
igb: convert igb from using PCI DMA functions to using DMA API functions

This patch makes it so that igb now uses the DMA API functions instead of
the PCI API functions.  To do this the pci_dev pointer that was in the
rings has been replaced with a device pointer, and as a result all
references to [tr]x_ring->pdev have been replaced with [tr]x_ring->dev.

This patch is based of of work originally done by Nicholas Nunley.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:09:05 +0000 (13:09 +0000)]
e1000e: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000: use DMA API instead of PCI DMA functions
Nick Nunley [Tue, 27 Apr 2010 13:08:45 +0000 (13:08 +0000)]
e1000: use DMA API instead of PCI DMA functions

Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: Use hlist_for_each_entry_rcu() in br_multicast_add_router()
David S. Miller [Tue, 27 Apr 2010 23:26:49 +0000 (16:26 -0700)]
bridge: Use hlist_for_each_entry_rcu() in br_multicast_add_router()

Noticed by Michał Mirosław.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4: set skb->rxhash
Dimitris Michailidis [Tue, 27 Apr 2010 23:22:42 +0000 (16:22 -0700)]
cxgb4: set skb->rxhash

Implement the ->set_flags ethtool method to control NETIF_F_RXHASH and
set skb->rxhash to the HW calculated hash accordingly.

Follow Eric Dumazet's suggestion and use the hash value raw.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/sb1250: setup the pdevice within the soc code
Sebastian Andrzej Siewior [Tue, 27 Apr 2010 22:54:50 +0000 (15:54 -0700)]
net/sb1250: setup the pdevice within the soc code

doing it within the driver does not look good.
And surely isn't how platform devices were meat to be used.

Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/sb1250: remove CONFIG_SIBYTE_STANDALONE
Sebastian Andrzej Siewior [Tue, 27 Apr 2010 22:53:50 +0000 (15:53 -0700)]
net/sb1250: remove CONFIG_SIBYTE_STANDALONE

CONFIG_SIBYTE_STANDALONE is gone since v2.6.31-rc1 ("MIPS: Sibyte:
Remove standalone kernel support")
This is a missing piece.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: disallow to use net_assign_generic externally
Jiri Pirko [Fri, 23 Apr 2010 01:40:47 +0000 (01:40 +0000)]
net: disallow to use net_assign_generic externally

Now there's no need to use this fuction directly because it's handled by
register_pernet_device. So to make this simple and easy to understand,
make this static to do not tempt potentional users.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4: increase serial number length
Dimitris Michailidis [Tue, 27 Apr 2010 12:24:16 +0000 (12:24 +0000)]
cxgb4: increase serial number length

Some boards have longer serial numbers in their VPD, up to 24 bytes.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4: parse the VPD instead of relying on a static VPD layout
Dimitris Michailidis [Tue, 27 Apr 2010 12:24:15 +0000 (12:24 +0000)]
cxgb4: parse the VPD instead of relying on a static VPD layout

Some boards' VPDs contain additional keywords or have longer serial numbers,
meaning the keyword locations are variable.  Ditch the static layout and
use the pci_vpd_* family of functions to parse the VPD instead.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: sk_add_backlog() take rmem_alloc into account
Eric Dumazet [Tue, 27 Apr 2010 22:13:20 +0000 (15:13 -0700)]
net: sk_add_backlog() take rmem_alloc into account

Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.

We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.

Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.

Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: batch skb dequeueing from softnet input_pkt_queue
Changli Gao [Tue, 27 Apr 2010 22:07:33 +0000 (15:07 -0700)]
net: batch skb dequeueing from softnet input_pkt_queue

batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
contention when RPS is enabled.

Note: in the worst case, the number of packets in a softnet_data may
be double of netdev_max_backlog.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Make RFS socket operations not be inet specific.
David S. Miller [Tue, 27 Apr 2010 22:05:31 +0000 (15:05 -0700)]
net: Make RFS socket operations not be inet specific.

Idea from Eric Dumazet.

As for placement inside of struct sock, I tried to choose a place
that otherwise has a 32-bit hole on 64-bit systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
14 years agoixgbe: Properly display 1 gig downshift warning for backplane
Anjali Singhai [Tue, 27 Apr 2010 11:31:25 +0000 (11:31 +0000)]
ixgbe: Properly display 1 gig downshift warning for backplane

Description: When using Intel smartspeed, the patch displays a
warning when the link down shifts to 1 Gig.

Signed-off-by: Anjali Singhai <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: cleanup ethtool autoneg input
Don Skidmore [Tue, 27 Apr 2010 11:31:06 +0000 (11:31 +0000)]
ixgbe: cleanup ethtool autoneg input

The way we were setting autoneg via ethtool was inconstant with that
of our other drivers.  It will change the following:

If autoneg is off:
>ethtool -a eth0
Pause parameters for eth0:

Autonegotiate:  off
RX:             off
TX:             off

Before:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:

Autonegotiate:  off
RX:             off
TX:             off

Now:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:

Autonegotiate:  on
RX:             on
TX:             on

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbevf: Fix link speed display
Greg Rose [Tue, 27 Apr 2010 11:31:45 +0000 (11:31 +0000)]
ixgbevf: Fix link speed display

The ixgbevf driver would always report 10Gig speeds even when the link
speed is downshifted to 1Gig.  This patch fixes that problem.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: reimplement softnet_data.output_queue as a FIFO queue
Changli Gao [Mon, 26 Apr 2010 23:06:24 +0000 (23:06 +0000)]
net: reimplement softnet_data.output_queue as a FIFO queue

reimplement softnet_data.output_queue as a FIFO queue to keep the
fairness among the qdiscs rescheduled.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
----
 include/linux/netdevice.h |    1 +
 net/core/dev.c            |   22 ++++++++++++----------
 2 files changed, 13 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6
David S. Miller [Tue, 27 Apr 2010 19:57:39 +0000 (12:57 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/kaber/ipmr-2.6

14 years agoixgb: Use pr_<level> and netdev_<level>
Joe Perches [Tue, 27 Apr 2010 00:50:58 +0000 (00:50 +0000)]
ixgb: Use pr_<level> and netdev_<level>

Convert DEBUGOUTx to pr_debug
Convert DEBUGFUNC to more commonly used ENTER
Convert mac address output to %pM
Use #define pr_fmt
Convert a few printks to pr_<level>
Improve ixgb_mc_addr_list_update: use a temporary for current mc address
Use etherdevice.h functions for mac address testing

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: ixgbe_down needs to stop dev_watchdog
John Fastabend [Tue, 27 Apr 2010 02:13:39 +0000 (02:13 +0000)]
ixgbe: ixgbe_down needs to stop dev_watchdog

There is a small race between when the tx queues are stopped
and when netif_carrier_off() is called in ixgbe_down.  If the
dev_watchdog() timer fires during this time it is possible for
a false tx timeout to occur.

This patch moves the netif_carrier_off() so that it is called before
the tx queues are stopped preventing the dev_watchdog timer from
detecting false tx timeouts.  The race is seen occosionally when
FCoE or DCB settings are being configured or changed.

Testing note, running ifconfig up/down will not reproduce this
issue because dev_open/dev_close call dev_deactivate() and then
dev_activate().

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoigb: add support for reporting 5GT/s during probe on PCIe Gen2
Alexander Duyck [Tue, 27 Apr 2010 01:02:40 +0000 (01:02 +0000)]
igb: add support for reporting 5GT/s during probe on PCIe Gen2

This change corrects the fact that we were not reporting Gen2 link speeds
when we were in fact connected at Gen2 rates.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: fix bug when EITR=0 causing no writebacks
Jesse Brandeburg [Tue, 27 Apr 2010 01:37:41 +0000 (01:37 +0000)]
ixgbe: fix bug when EITR=0 causing no writebacks

writebacks can be held indefinitely by hardware if EITR=0, when
combined with TXDCTL.WTHRESH=8.  When EITR=0, WTHRESH should be
set back to zero.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: enable extremely low latency
Jesse Brandeburg [Tue, 27 Apr 2010 01:37:20 +0000 (01:37 +0000)]
ixgbe: enable extremely low latency

82598/82599 can support EITR == 0, which allows for the
absolutely lowest latency setting in the hardware.  This disables
writeback batching and anything else that relies upon a delayed
interrupt. This patch enables the feature of "override" when a
user sets rx-usecs to zero, the driver will respect that setting
over using RSC, and automatically disable RSC.  If rx-usecs is
used to set the EITR value to 0, then the driver should disable
LRO (aka RSC) internally until EITR is set to non-zero again.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoigbvf: double increment nr_frags
Koki Sanagi [Tue, 27 Apr 2010 01:01:39 +0000 (01:01 +0000)]
igbvf: double increment nr_frags

There is no need to increment nr_frags because skb_fill_page_desc increments
it.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoigb: double increment nr_frags
Koki Sanagi [Tue, 27 Apr 2010 01:01:19 +0000 (01:01 +0000)]
igb: double increment nr_frags

There is no need to increment nr_frags because skb_fill_page_desc increments
it.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: fix a lockdep rcu warning in __sk_dst_set()
Eric Dumazet [Mon, 26 Apr 2010 20:40:43 +0000 (20:40 +0000)]
net: fix a lockdep rcu warning in __sk_dst_set()

__sk_dst_set() might be called while no state can be integrated in a
rcu_dereference_check() condition.

So use rcu_dereference_raw() to shutup lockdep warnings (if
CONFIG_PROVE_RCU is set)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agorps: inet_rps_save_rxhash() argument is not const
Eric Dumazet [Tue, 27 Apr 2010 02:42:51 +0000 (02:42 +0000)]
rps: inet_rps_save_rxhash() argument is not const

const qualifier on sock argument is misleading, since we can modify rxhash.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoTCP: avoid to send keepalive probes if receiving data
Flavio Leitner [Mon, 26 Apr 2010 18:33:27 +0000 (18:33 +0000)]
TCP: avoid to send keepalive probes if receiving data

RFC 1122 says the following:
...
  Keep-alive packets MUST only be sent when no data or
  acknowledgement packets have been received for the
  connection within an interval.
...

The acknowledgement packet is reseting the keepalive
timer but the data packet isn't. This patch fixes it by
checking the timestamp of the last received data packet
too when the keepalive timer expires.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: multicast router list manipulation
stephen hemminger [Tue, 27 Apr 2010 07:13:11 +0000 (07:13 +0000)]
bridge: multicast router list manipulation

I prefer that the hlist be only accessed through the hlist macro
objects. Explicit twiddling of links (especially with RCU) exposes
the code to future bugs.

Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: use is_multicast_ether_addr
stephen hemminger [Tue, 27 Apr 2010 07:13:06 +0000 (07:13 +0000)]
bridge: use is_multicast_ether_addr

Use existing inline function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Tue, 27 Apr 2010 19:49:13 +0000 (12:49 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/e100.c
drivers/net/e1000e/netdev.c

14 years agonet: suppress RCU lockdep false positive in twsk_net()
Paul E. McKenney [Tue, 27 Apr 2010 06:22:01 +0000 (06:22 +0000)]
net: suppress RCU lockdep false positive in twsk_net()

Calls to twsk_net() are in some cases protected by reference counting
as an alternative to RCU protection.  Cases covered by reference counts
include __inet_twsk_kill(), inet_twsk_free(), inet_twdr_do_twkill_work(),
inet_twdr_twcal_tick(), and tcp_timewait_state_process().  RCU is used
by inet_twsk_purge().  Locking is used by established_get_first()
and established_get_next().  Finally, __inet_twsk_hashdance() is an
initialization case.

It appears to be non-trivial to locate the appropriate locks and
reference counts from within twsk_net(), so used rcu_dereference_raw().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb3: Wait longer for control packets on initialization
Andre Detsch [Mon, 26 Apr 2010 05:38:27 +0000 (05:38 +0000)]
cxgb3: Wait longer for control packets on initialization

In some Power7 platforms, when using VIOS (Virtual I/O Server), we
need to wait longer for control packets to finish transfer during
initialization.
Without this change, initialization may fail prematurely.

Signed-off-by: Wen Xiong <wenxiong@us.ibm.com>
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata
Bruce Allan [Tue, 27 Apr 2010 03:33:04 +0000 (03:33 +0000)]
e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata

Prompted by a previous patch submitted by Matthew Garret <mjg@redhat.com>,
further digging into errata documentation reveals the current enabling or
disabling of ASPM L0s and L1 states for certain parts supported by this
driver are incorrect.  82571 and 82572 should always disable L1.  For
standard frames, 82573/82574/82583 can enable L1 but L0s must be disabled,
and for jumbo frames 82573/82574 must disable L1.  This allows for some
parts to enable L1 in certain configurations leading to better power
savings.

Also according to the same errata, Early Receive (ERT) should be disabled
on 82573 when using jumbo frames.

Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: Power down PHY during driver resets
Peter Waskiewicz [Tue, 27 Apr 2010 00:38:15 +0000 (00:38 +0000)]
ixgbe: Power down PHY during driver resets

The PHY laser is still on during driver init.  It's allowing
garbage to hit our FIFO, which eventually can cause the entire
device to die.  Power down the laser while setting up the device,
and re-enable the laser before getting link.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: Fix build of ipv6 multicast code.
David S. Miller [Tue, 27 Apr 2010 17:16:54 +0000 (10:16 -0700)]
bridge: Fix build of ipv6 multicast code.

Based upon a report from Stephen Rothwell:

--------------------
net/bridge/br_multicast.c: In function 'br_ip6_multicast_alloc_query':
net/bridge/br_multicast.c:469: error: implicit declaration of function 'csum_ipv6_magic'

Introduced by commit 08b202b6726459626c73ecfa08fcdc8c3efc76c2 ("bridge
br_multicast: IPv6 MLD support") from the net tree.

csum_ipv6_magic is declared in net/ip6_checksum.h ...
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agor8169: more broken register writes workaround
françois romieu [Mon, 26 Apr 2010 11:42:58 +0000 (11:42 +0000)]
r8169: more broken register writes workaround

78f1cd02457252e1ffbc6caa44a17424a45286b8 ("fix broken register writes")
does not work for Al Viro's r8169 (XID 18000000).

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agor8169: failure to enable mwi should not be fatal
françois romieu [Mon, 26 Apr 2010 11:42:06 +0000 (11:42 +0000)]
r8169: failure to enable mwi should not be fatal

Few (6) network drivers enable mwi explicitly. Fewer worry about a
failure.

It is not a fix but it should avoid some annoyance like
http://bugzilla.kernel.org/show_bug.cgi?id=15454

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Conrad Kostecki <conikost@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
YOSHIFUJI Hideaki / 吉藤英明 [Sun, 25 Apr 2010 08:06:40 +0000 (08:06 +0000)]
bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.

Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP/MLD snooping support is
compiled and disabled, so we can see garbage.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
YOSHIFUJI Hideaki / 吉藤英明 [Sun, 25 Apr 2010 08:59:07 +0000 (08:59 +0000)]
bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.

Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP snooping support is
compiled and disabled, so we can see garbage.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoieee802154: Fix oops during ieee802154_sock_ioctl
Stefan Schmidt [Mon, 26 Apr 2010 18:20:32 +0000 (11:20 -0700)]
ieee802154: Fix oops during ieee802154_sock_ioctl

Trying to run izlisten (from lowpan-tools tests) on a device that does not
exists I got the oops below. The problem is that we are using get_dev_by_name
without checking if we really get a device back. We don't in this case and
writing to dev->type generates this oops.

[Oops code removed by Dmitry Eremin-Solenikov]

If possible this patch should be applied to the current -rc fixes branch.

Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: use sk_sleep()
Eric Dumazet [Sun, 25 Apr 2010 22:20:06 +0000 (22:20 +0000)]
net: use sk_sleep()

Commit aa395145 (net: sk_sleep() helper) missed three files in the
conversion.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopppoe: use pppoe_pernet instead of directly net_generic
Jiri Pirko [Mon, 26 Apr 2010 01:46:12 +0000 (01:46 +0000)]
pppoe: use pppoe_pernet instead of directly net_generic

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agophonet: use phonet_pernet instead of directly net_generic
Jiri Pirko [Mon, 26 Apr 2010 03:41:00 +0000 (03:41 +0000)]
phonet: use phonet_pernet instead of directly net_generic

As in for example pppoe introduce phonet_pernet and use it instead of calling
net_generic directly.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Fix INTx fallback when MSI fails
Andre Detsch [Mon, 26 Apr 2010 07:27:07 +0000 (07:27 +0000)]
tg3: Fix INTx fallback when MSI fails

tg3: Fix INTx fallback when MSI fails

MSI setup changes the value of irq_vec in struct tg3 *tp.
This attribute must be taken into account and restored before
we try to do a new request_irq for INTx fallback.

In powerpc, the original code was leading to an EINVAL return within
request_irq, because the driver was trying to use the disabled MSI
virtual irq number instead of tp->pdev->irq.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: ipmr: add support for dumping routing tables over netlink
Patrick McHardy [Mon, 26 Apr 2010 14:02:08 +0000 (16:02 +0200)]
net: ipmr: add support for dumping routing tables over netlink

The ipmr /proc interface (ip_mr_cache) can't be extended to dump routes
from any tables but the main table in a backwards compatible fashion since
the output format ends in a variable amount of output interfaces.

Introduce a new netlink interface to dump multicast routes from all tables,
similar to the netlink interface for regular routes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonet: rtnetlink: decouple rtnetlink address families from real address families
Patrick McHardy [Mon, 26 Apr 2010 14:02:05 +0000 (16:02 +0200)]
net: rtnetlink: decouple rtnetlink address families from real address families

Decouple rtnetlink address families from real address families in socket.h to
be able to add rtnetlink interfaces to code that is not a real address family
without increasing AF_MAX/NPROTO.

This will be used to add support for multicast route dumping from all tables
as the proc interface can't be extended to support anything but the main table
without breaking compatibility.

This partialy undoes the patch to introduce independant families for routing
rules and converts ipmr routing rules to a new rtnetlink family. Similar to
that patch, values up to 127 are reserved for real address families, values
above that may be used arbitrarily.

Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonet: fib_rules: mark arguments to fib_rules_register const and __net_initdata
Patrick McHardy [Mon, 26 Apr 2010 14:02:04 +0000 (16:02 +0200)]
net: fib_rules: mark arguments to fib_rules_register const and __net_initdata

fib_rules_register() duplicates the template passed to it without modification,
mark the argument as const. Additionally the templates are only needed when
instantiating a new namespace, so mark them as __net_initdata, which means
they can be discarded when CONFIG_NET_NS=n.

Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agoipv6: Fix inet6_csk_bind_conflict()
Eric Dumazet [Sun, 25 Apr 2010 22:09:42 +0000 (15:09 -0700)]
ipv6: Fix inet6_csk_bind_conflict()

Commit fda48a0d7a84 (tcp: bind() fix when many ports are bound)
introduced a bug on IPV6 part.
We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but
ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is
IPV6.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonetns: rename unregister_pernet_subsys parameter
Jiri Pirko [Sun, 25 Apr 2010 07:49:56 +0000 (00:49 -0700)]
netns: rename unregister_pernet_subsys parameter

Stay consistent with other functions and with comment also and name
pernet_operations parameter properly.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agorps: optimize rps_get_cpu()
Changli Gao [Sun, 25 Apr 2010 05:50:10 +0000 (22:50 -0700)]
rps: optimize rps_get_cpu()

optimize rps_get_cpu().

don't initialize ports when we can get the ports. one memory access
for ports than two.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoe100: Fix the TX workqueue race
Alan Cox [Sun, 25 Apr 2010 04:09:29 +0000 (21:09 -0700)]
e100: Fix the TX workqueue race

Nothing stops the workqueue being left to run in parallel with close or a
few other operations. This causes double unmaps and the like.

See kerneloops.org #1041230 for an example

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosky2: add support for receive hashing
Stephen Hemminger [Sun, 25 Apr 2010 03:04:12 +0000 (20:04 -0700)]
sky2: add support for receive hashing

Sky2 hardware supports hardware receive hash calculation.
Now that Receive Packet Steering is available, add support
to enable it.

This version does not depend on CONFIG_RPS. Also set_flags rejects
all values except RXHASH, so driver won't have to change next time
somebody adds a new one.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'net-next-2.6_20100423a/br/br_multicast_v3' of git://git.linux-ipv6...
David S. Miller [Sat, 24 Apr 2010 06:37:24 +0000 (23:37 -0700)]
Merge branch 'net-next-2.6_20100423a/br/br_multicast_v3' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next

14 years agoIPv6: Complete IPV6_DONTFRAG support
Brian Haley [Fri, 23 Apr 2010 11:26:09 +0000 (11:26 +0000)]
IPv6: Complete IPV6_DONTFRAG support

Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket.  The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoIPv6: Add dontfrag argument to relevant functions
Brian Haley [Fri, 23 Apr 2010 11:26:08 +0000 (11:26 +0000)]
IPv6: Add dontfrag argument to relevant functions

Add dontfrag argument to relevant functions for
IPV6_DONTFRAG support, as well as allowing the value
to be passed-in via ancillary cmsg data.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoIPv6: data structure changes for new socket options
Brian Haley [Fri, 23 Apr 2010 11:26:07 +0000 (11:26 +0000)]
IPv6: data structure changes for new socket options

Add underlying data structure changes and basic setsockopt()
and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
and IPV6_DONTFRAG.  IPV6_PATHMTU is actually fully functional
at this point.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agol2tp_eth: fix memory allocation
Jiri Pirko [Fri, 23 Apr 2010 01:01:52 +0000 (01:01 +0000)]
l2tp_eth: fix memory allocation

Since .size is set properly in "struct pernet_operations l2tp_eth_net_ops",
allocating space for "struct l2tp_eth_net" by hand is not correct, even causes
memory leakage.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agol2tp: fix memory allocation
Jiri Pirko [Fri, 23 Apr 2010 00:53:39 +0000 (00:53 +0000)]
l2tp: fix memory allocation

Since .size is set properly in "struct pernet_operations l2tp_net_ops",
allocating space for "struct l2tp_net" by hand is not correct, even causes
memory leakage.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agogianfar: Fix potential oops during OF address translation
Anton Vorontsov [Fri, 23 Apr 2010 07:12:44 +0000 (07:12 +0000)]
gianfar: Fix potential oops during OF address translation

gianfar driver may pass NULL pointer to the of_translate_address(),
which may lead to a kernel oops. Fix this by using of_iomap(), which
is also much simpler and shorter.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofsl_pq_mdio: Fix kernel oops during OF address translation
Anton Vorontsov [Fri, 23 Apr 2010 07:12:35 +0000 (07:12 +0000)]
fsl_pq_mdio: Fix kernel oops during OF address translation

Old P1020RDB device trees were not specifing tbipa address for
MDIO nodes, which is now causing this kernel oops:

 ...
 eth2: TX BD ring size for Q[6]: 256
 eth2: TX BD ring size for Q[7]: 256
 Unable to handle kernel paging request for data at address 0x00000000
 Faulting instruction address: 0xc0015504
 Oops: Kernel access of bad area, sig: 11 [#1]
 ...
 NIP [c0015504] memcpy+0x3c/0x9c
 LR [c000a9f8] __of_translate_address+0xfc/0x21c
 Call Trace:
 [df839e00] [c000a94c] __of_translate_address+0x50/0x21c (unreliable)
 [df839e50] [c01a33e8] get_gfar_tbipa+0xb0/0xe0
 ...

The old device trees are buggy, though having a dead ethernet is
better than a dead kernel, so fix the issue by using of_iomap().

Also, a somewhat similar issue exist in the probe() routine, though
there the oops is only a possibility. Nonetheless, fix it too.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' into for-davem
John W. Linville [Fri, 23 Apr 2010 18:43:45 +0000 (14:43 -0400)]
Merge branch 'master' into for-davem

Conflicts:
drivers/net/wireless/ath/ath9k/phy.c
drivers/net/wireless/iwlwifi/iwl-6000.c
drivers/net/wireless/iwlwifi/iwl-debugfs.c

14 years agobnx2x: add support for receive hashing
Tom Herbert [Fri, 23 Apr 2010 07:10:52 +0000 (00:10 -0700)]
bnx2x: add support for receive hashing

Add support to bnx2x to extract Toeplitz hash out of the receive
descriptor for use in skb->rxhash.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge br_multicast: IPv6 MLD support.
YOSHIFUJI Hideaki [Thu, 22 Apr 2010 16:54:22 +0000 (01:54 +0900)]
bridge br_multicast: IPv6 MLD support.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
14 years agobridge br_multicast: Make functions less ipv4 dependent.
YOSHIFUJI Hideaki [Sun, 18 Apr 2010 03:42:07 +0000 (12:42 +0900)]
bridge br_multicast: Make functions less ipv4 dependent.

Introduce struct br_ip{} to store ip address and protocol
and make functions more generic so that we can support
both IPv4 and IPv6 with less pain.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
14 years agoipv6 mcast: Introduce include/net/mld.h for MLD definitions.
YOSHIFUJI Hideaki [Sun, 18 Apr 2010 03:42:05 +0000 (12:42 +0900)]
ipv6 mcast: Introduce include/net/mld.h for MLD definitions.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
14 years agotcp: bind() fix when many ports are bound
Eric Dumazet [Wed, 21 Apr 2010 09:26:15 +0000 (09:26 +0000)]
tcp: bind() fix when many ports are bound

Port autoselection done by kernel only works when number of bound
sockets is under a threshold (typically 30000).

When this threshold is over, we must check if there is a conflict before
exiting first loop in inet_csk_get_port()

Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to
bind on same (address,port) tuple (with a non ANY address)

Same change for inet6_csk_bind_conflict()

Reported-by: Gaspar Chilingarov <gasparch@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosky2: size status ring based on Tx/Rx ring
stephen hemminger [Thu, 22 Apr 2010 13:42:56 +0000 (13:42 +0000)]
sky2: size status ring based on Tx/Rx ring

Sky2 status ring must be big enough to handle worst case number
of status messages. It was being oversized (to handle dual port cards),
and excessive number of tx ring entries were allowed. This patch reduces
the footprint and makes sure the value is enough.

Later patch to add RSS increases the number of possible Rx status elements.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoremove DCB_PROTO_VERSION as we don't do netlink versioning
Scott Feldman [Thu, 22 Apr 2010 14:38:03 +0000 (14:38 +0000)]
remove DCB_PROTO_VERSION as we don't do netlink versioning

remove DCB_PROTO_VERSION as we don't do netlink versioning

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Update X25 interface documentation
andrew hendry [Mon, 19 Apr 2010 13:30:36 +0000 (13:30 +0000)]
X25: Update X25 interface documentation

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Use identifiers for hdlc x25 device to x25 interface
andrew hendry [Mon, 19 Apr 2010 13:30:24 +0000 (13:30 +0000)]
X25: Use identifiers for hdlc x25 device to x25 interface

Change magic numbers to identifiers for X25 interface.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Acked-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Use identifiers for cyclades device to x25 interface
andrew hendry [Mon, 19 Apr 2010 13:30:13 +0000 (13:30 +0000)]
X25: Use identifiers for cyclades device to x25 interface

Change magic numbers to identifiers for X25 interface.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Use identifiers for lapbether device to x25 interface
andrew hendry [Mon, 19 Apr 2010 13:30:02 +0000 (13:30 +0000)]
X25: Use identifiers for lapbether device to x25 interface

Change magic numbers to identifiers for X25 interface.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Use identifiers for x25 async device to x25 interface
andrew hendry [Mon, 19 Apr 2010 13:29:47 +0000 (13:29 +0000)]
X25: Use identifiers for x25 async device to x25 interface

Change magic numbers to identifiers for X25 interface.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Use identifiers for isdn device to x25 interface
andrew hendry [Mon, 19 Apr 2010 13:29:06 +0000 (13:29 +0000)]
X25: Use identifiers for isdn device to x25 interface

Change magic numbers to identifiers for X25 interface.
also minor check patch formatting.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Acked-by: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoX25: Add if_x25.h and x25 to device identifiers
Andrew Hendry [Thu, 22 Apr 2010 23:12:36 +0000 (16:12 -0700)]
X25: Add if_x25.h and x25 to device identifiers

V2 Feedback from John Hughes.
- Add header for userspace implementations such as xot/xoe to use
- Use explicit values for interface stability
- No changes to driver patches

V1
- Use identifiers instead of magic numbers for X25 layer 3 to device interface.
- Also fixed checkpatch notes on updated code.

[ Add new user header to include/linux/Kbuild  -DaveM ]

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodst: rcu check refinement
Eric Dumazet [Thu, 22 Apr 2010 23:06:59 +0000 (16:06 -0700)]
dst: rcu check refinement

__sk_dst_get() might be called from softirq, with socket lock held.

[  159.026180] include/net/sock.h:1200 invoked rcu_dereference_check()
without protection!
[  159.026261]
[  159.026261] other info that might help us debug this:
[  159.026263]
[  159.026425]
[  159.026426] rcu_scheduler_active = 1, debug_locks = 0
[  159.026552] 2 locks held by swapper/0:
[  159.026609]  #0:  (&icsk->icsk_retransmit_timer){+.-...}, at:
[<ffffffff8104fc15>] run_timer_softirq+0x105/0x350
[  159.026839]  #1:  (slock-AF_INET){+.-...}, at: [<ffffffff81392b8f>]
tcp_write_timer+0x2f/0x1e0
[  159.027063]
[  159.027064] stack backtrace:
[  159.027172] Pid: 0, comm: swapper Not tainted
2.6.34-rc5-03707-gde498c8-dirty #36
[  159.027252] Call Trace:
[  159.027306]  <IRQ>  [<ffffffff810718ef>] lockdep_rcu_dereference
+0xaf/0xc0
[  159.027411]  [<ffffffff8138e4f7>] tcp_current_mss+0xa7/0xb0
[  159.027537]  [<ffffffff8138fa49>] tcp_write_wakeup+0x89/0x190
[  159.027600]  [<ffffffff81391936>] tcp_send_probe0+0x16/0x100
[  159.027726]  [<ffffffff81392cd9>] tcp_write_timer+0x179/0x1e0
[  159.027790]  [<ffffffff8104fca1>] run_timer_softirq+0x191/0x350
[  159.027980]  [<ffffffff810477ed>] __do_softirq+0xcd/0x200

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Socket filter ancilliary data access for skb->dev->type
Paul LeoNerd Evans [Thu, 22 Apr 2010 03:32:22 +0000 (03:32 +0000)]
net: Socket filter ancilliary data access for skb->dev->type

Add an SKF_AD_HATYPE field to the packet ancilliary data area, giving
access to skb->dev->type, as reported in the sll_hatype field.

When capturing packets on a PF_PACKET/SOCK_RAW socket bound to all
interfaces, there doesn't appear to be a way for the filter program to
actually find out the underlying hardware type the packet was captured
on. This patch adds such ability.

This patch also handles the case where skb->dev can be NULL, such as on
netlink sockets.

Signed-off-by: Paul Evans <leonerd@leonerd.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: fix outsegs stat for TSO segments
Tom Herbert [Thu, 22 Apr 2010 07:00:24 +0000 (07:00 +0000)]
tcp: fix outsegs stat for TSO segments

Account for TSO segments of an skb in TCP_MIB_OUTSEGS counter.  Without
doing this, the counter can be off by orders of magnitude from the
actual number of segments sent.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agordma: potential ERR_PTR dereference
Dan Carpenter [Wed, 21 Apr 2010 23:55:27 +0000 (23:55 +0000)]
rdma: potential ERR_PTR dereference

In the original code, the "goto out" calls "rdma_destroy_id(cm_id);"
That isn't needed here and would cause problems because "cm_id" is an
ERR_PTR.  The new code just returns directly.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agortnetlink: potential ERR_PTR dereference
Dan Carpenter [Wed, 21 Apr 2010 23:53:27 +0000 (23:53 +0000)]
rtnetlink: potential ERR_PTR dereference

In the original code, if rtnl_create_link() returned an ERR_PTR then that
would get passed to rtnl_configure_link() which dereferences it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoniu: Add skb->rxhash support.
David S. Miller [Thu, 22 Apr 2010 22:48:17 +0000 (15:48 -0700)]
niu: Add skb->rxhash support.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: update version 5.0.2
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:42 +0000 (02:51 +0000)]
qlcnic: update version 5.0.2

Update version to indicate IDC(fw recovery) changes.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: protect resource access
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:41 +0000 (02:51 +0000)]
qlcnic: protect resource access

We do netif_device_attach, even if resource allocation fails.
Driver callbacks can be called, if device is attached.
All these callbacks need to be protected by ADAPTER_UP_MAGIC check.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fix rcv buffer leak
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:40 +0000 (02:51 +0000)]
qlcnic: fix rcv buffer leak

Rcv producer value should be read in spin-lock.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fix pci semaphore checks
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:39 +0000 (02:51 +0000)]
qlcnic: fix pci semaphore checks

Driver should not go ahead with fw recovery if fails to acquire
semaphore.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: define macro for driver state
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:38 +0000 (02:51 +0000)]
qlcnic: define macro for driver state

Defining macro to set and clear driver state.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fix fw initialization responsibility
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:37 +0000 (02:51 +0000)]
qlcnic: fix fw initialization responsibility

Now any pci-func can start fw, whoever sees the reset ack first.
Before this, pci-func which sets the RESET state has the responsibility
to start fw.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: fix defines as per IDC document
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:36 +0000 (02:51 +0000)]
qlcnic: fix defines as per IDC document

Different class of drivers co-exist for CNA device,
there is some minimal interaction that will be required amongst
the drivers for performing some device level operations.

All the driver should follow inter driver coexistence document.

Fixing polling interval and spelling mistake.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlcnic: additional driver statistics
Amit Kumar Salecha [Thu, 22 Apr 2010 02:51:35 +0000 (02:51 +0000)]
qlcnic: additional driver statistics

Added additional driver statistics to track errors in rcv/tx path.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoIPv6: Generic TTL Security Mechanism (final version)
Stephen Hemminger [Thu, 22 Apr 2010 22:24:53 +0000 (15:24 -0700)]
IPv6: Generic TTL Security Mechanism (final version)

This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.

Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
The server does have to deal with both IPv4 and IPv6 socket options
and the client has to handle the different for each family.

On client:
int ttl = 255;
getaddrinfo(argv[1], argv[2], &hint, &result);

for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;

if (rp->ai_family == AF_INET) {
setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
} else if (rp->ai_family == AF_INET6) {
setsockopt(s, IPPROTO_IPV6,  IPV6_UNICAST_HOPS,
&ttl, sizeof(ttl)))
}

if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
   ...

On server:
int minttl = 255 - maxhops;

getaddrinfo(NULL, port, &hints, &result);
for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;

if (rp->ai_family == AF_INET6)
setsockopt(s, IPPROTO_IPV6,  IPV6_MINHOPCOUNT,
&minttl, sizeof(minttl));
setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));

if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
break
...

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Orphan and de-dst skbs earlier in xmit path.
David S. Miller [Thu, 22 Apr 2010 08:02:07 +0000 (01:02 -0700)]
net: Orphan and de-dst skbs earlier in xmit path.

This way GSO packets don't get handled differently.

With help from Eric Dumazet.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>