Ursula Braun [Thu, 12 Nov 2009 21:46:29 +0000 (21:46 +0000)]
s390: remove cu3088 layer for lcs and ctcm
The cu3088-driver used as common base for lcs- and ctcm-devices
makes it difficult to assign the appropriate driver to an lcs-device
or a ctcm-device. This patch eliminates the cu3088-driver and thus
the root device "cu3088". Path /sys/devices/cu3088 is replaced with
the pathes /sys/devices/lcs and /sys/devices/ctcm.
Patch is based on a proposal from Cornelia Huck.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Blaschka [Thu, 12 Nov 2009 21:46:28 +0000 (21:46 +0000)]
ctcm: suspend has to wait for outstanding I/O
State transition to DEV_STATE_STOPPED indicates all outstanding I/O has
finished. Add wait queue to wait for this state.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 12 Nov 2009 21:46:27 +0000 (21:46 +0000)]
iucv: add work_queue cleanup for suspend
If iucv_work_queue is not empty during kernel freeze, a kernel panic
occurs. This suspend-patch adds flushing of the work queue for
pending connection requests and severing of remaining pending
connections.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Nov 2009 09:33:09 +0000 (09:33 +0000)]
inetpeer: Optimize inet_getid()
While investigating for network latencies, I found inet_getid() was a
contention point for some workloads, as inet_peer_idlock is shared
by all inet_getid() users regardless of peers.
One way to fix this is to make ip_id_count an atomic_t instead
of __u16, and use atomic_add_return().
In order to keep sizeof(struct inet_peer) = 64 on 64bit arches
tcp_ts_stamp is also converted to __u32 instead of "unsigned long".
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Nov 2009 04:11:50 +0000 (04:11 +0000)]
ipv6: speedup inet6_dump_addr()
When handling large number of netdevices, inet6_dump_addr()
is very slow because it has O(N^2) complexity.
Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lucian Adrian Grijincu [Thu, 12 Nov 2009 05:07:26 +0000 (05:07 +0000)]
inet: fix inet_bind_bucket_for_each
The first "node" is supposed to be the cursor used in the for_each.
The second "node" is ment literally and should not be macro expanded:
it's the name of the hlist_node field from the inet_bind_bucket.
This currently works because when inet_bind_bucket_for_each is called
it's argument is still "node".
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Nov 2009 07:44:25 +0000 (07:44 +0000)]
ipv4: speedup inet_dump_ifaddr()
Stephen Hemminger a écrit :
> On Thu, 12 Nov 2009 15:11:36 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> When handling large number of netdevices, inet_dump_ifaddr()
>> is very slow because it has O(N^2) complexity.
>>
>> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
>> sub lists of the dev_index hash table, and RCU lookups.
>>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> You might be able to make RCU critical section smaller by moving
> it into loop.
>
Indeed. But we dump at most one skb (<= 8192 bytes ?), so rcu_read_lock
holding time is small, unless we meet many netdevices without
addresses. I wonder if its really common...
Thanks
[PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
When handling large number of netdevices, inet_dump_ifaddr()
is very slow because it has O(N2) complexity.
Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PJ Waskiewicz [Thu, 12 Nov 2009 23:50:43 +0000 (23:50 +0000)]
ixgbe: Make queue pairs on single MSI-X interrupts
This patch pairs similar-numbered Rx and Tx queues onto a single
MSI-X vector. For example, Tx queue 0 and Rx queue 0's interrupt
with be ethX-RxTx-0. This allows for more efficient cleanup, since
fewer interrupts will be firing during device operation. It also
helps with a cleaner CPU affinity for IRQ affinity.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nelson, Shannon [Thu, 12 Nov 2009 18:47:11 +0000 (18:47 +0000)]
ixgbe: Flush the LSC mask change to prevent repeated interrupts
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:38:35 +0000 (18:38 +0000)]
igb: only recycle page if it is on our numa node
This patch makes it so that we only recycle pages when they are from the
local NUMA node. Non-local pages are freed and replaced with locally
allocated pages.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:38:16 +0000 (18:38 +0000)]
igb: check for packets on all tx rings when link is down
We were previously only checking the first tx ring to see if it had any
packets in it when the link when down. However we should be checking all
of the rings so this patch makes it so that all of the rings are now being
checked.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:37:56 +0000 (18:37 +0000)]
igb: removed unused tx/rx total bytes/packets from adapter struct
This patch removes unused variables total_tx_bytes, total_tx_packets,
total_rx_bytes, and total_rx_packets from the adapter struct.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:37:38 +0000 (18:37 +0000)]
igb: Rework how netdev->stats is handled
This patch does some refactoring work that I felt was needed after reviewing
the changes recently submitted relating to the replacement of net_stats with
netdev->stats.
This patch essentially creates two different collections of stats. The
first handles the adapter specific states and is stored in gstring_stats,
and the second is for netdev specific stats and is stored in
gstring_net_stats.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:37:19 +0000 (18:37 +0000)]
igb: when number of CPUs > 4 combine tx/rx queues to allow more queues
This patch makes it so that nics such as 82576 and newer can support more
hardware queues when there are more than 4 cpus by combining a tx/rx queue
pair onto one interrupt so that 8 queue pairs can be supported and thus
allow for more queues.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:37:00 +0000 (18:37 +0000)]
igb: move timesync init into a seperate function
Current code is quite large and making igb_probe difficult to read.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Nov 2009 18:36:41 +0000 (18:36 +0000)]
igb: change type for ring sizes to u16 in igb_set_ring_param
Change the type for the ring size values to u16 and use min/max_t instead of
min/max.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wan ZongShun [Wed, 11 Nov 2009 04:35:22 +0000 (04:35 +0000)]
ARM: fix bug of checking on signed return value using unsigned statement in w90p910 platform
To fix the bug of checking on signed return value using unsigned statement.
Thanks Roel Kluin for digging out it.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Nov 2009 17:48:52 +0000 (17:48 +0000)]
igmp: Use next_net_device_rcu()
We need to use next_det_device_rcu() in RCU protected section.
We also can avoid in_dev_get()/in_dev_put() overhead (code size mainly)
in rcu_read_lock() sections.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Nov 2009 17:34:30 +0000 (17:34 +0000)]
ipv6: use RCU to walk list of network devices
No longer need read_lock(&dev_base_lock), use RCU instead.
We also can avoid taking references on inet6_dev structs.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
William Allen Simpson [Tue, 10 Nov 2009 09:51:18 +0000 (09:51 +0000)]
net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED
Define two symbols needed in both kernel and user space.
Remove old (somewhat incorrect) kernel variant that wasn't used in
most cases. Default should apply to both RMSS and SMSS (RFC2581).
Replace numeric constants with defined symbols.
Stand-alone patch, originally developed for TCPCT.
Signed-off-by: William.Allen.Simpson@gmail.com
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 10 Nov 2009 06:14:24 +0000 (06:14 +0000)]
vlan/macvlan: propagate transmission state to upper layers
Both vlan and macvlan devices usually don't use a qdisc and immediately
queue packets to the underlying device. Propagate transmission state of
the underlying device to the upper layers so they can react on congestion
and/or inform the sending process.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 10 Nov 2009 06:14:14 +0000 (06:14 +0000)]
net: allow to propagate errors through ->ndo_hard_start_xmit()
Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return
one of the NETDEV_TX codes. This prevents any kind of error propagation for
virtual devices, like queue congestion of the underlying device in case of
layered devices, or unreachability in case of tunnels.
This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX
codes and changes the two callers of dev_hard_start_xmit() to expect either
errno codes, NET_XMIT codes or NETDEV_TX codes as return value.
In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK
since no error propagation is possible when using qdiscs. In case of
dev_queue_xmit(), the error is propagated upwards.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Mon, 9 Nov 2009 18:05:45 +0000 (18:05 +0000)]
niu.c: Use correct length in strncmp
Untested, no hardware
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 11 Nov 2009 03:45:22 +0000 (03:45 +0000)]
net/atm: move all compat_ioctl handling to atm/ioctl.c
We have two implementations of the compat_ioctl handling for ATM, the
one that we have had for ages in fs/compat_ioctl.c and the one added to
net/atm/ioctl.c by David Woodhouse. Unfortunately, both versions are
incomplete, and in practice we use a very confusing combination of the
two.
For ioctl numbers that have the same identifier on 32 and 64 bit systems,
we go directly through the compat_ioctl socket operation, for those that
differ, we do a conversion in fs/compat_ioctl.c.
This patch moves both variants into the vcc_compat_ioctl() function,
while preserving the current behaviour. It also kills off the COMPATIBLE_IOCTL
definitions that we never use here.
Doing it this way is clearly not a good solution, but I hope it is a
step into the right direction, so that someone is able to clean up this
mess for real.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 11 Nov 2009 03:39:40 +0000 (03:39 +0000)]
net/compat: fix dev_ifsioc emulation corner cases
Handling for SIOCSHWTSTAMP is broken on architectures
with a split user/kernel address space like s390,
because it passes a real user pointer while using
set_fs(KERNEL_DS).
A similar problem might arise the next time somebody
adds code to dev_ifsioc.
Split up dev_ifsioc into three separate functions for
SIOCSHWTSTAMP, SIOC*IFMAP and all other numbers so
we can get rid of set_fs in all potentially affected
cases.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Patrick Ohly <patrick.ohly@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Tue, 10 Nov 2009 07:22:24 +0000 (07:22 +0000)]
DM9000: Wake on LAN support
Add support for Wake on LAN (WOL) reception and waking the device up from
this signal via the ethtool interface. Currently we are only supporting
the magic-packet variant of wakeup.
WOL is enabled by specifying a second interrupt resource to the driver
which indicates where the interrupt for the WOL is being signalled. This
then enables the necessary ethtool calls to leave the device in a state
to receive WOL frames when going into suspend.
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ali Gholami Rudi [Tue, 10 Nov 2009 06:40:06 +0000 (06:40 +0000)]
ixgbe: r_idx not used in ixgbe_msix_clean_rx()
The values of r_idx and rx_ring are not used after the last time they
are set in ixgbe_msix_clean_rx(), so they can be removed.
Signed-off-by: Ali Gholami Rudi <ali@rudi.ir>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 11 Nov 2009 07:40:36 +0000 (07:40 +0000)]
decnet: convert dndev_lock to spinlock
There is no reason for this lock to be reader/writer since
the reader only has lock held for a very brief period.
The overhead of read_lock is more expensive than spinlock.
Compile tested only, I am not a decnet user.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 11 Nov 2009 07:39:27 +0000 (07:39 +0000)]
decnet: add RTNL lock when reading address list
Add missing locking in the case of auto binding to the
default device. The address list might change while this code is looking
at the list.
Compile tested only, I am not a decnet user.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:20:34 +0000 (07:20 +0000)]
netdev: fold name hash properly (v3)
The full_name_hash function does not produce well distributed values in
the lower bits, so most code uses hash_32() to fold it. This is really
a bug introduced when name hashing was added, back in 2.5 when I added
name hashing.
hash_32 is all that is needed since full_name_hash returns unsigned int
which is only 32 bits on 64 bit platforms.
Also, there is no point in using hash_32 on ifindex, because the is naturally
sequential and usually well distributed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Wed, 11 Nov 2009 12:54:06 +0000 (12:54 +0000)]
qlge: Change version to v1.00.00.23.00.00-01.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Wed, 11 Nov 2009 12:54:05 +0000 (12:54 +0000)]
qlge: Clean up module parameter name.
Change it to match qlge_xxx convention.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Wed, 11 Nov 2009 12:54:04 +0000 (12:54 +0000)]
qlge: Add asic reset to open call.
Force asic to known state at open().
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Wed, 11 Nov 2009 12:54:03 +0000 (12:54 +0000)]
qlge: Do not change frame routing during suspend.
We do not need to change the frame routing to direct all frames to the
management fifo during suspend. This is now done by the firmware.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Nov 2009 03:06:30 +0000 (19:06 -0800)]
clocksource/timecompare: Fix symbol exports to be GPL'd.
Noticed by Thomas GLeixner.
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:10 +0000 (14:11 +0000)]
gianfar: Revive SKB recycling
Before calling gfar_clean_tx_ring() the driver grabs an irqsave
spinlock, and then tries to recycle skbs. But since
skb_recycle_check() returns 0 with IRQs disabled, we'll never
recycle any skbs.
It appears that gfar_clean_tx_ring() and gfar_start_xmit() are
mostly idependent and can work in parallel, except when they
modify num_txbdfree.
So we can drop the lock from most sections and thus fix the skb
recycling.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:08 +0000 (14:11 +0000)]
gianfar: Fix race between gfar_error() and gfar_start_xmit()
gfar_error() can arrive at the middle of gfar_start_xmit() processing,
and so it can trigger transfers of BDs that we don't yet expect to
be transmitted.
Fix this by locking the tx queues in gfar_error().
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:07 +0000 (14:11 +0000)]
gianfar: Fix thinko in gfar_set_rx_stash_index()
We obviously want to write a modified 'temp' value back to the
register, not the saved IRQ flags.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:05 +0000 (14:11 +0000)]
gianfar: Fix build with CONFIG_PM=y
commit
fba4ed030cfae7efdb6b79a57b0c5a9d72c9 ("gianfar: Add Multiple
Queue Support") introduced the following build failure:
CC gianfar.o
gianfar.c: In function 'gfar_restore':
gianfar.c:1249: error: request for member 'napi' in something not a structure or union
This patch fixes the issue.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:03 +0000 (14:11 +0000)]
gianfar: Remove 'Interrupt problem!' warning
It is OK to poll with disabled IRQs, so remove the warning.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Tue, 10 Nov 2009 14:11:01 +0000 (14:11 +0000)]
skbuff: Do not allow skb recycling with disabled IRQs
NAPI drivers try to recycle SKBs in their polling routine, but we
generally don't know the context in which the polling will be called,
and the skb recycling itself may require IRQs to be enabled.
This patch adds irqs_disabled() test to the skb_recycle_check()
routine, so that we'll not let the drivers hit the skb recycling
path with IRQs disabled.
As a side effect, this patch actually disables skb recycling for some
[broken] drivers. E.g. gianfar driver grabs an irqsave spinlock during
TX ring processing, and then tries to recycle an skb, and that caused
the following badness:
nf_conntrack version 0.5.0 (1008 buckets, 4032 max)
------------[ cut here ]------------
Badness at kernel/softirq.c:143
NIP:
c003e3c4 LR:
c423a528 CTR:
c003e344
...
NIP [
c003e3c4] local_bh_enable+0x80/0xc4
LR [
c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
Call Trace:
[
c15d1b60] [
c003e32c] local_bh_disable+0x1c/0x34 (unreliable)
[
c15d1b70] [
c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
[
c15d1b80] [
c02c6370] nf_conntrack_destroy+0x3c/0x70
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Nov 2009 02:53:00 +0000 (18:53 -0800)]
ipv6: Remove unused var in inet6_dump_ifinfo()
Reported by Stephen Rothwell:
--------------------
Today's linux-next build (x86_64 allmodconfig) produced this warning:
net/ipv6/addrconf.c: In function 'inet6_dump_ifinfo':
net/ipv6/addrconf.c:3833: warning: unused variable 'err'
Introduced by commit
84d2697d9649339215675551eae28ba04068dea1 ("ipv6:
speedup inet6_dump_ifinfo()").
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Wed, 11 Nov 2009 21:04:42 +0000 (13:04 -0800)]
iwlwifi: fix iwl1000 "RTS/CTS for HT" merge damage
I may have botched my merge conflict resolution instructions for Dave...
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Nov 2009 19:38:16 +0000 (11:38 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/wireless/iwlwifi/iwl-1000.c
drivers/net/wireless/iwlwifi/iwl-6000.c
drivers/net/wireless/iwlwifi/iwl-core.h
stephen hemminger [Tue, 10 Nov 2009 07:54:56 +0000 (07:54 +0000)]
CAN: use dev_get_by_index_rcu
Use new function to avoid doing read_lock().
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:55 +0000 (07:54 +0000)]
IPV4: use rcu to walk list of devices in IGMP
This also needs to be optimized for large number of devices.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:53 +0000 (07:54 +0000)]
decnet: use RCU to find network devices
When showing device statistics use RCU rather than read_lock(&dev_base_lock)
Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:52 +0000 (07:54 +0000)]
s390: use RCU to walk list of network devices
This is similar to other cases where for_each_netdev_rcu
can be used when gathering information.
By inspection, don't have platform or cross-build environment
to validate.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:49 +0000 (07:54 +0000)]
net: use rcu for network scheduler API
Use RCU to walk list of network devices in qdisc dump.
This could be optimized for large number of devices.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:48 +0000 (07:54 +0000)]
vlan: eliminate use of dev_base_lock
Do not need to use read_lock(&dev_base_lock), use RCU instead.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 10 Nov 2009 07:54:47 +0000 (07:54 +0000)]
netdev: add netdev_continue_rcu
This adds an RCU macro for continuing search, useful for some
network devices like vlan.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Mon, 9 Nov 2009 12:05:53 +0000 (12:05 +0000)]
IPv6: use ipv6_addr_v4mapped()
Change udp6_portaddr_hash() to use ipv6_addr_v4mapped()
inline instead of ipv6_addr_type().
Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 9 Nov 2009 18:07:28 +0000 (18:07 +0000)]
parisc: led: Use for_each_netdev_rcu()
Use for_each_netdev_rcu() and dont lock dev_base_lock anymore
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 9 Nov 2009 08:42:01 +0000 (08:42 +0000)]
sit: Clean up DF code by copying from IPIP
This patch rearranges the SIT DF bit handling using the new IPIP DF
code. The only externally visible effect should be the case where
PMTU is enabled and the MTU is exactly 1280 bytes. In this case the
previous code would send packets out with DF off while the new code
would set the DF bit. This is inline with RFC 4213.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 9 Nov 2009 07:40:17 +0000 (07:40 +0000)]
ipv6: Allow inet6_dump_addr() to handle more than 64 addresses
Apparently, inet6_dump_addr() is not able to handle more than
64 ipv6 addresses per device. We must break from inner loops
in case skb is full, or else cursor is put at the end of list.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 9 Nov 2009 12:11:28 +0000 (12:11 +0000)]
ipv6: speedup inet6_dump_ifinfo()
When handling large number of netdevice, inet6_dump_ifinfo()
is very slow because it has O(N^2) complexity.
Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cyrill Gorcunov [Sun, 8 Nov 2009 05:51:19 +0000 (05:51 +0000)]
net: netlink_getname, packet_getname -- use DECLARE_SOCKADDR guard
Use guard DECLARE_SOCKADDR in a few more places which allow
us to catch if the structure copied back is too big.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 4 Nov 2009 15:29:52 +0000 (15:29 +0000)]
usbnet: Set link down initially for drivers that update link state
Some usbnet drivers update link state while others do not due to
hardware limitations. Add a flag to distinguish those that do, and
set the link down initially for their devices.
This is intended to fix this bug: http://bugs.debian.org/444043
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marin Mitov [Sun, 8 Nov 2009 05:59:27 +0000 (05:59 +0000)]
niu: Use DMA_BIT_MASK(44) instead of deprecated DMA_44BIT_MASK
Use DMA_BIT_MASK(44) instead of deprecated DMA_44BIT_MASK
Signed-off-by: Marin Mitov <mitov@issp.bas.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 9 Nov 2009 05:26:33 +0000 (05:26 +0000)]
udp: bind() optimisation
UDP bind() can be O(N^2) in some pathological cases.
Thanks to secondary hash tables, we can make it O(N)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Mon, 9 Nov 2009 06:09:37 +0000 (06:09 +0000)]
bnx2x: version 1.52.1-4
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Mon, 9 Nov 2009 06:09:35 +0000 (06:09 +0000)]
bnx2x: Change coalescing granularity to 4us
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Mon, 9 Nov 2009 06:09:28 +0000 (06:09 +0000)]
bnx2x: Remove misleading error print
Failing to allocate MSI-X vectors is not an error and should not be
printed as such
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Mon, 9 Nov 2009 06:09:22 +0000 (06:09 +0000)]
bnx2x: GSO implies CSUM offload
Making sure that whenever the FW/HW is configured for GSO, it is also
configured to CSUM offload
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Mon, 9 Nov 2009 04:06:40 +0000 (04:06 +0000)]
Phonet: allocate and copy for pipe TX without sock lock
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Mon, 9 Nov 2009 02:17:01 +0000 (02:17 +0000)]
Phonet: put sockets in a hash table
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 11 Nov 2009 04:30:37 +0000 (20:30 -0800)]
speedfax: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 12:04:09 +0000 (12:04 +0000)]
pcnet-cs: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:55:20 +0000 (11:55 +0000)]
tms380tr: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:55:07 +0000 (11:55 +0000)]
spider-net: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:54:44 +0000 (11:54 +0000)]
myri10ge: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:53:52 +0000 (11:53 +0000)]
cxgb3: declare MODULE_FIRMWARE
Replace run-time string formatting with preprocessor string
manipulation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:53:39 +0000 (11:53 +0000)]
bnx2x: declare MODULE_FIRMWARE
Replace run-time string formatting with preprocessor string
manipulation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:46:07 +0000 (11:46 +0000)]
ambassador: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:40:32 +0000 (11:40 +0000)]
solos-pci: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sat, 7 Nov 2009 11:37:36 +0000 (11:37 +0000)]
netx: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wey-Yi Guy [Fri, 6 Nov 2009 23:17:05 +0000 (15:17 -0800)]
iwlwifi: Fix issue on file transfer stalled in HT mode
Turn on RTS/CTS for HT to prevent uCode TX fifo underrun
This is fix for
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2103
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Tested-by: Jiajia Zheng <jiajia.zheng@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Wey-Yi Guy [Fri, 6 Nov 2009 23:17:04 +0000 (15:17 -0800)]
iwlwifi: Use RTS/CTS as the preferred protection mechanism for 6000 series
When 802.11g was introduced, we had RTS/CTS and CTS-to-Self protection
mechanisms. In an HT Beacon, HT stations use the "Operating Mode" field
in the HT Information Element to determine whether or not to use
protection.
The Operating Mode field has 4 possible settings: 0-3:
Mode 0: If all stations in the BSS are 20/40 MHz HT capable, or if the
BSS is 20/40 MHz capable, or if all stations in the BSS are 20 MHz HT
stations in a 20 MHz BSS
Mode 1: used if there are non-HT stations or APs using the primary or
secondary channels
Mode 2: if only HT stations are associated in the BSS and at least one
20 MHz HT station is associated.
Mode 3: used if one or more non-HT stations are associated in the BSS.
When in operating modes 1 or 3, and the Use_Protection field is 1 in the
Beacon's ERP IE, all HT transmissions must be protected using RTS/CTS or
CTS-to-Self.
By default, CTS-to-self is the preferred protection mechanism for less
overhead and higher throughput; but using the full RTS/CTS will better
protect the inner exchange from interference, especially in
highly-congested environment.
For 6000 series WIFI NIC, RTS/CTS protection mechanism is the
recommended choice for HT traffic based on the HW design.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bing Zhao [Tue, 10 Nov 2009 02:04:13 +0000 (18:04 -0800)]
Libertas: fix issues while configuring host sleep using ethtool wol
Configuration of wake-on-lan for unicast, multicast, broadcast, physical
activity was not working. Kernel panic issue was there when user tries to
disable WOL. Fixed them.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bing Zhao [Tue, 10 Nov 2009 02:04:12 +0000 (18:04 -0800)]
Libertas: coding style cleanup in ethtool.c
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Mon, 9 Nov 2009 22:56:06 +0000 (16:56 -0600)]
rtl8187: Fix sparse warnings
Due to a missing header include, sparse generates the following warnings:
CHECK drivers/net/wireless/rtl818x/rtl8187_rfkill.c
warning: symbol 'rtl8187_rfkill_init' was not declared. Should it be static?
warning: symbol 'rtl8187_rfkill_poll' was not declared. Should it be static?
warning: symbol 'rtl8187_rfkill_exit' was not declared. Should it be static?
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bob Copeland [Mon, 9 Nov 2009 02:59:02 +0000 (21:59 -0500)]
ath5k: add LED definition for BenQ Joybook R55v
Setup the GPIOs for the BenQ Joybook netbook.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bob Copeland [Mon, 9 Nov 2009 02:59:01 +0000 (21:59 -0500)]
ath5k: add LED support for HP Compaq CQ60
Add GPIO configuration for the Compaq CQ60 laptop
Reported-by: David Dreggors <ddreggors@jumptv.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bob Copeland [Mon, 9 Nov 2009 02:59:00 +0000 (21:59 -0500)]
ath5k: don't reset mcast filter when configuring the mode
We should not zero out the multicast hash when configuring
the operating mode, since a zero value means all multicast
frames will get dropped. Also, ath5k_mode_setup() gets
called after any reset, so the hash already set up in
configure_filter() is lost.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Sat, 7 Nov 2009 18:14:47 +0000 (19:14 +0100)]
rt2x00: update MAINTAINERS
Although I have always been the active maintainer of the rt2x00 drivers,
I was not mentioned explicitely in the MAINTAINERS file as such.
Update the rt2x00 entry in the MAINTAINERS file to add my name and
email address.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felix Fietkau [Sat, 7 Nov 2009 17:37:37 +0000 (18:37 +0100)]
b43: work around a locking issue in ->set_tim()
ops->set_tim() must be atomic, so b43 trying to acquire a mutex leads
to a kernel crash. This patch trades an easy to trigger crash in AP
mode for an unlikely race condition. According to Michael, the real
fix would be to allow set_tim() to sleep, since b43 is not the only
driver that needs to sleep in all callbacks.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Martin Fuzzey [Fri, 6 Nov 2009 20:21:27 +0000 (21:21 +0100)]
ssb-pcmcia: Fix 32bit register access in early bus scanning
The scan function was using 32 bit access which does not
work on 16bit CF cards.
This patch corrects this by doing two 16 bit reads like
ssb_pcmcia_read32 already does.
mb -- Removed locking. That early in init there's no need for locking.
Signed-off-by: Martin Fuzzey <mfuzzey@gmail.com>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
David S. Miller [Mon, 9 Nov 2009 19:17:24 +0000 (11:17 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6
David S. Miller [Mon, 9 Nov 2009 07:00:54 +0000 (23:00 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/can/usb/ems_usb.c
Yury Polyanskiy [Mon, 9 Nov 2009 04:58:41 +0000 (20:58 -0800)]
xfrm: SAD entries do not expire correctly after suspend-resume
This fixes the following bug in the current implementation of
net/xfrm: SAD entries timeouts do not count the time spent by the machine
in the suspended state. This leads to the connectivity problems because
after resuming local machine thinks that the SAD entry is still valid, while
it has already been expired on the remote server.
The cause of this is very simple: the timeouts in the net/xfrm are bound to
the old mod_timer() timers. This patch reassigns them to the
CLOCK_REALTIME hrtimer.
I have been using this version of the patch for a few months on my
machines without any problems. Also run a few stress tests w/o any
issues.
This version of the patch uses tasklet_hrtimer by Peter Zijlstra
(commit 9ba5f0).
This patch is against 2.6.31.4. Please CC me.
Signed-off-by: Yury Polyanskiy <polyanskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 9 Nov 2009 04:57:03 +0000 (20:57 -0800)]
net/compat_ioctl: support SIOCWANDEV
This adds compat_ioctl support for SIOCWANDEV, which has
always been missing.
The definition of struct compat_ifreq was missing an
ifru_settings fields that is needed to support SIOCWANDEV,
so add that and clean up the whitespace damage in the
struct definition.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 9 Nov 2009 04:56:21 +0000 (20:56 -0800)]
net, compat_ioctl: fix SIOCGMII ioctls
SIOCGMIIPHY and SIOCGMIIREG return data through ifreq,
so it needs to be converted on the way out as well.
SIOCGIFPFLAGS is unused, but has the same problem in theory.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:20:19 +0000 (10:20 +0000)]
udp: multicast RX should increment SNMP/sk_drops counter in allocation failures
When skb_clone() fails, we should increment sk_drops and SNMP counters.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:18:52 +0000 (10:18 +0000)]
ipv6: udp: Optimise multicast reception
IPV6 UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.
Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:18:44 +0000 (10:18 +0000)]
ipv4: udp: Optimise multicast reception
UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.
Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
It's also a base for a future RCU conversion of multicast recption.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:18:30 +0000 (10:18 +0000)]
ipv6: udp: optimize unicast RX path
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.
If too many sockets are listed, we take a look at the secondary
(port, address) hash chain.
We choose the shortest chain and proceed with a RCU lookup on the elected chain.
But, if we chose (port, address) chain, and fail to find a socket on given address,
we must try another lookup on (port, in6addr_any) chain to find sockets not bound
to a particular IP.
-> No extra cost for typical setups, where the first lookup will probabbly
be performed.
RCU lookups everywhere, we dont acquire spinlock.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:18:11 +0000 (10:18 +0000)]
ipv4: udp: optimize unicast RX path
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.
If too many sockets are listed, we take a look at the secondary
(port, address) hash chain we added in previous patch.
We choose the shortest chain and proceed with a RCU lookup on the elected chain.
But, if we chose (port, address) chain, and fail to find a socket on given address,
we must try another lookup on (port, INADDR_ANY) chain to find socket not bound
to a particular IP.
-> No extra cost for typical setups, where the first lookup will probabbly
be performed.
RCU lookups everywhere, we dont acquire spinlock.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:17:58 +0000 (10:17 +0000)]
udp: secondary hash on (local port, local address)
Extends udp_table to contain a secondary hash table.
socket anchor for this second hash is free, because UDP
doesnt use skc_bind_node : We define an union to hold
both skc_bind_node & a new hlist_nulls_node udp_portaddr_node
udp_lib_get_port() inserts sockets into second hash chain
(additional cost of one atomic op)
udp_lib_unhash() deletes socket from second hash chain
(additional cost of one atomic op)
Note : No spinlock lockdep annotation is needed, because
lock for the secondary hash chain is always get after
lock for primary hash chain.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:17:30 +0000 (10:17 +0000)]
udp: split sk_hash into two u16 hashes
Union sk_hash with two u16 hashes for udp (no extra memory taken)
One 16 bits hash on (local port) value (the previous udp 'hash')
One 16 bits hash on (local address, local port) values, initialized
but not yet used. This second hash is using jenkin hash for better
distribution.
Because the 'port' is xored later, a partial hash is performed
on local address + net_hash_mix(net)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 8 Nov 2009 10:17:05 +0000 (10:17 +0000)]
udp: add a counter into udp_hslot
Adds a counter in udp_hslot to keep an accurate count
of sockets present in chain.
This will permit to upcoming UDP lookup algo to chose
the shortest chain when secondary hash is added.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>