firefly-linux-kernel-4.4.55.git
17 years ago[XFRM]: Export SAD info.
Jamal Hadi Salim [Thu, 26 Apr 2007 07:10:29 +0000 (00:10 -0700)]
[XFRM]: Export SAD info.

On a system with a lot of SAs, counting SAD entries chews useful
CPU time since you need to dump the whole SAD to user space;
i.e something like ip xfrm state ls | grep -i src | wc -l
I have seen taking literally minutes on a 40K SAs when the system
is swapping.
With this patch, some of the SAD info (that was already being tracked)
is exposed to user space. i.e you do:
ip xfrm state count
And you get the count; you can also pass -s to the command line and
get the hash info.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: Missing rtnl.
Stephen Hemminger [Thu, 26 Apr 2007 05:08:46 +0000 (22:08 -0700)]
[BRIDGE]: Missing rtnl.

Writing to /sys/class/net/brX/bridge/stp_state causes a warning because
RTNL is not held when call br_stp_if.c

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: if no STP then forward all BPDUs
Stephen Hemminger [Thu, 26 Apr 2007 05:07:58 +0000 (22:07 -0700)]
[BRIDGE]: if no STP then forward all BPDUs

If a bridge is not running STP, then it has no way to detect a cycle
in the network. But if it is not running STP and some other machine
or device is running STP, then if STP BPDU's get forwarded to it can
detect the cycle.

This is how the old 2.4 and early 2.6 code worked.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: drop PAUSE frames
Stephen Hemminger [Thu, 26 Apr 2007 05:05:55 +0000 (22:05 -0700)]
[BRIDGE]: drop PAUSE frames

Pause frames should never make it out of the network device into
the stack. But if a device was misconfigured, it might happen.
So drop pause frames in bridge.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: don't change packet type
Stephen Hemminger [Thu, 26 Apr 2007 05:03:10 +0000 (22:03 -0700)]
[BRIDGE]: don't change packet type

The change to forward STP bpdu's (for usermode STP) through normal path,
changed the packet type in the process. Since link local stuff is multicast, it
should stay pkt_type = PACKET_MULTICAST.  The code was probably copy/pasted
incorrectly from the bridge pseudo-device receive path.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] NDISC: Unify main process of sending ND messages.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:52 +0000 (20:44 +0900)]
[IPV6] NDISC: Unify main process of sending ND messages.

Because ndisc_send_na(), ndisc_send_ns() and ndisc_send_rs()
are almost identical, so let's unify their common part.

With gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) on i386,
Before:
   text    data     bss     dec     hex filename
  14689     364      24   15077    3ae5 net/ipv6/ndisc.o
After:
   text    data     bss     dec     hex filename
  12317     364      24   12705    31a1 net/ipv6/ndisc.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV6] XFRM: Use ip6addr_any where applicable.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:50 +0000 (20:44 +0900)]
[IPV6] XFRM: Use ip6addr_any where applicable.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV6]: Export in6addr_any for future use.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:49 +0000 (20:44 +0900)]
[IPV6]: Export in6addr_any for future use.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV4] IP_GRE: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:48 +0000 (20:44 +0900)]
[IPV4] IP_GRE: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV4] IPIP: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:47 +0000 (20:44 +0900)]
[IPV4] IPIP: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV6] SIT: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:47 +0000 (20:44 +0900)]
[IPV6] SIT: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV6]: Fix Makefile thinko.
David S. Miller [Wed, 25 Apr 2007 05:15:40 +0000 (22:15 -0700)]
[IPV6]: Fix Makefile thinko.

obj-$(CONFIG_PROC_FS) --> ipv6-$(CONFIG_PROC_FS)

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6]: Consolidate common SNMP code
Herbert Xu [Wed, 25 Apr 2007 04:54:09 +0000 (21:54 -0700)]
[IPV6]: Consolidate common SNMP code

This patch moves the non-proc SNMP code into addrconf.c and reuses
IPv4 SNMP code where applicable.

As a result we can skip proc.o if /proc is disabled.

Note that I've made a number of functions static since they're only
used by addrconf.c for now.  If they ever get used elsewhere we can
always remove the static.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV4]: Consolidate common SNMP code
Herbert Xu [Wed, 25 Apr 2007 04:53:35 +0000 (21:53 -0700)]
[IPV4]: Consolidate common SNMP code

This patch moves the SNMP code shared between IPv4/IPv6 from proc.c
into net/ipv4/af_inet.c.  This makes sense because these functions
aren't specific to /proc.

As a result we can again skip proc.o if /proc is disabled.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV4]: Fix build without procfs.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 23:22:42 +0000 (16:22 -0700)]
[IPV4]: Fix build without procfs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: Fix linkage errors on i386.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 23:21:38 +0000 (16:21 -0700)]
[TCP]: Fix linkage errors on i386.

To avoid raw division, use ktime_to_timeval() to get usec.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TIPC]: Enhancements to msg_set_bits() routine
Allan Stephens [Tue, 24 Apr 2007 21:51:55 +0000 (14:51 -0700)]
[TIPC]: Enhancements to msg_set_bits() routine

This patch makes two enhancements to msg_set_bits():

1) It now ignores any bits of the new field value that are not
   covered by the mask being used.  (Previously, if the new value
   exceeded the size of the mask the extra bits could corrupt
   other fields in the message header word being updated.)

2) The code has been optimized to minimize the number of run-time
   endianness conversion operations by leveraging the fact that the
   mask (and, in some cases, the value as well) is constant and the
   necessary conversion can be performed by the compiler.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS] cfg80211: Update comment for locking.
Johannes Berg [Tue, 24 Apr 2007 21:07:27 +0000 (14:07 -0700)]
[WIRELESS] cfg80211: Update comment for locking.

This patch adds a comment that was part of my rtnl locking patch for
cfg80211 but which I forgot for the merge.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Warn about GSO/checksum abuse
Herbert Xu [Tue, 24 Apr 2007 05:36:13 +0000 (22:36 -0700)]
[NET]: Warn about GSO/checksum abuse

Now that Patrick has added the code to deal with GSO in netfilter,
we no longer need the crutch that computes partial checksums just
before transmission.

This patch turns this into a warning again.  If this goes OK, we
can then turn it into a BUG_ON and remove the gso_send_check cruft.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP] TCP YEAH: Use vegas dont copy it.
Stephen Hemminger [Tue, 24 Apr 2007 05:28:23 +0000 (22:28 -0700)]
[TCP] TCP YEAH: Use vegas dont copy it.

Rather than using a copy of vegas code, the YEAH code should just have
it exported so there is common code.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: Congestion control API update.
Stephen Hemminger [Tue, 24 Apr 2007 05:26:16 +0000 (22:26 -0700)]
[TCP]: Congestion control API update.

Do some simple changes to make congestion control API faster/cleaner.
* use ktime_t rather than timeval
* merge rtt sampling into existing ack callback
  this means one indirect call versus two per ack.
* use flags bits to store options/settings

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: TCP Illinois update.
Stephen Hemminger [Tue, 24 Apr 2007 05:24:32 +0000 (22:24 -0700)]
[TCP]: TCP Illinois update.

This version more closely matches the paper, and fixes several
math errors. The biggest difference is that it updates alpha/beta
once per RTT

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS] drivers/net/wireless/Kconfig: correct minor typo
John W. Linville [Mon, 23 Apr 2007 20:28:49 +0000 (13:28 -0700)]
[WIRELESS] drivers/net/wireless/Kconfig: correct minor typo

Correct minor typo in drivers/net/wireless/Kconfig identified by
Stefano Brivio <stefano.brivio@polimi.it>.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS]: Remove wext over netlink.
Johannes Berg [Mon, 23 Apr 2007 19:20:55 +0000 (12:20 -0700)]
[WIRELESS]: Remove wext over netlink.

As scheduled, this patch removes the pointless wext over netlink code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS] cfg80211: New wireless config infrastructure.
Johannes Berg [Mon, 23 Apr 2007 19:20:05 +0000 (12:20 -0700)]
[WIRELESS] cfg80211: New wireless config infrastructure.

This patch creates the core cfg80211 code along with some sysfs bits.
This is a stripped down version to allow mac80211 to function, but
doesn't include any configuration yet except for creating and removing
virtual interfaces.

This patch includes the nl80211 header file but it only contains the
interface types which the cfg80211 interface for creating virtual
interfaces relies on.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS]: Refactor wireless Kconfig.
Johannes Berg [Mon, 23 Apr 2007 19:19:12 +0000 (12:19 -0700)]
[WIRELESS]: Refactor wireless Kconfig.

This patch refactors the wireless Kconfig all over and already
introduces net/wireless/Kconfig with just the WEXT bit for now,
the cfg80211 patch will add to that as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[WIRELESS]: Update MAINTAINERS for wireless mailing list.
Johannes Berg [Mon, 23 Apr 2007 19:18:20 +0000 (12:18 -0700)]
[WIRELESS]: Update MAINTAINERS for wireless mailing list.

This patch adds the linux-wireless mailing list to all appropriate
entries in the MAINTAINERS file.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Prevent much sadness in qdisc_lock_tree().
Andrew Morton [Mon, 23 Apr 2007 06:22:24 +0000 (23:22 -0700)]
[NET]: Prevent much sadness in qdisc_lock_tree().

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] SNMP: Use put_unaligned() instead of memcpy().
YOSHIFUJI Hideaki [Sun, 22 Apr 2007 02:52:04 +0000 (19:52 -0700)]
[IPV6] SNMP: Use put_unaligned() instead of memcpy().

Hint from David Miller <davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] SNMP: Fix several warnings without procfs.
YOSHIFUJI Hideaki [Sat, 21 Apr 2007 11:13:44 +0000 (20:13 +0900)]
[IPV6] SNMP: Fix several warnings without procfs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[IPV6] SNMP: Avoid unaligned accesses.
YOSHIFUJI Hideaki [Sat, 21 Apr 2007 11:12:43 +0000 (20:12 +0900)]
[IPV6] SNMP: Avoid unaligned accesses.

Because stats pointer may not be aligned for u64, use memcpy
to fill u64 values.
Issue reported by David Miller <davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
17 years ago[TCP]: Sed magic converts func(sk, tp, ...) -> func(sk, ...)
Ilpo Järvinen [Sat, 21 Apr 2007 05:18:02 +0000 (22:18 -0700)]
[TCP]: Sed magic converts func(sk, tp, ...) -> func(sk, ...)

This is (mostly) automated change using magic:

sed -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N'
    -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N'
    -e 's|struct sock \*sk,[\n\t ]*struct tcp_sock \*tp\([^{]*\n{\n\)|
  struct sock \*sk\1\tstruct tcp_sock *tp = tcp_sk(sk);\n|g'
    -e 's|struct sock \*sk, struct tcp_sock \*tp|
  struct sock \*sk|g' -e 's|sk, tp\([^-]\)|sk\1|g'

Fixed four unused variable (tp) warnings that were introduced.

In addition, manually added newlines after local variables and
tweaked function arguments positioning.

$ gcc --version
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
...
$ codiff -fV built-in.o.old built-in.o.new
net/ipv4/route.c:
  rt_cache_flush |  +14
 1 function changed, 14 bytes added

net/ipv4/tcp.c:
  tcp_setsockopt |   -5
  tcp_sendpage   |  -25
  tcp_sendmsg    |  -16
 3 functions changed, 46 bytes removed

net/ipv4/tcp_input.c:
  tcp_try_undo_recovery |   +3
  tcp_try_undo_dsack    |   +2
  tcp_mark_head_lost    |  -12
  tcp_ack               |  -15
  tcp_event_data_recv   |  -32
  tcp_rcv_state_process |  -10
  tcp_rcv_established   |   +1
 7 functions changed, 6 bytes added, 69 bytes removed, diff: -63

net/ipv4/tcp_output.c:
  update_send_head          |   -9
  tcp_transmit_skb          |  +19
  tcp_cwnd_validate         |   +1
  tcp_write_wakeup          |  -17
  __tcp_push_pending_frames |  -25
  tcp_push_one              |   -8
  tcp_send_fin              |   -4
 7 functions changed, 20 bytes added, 63 bytes removed, diff: -43

built-in.o.new:
 18 functions changed, 40 bytes added, 178 bytes removed, diff: -138

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Fix comments for register_netdev().
Borislav Petkov [Sat, 21 Apr 2007 05:14:10 +0000 (22:14 -0700)]
[NET]: Fix comments for register_netdev().

Correct the function name in the comments supplied with
register_netdev()

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: Misc spelling corrections.
G. Liakhovetski [Sat, 21 Apr 2007 05:12:48 +0000 (22:12 -0700)]
[IrDA]: Misc spelling corrections.

Spelling corrections, from "to" to "too".

Signed-off-by: G. Liakhovetski <gl@dsa-ac.de>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: Adding carriage returns to mcs7780 debug statements
Samuel Ortiz [Sat, 21 Apr 2007 05:12:07 +0000 (22:12 -0700)]
[IrDA]: Adding carriage returns to mcs7780 debug statements

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA] af_irda: IRDA_ASSERT cleanups
Samuel Ortiz [Sat, 21 Apr 2007 05:10:13 +0000 (22:10 -0700)]
[IrDA] af_irda: IRDA_ASSERT cleanups

In af_irda.c, the multiple IRDA_ASSERT() are either hiding bugs, useless, or
returning the wrong value.
Let's clean that up.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA] af_irda: irda_accept cleanup
Samuel Ortiz [Sat, 21 Apr 2007 05:09:33 +0000 (22:09 -0700)]
[IrDA] af_irda: irda_accept cleanup

This patch removes a cut'n'paste copy of wait_event_interruptible
from irda_accept.

Signed-off-by: Samuel Ortiz <samuel@ortiz.org>
Acked-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA] af_irda: Silence kernel message in irda_recvmsg_stream
Olaf Kirch [Sat, 21 Apr 2007 05:08:15 +0000 (22:08 -0700)]
[IrDA] af_irda: Silence kernel message in irda_recvmsg_stream

This patch silences an IRDA_ASSERT in irda_recvmsg_stream, as described in
http://bugzilla.kernel.org/show_bug.cgi?id=7512 irda_disconnect_indication
would set sk->sk_err to ECONNRESET, and a subsequent call to recvmsg
would print an irritating kernel message and return -1.

When a connected socket is closed by the peer, recvmsg should return 0
rather than an error. This patch fixes this.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA] af_irda: irda_recvmsg_stream cleanup
Olaf Kirch [Sat, 21 Apr 2007 05:05:27 +0000 (22:05 -0700)]
[IrDA] af_irda: irda_recvmsg_stream cleanup

This patch cleans up some code in irda_recvmsg_stream, replacing some
homebrew code with prepare_to_wait/finish_wait, and by making the
code honor sock_rcvtimeo.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Move sk_setup_caps() out of line.
Andi Kleen [Sat, 21 Apr 2007 00:12:43 +0000 (17:12 -0700)]
[NET]: Move sk_setup_caps() out of line.

It is far too large to be an inline and not in any hot paths.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: Uninline tcp_done().
Andi Kleen [Sat, 21 Apr 2007 00:11:46 +0000 (17:11 -0700)]
[TCP]: Uninline tcp_done().

The function is quite big and has several call sites and nothing
to collapse by compiler optimization on inlining.

Besides it's nicer to read in a in .c file.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: cleanup extra semicolons
Stephen Hemminger [Sat, 21 Apr 2007 00:09:22 +0000 (17:09 -0700)]
[NET]: cleanup extra semicolons

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals.  Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: TCP Illinois congestion control (rev3)
Stephen Hemminger [Sat, 21 Apr 2007 00:07:51 +0000 (17:07 -0700)]
[TCP]: TCP Illinois congestion control (rev3)

This is an implementation of TCP Illinois invented by Shao Liu
at University of Illinois. It is a another variant of Reno which adapts
the alpha and beta parameters based on RTT. The basic idea is to increase
window less rapidly as delay approaches the maximum. See the papers
and talks to get a more complete description.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Get rid of netdev_nit
Stephen Hemminger [Sat, 21 Apr 2007 00:02:45 +0000 (17:02 -0700)]
[NET]: Get rid of netdev_nit

It isn't any faster to test a boolean global variable than do a simple
check for empty list.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PPPOE]: Fix device tear-down notification.
Michal Ostrowski [Fri, 20 Apr 2007 23:59:24 +0000 (16:59 -0700)]
[PPPOE]: Fix device tear-down notification.

pppoe_flush_dev() kicks all sockets bound to a device that is going down.
In doing so, locks must be taken in the right order consistently (sock lock,
followed by the pppoe_hash_lock).  However, the scan process is based on
us holding the sock lock.  So, when something is found in the scan we must
release the lock we're holding and grab the sock lock.

This patch fixes race conditions between this code and pppoe_release(),
both of which perform similar functions but would naturally prefer to grab
locks in opposing orders.  Both code paths are now going after these locks
in a consistent manner.

pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside
inside the hash.  Thus, NULL'ing out the pppoe_dev field should be done
under the protection of this lock.

Signed-off-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PPPOE]: memory leak when socket is release()d before PPPIOCGCHAN has been called...
Florian Zumbiehl [Fri, 20 Apr 2007 23:58:14 +0000 (16:58 -0700)]
[PPPOE]: memory leak when socket is release()d before PPPIOCGCHAN has been called on it

below you find a patch that fixes a memory leak when a PPPoE socket is
release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl
ever has been called on it.

This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.

Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PPPOE]: race between interface going down and connect()
Florian Zumbiehl [Fri, 20 Apr 2007 23:57:27 +0000 (16:57 -0700)]
[PPPOE]: race between interface going down and connect()

below you find a patch that (hopefully) fixes a race between an interface
going down and a connect() to a peer on that interface. Before,
connect() would determine that an interface is up, then the interface
could go down and all entries referring to that interface in the
item_hash_table would be marked as ZOMBIEs and their references to
the device would be freed, and after that, connect() would put a new
entry into the hash table referring to the device that meanwhile is
down already - which also would cause unregister_netdevice() to wait
until the socket has been release()d.

This patch does not suffice if we are not allowed to accept connect()s
referring to a device that we already acked a NETDEV_GOING_DOWN for
(that is: all references are only guaranteed to be freed after
NETDEV_DOWN has been acknowledged, not necessarily after the
NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid
looking through the hash table upon NETDEV_GOING_DOWN completely and
only do that once we get the NETDEV_DOWN ...

mostrows:
pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with
this "late connect" issue.  Ideally one would hope to notify users at the
"NETDEV_GOING_DOWN" phase (just to pretend to be nice).  However, it is the
NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is
hanging around at that time.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PPPoE]: miscellaneous smaller cleanups
Florian Zumbiehl [Fri, 20 Apr 2007 23:56:31 +0000 (16:56 -0700)]
[PPPoE]: miscellaneous smaller cleanups

below is a patch that just removes dead code/initializers without any
effect (first access is an assignment) that I stumbled accross while
reading the source.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET] skbuff: skb_store_bits const is backwards
Stephen Hemminger [Fri, 20 Apr 2007 23:40:01 +0000 (16:40 -0700)]
[NET] skbuff: skb_store_bits const is backwards

Getting warnings becuase skb_store_bits has skb as constant,
but the function overwrites it. Looks like const was on the
wrong side.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: Fix warning in net-2.6.22
Stephen Hemminger [Fri, 20 Apr 2007 23:39:17 +0000 (16:39 -0700)]
[BRIDGE]: Fix warning in net-2.6.22

The following is leftover from earlier change in net-2.6.22.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[AX25/NETROM/ROSE]: Convert to use modern wait queue API
Ralf Baechle [Fri, 20 Apr 2007 23:06:45 +0000 (16:06 -0700)]
[AX25/NETROM/ROSE]: Convert to use modern wait queue API

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[AF_PACKET]: Add option to return orig_dev to userspace.
Peter P. Waskiewicz Jr [Fri, 20 Apr 2007 23:05:39 +0000 (16:05 -0700)]
[AF_PACKET]: Add option to return orig_dev to userspace.

Add a packet socket option to allow the orig_dev index to be returned
to userspace when passing traffic through a decapsulated device, such
as the bonding driver.

This is very useful for layer 2 traffic being able to report which
physical device actually received the traffic, instead of having the
encapsulating device hide that information.

The new option is called PACKET_ORIGDEV.

Signed-off-by: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:57:45 +0000 (15:57 -0700)]
[IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV4] SNMP: Move some statistic bits to net/ipv4/proc.c.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:57:15 +0000 (15:57 -0700)]
[IPV4] SNMP: Move some statistic bits to net/ipv4/proc.c.

This also fixes memory leak in error path.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:56:48 +0000 (15:56 -0700)]
[IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] SNMP: Netlink interface.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:56:20 +0000 (15:56 -0700)]
[IPV6] SNMP: Netlink interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[INET]: Add IP(V6)_PMTUDISC_RPOBE
John Heffner [Fri, 20 Apr 2007 22:53:27 +0000 (15:53 -0700)]
[INET]: Add IP(V6)_PMTUDISC_RPOBE

Add IP(V6)_PMTUDISC_PROBE value for IP(V6)_MTU_DISCOVER.  This option forces
us not to fragment, but does not make use of the kernel path MTU discovery.
That is, it allows for user-mode MTU probing (or, packetization-layer path
MTU discovery).  This is particularly useful for diagnostic utilities, like
traceroute/tracepath.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6]: MTU discovery check in ip6_fragment()
John Heffner [Fri, 20 Apr 2007 22:52:39 +0000 (15:52 -0700)]
[IPV6]: MTU discovery check in ip6_fragment()

Adds a check in ip6_fragment() mirroring ip_fragment() for packets
that we can't fragment, and sends an ICMP Packet Too Big message
in response.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: ingress: switch back to using ingress_lock
Patrick McHardy [Tue, 17 Apr 2007 00:07:08 +0000 (17:07 -0700)]
[NET_SCHED]: ingress: switch back to using ingress_lock

Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks
both the ingress and egress qdiscs on the device. All changes to data that
might be used on both ingress and egress needs to be protected by using
qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally
the qdisc stats_lock needs to be initialized to ingress_lock for ingress
qdiscs.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: Eliminate qdisc_tree_lock
Patrick McHardy [Tue, 17 Apr 2007 00:02:10 +0000 (17:02 -0700)]
[NET_SCHED]: Eliminate qdisc_tree_lock

Since we're now holding the rtnl during the entire dump operation, we
can remove qdisc_tree_lock, whose only purpose is to protect dump
callbacks from concurrent changes to the qdisc tree.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: don't reinitialize callback mutex
Patrick McHardy [Wed, 25 Apr 2007 21:01:17 +0000 (14:01 -0700)]
[NETLINK]: don't reinitialize callback mutex

Don't reinitialize the callback mutex the netlink_kernel_create caller
handed in, it is supposed to already be initialized and could already
be held by someone.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNETLINK]: Remove unnecessary locking in dump callbacks
Patrick McHardy [Tue, 17 Apr 2007 00:00:53 +0000 (17:00 -0700)]
[RTNETLINK]: Remove unnecessary locking in dump callbacks

Since we're now holding the rtnl during the entire dump operation, we can
remove additional locking for rtnl protected data. This patch does that
for all simple cases (dev_base_lock for dev_base walking, RCU protection
for FIB rule dumping).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
Patrick McHardy [Mon, 16 Apr 2007 23:59:10 +0000 (16:59 -0700)]
[RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks

Hold rtnl_mutex during the entire netlink dump operation. This allows
to simplify locking in the dump callbacks, since they can now rely on
that no concurrent changes happen.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it
Patrick McHardy [Fri, 20 Apr 2007 21:14:21 +0000 (14:14 -0700)]
[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it

Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: ipt_ULOG: add compat conversion functions
Patrick McHardy [Fri, 13 Apr 2007 05:17:05 +0000 (22:17 -0700)]
[NETFILTER]: ipt_ULOG: add compat conversion functions

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: remove fallback to group 0
Patrick McHardy [Fri, 13 Apr 2007 05:16:38 +0000 (22:16 -0700)]
[NETFILTER]: nfnetlink_log: remove fallback to group 0

Don't fallback to group 0 if no instance can be found for the given group.
This potentially confuses the listener and is not what the user configured.
Also remove the ring buffer spamming that happens when rules are set up
before the logging daemon is started.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading
Patrick McHardy [Fri, 13 Apr 2007 05:16:18 +0000 (22:16 -0700)]
[NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading

All LOG targets always use their internal logging function nowadays, so
remove the incorrect error message and handle real errors (!= -EEXIST)
by failing to load.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nf_nat: use HW checksumming when possible
Patrick McHardy [Fri, 13 Apr 2007 05:15:50 +0000 (22:15 -0700)]
[NETFILTER]: nf_nat: use HW checksumming when possible

When mangling packets forwarded to a HW checksumming capable device,
offload recalculation of the checksum instead of doing it in software.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: ebt_arp: add gratuitous arp filtering
Bart De Schuymer [Fri, 13 Apr 2007 05:15:06 +0000 (22:15 -0700)]
[NETFILTER]: ebt_arp: add gratuitous arp filtering

The attached patch adds gratuitous arp filtering, more precisely: it
allows checking that the IPv4 source address matches the IPv4
destination address inside the ARP header. It also adds a check for the
hardware address type when matching MAC addresses (nothing critical,
just for better consistency).

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Acked-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic
Michael Milner [Fri, 13 Apr 2007 05:14:23 +0000 (22:14 -0700)]
[NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic

The attached patch by Michael Milner adds support for using iptables and
ip6tables on bridged traffic encapsulated in ppoe frames, similar to
what's already supported for vlan.

Signed-off-by: Michael Milner <milner@blissisland.ca>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Complete documentation of dccp_sock
Gerrit Renker [Fri, 20 Apr 2007 20:57:21 +0000 (13:57 -0700)]
[DCCP]: Complete documentation of dccp_sock

This fills in missing documentation for dccp_sock fields.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Debug statements for Elapsed Time option
Gerrit Renker [Fri, 20 Apr 2007 20:56:47 +0000 (13:56 -0700)]
[DCCP]: Debug statements for Elapsed Time option

This prints the value of the parsed Elapsed Time when received via a
Timestamp Echo option [RFC 4342, 13.3].

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Fix bug in the calculation of very low sending rates
Gerrit Renker [Fri, 20 Apr 2007 20:02:55 +0000 (13:02 -0700)]
[DCCP]: Fix bug in the calculation of very low sending rates

This fixes an error in the calculation of t_ipi when X converges towards
very low sending rates (between 1 and 64 bytes per second).

Although this case may not sound likely, it can be reproduced by connecting,
hitting enter (1 byte sent) and waiting for some time, during which the
nofeedback timer halves the sending rate until finally it reaches the region
1..64 bytes/sec. Computing X is handled correctly (tested separately); but by
dividing X _before_ entering the calculation of t_ipi, X becomes zero as
a result.  This in turn triggers a BUG condition caught in scaled_div().

Fixed by replacing with equivalent statement and explicit typecast for good
measure.

Calculation verified and effect of patch tested - reduced never below 1 byte
per 64 seconds afterwards, i.e. not allowing divide-by-zero.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[S390]: Fix build on 31-bit.
David S. Miller [Wed, 11 Apr 2007 05:10:39 +0000 (22:10 -0700)]
[S390]: Fix build on 31-bit.

Allow s390 to properly override the generic
__div64_32() implementation by:

1) Using obj-y for div64.o in s390's makefile instead
   of lib-y

2) Adding the weak attribute to the generic implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Fix missing offset adjustment in skb_copy_expand
Patrick McHardy [Wed, 11 Apr 2007 01:30:09 +0000 (18:30 -0700)]
[SK_BUFF]: Fix missing offset adjustment in skb_copy_expand

skb_copy_expand changes the headroom, so it needs to adjust the header
offsets by the difference between the old and the new value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: loopback driver can use loopback_dev integrated net_device_stats
Eric Dumazet [Tue, 10 Apr 2007 20:25:40 +0000 (13:25 -0700)]
[NET]: loopback driver can use loopback_dev integrated net_device_stats

Rusty added a new 'stats' field to struct net_device.

loopback driver can use it instead of declaring another struct
net_device_stats This saves some memory.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years agobridge: check kmem_cache_create() error
Akinobu Mita [Sat, 7 Apr 2007 09:57:07 +0000 (18:57 +0900)]
bridge: check kmem_cache_create() error

This patch checks kmem_cache_create() error and aborts loading module
on failure.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: allow changing hardware address to any valid address
Stephen Hemminger [Mon, 9 Apr 2007 18:49:58 +0000 (11:49 -0700)]
bridge: allow changing hardware address to any valid address

For case of bridging pseudo devices, the get created/destroyed (Xen)
need to allow setting address to any valid value.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: change when netlink events go to STP
Stephen Hemminger [Thu, 22 Mar 2007 21:08:46 +0000 (14:08 -0700)]
bridge: change when netlink events go to STP

Need to tell STP daemon about more events, like any time a
device is added even when it is down.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: add support for user mode STP
Stephen Hemminger [Wed, 21 Mar 2007 21:22:44 +0000 (14:22 -0700)]
bridge: add support for user mode STP

This patchset based on work by Aji_Srinivas@emc.com provides allows
spanning tree to be controled from userspace.  Like hotplug, it
uses call_usermodehelper when spanning tree is enabled so there
is no visible API change. If call to start usermode STP fails
it falls back to existing kernel STP.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: add sysfs hook to flush forwarding table
Stephen Hemminger [Mon, 9 Apr 2007 19:57:54 +0000 (12:57 -0700)]
bridge: add sysfs hook to flush forwarding table

The RSTP daemon needs to be able to flush all dynamic forwarding
entries in the case of topology change.

This is a temporary interface. It will change to a netlink interface
before RSTP daemon is officially released.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: simpler hash with salt
Stephen Hemminger [Wed, 21 Mar 2007 20:42:33 +0000 (13:42 -0700)]
bridge: simpler hash with salt

Instead of hashing the whole Ethernet address, it should be faster
to just use the last 4 bytes. Add a random salt value to the hash
to make it more difficult to construct worst case DoS hash chains.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: don't route packets while learning
Stephen Hemminger [Wed, 21 Mar 2007 20:42:06 +0000 (13:42 -0700)]
bridge: don't route packets while learning

While in the STP learning state, don't route packets; wait until
forwarding delay has expired. The purpose of the forwarding delay
is to detect loops in the network, and if a brouter started up
and started forwarding, it could cause a flood.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years agobridge: eliminate call by reference
Stephen Hemminger [Wed, 21 Mar 2007 20:38:47 +0000 (13:38 -0700)]
bridge: eliminate call by reference

Change the bridging hook to be simple function with return value
rather than modifying the skb argument. This could generate better
code and is cleaner.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
17 years ago[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY
Herbert Xu [Mon, 9 Apr 2007 18:59:39 +0000 (11:59 -0700)]
[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETDRV]: Perform missing csum_offset conversions
Herbert Xu [Tue, 24 Apr 2007 00:06:40 +0000 (17:06 -0700)]
[NETDRV]: Perform missing csum_offset conversions

When csum_offset was introduced we did a conversion from csum to
csum_offset where applicable.  A couple of drivers were missed in
this process.

It was harmless to begin with since the two fields coincided.  Now
that we've made them different with the addition of csum_start, the
missed drivers must be converted or they can't send packets out at
all that require checksum offload.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Use csum_start offset instead of skb_transport_header
Herbert Xu [Mon, 9 Apr 2007 18:59:07 +0000 (11:59 -0700)]
[NET]: Use csum_start offset instead of skb_transport_header

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb->head so that
it does not have to change when skb->data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[XFRM]: beet: fix worst case header_len calculation
Patrick McHardy [Mon, 9 Apr 2007 18:47:58 +0000 (11:47 -0700)]
[XFRM]: beet: fix worst case header_len calculation

esp_init_state doesn't account for the beet pseudo header in the header_len
calculation, which may result in undersized skbs hitting xfrm4_beet_output,
causing unnecessary reallocations in ip_finish_output2.

The skbs should still always have enough room to avoid causing
skb_under_panic in skb_push since we have at least 16 bytes available
from LL_RESERVED_SPACE in xfrm_state_check_space.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[XFRM]: Optimize MTU calculation
Patrick McHardy [Mon, 9 Apr 2007 18:47:18 +0000 (11:47 -0700)]
[XFRM]: Optimize MTU calculation

Replace the probing based MTU estimation, which usually takes 2-3 iterations
to find a fitting value and may underestimate the MTU, by an exact calculation.

Also fix underestimation of the XFRM trailer_len, which causes unnecessary
reallocations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[XFRM]: esp: fix skb_tail_pointer conversion bug
Patrick McHardy [Mon, 9 Apr 2007 18:46:17 +0000 (11:46 -0700)]
[XFRM]: esp: fix skb_tail_pointer conversion bug

Fix incorrect switch of "trailer" skb by "skb" during skb_tail_pointer
conversion:

-       *(u8*)(trailer->tail - 1) = top_iph->protocol;
+       *(skb_tail_pointer(skb) - 1) = top_iph->protocol;

-       *(u8 *)(trailer->tail - 1) = *skb_network_header(skb);
+       *(skb_tail_pointer(skb) - 1) = *skb_network_header(skb);

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Fix missing offset adjustment in pskb_expand_head
Patrick McHardy [Mon, 9 Apr 2007 18:45:04 +0000 (11:45 -0700)]
[SK_BUFF]: Fix missing offset adjustment in pskb_expand_head

Since we're increasing the headroom, the header offsets need to be
increased by the same amount as well.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6] FIB6RULE: Find source address during looking up route.
YOSHIFUJI Hideaki [Fri, 6 Apr 2007 18:45:39 +0000 (11:45 -0700)]
[IPV6] FIB6RULE: Find source address during looking up route.

When looking up route for destination with rules with
source address restrictions, we may need to find a source
address for the traffic if not given.

Based on patch from Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[XFRM]: beet: minor cleanups
Patrick McHardy [Thu, 5 Apr 2007 23:04:04 +0000 (16:04 -0700)]
[XFRM]: beet: minor cleanups

Remove unnecessary initialization/variable.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNL]: Improve error codes for unsupported operations
Thomas Graf [Thu, 5 Apr 2007 21:35:52 +0000 (14:35 -0700)]
[RTNL]: Improve error codes for unsupported operations

The most common trigger of these errors is that the
config option hasn't been enable wich would make the
functionality available. Therefore returning EOPNOTSUPP
gives a better idea on what is going wrong.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Move generic skbuff stuff from XFRM code to generic code
David Howells [Tue, 3 Apr 2007 03:19:53 +0000 (20:19 -0700)]
[NET]: Move generic skbuff stuff from XFRM code to generic code

Move generic skbuff stuff from XFRM code to generic code so that
AF_RXRPC can use it too.

The kdoc comments I've attached to the functions needs to be checked
by whoever wrote them as I had to make some guesses about the workings
of these functions.

Signed-off-By: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CREDITS]: Update Arnaldo entry
Arnaldo Carvalho de Melo [Sat, 31 Mar 2007 15:05:49 +0000 (12:05 -0300)]
[CREDITS]: Update Arnaldo entry

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
17 years ago[SK_BUFF]: Some more conversions to skb_copy_from_linear_data
Arnaldo Carvalho de Melo [Sat, 31 Mar 2007 14:55:45 +0000 (11:55 -0300)]
[SK_BUFF]: Some more conversions to skb_copy_from_linear_data

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
17 years ago[SK_BUFF]: Introduce skb_copy_to_linear_data{_offset}
Arnaldo Carvalho de Melo [Sat, 31 Mar 2007 14:55:19 +0000 (11:55 -0300)]
[SK_BUFF]: Introduce skb_copy_to_linear_data{_offset}

To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
17 years ago[NET]: Fix warnings in 3c523.c and ni52.c
David S. Miller [Fri, 30 Mar 2007 02:16:03 +0000 (19:16 -0700)]
[NET]: Fix warnings in 3c523.c and ni52.c

We have to put back the cast to "char *" because these
pointers are volatile.

Reported by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Inline net_device_stats
Rusty Russell [Wed, 28 Mar 2007 21:29:08 +0000 (14:29 -0700)]
[NET]: Inline net_device_stats

Network drivers which keep stats allocate their own stats structure
then write a get_stats() function to return them.  It would be nice if
this were done by default.

1) Add a new "stats" field to "struct net_device".
2) Add a new feature field to say "this driver uses the internal one"
3) Have a default "get_stats" which returns NULL if that feature not set.
4) Change callers to check result of get_stats call for NULL, not if
   ->get_stats is set.

This should not break backwards compatibility with older drivers, yet
allow modern drivers to shed some boilerplate code.

Lightly tested: works for a modified lguest network driver.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>