firefly-linux-kernel-4.4.55.git
11 years agonet: xfrm: use __this_cpu_read per-cpu helper
Shan Wei [Tue, 13 Nov 2012 12:36:00 +0000 (20:36 +0800)]
net: xfrm: use __this_cpu_read per-cpu helper

this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id())
and can reduce  memory accesses.
The latter helper needs to find the offset for current cpu,
and needs more assembler instructions which objdump shows in following.

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

per_cpu_ptr(p, smp_processor_id()):
  1e:   65 8b 04 25 00 00 00 00         mov    %gs:0x0,%eax
  26:   48 98                           cltq
  28:   31 f6                           xor    %esi,%esi
  2a:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
  31:   48 8b 04 c5 00 00 00 00         mov    0x0(,%rax,8),%rax
  39:   c7 44 10 04 14 00 00 00         movl   $0x14,0x4(%rax,%rdx,1)

this_cpu_ptr(p)
  1e:   65 48 03 14 25 00 00 00 00      add    %gs:0x0,%rdx
  27:   31 f6                           xor    %esi,%esi
  29:   c7 42 04 14 00 00 00            movl   $0x14,0x4(%rdx)
  30:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi

Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
11 years agoxfrm: remove redundant replay_esn check
Ulrich Weber [Thu, 8 Nov 2012 10:15:44 +0000 (11:15 +0100)]
xfrm: remove redundant replay_esn check

x->replay_esn is already checked in if clause,
so remove check and ident properly

Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
11 years agoMerge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Thu, 8 Nov 2012 00:08:42 +0000 (19:08 -0500)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- minimal fixes to the packet layout to avoid the __packed attribute when not
  needed
- new packet type called UNICAST_4ADDR: in this packet it is possible to find
  both source and destination node (in the classic UNICAST header only the
  destination field exists).
- a new feature: Distributed ARP Table (D.A.T.). It aims to reduce ARP lookups
  latency by means of a simil-DHT approach.

11 years agondisc: fix a typo in a comment in ndisc_recv_na()
Nicolas Dichtel [Wed, 7 Nov 2012 05:05:38 +0000 (05:05 +0000)]
ndisc: fix a typo in a comment in ndisc_recv_na()

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoksz884x: use module_pci_driver to simplify the code
Wei Yongjun [Wed, 7 Nov 2012 02:54:30 +0000 (02:54 +0000)]
ksz884x: use module_pci_driver to simplify the code

Use the module_pci_driver() macro to make the code simpler
by eliminating module_init and module_exit calls.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Support loading cnic resources at run-time
Merav Sicron [Wed, 7 Nov 2012 00:45:48 +0000 (00:45 +0000)]
bnx2x: Support loading cnic resources at run-time

This patch replaces the BCM_CNIC define with a flag which can change at run-time
and which does not use the CONFIG_CNIC kconfig option.
For the PF/hypervisor driver cnic is always supported, however allocation of
cnic resources and configuration of the HW for offload mode is done only when
the cnic module registers bnx2x.

Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: HSI change for 'update' ramrod
Merav Sicron [Wed, 7 Nov 2012 00:45:47 +0000 (00:45 +0000)]
bnx2x: HSI change for 'update' ramrod

This patch updates the driver-FW HSI to support changes to the 'update' ramrod
(FW supports this change since 7.8.2). This ramrod is sent when the cnic module
registers bnx2x, to enable changing the nic_mode configuration in HW at
run-time.

Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopacket: tx_ring: allow the user to choose tx data offset
Paul Chavent [Tue, 6 Nov 2012 23:10:47 +0000 (23:10 +0000)]
packet: tx_ring: allow the user to choose tx data offset

The tx data offset of packet mmap tx ring used to be :
(TPACKET2_HDRLEN - sizeof(struct sockaddr_ll))

The problem is that, with SOCK_RAW socket, the payload (14 bytes after
the beginning of the user data) is misaligned.

This patch allows to let the user gives an offset for it's tx data if
he desires.

Set sock option PACKET_TX_HAS_OFF to 1, then specify in each frame of
your tx ring tp_net for SOCK_DGRAM, or tp_mac for SOCK_RAW.

Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fec: reduce spin lock time in fec_ptp_adjfreq
Frank Li [Tue, 6 Nov 2012 20:14:49 +0000 (20:14 +0000)]
net: fec: reduce spin lock time in fec_ptp_adjfreq

move below calculate out of spin lock section
diff = fep->cc.mult;
diff *= ppb;
diff = div_u64(diff, 1000000000ULL);

diff is local variable and not neccesary in spin lock

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fec: default select FEC_PTP at mx6 platform
Frank Li [Tue, 6 Nov 2012 20:14:43 +0000 (20:14 +0000)]
net: fec: default select FEC_PTP at mx6 platform

Remove PPS.
Limit FEC_PTP option for i.MX chip only.
FEC_PTP default is on at mx6 platform.

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/at91_ether: fix comment and style issues
Joachim Eastwood [Wed, 7 Nov 2012 08:14:57 +0000 (08:14 +0000)]
net/at91_ether: fix comment and style issues

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/at91_ether: clean up print outs
Joachim Eastwood [Wed, 7 Nov 2012 08:14:56 +0000 (08:14 +0000)]
net/at91_ether: clean up print outs

Convert all printk's to netdev_ counterparts and fix up some
printed texts.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/at91_ether: drop board_data private struct member
Joachim Eastwood [Wed, 7 Nov 2012 08:14:55 +0000 (08:14 +0000)]
net/at91_ether: drop board_data private struct member

No longer used after gpio phy interrupt support was
removed from at91_ether.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/at91_ether: use stat function from macb
Joachim Eastwood [Wed, 7 Nov 2012 08:14:54 +0000 (08:14 +0000)]
net/at91_ether: use stat function from macb

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/at91_ether: use macb functions for get/set hwaddr
Joachim Eastwood [Wed, 7 Nov 2012 08:14:53 +0000 (08:14 +0000)]
net/at91_ether: use macb functions for get/set hwaddr

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/macb: export macb_set_hwaddr and macb_get_hwaddr
Joachim Eastwood [Wed, 7 Nov 2012 08:14:52 +0000 (08:14 +0000)]
net/macb: export macb_set_hwaddr and macb_get_hwaddr

for usage in at91_ether driver.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/macb: support reversed hw addr
Joachim Eastwood [Wed, 7 Nov 2012 08:14:51 +0000 (08:14 +0000)]
net/macb: support reversed hw addr

This is used on one AT91RM9200 board where a bootloader stores
the Ethernet address in the wrong order.

Support this on macb so address setting functions can be shared
with the at91_ether driver.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/macb: check all address registers sets
Joachim Eastwood [Wed, 7 Nov 2012 08:14:50 +0000 (08:14 +0000)]
net/macb: check all address registers sets

The macb driver in u-boot uses the first register set while
the at91_ether driver in u-boot uses the second register set.

By checking all register set, like at91_ether does, this code
can be shared between the drivers.

This only changes behavior on macb if no vaild address
is found in the first register set.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: remove adapter->eq_next_idx
Sathya Perla [Tue, 6 Nov 2012 17:49:01 +0000 (17:49 +0000)]
be2net: remove adapter->eq_next_idx

It's not used anywhere

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: remove roce on lancer
Sathya Perla [Tue, 6 Nov 2012 17:49:00 +0000 (17:49 +0000)]
be2net: remove roce on lancer

roce interface is suppored only on Skyhawk-R.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: fix access to SEMAPHORE reg
Sathya Perla [Tue, 6 Nov 2012 17:48:59 +0000 (17:48 +0000)]
be2net: fix access to SEMAPHORE reg

The SEMAPHORE register was being accessed from the csr BAR space. This BAR
may not be available in some Skyhawk-R configurations. Instead, access this
register via the PCI config space (it's available there too).

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: re-factor bar mapping code
Sathya Perla [Tue, 6 Nov 2012 17:48:58 +0000 (17:48 +0000)]
be2net: re-factor bar mapping code

1) separate NIC and roce bar mapping code
2) parse sli_intf::if_type inside be_map_pci_bars() as if_type must be
   used only to identify bars.
3) Use pci_iomap/unmap() routines

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: do not use sli_family to identify skyhawk-R chip
Sathya Perla [Tue, 6 Nov 2012 17:48:57 +0000 (17:48 +0000)]
be2net: do not use sli_family to identify skyhawk-R chip

SKYHAWK_FAMILY will not identify all revisions of the chip.
Use device-id check (skyhawk_chip() macro) instead.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: fix wrong usage of adapter->generation
Sathya Perla [Tue, 6 Nov 2012 17:48:56 +0000 (17:48 +0000)]
be2net: fix wrong usage of adapter->generation

adapter->generation was being incorrectly set as BE_GEN3 for Skyhawk-R.
Replace generation usage with XXX_chip() macros to identify the chip.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: remove LANCER A0 workaround
Sathya Perla [Tue, 6 Nov 2012 17:48:55 +0000 (17:48 +0000)]
be2net: remove LANCER A0 workaround

It's not needed anymore.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomlx4: change TX coalescing defaults
Eric Dumazet [Mon, 5 Nov 2012 16:20:42 +0000 (16:20 +0000)]
mlx4: change TX coalescing defaults

mlx4 currently uses a too high tx coalescing setting, deferring
TX completion interrupts by up to 128 us.

With the recent skb_orphan() removal in commit 8112ec3b872,
performance of a single TCP flow is capped to ~4 Gbps, unless
we increase tcp_limit_output_bytes.

I suggest using 16 us instead of 128 us, allowing a finer control.

Performance of a single TCP flow is restored to previous levels,
while keeping TCP small queues fully enabled with default sysctl.

This patch is also a BQL prereq.

Reported-by: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobatman-adv: enable fast client detection using unicast_4addr packets
Antonio Quartulli [Sun, 14 Oct 2012 15:19:19 +0000 (17:19 +0200)]
batman-adv: enable fast client detection using unicast_4addr packets

The "early client detection mechanism" can be extended to find new clients by
means of unicast_4addr packets.

The unicast_4addr packet contains as well as the broadcast packet (which is
currently used in this mechanism) the address of the originating node and can
therefore be used to install new entries in the Global Translation Table

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
11 years agobatman-adv: Add get_ethtool_stats() support for DAT
Martin Hundebøll [Fri, 20 Apr 2012 15:02:45 +0000 (17:02 +0200)]
batman-adv: Add get_ethtool_stats() support for DAT

Added additional counters for D.A.T.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - add runtime switch
Antonio Quartulli [Wed, 8 Aug 2012 16:50:57 +0000 (18:50 +0200)]
batman-adv: Distributed ARP Table - add runtime switch

This patch adds a runtime switch that enables the user to turn the DAT feature
on or off at runtime

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - add compile option
Antonio Quartulli [Sun, 6 Nov 2011 11:23:55 +0000 (12:23 +0100)]
batman-adv: Distributed ARP Table - add compile option

This patch makes it possible to decide whether to include DAT within the
batman-adv binary or not.
It is extremely useful when the user wants to reduce the size of the resulting
module by cutting off any not needed feature.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - add snooping functions for ARP messages
Antonio Quartulli [Sun, 26 Jun 2011 01:37:18 +0000 (03:37 +0200)]
batman-adv: Distributed ARP Table - add snooping functions for ARP messages

In case of an ARP message going in or out the soft_iface, it is intercepted and
a special action is performed. In particular the DHT helper functions previously
implemented are used to store all the ARP entries belonging to the network in
order to provide a fast and unicast lookup instead of the classic broadcast
flooding mechanism.
Each node stores the entries it is responsible for (following the DHT rules) in
its soft_iface ARP table. This makes it possible to reuse the kernel data
structures and functions for ARP management.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - add ARP parsing functions
Antonio Quartulli [Thu, 2 Jun 2011 10:29:51 +0000 (12:29 +0200)]
batman-adv: Distributed ARP Table - add ARP parsing functions

ARP messages are now parsed to make it possible to trigger special actions
depending on their types (snooping).

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - implement local storage
Antonio Quartulli [Sat, 30 Jun 2012 18:01:19 +0000 (20:01 +0200)]
batman-adv: Distributed ARP Table - implement local storage

Since batman-adv cannot inter-operate with the host ARP table, this patch
introduces a batman-adv private storage for ARP entries exchanged within DAT.
This storage will represent the node local cache in the DAT protocol.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - create DHT helper functions
Antonio Quartulli [Wed, 23 Nov 2011 10:35:44 +0000 (11:35 +0100)]
batman-adv: Distributed ARP Table - create DHT helper functions

Add all the relevant functions in order to manage a Distributed Hash Table over
the B.A.T.M.A.N.-adv network. It will later be used to store several ARP entries
and implement DAT (Distributed ARP Table)

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Distributed ARP Table - add a new debug log level
Antonio Quartulli [Mon, 1 Oct 2012 07:57:36 +0000 (09:57 +0200)]
batman-adv: Distributed ARP Table - add a new debug log level

A new log level has been added to concentrate messages regarding DAT: ARP
snooping, requests, response and DHT related messages.
The new log level is named BATADV_DBG_DAT

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: add UNICAST_4ADDR packet type
Antonio Quartulli [Mon, 1 Oct 2012 07:57:35 +0000 (09:57 +0200)]
batman-adv: add UNICAST_4ADDR packet type

The current unicast packet type does not contain the orig source address. This
patches add a new unicast packet (called UNICAST_4ADDR) which provides two new
fields: the originator source address and the subtype (the type of the data
contained in the packet payload). The former is useful to identify the node
which injected the packet into the network and the latter is useful to avoid
creating new unicast packet types in the future: a macro defining a new subtype
will be enough.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Mark correctly aligned headers not as __packed
Sven Eckelmann [Mon, 5 Nov 2012 20:25:26 +0000 (21:25 +0100)]
batman-adv: Mark correctly aligned headers not as __packed

Headers which are already perfectly aligned and create a 4 byte boundary
non-ethernet header payload can have the __packed attribute removed. The
__packed attribute doesn't change the appeareance of the packet for these
headers because no extra padding is necessary to align the data members. The
compiler will also create slightly faster code for loads of multi-byte members.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: Reserve extra bytes in skb for better alignment
Sven Eckelmann [Sun, 4 Nov 2012 16:11:45 +0000 (17:11 +0100)]
batman-adv: Reserve extra bytes in skb for better alignment

The ethernet header is 14 bytes long. Therefore, the data after it is not 4
byte aligned and may cause problems on systems without unaligned data access.
Reserving NET_IP_ALIGN more byes can fix the misalignment of the ethernet
header.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agousbnet: runtime wake up device before calling usbnet_{read|write}_cmd
Ming Lei [Tue, 6 Nov 2012 04:53:08 +0000 (04:53 +0000)]
usbnet: runtime wake up device before calling usbnet_{read|write}_cmd

This patch gets the runtime PM reference count before calling
usbnet_{read|write}_cmd, and puts it after completion of the
usbnet_{read|write}_cmd, so that the usb control message can always
be sent to one active device in the non-PM context.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: smsc95xx: apply the introduced usbnet_{read|write}_cmd_nopm
Ming Lei [Tue, 6 Nov 2012 04:53:07 +0000 (04:53 +0000)]
usbnet: smsc95xx: apply the introduced usbnet_{read|write}_cmd_nopm

This patch applies the introduced usbnet_read_cmd_nopm() and
usbnet_write_cmd_nopm() in the callback of resume and suspend
to avoid deadlock if USB runtime PM is considered into
usbnet_read_cmd() and usbnet_write_cmd().

Cc: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: smsc95xx: fix memory leak in smsc95xx_suspend
Ming Lei [Tue, 6 Nov 2012 04:53:06 +0000 (04:53 +0000)]
usbnet: smsc95xx: fix memory leak in smsc95xx_suspend

This patch fixes memory leak in smsc95xx_suspend.

Also, it isn't necessary to bother mm to allocate 8bytes/16byte,
and we can use stack variable safely.

Acked-By: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: smsc75xx: apply the introduced usbnet_{read|write}_cmd_nopm
Ming Lei [Tue, 6 Nov 2012 04:53:05 +0000 (04:53 +0000)]
usbnet: smsc75xx: apply the introduced usbnet_{read|write}_cmd_nopm

This patch applies the introduced usbnet_read_cmd_nopm() and
usbnet_write_cmd_nopm() in the callback of resume and suspend
to avoid deadlock if USB runtime PM is considered into
usbnet_read_cmd() and usbnet_write_cmd().

Cc: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agousbnet: introduce usbnet_{read|write}_cmd_nopm
Ming Lei [Tue, 6 Nov 2012 04:53:04 +0000 (04:53 +0000)]
usbnet: introduce usbnet_{read|write}_cmd_nopm

This patch introduces the below two helpers to prepare for solving
the usbnet runtime PM problem, which may cause some network utilities
(ifconfig, ethtool,...) touch a suspended device.

usbnet_read_cmd_nopm()
usbnet_write_cmd_nopm()

The above two helpers should be called by usbnet resume/suspend
callback to avoid deadlock.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: ip align receive buffers
Rob Herring [Mon, 5 Nov 2012 06:22:24 +0000 (06:22 +0000)]
net: calxedaxgmac: ip align receive buffers

On gcc 4.7, we will get alignment traps in the ip stack if we don't align
the ip headers on receive. The h/w can support this, so use ip aligned
allocations.

Cut down the unnecessary padding on the allocation. The buffer can start on
any byte alignment, but the size including the begining offset must be 8
byte aligned. So the h/w buffer size must include the NET_IP_ALIGN offset.

Thanks to Eric Dumazet for the initial patch highlighting the padding issues.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: rework transmit ring handling
Rob Herring [Mon, 5 Nov 2012 06:22:23 +0000 (06:22 +0000)]
net: calxedaxgmac: rework transmit ring handling

Only generate tx interrupts on every ring size / 4 descriptors. Move the
netif_stop_queue call to the end of the xmit function rather than
checking at the beginning.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: drop some unnecessary register writes
Rob Herring [Mon, 5 Nov 2012 06:22:22 +0000 (06:22 +0000)]
net: calxedaxgmac: drop some unnecessary register writes

The interrupts have already been cleared, so we don't need to clear them
again. Also, we could miss interrupts if they are cleared, but we don't
process the packet.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: use raw i/o accessors in rx and tx paths
Rob Herring [Mon, 5 Nov 2012 06:22:21 +0000 (06:22 +0000)]
net: calxedaxgmac: use raw i/o accessors in rx and tx paths

The standard readl/writel accessors involve a spinlock and cache sync
operation on ARM platforms with an outer cache. Only DMA triggering
accesses need this, so use the raw variants instead in the critical paths.

The relaxed variants would be more appropriate, but don't exist on all
arches.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: remove explicit rx dma buffer polling
Rob Herring [Mon, 5 Nov 2012 06:22:20 +0000 (06:22 +0000)]
net: calxedaxgmac: remove explicit rx dma buffer polling

New received frames will trigger the rx DMA to poll the DMA descriptors,
so there is no need to tell the h/w to poll. We also want to enable
dropping frames from the fifo when there is no buffer.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: calxedaxgmac: enable operate on 2nd frame mode
Rob Herring [Mon, 5 Nov 2012 06:22:19 +0000 (06:22 +0000)]
net: calxedaxgmac: enable operate on 2nd frame mode

Enable the tx dma to start reading the next frame while sending the current
frame.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agohtb: fix two bugs
Eric Dumazet [Mon, 5 Nov 2012 16:40:49 +0000 (16:40 +0000)]
htb: fix two bugs

Commit 56b765b79e9 (htb: improved accuracy at high rates)
introduced two bugs :

1) one bstats_update() was inadvertently removed from
   htb_dequeue_tree(), breaking statistics/rate estimation.

2) Missing qdisc_put_rtab() calls in htb_change_class(),
   leaking kernel memory, now struct htb_class no longer
   retains pointers to qdisc_rate_table structs.

   Since only rate is used, dont use qdisc_get_rtab() calls
   copying data we ignore anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotg3: Call tg3_netif_stop() from tg3_stop()
Nithin Nayak Sujir [Mon, 5 Nov 2012 14:26:30 +0000 (14:26 +0000)]
tg3: Call tg3_netif_stop() from tg3_stop()

instead of making separate tg3_napi_disable() and netif_tx_disable() calls.

Update version to 3.126.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotg3: Support 5717 C0
Michael Chan [Mon, 5 Nov 2012 14:26:29 +0000 (14:26 +0000)]
tg3: Support 5717 C0

Add support for 5717C0 which is a 5720A0 with special bonds-out option.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: at91_ether: add pinctrl support
Jean-Christophe PLAGNIOL-VILLARD [Sun, 4 Nov 2012 21:34:52 +0000 (21:34 +0000)]
net: at91_ether: add pinctrl support

If no pinctrl available just report a warning as some architecture may not
need to do anything.

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: at91_ether: add dt support
Jean-Christophe PLAGNIOL-VILLARD [Sun, 4 Nov 2012 21:34:51 +0000 (21:34 +0000)]
net: at91_ether: add dt support

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: ethernet: qlogic: netxen_nic_ethtool.c: Fixed a coding style issue
Kumar Amit Mehta [Sun, 4 Nov 2012 19:46:08 +0000 (19:46 +0000)]
drivers: ethernet: qlogic: netxen_nic_ethtool.c: Fixed a coding style issue

Fixed some coding style issues.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: ethernet: qlogic: qlge_dbg.c: Fixed a coding style issue
Kumar Amit Mehta [Sun, 4 Nov 2012 19:11:32 +0000 (19:11 +0000)]
drivers: ethernet: qlogic: qlge_dbg.c: Fixed a coding style issue

checkpatch.pl throws error message for the current code. This patch fixes
this coding style issue.

Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Acked-by: Jitendra Kalsaria <Jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosparc: bpf_jit_comp: add VLAN instructions for BPF JIT
Daniel Borkmann [Sun, 4 Nov 2012 16:59:30 +0000 (16:59 +0000)]
sparc: bpf_jit_comp: add VLAN instructions for BPF JIT

This patch is a follow-up for patch "net: filter: add vlan tag access"
to support the new VLAN_TAG/VLAN_TAG_PRESENT accessors in BPF JIT.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: enable internal ASPM and clock request settings
hayeswang [Thu, 1 Nov 2012 16:46:28 +0000 (16:46 +0000)]
r8169: enable internal ASPM and clock request settings

The following chips need to enable internal settings to let ASPM
and clock request work.

RTL8111E-VL, RTL8111F, RTL8411, RTL8111G
RTL8105, RTL8402, RTL8106

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: Avoid 'statement with no effect' compiler warnings
Lee Jones [Sat, 3 Nov 2012 22:02:30 +0000 (23:02 +0100)]
bridge: Avoid 'statement with no effect' compiler warnings

Instead of issuing (0) statements when !CONFIG_SYSFS which will cause
'warning: ', we'll use inline statements instead. This will effectively
do the same thing, but suppress any unnecessary warnings.

Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net/ethernet/ibm/emac/mal.c: use WARN
Julia Lawall [Sat, 3 Nov 2012 00:58:31 +0000 (00:58 +0000)]
drivers/net/ethernet/ibm/emac/mal.c: use WARN

Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
  es);
-WARN_ON(1);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoatp: remove set_rx_mode_8012()
Paul Bolle [Fri, 2 Nov 2012 23:53:15 +0000 (23:53 +0000)]
atp: remove set_rx_mode_8012()

Building atp.o triggers this GCC warning:
    drivers/net/ethernet/realtek/atp.c: In function ‘set_rx_mode’:
    drivers/net/ethernet/realtek/atp.c:871:26: warning: ‘mc_filter[0]’ may be used uninitialized in this function [-Wuninitialized]

GCC is correct. In promiscuous mode 'mc_filter' will be used
uninitialized in set_rx_mode_8012(), which is apparently inlined into
set_rx_mode().

But it turns out set_rx_mode_8012() will never be called, since
net_local.chip_type will always be RTL8002. So we can just remove
set_rx_mode_8012() and do some related cleanups.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocpsw: fix leaking IO mappings
Richard Cochran [Fri, 2 Nov 2012 22:25:30 +0000 (22:25 +0000)]
cpsw: fix leaking IO mappings

The CPSW driver remaps two different IO regions, but fails to unmap them
both. This patch fixes the issue by calling iounmap in the appropriate
places.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocpsw: rename register banks to match the reference manual, part 2
Richard Cochran [Fri, 2 Nov 2012 22:25:29 +0000 (22:25 +0000)]
cpsw: rename register banks to match the reference manual, part 2

The code mixes up the CPSW_SS and the CPSW_WR register naming. This patch
changes the names to conform to the published Technical Reference Manual
from TI, in order to make working on the code less confusing.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: fix bridge notify hook to manage flags correctly
John Fastabend [Fri, 2 Nov 2012 16:32:36 +0000 (16:32 +0000)]
net: fix bridge notify hook to manage flags correctly

The bridge notify hook rtnl_bridge_notify() was not handling the
case where the master flags was set or with both flags set. First
flags are not being passed correctly and second the logic to parse
them is broken.

This patch passes the original flags value and fixes the
logic.

Reported-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomacb: Keep driver's speed/duplex in sync with actual NCFGR
Vitalii Demianets [Fri, 2 Nov 2012 07:09:24 +0000 (07:09 +0000)]
macb: Keep driver's speed/duplex in sync with actual NCFGR

When underlying phy driver restores its state very fast after being brought
down and up so that macb driver function macb_handle_link_change() was never
called with link state "down", driver's internal representation of phy speed
and duplex (bp->speed and bp->duplex) didn't change. So, macb driver sees no
reason to perform actual write to the NCFGR register, although the speed and
duplex settings in that register were reset when interface was brought down
and up. In that case actual phy speed and duplex differ from NCFGR settings.
The patch fixes that by keeping internal driver representation of speed and
duplex in sync with actual content of NCFGR.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: neterion: Do not break word unregister.
YOSHIFUJI Hideaki / 吉藤英明 [Fri, 2 Nov 2012 04:45:24 +0000 (04:45 +0000)]
net: neterion: Do not break word unregister.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: sh_eth: Fix a typo - replace regist with register.
YOSHIFUJI Hideaki / 吉藤英明 [Fri, 2 Nov 2012 04:45:07 +0000 (04:45 +0000)]
net: sh_eth: Fix a typo - replace regist with register.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoptp: fixup Kconfig for two PHC drivers.
Richard Cochran [Thu, 1 Nov 2012 23:57:38 +0000 (23:57 +0000)]
ptp: fixup Kconfig for two PHC drivers.

Ben Hutchings recently came up with a better way to handle the kconfig
dependencies for the PTP hardware clocks. This patch converts one new and
one older driver to the new scheme.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agohtb: improved accuracy at high rates
Vimalkumar [Wed, 31 Oct 2012 06:04:11 +0000 (06:04 +0000)]
htb: improved accuracy at high rates

Current HTB (and TBF) uses rate table computed by the "tc"
userspace program, which has the following issue:

The rate table has 256 entries to map packet lengths
to token (time units).  With TSO sized packets, the
256 entry granularity leads to loss/gain of rate,
making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch
explicitly computes the time and accounts for packet
transmission times with nanosecond granularity.

This greatly improves accuracy of HTB with a wide
range of packet sizes.

Example:

tc qdisc add dev $dev root handle 1: \
        htb default 1

tc class add dev $dev classid 1:1 parent 1: \
        rate 5Gbit mtu 64k

Here is an example of inaccuracy:

$ iperf -c host -t 10 -i 1

With old htb:
eth4:   34.76 Mb/s In  5827.98 Mb/s Out -  65836.0 p/s In  481273.0 p/s Out
[SUM]  9.0-10.0 sec   669 MBytes  5.61 Gbits/sec
[SUM]  0.0-10.0 sec  6.50 GBytes  5.58 Gbits/sec

With new htb:
eth4:   28.36 Mb/s In  5208.06 Mb/s Out -  53704.0 p/s In  430076.0 p/s Out
[SUM]  9.0-10.0 sec   594 MBytes  4.98 Gbits/sec
[SUM]  0.0-10.0 sec  5.80 GBytes  4.98 Gbits/sec

The bits per second on the wire is still 5200Mb/s with new HTB
because qdisc accounts for packet length using skb->len, which
is smaller than total bytes on the wire if GSO is used.  But
that is for another patch regardless of how time is accounted.

Many thanks to Eric Dumazet for review and feedback.

Signed-off-by: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovxlan: allow a user to set TTL value
Vincent Bernat [Tue, 30 Oct 2012 10:27:16 +0000 (10:27 +0000)]
vxlan: allow a user to set TTL value

"ip link add ... type vxlan ... ttl X" allows a user to set the TTL
used by a VXLAN for encapsulation. The provided value was ignored by
vxlan module and the default value of 1 was used when encapsulating
multicast packets.

Signed-off-by: Vincent Bernat <bernat@luffy.cx>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc75xx: add wol support for more frame types
Steve Glendinning [Tue, 30 Oct 2012 07:46:32 +0000 (07:46 +0000)]
smsc75xx: add wol support for more frame types

This patch adds support for wol wakeup on unicast, broadcast,
multicast and arp frames.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoif_ether.h: add B.A.T.M.A.N.-Advanced Ethertype
Antonio Quartulli [Tue, 30 Oct 2012 04:08:41 +0000 (04:08 +0000)]
if_ether.h: add B.A.T.M.A.N.-Advanced Ethertype

Add Ethertype 0x4305 (not an officially registered id).
This Ethertype is used by every frame generated by B.A.T.M.A.N.-Advanced. Its
definition is currently batman-adv local only and since it is not officially
registered it is better to make its definition kernel-wide so that we avoid
collisions given by future unofficial uses of the same Ethertype.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: Kill SafeMtu macro
Kirill Smelkov [Mon, 29 Oct 2012 07:55:12 +0000 (07:55 +0000)]
r8169: Kill SafeMtu macro

After d58d46b5 (r8169: jumbo fixes.) max frame len is stored in
rtl_chip_infos[].jumbo_max for each chip and SafeMtu should be gone.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: introduce ip6_rt_put()
Amerigo Wang [Mon, 29 Oct 2012 00:13:19 +0000 (00:13 +0000)]
ipv6: introduce ip6_rt_put()

As suggested by Eric, we could introduce a helper function
for ipv6 too, to avoid checking if rt is NULL before
dst_release().

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: avoid a test in ip_rt_put()
Eric Dumazet [Sun, 28 Oct 2012 22:33:23 +0000 (22:33 +0000)]
ipv4: avoid a test in ip_rt_put()

We can save a test in ip_rt_put(), considering dst_release() accepts
a NULL parameter, and dst is first element in rtable.

Add a BUILD_BUG_ON() to catch any change that could break this
assertion.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <amwang@redhat.com>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosctp: Clean up type-punning in sctp_cmd_t union
Neil Horman [Mon, 29 Oct 2012 08:32:13 +0000 (08:32 +0000)]
sctp: Clean up type-punning in sctp_cmd_t union

Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as
a void pointer, even though they are written as various other types.  Theres no
need for this as doing so just leads to possible type-punning issues that could
cause crashes, and if we remain type-consistent we can actually just remove the
void * member of the union entirely.

Change Notes:

v2)
* Dropped chunk that modified SCTP_NULL to create a marker pattern
 should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning
 to .zero provides the same effect and should be faster, per Vlad Y.

v3)
* Reverted part of V2, opting to use memset instead of .zero, so that
 the entire union is initalized thus avoiding the i164 speculative load
 problems previously encountered, per Dave M..  Also rewrote
 SCTP_[NO]FORCE so as to use common infrastructure a little more

Signed-off-by: Neil Horman <nhorman@tuxdriver.com
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: remove a useless NULL check
Amerigo Wang [Sun, 28 Oct 2012 17:43:53 +0000 (17:43 +0000)]
ipv6: remove a useless NULL check

In dev_forward_change(), it is useless to check if idev->dev
is NULL, it is always non-NULL here.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopktgen: clean up ktime_t helpers
Daniel Borkmann [Sun, 28 Oct 2012 08:27:19 +0000 (08:27 +0000)]
pktgen: clean up ktime_t helpers

Some years ago, the ktime_t helper functions ktime_now() and ktime_lt()
have been introduced. Instead of defining them inside pktgen.c, they
should either use ktime_t library functions or, if not available, they
should be defined in ktime.h, so that also others can benefit from them.
ktime_compare() is introduced with a similar notion as in timespec_compare().

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: better retrans tracking for defer-accept
Eric Dumazet [Sat, 27 Oct 2012 23:16:46 +0000 (23:16 +0000)]
tcp: better retrans tracking for defer-accept

For passive TCP connections using TCP_DEFER_ACCEPT facility,
we incorrectly increment req->retrans each time timeout triggers
while no SYNACK is sent.

SYNACK are not sent for TCP_DEFER_ACCEPT that were established (for
which we received the ACK from client). Only the last SYNACK is sent
so that we can receive again an ACK from client, to move the req into
accept queue. We plan to change this later to avoid the useless
retransmit (and potential problem as this SYNACK could be lost)

TCP_INFO later gives wrong information to user, claiming imaginary
retransmits.

Decouple req->retrans field into two independent fields :

num_retrans : number of retransmit
num_timeout : number of timeouts

num_timeout is the counter that is incremented at each timeout,
regardless of actual SYNACK being sent or not, and used to
compute the exponential timeout.

Introduce inet_rtx_syn_ack() helper to increment num_retrans
only if ->rtx_syn_ack() succeeded.

Use inet_rtx_syn_ack() from tcp_check_req() to increment num_retrans
when we re-send a SYNACK in answer to a (retransmitted) SYN.
Prior to this patch, we were not counting these retransmits.

Change tcp_v[46]_rtx_synack() to increment TCP_MIB_RETRANSSEGS
only if a synack packet was successfully queued.

Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix continued iteration in rtnl_bridge_getlink()
Ben Hutchings [Fri, 2 Nov 2012 12:56:52 +0000 (12:56 +0000)]
net: Fix continued iteration in rtnl_bridge_getlink()

Commit e5a55a898720096f43bc24938f8875c0a1b34cd7 ('net: create generic
bridge ops') broke the handling of a non-zero starting index in
rtnl_bridge_getlink() (based on the old br_dump_ifinfo()).

When the starting index is non-zero, we need to increment the current
index for each entry that we are skipping.  Also, we need to check the
index before both cases, since we may previously have stopped
iteration between getting information about a device from its master
and from itself.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Tested-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6/multipath: remove flag NLM_F_EXCL after the first nexthop
Nicolas Dichtel [Thu, 1 Nov 2012 22:58:22 +0000 (22:58 +0000)]
ipv6/multipath: remove flag NLM_F_EXCL after the first nexthop

fib6_add_rt2node() will reject the nexthop if this flag is set, so
we perform the check only for the first nexthop.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoeth: Rename and properly align br_reserved_address array
Ben Hutchings [Thu, 1 Nov 2012 09:12:02 +0000 (09:12 +0000)]
eth: Rename and properly align br_reserved_address array

Since this array is no longer part of the bridge driver, it should
have an 'eth' prefix not 'br'.

We also assume that either it's 16-bit-aligned or the architecture has
efficient unaligned access.  Ensure the first of these is true by
explicitly aligning it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoeth: Make is_link_local() consistent with other address tests
Ben Hutchings [Thu, 1 Nov 2012 09:11:11 +0000 (09:11 +0000)]
eth: Make is_link_local() consistent with other address tests

Function name should include '_ether_addr'.
Return type should be bool.
Parameter name should be 'addr' not 'dest' (also matching kernel-doc).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: Use is_link_local() in store_group_addr()
Ben Hutchings [Thu, 1 Nov 2012 09:10:04 +0000 (09:10 +0000)]
bridge: Use is_link_local() in store_group_addr()

Parse the string into an array of bytes rather than ints, so we can
use is_link_local() rather than reimplementing it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosfc: Select PTP_1588_CLOCK
Ben Hutchings [Thu, 1 Nov 2012 11:22:22 +0000 (11:22 +0000)]
sfc: Select PTP_1588_CLOCK

This was missed in commit a24006ed12616bde1bbdb26868495906a212d8dc
('ptp: Enable clock drivers along with associated net/PHY drivers')
which enabled sfc's clock driver unconditionally.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost-net: reduce vq polling on tx zerocopy
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:55 +0000 (09:16 +0000)]
vhost-net: reduce vq polling on tx zerocopy

It seems that to avoid deadlocks it is enough to poll vq before
 we are going to use the last buffer.  This is faster than
c70aa540c7a9f67add11ad3161096fb95233aa2e.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost-net: select tx zero copy dynamically
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:51 +0000 (09:16 +0000)]
vhost-net: select tx zero copy dynamically

Even when vhost-net is in zero-copy transmit mode,
net core might still decide to copy the skb later
which is somewhat slower than a copy in user
context: data copy overhead is added to the cost of
page pin/unpin. The result is that enabling tx zero copy
option leads to higher CPU utilization for guest to guest
and guest to host traffic.

To fix this, suppress zero copy tx after a given number of
packets triggered late data copy. Re-enable periodically
to detect workload changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost: move -net specific code out
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:46 +0000 (09:16 +0000)]
vhost: move -net specific code out

Zerocopy handling code is vhost-net specific.
Move it from vhost.c/vhost.h out to net.c

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost: track zero copy failures using DMA length
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:42 +0000 (09:16 +0000)]
vhost: track zero copy failures using DMA length

This will be used to disable zerocopy when error rate
is high.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost-net: cleanup macros for DMA status tracking
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:37 +0000 (09:16 +0000)]
vhost-net: cleanup macros for DMA status tracking

Better document macros for DMA tracking. Add an
explicit one for DMA in progress instead of
relying on user supplying len != 1.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotun: report orphan frags errors to zero copy callback
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:32 +0000 (09:16 +0000)]
tun: report orphan frags errors to zero copy callback

When tun transmits a zero copy skb, it orphans the frags
which might need to allocate extra memory, in atomic context.
If that fails, notify ubufs callback before freeing the skb
as a hint that device should disable zerocopy mode.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoskb: api to report errors for zero copy skbs
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:28 +0000 (09:16 +0000)]
skb: api to report errors for zero copy skbs

Orphaning frags for zero copy skbs needs to allocate data in atomic
context so is has a chance to fail. If it does we currently discard
the skb which is safe, but we don't report anything to the caller,
so it can not recover by e.g. disabling zero copy.

Add an API to free skb reporting such errors: this is used
by tun in case orphaning frags fails.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoskb: report completion status for zero copy skbs
Michael S. Tsirkin [Thu, 1 Nov 2012 09:16:22 +0000 (09:16 +0000)]
skb: report completion status for zero copy skbs

Even if skb is marked for zero copy, net core might still decide
to copy it later which is somewhat slower than a copy in user context:
besides copying the data we need to pin/unpin the pages.

Add a parameter reporting such cases through zero copy callback:
if this happens a lot, device can take this into account
and switch to copying in user context.

This patch updates all users but ignores the passed value for now:
it will be used by follow-up patches.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Fri, 2 Nov 2012 22:45:35 +0000 (18:45 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
This series contains updates to igb, ixgbe and e1000.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovlan: use IS_ENABLED()
Amerigo Wang [Mon, 29 Oct 2012 17:22:28 +0000 (17:22 +0000)]
vlan: use IS_ENABLED()

#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)

can be replaced by

#if IS_ENABLED(CONFIG_FOO)

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: use IS_ENABLED()
Amerigo Wang [Mon, 29 Oct 2012 16:23:10 +0000 (16:23 +0000)]
ipv6: use IS_ENABLED()

#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)

can be replaced by

#if IS_ENABLED(CONFIG_FOO)

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agortnl/ipv4: use netconf msg to advertise rp_filter status
Nicolas Dichtel [Mon, 29 Oct 2012 04:53:27 +0000 (04:53 +0000)]
rtnl/ipv4: use netconf msg to advertise rp_filter status

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoppp: make ppp_get_stats64 static
stephen hemminger [Mon, 29 Oct 2012 08:34:02 +0000 (08:34 +0000)]
ppp: make ppp_get_stats64 static

This was picked up by sparse.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoFEC: Add time stamping code and a PTP hardware clock
Frank Li [Tue, 30 Oct 2012 18:25:31 +0000 (18:25 +0000)]
FEC: Add time stamping code and a PTP hardware clock

This patch adds a driver for the FEC(MX6) that offers time
stamping and a PTP haderware clock. Because FEC\ENET(MX6)
hardware frequency adjustment is complex, we have implemented
this in software by changing the multiplication factor of the
timecounter.

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoARM: imx6q: Set enet tx reference clk from anatop to support 1588
Frank Li [Tue, 30 Oct 2012 18:25:22 +0000 (18:25 +0000)]
ARM: imx6q: Set enet tx reference clk from anatop to support 1588

Set GRP1 BIT21 ENET_CLK_SEL:
  Enet tx reference clk from internal clock from anatop
  (loopback through pad), this clock also sent out to external PHY

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>