firefly-linux-kernel-4.4.55.git
13 years agonet: Add Open vSwitch kernel components.
Jesse Gross [Wed, 26 Oct 2011 02:26:31 +0000 (19:26 -0700)]
net: Add Open vSwitch kernel components.

Open vSwitch is a multilayer Ethernet switch targeted at virtualized
environments.  In addition to supporting a variety of features
expected in a traditional hardware switch, it enables fine-grained
programmatic extension and flow-based control of the network.
This control is useful in a wide variety of applications but is
particularly important in multi-server virtualization deployments,
which are often characterized by highly dynamic endpoints and the need
to maintain logical abstractions for multiple tenants.

The Open vSwitch datapath provides an in-kernel fast path for packet
forwarding.  It is complemented by a userspace daemon, ovs-vswitchd,
which is able to accept configuration from a variety of sources and
translate it into packet processing rules.

See http://openvswitch.org for more information and userspace
utilities.

Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agoipv6: Add fragment reporting to ipv6_skip_exthdr().
Jesse Gross [Thu, 1 Dec 2011 01:05:51 +0000 (17:05 -0800)]
ipv6: Add fragment reporting to ipv6_skip_exthdr().

While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller.  This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.

Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agovlan: Move vlan_set_encap_proto() to vlan header file
Pravin B Shelar [Fri, 18 Nov 2011 21:15:54 +0000 (13:15 -0800)]
vlan: Move vlan_set_encap_proto() to vlan header file

Open vSwitch needs this function for vlan handling.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agogenetlink: Add rcu_dereference_genl and genl_dereference.
Jesse Gross [Sun, 20 Nov 2011 00:21:37 +0000 (16:21 -0800)]
genetlink: Add rcu_dereference_genl and genl_dereference.

This adds rcu_dereference_genl and genl_dereference, which are genl
variants of the RTNL functions to enforce proper locking with lockdep
and sparse.

Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agogenetlink: Add lockdep_genl_is_held().
Pravin B Shelar [Fri, 11 Nov 2011 03:14:51 +0000 (19:14 -0800)]
genetlink: Add lockdep_genl_is_held().

Open vSwitch uses genl_mutex locking to protect datapath
data-structures like flow-table, flow-actions. Following patch adds
lockdep_genl_is_held() which is used for rcu annotation to prove
locking.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agogenetlink: Add genl_notify()
Pravin B Shelar [Fri, 11 Nov 2011 03:14:37 +0000 (19:14 -0800)]
genetlink: Add genl_notify()

Open vSwitch uses Generic Netlink interface for communication
between userspace and kernel module. genl_notify() is used
for sending notification back to userspace.

genl_notify() is analogous to rtnl_notify() but uses genl_sock
instead of rtnl.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
13 years agoatm: clip: Remove code commented out since eternity.
David S. Miller [Fri, 2 Dec 2011 19:27:11 +0000 (14:27 -0500)]
atm: clip: Remove code commented out since eternity.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 2 Dec 2011 18:49:21 +0000 (13:49 -0500)]
Merge git://git./linux/kernel/git/davem/net

13 years agoforcedeath: Fix bql support for forcedeath
Igor Maravic [Thu, 1 Dec 2011 23:48:20 +0000 (23:48 +0000)]
forcedeath: Fix bql support for forcedeath

Moved netdev_completed_queue() out of while loop in function nv_tx_done_optimized().
Because this function was in while loop,
BUG_ON(count > dql->num_queued - dql->num_completed)
was hit in dql_completed().

Signed-off-by: Igor Maravic <igorm@etf.rs>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoniu: Fix typo in comment.
David S. Miller [Fri, 2 Dec 2011 17:37:33 +0000 (12:37 -0500)]
niu: Fix typo in comment.

Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 2 Dec 2011 04:09:08 +0000 (20:09 -0800)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
  ipv4: flush route cache after change accept_local
  sch_red: fix red_change
  Revert "udp: remove redundant variable"
  bridge: master device stuck in no-carrier state forever when in user-stp mode
  ipv4: Perform peer validation on cached route lookup.
  net/core: fix rollback handler in register_netdevice_notifier
  sch_red: fix red_calc_qavg_from_idle_time
  bonding: only use primary address for ARP
  ipv4: fix lockdep splat in rt_cache_seq_show
  sch_teql: fix lockdep splat
  net: fec: Select the FEC driver by default for i.MX SoCs
  isdn: avoid copying too long drvid
  isdn: make sure strings are null terminated
  netlabel: Fix build problems when IPv6 is not enabled
  sctp: better integer overflow check in sctp_auth_create_key()
  sctp: integer overflow in sctp_auth_create_key()
  ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
  net: Fix corruption in /proc/*/net/dev_mcast
  mac80211: fix race between the AGG SM and the Tx data path
  ...

13 years agonetfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
David S. Miller [Fri, 2 Dec 2011 03:19:01 +0000 (22:19 -0500)]
netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS

firewalld in Fedora 16 needs this.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoniu: Add support for byte queue limits.
David S. Miller [Fri, 2 Dec 2011 03:15:47 +0000 (22:15 -0500)]
niu: Add support for byte queue limits.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoniu: Remove redundant PHY ID test.
David S. Miller [Fri, 2 Dec 2011 02:59:07 +0000 (21:59 -0500)]
niu: Remove redundant PHY ID test.

Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: flush route cache after change accept_local
Peter Pan(潘卫平) [Thu, 1 Dec 2011 15:47:06 +0000 (15:47 +0000)]
ipv4: flush route cache after change accept_local

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_red: fix red_change
Eric Dumazet [Thu, 1 Dec 2011 11:06:34 +0000 (11:06 +0000)]
sch_red: fix red_change

Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit :

> (Almost) nobody uses RED because they can't figure it out.
> According to Wikipedia, VJ says that:
>  "there are not one, but two bugs in classic RED."

RED is useful for high throughput routers, I doubt many linux machines
act as such devices.

I was considering adding Adaptative RED (Sally Floyd, Ramakrishna
Gummadi, Scott Shender), August 2001

In this version, maxp is dynamic (from 1% to 50%), and user only have to
setup min_th (target average queue size)
(max_th and wq (burst in linux RED) are automatically setup)

By the way it seems we have a small bug in red_change()

if (skb_queue_empty(&sch->q))
red_end_of_idle_period(&q->parms);

First, if queue is empty, we should call
red_start_of_idle_period(&q->parms);

Second, since we dont use anymore sch->q, but q->qdisc, the test is
meaningless.

Oh well...

[PATCH] sch_red: fix red_change()

Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty,
we start an idle period, not end it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoLinux 3.2-rc4
Linus Torvalds [Thu, 1 Dec 2011 22:56:01 +0000 (14:56 -0800)]
Linux 3.2-rc4

13 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec...
Linus Torvalds [Thu, 1 Dec 2011 22:55:34 +0000 (14:55 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jlbec/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
  ocfs2: avoid unaligned access to dqc_bitmap
  ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
  ocfs2: honor O_(D)SYNC flag in fallocate
  ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
  ocfs2: send correct UUID to cleancache initialization
  ocfs2: Commit transactions in error cases -v2
  ocfs2: make direntry invalid when deleting it
  fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
  ocfs2: Avoid livelock in ocfs2_readpage()
  ocfs2: serialize unaligned aio
  ocfs2: Implement llseek()
  ocfs2: Fix ocfs2_page_mkwrite()
  ocfs2: Add comment about orphan scanning
  ocfs2: Clean up messages in the fs
  ocfs2/cluster: Cluster up now includes network connections too
  ocfs2/cluster: Add new function o2net_fill_node_map()
  ocfs2/cluster: Fix output in file elapsed_time_in_ms
  ocfs2/dlm: dlmlock_remote() needs to account for remastery
  ocfs2/dlm: Take inflight reference count for remotely mastered resources too
  ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
  ...

13 years agoocfs2: avoid unaligned access to dqc_bitmap
Akinobu Mita [Tue, 15 Nov 2011 22:56:34 +0000 (14:56 -0800)]
ocfs2: avoid unaligned access to dqc_bitmap

The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned.  The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit().  These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)

So some 64bit architectures may not be able to access the dqc_bitmap
correctly.

This avoids such unaligned access by using another wrapper functions for
ext2_*_bit().  The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
13 years agoMerge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur...
Linus Torvalds [Thu, 1 Dec 2011 19:53:54 +0000 (11:53 -0800)]
Merge branch 'fixes' of ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7182/1: ARM cpu topology: fix warning
  ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
  ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
  ARM: 7177/1: GIC: avoid skipping non-existent PPIs in irq_start calculation
  ARM: 7176/1: cpu_pm: register GIC PM notifier only once
  ARM: 7175/1: add subname parameter to mfp_set_groupg callers
  ARM: 7174/1: Fix build error in kprobes test code on Thumb2 kernels
  ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations
  ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
  ARM: 7170/2: fix compilation breakage in entry-armv.S
  ARM: 7168/1: use cache type functions for arch_get_unmapped_area
  ARM: perf: check that we have a platform device when reserving PMU
  ARM: 7166/1: Use PMD_SHIFT instead of PGDIR_SHIFT in dma-consistent.c
  ARM: 7165/2: PL330: Fix typo in _prepare_ccr()
  ARM: 7163/2: PL330: Only register usable channels
  ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds
  ARM: 7161/1: errata: no automatic store buffer drain
  ARM: perf: initialise used_mask for fake PMU during validation
  ARM: PMU: remove pmu_init declaration
  ARM: PMU: re-export release_pmu symbol to modules

13 years agodccp: Fix compile warning in probe code.
David S. Miller [Thu, 1 Dec 2011 19:45:49 +0000 (14:45 -0500)]
dccp: Fix compile warning in probe code.

Commit 1386be55e32a3c5d8ef4a2b243c530a7b664c02c ("dccp: fix
auto-loading of dccp(_probe)") fixed a bug but created a new
compiler warning:

net/dccp/probe.c: In function ‘dccpprobe_init’:
net/dccp/probe.c:166:2: warning: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Wparentheses]

try_then_request_module() is built for situations where the
"existence" test is some lookup function that returns a non-NULL
object on success, and with a reference count of some kind held.

Here we're looking for a success return of zero from the jprobe
registry.

Instead of fighting the way try_then_request_module() works, simply
open code what we want to happen in a local helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Make ndo_neigh_destroy return void.
David S. Miller [Thu, 1 Dec 2011 19:16:04 +0000 (14:16 -0500)]
net: Make ndo_neigh_destroy return void.

The return value isn't used.

Suggested by Ben Hucthings.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoRevert "udp: remove redundant variable"
David S. Miller [Thu, 1 Dec 2011 19:12:55 +0000 (14:12 -0500)]
Revert "udp: remove redundant variable"

This reverts commit 81d54ec8479a2c695760da81f05b5a9fb2dbe40a.

If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated.  So we won't compute
the same values as the original code did.

Reported-by: paul bilke <fsmail@conspiracy.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobridge: master device stuck in no-carrier state forever when in user-stp mode
Vitalii Demianets [Fri, 25 Nov 2011 00:16:37 +0000 (00:16 +0000)]
bridge: master device stuck in no-carrier state forever when in user-stp mode

When in user-stp mode, bridge master do not follow state of its slaves, so
after the following sequence of events it can stuck forever in no-carrier
state:
1) turn stp off
2) put all slaves down - master device will follow their state and also go in
no-carrier state
3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
Now bridge master won't follow slaves' state and will never reach running
state.

This patch solves the problem by making user-stp and kernel-stp behavior
similar regarding master following slaves' states.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Perform peer validation on cached route lookup.
David S. Miller [Thu, 1 Dec 2011 18:38:59 +0000 (13:38 -0500)]
ipv4: Perform peer validation on cached route lookup.

Otherwise we won't notice the peer GENID change.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: use a 64bit load/store in output path
Eric Dumazet [Wed, 30 Nov 2011 19:00:53 +0000 (19:00 +0000)]
ipv4: use a 64bit load/store in output path

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodccp: Evaluate ip_hdr() only once in dccp_v4_route_skb().
David S. Miller [Thu, 1 Dec 2011 18:28:34 +0000 (13:28 -0500)]
dccp: Evaluate ip_hdr() only once in dccp_v4_route_skb().

This also works around a bogus gcc warning generated by an
upcoming patch from Eric Dumazet that rearranges the layout
of struct flowi4.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agopowerpc: tqm8548/tqm8xx: add and update CAN device nodes
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:21 +0000 (23:41 +0000)]
powerpc: tqm8548/tqm8xx: add and update CAN device nodes

This patch enables or updates support for the CC770 and AN82527
CAN controller on the TQM8548 and TQM8xx boards.

CC: devicetree-discuss@lists.ozlabs.org
CC: linuxppc-dev@ozlabs.org
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: cc770: add platform bus driver for the CC770 and AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:20 +0000 (23:41 +0000)]
can: cc770: add platform bus driver for the CC770 and AN82527

This driver works with both, static platform data and device tree
bindings. It has been tested on a TQM855L board with two AN82527
CAN controllers on the local bus.

CC: Devicetree-discuss@lists.ozlabs.org
CC: linuxppc-dev@ozlabs.org
CC: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: cc770: add legacy ISA bus driver for the CC770 and AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:19 +0000 (23:41 +0000)]
can: cc770: add legacy ISA bus driver for the CC770 and AN82527

This patch adds support for legacy Bosch CC770 and Intel AN82527 CAN
controllers on the ISA or PC-104 bus. The I/O port or memory address
and the IRQ number must be specified via module parameters:

  insmod cc770_isa.ko port=0x310,0x380 irq=7,11

for ISA devices using I/O ports or:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11

for memory mapped ISA devices.

Indirect access via address and data port is supported as well:

  insmod cc770_isa.ko port=0x310,0x380 indirect=1 irq=7,11

Furthermore, the following mode parameter can be defined:

  clk: External oscillator clock frequency (default=16000000 [16 MHz])
  cir: CPU interface register (default=0x40 [DSC])
  bcr: Bus configuration register (default=0x40 [CBY])
  cor: Clockout register (default=0x00)

Note: for clk, cir, bcr and cor, the first argument re-defines the
default for all other devices, e.g.:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000

is equivalent to

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000,24000000

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: cc770: add driver core for the Bosch CC770 and Intel AN82527
Wolfgang Grandegger [Wed, 30 Nov 2011 23:41:18 +0000 (23:41 +0000)]
can: cc770: add driver core for the Bosch CC770 and Intel AN82527

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoicplus: mdio_write(), remove unnecessary for loop
Patrick Kelle [Thu, 1 Dec 2011 17:54:46 +0000 (12:54 -0500)]
icplus: mdio_write(), remove unnecessary for loop

At this point the variable j is always set to 7 and the code within
the loop has to run only once anyway.

As suggested by David Miller:
"You can simply this even further since p[7] is what is used here,
and this means len is one, the inner loop therefore executes only
once, and the p[7].field value is not used (it's zero in the table)
and the write to it is completely thrown away."

Signed-off-by: Patrick Kelle <patrick.kelle81@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: net_device flags is an unsigned int
Eric Dumazet [Wed, 30 Nov 2011 21:42:26 +0000 (21:42 +0000)]
net: net_device flags is an unsigned int

commit b00055aacdb ([NET] core: add RFC2863 operstate) changed
net_device flags from unsigned short to unsigned int.

Some core functions still assume its an unsigned short.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodsa: Include linux/if_ether.h to fix build error
Axel Lin [Wed, 30 Nov 2011 22:07:18 +0000 (22:07 +0000)]
dsa: Include linux/if_ether.h to fix build error

Include linux/if_ether.h to fix below build errors:

  CC      arch/arm/mach-kirkwood/common.o
In file included from arch/arm/mach-kirkwood/common.c:19:
include/net/dsa.h: In function 'dsa_uses_dsa_tags':
include/net/dsa.h:192: error: 'ETH_P_DSA' undeclared (first use in this function)
include/net/dsa.h:192: error: (Each undeclared identifier is reported only once
include/net/dsa.h:192: error: for each function it appears in.)
include/net/dsa.h: In function 'dsa_uses_trailer_tags':
include/net/dsa.h:197: error: 'ETH_P_TRAILER' undeclared (first use in this function)
make[1]: *** [arch/arm/mach-kirkwood/common.o] Error 1
make: *** [arch/arm/mach-kirkwood] Error 2

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetem: fix build error on 32bit arches
Eric Dumazet [Wed, 30 Nov 2011 23:32:14 +0000 (23:32 +0000)]
netem: fix build error on 32bit arches

ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Thu, 1 Dec 2011 16:28:53 +0000 (08:28 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix meta data raid-repair merge problem
  Btrfs: skip allocation attempt from empty cluster
  Btrfs: skip block groups without enough space for a cluster
  Btrfs: start search for new cluster at the beginning
  Btrfs: reset cluster's max_size when creating bitmap
  Btrfs: initialize new bitmaps' list
  Btrfs: fix oops when calling statfs on readonly device
  Btrfs: Don't error on resizing FS to same size
  Btrfs: fix deadlock on metadata reservation when evicting a inode
  Fix URL of btrfs-progs git repository in docs
  btrfs scrub: handle -ENOMEM from init_ipath()

13 years agoBtrfs: fix meta data raid-repair merge problem
Jan Schmidt [Thu, 1 Dec 2011 14:30:36 +0000 (09:30 -0500)]
Btrfs: fix meta data raid-repair merge problem

Commit 4a54c8c16 introduced raid-repair, killing the individual
readpage_io_failed_hook entries from inode.c and disk-io.c. Commit
4bb31e92 introduced new readahead code, adding a readpage_io_failed_hook to
disk-io.c.

The raid-repair commit had logic to disable raid-repair, if
readpage_io_failed_hook is set. Thus, the readahead commit effectively
disabled raid-repair for meta data.

This commit changes the logic to always attempt raid-repair when needed and
call the readpage_io_failed_hook in case raid-repair fails. This is much
more straight forward and should have been like that from the beginning.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Reported-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agonet/core: fix rollback handler in register_netdevice_notifier
RongQing.Li [Thu, 1 Dec 2011 04:43:07 +0000 (23:43 -0500)]
net/core: fix rollback handler in register_netdevice_notifier

Within nested statements, the break statement terminates only the
do, for, switch, or while statement that immediately encloses it,
So replace the break with goto.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocaif: Remove unused attributes from struct cflayer
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 13:02:32 +0000 (13:02 +0000)]
caif: Remove unused attributes from struct cflayer

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocaif: Remove unused enum and parameter in cfserl
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:48 +0000 (09:22 +0000)]
caif: Remove unused enum and parameter in cfserl

Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocaif: Restructure how link caif link layer enroll
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:47 +0000 (09:22 +0000)]
caif: Restructure how link caif link layer enroll

Enrolling CAIF link layers are refactored.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocaif: Allow cfpkt_extr_head to process empty message
sjur.brandeland@stericsson.com [Wed, 30 Nov 2011 09:22:46 +0000 (09:22 +0000)]
caif: Allow cfpkt_extr_head to process empty message

Allow NULL pointer in cfpkt_extr_head in order to
skip past header data.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_red: fix red_calc_qavg_from_idle_time
Eric Dumazet [Wed, 30 Nov 2011 12:10:53 +0000 (12:10 +0000)]
sch_red: fix red_calc_qavg_from_idle_time

Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems RED/GRED are broken.

red_calc_qavg_from_idle_time() computes a delay in us units, but this
delay is now 16 times bigger than real delay, so the final qavg result
smaller than expected.

Use standard kernel time services since there is no need to obfuscate
them.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetem: rate extension
Hagen Paul Pfeifer [Wed, 30 Nov 2011 12:20:26 +0000 (12:20 +0000)]
netem: rate extension

Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used.  But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
13 years agoipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()
Jun Zhao [Wed, 30 Nov 2011 06:21:05 +0000 (06:21 +0000)]
ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()

Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
13 years agoipv4 : igmp : Delete useless parameter in ip_mc_add1_src()
Jun Zhao [Wed, 30 Nov 2011 06:21:04 +0000 (06:21 +0000)]
ipv4 : igmp : Delete useless parameter in ip_mc_add1_src()

Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
13 years agobonding: only use primary address for ARP
Henrik Saavedra Persson [Wed, 23 Nov 2011 23:37:15 +0000 (23:37 +0000)]
bonding: only use primary address for ARP

Only use the primary address of the bond device
for master_ip. This will prevent changing the ARP source
address in Active-Backup mode whenever a secondry address
is added to the bond device.

Signed-off-by: Henrik Saavedra Persson <henrik.e.persson@ericsson.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Thu, 1 Dec 2011 00:25:02 +0000 (16:25 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB: Fix RCU lockdep splats
  IB/ipoib: Prevent hung task or softlockup processing multicast response
  IB/qib: Fix over-scheduling of QSFP work
  RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2
  RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic
  IB/qib: Don't use schedule_work()

13 years agoMerge branch 'dt-for-linus' of git://sources.calxeda.com/kernel/linux
Linus Torvalds [Thu, 1 Dec 2011 00:24:43 +0000 (16:24 -0800)]
Merge branch 'dt-for-linus' of git://sources.calxeda.com/kernel/linux

* 'dt-for-linus' of git://sources.calxeda.com/kernel/linux:
  of: Add Silicon Image vendor prefix
  of/irq: of_irq_init: add check for parent equal to child node

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Linus Torvalds [Thu, 1 Dec 2011 00:24:24 +0000 (16:24 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/broonie/regulator

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: twl: fix twl4030 support for smps regulators
  regulator: fix use after free bug
  regulator: aat2870: Fix the logic of checking if no id is matched in aat2870_get_regulator

13 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Thu, 1 Dec 2011 00:23:59 +0000 (16:23 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/arm/arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (45 commits)
  ARM: ux500: update defconfig
  ARM: u300: update defconfig
  ARM: at91: enable additional boards in existing soc defconfig files
  ARM: at91: refresh soc defconfig files for 3.2
  ARM: at91: rename defconfig files appropriately
  ARM: OMAP2+: Fix Compilation error when omap_l3_noc built as module
  ARM: OMAP2+: Remove empty io.h
  ARM: OMAP2: select ARM_AMBA if OMAP3_EMU is defined
  ARM: OMAP: smartreflex: fix IRQ handling bug
  ARM: OMAP: PM: only register TWL with voltage layer when device is present
  ARM: OMAP: hwmod: Fix the addr space, irq, dma count APIs
  arm: mx28: fix bit operation in clock setting
  ARM: imx: export imx_ioremap
  ARM: imx/mm-imx3: conditionally compile i.MX31 and i.MX35 code
  ARM: mx5: Fix checkpatch warnings in cpu-imx5.c
  MAINTAINERS: Add missing directory
  ARM: imx: drop 'ARCH_MX31' and 'ARCH_MX35'
  ARM: imx6q: move clock register map to machine_desc.map_io
  ARM: pxa168/gplugd: add the correct SSP device
  ARM: Update mach-types to fix mxs build breakage
  ...

13 years agoARM: 7182/1: ARM cpu topology: fix warning
Vincent Guittot [Tue, 29 Nov 2011 14:50:20 +0000 (15:50 +0100)]
ARM: 7182/1: ARM cpu topology: fix warning

kernel/sched.c:7354:2: warning: initialization from incompatible pointer type

Align cpu_coregroup_mask prototype interface with sched_domain_mask_f typedef
use int cpu instead of unsigned int cpu

Cc: <stable@vger.kernel.org>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
13 years agoARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
Jon Medhurst (Tixy) [Tue, 29 Nov 2011 07:16:02 +0000 (08:16 +0100)]
ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below

The SWP instruction is deprecated on ARMv6 and with ARMv7 it will be
UNDEFINED when CONFIG_SWP_EMULATE is selected. In this case, probing a
SWP instruction will cause an oops when the kprobes emulation code
executes an undefined instruction.

As the SWP instruction should be rare or non-existent in kernels for
ARMv6 and later, we can simply avoid these problems by not allowing
probing of these.

Reported-by: Leif Lindholm <leif.lindholm@arm.com>
Tested-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
13 years agoARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
Jon Medhurst (Tixy) [Tue, 29 Nov 2011 07:14:35 +0000 (08:14 +0100)]
ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction

There is a kprobes testcase for the instruction "strd r2, [r3], r4".
This has unpredictable behaviour as it uses r3 for register writeback
addressing and also stores it to memory.

On a cortex A9, this testcase would fail because the instruction writes
the updated value of r3 to memory, whereas the kprobes emulation code
writes the original value.

Fix this by changing testcase to used r5 instead of r3.

Reported-by: Leif Lindholm <leif.lindholm@arm.com>
Tested-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
13 years agoatm: clip: Use device neigh support on top of "arp_tbl".
David Miller [Mon, 25 Jul 2011 00:01:41 +0000 (00:01 +0000)]
atm: clip: Use device neigh support on top of "arp_tbl".

Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoneigh: Add device constructor/destructor capability.
David Miller [Mon, 25 Jul 2011 00:01:38 +0000 (00:01 +0000)]
neigh: Add device constructor/destructor capability.

If the neigh entry has device private state, it will need
constructor/destructor ops.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoatm: clip: Convert over to neighbour_priv()
David Miller [Mon, 25 Jul 2011 00:01:33 +0000 (00:01 +0000)]
atm: clip: Convert over to neighbour_priv()

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoneigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.
David Miller [Mon, 25 Jul 2011 00:01:28 +0000 (00:01 +0000)]
neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.

Let the core self-size the neigh entry based upon the key length.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoneigh: Add infrastructure for allocating device neigh privates.
David Miller [Mon, 25 Jul 2011 00:01:25 +0000 (00:01 +0000)]
neigh: Add infrastructure for allocating device neigh privates.

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoneigh: Get rid of neigh_table->kmem_cachep
David Miller [Mon, 25 Jul 2011 00:01:22 +0000 (00:01 +0000)]
neigh: Get rid of neigh_table->kmem_cachep

We are going to alloc for device specific private areas for
neighbour entries, and in order to do that we have to move
away from the fixed allocation size enforced by using
neigh_table->kmem_cachep

As a nice side effect we can now use kfree_rcu().

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoneigh: Create mechanism for generic neigh private areas.
David Miller [Mon, 25 Jul 2011 00:01:17 +0000 (00:01 +0000)]
neigh: Create mechanism for generic neigh private areas.

The implementation private sits right after the primary_key memory.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: fix lockdep splat in rt_cache_seq_show
Eric Dumazet [Tue, 29 Nov 2011 20:05:55 +0000 (20:05 +0000)]
ipv4: fix lockdep splat in rt_cache_seq_show

After commit f2c31e32b378 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.

Reported-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: fix race in efx_enqueue_skb_tso()
Eric Dumazet [Wed, 30 Nov 2011 22:12:27 +0000 (17:12 -0500)]
sfc: fix race in efx_enqueue_skb_tso()

As soon as skb is pushed to hardware, it can be completed and freed, so
we should not dereference skb anymore.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_teql: fix lockdep splat
Eric Dumazet [Wed, 30 Nov 2011 04:08:58 +0000 (04:08 +0000)]
sch_teql: fix lockdep splat

We need rcu_read_lock() protection before using dst_get_neighbour(), and
we must cache its value (pass it to __teql_resolve())

teql_master_xmit() is called under rcu_read_lock_bh() protection, its
not enough.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobnx2: Support for byte queue limits
Eric Dumazet [Tue, 29 Nov 2011 11:53:05 +0000 (11:53 +0000)]
bnx2: Support for byte queue limits

Changes to bnx2 to use byte queue limits.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: fec: Select the FEC driver by default for i.MX SoCs
Fabio Estevam [Wed, 30 Nov 2011 22:07:21 +0000 (17:07 -0500)]
net: fec: Select the FEC driver by default for i.MX SoCs

Since commit 230dec6 (net/fec: add imx6q enet support) the FEC driver is no
longer built by default for i.MX SoCs.

Let the FEC driver be built by default again.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: inherit listener congestion control for passive cnx
Eric Dumazet [Wed, 30 Nov 2011 01:02:41 +0000 (01:02 +0000)]
tcp: inherit listener congestion control for passive cnx

Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.

Reported-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: Revert outdated cc770 driver patches.
David S. Miller [Wed, 30 Nov 2011 21:00:48 +0000 (16:00 -0500)]
can: Revert outdated cc770 driver patches.

Newer versions have been floating about, and I applied
to older variant unfortunately.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Wed, 30 Nov 2011 19:14:42 +0000 (14:14 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem

13 years agoBtrfs: skip allocation attempt from empty cluster
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: skip allocation attempt from empty cluster

If we don't have a cluster, don't bother trying to allocate from it,
jumping right away to the attempt to allocate a new cluster.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: skip block groups without enough space for a cluster
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: skip block groups without enough space for a cluster

We test whether a block group has enough free space to hold the
requested block, but when we're doing clustered allocation, we can
save some cycles by testing whether it has enough room for the cluster
upfront, otherwise we end up attempting to set up a cluster and
failing.  Only in the NO_EMPTY_SIZE loop do we attempt an unclustered
allocation, and by then we'll have zeroed the cluster size, so this
patch won't stop us from using the block group as a last resort.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: start search for new cluster at the beginning
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: start search for new cluster at the beginning

Instead of starting at zero (offset is always zero), request a cluster
starting at search_start, that denotes the beginning of the current
block group.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: reset cluster's max_size when creating bitmap
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: reset cluster's max_size when creating bitmap

The field that indicates the size of the largest contiguous chunk of
free space in the cluster is not initialized when setting up bitmaps,
it's only increased when we find a larger contiguous chunk.  We end up
retaining a larger value than appropriate for highly-fragmented
clusters, which may cause pointless searches for large contiguous
groups, and even cause clusters that do not meet the density
requirements to be set up.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: initialize new bitmaps' list
Alexandre Oliva [Mon, 28 Nov 2011 14:04:43 +0000 (12:04 -0200)]
Btrfs: initialize new bitmaps' list

We're failing to create clusters with bitmaps because
setup_cluster_no_bitmap checks that the list is empty before inserting
the bitmap entry in the list for setup_cluster_bitmap, but the list
field is only initialized when it is restored from the on-disk free
space cache, or when it is written out to disk.

Besides a potential race condition due to the multiple use of the list
field, filesystem performance severely degrades over time: as we use
up all non-bitmap free extents, the try-to-set-up-cluster dance is
done at every metadata block allocation.  For every block group, we
fail to set up a cluster, and after failing on them all up to twice,
we fall back to the much slower unclustered allocation.

To make matters worse, before the unclustered allocation, we try to
create new block groups until we reach the 1% threshold, which
introduces additional bitmaps and thus block groups that we'll iterate
over at each metadata block request.

13 years agoBtrfs: fix oops when calling statfs on readonly device
Li Zefan [Mon, 28 Nov 2011 08:43:00 +0000 (16:43 +0800)]
Btrfs: fix oops when calling statfs on readonly device

To reproduce this bug:

  # dd if=/dev/zero of=img bs=1M count=256
  # mkfs.btrfs img
  # losetup -r /dev/loop1 img
  # mount /dev/loop1 /mnt
  OOPS!!

It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space().

To fix this, instead of checking write-only devices, we check all open
deivces:

  # df -h /dev/loop1
  Filesystem            Size  Used Avail Use% Mounted on
  /dev/loop1            250M   28K  238M   1% /mnt

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
13 years agoBtrfs: Don't error on resizing FS to same size
Mike Fleetwood [Fri, 18 Nov 2011 18:55:01 +0000 (18:55 +0000)]
Btrfs: Don't error on resizing FS to same size

It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed.  User app GParted trips
over this error.  Allow it by bypassing the shrink or grow operation.

Signed-off-by: Mike Fleetwood <mike.fleetwood@googlemail.com>
13 years agoBtrfs: fix deadlock on metadata reservation when evicting a inode
Miao Xie [Fri, 18 Nov 2011 09:43:00 +0000 (17:43 +0800)]
Btrfs: fix deadlock on metadata reservation when evicting a inode

When I ran the xfstests, I found the test tasks was blocked on meta-data
reservation.

By debugging, I found the reason of this bug:
   start transaction
        |
v
   reserve meta-data space
|
v
   flush delay allocation -> iput inode -> evict inode
^ |
| v
   wait for delay allocation flush <- reserve meta-data space

And besides that, the flush on evicting inode will block the thread, which
is reclaiming the memory, and make oom happen easily.

Fix this bug by skipping the flush step when evicting inode.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
13 years agoFix URL of btrfs-progs git repository in docs
Arnd Hannemann [Wed, 16 Nov 2011 16:35:37 +0000 (17:35 +0100)]
Fix URL of btrfs-progs git repository in docs

The location of the btrfs-progs repository has been changed.
This patch updates the documentation accordingly.

Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
13 years agobtrfs scrub: handle -ENOMEM from init_ipath()
Dan Carpenter [Wed, 16 Nov 2011 08:28:01 +0000 (11:28 +0300)]
btrfs scrub: handle -ENOMEM from init_ipath()

init_ipath() can return an ERR_PTR(-ENOMEM).

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
13 years agoMerge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next
Roland Dreier [Wed, 30 Nov 2011 02:01:53 +0000 (18:01 -0800)]
Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next

13 years agosky2: add bql support
stephen hemminger [Tue, 29 Nov 2011 15:15:33 +0000 (15:15 +0000)]
sky2: add bql support

This adds support for byte queue limits and aggregates statistics
update (suggestion from Eric).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
13 years agobnx2x: handle iSCSI SD mode
Dmitry Kravkov [Mon, 28 Nov 2011 12:31:49 +0000 (12:31 +0000)]
bnx2x: handle iSCSI SD mode

in iSCSI SD mode to bnx2x device assigned single mac address
which is supposted to be iscsi mac. If this mode is recognized
bnx2x will disable LRO, decrease number of queues to 1 and rx ring
size to the minumum allowed by FW, this in order minimize memory use.
It will tranfer mac for iscsi usage and zero primary mac of the netdev.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoisdn: avoid copying too long drvid
Dan Carpenter [Thu, 24 Nov 2011 02:42:09 +0000 (02:42 +0000)]
isdn: avoid copying too long drvid

"cfg->drvid" comes from the user so there is a possibility they
didn't NUL terminate it properly.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoisdn: make sure strings are null terminated
Dan Carpenter [Thu, 24 Nov 2011 02:41:49 +0000 (02:41 +0000)]
isdn: make sure strings are null terminated

These strings come from the user.  We strcpy() them inside
cf_command() so we should check that they are NULL terminated and
return an error if not.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: cc770: legacy CC770 ISA bus driver
Wolfgang Grandegger [Thu, 24 Nov 2011 02:07:28 +0000 (02:07 +0000)]
can: cc770: legacy CC770 ISA bus driver

This patch adds support for legacy Bosch CC770 and Intel AN82527 CAN
controllers on the ISA or PC-104 bus. The I/O port or memory address
and the IRQ number must be specified via module parameters:

  insmod cc770_isa.ko port=0x310,0x380 irq=7,11

for ISA devices using I/O ports or:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11

for memory mapped ISA devices.

Indirect access via address and data port is supported as well:

  insmod cc770_isa.ko port=0x310,0x380 indirect=1 irq=7,11

Furthermore, the following mode parameter can be defined:

  clk: External oscillator clock frequency (default=16000000 [16 MHz])
  cir: CPU interface register (default=0x40 [CPU_DSC])
  ocr, Bus configuration register (default=0x00)
  cor, Clockout register (default=0x00)

Note: for clk, cir, bcr and cor, the first argument re-defines the
default for all other devices, e.g.:

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000

is equivalent to

  insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000,24000000

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: cc770: add driver core for the Bosch CC770 and Intel AN82527
Wolfgang Grandegger [Thu, 24 Nov 2011 02:07:27 +0000 (02:07 +0000)]
can: cc770: add driver core for the Bosch CC770 and Intel AN82527

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/smsc911x: Add regulator support
Robert Marklund [Thu, 24 Nov 2011 01:03:07 +0000 (01:03 +0000)]
net/smsc911x: Add regulator support

Add some basic regulator support for the power pins, as needed
by the ST-Ericsson Snowball platform that powers up the SMSC911
chip using an external regulator.

Platforms that use regulators and the smsc911x and have no defined
regulator for the smsc911x and claim complete regulator
constraints with no dummy regulators will need to provide it, for
example using a fixed voltage regulator. It appears that this may
affect (apart from Ux500 Snowball) possibly these archs/machines
that from some grep:s appear to define both CONFIG_SMSC911X and
CONFIG_REGULATOR:

- ARM Freescale mx3 and OMAP 2 plus, Raumfeld machines
- Blackfin
- Super-H

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: linux-omap@vger.kernel.org
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Robert Marklund <robert.marklund@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: sja1000_isa: convert to platform driver to support x86_64 systems
Wolfgang Grandegger [Wed, 23 Nov 2011 23:58:22 +0000 (23:58 +0000)]
can: sja1000_isa: convert to platform driver to support x86_64 systems

This driver is currently not supported on x86_64 systems because the
"isa_driver" interface is used (CONFIG_ISA=y). To overcome this
limitation, the driver is converted to a platform driver, similar to
the serial 8250 driver.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: remove useless codes in ipmr_device_event()
RongQing.Li [Wed, 23 Nov 2011 23:10:52 +0000 (23:10 +0000)]
ipv4: remove useless codes in ipmr_device_event()

Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
distinguish whether to unregister the device.

When notify=1 means we does not need to unregister the device,
so calling unregister_netdevice_many is useless.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocan: sja1000_isa: fix "limited range" compiler warnings
Wolfgang Grandegger [Wed, 23 Nov 2011 23:08:35 +0000 (23:08 +0000)]
can: sja1000_isa: fix "limited range" compiler warnings

This patch fixes the compiler warnings: "comparison is always
false due to limited range of data type" by using "0xff" instead
of "-1" for unsigned values.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Fix skb_update_prio RCU usage.
Igor Maravic [Fri, 25 Nov 2011 07:44:54 +0000 (07:44 +0000)]
net: Fix skb_update_prio RCU usage.

Change function rcu_dereference to rcu_dereference_bh to avoid warning

[ INFO: suspicious RCU usage. ]
-------------------------------
net/core/dev.c:2459 suspicious rcu_dereference_check() usage!

because we are locking with

rcu_read_lock_bh();

in function dev_queue_xmit(struct sk_buff *skb)

Signed-off-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Tue, 29 Nov 2011 22:43:22 +0000 (14:43 -0800)]
Merge branch 'pm-fixes' of git://git./linux/kernel/git/rafael/linux-pm

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: Update comments describing device power management callbacks
  PM / Sleep: Update documentation related to system wakeup
  PM / Runtime: Make documentation follow the new behavior of irq_safe
  PM / Sleep: Correct inaccurate information in devices.txt
  PM / Domains: Document how PM domains are used by the PM core
  PM / Hibernate: Do not leak memory in error/test code paths

13 years agonetlabel: Fix build problems when IPv6 is not enabled
Paul Moore [Tue, 29 Nov 2011 10:10:54 +0000 (10:10 +0000)]
netlabel: Fix build problems when IPv6 is not enabled

A recent fix to the the NetLabel code caused build problem with
configurations that did not have IPv6 enabled; see below:

 netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add':
 netlabel_kapi.c:165:4:
  error: implicit declaration of function 'netlbl_af6list_add'

This patch fixes this problem by making the IPv6 specific code conditional
on the IPv6 configuration flags as we done in the rest of NetLabel and the
network stack as a whole.  We have to move some variable declarations
around as a result so things may not be quite as pretty, but at least it
builds cleanly now.

Some additional IPv6 conditionals were added to the NetLabel code as well
for the sake of consistency.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoIB: Fix RCU lockdep splats
Eric Dumazet [Tue, 29 Nov 2011 21:31:23 +0000 (22:31 +0100)]
IB: Fix RCU lockdep splats

Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
13 years agoIB/ipoib: Prevent hung task or softlockup processing multicast response
Mike Marciniszyn [Mon, 21 Nov 2011 13:43:54 +0000 (08:43 -0500)]
IB/ipoib: Prevent hung task or softlockup processing multicast response

This following can occur with ipoib when processing a multicast reponse:

    BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
    Modules linked in: ...
    CPU 0:
    Modules linked in: ...
    Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
    RIP: 0010:[<ffffffff814ddb27>]  [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20
    RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
    0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
    RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
    R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
    R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
    [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
    [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
    [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0
    [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0
    [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0
    [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
    [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
    [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
    [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
    [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
    [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa]
    [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
    [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa]
    [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
    [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
    [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
    [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
    [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
    [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
    [<ffffffff8100c1ca>] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

    ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

                ah = ipoib_create_ah(dev, priv->pd, &av);
                if (!ah) {
                        ipoib_warn(priv, "ib_address_create failed\n");
                } else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():

        if (!mcast->ah) {
                if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
                        skb_queue_tail(&mcast->pkt_queue, skb);
                else {
                        ++dev->stats.tx_dropped;
                        dev_kfree_skb_any(skb);
                }

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets.  Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Reviewed-by: Gary Leshner <gary.leshner@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
13 years agosctp: better integer overflow check in sctp_auth_create_key()
Xi Wang [Tue, 29 Nov 2011 09:26:30 +0000 (09:26 +0000)]
sctp: better integer overflow check in sctp_auth_create_key()

The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1).  In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.

However this shouldn't be a security issue.  The function is called
from the following two code paths:

 1) setsockopt()

 2) sctp_auth_asoc_set_secret()

In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API.  As such, the key length will
never overflow.

In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.

In other words, this overflow check is not really necessary.  Just
make it more correct.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg...
Linus Torvalds [Tue, 29 Nov 2011 19:13:22 +0000 (11:13 -0800)]
Merge branch 'slab/urgent' of git://git./linux/kernel/git/penberg/linux

* 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  slub: avoid potential NULL dereference or corruption
  slub: use irqsafe_cpu_cmpxchg for put_cpu_partial
  slub: move discard_slab out of node lock
  slub: use correct parameter to add a page to partial list tail

13 years agosch_choke: use skb_flow_dissect()
Eric Dumazet [Tue, 29 Nov 2011 04:22:15 +0000 (04:22 +0000)]
sch_choke: use skb_flow_dissect()

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_sfq: use skb_flow_dissect()
Eric Dumazet [Tue, 29 Nov 2011 03:40:45 +0000 (03:40 +0000)]
sch_sfq: use skb_flow_dissect()

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: avoid frag allocation for small frames
Eric Dumazet [Mon, 28 Nov 2011 22:41:47 +0000 (22:41 +0000)]
tcp: avoid frag allocation for small frames

tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.

If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.

This strategy is not welcome for small packets, wasting memory.

Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)

This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.

If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.

netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>