Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

drm/nvc0/ltcg: mask off intr 0x10

NVIDIA do that at startup too on Fermi, so perhaps the heap of 0x10
intrs we receive are normal and we can ignore them.

On Kepler NVIDIA *don't* do this, but the hardware appears to come up
with the bit masked off by default - so that's probably why :)

This should silence some interrupt spam seen on Fermi+ boards.

Backported patch from reworked nouveau kernel tree.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: silence a debug message triggered by newer userspace

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

netfilter: xt_limit: have r->cost != 0 case work

Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
a running state is saved to userspace and then reinstated from there.

Make sure that private xt_limit area is initialized with correct values.
Otherwise, random matchings due to use of uninitialized memory.

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge git://git./linux/kernel/git/davem/net

Pull more networking fixes from David Miller:

1) Eric Dumazet discovered and fixed what turned out to be a family of
    bugs.  These functions were using pskb_may_pull() which might need
    to reallocate the linear SKB data buffer, but the callers were not
    expecting this possibility.  The callers have cached pointers to the
    packet header areas, and would need to reload them if we were to
    continue using pskb_may_pull().

    So they could end up reading garbage.

    It's easier to just change these RAW4/RAW6/MIP6 routines to use
    skb_header_pointer() instead of pskb_may_pull(), which won't modify
    the linear SKB data area.

2) Dave Jone's syscall spammer caught a case where a non-TCP socket can
    call down into the TCP keepalive code.  The case basically involves
    creating a raw socket with sk_protocol == IPPROTO_TCP, then calling
    setsockopt(sock_fd, SO_KEEPALIVE, ...)

    Fixed by Eric Dumazet.

3) Bluetooth devices do not get configured properly while being powered
    on, resulting in always using legacy pairing instead of SSP.  Fix
    from Andrzej Kaczmarek.

4) Bluetooth cancels delayed work erroneously, put stricter checks in
    place.  From Andrei Emeltchenko.

5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in
    cfg80211, from Luis R.  Rodriguez.

6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach.

7) Missing module license in bcm87xx driver, from Peter Huewe.

8) Team driver can lose port changed events when adding devices to a
    team, fix from Jiri Pirko.

9) Fix endless loop when trying ot unregister PPPOE device in zombie
    state, from Xiaodong Xu.

10) batman-adv layer needs to set MAC address of software device
    earlier, otherwise we call tt_local_add with it uninitialized.

11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but
    that doesn't program the device properly.  From Marek Vasut.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6: mip6: fix mip6_mh_filter()
  ipv6: raw: fix icmpv6_filter()
  net: guard tcp_set_keepalive() to tcp sockets
  phy/micrel: Add missing header to micrel_phy.h
  phy/micrel: Rename KS80xx to KSZ80xx
  phy/micrel: Implement support for KSZ8021
  batman-adv: Fix symmetry check / route flapping in multi interface setups
  batman-adv: Fix change mac address of soft iface.
  pppoe: drop PPPOX_ZOMBIEs in pppoe_release
  team: send port changed when added
  ipv4: raw: fix icmp_filter()
  net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver
  iwlwifi: don't double free the interrupt in failure path
  cfg80211: fix possible circular lock on reg_regdb_search()
  Bluetooth: Fix not removing power_off delayed work
  Bluetooth: Fix freeing uninitialized delayed works
  Bluetooth: mgmt: Fix enabling LE while powered off
  Bluetooth: mgmt: Fix enabling SSP while powered off

ipv6: mip6: fix mip6_mh_filter()

mip6_mh_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included fixes:
- fix the behaviour of batman-adv in case of virtual interface MAC change event
- fix symmetric link check in neighbour selection

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: raw: fix icmpv6_filter()

icmpv6_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.

Also, if icmpv6 header cannot be found, do not deliver the packet,
as we do in IPv4.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b

Resolved conflict in kernel/sched/core.c using Peter Zijlstra's
approach from https://lkml.org/lkml/2012/9/5/585.

Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25b

The conflicts between kernel/rcutree.h and kernel/rcutree_plugin.h
were due to adjacent insertions and deletions, which were resolved
by simply accepting the changes on both branches.

Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

Pull SuperH fix from Paul Mundt:
"One last minute regression fix.."

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: pfc: Fix up GPIO mux type reconfig case.

Merge branch 'akpm' (sundry from Andrew)

Merge misc fixes from Andrew Morton:
"One maintainer change and three bugfixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (4 commits)
  c/r: prctl: fix build error for no-MMU case
  lib/flex_proportions.c: fix corruption of denominator in flexible proportions
  checksyscalls: fix "here document" handling
  pwm-backlight: take over maintenance

c/r: prctl: fix build error for no-MMU case

Commit 1ad75b9e1628 ("c/r: prctl: add minimal address test to
PR_SET_MM") added some address checking to prctl_set_mm() used by
checkpoint-restore.  This causes a build error for no-MMU systems:

   kernel/sys.c: In function 'prctl_set_mm':
   kernel/sys.c:1868:34: error: 'mmap_min_addr' undeclared (first use in this function)

The test for mmap_min_addr doesn't make a lot of sense for no-MMU code
as noted in commit 6e1415467614 ("NOMMU: Optimise away the
{dac_,}mmap_min_addr tests").

This patch defines mmap_min_addr as 0UL in the no-MMU case so that the
compiler will optimize away tests for "addr < mmap_min_addr".

Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@vger.kernel.org> [3.6.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/flex_proportions.c: fix corruption of denominator in flexible proportions

When racing with CPU hotplug, percpu_counter_sum() can return negative
values for the number of observed events.

This confuses fprop_new_period(), which uses unsigned type and as a
result number of events is set to big *positive* number.  From that
moment on, things go pear shaped and can result e.g.  in division by
zero as denominator is later truncated to 32-bits.

This bug causes a divide-by-zero oops in bdi_dirty_limit() in Borislav's
3.6.0-rc6 based kernel.

Fix the issue by using a signed type in fprop_new_period().  That makes
us bail out from the function without doing anything (mistakenly)
thinking there are no events to age.  That makes aging somewhat
inaccurate but getting accurate data would be rather hard.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Borislav Petkov <bp@amd64.org>
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checksyscalls: fix "here document" handling

"echo" doesn't read from stdin, therefore the checksyscalls script didn't
warn about not implemented system calls anymore since 29dc54c6
("checksyscalls: Use arch/x86/syscalls/syscall_32.tbl as source").

Use "cat" instead of "echo" which handles this correctly.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pwm-backlight: take over maintenance

Since the pwm-backlight driver is lacking a proper maintainer and is the
heaviest user of the PWM framework I'm taking over maintenance.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Arun Murthy <arun.murthy@stericsson.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Robert Morell <rmorell@nvidia.com>
Cc: Dilan Lee <dilee@nvidia.com>
Cc: Axel Lin <axel.lin@gmail.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

HID: lg4ff: Minor coding style fixes in lg4ff and hid-lg

Fixes a couple of minor coding style issues.

Signed-off-by: Michal Malý <madcatxster@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: hid-lg4ff: Set absolute axes parametes on DFP

The lg4ff driver doesn't fill the "input_absinfo" struct so it is left
with default values. Applications with rely on information in this struct
therefore do not work correctly with the wheel.

Other Logitech wheels probably need this fix too, but again I do not have
enough information to write it.

Signed-off-by: Michal Malý <madcatxster@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: hid-lg4ff: Adjust X axis input value accordingly to selected range.

Range limiting command for the Driving Force Pro wheel is only a FF_SPRING
effect so that the wheel creates resistance when the user tries to turn it past
the limit. It is however possible to overpower the FFB motors quite easily which
leads to the X axis value exceeding the expected limit. This confuses
games which dynamically adjust calibration using the highest/lowest min and max
values reported by the wheel. Joydev device driver also doesn't take in account
any changes in an axis range after the joystick device is created.

This patch recalculates received ABS_X axis value so it is always in
<0; 16383> range where 0 is the left limit and 16383 the right limit.
Logitech driver for Windows does the same thing. As for any concerns about
possible loss of precision, I compared a large set of raw/adjusted values
generated by "mult_frac" to values returned by the Windows driver and I got
a 100% match.

Other Logitech wheels will probably need a similar fix, but I currently lack
the information needed to write one.

Signed-off-by: Michal Malý <madcatxster@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: hid-lg4ff: Minor code cleanup to improve readability

This patch replaces all occurrences of "report->field[0]->value[n]" with just
"value[n]" to get rid of the lengthy trains we have now.

Signed-off-by: Michal Malý <madcatxster@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: ntrig: change default value of logical/physical width/height to 1

Since something will be divided by these variables in
show_min_width()/show_min_height() and show_activate_width()/
show_activate_height(), a divided error would be triggered if
they are zero.

Signed-off-by: Wen-chien Jesse Sung <jesse.sung@canonical.com>
Acked-by: Rafi Rubin <rafi@seas.upenn.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

regulator: tps6586x: remove regulator-compatible from DT docs

Commit "regulator: deprecate regulator-compatible DT property" deprecated
the use of the regulator-compatible DT property. Update the DT example in
the TPS6586x binding documentation to reflect this.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

regulator: tps65217.txt: remove regulator-compatible from DT docs

Commit "regulator: deprecate regulator-compatible DT property" deprecated
the use of the regulator-compatible DT property. Update the DT example in
the TPS65217 binding documentation to reflect this.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

regulator: deprecate regulator-compatible DT property

When the bindings for the TPS6586x regulator were being proposed, I
asserted that DT node naming rules for bus child nodes should also be
applied to nodes inside the TPS6586x regulator node itself. In other
words, that each node providing regulator init data should be named
after the type of object it represented ("regulator") and hence that
some other property was required to indicate which regulator the node
described ("regulator-compatible"). In turn this led to multiple nodes
having the same name, thus requiring node names to use a unit address
to make them unique, thus requiring reg properties within the nodes and

However, subsequent discussion indicates that the rules I was asserting
only applies to standardized bus nodes, and within a device's own node,
the binding can basically do anything sane that it wants.

Hence, this change deprecates the register-compatible property, and
instead uses node names to replace this functionality. This greatly
simplifies the device tree content, making them smaller and more legible.

The code is changed such that old device trees continue to work.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

sb_edac: Avoid overflow errors at memory size calculation

Sandy bridge EDAC is calculating the memory size with overflow.
Basically, the size field and the integer calculation is using 32 bits.
More bits are needed, when the DIMM memories have high density.

The net result is that memories are improperly reported there, when
high-density DIMMs are used:

EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 0, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 1, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800

As the number of pages value is handled at the EDAC core as unsigned
ints, the driver shows the 16 GB memories at sysfs interface as 16760832
MB! The fix is simple: calculate the number of pages as unsigned 64-bits
integer.

After the patch, the memory size (16 GB) is properly detected:

EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 0, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800
EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 1, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800

Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

i5000: Fix the memory size calculation with 2R memories

When 2R memories are found, the memory size should be multiplied
by two, otherwise, it will report half of the memory size:

       +-----------------------------------------------+
       |                      mc0                      |
       |        branch0        |        branch1        |
       | channel0  | channel1  | channel0  | channel1  |
-------+-----------------------------------------------+
slot3: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
-------+-----------------------------------------------+
slot1: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
slot0: |  1024 MB  |  1024 MB  |  1024 MB  |  1024 MB  |
-------+-----------------------------------------------+

(the above machine have 4 x 2GB 2R memories)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

i3200_edac: Fix memory rank size

commit a895bf8b1e1ea4c032a8fa8a09475a2ce09fe77a incorrectly
changed the logic that fills the memory bank size. Fix it.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

iommu: static inline iommu group stub functions

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

Merge tag 'v3.6-rc7' into core/rcu

Merge Linux 3.6-rc7, to pick up fixes and to resolve a conflict in an
upcoming pull.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD

bigrt.2012.09.23a contains additional commits to reduce scheduling latency
from RCU on huge systems (many hundrends or thousands of CPUs).

doctorture.2012.09.23a contains documentation changes and rcutorture fixes.

fixes.2012.09.23a contains miscellaneous fixes.

hotplug.2012.09.23a contains CPU-hotplug-related changes.

idle.2012.09.23a fixes architectures for which RCU no longer considered
the idle loop to be a quiescent state due to earlier
adaptive-dynticks changes. Affected architectures are alpha,
cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa,
and ia64.

sh: pfc: Fix up GPIO mux type reconfig case.

Some drivers need to switch pin states between GPIO and pin function at
runtime, which was inadvertently broken in the pinctrl driver for GPIOs
being bound to a specific direction.

This fixes up the request path to ensure that previously configured GPIOs
don't cause us to inadvertently error out with an unsupported mux on
reconfig, which in practice is primarily aimed at trapping pull-up/down
users that have yet to be implemented under the new API.

Fixes up regressions in the TPU PWM driver, amongst others.

Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this last(?) batch of fixes intended for 3.6...

For the Bluetooth bits, Gustavo says this:

"Here goes probably my last update to 3.6. It includes the two patches
you were ok last week(from Andrzej Kaczmarek), those are critical
ones, and two other fixes one for a system crash and the other for
a missing lockdep annotation."

The referenced fixes from Andrzej prevent attempts to configure devices
that are powered-off.

Along with the Bluetooth fixes, there are a couple of 802.11 fixes.
Emmanuel Grumbach gives us an iwlwifi fix to prevent releasing an
interrupt twice. Luis R. Rodriguez provides a fix for a possible
circular lock dependency in the cfg80211 regulatory enforcement code.

All of these have been in linux-next for a few days. I hope they are
not too late to make the 3.6 release!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/cmetcalf/linux-tile

Pull tile gxio ABI fix from Chris Metcalf:
"This fixes a last-minute change in the Tilera hypervisor ABI for TRIO
  (PCI root complex) support.  We've locked in this ABI going forward
  and will make sure no further ABI changes like this occur."

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: gxio iorpc numbering change for TRIO interface

Merge tag 'vfio-for-linus' of git://github.com/awilliam/linux-vfio

Pull vfio fixes from Alex Williamson:
"VFIO doc update and virqfd race fix"

* tag 'vfio-for-linus' of git://github.com/awilliam/linux-vfio:
vfio: Fix virqfd release race
vfio: Trivial Documentation correction

Merge tag 'stable/for-linus-3.6-rc7-tag' of git://git./linux/kernel/git/konrad/xen

Pull a Xen fix from Konrad Rzeszutek Wilk:
"It is a bug-fix when we run the initial PV guest on a AMD K8 machine
  and have CONFIG_AMD_NUMA enabled and detect the NUMA topology from the
  Northbridge.

  We end up in the situation where the initial domain gets too much
  information and gets confused and crashes - the fix is to restrict the
  domain to get the information - and we do it by just disabling NUMA on
  the PV guest (the hypervisor is still able to do its proper NUMA
  allocations of guests).

  It is OK to disable the PV guest from accessing NUMA data as right now
  we do not inject any NUMA node information to the PV guests.  When we
  do get to that point, then this patch will have to be reverted."

* Disable PV NUMA support as we do not do anything with it (yet) and it
   can cause bootup crashes on certain AMD machines.

* tag 'stable/for-linus-3.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/boot: Disable NUMA for PV guests.

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull two ceph fixes from Sage Weil:
"The first fixes a leak in the rbd setup error path, and the second
  fixes a more serious problem with mismatched kmap/kunmap that surfaced
  after the recent refactoring work."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: only kunmap kmapped pages
  rbd: drop dev reference on error in rbd_open()

HID: picoLCD: bounds check in dump_buff_as_hex()

Make sure we keep enough space for terminating NUL character after last
newline. If we have too much data, replace last byte with '.'s to
make overflow visible.

Using hex_dump_to_buffer() is not interesting as it adds more overhead
and does not append the trailing linefeed.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

net: guard tcp_set_keepalive() to tcp sockets

Its possible to use RAW sockets to get a crash in
tcp_set_keepalive() / sk_reset_timer()

Fix is to make sure socket is a SOCK_STREAM one.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gpio-lpc32xx: Fix value handling of gpio_direction_output()

For GPIOs of gpio-lpc32xx, gpio_direction_output() ignores the value argument
(initial value of output). This patch fixes this by setting the level
accordingly.

Cc: stable@kernel.org
Signed-off-by: Roland Stigge <stigge@antcom.de>
Acked-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

phy/micrel: Add missing header to micrel_phy.h

The license header was missing in micrel_phy.h . This patch adds
one.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy/micrel: Rename KS80xx to KSZ80xx

There is no such part as KS8001, KS8041 or KS8051. There are only
KSZ8001, KSZ8041 and KSZ8051. Rename these parts as such to match
the Micrel naming.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Linux ARM kernel <linux-arm-kernel@lists.infradead.org>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy/micrel: Implement support for KSZ8021

The KSZ8021 PHY was previously caught by KS8051, which is not correct.
This PHY needs additional setup if it is strapped for address 0. In such
case an reserved bit must be written in the 0x16, "Operation Mode Strap
Override" register. According to the KS8051 datasheet, that bit means
"PHY Address 0 in non-broadcast" and it indeed behaves as such on KSZ8021.
The issue where the ethernet controller (Freescale FEC) did not communicate
with network is fixed by writing this bit as 1.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tile: gxio iorpc numbering change for TRIO interface

An ABI numbering change was made in the hypervisor for Tilera's 4.1
MDE release (just shipped). It's incompatible with the previous 4.0
release ABI numbering, so we track the new numbering going forward.
We plan to avoid modifying ABI numbering for these interfaces again.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>

c6x: use asm-generic/barrier.h

A recent patch in the linux-next tree caused a build failure on
C6X because C6X didn't define a read_barrier_depends() macro. C6X
does not support SMP and the architecture doesn't provide any
special memory ordering instructions, so it makes sense to just
use the generic barrier.h rather than patching the existing c6x
specific header.

Signed-off-by: Mark Salter <msalter@redhat.com>

xen/boot: Disable NUMA for PV guests.

The hypervisor is in charge of allocating the proper "NUMA" memory
and dealing with the CPU scheduler to keep them bound to the proper
NUMA node. The PV guests (and PVHVM) have no inkling of where they
run and do not need to know that right now. In the future we will
need to inject NUMA configuration data (if a guest spans two or more
NUMA nodes) so that the kernel can make the right choices. But those
patches are not yet present.

In the meantime, disable the NUMA capability in the PV guest, which
also fixes a bootup issue. Andre says:

"we see Dom0 crashes due to the kernel detecting the NUMA topology not
by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).

This will detect the actual NUMA config of the physical machine, but
will crash about the mismatch with Dom0's virtual memory. Variation of
the theme: Dom0 sees what it's not supposed to see.

This happens with the said config option enabled and on a machine where
this scanning is still enabled (K8 and Fam10h, not Bulldozer class)

We have this dump then:
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
Scanning NUMA topology in Northbridge 24
Number of physical nodes 4
Node 0 MemBase 0000000000000000 Limit 0000000040000000
Node 1 MemBase 0000000040000000 Limit 0000000138000000
Node 2 MemBase 0000000138000000 Limit 00000001f8000000
Node 3 MemBase 00000001f8000000 Limit 0000000238000000
Initmem setup node 0 0000000000000000-0000000040000000
  NODE_DATA [000000003ffd9000 - 000000003fffffff]
Initmem setup node 1 0000000040000000-0000000138000000
  NODE_DATA [0000000137fd9000 - 0000000137ffffff]
Initmem setup node 2 0000000138000000-00000001f8000000
  NODE_DATA [00000001f095e000 - 00000001f0984fff]
Initmem setup node 3 00000001f8000000-0000000238000000
Cannot find 159744 bytes in node 3
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
RIP: e030:[<ffffffff81d220e6>]  [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
.. snip..
  [<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178
  [<ffffffff81d23348>] sparse_init+0xe4/0x25a
  [<ffffffff81d16840>] paging_init+0x13/0x22
  [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b
  [<ffffffff81683954>] ? printk+0x3c/0x3e
  [<ffffffff81d01a38>] start_kernel+0xe5/0x468
  [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1
  [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36
  [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c
"

so we just disable NUMA scanning by setting numa_off=1.

CC: stable@vger.kernel.org
Reported-and-Tested-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

GFS2: Write out dirty inode metadata in delayed deletes

If a dirty GFS2 inode was being deleted but was in use by another node, its
metadata was not getting written out before GFS2 checked for dirty buffers in
gfs2_ail_flush(). GFS2 was relying on inode_go_sync() to write out the
metadata when the other node tried to free the file, but it failed the error
check before it got that far. This patch writes out the metadata before calling
gfs2_ail_flush()

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: fix s_writers.counter imbalance in gfs2_ail_empty_gl

gfs2_ail_empty_gl() contains an "inline version" of gfs2_trans_begin(),
so it needs an explicit sb_start_intwrite() as well, to balance the
sb_end_intwrite() which will be called by gfs2_trans_end().

With this, xfstest 068 passes on lock_nolock local gfs2.
Without it, we reach a writer count of -1 and get stuck.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Fix infinite loop in rbm_find

This patch fixes an infinite loop in gfs2_rbm_find that was introduced
by the previous patch. The problem occurred when the length was less
than 3 but the rbm block was byte-aligned, causing it to improperly
return a extent length of zero, which caused it to spin.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Barry Marson <bmarson@redhat.com>

GFS2: Consolidate free block searching functions

With the recently added block reservation code, an additional function
was added to search for free blocks. This had a restriction of only being
able to search for aligned extents of free blocks. As a result the
allocation patterns when reserving blocks were suboptimal when the
existing allocation of blocks for an inode was not aligned to the same
boundary.

This patch resolves that problem by adding the ability for gfs2_rbm_find
to search for extents of a particular minimum size. We can then use
gfs2_rbm_find for both looking for reservations, and also looking for
free blocks on an individual basis when we actually come to do the
allocation later on. As a result we only need a single set of code
to deal with both situations.

The function gfs2_rbm_from_block() is moved up rgrp.c so that it
occurs before all of its callers.

Many thanks are due to Bob for helping track down the final issue in
this patch. That fix to the rb_tree traversal and to not share
block reservations from a dirctory to its children is included here.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>

GFS2: Get rid of I_MUTEX_QUOTA usage

GFS2 uses i_mutex on its system quota inode to synchronize writes to
quota file. Since this is an internal inode to GFS2 (not part of directory
hiearchy or visible by user) we are safe to define locking rules for it. So
let's just get it its own locking class to make it clear.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Stop block extents at the end of bitmaps

This patch stops multiple block allocations if a nonzero
return code is received from gfs2_rbm_from_block. Without
this patch, if enough pressure is put on the file system,
you get a kernel warning quickly followed by:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa04f47e8>] gfs2_alloc_blocks+0x2c8/0x880 [gfs2]
With this patch, things run normally.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Fix unclaimed_blocks() wrapping bug and clean up

When rgd->rd_free_clone is less than rgd->rd_reserved, the
unclaimed_blocks() calculation would wrap and produce
incorrect results. This patch checks for this condition
when this function is called from gfs2_mblk_search()

In addition, the use of this particular function in other
places in the code has been dropped by means of a general
clean up of gfs2_inplace_reserve(). This function is now
much easier to follow.

Also the setting of the rgd->rd_last_alloc field is corrected.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Improve block reservation tracing

This patch improves the tracing of block reservations by
removing some corner cases and also providing more useful
detail in the traces.

A new field is added to the reservation structure to contain
the inode number. This is used since in certain contexts it is
not possible to access the inode itself to obtain this information.
As a result we can then display the inode number for all tracepoints
and also in case we dump the resource group.

The "del" tracepoint operation has been removed. This could be called
with the reservation rgrp set to NULL. That resulted in not printing
the device number, and thus making the information largely useless
anyway. Also, the conditional on the rgrp being NULL can then be
removed from the tracepoint. After this change, all the block
reservation tracepoint calls will be called with the rgrp information.

The existing ins,clm and tdel calls to the block reservation tracepoint
are sufficient to track the entire life of the block reservation.

In gfs2_block_alloc() the error detection is updated to print out
the inode number of the problematic inode. This can then be compared
against the information in the glock dump,tracepoints, etc.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Fall back to ignoring reservations, if there are no other blocks left

When we get to the stage of allocating blocks, we know that the
resource group in question must contain enough free blocks, otherwise
gfs2_inplace_reserve() would have failed. So if we are left with only
free blocks which are reserved, then we must use those. This can happen
if another node has sneeked in and use some blocks reserved on this
node, for example. Generally this will happen very rarely and only
when the resouce group is nearly full.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Fix ->show_options() for statfs slow

The ->show_options() function for GFS2 was not correctly displaying
the value when statfs slow in in use.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Milos Jakubicek <xjakub@fi.muni.cz>

GFS2: Use rbm for gfs2_setbit()

Use the rbm structure for gfs2_setbit() in order to simplify the
arguments to the function. We have to add a bool to control whether
the clone bitmap should be updated (if it exists) but otherwise it
is a more or less direct substitution.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Use rbm for gfs2_testbit()

Change the arguments to gfs2_testbit() so that it now just takes an
rbm specifying the position of the two bit entry to return.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Eliminate unnecessary check for state > 3 in bitfit

Function gfs2_bitfit was checking for state > 3, but that's
impossible since it is only called from rgblk_search, which receives
only GFS2_BLKST_ constants.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Eliminate redundant calls to may_grant

Function add_to_queue was checking may_grant for the passed-in
holder for every iteration of its gh2 loop. Now it only checks it
once at the beginning to see if a try lock is futile.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Combine functions gfs2_glock_dq_wait and wait_on_demote

Function gfs2_glock_dq_wait called two-line function wait_on_demote,
so they were combined.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Combine functions gfs2_glock_wait and wait_on_holder

Function gfs2_glock_wait only called function wait_on_holder and
returned its return code, so they were combined for readability.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: inline __gfs2_glock_schedule_for_reclaim

Since function gfs2_glock_schedule_for_reclaim is only two
significant lines, we can eliminate it, simplifying the code
and making it more readable.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: change function gfs2_direct_IO to use a normal gfs2_glock_dq

This patch changes function gfs2_direct_IO so that it uses a normal
call to gfs2_glock_dq rather than a call to a multiple-dq of one item.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: rbm code cleanup

This patch fixes a few small rbm related things. First, it fixes
a corner case where the rbm needs to switch bitmaps and wasn't
adjusting its buffer pointer. Second, there's a white space issue
fixed. Third, the logic in function gfs2_rbm_from_block was optimized
a bit. Lastly, a check for goal block overflows was added to function
gfs2_alloc_blocks.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Fix case where reservation finished at end of rgrp

One corner case which the original patch failed to take into
account was when there is a reservation which ended such that
the following block was one beyond the end of the rgrp in
question. This extra test fixes that case.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>

GFS2: Use RB_CLEAR_NODE() rather than rb_init_node()

gfs2 calls RB_EMPTY_NODE() to check if nodes are not on an rbtree.
The corresponding initialization function is RB_CLEAR_NODE().
rb_init_node() was never clearly defined and is going away.

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Update rgblk_free() to use rbm

Replace open coded version with a call to gfs2_rbm_from_block()

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Update gfs2_get_block_type() to use rbm

Use the new gfs2_rbm_from_block() function to replace an open
coded version of the same code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Replace rgblk_search with gfs2_rbm_find

This is part of a series of patches which are introducing the
gfs2_rbm structure throughout the block allocation code. The
main aim of this part is to create a search function which can
deal directly with struct gfs2_rbm. In this case it specifies
the initial position at which to start the search and also the
point at which the search terminates.

The net result of this is to clean up the search code and make
it rather more readable, and the various possible exceptions which
may occur during the search are partitioned into their own functions.

There are some bug fixes too. We should not be checking the reservations
while allocating extents - the time for that is when we are searching
for where to put the extent, not when we've already made that decision.

Also, rgblk_search had two uses, and in only one of those cases did
it make sense to check for reservations. This is fixed in the new
gfs2_rbm_find function, which has a cleaner interface.

The reservation checking has been improved by always checking for
contiguous reservations, and returning the first free block after
all contiguous reservations. This is done under the spin lock to
ensure consistancy of the tree.

The allocation of extents is now in all cases done by the existing
allocation code, and if there is an active reservation, that is updated
after the fact. Again this is done under the spin lock, since it entails
changing the lookup key for the reservation in question.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Add structure to contain rgrp, bitmap, offset tuple

This patch introduces a new structure, gfs2_rbm, which is a
tuple of a resource group, a bitmap within the resource group
and an offset within that bitmap. This is designed to make
manipulating these sets of variables easier. There is also a
new helper function which converts this representation back
to a disk block address.

In addition, the rbtree nodes which are used for the reservations
were not being correctly initialised, which is now fixed. Also,
the tracing was not passing through the inode where it should
have been. That is mostly fixed aside from one corner case. This
needs to be revisited since there can also be a NULL rgrp in
some cases which results in the device being incorrect in the
trace.

This is intended to be the first step towards cleaning up some
of the allocation code, and some further bug fixes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Remove rs_requested field from reservations

The rs_requested field is left over from the original allocation
code, however this should have been a parameter passed to the
various functions from gfs2_inplace_reserve() and not a member of the
reservation structure as the value is not required after the
initial allocation.

This also helps simplify the code since we no longer need to set
the rs_requested to zero. Also the gfs2_inplace_release()
function can also be simplified since the reservation structure
will always be defined when it is called, and the only remaining
task is to unlock the rgrp if required. It can also now be
called unconditionally too, resulting in a further simplification.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

GFS2: Merge two nearly identical xattr functions

There were two functions in the xattr code which were nearly
identical, the only difference being that one was copy data into
the unstuffed xattrs and the other was copying data out from it.

This patch merges the two functions such that the code which deal
with iteration over the unstuffed xattrs is no longer duplicated.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

ARM: dma-mapping: Fix potential memory leak in atomic_pool_init()

When either of __alloc_from_contiguous or __alloc_remap_buffer fails
to provide a valid pointer, allocated memory is freed up and an error
is returned. 'pages' was however not freed before returning error.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

md/raid5: add missing spin_lock_init.

commit b17459c05000fdbe8d10946570a26510f86ec0f
raid5: add a per-stripe lock

added a spin_lock to the 'stripe_head' struct.
Unfortunately there are two places where this struct is allocated
but the spin lock was only initialised in one of them.

So add the missing spin_lock_init.

Signed-off-by: NeilBrown <neilb@suse.de>

Linux 3.6-rc7

Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild

Pull kbuild fixes from Michal Marek:
"There are two more kbuild fixes for 3.6.

  One fixes a race between x86's archscripts target and the rule
  (re)building scripts/basic/fixdep.  The second is a fix for the
  previous attempt at fixing make firmware_install with make 3.82.
  This new solution should work with any version of GNU make"

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  x86/kbuild: archscripts depends on scripts_basic
  firmware: fix directory creation rule matching with make 3.80

Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

Pull hwmon subsystem fixes from Jean Delvare.

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (fam15h_power) Tweak runavg_range on resume
  hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug
  hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"This is a set of four essential fixes: two oops related (bnx2i,
  virtio-scsi), one data corruption related (hpsa) and one failure to
  boot due to interrupt routing issues (mpt2ss).

Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] hpsa: fix handling of protocol error
  [SCSI] mpt2sas: Fix for issue - Unable to boot from the drive connected to HBA
  [SCSI] bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload
  [SCSI] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs.

Fix potential NULL pointer dereference in edac_unregister_sysfs() on
system boot introduced in 3.6-rc1.

Since commit 7a623c039 ("edac: rewrite the sysfs code to use struct
device") edac_mc_alloc() no longer initializes embedded kobjects in
struct mem_ctl_info.  Therefore edac_mc_free() can no longer simply
decrement a kobject reference count to free the allocated memory unless
the memory controller driver module had also called edac_mc_add_mc().

Now edac_mc_free() will check if the newly embedded struct device has
been registered with sysfs before using either the standard device
release functions or freeing the data structures itself with logic
pulled out of the error path of edac_mc_alloc().

The BUG this patch resolves for me:

  BUG: unable to handle kernel NULL pointer dereference at   (null)
  EIP is at __wake_up_common+0x1a/0x6a
  Process modprobe (pid: 933, ti=f3dc6000 task=f3db9520 task.ti=f3dc6000)
  Call Trace:
    complete_all+0x3f/0x50
    device_pm_remove+0x23/0xa2
    device_del+0x34/0x142
    edac_unregister_sysfs+0x3b/0x5c [edac_core]
    edac_mc_free+0x29/0x2f [edac_core]
    e7xxx_probe1+0x268/0x311 [e7xxx_edac]
    e7xxx_init_one+0x56/0x61 [e7xxx_edac]
    local_pci_probe+0x13/0x15
  ...

Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

edac_mc: fix messy kfree calls in the error path

coccinelle warns about:

+ drivers/edac/edac_mc.c:429:9-23: ERROR: reference preceded by free on line 429

   421         if (mci->csrows) {
> 422                 for (chn = 0; chn < tot_channels; chn++) {
   423                         csr = mci->csrows[chn];
   424                         if (csr) {
> 425                                 for (chn = 0; chn < tot_channels; chn++)
   426                                          kfree(csr->channels[chn]);
   427                                  kfree(csr);
   428                          }
> 429                          kfree(mci->csrows[i]);
   430                  }
   431                  kfree(mci->csrows);
   432          }

and that code block seem to mess things up in several ways (double free, memory
leak, out-of-bound reads etc.):

L422: The iterator "chn" and bound "tot_channels" are totally wrong. Should be
      "row" and "tot_csrows" respectively. Which means either memory leak, or
      out-of-bound reads (which if does not trigger an immediate page fault
      error, will further lead to kfree() on random addresses).

L425: The inner loop is reusing the same iterator "chn" as the outer loop,
      which could lead to premature end of the outer loop, and hence memory leak.

L429: The array index 'i' in mci->csrows[i] is a temporary value used in
      previous loops, and won't change at all in the current loop. Which
      means either out-of-bound read and possibly kfree(random number), or the
      same mci->csrows[i] get freed once and again, and possibly double free
      for the kfree(csr) in L427.

L426/L427: a kfree(csr->channels) is needed in between to avoid leaking the memory.

The buggy code was introduced by commit de3910eb ("edac: change the mem
allocation scheme to make Documentation/kobject.txt happy") in the 3.6-rc1
merge window. Fix it by freeing up resources in this order:

  free csrows[i]->channels[j]
  free csrows[i]->channels
  free csrows[i]
  free csrows

CC: Mauro Carvalho Chehab <mchehab@redhat.com>
CC: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

batman-adv: Fix symmetry check / route flapping in multi interface setups

If receiving an OGM from a neighbor other than the currently selected
and if it has the same TQ then we are supposed to switch if this
neighbor provides a more symmetric link than the currently selected one.

However this symmetry check currently is broken if the interface of the
neighbor we received the OGM from and the one of the currently selected
neighbor differ: We are currently trying to determine the symmetry of the
link towards the selected router via the link we received the OGM from
instead of just checking via the link towards the currently selected
router.

This leads to way more route switches than necessary and can lead to
permanent route flapping in many common multi interface setups.

This patch fixes this issue by using the right interface for this
symmetry check.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>

batman-adv: Fix change mac address of soft iface.

Into function interface_set_mac_addr, the function tt_local_add was
invoked before updating dev->dev_addr. The new MAC address was not
tagged as NoPurge.

Signed-off-by: Def <def@laposte.net>

hwmon: (fam15h_power) Tweak runavg_range on resume

The quirk introduced with commit
00250ec90963b7ef6678438888f3244985ecde14 (hwmon: fam15h_power: fix
bogus values with current BIOSes) is not only required during driver
load but also when system resumes from suspend. The BIOS might set the
previously recommended (but unsuitable) initilization value for the
running average range register during resume.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@vger.kernel.org # 3.0+

hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug

coretemp_init loops with for_each_online_cpu, adding platform_devices
and sysfs interfaces, then calls register_hotcpu_notifier.  There is a
race if a CPU is offlined or onlined after the loop, but before
register_hotcpu_notifier.  The race might result in the absence of a
platform_device+sysfs interface for an online CPU, or the presence of
a platform_device+sysfs interface for an offline CPU.  A similar race
occurs during coretemp_exit, after the module calls
unregister_hotcpu_notifier, but before it unregisters all devices, a
CPU might offline and a device for an offline CPU will exist for a
short while.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier
with get_online_cpus+put_online_cpus; and surrounds
unregister_hotcpu_notifier and device unregistering with
get_online_cpus+put_online_cpus.

Build tested.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug

via_cputemp_init loops with for_each_online_cpu, adding
platform_devices, then calls register_hotcpu_notifier. If a CPU is
offlined between the loop and register_hotcpu_notifier, then later
onlined, via_cputemp_device_add will attempt to add platform devices
with the same ID. A similar race occurs during via_cputemp_exit,
after the module calls unregister_hotcpu_notifier, a CPU might offline
and a device will exist for a CPU that is offline.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier
with get_online_cpus+put_online_cpus; and surrounds
unregister_hotcpu_notifier and device unregistering with
get_online_cpus+put_online_cpus.

Build tested.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Acked-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

ia64: Add missing RCU idle APIs on idle loop

Traditionally, the entire idle task served as an RCU quiescent state.
But when RCU read side critical sections started appearing within the
idle loop, this traditional strategy became untenable. The fix was to
create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
must be called by each architecture's idle loop so that RCU can tell
when it is safe to ignore a given idle CPU.

Unfortunately, this fix was never applied to ia64, a shortcoming remedied
by this commit.

Reported by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

xtensa: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the xtensa's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

score: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in scores's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

parisc: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the parisc's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Parisc <linux-parisc@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

mn10300: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the mn10300's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

m68k: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m68k's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: m68k <linux-m68k@lists.linux-m68k.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

m32r: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m32r's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

h8300: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the h8300's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

frv: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Frv's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org> # 3.3+
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

cris: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Cris's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Cris <linux-cris-kernel@axis.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

alpha: Add missing RCU idle APIs on idle loop

In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Alpha's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: alpha <linux-alpha@vger.kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 3.3+
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

alpha: Fix preemption handling in idle loop

cpu_idle() is called on the boot CPU by the init code with
preemption disabled. But the cpu_idle() function in alpha
doesn't handle this when it calls schedule() directly.

Fix it by converting it into schedule_preempt_disabled().

Also disable preemption before calling cpu_idle() from
secondary CPU entry code to stay consistent with this
state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: alpha <linux-alpha@vger.kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

Use get_online_cpus to avoid races involving CPU hotplug

If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
between the for_each_online_cpu() loop and the call to
register_hotcpu_notifier in cpuid_init or the call to
unregister_hotcpu_notifier in cpuid_exit.  The potential races can
lead to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in cpuid_exit if:

        for_each_online_cpu(cpu)
                cpuid_device_destroy(cpu);
        class_destroy(cpuid_class);
        __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
        <----- CPU onlines
        unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);

the hotcpu notifier will attempt to create a device for the
cpuid_class, which the module already destroyed.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Use get_online_cpus to avoid races involving CPU hotplug

If arch/x86/kernel/msr.c is a module, a CPU might offline or online
between the for_each_online_cpu(i) loop and the call to
register_hotcpu_notifier in msr_init or the call to
unregister_hotcpu_notifier in msr_exit. The potential races can lead
to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in msr_init if:

        for_each_online_cpu(i) {
                err = msr_device_create(i);
                if (err != 0)
                        goto out_class;
        }
        <----- CPU offlines
        register_hotcpu_notifier(&msr_class_cpu_notifier);

and the CPU never onlines before msr_exit, then the module will never
call msr_device_destroy for the associated CPU.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

sched: Fix load avg vs cpu-hotplug

Rabik and Paul reported two different issues related to the same few
lines of code.

Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).

Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.

The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.

The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.

So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.

[ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
miscounting noted by Rakib. ]

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>

rcu: Disallow callback registry on offline CPUs

Posting a callback after the CPU_DEAD notifier effectively leaks
that callback unless/until that CPU comes back online. Silence is
unhelpful when attempting to track down such leaks, so this commit emits
a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
CPU attempts to register a callback. The rdp->nxttail[RCU_NEXT_TAIL] is
set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
notifier, allowing _call_rcu() to determine exactly when posting callbacks
is illegal.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>