firefly-linux-kernel-4.4.55.git
8 years agoARM: OMAP3: hwmod data: Add sysc information for DSI
Sebastian Reichel [Fri, 24 Jun 2016 01:59:33 +0000 (03:59 +0200)]
ARM: OMAP3: hwmod data: Add sysc information for DSI

commit b46211d6dcfb81a8af66b8684a42d629183670d4 upstream.

Add missing sysconfig/sysstatus information
to OMAP3 hwmod. The information has been
checked against OMAP34xx and OMAP36xx TRM.

Without this change DSI block is not reset
during boot, which is required for working
Nokia N950 display.

Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: kirkwood: ib62x0: fix size of u-boot environment partition
Simon Baatz [Fri, 12 Aug 2016 17:12:50 +0000 (19:12 +0200)]
ARM: kirkwood: ib62x0: fix size of u-boot environment partition

commit a778937888867aac17a33887d1c429120790fbc2 upstream.

Commit 148c274ea644 ("ARM: kirkwood: ib62x0: add u-boot environment
partition") split the "u-boot" partition into "u-boot" and "u-boot
environment".  However, instead of the size of the environment, an offset
was given, resulting in overlapping partitions.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Fixes: 148c274ea644 ("ARM: kirkwood: ib62x0: add u-boot environment partition")
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Luka Perkov <luka@openwrt.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: imx6: add missing BM_CLPCR_BYPASS_PMIC_READY setting for imx6sx
Anson Huang [Mon, 22 Aug 2016 15:53:25 +0000 (23:53 +0800)]
ARM: imx6: add missing BM_CLPCR_BYPASS_PMIC_READY setting for imx6sx

commit 8aade778f787305fdbfd3c1d54e6b583601b5902 upstream.

i.MX6SX has bypass PMIC ready function, as this function
is normally NOT enabled on the board design, so we need
to bypass the PMIC ready pin check during DSM mode resume
flow, otherwise, the internal DSM resume logic will be
waiting for this signal to be ready forever and cause
resume fail.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Fixes: ff843d621bfc ("ARM: imx: add suspend support for i.mx6sx")
Tested-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: imx6: add missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul
Peter Chen [Tue, 9 Aug 2016 08:24:43 +0000 (16:24 +0800)]
ARM: imx6: add missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul

commit f5a49057c71433e35a4712ab8d8f00641b3e1ec0 upstream.

There is a missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul,
without it, the "standby" mode can't work well, the system can't be
resumed.

With this commit, the "standby" mode works well.

Signed-off-by: Peter Chen <peter.chen@nxp.com>
Cc: Anson Huang <anson.huang@nxp.com>
Fixes: ee4a5f838c84 ("ARM: imx: add suspend/resume support for i.mx6ul")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: AM43XX: hwmod: Fix RSTST register offset for pruss
Keerthy [Mon, 20 Jun 2016 03:52:25 +0000 (09:22 +0530)]
ARM: AM43XX: hwmod: Fix RSTST register offset for pruss

commit b00ccf5b684992829610d162e78a7836933a1b19 upstream.

pruss hwmod RSTST register wrongly points to PWRSTCTRL register in case of
am43xx. Fix the RSTST register offset value.

This can lead to setting of wrong power state values for PER domain.

Fixes: 1c7e224d ("ARM: OMAP2+: hwmod: AM335x: runtime register update")
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocpuset: make sure new tasks conform to the current config of the cpuset
Zefan Li [Tue, 9 Aug 2016 03:25:01 +0000 (11:25 +0800)]
cpuset: make sure new tasks conform to the current config of the cpuset

commit 06f4e94898918bcad00cdd4d349313a439d6911e upstream.

A new task inherits cpus_allowed and mems_allowed masks from its parent,
but if someone changes cpuset's config by writing to cpuset.cpus/cpuset.mems
before this new task is inserted into the cgroup's task list, the new task
won't be updated accordingly.

Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: thunderx: Fix OOPs with ethtool --register-dump
David Daney [Tue, 16 Aug 2016 20:30:36 +0000 (13:30 -0700)]
net: thunderx: Fix OOPs with ethtool --register-dump

commit 1423661fed2c40d6d71b5e2e3aa390f85157f9d5 upstream.

The ethtool_ops .get_regs function attempts to read the nonexistent
register NIC_QSET_SQ_0_7_CNM_CHG, which produces a "bus error" type
OOPs.

Fix by not attempting to read, and removing the definition of,
NIC_QSET_SQ_0_7_CNM_CHG.  A zero is written into the register dump to
keep the layout unchanged.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: change bInterval default to 10 ms
Alan Stern [Fri, 16 Sep 2016 14:24:26 +0000 (10:24 -0400)]
USB: change bInterval default to 10 ms

commit 08c5cd37480f59ea39682f4585d92269be6b1424 upstream.

Some full-speed mceusb infrared transceivers contain invalid endpoint
descriptors for their interrupt endpoints, with bInterval set to 0.
In the past they have worked out okay with the mceusb driver, because
the driver sets the bInterval field in the descriptor to 1,
overwriting whatever value may have been there before.  However, this
approach was never sanctioned by the USB core, and in fact it does not
work with xHCI controllers, because they use the bInterval value that
was present when the configuration was installed.

Currently usbcore uses 32 ms as the default interval if the value in
the endpoint descriptor is invalid.  It turns out that these IR
transceivers don't work properly unless the interval is set to 10 ms
or below.  To work around this mceusb problem, this patch changes the
endpoint-descriptor parsing routine, making the default interval value
be 10 ms rather than 32 ms.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Wade Berrier <wberrier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
Lee Jones [Thu, 8 Sep 2016 09:11:00 +0000 (11:11 +0200)]
ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)

commit 7e9d2850a8db4e0d85a20bb692198bf2cc4be3b7 upstream.

The STiH4{07,10} platform contains some interconnect clocks which are used
by various IPs.  If this clock isn't handled correctly by ST's EHCI/OHCI
drivers, their hub won't be found, the following error be shown and the
result will be non-working USB:

  [   97.221963] hub 2-1:1.0: hub_ext_port_status failed (err = -110)

Tested-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase
Clemens Gruber [Mon, 5 Sep 2016 17:29:58 +0000 (19:29 +0200)]
usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase

commit 6f3c4fb6d05e63c9c6d8968302491c3a5457be61 upstream.

Problems with the signal integrity of the high speed USB data lines or
noise on reference ground lines can cause the i.MX6 USB controller to
violate USB specs and exhibit unexpected behavior.

It was observed that USBi_UI interrupts were triggered first and when
isr_setup_status_phase was called, ci->status was NULL, which lead to a
NULL pointer dereference kernel panic.

This patch fixes the kernel panic, emits a warning once and returns
-EPIPE to halt the device and let the host get stalled.
It also adds a comment to point people, who are experiencing this issue,
to their USB hardware design.

Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
Yoshihiro Shimoda [Mon, 29 Aug 2016 09:00:38 +0000 (18:00 +0900)]
usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition

commit 519d8bd4b5d3d82c413eac5bb42b106bb4b9ec15 upstream.

The previous driver is possible to stop the transfer wrongly.
For example:
 1) An interrupt happens, but not BRDY interruption.
 2) Read INTSTS0. And than state->intsts0 is not set to BRDY.
 3) BRDY is set to 1 here.
 4) Read BRDYSTS.
 5) Clear the BRDYSTS. And then. the BRDY is cleared wrongly.

Remarks:
 - The INTSTS0.BRDY is read only.
  - If any bits of BRDYSTS are set to 1, the BRDY is set to 1.
  - If BRDYSTS is 0, the BRDY is set to 0.

So, this patch adds condition to avoid such situation. (And about
NRDYSTS, this is not used for now. But, avoiding any side effects,
this patch doesn't touch it.)

Fixes: d5c6a1e024dd ("usb: renesas_usbhs: fixup interrupt status clear method")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: serial: simple: add support for another Infineon flashloader
Daniele Palmas [Fri, 2 Sep 2016 08:37:56 +0000 (10:37 +0200)]
USB: serial: simple: add support for another Infineon flashloader

commit f190fd92458da3e869b4e2c6289e2c617490ae53 upstream.

This patch adds support for Infineon flashloader 0x8087/0x0801.

The flashloader is used in Telit LE940B modem family with Telit
flashing application.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoserial: 8250: added acces i/o products quad and octal serial cards
Jimi Damon [Thu, 21 Jul 2016 00:00:40 +0000 (17:00 -0700)]
serial: 8250: added acces i/o products quad and octal serial cards

commit c8d192428f52f244130b84650ad616df09f2b1e1 upstream.

Added devices ids for acces i/o products quad and octal serial cards
that make use of existing Pericom PI7C9X7954 and PI7C9X7958
configurations .

Signed-off-by: Jimi Damon <jdamon@accesio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoserial: 8250_mid: fix divide error bug if baud rate is 0
Andy Shevchenko [Fri, 1 Jul 2016 14:21:49 +0000 (17:21 +0300)]
serial: 8250_mid: fix divide error bug if baud rate is 0

commit 47b34d2ef266e2c283b514d65c8963c2ccd42474 upstream.

Since the commit c1a67b48f6a5 ("serial: 8250_pci: replace switch-case by
formula for Intel MID"), the 8250 driver crashes in the byt_set_termios()
function with a divide error. This is caused by the fact that a baud rate of 0
(B0) is not handled properly. Fix it by falling back to B9600 in this case.

Reported-by: "Mendez Salinas, Fernando" <fernando.mendez.salinas@intel.com>
Fixes: c1a67b48f6a5 ("serial: 8250_pci: replace switch-case by formula for Intel MID")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: ensure ret is initialized to zero before entering do loop
Colin Ian King [Mon, 5 Sep 2016 14:39:06 +0000 (15:39 +0100)]
iio: ensure ret is initialized to zero before entering do loop

commit 5dba4b14bafe801083d01e1f400816df7e5a8f2e upstream.

A recent fix to iio_buffer_read_first_n_outer removed ret from being set by
a return from wait_event_interruptible and also added a continue in a loop
which causes the variable ret to not be set when it reaches the end of the
loop.  Fix this by initializing ret to zero.

Also remove extraneous white space at the end of the loop.

Fixes: fcf68f3c0bb2a5 ("fix sched WARNING "do not call blocking ops when !TASK_RUNNING")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio:core: fix IIO_VAL_FRACTIONAL sign handling
Gregor Boirie [Fri, 2 Sep 2016 18:27:46 +0000 (20:27 +0200)]
iio:core: fix IIO_VAL_FRACTIONAL sign handling

commit 171c0091837c81ed5c949fec6966bb5afff2d1cf upstream.

7985e7c100 ("iio: Introduce a new fractional value type") introduced a
new IIO_VAL_FRACTIONAL value type meant to represent rational type numbers
expressed by a numerator and denominator combination.

Formating of IIO_VAL_FRACTIONAL values relies upon do_div() usage. This
fails handling negative values properly since parameters are reevaluated
as unsigned values.
Fix this by using div_s64_rem() instead. Computed integer part will carry
properly signed value. Formatted fractional part will always be positive.

Fixes: 7985e7c100 ("iio: Introduce a new fractional value type")
Signed-off-by: Gregor Boirie <gregor.boirie@parrot.com>
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: accel: kxsd9: Fix scaling bug
Linus Walleij [Thu, 1 Sep 2016 09:44:35 +0000 (11:44 +0200)]
iio: accel: kxsd9: Fix scaling bug

commit 307fe9dd11ae44d4f8881ee449a7cbac36e1f5de upstream.

All the scaling of the KXSD9 involves multiplication with a
fraction number < 1.

However the scaling value returned from IIO_INFO_SCALE was
unpredictable as only the micros of the value was assigned, and
not the integer part, resulting in scaling like this:

$cat in_accel_scale
-1057462640.011978

Fix this by assigning zero to the integer part.

Tested-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: fix pressure data output unit in hid-sensor-attributes
Kweh, Hock Leong [Mon, 29 Aug 2016 10:50:56 +0000 (18:50 +0800)]
iio: fix pressure data output unit in hid-sensor-attributes

commit 36afb176d3c9580651d7f410ed7f000ec48b5137 upstream.

According to IIO ABI definition, IIO_PRESSURE data output unit is
kilopascal:
http://lxr.free-electrons.com/source/Documentation/ABI/testing/sysfs-bus-iio

This patch fix output unit of HID pressure sensor IIO driver from pascal to
kilopascal to follow IIO ABI definition.

Signed-off-by: Kweh, Hock Leong <hock.leong.kweh@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: accel: bmc150: reset chip at init time
Olof Johansson [Thu, 25 Aug 2016 16:45:33 +0000 (09:45 -0700)]
iio: accel: bmc150: reset chip at init time

commit 1c500840934a138bd6b13556c210516e9301fbee upstream.

In at least one known setup, the chip comes up in a state where reading
the chip ID returns garbage unless it's been reset, due to noise on the
wires during system boot.

All supported chips have the same reset method, and based on the
datasheets they all need 1.3 or 1.8ms to recover after reset. So, do
the conservative thing here and always reset the chip.

Signed-off-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: adc: at91: unbreak channel adc channel 3
Anders Darander [Mon, 8 Aug 2016 12:42:16 +0000 (14:42 +0200)]
iio: adc: at91: unbreak channel adc channel 3

commit c2ab447454d498e709d9011c0f2d2945ee321f9b upstream.

The driver always assumes that an input device has been created when
reading channel 3. This causes a kernel panic when dereferencing
st->ts_input.

The change was introduced in
commit 84882b060301 ("iio: adc: at91_adc: Add support for touchscreens
without TSMR"). Earlier versions only entered that part of the if-else
statement if only the following flags are set:

AT91_ADC_IER_XRDY | AT91_ADC_IER_YRDY | AT91_ADC_IER_PRDY

Signed-off-by: Anders Darander <anders@chargestorm.se>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: ad799x: Fix buffered capture for ad7991/ad7995/ad7999
Lars-Peter Clausen [Mon, 11 Jul 2016 11:54:17 +0000 (13:54 +0200)]
iio: ad799x: Fix buffered capture for ad7991/ad7995/ad7999

commit 7d3cc21dab5313a02f2f3ca8164529b828a030d1 upstream.

The data buffer for captured mode for the ad799x driver is allocated in the
update_scan_mode() callback. This callback is not set in the iio_info
struct for the ad7791/ad7995/ad7999, which means that the data buffer is
not allocated when a captured transfer is started. As a result the driver
crashes when the first sample is received. To fix this properly set the
update_scan_mode() callback.

Fixes: d8dca33027c1 ("staging:iio:ad799x: Preallocate sample buffer")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
Vignesh R [Wed, 17 Aug 2016 12:13:01 +0000 (17:43 +0530)]
iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample

commit 7175cce1c3f1d8c8840d2004f78f96a3904249b5 upstream.

Now that open delay and sample delay for each channel is configurable
via DT, the default IDLE_TIMEOUT value is not enough as this is
calculated based on hardcoded macros. This results in driver returning
EBUSY sometimes. Fix this by increasing the timeout
value based on maximum value possible to open delay and sample delays
for each channel.

Fixes: 5dc11e810676e ("iio: adc: ti_am335x_adc: make sample delay, open delay, averaging DT parameters")
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access
Vignesh R [Wed, 17 Aug 2016 12:13:00 +0000 (17:43 +0530)]
iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access

commit 90c43ec6997a892448f1f86180a515f59cafd8a3 upstream.

It is possible that two or more ADC channels can be simultaneously
requested for raw samples, in which case there can be race in access to
FIFO data resulting in loss of samples.
If am335x_tsc_se_set_once() is called again from tiadc_read_raw(), when
ADC is still acquired to sample one of the channels, the second process
might be put into uninterruptible sleep state. Fix these issues, by
protecting FIFO access and channel configurations with a mutex. Since
tiadc_read_raw() might take anywhere between few microseconds to few
milliseconds to finish execution (depending on averaging and delay
values supplied via DT), its better to use mutex instead of spinlock.

Fixes: 7ca6740cd1cd4 ("mfd: input: iio: ti_amm335x: Rework TSC/ADC synchronization")
Signed-off-by: Vignesh R <vigneshr@ti.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: adc: rockchip_saradc: reset saradc controller before programming it
Caesar Wang [Wed, 27 Jul 2016 14:24:04 +0000 (22:24 +0800)]
iio: adc: rockchip_saradc: reset saradc controller before programming it

commit 543852af8e5902aee8f7c72c89e1513663e0f696 upstream.

SARADC controller needs to be reset before programming it, otherwise
it will not function properly.

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-iio@vger.kernel.org
Cc: linux-rockchip@lists.infradead.org
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: proximity: as3935: set up buffer timestamps for non-zero values
Alison Schofield [Mon, 11 Jul 2016 15:26:56 +0000 (08:26 -0700)]
iio: proximity: as3935: set up buffer timestamps for non-zero values

commit f8adf645db03345af2d9a8b6095b02327ea50885 upstream.

Use the iio_pollfunc_store_time parameter during triggered buffer
set-up to get valid timestamps.

Signed-off-by: Alison Schofield <amsfield22@gmail.com>
Cc: Daniel Baluta <daniel.baluta@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: accel: kxsd9: Fix raw read return
Linus Walleij [Tue, 16 Aug 2016 13:33:28 +0000 (15:33 +0200)]
iio: accel: kxsd9: Fix raw read return

commit 7ac61a062f3147dc23e3f12b9dfe7c4dd35f9cb8 upstream.

Any readings from the raw interface of the KXSD9 driver will
return an empty string, because it does not return
IIO_VAL_INT but rather some random value from the accelerometer
to the caller.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agokvm-arm: Unmap shadow pagetables properly
Suzuki K Poulose [Thu, 8 Sep 2016 15:25:49 +0000 (16:25 +0100)]
kvm-arm: Unmap shadow pagetables properly

commit 293f293637b55db4f9f522a5a72514e98a541076 upstream.

On arm/arm64, we depend on the kvm_unmap_hva* callbacks (via
mmu_notifiers::invalidate_*) to unmap the stage2 pagetables when
the userspace buffer gets unmapped. However, when the Hypervisor
process exits without explicit unmap of the guest buffers, the only
notifier we get is kvm_arch_flush_shadow_all() (via mmu_notifier::release
) which does nothing on arm. Later this causes us to access pages that
were already released [via exit_mmap() -> unmap_vmas()] when we actually
get to unmap the stage2 pagetable [via kvm_arch_destroy_vm() ->
kvm_free_stage2_pgd()]. This triggers crashes with CONFIG_DEBUG_PAGEALLOC,
which unmaps any free'd pages from the linear map.

 [  757.644120] Unable to handle kernel paging request at virtual address
  ffff800661e00000
 [  757.652046] pgd = ffff20000b1a2000
 [  757.655471] [ffff800661e00000] *pgd=00000047fffe3003, *pud=00000047fcd8c003,
  *pmd=00000047fcc7c003, *pte=00e8004661e00712
 [  757.666492] Internal error: Oops: 96000147 [#3] PREEMPT SMP
 [  757.672041] Modules linked in:
 [  757.675100] CPU: 7 PID: 3630 Comm: qemu-system-aar Tainted: G      D
 4.8.0-rc1 #3
 [  757.683240] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board,
  BIOS 3.06.15 Aug 19 2016
 [  757.692938] task: ffff80069cdd3580 task.stack: ffff8006adb7c000
 [  757.698840] PC is at __flush_dcache_area+0x1c/0x40
 [  757.703613] LR is at kvm_flush_dcache_pmd+0x60/0x70
 [  757.708469] pc : [<ffff20000809dbdc>] lr : [<ffff2000080b4a70>] pstate: 20000145
 ...
 [  758.357249] [<ffff20000809dbdc>] __flush_dcache_area+0x1c/0x40
 [  758.363059] [<ffff2000080b6748>] unmap_stage2_range+0x458/0x5f0
 [  758.368954] [<ffff2000080b708c>] kvm_free_stage2_pgd+0x34/0x60
 [  758.374761] [<ffff2000080b2280>] kvm_arch_destroy_vm+0x20/0x68
 [  758.380570] [<ffff2000080aa330>] kvm_put_kvm+0x210/0x358
 [  758.385860] [<ffff2000080aa524>] kvm_vm_release+0x2c/0x40
 [  758.391239] [<ffff2000082ad234>] __fput+0x114/0x2e8
 [  758.396096] [<ffff2000082ad46c>] ____fput+0xc/0x18
 [  758.400869] [<ffff200008104658>] task_work_run+0x108/0x138
 [  758.406332] [<ffff2000080dc8ec>] do_exit+0x48c/0x10e8
 [  758.411363] [<ffff2000080dd5fc>] do_group_exit+0x6c/0x130
 [  758.416739] [<ffff2000080ed924>] get_signal+0x284/0xa18
 [  758.421943] [<ffff20000808a098>] do_signal+0x158/0x860
 [  758.427060] [<ffff20000808aad4>] do_notify_resume+0x6c/0x88
 [  758.432608] [<ffff200008083624>] work_pending+0x10/0x14
 [  758.437812] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)

This patch fixes the issue by moving the kvm_free_stage2_pgd() to
kvm_arch_flush_shadow_all().

Tested-by: Itaru Kitayama <itaru.kitayama@riken.jp>
Reported-by: Itaru Kitayama <itaru.kitayama@riken.jp>
Reported-by: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/AMD: Apply erratum 665 on machines without a BIOS fix
Emanuel Czirai [Fri, 2 Sep 2016 05:35:50 +0000 (07:35 +0200)]
x86/AMD: Apply erratum 665 on machines without a BIOS fix

commit d1992996753132e2dafe955cccb2fb0714d3cfc4 upstream.

AMD F12h machines have an erratum which can cause DIV/IDIV to behave
unpredictably. The workaround is to set MSRC001_1029[31] but sometimes
there is no BIOS update containing that workaround so let's do it
ourselves unconditionally. It is simple enough.

[ Borislav: Wrote commit message. ]

Signed-off-by: Emanuel Czirai <icanrealizeum@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yaowu Xu <yaowu@google.com>
Link: http://lkml.kernel.org/r/20160902053550.18097-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/paravirt: Do not trace _paravirt_ident_*() functions
Steven Rostedt [Wed, 25 May 2016 17:47:26 +0000 (13:47 -0400)]
x86/paravirt: Do not trace _paravirt_ident_*() functions

commit 15301a570754c7af60335d094dd2d1808b0641a5 upstream.

Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
after enabling function tracer. I asked him to bisect the functions within
available_filter_functions, which he did and it came down to three:

  _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()

It was found that this is only an issue when noreplace-paravirt is added
to the kernel command line.

This means that those functions are most likely called within critical
sections of the funtion tracer, and must not be traced.

In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
longer an issue.  But both _paravirt_ident_{32,64}() causes the
following splat when they are traced:

 mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
 mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
 NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
 Modules linked in: e1000e
 CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
 RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
 RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
 RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
 RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
 R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
 R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
 FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
 Call Trace:
   _raw_spin_lock+0x27/0x30
   handle_pte_fault+0x13db/0x16b0
   handle_mm_fault+0x312/0x670
   __do_page_fault+0x1b1/0x4e0
   do_page_fault+0x22/0x30
   page_fault+0x28/0x30
   __vfs_read+0x28/0xe0
   vfs_read+0x86/0x130
   SyS_read+0x46/0xa0
   entry_SYSCALL_64_fastpath+0x1e/0xa8
 Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b

Reported-by: Łukasz Daniluk <lukasz.daniluk@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
Vineet Gupta [Wed, 17 Aug 2016 01:27:07 +0000 (18:27 -0700)]
ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS

commit 1c3c909303924d30145601f47b6c058fdd2cbc2e upstream.

|  CC      mm/memory.o
| In file included from ../mm/memory.c:53:0:
| ../include/linux/pfn_t.h: In function ‘pfn_t_pte’:
| ../include/linux/pfn_t.h:78:2: error: conversion to non-scalar type requested
|  return pfn_pte(pfn_t_to_pfn(pfn), pgprot);

With STRICT_MM_TYPECHECKS pte_t is a struct and the offending code
forces a cast which ends up shifting a struct and hence the gcc warning.

Note that in recent past some of the arches (aarch64, s390) made
STRICT_MM_TYPECHECKS default, but we don't for ARC as this leads to slightly
worse generated code, given ARC ABI definition of returning structs
(which pte_t would become)

Quoting from ARC ABI...

  "Results of type struct are returned in a caller-supplied temporary
  variable whose address is passed in r0.
  For such functions, the arguments are shifted so that they are
  passed in r1 and up."

So
 - struct to be returned would be allocated on stack requiring extra
   code at call sites
 - callee updates stack memory to facilitate the return (vs. simple
   MOV into return reg r0)

Hence STRICT_MM_TYPECHECKS is not enabled by default for ARC

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoIB/uverbs: Fix race between uverbs_close and remove_one
Jason Gunthorpe [Sun, 3 Jul 2016 12:28:18 +0000 (15:28 +0300)]
IB/uverbs: Fix race between uverbs_close and remove_one

commit d1e09f304a1d9651c5059ebfeb696dc2effc9b32 upstream.

Fixes an oops that might happen if uverbs_close races with
remove_one.

Both contexts may run ib_uverbs_cleanup_ucontext, it depends
on the flow.

Currently, there is no protection for a case that remove_one
didn't make the cleanup it runs to its end, the underlying
ib_device was freed then uverbs_close will call
ib_uverbs_cleanup_ucontext and OOPs.

Above might happen if uverbs_close deleted the file from the list
then remove_one didn't find it and runs to its end.

Fixes to protect against that case by a new cleanup lock so that
ib_uverbs_cleanup_ucontext will be called always before that
remove_one is ended.

Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one")
Reported-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodm flakey: fix reads to be issued if drop_writes configured
Mike Snitzer [Thu, 25 Aug 2016 01:12:58 +0000 (21:12 -0400)]
dm flakey: fix reads to be issued if drop_writes configured

commit 299f6230bc6d0ccd5f95bb0fb865d80a9c7d5ccc upstream.

v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the
down_interval") overlooked the 'drop_writes' feature, which is meant to
allow reads to be issued rather than errored, during the down_interval.

Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval")
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoaudit: fix exe_file access in audit_exe_compare
Mateusz Guzik [Tue, 23 Aug 2016 14:20:39 +0000 (16:20 +0200)]
audit: fix exe_file access in audit_exe_compare

commit 5efc244346f9f338765da3d592f7947b0afdc4b5 upstream.

Prior to the change the function would blindly deference mm, exe_file
and exe_file->f_inode, each of which could have been NULL or freed.

Use get_task_exe_file to safely obtain stable exe_file.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: introduce get_task_exe_file
Mateusz Guzik [Tue, 23 Aug 2016 14:20:38 +0000 (16:20 +0200)]
mm: introduce get_task_exe_file

commit cd81a9170e69e018bbaba547c1fd85a585f5697a upstream.

For more convenient access if one has a pointer to the task.

As a minor nit take advantage of the fact that only task lock + rcu are
needed to safely grab ->exe_file. This saves mm refcount dance.

Use the helper in proc_exe_link.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agokexec: fix double-free when failing to relocate the purgatory
Thiago Jung Bauermann [Thu, 1 Sep 2016 23:14:44 +0000 (16:14 -0700)]
kexec: fix double-free when failing to relocate the purgatory

commit 070c43eea5043e950daa423707ae3c77e2f48edb upstream.

If kexec_apply_relocations fails, kexec_load_purgatory frees pi->sechdrs
and pi->purgatory_buf.  This is redundant, because in case of error
kimage_file_prepare_segments calls kimage_file_post_load_cleanup, which
will also free those buffers.

This causes two warnings like the following, one for pi->sechdrs and the
other for pi->purgatory_buf:

  kexec-bzImage64: Loading purgatory failed
  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 2119 at mm/vmalloc.c:1490 __vunmap+0xc1/0xd0
  Trying to vfree() nonexistent vm area (ffffc90000e91000)
  Modules linked in:
  CPU: 1 PID: 2119 Comm: kexec Not tainted 4.8.0-rc3+ #5
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    dump_stack+0x4d/0x65
    __warn+0xcb/0xf0
    warn_slowpath_fmt+0x4f/0x60
    ? find_vmap_area+0x19/0x70
    ? kimage_file_post_load_cleanup+0x47/0xb0
    __vunmap+0xc1/0xd0
    vfree+0x2e/0x70
    kimage_file_post_load_cleanup+0x5e/0xb0
    SyS_kexec_file_load+0x448/0x680
    ? putname+0x54/0x60
    ? do_sys_open+0x190/0x1f0
    entry_SYSCALL_64_fastpath+0x13/0x8f
  ---[ end trace 158bb74f5950ca2b ]---

Fix by setting pi->sechdrs an pi->purgatory_buf to NULL, since vfree
won't try to free a NULL pointer.

Link: http://lkml.kernel.org/r/1472083546-23683-1-git-send-email-bauerman@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoNFSv4.1: Fix the CREATE_SESSION slot number accounting
Trond Myklebust [Sun, 11 Sep 2016 18:50:01 +0000 (14:50 -0400)]
NFSv4.1: Fix the CREATE_SESSION slot number accounting

commit b519d408ea32040b1c7e10b155a3ee9a36660947 upstream.

Ensure that we conform to the algorithm described in RFC5661, section
18.36.4 for when to bump the sequence id. In essence we do it for all
cases except when the RPC call timed out, or in case of the server returning
NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised
Trond Myklebust [Sat, 3 Sep 2016 14:39:51 +0000 (10:39 -0400)]
pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised

commit bf0291dd2267a2b9a4cd74d65249553d11bb45d6 upstream.

According to RFC5661, the client is responsible for serialising
LAYOUTGET and LAYOUTRETURN to avoid ambiguity. Consider the case
where we send both in parallel.

Client Server
====== ======
LAYOUTGET(seqid=X)
LAYOUTRETURN(seqid=X)
LAYOUTGET return seqid=X+1
LAYOUTRETURN return seqid=X+2
Process LAYOUTRETURN
          Forget layout stateid
Process LAYOUTGET
          Set seqid=X+1

The client processes the layoutget/layoutreturn in the wrong order,
and since the result of the layoutreturn was to clear the only
existing layout segment, the client forgets the layout stateid.

When the LAYOUTGET comes in, it is treated as having a completely
new stateid, and so the client sets the wrong sequence id...

Fix is to check if there are outstanding LAYOUTGET requests
before we send the LAYOUTRETURN (note that LAYOUGET will already
wait if it sees an outstanding LAYOUTRETURN).

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
Chuck Lever [Wed, 13 Jul 2016 20:40:14 +0000 (16:40 -0400)]
nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock

commit 885848186fbc2d1d8fb6d2fdc2156638ae289a46 upstream.

nfsd4_release_lockowner finds a lock owner that has no lock state,
and drops cl_lock. Then release_lockowner picks up cl_lock and
unhashes the lock owner.

During the window where cl_lock is dropped, I don't see anything
preventing a concurrent nfsd4_lock from finding that same lock owner
and adding lock state to it.

Move release_lockowner() into nfsd4_release_lockowner and hang onto
the cl_lock until after the lock owner's state cannot be found
again.

Found by inspection, we don't currently have a reproducer.

Fixes: 2c41beb0e5cf ("nfsd: reduce cl_lock thrashing in ... ")
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoNFSv4.x: Fix a refcount leak in nfs_callback_up_net
Trond Myklebust [Mon, 29 Aug 2016 15:15:36 +0000 (11:15 -0400)]
NFSv4.x: Fix a refcount leak in nfs_callback_up_net

commit 98b0f80c2396224bbbed81792b526e6c72ba9efa upstream.

On error, the callers expect us to return without bumping
nn->cb_users[].

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopNFS: The client must not do I/O to the DS if it's lease has expired
Trond Myklebust [Tue, 23 Aug 2016 15:19:33 +0000 (11:19 -0400)]
pNFS: The client must not do I/O to the DS if it's lease has expired

commit b88fa69eaa8649f11828158c7b65c4bcd886ebd5 upstream.

Ensure that the client conforms to the normative behaviour described in
RFC5661 Section 12.7.2: "If a client believes its lease has expired,
it MUST NOT send I/O to the storage device until it has validated its
lease."

So ensure that we wait for the lease to be validated before using
the layout.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agokernfs: don't depend on d_find_any_alias() when generating notifications
Tejun Heo [Fri, 17 Jun 2016 21:51:17 +0000 (17:51 -0400)]
kernfs: don't depend on d_find_any_alias() when generating notifications

commit df6a58c5c5aa8ecb1e088ecead3fa33ae70181f1 upstream.

kernfs_notify_workfn() sends out file modified events for the
scheduled kernfs_nodes.  Because the modifications aren't from
userland, it doesn't have the matching file struct at hand and can't
use fsnotify_modify().  Instead, it looked up the inode and then used
d_find_any_alias() to find the dentry and used fsnotify_parent() and
fsnotify() directly to generate notifications.

The assumption was that the relevant dentries would have been pinned
if there are listeners, which isn't true as inotify doesn't pin
dentries at all and watching the parent doesn't pin the child dentries
even for dnotify.  This led to, for example, inotify watchers not
getting notifications if the system is under memory pressure and the
matching dentries got reclaimed.  It can also be triggered through
/proc/sys/vm/drop_caches or a remount attempt which involves shrinking
dcache.

fsnotify_parent() only uses the dentry to access the parent inode,
which kernfs can do easily.  Update kernfs_notify_workfn() so that it
uses fsnotify() directly for both the parent and target inodes without
going through d_find_any_alias().  While at it, supply the target file
name to fsnotify() from kernfs_node->name.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Fixes: d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events too")
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopowerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
Paul Mackerras [Fri, 2 Sep 2016 11:47:59 +0000 (21:47 +1000)]
powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET

commit f077aaf0754bcba0fffdbd925bc12f09cd1e38aa upstream.

In commit c60ac5693c47 ("powerpc: Update kernel VSID range", 2013-03-13)
we lost a check on the region number (the top four bits of the effective
address) for addresses below PAGE_OFFSET.  That commit replaced a check
that the top 18 bits were all zero with a check that bits 46 - 59 were
zero (performed for all addresses, not just user addresses).

This means that userspace can access an address like 0x1000_0xxx_xxxx_xxxx
and we will insert a valid SLB entry for it.  The VSID used will be the
same as if the top 4 bits were 0, but the page size will be some random
value obtained by indexing beyond the end of the mm_ctx_high_slices_psize
array in the paca.  If that page size is the same as would be used for
region 0, then userspace just has an alias of the region 0 space.  If the
page size is different, then no HPTE will be found for the access, and
the process will get a SIGSEGV (since hash_page_mm() will refuse to create
a HPTE for the bogus address).

The access beyond the end of the mm_ctx_high_slices_psize can be at most
5.5MB past the array, and so will be in RAM somewhere.  Since the access
is a load performed in real mode, it won't fault or crash the kernel.
At most this bug could perhaps leak a little bit of information about
blocks of 32 bytes of memory located at offsets of i * 512kB past the
paca->mm_ctx_high_slices_psize array, for 1 <= i <= 11.

Fixes: c60ac5693c47 ("powerpc: Update kernel VSID range")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopowerpc/powernv : Drop reference added by kset_find_obj()
Mukesh Ojha [Mon, 22 Aug 2016 06:47:44 +0000 (12:17 +0530)]
powerpc/powernv : Drop reference added by kset_find_obj()

commit a9cbf0b2195b695cbeeeecaa4e2770948c212e9a upstream.

In a situation, where Linux kernel gets notified about duplicate error log
from OPAL, it is been observed that kernel fails to remove sysfs entries
(/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because,
we currently search the error log/dump kobject in the kset list via
'kset_find_obj()' routine. Which eventually increment the reference count
by one, once it founds the kobject.

So, unless we decrement the reference count by one after it found the kobject,
we would not be able to release the kobject properly later.

This patch adds the 'kobject_put()' which was missing earlier.

Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopowerpc/tm: do not use r13 for tabort_syscall
Nicholas Piggin [Mon, 25 Jul 2016 04:26:51 +0000 (14:26 +1000)]
powerpc/tm: do not use r13 for tabort_syscall

commit cc7786d3ee7e3c979799db834b528db2c0834c2e upstream.

tabort_syscall runs with RI=1, so a nested recoverable machine
check will load the paca into r13 and overwrite what we loaded
it with, because exceptions returning to privileged mode do not
restore r13.

Fixes: b4b56f9ecab4 (powerpc/tm: Abort syscalls in active transactions)
Signed-off-by: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotipc: move linearization of buffers to generic code
Jon Paul Maloy [Wed, 21 Sep 2016 13:00:02 +0000 (15:00 +0200)]
tipc: move linearization of buffers to generic code

commit c7cad0d6f70cd4ce8644ffe528a4df1cdc2e77f5 upstream.

In commit 5cbb28a4bf65c7e4 ("tipc: linearize arriving NAME_DISTR
and LINK_PROTO buffers") we added linearization of NAME_DISTRIBUTOR,
LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE to the function
tipc_udp_recv(). The location of the change was selected in order
to make the commit easily appliable to 'net' and 'stable'.

We now move this linearization to where it should be done, in the
functions tipc_named_rcv() and tipc_link_proto_rcv() respectively.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolightnvm: put bio before return
Wenwei Tao [Thu, 4 Feb 2016 14:13:23 +0000 (15:13 +0100)]
lightnvm: put bio before return

commit 16c6d048d7b74249a4387700887e8adb13028866 upstream.

The bio is not returned if the data page cannot be allocated.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agofscrypto: require write access to mount to set encryption policy
Eric Biggers [Thu, 8 Sep 2016 21:20:38 +0000 (14:20 -0700)]
fscrypto: require write access to mount to set encryption policy

commit ba63f23d69a3a10e7e527a02702023da68ef8a6d upstream.

Since setting an encryption policy requires writing metadata to the
filesystem, it should be guarded by mnt_want_write/mnt_drop_write.
Otherwise, a user could cause a write to a frozen or readonly
filesystem.  This was handled correctly by f2fs but not by ext4.  Make
fscrypt_process_policy() handle it rather than relying on the filesystem
to get it right.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoRevert "KVM: x86: fix missed hardware breakpoints"
Paolo Bonzini [Thu, 15 Sep 2016 21:52:43 +0000 (23:52 +0200)]
Revert "KVM: x86: fix missed hardware breakpoints"

[the change is part of 70e4da7a8ff62f2775337b705f45c804bb450454, which
is already in stable kernels 4.1.y to 4.4.y.  this part of the fix
however was later undone, so remove the line again]

The following patches were applied in the wrong order in -stable. This
is the order as they appear in Linus' tree,

 [0] commit 4e422bdd2f84 ("KVM: x86: fix missed hardware breakpoints")
 [1] commit 172b2386ed16 ("KVM: x86: fix missed hardware breakpoints")
 [2] commit 70e4da7a8ff6 ("KVM: x86: fix root cause for missed hardware breakpoints")

but this is the order for linux-4.4.y

 [1] commit fc90441e728a ("KVM: x86: fix missed hardware breakpoints")
 [2] commit 25e8618619a5 ("KVM: x86: fix root cause for missed hardware breakpoints")
 [0] commit 0f6e5e26e68f ("KVM: x86: fix missed hardware breakpoints")

The upshot is that KVM_DEBUGREG_RELOAD is always set when returning
from kvm_arch_vcpu_load() in stable, but not in Linus' tree.

This happened because [0] and [1] are the same patch.  [0] and [1] come from two
different merges, and the later merge is trivially resolved; when [2]
is applied it reverts both of them.  Instead, when using the [1][2][0]
order, patches applies normally but "KVM: x86: fix missed hardware
breakpoints" is present in the final tree.

Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoMIPS: KVM: Check for pfn noslot case
James Hogan [Fri, 19 Aug 2016 13:30:29 +0000 (14:30 +0100)]
MIPS: KVM: Check for pfn noslot case

commit ba913e4f72fc9cfd03dad968dfb110eb49211d80 upstream.

When mapping a page into the guest we error check using is_error_pfn(),
however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an
error HVA for the page. This can only happen on MIPS right now due to
unusual memslot management (e.g. being moved / removed / resized), or
with an Enhanced Virtual Memory (EVA) configuration where the default
KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed
in a later patch). This case will be treated as a pfn of zero, mapping
the first page of physical memory into the guest.

It would appear the MIPS KVM port wasn't updated prior to being merged
(in v3.10) to take commit 81c52c56e2b4 ("KVM: do not treat noslot pfn as
a error pfn") into account (merged v3.8), which converted a bunch of
is_error_pfn() calls to is_error_noslot_pfn(). Switch to using
is_error_noslot_pfn() instead to catch this case properly.

Fixes: 858dd5d45733 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[james.hogan@imgtec.com: Backport to v4.7.y]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoclocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function
Chen-Yu Tsai [Thu, 25 Aug 2016 06:26:59 +0000 (14:26 +0800)]
clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function

commit b53e7d000d9e6e9fd2c6eb6b82d2783c67fd599e upstream.

The bootloader (U-boot) sometimes uses this timer for various delays.
It uses it as a ongoing counter, and does comparisons on the current
counter value. The timer counter is never stopped.

In some cases when the user interacts with the bootloader, or lets
it idle for some time before loading Linux, the timer may expire,
and an interrupt will be pending. This results in an unexpected
interrupt when the timer interrupt is enabled by the kernel, at
which point the event_handler isn't set yet. This results in a NULL
pointer dereference exception, panic, and no way to reboot.

Clear any pending interrupts after we stop the timer in the probe
function to avoid this.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agofscrypto: add authorization check for setting encryption policy
Eric Biggers [Thu, 8 Sep 2016 17:57:08 +0000 (10:57 -0700)]
fscrypto: add authorization check for setting encryption policy

commit 163ae1c6ad6299b19e22b4a35d5ab24a89791a98 upstream.

On an ext4 or f2fs filesystem with file encryption supported, a user
could set an encryption policy on any empty directory(*) to which they
had readonly access.  This is obviously problematic, since such a
directory might be owned by another user and the new encryption policy
would prevent that other user from creating files in their own directory
(for example).

Fix this by requiring inode_owner_or_capable() permission to set an
encryption policy.  This means that either the caller must own the file,
or the caller must have the capability CAP_FOWNER.

(*) Or also on any regular file, for f2fs v4.6 and later and ext4
    v4.8-rc1 and later; a separate bug fix is coming for that.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: use __GFP_NOFAIL in ext4_free_blocks()
Konstantin Khlebnikov [Sun, 13 Mar 2016 21:29:06 +0000 (17:29 -0400)]
ext4: use __GFP_NOFAIL in ext4_free_blocks()

commit adb7ef600cc9d9d15ecc934cc26af5c1379777df upstream.

This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoLinux 4.4.21
Greg Kroah-Hartman [Thu, 15 Sep 2016 06:29:29 +0000 (08:29 +0200)]
Linux 4.4.21

8 years agolib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
Nicolai Stange [Tue, 22 Mar 2016 12:12:35 +0000 (13:12 +0100)]
lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs

commit f2d1362ff7d266b3d2b1c764d6c2ef4a3b457f23 upstream.

Currently, if the number of leading zeros is greater than fits into a
complete limb, mpi_write_sgl() skips them by iterating over them limb-wise.

However, it fails to adjust its internal leading zeros tracking variable,
lzeros, accordingly: it does a

  p -= sizeof(alimb);
  continue;

which should really have been a

  lzeros -= sizeof(alimb);
  continue;

Since lzeros never decreases if its initial value >= sizeof(alimb), nothing
gets copied by mpi_write_sgl() in that case.

Instead of skipping the high order zero limbs within the loop as shown
above, fix the issue by adjusting the copying loop's bounds.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: anatop: allow regulator to be in bypass mode
Mika Båtsman [Fri, 17 Jun 2016 10:31:37 +0000 (13:31 +0300)]
regulator: anatop: allow regulator to be in bypass mode

commit 8a092e682f20f193f2070dba2ea1904e95814126 upstream.

Bypass support was added in commit d38018f2019c ("regulator: anatop: Add
bypass support to digital LDOs"). A check for valid voltage selectors was
added in commit da0607c8df5c ("regulator: anatop: Fail on invalid voltage
selector") but it also discards all regulators that are in bypass mode. Add
check for the bypass setting. Errors below were seen on a Variscite mx6
board.

anatop_regulator 20c8000.anatop:regulator-vddcore@140: Failed to read a valid default voltage selector.
anatop_regulator: probe of 20c8000.anatop:regulator-vddcore@140 failed with error -22
anatop_regulator 20c8000.anatop:regulator-vddsoc@140: Failed to read a valid default voltage selector.
anatop_regulator: probe of 20c8000.anatop:regulator-vddsoc@140 failed with error -22

Fixes: da0607c8df5c ("regulator: anatop: Fail on invalid voltage selector")
Signed-off-by: Mika Båtsman <mbatsman@mvista.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agohwrng: exynos - Disable runtime PM on probe failure
Krzysztof Kozlowski [Mon, 14 Mar 2016 00:07:14 +0000 (09:07 +0900)]
hwrng: exynos - Disable runtime PM on probe failure

commit 48a61e1e2af8020f11a2b8f8dc878144477623c6 upstream.

Add proper error path (for disabling runtime PM) when registering of
hwrng fails.

Fixes: b329669ea0b5 ("hwrng: exynos - Add support for Exynos random number generator")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocpufreq: Fix GOV_LIMITS handling for the userspace governor
Sai Gurrappadi [Fri, 29 Apr 2016 21:44:37 +0000 (14:44 -0700)]
cpufreq: Fix GOV_LIMITS handling for the userspace governor

commit e43e94c1eda76dabd686ddf6f7825f54d747b310 upstream.

Currently, the userspace governor only updates frequency on GOV_LIMITS
if policy->cur falls outside policy->{min/max}. However, it is also
necessary to update current frequency on GOV_LIMITS to match the user
requested value if it can be achieved within the new policy->{max/min}.

This was previously the behaviour in the governor until commit d1922f0
("cpufreq: Simplify userspace governor") which incorrectly assumed that
policy->cur == user requested frequency via scaling_setspeed. This won't
be true if the user requested frequency falls outside policy->{min/max}.
Ex: a temporary thermal cap throttled the user requested frequency.

Fix this by storing the user requested frequency in a seperate variable.
The governor will then try to achieve this request on every GOV_LIMITS
change.

Fixes: d1922f02562f (cpufreq: Simplify userspace governor)
Signed-off-by: Sai Gurrappadi <sgurrappadi@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agometag: Fix atomic_*_return inline asm constraints
James Hogan [Tue, 3 May 2016 08:11:21 +0000 (09:11 +0100)]
metag: Fix atomic_*_return inline asm constraints

commit 096a8b6d5e7ab9f8ca3d2474b3ca6a1fe79e0371 upstream.

The argument i of atomic_*_return() operations is given to inline asm
with the "bd" constraint, which means "An Op2 register where Op1 is a
data unit register and the instruction supports O2R", however Op1 is
constrained by "da" which allows an address unit register to be used.

Fix the constraint to use "br", meaning "An Op2 register and the
instruction supports O2R", i.e. not requiring Op1 to be a data unit
register.

Fixes: d6dfe2509da9 ("locking,arch,metag: Fold atomic_ops")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoscsi: fix upper bounds check of sense key in scsi_sense_key_string()
Tyrel Datwyler [Fri, 12 Aug 2016 22:20:07 +0000 (17:20 -0500)]
scsi: fix upper bounds check of sense key in scsi_sense_key_string()

commit a87eeb900dbb9f8202f96604d56e47e67c936b9d upstream.

Commit 655ee63cf371 ("scsi constants: command, sense key + additional
sense string") added a "Completed" sense string with key 0xF to
snstext[], but failed to updated the upper bounds check of the sense key
in scsi_sense_key_string().

Fixes: 655ee63cf371 ("[SCSI] scsi constants: command, sense key + additional sense strings")
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: timer: fix NULL pointer dereference on memory allocation failure
Vegard Nossum [Sun, 28 Aug 2016 22:33:51 +0000 (00:33 +0200)]
ALSA: timer: fix NULL pointer dereference on memory allocation failure

commit 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 upstream.

I hit this with syzkaller:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #190
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff88011278d600 task.stack: ffff8801120c0000
    RIP: 0010:[<ffffffff82c8ba07>]  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
    RSP: 0018:ffff8801120c7a60  EFLAGS: 00010006
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
    RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
    RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
    R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
    R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
    FS:  00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
    Stack:
     ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
     ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
     ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
    Call Trace:
     [<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
     [<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
     [<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
    RIP  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
     RSP <ffff8801120c7a60>
    ---[ end trace 5955b08db7f2b029 ]---

This can happen if snd_hrtimer_open() fails to allocate memory and
returns an error, which is currently not checked by snd_timer_open():

    ioctl(SNDRV_TIMER_IOCTL_SELECT)
     - snd_timer_user_tselect()
- snd_timer_close()
   - snd_hrtimer_close()
      - (struct snd_timer *) t->private_data = NULL
        - snd_timer_open()
           - snd_hrtimer_open()
              - kzalloc() fails; t->private_data is still NULL

    ioctl(SNDRV_TIMER_IOCTL_START)
     - snd_timer_user_start()
- snd_timer_start()
   - snd_timer_start1()
      - snd_hrtimer_start()
- t->private_data == NULL // boom

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
Vegard Nossum [Sun, 28 Aug 2016 22:33:50 +0000 (00:33 +0200)]
ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE

commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 upstream.

I got this:

    divide error: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff8801120a9580 task.stack: ffff8801120b0000
    RIP: 0010:[<ffffffff82c8bd9a>]  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
    RSP: 0018:ffff88011aa87da8  EFLAGS: 00010006
    RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
    RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
    R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
    R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
    FS:  00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
    Stack:
     0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
     ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
     00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
    Call Trace:
     <IRQ>
     [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
     [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
     [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
     [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
     [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
     [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
     [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
     [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
     [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
     <EOI>
     [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
     [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
     [<ffffffff82c87015>] snd_timer_continue+0x45/0x80
     [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
     [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
    RIP  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
     RSP <ffff88011aa87da8>
    ---[ end trace 6aa380f756a21074 ]---

The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
completely new/unused timer -- it will have ->sticks == 0, which causes a
divide by 0 in snd_hrtimer_callback().

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: timer: fix NULL pointer dereference in read()/ioctl() race
Vegard Nossum [Sun, 28 Aug 2016 08:13:07 +0000 (10:13 +0200)]
ALSA: timer: fix NULL pointer dereference in read()/ioctl() race

commit 11749e086b2766cccf6217a527ef5c5604ba069c upstream.

I got this with syzkaller:

    ==================================================================
    BUG: KASAN: null-ptr-deref on address 0000000000000020
    Read of size 32 by task syz-executor/22519
    CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
    014
     0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
     ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
     ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
    Call Trace:
     [<ffffffff81f9f141>] dump_stack+0x83/0xb2
     [<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
     [<ffffffff8161ff74>] kasan_report+0x34/0x40
     [<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
     [<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
     [<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
     [<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
     [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
     [<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
     [<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
     [<ffffffff8127c278>] ? do_group_exit+0x108/0x330
     [<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
     [<ffffffff81674dfe>] __vfs_read+0x10e/0x550
     [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
     [<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
     [<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
     [<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
     [<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
     [<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
     [<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
     [<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
     [<ffffffff81675355>] vfs_read+0x115/0x330
     [<ffffffff81676371>] SyS_read+0xd1/0x1a0
     [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
     [<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
     [<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
     [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
     [<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
    ==================================================================

There are a couple of problems that I can see:

 - ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
   tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
   would get a NULL pointer dereference like the above splat

 - the same ioctl() can free tu->queue/to->tqueue which means read()
   could potentially see (and dereference) the freed pointer

We can fix both by taking the ioctl_lock mutex when dereferencing
->queue/->tqueue, since that's always held over all the ioctl() code.

Just looking at the code I find it likely that there are more problems
here such as tu->qhead pointing outside the buffer if the size is
changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Enable subwoofer on Dell Inspiron 7559
Kai-Heng Feng [Tue, 30 Aug 2016 07:36:34 +0000 (15:36 +0800)]
ALSA: hda - Enable subwoofer on Dell Inspiron 7559

commit fd06c77eb9200b53d421da5fffe0dcd894b5d72a upstream.

The subwoofer on Inspiron 7559 was disabled originally.
Applying a pin fixup to node 0x1b can enable it and make it work.

Old pin: 0x411111f0
New pin: 0x90170151

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Add headset mic quirk for Dell Inspiron 5468
Shrirang Bagul [Mon, 29 Aug 2016 07:19:27 +0000 (15:19 +0800)]
ALSA: hda - Add headset mic quirk for Dell Inspiron 5468

commit 311042d1b67d9a1856a8e1294e7729fb86f64014 upstream.

This patch enables headset microphone on some variants of
Dell Inspiron 5468. (Dell SSID 0x07ad)

BugLink: https://bugs.launchpad.net/bugs/1617900
Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: rawmidi: Fix possible deadlock with virmidi registration
Takashi Iwai [Tue, 30 Aug 2016 12:45:46 +0000 (14:45 +0200)]
ALSA: rawmidi: Fix possible deadlock with virmidi registration

commit 816f318b2364262a51024096da7ca3b84e78e3b5 upstream.

When a seq-virmidi driver is initialized, it registers a rawmidi
instance with its callback to create an associated seq kernel client.
Currently it's done throughly in rawmidi's register_mutex context.
Recently it was found that this may lead to a deadlock another rawmidi
device that is being attached with the sequencer is accessed, as both
open with the same register_mutex.  This was actually triggered by
syzkaller, as Dmitry Vyukov reported:

======================================================
 [ INFO: possible circular locking dependency detected ]
 4.8.0-rc1+ #11 Not tainted
 -------------------------------------------------------
 syz-executor/7154 is trying to acquire lock:
  (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341

 but task is already holding lock:
  (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&grp->list_mutex){++++.+}:
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22
    [<     inline     >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681
    [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822
    [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418
    [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101
    [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297
    [<     inline     >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383
    [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450
    [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645
    [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164
    [<     inline     >] __snd_device_register sound/core/device.c:162
    [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212
    [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749
    [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123
    [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564
    ......

 -> #0 (register_mutex#5){+.+.+.}:
    [<     inline     >] check_prev_add kernel/locking/lockdep.c:1829
    [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1939
    [<     inline     >] validate_chain kernel/locking/lockdep.c:2266
    [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
    [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621
    [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
    [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188
    [<     inline     >] subscribe_port sound/core/seq/seq_ports.c:427
    [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510
    [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579
    [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480
    [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225
    [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440
    [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375
    [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281
    [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274
    [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138
    [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639
    ......

 other info that might help us debug this:

 Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&grp->list_mutex);
                                lock(register_mutex#5);
                                lock(&grp->list_mutex);
   lock(register_mutex#5);

 *** DEADLOCK ***
======================================================

The fix is to simply move the registration parts in
snd_rawmidi_dev_register() to the outside of the register_mutex lock.
The lock is needed only to manage the linked list, and it's not
necessarily to cover the whole initialization process.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: fireworks: accessing to user space outside spinlock
Takashi Sakamoto [Wed, 31 Aug 2016 13:58:42 +0000 (22:58 +0900)]
ALSA: fireworks: accessing to user space outside spinlock

commit 6b1ca4bcadf9ef077cc5f03c6822ba276ed14902 upstream.

In hwdep interface of fireworks driver, accessing to user space is in a
critical section with disabled local interrupt. Depending on architecture,
accessing to user space can cause page fault exception. Then local
processor stores machine status and handles the synchronous event. A
handler corresponding to the event can call task scheduler to wait for
preparing pages. In a case of usage of single core processor, the state to
disable local interrupt is worse because it don't handle usual interrupts
from hardware.

This commit fixes this bug, performing the accessing outside spinlock. This
commit also gives up counting the number of queued response messages to
simplify ring-buffer management.

Reported-by: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Fixes: 555e8a8f7f14('ALSA: fireworks: Add command/response functionality into hwdep interface')
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: firewire-tascam: accessing to user space outside spinlock
Takashi Sakamoto [Wed, 31 Aug 2016 11:15:32 +0000 (20:15 +0900)]
ALSA: firewire-tascam: accessing to user space outside spinlock

commit 04b2d9c9c319277ad4fbbb71855c256a9f4d5f98 upstream.

In hwdep interface of firewire-tascam driver, accessing to user space is
in a critical section with disabled local interrupt. Depending on
architecture, accessing to user space can cause page fault exception. Then
local processor stores machine status and handle the synchronous event. A
handler corresponding to the event can call task scheduler to wait for
preparing pages. In a case of usage of single core processor, the state to
disable local interrupt is worse because it doesn't handle usual interrupts
from hardware.

This commit fixes this bug, by performing the accessing outside spinlock.

Reported-by: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Fixes: e5e0c3dd257b('ALSA: firewire-tascam: add hwdep interface')
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114
Ken Lin [Fri, 12 Aug 2016 18:08:47 +0000 (14:08 -0400)]
ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114

commit 83d9956b7e6b310c1062df7894257251c625b22e upstream.

Avoid getting sample rate on B850V3 CP2114 as it is unsupported and
causes noisy "current rate is different from the runtime rate" messages
when playback starts.

Signed-off-by: Ken Lin <ken.lin@advantech.com.tw>
Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocrypto: caam - fix IV loading for authenc (giv)decryption
Horia Geantă [Mon, 29 Aug 2016 11:52:14 +0000 (14:52 +0300)]
crypto: caam - fix IV loading for authenc (giv)decryption

commit 8b18e2359aff2ab810aba84cebffc9da07fef78f upstream.

For algorithms that implement IV generators before the crypto ops,
the IV needed for decryption is initially located in req->src
scatterlist, not in req->iv.

Avoid copying the IV into req->iv by modifying the (givdecrypt)
descriptors to load it directly from req->src.
aead_givdecrypt() is no longer needed and goes away.

Fixes: 479bcc7c5b9e ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agouprobes: Fix the memcg accounting
Oleg Nesterov [Wed, 17 Aug 2016 15:36:29 +0000 (17:36 +0200)]
uprobes: Fix the memcg accounting

commit 6c4687cc17a788a6dd8de3e27dbeabb7cbd3e066 upstream.

__replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
it should only do this if page_check_address() fails.

This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
from put_page(old_page), it is trivial to underflow the page_counter->count
and trigger OOM.

Reported-and-tested-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Fixes: 00501b531c47 ("mm: memcontrol: rewrite charge API")
Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/apic: Do not init irq remapping if ioapic is disabled
Wanpeng Li [Tue, 23 Aug 2016 12:07:19 +0000 (20:07 +0800)]
x86/apic: Do not init irq remapping if ioapic is disabled

commit 2e63ad4bd5dd583871e6602f9d398b9322d358d9 upstream.

native_smp_prepare_cpus
  -> default_setup_apic_routing
    -> enable_IR_x2apic
      -> irq_remapping_prepare
        -> intel_prepare_irq_remapping
          -> intel_setup_irq_remapping

So IR table is setup even if "noapic" boot parameter is added. As a result we
crash later when the interrupt affinity is set due to a half initialized
remapping infrastructure.

Prevent remap initialization when IOAPIC is disabled.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovhost/scsi: fix reuse of &vq->iov[out] in response
Benjamin Coddington [Mon, 6 Jun 2016 22:07:59 +0000 (18:07 -0400)]
vhost/scsi: fix reuse of &vq->iov[out] in response

commit a77ec83a57890240c546df00ca5df1cdeedb1cc3 upstream.

The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi
command's response iovec throughout the lifetime of the command.  Rather, it
is more likely to contain an iovec from an immediately following command
after looping back around to vhost_get_vq_desc().  Pass along the iovec
entirely instead.

Fixes: 79c14141a487 ("vhost/scsi: Convert completion path to use copy_to_iter")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two.
Kent Overstreet [Thu, 18 Aug 2016 01:21:24 +0000 (18:21 -0700)]
bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two.

commit acc9cf8c66c66b2cbbdb4a375537edee72be64df upstream.

This patch fixes a cachedev registration-time allocation deadlock.
This can deadlock on boot if your initrd auto-registeres bcache devices:

Allocator thread:
[  720.727614] INFO: task bcache_allocato:3833 blocked for more than 120 seconds.
[  720.732361]  [<ffffffff816eeac7>] schedule+0x37/0x90
[  720.732963]  [<ffffffffa05192b8>] bch_bucket_alloc+0x188/0x360 [bcache]
[  720.733538]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
[  720.734137]  [<ffffffffa05302bd>] bch_prio_write+0x19d/0x340 [bcache]
[  720.734715]  [<ffffffffa05190bf>] bch_allocator_thread+0x3ff/0x470 [bcache]
[  720.735311]  [<ffffffff816ee41c>] ? __schedule+0x2dc/0x950
[  720.735884]  [<ffffffffa0518cc0>] ? invalidate_buckets+0x980/0x980 [bcache]

Registration thread:
[  720.710403] INFO: task bash:3531 blocked for more than 120 seconds.
[  720.715226]  [<ffffffff816eeac7>] schedule+0x37/0x90
[  720.715805]  [<ffffffffa05235cd>] __bch_btree_map_nodes+0x12d/0x150 [bcache]
[  720.716409]  [<ffffffffa0522d30>] ? bch_btree_insert_check_key+0x1c0/0x1c0 [bcache]
[  720.717008]  [<ffffffffa05236e4>] bch_btree_insert+0xf4/0x170 [bcache]
[  720.717586]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
[  720.718191]  [<ffffffffa0527d9a>] bch_journal_replay+0x14a/0x290 [bcache]
[  720.718766]  [<ffffffff810cc90d>] ? ttwu_do_activate.constprop.94+0x5d/0x70
[  720.719369]  [<ffffffff810cf684>] ? try_to_wake_up+0x1d4/0x350
[  720.719968]  [<ffffffffa05317d0>] run_cache_set+0x580/0x8e0 [bcache]
[  720.720553]  [<ffffffffa053302e>] register_bcache+0xe2e/0x13b0 [bcache]
[  720.721153]  [<ffffffff81354cef>] kobj_attr_store+0xf/0x20
[  720.721730]  [<ffffffff812a2dad>] sysfs_kf_write+0x3d/0x50
[  720.722327]  [<ffffffff812a225a>] kernfs_fop_write+0x12a/0x180
[  720.722904]  [<ffffffff81225177>] __vfs_write+0x37/0x110
[  720.723503]  [<ffffffff81228048>] ? __sb_start_write+0x58/0x110
[  720.724100]  [<ffffffff812cedb3>] ? security_file_permission+0x23/0xa0
[  720.724675]  [<ffffffff812258a9>] vfs_write+0xa9/0x1b0
[  720.725275]  [<ffffffff8102479c>] ? do_audit_syscall_entry+0x6c/0x70
[  720.725849]  [<ffffffff81226755>] SyS_write+0x55/0xd0
[  720.726451]  [<ffffffff8106a390>] ? do_page_fault+0x30/0x80
[  720.727045]  [<ffffffff816f2cae>] system_call_fastpath+0x12/0x71

The fifo code in upstream bcache can't use the last element in the buffer,
which was the cause of the bug: if you asked for a power of two size,
it'd give you a fifo that could hold one less than what you asked for
rather than allocating a buffer twice as big.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Tested-by: Eric Wheeler <bcache@linux.ewheeler.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoubifs: Fix assertion in layout_in_gaps()
Vincent Stehlé [Fri, 12 Aug 2016 13:26:30 +0000 (15:26 +0200)]
ubifs: Fix assertion in layout_in_gaps()

commit c0082e985fdf77b02fc9e0dac3b58504dcf11b7a upstream.

An assertion in layout_in_gaps() verifies that the gap_lebs pointer is
below the maximum bound. When computing this maximum bound the idx_lebs
count is multiplied by sizeof(int), while C pointers arithmetic does take
into account the size of the pointed elements implicitly already. Remove
the multiplication to fix the assertion.

Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system")
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: fix workdir creation
Miklos Szeredi [Mon, 5 Sep 2016 11:55:20 +0000 (13:55 +0200)]
ovl: fix workdir creation

commit e1ff3dd1ae52cef5b5373c8cc4ad949c2c25a71c upstream.

Workdir creation fails in latest kernel.

Fix by allowing EOPNOTSUPP as a valid return value from
vfs_removexattr(XATTR_NAME_POSIX_ACL_*).  Upper filesystem may not support
ACL and still be perfectly able to support overlayfs.

Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: c11b9fdd6a61 ("ovl: remove posix_acl_default from workdir")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: listxattr: use strnlen()
Miklos Szeredi [Thu, 1 Sep 2016 09:12:00 +0000 (11:12 +0200)]
ovl: listxattr: use strnlen()

commit 7cb35119d067191ce9ebc380a599db0b03cbd9d9 upstream.

Be defensive about what underlying fs provides us in the returned xattr
list buffer.  If it's not properly null terminated, bail out with a warning
insead of BUG.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: remove posix_acl_default from workdir
Miklos Szeredi [Thu, 1 Sep 2016 09:11:59 +0000 (11:11 +0200)]
ovl: remove posix_acl_default from workdir

commit c11b9fdd6a612f376a5e886505f1c54c16d8c380 upstream.

Clear out posix acl xattrs on workdir and also reset the mode after
creation so that an inherited sgid bit is cleared.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: don't copy up opaqueness
Miklos Szeredi [Mon, 8 Aug 2016 13:08:49 +0000 (15:08 +0200)]
ovl: don't copy up opaqueness

commit 0956254a2d5b9e2141385514553aeef694dfe3b5 upstream.

When a copy up of a directory occurs which has the opaque xattr set, the
xattr remains in the upper directory. The immediate behavior with overlayfs
is that the upper directory is not treated as opaque, however after a
remount the opaque flag is used and upper directory is treated as opaque.
This causes files created in the lower layer to be hidden when using
multiple lower directories.

Fix by not copying up the opaque flag.

To reproduce:

 ----8<---------8<---------8<---------8<---------8<---------8<----
mkdir -p l/d/s u v w mnt
mount -t overlay overlay -olowerdir=l,upperdir=u,workdir=w mnt
rm -rf mnt/d/
mkdir -p mnt/d/n
umount mnt
mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
touch mnt/d/foo
umount mnt
mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
ls mnt/d
 ----8<---------8<---------8<---------8<---------8<---------8<----

output should be:  "foo  n"

Reported-by: Derek McGowan <dmcg@drizz.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=151291
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agowrappers for ->i_mutex access
Al Viro [Fri, 22 Jan 2016 20:40:57 +0000 (15:40 -0500)]
wrappers for ->i_mutex access

commit 5955102c9984fa081b2d570cfac75c97eecf8f3b upstream

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[only the fs.h change included to make backports easier - gregkh]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolustre: remove unused declaration
Al Viro [Fri, 22 Jan 2016 20:34:16 +0000 (15:34 -0500)]
lustre: remove unused declaration

commit 57b8f112cfe6622ddddb8c2641206bb5fa8a112d upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotimekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING
John Stultz [Tue, 23 Aug 2016 23:08:21 +0000 (16:08 -0700)]
timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING

commit 27727df240c7cc84f2ba6047c6f18d5addfd25ef upstream.

When I added some extra sanity checking in timekeeping_get_ns() under
CONFIG_DEBUG_TIMEKEEPING, I missed that the NMI safe __ktime_get_fast_ns()
method was using timekeeping_get_ns().

Thus the locking added to the debug checks broke the NMI-safety of
__ktime_get_fast_ns().

This patch open-codes the timekeeping_get_ns() logic for
__ktime_get_fast_ns(), so can avoid any deadlocks in NMI.

Fixes: 4ca22c2648f9 "timekeeping: Add warnings when overflows or underflows are observed"
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1471993702-29148-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotimekeeping: Cap array access in timekeeping_debug
John Stultz [Tue, 23 Aug 2016 23:08:22 +0000 (16:08 -0700)]
timekeeping: Cap array access in timekeeping_debug

commit a4f8f6667f099036c88f231dcad4cf233652c824 upstream.

It was reported that hibernation could fail on the 2nd attempt, where the
system hangs at hibernate() -> syscore_resume() -> i8237A_resume() ->
claim_dma_lock(), because the lock has already been taken.

However there is actually no other process would like to grab this lock on
that problematic platform.

Further investigation showed that the problem is triggered by setting
/sys/power/pm_trace to 1 before the 1st hibernation.

Since once pm_trace is enabled, the rtc becomes unmeaningful after suspend,
and meanwhile some BIOSes would like to adjust the 'invalid' RTC (e.g, smaller
than 1970) to the release date of that motherboard during POST stage, thus
after resumed, it may seem that the system had a significant long sleep time
which is a completely meaningless value.

Then in timekeeping_resume -> tk_debug_account_sleep_time, if the bit31 of the
sleep time happened to be set to 1, fls() returns 32 and we add 1 to
sleep_time_bin[32], which causes an out of bounds array access and therefor
memory being overwritten.

As depicted by System.map:
0xffffffff81c9d080 b sleep_time_bin
0xffffffff81c9d100 B dma_spin_lock
the dma_spin_lock.val is set to 1, which caused this problem.

This patch adds a sanity check in tk_debug_account_sleep_time()
to ensure we don't index past the sleep_time_bin array.

[jstultz: Problem diagnosed and original patch by Chen Yu, I've solved the
 issue slightly differently, but borrowed his excelent explanation of the
 issue here.]

Fixes: 5c83545f24ab "power: Add option to log time spent in suspend"
Reported-by: Janek Kozicki <cosurgi@gmail.com>
Reported-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: linux-pm@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xunlei Pang <xpang@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: http://lkml.kernel.org/r/1471993702-29148-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxfs: fix superblock inprogress check
Dave Chinner [Fri, 26 Aug 2016 06:01:30 +0000 (16:01 +1000)]
xfs: fix superblock inprogress check

commit f3d7ebdeb2c297bd26272384e955033493ca291c upstream.

From inspection, the superblock sb_inprogress check is done in the
verifier and triggered only for the primary superblock via a
"bp->b_bn == XFS_SB_DADDR" check.

Unfortunately, the primary superblock is an uncached buffer, and
hence it is configured by xfs_buf_read_uncached() with:

bp->b_bn = XFS_BUF_DADDR_NULL;  /* always null for uncached buffers */

And so this check never triggers. Fix it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup
Christoph Huber [Mon, 15 Aug 2016 16:59:25 +0000 (18:59 +0200)]
ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup

commit 3e103a65514c2947e53f3171b21255fbde8b60c6 upstream.

commit cbaadf0f90d6 ("ASoC: atmel_ssc_dai: refactor the startup and
shutdown") refactored code such that the SSC is reset on every
startup; this breaks duplex audio (e.g. first start audio playback,
then start record, causing the playback to stop/hang)

Fixes: cbaadf0f90d6 (ASoC: atmel_ssc_dai: refactor the startup and shutdown)
Signed-off-by: Christoph Huber <c.huber@bct-electronic.com>
Signed-off-by: Peter Meerwald-Stadler <p.meerwald@bct-electronic.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/msm: fix use of copy_from_user() while holding spinlock
Rob Clark [Mon, 22 Aug 2016 19:15:23 +0000 (15:15 -0400)]
drm/msm: fix use of copy_from_user() while holding spinlock

commit 89f82cbb0d5c0ab768c8d02914188aa2211cd2e3 upstream.

Use instead __copy_from_user_inatomic() and fallback to slow-path where
we drop and re-aquire the lock in case of fault.

Reported-by: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Reject page_flip for !DRIVER_MODESET
Daniel Vetter [Sat, 20 Aug 2016 10:22:11 +0000 (12:22 +0200)]
drm: Reject page_flip for !DRIVER_MODESET

commit 6f00975c619064a18c23fd3aced325ae165a73b9 upstream.

Somehow this one slipped through, which means drivers without modeset
support can be oopsed (since those also don't call
drm_mode_config_init, which means the crtc lookup will chase an
uninitalized idr).

Reported-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: fix radeon_move_blit on 32bit systems
Christian König [Wed, 17 Aug 2016 07:46:42 +0000 (09:46 +0200)]
drm/radeon: fix radeon_move_blit on 32bit systems

commit 13f479b9df4e2bbf2d16e7e1b02f3f55f70e2455 upstream.

This bug seems to be present for a very long time.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agos390/sclp_ctl: fix potential information leak with /dev/sclp
Martin Schwidefsky [Mon, 25 Apr 2016 15:54:28 +0000 (17:54 +0200)]
s390/sclp_ctl: fix potential information leak with /dev/sclp

commit 532c34b5fbf1687df63b3fcd5b2846312ac943c6 upstream.

The sclp_ctl_ioctl_sccb function uses two copy_from_user calls to
retrieve the sclp request from user space. The first copy_from_user
fetches the length of the request which is stored in the first two
bytes of the request. The second copy_from_user gets the complete
sclp request, but this copies the length field a second time.
A malicious user may have changed the length in the meantime.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agords: fix an infoleak in rds_inc_info_copy
Kangjie Lu [Thu, 2 Jun 2016 08:11:20 +0000 (04:11 -0400)]
rds: fix an infoleak in rds_inc_info_copy

commit 4116def2337991b39919f3b448326e21c40e0dbb upstream.

The last field "flags" of object "minfo" is not initialized.
Copying this object out may leak kernel stack data.
Assign 0 to it to avoid leak.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopowerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
Michael Neuling [Tue, 28 Jun 2016 03:01:04 +0000 (13:01 +1000)]
powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0

commit 190ce8693c23eae09ba5f303a83bf2fbeb6478b1 upstream.

Currently we have 2 segments that are bolted for the kernel linear
mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
stacks. Anything accessed outside of these regions may need to be
faulted in. (In practice machines with TM always have 1T segments)

If a machine has < 2TB of memory we never fault on the kernel linear
mapping as these two segments cover all physical memory. If a machine
has > 2TB of memory, there may be structures outside of these two
segments that need to be faulted in. This faulting can occur when
running as a guest as the hypervisor may remove any SLB that's not
bolted.

When we treclaim and trecheckpoint we have a window where we need to
run with the userspace GPRs. This means that we no longer have a valid
stack pointer in r1. For this window we therefore clear MSR RI to
indicate that any exceptions taken at this point won't be able to be
handled. This means that we can't take segment misses in this RI=0
window.

In this RI=0 region, we currently access the thread_struct for the
process being context switched to or from. This thread_struct access
may cause a segment fault since it's not guaranteed to be covered by
the two bolted segment entries described above.

We've seen this with a crash when running as a guest with > 2TB of
memory on PowerVM:

  Unrecoverable exception 4100 at c00000000004f138
  Oops: Unrecoverable exception, sig: 6 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
  task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
  NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
  REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
  MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
  CFAR: c00000000004ed18 SOFTE: 0
  GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
  GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
  GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
  GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
  GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
  GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
  GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
  GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
  NIP [c00000000004f138] restore_gprs+0xd0/0x16c
  LR [0000000010003a24] 0x10003a24
  Call Trace:
  [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
  [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
  [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
  [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
  [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
  [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
  [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
  [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  Instruction dump:
  7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
  38a00006 78c60022 7cc62838 0b060000 <e8c701a07ccff120 e8270078 e8a70098
  ---[ end trace 602126d0a1dedd54 ]---

This fixes this by copying the required data from the thread_struct to
the stack before we clear MSR RI. Then once we clear RI, we only access
the stack, guaranteeing there's no segment miss.

We also tighten the region over which we set RI=0 on the treclaim()
path. This may have a slight performance impact since we're adding an
mtmsr instruction.

Fixes: 090b9284d725 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonvme: Call pci_disable_device on the error path.
Gabriel Krisman Bertazi [Thu, 8 Sep 2016 21:10:23 +0000 (18:10 -0300)]
nvme: Call pci_disable_device on the error path.

Commit 5706aca74fe4 ("NVMe: Don't unmap controller registers on reset"),
which backported b00a726a9fd8 to the 4.4.y kernel introduced a
regression in which it didn't call pci_disable_device in the error path
of nvme_pci_enable.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Embarassed-developer: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork
Balbir Singh [Wed, 10 Aug 2016 19:43:06 +0000 (15:43 -0400)]
cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork

commit 568ac888215c7fb2fabe8ea739b00ec3c1f5d440 upstream.

cgroup_threadgroup_rwsem is acquired in read mode during process exit
and fork.  It is also grabbed in write mode during
__cgroups_proc_write().  I've recently run into a scenario with lots
of memory pressure and OOM and I am beginning to see

systemd

 __switch_to+0x1f8/0x350
 __schedule+0x30c/0x990
 schedule+0x48/0xc0
 percpu_down_write+0x114/0x170
 __cgroup_procs_write.isra.12+0xb8/0x3c0
 cgroup_file_write+0x74/0x1a0
 kernfs_fop_write+0x188/0x200
 __vfs_write+0x6c/0xe0
 vfs_write+0xc0/0x230
 SyS_write+0x6c/0x110
 system_call+0x38/0xb4

This thread is waiting on the reader of cgroup_threadgroup_rwsem to
exit.  The reader itself is under memory pressure and has gone into
reclaim after fork. There are times the reader also ends up waiting on
oom_lock as well.

 __switch_to+0x1f8/0x350
 __schedule+0x30c/0x990
 schedule+0x48/0xc0
 jbd2_log_wait_commit+0xd4/0x180
 ext4_evict_inode+0x88/0x5c0
 evict+0xf8/0x2a0
 dispose_list+0x50/0x80
 prune_icache_sb+0x6c/0x90
 super_cache_scan+0x190/0x210
 shrink_slab.part.15+0x22c/0x4c0
 shrink_zone+0x288/0x3c0
 do_try_to_free_pages+0x1dc/0x590
 try_to_free_pages+0xdc/0x260
 __alloc_pages_nodemask+0x72c/0xc90
 alloc_pages_current+0xb4/0x1a0
 page_table_alloc+0xc0/0x170
 __pte_alloc+0x58/0x1f0
 copy_page_range+0x4ec/0x950
 copy_process.isra.5+0x15a0/0x1870
 _do_fork+0xa8/0x4b0
 ppc_clone+0x8/0xc

In the meanwhile, all processes exiting/forking are blocked almost
stalling the system.

This patch moves the threadgroup_change_begin from before
cgroup_fork() to just before cgroup_canfork().  There is no nee to
worry about threadgroup changes till the task is actually added to the
threadgroup.  This avoids having to call reclaim with
cgroup_threadgroup_rwsem held.

tj: Subject and description edits.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Zefan Li <lizefan@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoblock: make sure a big bio is split into at most 256 bvecs
Ming Lei [Tue, 23 Aug 2016 13:49:45 +0000 (21:49 +0800)]
block: make sure a big bio is split into at most 256 bvecs

commit 4d70dca4eadf2f95abe389116ac02b8439c2d16c upstream.

After arbitrary bio size was introduced, the incoming bio may
be very big. We have to split the bio into small bios so that
each holds at most BIO_MAX_PAGES bvecs for safety reason, such
as bio_clone().

This patch fixes the following kernel crash:

> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> [  172.660399] Oops: 0000 [#1] SMP
> [...]
> [  172.664780] Call Trace:
> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]

The issue can be reproduced by the following steps:
- create one raid1 over two virtio-blk
- build bcache device over the above raid1 and another cache device
and bucket size is set as 2Mbytes
- set cache mode as writeback
- run random write over ext4 on the bcache device

Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios)
Reported-by: Sebastian Roesner <sroesner-kernelorg@roesner-online.de>
Reported-by: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: Shaohua Li <shli@fb.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoblock: Fix race triggered by blk_set_queue_dying()
Bart Van Assche [Tue, 16 Aug 2016 23:48:36 +0000 (16:48 -0700)]
block: Fix race triggered by blk_set_queue_dying()

commit 1b856086813be9371929b6cc62045f9fd470f5a0 upstream.

blk_set_queue_dying() can be called while another thread is
submitting I/O or changing queue flags, e.g. through dm_stop_queue().
Hence protect the QUEUE_FLAG_DYING flag change with locking.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: avoid modifying checksum fields directly during checksum verification
Daeho Jeong [Sun, 3 Jul 2016 21:51:39 +0000 (17:51 -0400)]
ext4: avoid modifying checksum fields directly during checksum verification

commit b47820edd1634dc1208f9212b7ecfb4230610a23 upstream.

We temporally change checksum fields in buffers of some types of
metadata into '0' for verifying the checksum values. By doing this
without locking the buffer, some metadata's checksums, which are
being committed or written back to the storage, could be damaged.
In our test, several metadata blocks were found with damaged metadata
checksum value during recovery process. When we only verify the
checksum value, we have to avoid modifying checksum fields directly.

Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Török Edwin <edwin@etorok.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: avoid deadlock when expanding inode size
Jan Kara [Thu, 11 Aug 2016 16:38:55 +0000 (12:38 -0400)]
ext4: avoid deadlock when expanding inode size

commit 2e81a4eeedcaa66e35f58b81e0755b87057ce392 upstream.

When we need to move xattrs into external xattr block, we call
ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
up calling ext4_mark_inode_dirty() again which will recurse back into
the inode expansion code leading to deadlocks.

Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
its management into ext4_expand_extra_isize_ea() since its manipulation
is safe there (due to xattr_sem) from possible races with
ext4_xattr_set_handle() which plays with it as well.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: properly align shifted xattrs when expanding inodes
Jan Kara [Thu, 11 Aug 2016 16:00:01 +0000 (12:00 -0400)]
ext4: properly align shifted xattrs when expanding inodes

commit 443a8c41cd49de66a3fda45b32b9860ea0292b84 upstream.

We did not count with the padding of xattr value when computing desired
shift of xattrs in the inode when expanding i_extra_isize. As a result
we could create unaligned start of inline xattrs. Account for alignment
properly.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: fix xattr shifting when expanding inodes part 2
Jan Kara [Thu, 11 Aug 2016 15:58:32 +0000 (11:58 -0400)]
ext4: fix xattr shifting when expanding inodes part 2

commit 418c12d08dc64a45107c467ec1ba29b5e69b0715 upstream.

When multiple xattrs need to be moved out of inode, we did not properly
recompute total size of xattr headers in the inode and the new header
position. Thus when moving the second and further xattr we asked
ext4_xattr_shift_entries() to move too much and from the wrong place,
resulting in possible xattr value corruption or general memory
corruption.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: fix xattr shifting when expanding inodes
Jan Kara [Thu, 11 Aug 2016 15:50:30 +0000 (11:50 -0400)]
ext4: fix xattr shifting when expanding inodes

commit d0141191a20289f8955c1e03dad08e42e6f71ca9 upstream.

The code in ext4_expand_extra_isize_ea() treated new_extra_isize
argument sometimes as the desired target i_extra_isize and sometimes as
the amount by which we need to grow current i_extra_isize. These happen
to coincide when i_extra_isize is 0 which used to be the common case and
so nobody noticed this until recently when we added i_projid to the
inode and so i_extra_isize now needs to grow from 28 to 32 bytes.

The result of these bugs was that we sometimes unnecessarily decided to
move xattrs out of inode even if there was enough space and we often
ended up corrupting in-inode xattrs because arguments to
ext4_xattr_shift_entries() were just wrong. This could demonstrate
itself as BUG_ON in ext4_xattr_shift_entries() triggering.

Fix the problem by introducing new isize_diff variable and use it where
appropriate.

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoext4: validate that metadata blocks do not overlap superblock
Theodore Ts'o [Mon, 1 Aug 2016 04:51:02 +0000 (00:51 -0400)]
ext4: validate that metadata blocks do not overlap superblock

commit 829fa70dddadf9dd041d62b82cd7cea63943899d upstream.

A number of fuzzing failures seem to be caused by allocation bitmaps
or other metadata blocks being pointed at the superblock.

This can cause kernel BUG or WARNings once the superblock is
overwritten, so validate the group descriptor blocks to make sure this
doesn't happen.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>