Huang, Tao [Wed, 10 Dec 2014 11:40:53 +0000 (19:40 +0800)]
ARM: rockchip: common: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:36:30 +0000 (19:36 +0800)]
usb: dwc_otg_310: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:35:54 +0000 (19:35 +0800)]
video: rockchip: rk3288_hdmi: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:34:41 +0000 (19:34 +0800)]
video: rockchip: iep: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:32:25 +0000 (19:32 +0800)]
video: rk3368_lcdc: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:31:34 +0000 (19:31 +0800)]
media: rk30_camera_oneframe: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:30:04 +0000 (19:30 +0800)]
rk_serial: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:27:08 +0000 (19:27 +0800)]
clocksource: rockchip_timer: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:25:48 +0000 (19:25 +0800)]
pwm: rockchip: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:24:50 +0000 (19:24 +0800)]
irqchip: gic: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:24:11 +0000 (19:24 +0800)]
mmc: rockchip: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:23:04 +0000 (19:23 +0800)]
rockchip: clk: covert dsb() to dsb(sy)
Huang, Tao [Wed, 10 Dec 2014 11:21:35 +0000 (19:21 +0800)]
ethernet: gmac: covert dsb() to dsb(sy)
hjc [Tue, 9 Dec 2014 04:19:52 +0000 (12:19 +0800)]
rk3368 dtsi: modify dtsi for display module
Signed-off-by: hjc <hjc@rock-chips.com>
hjc [Tue, 9 Dec 2014 04:18:12 +0000 (12:18 +0800)]
rk3368 lcdc: add lcdc driver
Signed-off-by: hjc <hjc@rock-chips.com>
hjc [Tue, 9 Dec 2014 04:17:28 +0000 (12:17 +0800)]
rk31xx lvds: add support rk3368 lvds transmitter
Signed-off-by: hjc <hjc@rock-chips.com>
Huang, Tao [Tue, 9 Dec 2014 07:10:34 +0000 (15:10 +0800)]
Merge branch develop-3.10 into develop-3.10-next
hjc [Mon, 8 Dec 2014 10:24:22 +0000 (18:24 +0800)]
rk312x lcdc: fix vop csc config error
Mark Yao [Mon, 8 Dec 2014 08:55:21 +0000 (16:55 +0800)]
rk_fb: logo: support display bmp logo from uboot
get bmp file data from bootargs "kernel_logo=xxxxxx", decoder bmp
file to framebuffer.
now support rle 8bit bmp files and 24bit bmp files.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
hjc [Mon, 8 Dec 2014 03:53:47 +0000 (11:53 +0800)]
rk fb: update fb config done info.
this commit depend on hwc update,
including the following directory:
hardware/rk29/hwcomposer_rga/
hardware/rk29/libgralloc_ump/
hardware/libhardware/
typ [Mon, 8 Dec 2014 02:50:18 +0000 (10:50 +0800)]
RK3126/3126B DDR:fix ddr DQS1 drv set err
sugar [Mon, 8 Dec 2014 01:44:14 +0000 (09:44 +0800)]
Merge branch 'develop-3.10' of ssh://10.10.10.29/rk/kernel into rk30/box/4.4_r1/develop
sugar [Mon, 8 Dec 2014 01:40:26 +0000 (09:40 +0800)]
i2s: compatable with rk3126/rk3126b/rk3128.
sugar [Mon, 8 Dec 2014 01:11:51 +0000 (09:11 +0800)]
Revert "rk3126/rk3126b: i2s: use i2s_2ch."
This reverts commit
942d98b30dd2c9907b4f53c160a16ee4216745a0.
Huang, Tao [Fri, 5 Dec 2014 13:18:34 +0000 (21:18 +0800)]
Merge tag 'lsk-v3.10-android-14.11'
LSK Android 14.11 v3.10
Conflicts:
arch/arm/include/asm/cputype.h
lyz [Tue, 25 Nov 2014 11:33:13 +0000 (19:33 +0800)]
usb: cleanup useless struct usb20otg_pdata_id
许盛飞 [Fri, 5 Dec 2014 11:05:59 +0000 (19:05 +0800)]
RK3126B: RK3126B delete the EBC
Signed-off-by: 许盛飞 <xsf@rock-chips.com>
li bing [Fri, 5 Dec 2014 09:20:51 +0000 (17:20 +0800)]
wifi->esp8089: add wifi mac address user-defined function.
sugar [Fri, 5 Dec 2014 06:36:00 +0000 (14:36 +0800)]
rk3126/rk3126b: i2s: use i2s_2ch.
blb [Fri, 5 Dec 2014 05:59:29 +0000 (13:59 +0800)]
rk3128 & rk3036 : change the led color when power up and down
Signed-off-by:blb <blb@rockchips.com>
CMY [Fri, 28 Nov 2014 06:49:59 +0000 (14:49 +0800)]
rk: ion: fix dts parse failure on arm64
li bing [Fri, 5 Dec 2014 02:41:35 +0000 (10:41 +0800)]
wifi->esp8089:
ESP8089 driver update to V1.9 (
11272014).
Update the V2.3 esp_prealloc program.
This update is mainly as follows:
1.To increase the new MAC address and MAC address customized version of the support;
2.Enhance the robustness of the code;
3.The increase of pure drive consistency under Linux and android;
4.Enhance compatibility of non-standard AP;
5.Improve the stability of the P2P mode.
hjc [Fri, 5 Dec 2014 00:46:11 +0000 (08:46 +0800)]
rk fb: mid not support uboot display hdmi, so wo identify box and mid at switch screen
cl [Thu, 4 Dec 2014 02:12:52 +0000 (10:12 +0800)]
rockchip: avoid change ddr freq before lcd driver is inited
Signed-off-by: cl <cl@rock-chips.com>
cl [Wed, 3 Dec 2014 11:54:35 +0000 (19:54 +0800)]
rk3288: arm pvtm add RK3288_PROCESS_V2
Signed-off-by: cl <cl@rock-chips.com>
dkl [Fri, 28 Nov 2014 02:01:26 +0000 (10:01 +0800)]
rk3368: clk: fix address expression and some errors
Signed-off-by: dkl <dkl@rock-chips.com>
dkl [Thu, 13 Nov 2014 06:54:22 +0000 (14:54 +0800)]
rk3368: clk: add clocks-init and clocks-enable in DTS
Signed-off-by: dkl <dkl@rock-chips.com>
dkl [Wed, 12 Nov 2014 07:15:59 +0000 (15:15 +0800)]
rk3368: clk: add codes to make npll only used by dclk_vop
Signed-off-by: dkl <dkl@rock-chips.com>
dkl [Mon, 10 Nov 2014 11:14:50 +0000 (19:14 +0800)]
rk3368: clk: add rk3368_apllb_table/rk3368_aplll_table
Signed-off-by: dkl <dkl@rock-chips.com>
Huang, Tao [Wed, 3 Dec 2014 11:10:23 +0000 (19:10 +0800)]
Merge branch develop-3.10 into develop-3.10-next
Mark Yao [Wed, 3 Dec 2014 07:05:15 +0000 (15:05 +0800)]
rk_fb: sysfs: make use vmap/vunmap in pairs.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Mark Yao [Wed, 3 Dec 2014 02:05:26 +0000 (10:05 +0800)]
rk_fb: sysfs: add dump_buffer func to fb sysfs
Due to some time we want to know which buffer vop scaning, use "io"
cammand to dump buffer is too complex, so we add a sys node to help
buffer dump.
how to use it:
- echo bin > /sys/class/graphics/fb0/disp_info
it will create bin file at /data/xxx.bin
- or echo bmp > /sys/class/graphics/fb0/disp_info
it will create bmp file at /data/xxx.bmp,
this file is normal bmp file.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Mark Yao [Wed, 3 Dec 2014 01:48:50 +0000 (09:48 +0800)]
ion: export ion handle get/put
use ion handle get/put, we can easyly protect the buffer when we
use it.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Mark Yao [Mon, 1 Dec 2014 09:21:25 +0000 (17:21 +0800)]
rk_fb: use front_regs instead of some global variable
front_regs means this config is scaning on the vop devices
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
lyz [Tue, 2 Dec 2014 10:55:37 +0000 (18:55 +0800)]
usb: dwc_otg: fix incorrect bit operation
typ [Tue, 2 Dec 2014 02:25:05 +0000 (10:25 +0800)]
RK3126B DDR:add supporting DDR change freq
ljf [Mon, 1 Dec 2014 03:35:08 +0000 (11:35 +0800)]
hevc, add scaling list table patch in kernel, fix bug in scaling list enable hevc video playback
Mark Yao [Mon, 1 Dec 2014 00:47:24 +0000 (08:47 +0800)]
rk_fb: fix iommu problem when hdmi plug or unplug.
There are two thread will update the win config,
one is update_regs handler, another is hdmi hotplug
thread, win config maybe modify by another thread
unexpectly, then vop scan umap address, cause iommu
crash, so we need use a mutex to protect win config.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
dalon.zhang [Sat, 29 Nov 2014 11:38:54 +0000 (19:38 +0800)]
camera : cif : v0.1.a Support rk3288 cif driver
ljf [Fri, 28 Nov 2014 07:10:59 +0000 (15:10 +0800)]
vcodec iommu, fix some vp8 source decode cause
iommu pagefault. omit some iommu table creation.
Signed-off-by: ljf <ljf@rock-chips.com>
CMY [Fri, 28 Nov 2014 07:07:49 +0000 (15:07 +0800)]
rk: ion: change ion's debug node for other r/w
Mark Yao [Fri, 28 Nov 2014 06:32:01 +0000 (14:32 +0800)]
rk_fb: rk3128: fix crash when boot when with hdmi plug
enable iommu when first ion buffer take effect.
Zheng Yang [Fri, 28 Nov 2014 06:13:16 +0000 (14:13 +0800)]
rk3036/rk3128 hdmi:
According to HDMI CTS 7-19, GCP SB1~SB6 value must be zero
if color mode is 24bit. So we enable reg04 bit4 which will
set CD[0:3] of SB1 to zero.
许盛飞 [Fri, 28 Nov 2014 01:51:09 +0000 (09:51 +0800)]
rk312xdts: reconfiguration the rk3126-sdk.dts
Signed-off-by: 许盛飞 <xsf@rock-chips.com>
smj [Fri, 28 Nov 2014 01:09:27 +0000 (09:09 +0800)]
rk3036:SDK enable sdmmc
rk88 disable pwm_regulator
Signed-off-by: smj <smj@rock-chips.com>
许盛飞 [Fri, 28 Nov 2014 01:11:21 +0000 (09:11 +0800)]
rk312x-sleep: arm-off and ddr_selfrefres by soft controled
Signed-off-by: 许盛飞 <xsf@rock-chips.com>
blb [Thu, 27 Nov 2014 12:40:42 +0000 (20:40 +0800)]
rk3128 & rk3036: add the power led support of box-rk88
Signed-off-by: Bai Longbiao <blb@rock-chips.com>
lintao [Thu, 27 Nov 2014 12:04:24 +0000 (20:04 +0800)]
Revert "mmc: add rto for infinit sending timeout loop"
This reverts commit
426495888a245034d2b60e8c032ed5bd725a44ac.
Mark Yao [Thu, 27 Nov 2014 08:33:18 +0000 (16:33 +0800)]
kernel logo: default disable CONFIG_LOGO
because most projects use uboot logo display at kenel, so default
disable kernel logo.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
lintao [Fri, 21 Nov 2014 00:32:23 +0000 (08:32 +0800)]
mmc: auto-pin when pm call for udbg
If mux data lines with uart, drivers auto work around
pcl setting for CARD_PRESENT state
Signed-off-by: lintao <lintao@rock-chips.com>
lintao [Thu, 20 Nov 2014 01:47:32 +0000 (09:47 +0800)]
mmc: add rto for infinit sending timeout loop
dw_mci write cmd index to CMD register to trigger sending cmd by BIU.
However, if device fall into panic holding cmd/data line to low level cause BIU
cannnot send out cmd forever. So no cmd_done_int will come. AND, cmd response timeout
only valid after cmd been sent. Nothing to break this loop, we need a s/w recovery from
STATE_SENDING_CMD to STATE_IDLE, and the pending one reported as -ETIMEOUT, let caller
decide howto again.
Reported-by: roger.hu <hwg@rock-chips.com>
Signed-off-by: lintao <lintao@rock-chips.com>
Reviewed-and-tested-by: roger.hu <hwg@rock-chips.com>
Simon Xue [Thu, 27 Nov 2014 01:27:39 +0000 (09:27 +0800)]
rockchip: iommu: update iommu driver
1.Audi vpu_combo contain hevc and vpu,it need to switch
when hevc on or vpu on,but there was a issue could cause
hevc or vpu failed,so current vpu driver disable/enable
iommu each frame to avoid failed,for these,a lot of log
produced,so change dev_info to dev_dbg when iommu attach/deattach
2.AudiB has fixed the vop read problem,we use soc_is_rk3126 or
soc_is_rk3128 instead of cpu_is_rk312x to identify Audi
Jaegeuk Kim [Mon, 3 Jun 2013 10:46:19 +0000 (19:46 +0900)]
f2fs: support xattr security labels
This patch adds the support of security labels for f2fs, which will be used
by Linus Security Models (LSMs).
Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules:
"Linux Security Modules (LSM) is a framework that allows the Linux kernel to
support a variety of computer security models while avoiding favoritism toward
any single security implementation. The framework is licensed under the terms of
the GNU General Public License and is standard part of the Linux kernel since
Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted
modules in the official kernel.".
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
(cherry picked from commit
8ae8f1627f39bae505b90cade50cd8a911b8bda6)
Jaegeuk Kim [Mon, 20 May 2013 01:10:29 +0000 (10:10 +0900)]
f2fs: update inode page after creation
I found a bug when testing power-off-recovery as follows.
[Bug Scenario]
1. create a file
2. fsync the file
3. reboot w/o any sync
4. try to recover the file
- found its fsync mark
- found its dentry mark
: try to recover its dentry
- get its file name
- get its parent inode number
: here we got zero value
The reason why we get the wrong parent inode number is that we didn't
synchronize the inode page with its newly created inode information perfectly.
Especially, previous f2fs stores fi->i_pino and writes it to the cached
node page in a wrong order, which incurs the zero-valued i_pino during the
recovery.
So, this patch modifies the creation flow to fix the synchronization order of
inode page with its inode.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
(cherry picked from commit
44a83ff6a81d84ab83bcb43a49ff1ba6c7e17cd1)
Jaegeuk Kim [Mon, 20 May 2013 00:55:50 +0000 (09:55 +0900)]
f2fs: change get_new_data_page to pass a locked node page
This patch is for passing a locked node page to get_dnode_of_data.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
(cherry picked from commit
64aa7ed98db489d1c41ef140876ada38498678ab)
ljf [Wed, 26 Nov 2014 09:52:19 +0000 (17:52 +0800)]
rk3036,rk312x. merge the hevc and vpu workqueue according to audi vpu_combo feature
dkl [Tue, 25 Nov 2014 08:53:24 +0000 (16:53 +0800)]
clk: rk3126b: add support and fix clk_pll_set_rate_3036_apll
Mark Yao [Tue, 25 Nov 2014 06:16:01 +0000 (14:16 +0800)]
rk-fb: display kernel logo if define CONFIG_LOGO
if define CONFIG_LOGO macro, display kernel logo,
else display logo from uboot.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
hjc [Tue, 25 Nov 2014 03:44:25 +0000 (11:44 +0800)]
rk312x lcdc: fix fb_par->state error.
许盛飞 [Mon, 24 Nov 2014 04:00:21 +0000 (12:00 +0800)]
test-power: add testpower dts-config
Signed-off-by: 许盛飞 <xsf@rock-chips.com>
Mark Brown [Sat, 22 Nov 2014 11:07:41 +0000 (11:07 +0000)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Fri, 21 Nov 2014 23:43:29 +0000 (23:43 +0000)]
Merge remote-tracking branch 'lsk/v3.10/topic/mailbox' into linux-linaro-lsk
Conflicts:
drivers/mailbox/mailbox.c
include/linux/mailbox_controller.h
Jassi Brar [Tue, 22 Jul 2014 15:10:04 +0000 (20:40 +0530)]
dt: mailbox: add generic bindings
Define generic bindings for the framework clients to
request mailbox channels.
Reviewed-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
(cherry picked from commit
9f3e3cacb2ffdefe28c7cf490bf543e4dcb2770a)
Signed-off-by: Mark Brown <broonie@kernel.org>
Jassi Brar [Tue, 22 Jul 2014 14:35:58 +0000 (20:05 +0530)]
doc: add documentation for mailbox framework
Some explanations with examples of how to write to implement users
and providers of the mailbox framework.
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
(cherry picked from commit
15320fbcec69dc3a4f217044ed848e4225397e25)
Signed-off-by: Mark Brown <broonie@kernel.org>
Jassi Brar [Thu, 12 Jun 2014 17:01:19 +0000 (22:31 +0530)]
mailbox: Introduce framework for mailbox
Introduce common framework for client/protocol drivers and
controller drivers of Inter-Processor-Communication (IPC).
Client driver developers should have a look at
include/linux/mailbox_client.h to understand the part of
the API exposed to client drivers.
Similarly controller driver developers should have a look
at include/linux/mailbox_controller.h
Reviewed-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
(cherry picked from commit
2b6d83e2b8b7de82331a6a1dcd64b51020a6031c)
Signed-off-by: Mark Brown <broonie@kernel.org>
Suman Anna [Thu, 12 Jun 2014 17:00:34 +0000 (22:30 +0530)]
mailbox: rename pl320-ipc specific mailbox.h
The patch
30058677 "ARM / highbank: add support for pl320 IPC"
added a pl320 IPC specific header file as a generic mailbox.h.
This file has been renamed appropriately to allow the
introduction of the generic mailbox API framework.
Acked-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Mark Brown <broonie@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
(cherry picked from commit
f2fc42b6ac31f4d808da7a9da460dd433a71e976)
Signed-off-by: Mark Brown <broonie@kernel.org>
Conflicts:
arch/arm/mach-highbank/highbank.c
Mark Brown [Fri, 21 Nov 2014 18:53:31 +0000 (18:53 +0000)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Fri, 21 Nov 2014 18:53:19 +0000 (18:53 +0000)]
Merge tag 'v3.10.61' into linux-linaro-lsk
This is the 3.10.61 stable release
Mark Brown [Fri, 21 Nov 2014 17:41:22 +0000 (17:41 +0000)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Greg Kroah-Hartman [Fri, 21 Nov 2014 17:23:22 +0000 (09:23 -0800)]
Linux 3.10.61
Johannes Weiner [Wed, 16 Oct 2013 20:46:59 +0000 (13:46 -0700)]
mm: memcg: handle non-error OOM situations more gracefully
commit
4942642080ea82d99ab5b653abb9a12b7ba31f4a upstream.
Commit
3812c8c8f395 ("mm: memcg: do not trap chargers with full
callstack on OOM") assumed that only a few places that can trigger a
memcg OOM situation do not return VM_FAULT_OOM, like optional page cache
readahead. But there are many more and it's impractical to annotate
them all.
First of all, we don't want to invoke the OOM killer when the failed
allocation is gracefully handled, so defer the actual kill to the end of
the fault handling as well. This simplifies the code quite a bit for
added bonus.
Second, since a failed allocation might not be the abrupt end of the
fault, the memcg OOM handler needs to be re-entrant until the fault
finishes for subsequent allocation attempts. If an allocation is
attempted after the task already OOMed, allow it to bypass the limit so
that it can quickly finish the fault and invoke the OOM killer.
Reported-by: azurIt <azurit@pobox.sk>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:44 +0000 (15:13 -0700)]
mm: memcg: do not trap chargers with full callstack on OOM
commit
3812c8c8f3953921ef18544110dafc3505c1ac62 upstream.
The memcg OOM handling is incredibly fragile and can deadlock. When a
task fails to charge memory, it invokes the OOM killer and loops right
there in the charge code until it succeeds. Comparably, any other task
that enters the charge path at this point will go to a waitqueue right
then and there and sleep until the OOM situation is resolved. The problem
is that these tasks may hold filesystem locks and the mmap_sem; locks that
the selected OOM victim may need to exit.
For example, in one reported case, the task invoking the OOM killer was
about to charge a page cache page during a write(), which holds the
i_mutex. The OOM killer selected a task that was just entering truncate()
and trying to acquire the i_mutex:
OOM invoking task:
mem_cgroup_handle_oom+0x241/0x3b0
mem_cgroup_cache_charge+0xbe/0xe0
add_to_page_cache_locked+0x4c/0x140
add_to_page_cache_lru+0x22/0x50
grab_cache_page_write_begin+0x8b/0xe0
ext3_write_begin+0x88/0x270
generic_file_buffered_write+0x116/0x290
__generic_file_aio_write+0x27c/0x480
generic_file_aio_write+0x76/0xf0 # takes ->i_mutex
do_sync_write+0xea/0x130
vfs_write+0xf3/0x1f0
sys_write+0x51/0x90
system_call_fastpath+0x18/0x1d
OOM kill victim:
do_truncate+0x58/0xa0 # takes i_mutex
do_last+0x250/0xa30
path_openat+0xd7/0x440
do_filp_open+0x49/0xa0
do_sys_open+0x106/0x240
sys_open+0x20/0x30
system_call_fastpath+0x18/0x1d
The OOM handling task will retry the charge indefinitely while the OOM
killed task is not releasing any resources.
A similar scenario can happen when the kernel OOM killer for a memcg is
disabled and a userspace task is in charge of resolving OOM situations.
In this case, ALL tasks that enter the OOM path will be made to sleep on
the OOM waitqueue and wait for userspace to free resources or increase
the group's limit. But a userspace OOM handler is prone to deadlock
itself on the locks held by the waiting tasks. For example one of the
sleeping tasks may be stuck in a brk() call with the mmap_sem held for
writing but the userspace handler, in order to pick an optimal victim,
may need to read files from /proc/<pid>, which tries to acquire the same
mmap_sem for reading and deadlocks.
This patch changes the way tasks behave after detecting a memcg OOM and
makes sure nobody loops or sleeps with locks held:
1. When OOMing in a user fault, invoke the OOM killer and restart the
fault instead of looping on the charge attempt. This way, the OOM
victim can not get stuck on locks the looping task may hold.
2. When OOMing in a user fault but somebody else is handling it
(either the kernel OOM killer or a userspace handler), don't go to
sleep in the charge context. Instead, remember the OOMing memcg in
the task struct and then fully unwind the page fault stack with
-ENOMEM. pagefault_out_of_memory() will then call back into the
memcg code to check if the -ENOMEM came from the memcg, and then
either put the task to sleep on the memcg's OOM waitqueue or just
restart the fault. The OOM victim can no longer get stuck on any
lock a sleeping task may hold.
Debugged by Michal Hocko.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: azurIt <azurit@pobox.sk>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:43 +0000 (15:13 -0700)]
mm: memcg: rework and document OOM waiting and wakeup
commit
fb2a6fc56be66c169f8b80e07ed999ba453a2db2 upstream.
The memcg OOM handler open-codes a sleeping lock for OOM serialization
(trylock, wait, repeat) because the required locking is so specific to
memcg hierarchies. However, it would be nice if this construct would be
clearly recognizable and not be as obfuscated as it is right now. Clean
up as follows:
1. Remove the return value of mem_cgroup_oom_unlock()
2. Rename mem_cgroup_oom_lock() to mem_cgroup_oom_trylock().
3. Pull the prepare_to_wait() out of the memcg_oom_lock scope. This
makes it more obvious that the task has to be on the waitqueue
before attempting to OOM-trylock the hierarchy, to not miss any
wakeups before going to sleep. It just didn't matter until now
because it was all lumped together into the global memcg_oom_lock
spinlock section.
4. Pull the mem_cgroup_oom_notify() out of the memcg_oom_lock scope.
It is proctected by the hierarchical OOM-lock.
5. The memcg_oom_lock spinlock is only required to propagate the OOM
lock in any given hierarchy atomically. Restrict its scope to
mem_cgroup_oom_(trylock|unlock).
6. Do not wake up the waitqueue unconditionally at the end of the
function. Only the lockholder has to wake up the next in line
after releasing the lock.
Note that the lockholder kicks off the OOM-killer, which in turn
leads to wakeups from the uncharges of the exiting task. But a
contender is not guaranteed to see them if it enters the OOM path
after the OOM kills but before the lockholder releases the lock.
Thus there has to be an explicit wakeup after releasing the lock.
7. Put the OOM task on the waitqueue before marking the hierarchy as
under OOM as that is the point where we start to receive wakeups.
No point in listening before being on the waitqueue.
8. Likewise, unmark the hierarchy before finishing the sleep, for
symmetry.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:42 +0000 (15:13 -0700)]
mm: memcg: enable memcg OOM killer only for user faults
commit
519e52473ebe9db5cdef44670d5a97f1fd53d721 upstream.
System calls and kernel faults (uaccess, gup) can handle an out of memory
situation gracefully and just return -ENOMEM.
Enable the memcg OOM killer only for user faults, where it's really the
only option available.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:40 +0000 (15:13 -0700)]
x86: finish user fault error path with fatal signal
commit
3a13c4d761b4b979ba8767f42345fed3274991b0 upstream.
The x86 fault handler bails in the middle of error handling when the
task has a fatal signal pending. For a subsequent patch this is a
problem in OOM situations because it relies on pagefault_out_of_memory()
being called even when the task has been killed, to perform proper
per-task OOM state unwinding.
Shortcutting the fault like this is a rather minor optimization that
saves a few instructions in rare cases. Just remove it for
user-triggered faults.
Use the opportunity to split the fault retry handling from actual fault
errors and add locking documentation that reads suprisingly similar to
ARM's.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:39 +0000 (15:13 -0700)]
arch: mm: pass userspace fault flag to generic fault handler
commit
759496ba6407c6994d6a5ce3a5e74937d7816208 upstream.
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:38 +0000 (15:13 -0700)]
arch: mm: do not invoke OOM killer on kernel fault OOM
commit
871341023c771ad233620b7a1fb3d9c7031c4e5c upstream.
Kernel faults are expected to handle OOM conditions gracefully (gup,
uaccess etc.), so they should never invoke the OOM killer. Reserve this
for faults triggered in user context when it is the only option.
Most architectures already do this, fix up the remaining few.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Thu, 12 Sep 2013 22:13:36 +0000 (15:13 -0700)]
arch: mm: remove obsolete init OOM protection
commit
94bce453c78996cc4373d5da6cfabe07fcc6d9f9 upstream.
The memcg code can trap tasks in the context of the failing allocation
until an OOM situation is resolved. They can hold all kinds of locks
(fs, mm) at this point, which makes it prone to deadlocking.
This series converts memcg OOM handling into a two step process that is
started in the charge context, but any waiting is done after the fault
stack is fully unwound.
Patches 1-4 prepare architecture handlers to support the new memcg
requirements, but in doing so they also remove old cruft and unify
out-of-memory behavior across architectures.
Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
faults, because they can gracefully unwind the stack with -ENOMEM. OOM
handling is restricted to user triggered faults that have no other
option.
Patch 6 reworks memcg's hierarchical OOM locking to make it a little
more obvious wth is going on in there: reduce locked regions, rename
locking functions, reorder and document.
Patch 7 implements the two-part OOM handling such that tasks are never
trapped with the full charge stack in an OOM situation.
This patch:
Back before smart OOM killing, when faulting tasks were killed directly on
allocation failures, the arch-specific fault handlers needed special
protection for the init process.
Now that all fault handlers call into the generic OOM killer (see commit
609838cfed97: "mm: invoke oom-killer from remaining unconverted page
fault handlers"), which already provides init protection, the
arch-specific leftovers can be removed.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Weiner [Mon, 8 Jul 2013 22:59:50 +0000 (15:59 -0700)]
mm: invoke oom-killer from remaining unconverted page fault handlers
commit
609838cfed972d49a65aac7923a9ff5cbe482e30 upstream.
A few remaining architectures directly kill the page faulting task in an
out of memory situation. This is usually not a good idea since that
task might not even use a significant amount of memory and so may not be
the optimal victim to resolve the situation.
Since 2.6.29's
1c0fe6e ("mm: invoke oom-killer from page fault") there
is a hook that architecture page fault handlers are supposed to call to
invoke the OOM killer and let it pick the right task to kill. Convert
the remaining architectures over to this hook.
To have the previous behavior of simply taking out the faulting task the
vm.oom_kill_allocating_task sysctl can be set to 1.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Thu, 9 Oct 2014 20:55:31 +0000 (22:55 +0200)]
net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
commit
9de7922bc709eee2f609cd01d98aaedc4cf5ea74 upstream.
Commit
6f4c618ddb0 ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:
skb_over_panic: text:
ffffffffa01ea1c3 len:31056 put:30768
head:
ffff88011bd81800 data:
ffff88011bd81800 tail:0x7950
end:0x440 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
<IRQ>
[<
ffffffff8144fb1c>] skb_put+0x5c/0x70
[<
ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
[<
ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
[<
ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
[<
ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
[<
ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
[<
ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
[<
ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
[<
ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
[<
ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
[<
ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
[<
ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
[<
ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
[<
ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<
ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
[<
ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<
ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
[<
ffffffff81497078>] ip_local_deliver+0x98/0xa0
[<
ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
[<
ffffffff81496ac5>] ip_rcv+0x275/0x350
[<
ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
[<
ffffffff81460588>] netif_receive_skb+0x58/0x60
This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
------------------ ASCONF; UNKNOWN ------------------>
... where ASCONF chunk of length 280 contains 2 parameters ...
1) Add IP address parameter (param length: 16)
2) Add/del IP address parameter (param length: 255)
... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.
The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.
In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.
When walking to the next parameter this time, we proceed
with ...
length = ntohs(asconf_param->param_hdr.length);
asconf_param = (void *)asconf_param + length;
... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.
Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.
Joint work with Vlad Yasevich.
Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Thu, 9 Oct 2014 20:55:32 +0000 (22:55 +0200)]
net: sctp: fix panic on duplicate ASCONF chunks
commit
b69040d8e39f20d5215a03502a8e8b4c6ab78395 upstream.
When receiving a e.g. semi-good formed connection scan in the
form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---------------- ASCONF_a; ASCONF_b ----------------->
... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!
The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.
Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():
While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit
2e3216cd54b1
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.
Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.
Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Thu, 9 Oct 2014 20:55:33 +0000 (22:55 +0200)]
net: sctp: fix remote memory pressure from excessive queueing
commit
26b87c7881006311828bb0ab271a551a62dcceb4 upstream.
This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
[...]
---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>
... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.
We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].
The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.
In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.
One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nadav Amit [Tue, 16 Sep 2014 23:50:50 +0000 (02:50 +0300)]
KVM: x86: Don't report guest userspace emulation error to userspace
commit
a2b9e6c1a35afcc0973acb72e591c714e78885ff upstream.
Commit
fc3a9157d314 ("KVM: X86: Don't report L2 emulation failures to
user-space") disabled the reporting of L2 (nested guest) emulation failures to
userspace due to race-condition between a vmexit and the instruction emulator.
The same rational applies also to userspace applications that are permitted by
the guest OS to access MMIO area or perform PIO.
This patch extends the current behavior - of injecting a #UD instead of
reporting it to userspace - also for guest userspace code.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tomas Henzl [Thu, 1 Aug 2013 13:14:00 +0000 (15:14 +0200)]
SCSI: hpsa: fix a race in cmd_free/scsi_done
commit
2cc5bfaf854463d9d1aa52091f60110fbf102a96 upstream.
When the driver calls scsi_done and after that frees it's internal
preallocated memory it can happen that a new job is enqueud before
the memory is freed. The allocation fails and the message
"cmd_alloc returned NULL" is shown.
Patch below fixes it by moving cmd->scsi_done after cmd_free.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Cc: Masoud Sharbiani <msharbiani@twitter.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eugenia Emantayev [Thu, 25 Jul 2013 16:21:23 +0000 (19:21 +0300)]
net/mlx4_en: Fix BlueFlame race
commit
2d4b646613d6b12175b017aca18113945af1faf3 upstream.
Fix a race between BlueFlame flow and stamping in post send flow.
Example:
SW: Build WQE 0 on the TX buffer, except the ownership bit
SW: Set ownership for WQE 0 on the TX buffer
SW: Ring doorbell for WQE 0
SW: Build WQE 1 on the TX buffer, except the ownership bit
SW: Set ownership for WQE 1 on the TX buffer
HW: Read WQE 0 and then WQE 1, before doorbell was rung/BF was done for WQE 1
HW: Produce CQEs for WQE 0 and WQE 1
SW: Process the CQEs, and stamp WQE 0 and WQE 1 accordingly (on the TX buffer)
SW: Copy WQE 1 from the TX buffer to the BF register - ALREADY STAMPED!
HW: CQE error with index 0xFFFF - the BF WQE's control segment is STAMPED,
so the BF index is 0xFFFF. Error: Invalid Opcode.
As a result QP enters the error state and no traffic can be sent.
Solution:
When stamping - do not stamp last completed wqe.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Dooks [Thu, 25 Jul 2013 13:38:03 +0000 (14:38 +0100)]
ARM: Correct BUG() assembly to ensure it is endian-agnostic
commit
63328070eff2f4fd730c86966a0dbc976147c39f upstream.
Currently BUG() uses .word or .hword to create the necessary illegal
instructions. However if we are building BE8 then these get swapped
by the linker into different illegal instructions in the text. This
means that the BUG() macro does not get trapped properly.
Change to using <asm/opcodes.h> to provide the necessary ARM instruction
building as we cannot rely on gcc/gas having the `.inst` instructions
which where added to try and resolve this issue (reported by Dave Martin
<Dave.Martin@arm.com>).
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vince Weaver [Mon, 14 Jul 2014 19:33:25 +0000 (15:33 -0400)]
perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge
commit
1996388e9f4e3444db8273bc08d25164d2967c21 upstream.
This was discussed back in February:
https://lkml.org/lkml/2014/2/18/956
But I never saw a patch come out of it.
On IvyBridge we share the SandyBridge cache event tables, but the
dTLB-load-miss event is not compatible. Patch it up after
the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Usyskin [Mon, 25 Aug 2014 13:46:53 +0000 (16:46 +0300)]
mei: bus: fix possible boundaries violation
commit
cfda2794b5afe7ce64ee9605c64bef0e56a48125 upstream.
function 'strncpy' will fill whole buffer 'id.name' of fixed size (32)
with string value and will not leave place for NULL-terminator.
Possible buffer boundaries violation in following string operations.
Replace strncpy with strlcpy.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pawel Moll [Fri, 13 Jun 2014 15:03:32 +0000 (16:03 +0100)]
perf: Handle compat ioctl
commit
b3f207855f57b9c8f43a547a801340bb5cbc59e5 upstream.
When running a 32-bit userspace on a 64-bit kernel (eg. i386
application on x86_64 kernel or 32-bit arm userspace on arm64
kernel) some of the perf ioctls must be treated with special
care, as they have a pointer size encoded in the command.
For example, PERF_EVENT_IOC_ID in 32-bit world will be encoded
as 0x80042407, but 64-bit kernel will expect 0x80082407. In
result the ioctl will fail returning -ENOTTY.
This patch solves the problem by adding code fixing up the
size as compat_ioctl file operation.
Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1402671812-9078-1-git-send-email-pawel.moll@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yoichi Yuasa [Wed, 2 Oct 2013 06:03:03 +0000 (15:03 +0900)]
MIPS: Fix forgotten preempt_enable() when CPU has inclusive pcaches
commit
5596b0b245fb9d2cefb5023b11061050351c1398 upstream.
[ 1.904000] BUG: scheduling while atomic: swapper/1/0x00000002
[ 1.908000] Modules linked in:
[ 1.916000] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc2-lemote-los.git-
5318619-dirty #1
[ 1.920000] Stack :
0000000031aac000 ffffffff810d0000 0000000000000052 ffffffff802730a4
0000000000000000 0000000000000001 ffffffff810cdf90 ffffffff810d0000
ffffffff8068b968 ffffffff806f5537 ffffffff810cdf90 980000009f0782e8
0000000000000001 ffffffff80720000 ffffffff806b0000 980000009f078000
980000009f290000 ffffffff805f312c 980000009f05b5d8 ffffffff80233518
980000009f05b5e8 ffffffff80274b7c 980000009f078000 ffffffff8068b968
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 980000009f05b520 0000000000000000 ffffffff805f2f6c
0000000000000000 ffffffff80700000 ffffffff80700000 ffffffff806fc758
ffffffff80700000 ffffffff8020be98 ffffffff806fceb0 ffffffff805f2f6c
...
[ 2.028000] Call Trace:
[ 2.032000] [<
ffffffff8020be98>] show_stack+0x80/0x98
[ 2.036000] [<
ffffffff805f2f6c>] __schedule_bug+0x44/0x6c
[ 2.040000] [<
ffffffff805fac58>] __schedule+0x518/0x5b0
[ 2.044000] [<
ffffffff805f8a58>] schedule_timeout+0x128/0x1f0
[ 2.048000] [<
ffffffff80240314>] msleep+0x3c/0x60
[ 2.052000] [<
ffffffff80495400>] do_probe+0x238/0x3a8
[ 2.056000] [<
ffffffff804958b0>] ide_probe_port+0x340/0x7e8
[ 2.060000] [<
ffffffff80496028>] ide_host_register+0x2d0/0x7a8
[ 2.064000] [<
ffffffff8049c65c>] ide_pci_init_two+0x4e4/0x790
[ 2.068000] [<
ffffffff8049f9b8>] amd74xx_probe+0x148/0x2c8
[ 2.072000] [<
ffffffff803f571c>] pci_device_probe+0xc4/0x130
[ 2.076000] [<
ffffffff80478f60>] driver_probe_device+0x98/0x270
[ 2.080000] [<
ffffffff80479298>] __driver_attach+0xe0/0xe8
[ 2.084000] [<
ffffffff80476ab0>] bus_for_each_dev+0x78/0xe0
[ 2.088000] [<
ffffffff80478468>] bus_add_driver+0x230/0x310
[ 2.092000] [<
ffffffff80479b44>] driver_register+0x84/0x158
[ 2.096000] [<
ffffffff80200504>] do_one_initcall+0x104/0x160
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/5941/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>