Shaohua Li [Fri, 1 May 2015 16:59:44 +0000 (09:59 -0700)]
blk-mq: fix FUA request hang
When a FUA request enters its DATA stage of flush pipeline, the
request is added to mq requeue list, the request will then be added to
ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
Later when the request is finished the flush pipeline, the
request->__data_len is 0. Then I only saw the bio gets endio called, the
original request never finish.
Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.
stable: 3.15+
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
NeilBrown [Mon, 27 Apr 2015 04:12:22 +0000 (14:12 +1000)]
block: destroy bdi before blockdev is unregistered.
Because of the peculiar way that md devices are created (automatically
when the device node is opened), a new device can be created and
registered immediately after the
blk_unregister_region(disk_devt(disk), disk->minors);
call in del_gendisk().
Therefore it is important that all visible artifacts of the previous
device are removed before this call. In particular, the 'bdi'.
Since:
commit
c4db59d31e39ea067c32163ac961e9c80198fd37
Author: Christoph Hellwig <hch@lst.de>
fs: don't reassign dirty inodes to default_backing_dev_info
moved the
device_unregister(bdi->dev);
call from bdi_unregister() to bdi_destroy() it has been quite easy to
lose a race and have a new (e.g.) "md127" be created after the
blk_unregister_region() call and before bdi_destroy() is ultimately
called by the final 'put_disk', which must come after del_gendisk().
The new device finds that the bdi name is already registered in sysfs
and complains
> [ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5a/0x70()
> [ 9627.630032] sysfs: cannot create duplicate filename '/devices/virtual/bdi/9:127'
We can fix this by moving the bdi_destroy() call out of
blk_release_queue() (which can happen very late when a refcount
reaches zero) and into blk_cleanup_queue() - which happens exactly when the md
device driver calls it.
Then it is only necessary for md to call blk_cleanup_queue() before
del_gendisk(). As loop.c devices are also created on demand by
opening the device node, we make the same change there.
Fixes: c4db59d31e39ea067c32163ac961e9c80198fd37
Reported-by: Azat Khuzhin <a3at.mail@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org (v4.0)
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Wang YanQing [Sun, 26 Apr 2015 08:43:31 +0000 (16:43 +0800)]
block:bounce: fix call inc_|dec_zone_page_state on different pages confuse value of NR_BOUNCE
Commit
d2c5e30c9a1420902262aa923794d2ae4e0bc391
("[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter")
convert statistic of nr_bounce to per zone and one global value in vm_stat,
but it call inc_|dec_zone_page_state on different pages, then different
zones, and cause us to get unexpected value of NR_BOUNCE.
Below is the result on my machine:
Mar 2 09:26:08 udknight kernel: [144766.778265] Mem-Info:
Mar 2 09:26:08 udknight kernel: [144766.778266] DMA per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778268] CPU 0: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778269] CPU 1: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778270] Normal per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778271] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778273] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778274] HighMem per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778275] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778276] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_anon:46926 inactive_anon:287406 isolated_anon:0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_file:105085 inactive_file:139432 isolated_file:0
Mar 2 09:26:08 udknight kernel: [144766.778279] unevictable:653 dirty:0 writeback:0 unstable:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free:178957 slab_reclaimable:6419 slab_unreclaimable:9966
Mar 2 09:26:08 udknight kernel: [144766.778279] mapped:4426 shmem:305277 pagetables:784 bounce:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free_cma:0
Mar 2 09:26:08 udknight kernel: [144766.778286] DMA free:3324kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778287] lowmem_reserve[]: 0 822 3754 3754
Mar 2 09:26:08 udknight kernel: [144766.778293] Normal free:26828kB min:3632kB low:4540kB high:5448kB active_anon:4872kB inactive_anon:68kB active_file:1796kB inactive_file:1796kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:892920kB managed:842560kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:4144kB slab_reclaimable:25676kB slab_unreclaimable:39864kB kernel_stack:1944kB pagetables:3136kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:
2412612 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778294] lowmem_reserve[]: 0 0 23451 23451
Mar 2 09:26:08 udknight kernel: [144766.778299] HighMem free:685676kB min:512kB low:3748kB high:6984kB active_anon:182832kB inactive_anon:1149556kB active_file:418544kB inactive_file:555932kB unevictable:2612kB isolated(anon):0kB isolated(file):0kB present:3001732kB managed:3001732kB mlocked:0kB dirty:0kB writeback:0kB mapped:17704kB shmem:1216964kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:75771152kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Mar 2 09:26:08 udknight kernel: [144766.778300] lowmem_reserve[]: 0 0 0 0
You can see bounce:75771152kB for HighMem, but bounce:0 for lowmem and global.
This patch fix it.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Chao Yu [Thu, 23 Apr 2015 16:47:44 +0000 (10:47 -0600)]
elevator: fix double release of elevator module
Our issue is descripted in below call path:
->elevator_init
->elevator_init_fn
->{cfq,deadline,noop}_init_queue
->elevator_alloc
->kzalloc_node
fail to call kzalloc_node and then put module in elevator_alloc;
fail to call elevator_init_fn and then put module again in elevator_init.
Remove elevator_put invoking in error path of elevator_alloc to avoid
double release issue.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Tejun Heo [Tue, 21 Apr 2015 20:49:13 +0000 (16:49 -0400)]
writeback: use |1 instead of +1 to protect against div by zero
mm/page-writeback.c has several places where 1 is added to the divisor
to prevent division by zero exceptions; however, if the original
divisor is equivalent to -1, adding 1 leads to division by zero.
There are three places where +1 is used for this purpose - one in
pos_ratio_polynom() and two in bdi_position_ratio(). The second one
in bdi_position_ratio() actually triggered div-by-zero oops on a
machine running a 3.10 kernel. The divisor is
x_intercept - bdi_setpoint + 1 == span + 1
span is confirmed to be (u32)-1. It isn't clear how it ended up that
but it could be from write bandwidth calculation underflow fixed by
c72efb658f7c ("writeback: fix possible underflow in write bandwidth
calculation").
At any rate, +1 isn't a proper protection against div-by-zero. This
patch converts all +1 protections to |1. Note that
bdi_update_dirty_ratelimit() was already using |1 before this patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Tue, 21 Apr 2015 02:00:20 +0000 (10:00 +0800)]
blk-mq: fix CPU hotplug handling
hctx->tags has to be set as NULL in case that it is to be unmapped
no matter if set->tags[hctx->queue_num] is NULL or not in blk_mq_map_swqueue()
because shared tags can be freed already from another request queue.
The same situation has to be considered during handling CPU online too.
Unmapped hw queue can be remapped after CPU topo is changed, so we need
to allocate tags for the hw queue in blk_mq_map_swqueue(). Then tags
allocation for hw queue can be removed in hctx cpu online notifier, and it
is reasonable to do that after mapping is updated.
Cc: <stable@vger.kernel.org>
Reported-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Tue, 21 Apr 2015 02:00:19 +0000 (10:00 +0800)]
blk-mq: fix race between timeout and CPU hotplug
Firstly during CPU hotplug, even queue is freezed, timeout
handler still may come and access hctx->tags, which may cause
use after free, so this patch deactivates timeout handler
inside CPU hotplug notifier.
Secondly, tags can be shared by more than one queues, so we
have to check if the hctx has been unmapped, otherwise
still use-after-free on tags can be triggered.
Cc: <stable@vger.kernel.org>
Reported-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Thu, 23 Apr 2015 16:24:45 +0000 (10:24 -0600)]
NVMe: Fix VPD B0 max sectors translation
Use the namespace's block format for reporting the max transfer length.
Max unmap count is left as-is since NVMe doesn't provide a max, so the
value the driver provided the block layer is valid for any format.
Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Linus Torvalds [Fri, 17 Apr 2015 19:09:51 +0000 (15:09 -0400)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block core fix from Jens Axboe:
"A commit in the previous pull request introduce a regression. So far
only observed on qemu-sparc64, but it's a general bug. Please pull
this single fix to rectify that, thanks"
[ And it turns out that it's been seen outside of that qemu-sparc64
case, and is easy to trigger with small number of CPUs and blk-mq
enabled by default - Linus ]
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: fix iteration of busy bitmap
Linus Torvalds [Fri, 17 Apr 2015 19:01:29 +0000 (15:01 -0400)]
Merge tag 'acpica-4.1-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPICA updates from Rafael Wysocki:
"This updates the kernel's ACPICA code to upstream revision
20150410
and adds a fix for a GPE handling regression introduced during the
3.19 cycle on top of that.
Included are two stable-candidate bug fixes (one of them fixing a 3.16
regression), multiple other fixes and a bunch of cleanups.
Specifics:
- Fix for a GPE handling regression on Dell Latitude D600 that caused
GPE signaling to stop working on that machine, which appears to be
due to a hardware glitch, but it used to work and it can be made
work again in a relativly straightforward way (Rafael J Wysocki).
- Fix for a mutex unlock regression related to the handling of ACPI
tables introduced during the 3.16 development cycle (Octavian
Purdila).
- _REV modification to always return 2 which has been done by all
versions of Windows since NT and the firmware people started to use
it to distinguish between OSes in their AML and do some silly and
wrong things on that basis (Bob Moore).
- Fixes and cleanups related to the acpi_physicall_address data type
including one stable-candidate fix for an issue occasionally
occuring on 64-bit machines running 32-bit kernels where using
offsets provided by the firmware may lead to address overflows (Lv
Zheng).
- External() opcode support infrastructure needed for recompiling
disassembled ACPI tables in some cases including interpreter
modification to ignore that opcode (Bob Moore).
- Support for the "Windows 2015" string in _OSI (Bob Moore).
- GPE debug interface change to return values read from hardware
registers (Lv Zheng).
- Removal of the __DATE__ macro usage in tools (Rasmus Villemoes).
- Assorted minor fixes and cleanups (Lv Zheng, Rickard Strandqvist,
Bob Moore)"
* tag 'acpica-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
ACPICA: Store GPE register enable masks upfront
ACPICA: Update version to
20150410.
ACPICA: Fix a couple issues with the local printf module.
ACPICA: Disassembler: Some cleanup of the table dump module.
ACPICA: iASL: Add support for MSDM ACPI table.
ACPICA: Update for SLIC ACPI table.
ACPICA: Add "//" before ascii output of buffers.
ACPICA: Remove unused internal AML opcode.
ACPICA: Permanently set _REV to the value '2'.
ACPICA: Add "Windows 2015" string to _OSI support.
ACPICA: Add infrastructure for External() opcode.
ACPICA: iASL: Enhancement for constant folding.
ACPICA: iASL/Disassembler: Add option to assume table contains valid AML.
ACPICA: Update AML Debugger global variables.
ACPICA: Update Resource descriptor dump module.
ACPICA: Fix a sscanf format string.
ACPICA: Casting changes around acpi_physical_address/acpi_size.
ACPICA: Resources: Correct conditional compilation definitions.
ACPICA: Utilities: Correct conditional compilation definitions.
ACPICA: Tables: Move an iasl specific table function to iasl source file.
...
Jens Axboe [Fri, 17 Apr 2015 14:28:50 +0000 (08:28 -0600)]
blk-mq: fix iteration of busy bitmap
Commit
889fa31f00b2 was a bit too eager in reducing the loop count,
so we ended up missing queues in some configurations. Ensure that
our division rounds up, so that's not the case.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 889fa31f00b2 ("blk-mq: reduce unnecessary software queue looping")
Signed-off-by: Jens Axboe <axboe@fb.com>
Linus Torvalds [Fri, 17 Apr 2015 13:04:38 +0000 (09:04 -0400)]
Merge branch 'akpm' (patches from Andrew)
Merge third patchbomb from Andrew Morton:
- various misc things
- a couple of lib/ optimisations
- provide DIV_ROUND_CLOSEST_ULL()
- checkpatch updates
- rtc tree
- befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs
- ptrace fixes
- fork() fixes
- seccomp cleanups
- more mmap_sem hold time reductions from Davidlohr
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (138 commits)
proc: show locks in /proc/pid/fdinfo/X
docs: add missing and new /proc/PID/status file entries, fix typos
drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
Documentation/spi/spidev_test.c: fix warning
drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
.gitignore: ignore *.tar
MAINTAINERS: add Mediatek SoC mailing list
tomoyo: reduce mmap_sem hold for mm->exe_file
powerpc/oprofile: reduce mmap_sem hold for exe_file
oprofile: reduce mmap_sem hold for mm->exe_file
mips: ip32: add platform data hooks to use DS1685 driver
lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
x86: switch to using asm-generic for seccomp.h
sparc: switch to using asm-generic for seccomp.h
powerpc: switch to using asm-generic for seccomp.h
parisc: switch to using asm-generic for seccomp.h
mips: switch to using asm-generic for seccomp.h
microblaze: use asm-generic for seccomp.h
arm: use asm-generic for seccomp.h
seccomp: allow COMPAT sigreturn overrides
...
Andrey Vagin [Thu, 16 Apr 2015 19:49:38 +0000 (12:49 -0700)]
proc: show locks in /proc/pid/fdinfo/X
Let's show locks which are associated with a file descriptor in
its fdinfo file.
Currently we don't have a reliable way to determine who holds a lock. We
can find some information in /proc/locks, but PID which is reported there
can be wrong. For example, a process takes a lock, then forks a child and
dies. In this case /proc/locks contains the parent pid, which can be
reused by another process.
$ cat /proc/locks
...
6: FLOCK ADVISORY WRITE 324 00:13:13431 0 EOF
...
$ ps -C rpcbind
PID TTY TIME CMD
332 ? 00:00:00 rpcbind
$ cat /proc/332/fdinfo/4
pos: 0
flags:
0100000
mnt_id: 22
lock: 1: FLOCK ADVISORY WRITE 324 00:13:13431 0 EOF
$ ls -l /proc/332/fd/4
lr-x------ 1 root root 64 Mar 5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock
$ ls -l /proc/324/fd/
total 0
lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0
You can see that the process with the 324 pid doesn't hold the lock.
This information is required for proper dumping and restoring file
locks.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Layton <jlayton@poochiereds.net>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nathan Scott [Thu, 16 Apr 2015 19:49:35 +0000 (12:49 -0700)]
docs: add missing and new /proc/PID/status file entries, fix typos
docs: add missing and new /proc/PID/status file entries, fix typos
Signed-off-by: Nathan Scott <nathans@redhat.com>
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 16 Apr 2015 19:49:32 +0000 (12:49 -0700)]
drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
Change the __raw IO calls to readl/write_relaxed which makes the driver
endian agnostic to run properly on big endian systems.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Andrew Victor <linux@maxim.org.za>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 16 Apr 2015 19:49:29 +0000 (12:49 -0700)]
Documentation/spi/spidev_test.c: fix warning
Documentation/spi/spidev_test.c:83:5: warning: no previous prototype for 'unespcape' [-Wmissing-prototypes]
fix spelling too.
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Thu, 16 Apr 2015 19:49:27 +0000 (12:49 -0700)]
drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
The RTC driver supports two flavors of S5M devices: S5M8767-like and
S2MPS14-like.
On S2MPS13 and S2MPS14 devices the RTC module is the same so we want to
re-use the existing support of S2MPS14. However device type was passed
from parent MFD driver in platform data structure. This way for the
S2MPS13 device the main MFD driver passed device type of 'S2MPS13X'.
Instead decouple detecting of device type between main MFD and RTC driver.
This allows adding support for other S2MPS14 variations (like S2MPS11 and
S2MPS13) easily by adding to mfd/sec-core.c:
static const struct mfd_cell s2mps13_devs[] = {
{ .name = "s2mps14-rtc", }
};
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Skvortsov [Thu, 16 Apr 2015 19:49:24 +0000 (12:49 -0700)]
.gitignore: ignore *.tar
Running make tar-pkg results in following:
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# linux-4.0.0-rc3-next-
20150313-150225--x86.tar
This patch makes git ignore *.tar files.
Running 'git ls-files -i --exclude-standard' does not show any
tar files excluded from tracking after the change.
Signed-off-by: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Benjamin Romer <benjamin.romer@unisys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthias Brugger [Thu, 16 Apr 2015 19:49:21 +0000 (12:49 -0700)]
MAINTAINERS: add Mediatek SoC mailing list
Add the new list that Mediatek specific patches should also be
directed to.
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 16 Apr 2015 19:49:18 +0000 (12:49 -0700)]
tomoyo: reduce mmap_sem hold for mm->exe_file
The mm->exe_file is currently serialized with mmap_sem (shared) in order
to both safely (1) read the file and (2) compute the realpath by calling
tomoyo_realpath_from_path, making it an absolute overkill. Good users
will, on the other hand, make use of the more standard get_mm_exe_file(),
requiring only holding the mmap_sem to read the value, and relying on
reference
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 16 Apr 2015 19:49:15 +0000 (12:49 -0700)]
powerpc/oprofile: reduce mmap_sem hold for exe_file
In the future mm->exe_file will be done without mmap_sem serialization,
thus isolate and reorganize the related code to make the transition
easier. Good users will, make use of the more standard get_mm_exe_file(),
requiring only holding the mmap_sem to read the value, and relying on
reference counting to make sure that the exe file won't dissappear
underneath us while getting the dcookie.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robert Richter <rric@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 16 Apr 2015 19:49:12 +0000 (12:49 -0700)]
oprofile: reduce mmap_sem hold for mm->exe_file
sync_buffer() needs the mmap_sem for two distinct operations, both only
occurring upon user context switch handling:
1) Dealing with the exe_file.
2) Adding the dcookie data as we need to lookup the vma that
backs it. This is done via add_sample() and add_data().
This patch isolates 1), for it will no longer need the mmap_sem for
serialization. However, for now, make of the more standard
get_mm_exe_file(), requiring only holding the mmap_sem to read the value,
and relying on reference counting to make sure that the exe file won't
dissappear underneath us while doing the get dcookie.
As a consequence, for 2) we move the mmap_sem locking into where we really
need it, in lookup_dcookie(). The benefits are twofold: reduce mmap_sem
hold times, and cleaner code.
[akpm@linux-foundation.org: export get_mm_exe_file for arch/x86/oprofile/oprofile.ko]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Robert Richter <rric@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joshua Kinard [Thu, 16 Apr 2015 19:49:09 +0000 (12:49 -0700)]
mips: ip32: add platform data hooks to use DS1685 driver
This modifies the IP32 (SGI O2) platform and reset code to utilize the new
rtc-ds1685 driver. The old mc146818rtc.h header is removed and ip32_defconfig
is updated as well.
Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 16 Apr 2015 19:49:07 +0000 (12:49 -0700)]
lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
Cc: Yalin Wang <yalin.wang@sonymobile.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:49:04 +0000 (12:49 -0700)]
x86: switch to using asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. The obsolete sigreturn syscall override
is retained in 32-bit mode, and the ia32 syscall overrides are used in
the compat case. Remaining definitions were identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:49:01 +0000 (12:49 -0700)]
sparc: switch to using asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. The obsolete sigreturn in COMPAT mode
is retained as an override. Remaining definitions are identical. Also
corrected missing #define for header reinclusion protection.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:58 +0000 (12:48 -0700)]
powerpc: switch to using asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. The obsolete sigreturn in COMPAT mode is
retained as an override. Remaining definitions are identical, though they
incorrectly appeared in uapi, which has been corrected.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:55 +0000 (12:48 -0700)]
parisc: switch to using asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. Definitions were identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:53 +0000 (12:48 -0700)]
mips: switch to using asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. COMPAT definitions retain their
overrides and the remaining definitions were identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:50 +0000 (12:48 -0700)]
microblaze: use asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. Since microblaze is 32-bit, the COMPAT
seccomp defines are unused and can be dropped. The obsolete sigreturn
for seccomp strict mode is retained as an override. Remaining definitions
are identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:47 +0000 (12:48 -0700)]
arm: use asm-generic for seccomp.h
Switch to using the newly created asm-generic/seccomp.h for the seccomp
strict mode syscall definitions. Definitions were identical.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Thu, 16 Apr 2015 19:48:44 +0000 (12:48 -0700)]
seccomp: allow COMPAT sigreturn overrides
Most architectures don't need to do much special for the strict-mode
seccomp syscall entries. Remove the redundant headers and reduce the
others.
This patch (of 8):
Some architectures may need to override the compat sigreturn definition,
as is already possible in the non-compat case.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 16 Apr 2015 19:48:40 +0000 (12:48 -0700)]
arc: do not export symbols in troubleshoot.c
print_task_path_n_nm() is local to this file, its only user being
show_regs(). Mark the function static and avoid the EXPORT_SYMBOL.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Vineet Gupta <vgupta@synoipsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Simek [Thu, 16 Apr 2015 19:48:38 +0000 (12:48 -0700)]
include/linux/kconfig.h: ese macros which are already defined
It is better to use macros which are already available because then there
is just one location which needs to be change.
[akpm@linux-foundation.org: move IS_ENABLED definition to after the IS_BUILTIN and IS_MODULE definitions]
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 16 Apr 2015 19:48:35 +0000 (12:48 -0700)]
memstick: mspro_block: add missing curly braces
Using the indenting we can see the curly braces were obviously intended.
This is a static checker fix, but my guess is that we don't read enough
bytes, because we don't calculate "t_len" correctly.
Fixes: f1d82698029b ('memstick: use fully asynchronous request processing')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sanidhya Kashyap [Thu, 16 Apr 2015 19:48:32 +0000 (12:48 -0700)]
bfs: correct return values
In case of failed memory allocation, the return should be ENOMEM instead
of ENOSPC.
Return -EIO when sb_bread() fails.
Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Thu, 16 Apr 2015 19:48:29 +0000 (12:48 -0700)]
bfs: bfad_worker cleanup
This kthread is not loop at all due to break at the end of the loop. Make
that function linear, with no while loop.
And remove an unnecessary cast.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sanidhya Kashyap [Thu, 16 Apr 2015 19:48:26 +0000 (12:48 -0700)]
affs: kstrdup() memory handling
There is a possibility of kstrdup() failure upon memory pressure.
Therefore, returning ENOMEM even for new_opts.
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: Taesoo kim <taesoo@gatech.edu>
Cc: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:48:24 +0000 (12:48 -0700)]
fs/affs: use affs_test_opt()
Replace mount option test by affs_test_opt().
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:48:21 +0000 (12:48 -0700)]
fs/affs/super.c: use affs_set_opt()
Replace direct mount option assignation by affs_set_opt() macro.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:48:18 +0000 (12:48 -0700)]
fs/affs/affs.h: add mount option manipulation macros
Add clear/set/test affs mount option macros.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:48:15 +0000 (12:48 -0700)]
fs/affs: use AFFS_MOUNT prefix for mount options
Currently, affs still uses direct access on mount_options. This patch
prepares to use affs_clear/set/test_opt() like other filesystems.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sanidhya Kashyap [Thu, 16 Apr 2015 19:48:13 +0000 (12:48 -0700)]
adfs: return correct return values
Fix the wrong values returned by various functions such as EIO and ENOMEM.
Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Cc: Taesoo kim <taesoo@gatech.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Thu, 16 Apr 2015 19:48:10 +0000 (12:48 -0700)]
gcov: fix softlockups
gcov profiling if enabled with other heavy compile-time instrumentation
like KASan could trigger following softlockups:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
Modules linked in:
irq event stamp:
22823276
hardirqs last enabled at (
22823275): [<
ffffffff86e8d10d>] mutex_lock_nested+0x7d9/0x930
hardirqs last disabled at (
22823276): [<
ffffffff86e9521d>] apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (
22823172): [<
ffffffff811ed969>] __do_softirq+0x4db/0x729
softirqs last disabled at (
22823167): [<
ffffffff811edfcf>] irq_exit+0x7d/0x15b
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W
3.19.0-05245-gbb33326-dirty #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
task:
ffff88006cba8000 ti:
ffff88006cbb0000 task.ti:
ffff88006cbb0000
RIP: kasan_mem_to_shadow+0x1e/0x1f
Call Trace:
strcmp+0x28/0x70
get_node_by_name+0x66/0x99
gcov_event+0x4f/0x69e
gcov_enable_events+0x54/0x7b
gcov_fs_init+0xf8/0x134
do_one_initcall+0x1b2/0x288
kernel_init_freeable+0x467/0x580
kernel_init+0x15/0x18b
ret_from_fork+0x7c/0xb0
Kernel panic - not syncing: softlockup: hung tasks
Fix this by sticking cond_resched() in gcov_enable_events().
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Thu, 16 Apr 2015 19:48:07 +0000 (12:48 -0700)]
kernel/sysctl.c: detect overflows when converting to int
When converting unsigned long to int overflows may occur. These currently
are not detected when writing to the sysctl file system.
E.g. on a system where int has 32 bits and long has 64 bits
echo 0x800001234 > /proc/sys/kernel/threads-max
has the same effect as
echo 0x1234 > /proc/sys/kernel/threads-max
The patch adds the missing check in do_proc_dointvec_conv.
With the patch an overflow will result in an error EINVAL when writing to
the the sysctl file system.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergey Senozhatsky [Thu, 16 Apr 2015 19:48:04 +0000 (12:48 -0700)]
cpumask: don't perform while loop in cpumask_next_and()
cpumask_next_and() is looking for cpumask_next() in src1 in a loop and
tests if found cpu is also present in src2. remove that loop, perform
cpumask_and() of src1 and src2 first and use that new mask to find
cpumask_next().
Apart from removing while loop, ./bloat-o-meter on x86_64 shows
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function old new delta
cpumask_next_and 62 54 -8
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill Tkhai [Thu, 16 Apr 2015 19:48:01 +0000 (12:48 -0700)]
fs/exec.c:de_thread: move notify_count write under lock
We set sig->notify_count = -1 between RELEASE and ACQUIRE operations:
spin_unlock_irq(lock);
...
if (!thread_group_leader(tsk)) {
...
for (;;) {
sig->notify_count = -1;
write_lock_irq(&tasklist_lock);
There are no restriction on it so other processors may see this STORE
mixed with other STOREs in both areas limited by the spinlocks.
Probably, it may be reordered with the above
sig->group_exit_task = tsk;
sig->notify_count = zap_other_threads(tsk);
in some way.
Set it under tasklist_lock locked to be sure nothing will be reordered.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 16 Apr 2015 19:47:59 +0000 (12:47 -0700)]
prctl: avoid using mmap_sem for exe_file serialization
Oleg cleverly suggested using xchg() to set the new mm->exe_file instead
of calling set_mm_exe_file() which requires some form of serialization --
mmap_sem in this case. For archs that do not have atomic rmw instructions
we still fallback to a spinlock alternative, so this should always be
safe. As such, we only need the mmap_sem for looking up the backing
vm_file, which can be done sharing the lock. Naturally, this means we
need to manually deal with both the new and old file reference counting,
and we need not worry about the MMF_EXE_FILE_CHANGED bits, which can
probably be deleted in the future anyway.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Thu, 16 Apr 2015 19:47:56 +0000 (12:47 -0700)]
mm: rcu-protected get_mm_exe_file()
This patch removes mm->mmap_sem from mm->exe_file read side.
Also it kills dup_mm_exe_file() and moves exe_file duplication into
dup_mmap() where both mmap_sems are locked.
[akpm@linux-foundation.org: fix comment typo]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Thu, 16 Apr 2015 19:47:53 +0000 (12:47 -0700)]
Doc/sysctl/kernel.txt: document threads-max
File /proc/sys/kernel/threads-max controls the maximum number of threads
that can be created using fork().
[akpm@linux-foundation.org: fix typo, per Guenter]
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Thu, 16 Apr 2015 19:47:50 +0000 (12:47 -0700)]
kernel/sysctl.c: threads-max observe limits
Users can change the maximum number of threads by writing to
/proc/sys/kernel/threads-max.
With the patch the value entered is checked against the same limits that
apply when fork_init is called.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Thu, 16 Apr 2015 19:47:47 +0000 (12:47 -0700)]
kernel/fork.c: avoid division by zero
PAGE_SIZE is not guaranteed to be equal to or less than 8 times the
THREAD_SIZE.
E.g. architecture hexagon may have page size 1M and thread size 4096.
This would lead to a division by zero in the calculation of max_threads.
With 32-bit calculation there is no solution which delivers valid results
for all possible combinations of the parameters. The code is only called
once. Hence a 64-bit calculation can be used as solution.
[akpm@linux-foundation.org: use clamp_t(), per Oleg]
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Thu, 16 Apr 2015 19:47:44 +0000 (12:47 -0700)]
kernel/fork.c: new function for max_threads
PAGE_SIZE is not guaranteed to be equal to or less than 8 times the
THREAD_SIZE.
E.g. architecture hexagon may have page size 1M and thread size 4096.
This would lead to a division by zero in the calculation of max_threads.
With this patch the buggy code is moved to a separate function
set_max_threads. The error is not fixed.
After fixing the problem in a separate patch the new function can be
reused to adjust max_threads after adding or removing memory.
Argument mempages of function fork_init() is removed as totalram_pages is
an exported symbol.
The creation of separate patches for refactoring to a new function and for
fixing the logic was suggested by Ingo Molnar.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Thu, 16 Apr 2015 19:47:41 +0000 (12:47 -0700)]
fork_init: update max_threads comment
The comment explaining what value max_threads is set to is outdated. The
maximum memory consumption ratio for thread structures was 1/2 until
February 2002, then it was briefly changed to 1/16 before being set to 1/8
which we still use today. The comment was never updated to reflect that
change, it's about time.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Thu, 16 Apr 2015 19:47:38 +0000 (12:47 -0700)]
fork: report pid reservation failure properly
copy_process will report any failure in alloc_pid as ENOMEM currently
which is misleading because the pid allocation might fail not only when
the memory is short but also when the pid space is consumed already.
The current man page even mentions this case:
: EAGAIN
:
: A system-imposed limit on the number of threads was encountered.
: There are a number of limits that may trigger this error: the
: RLIMIT_NPROC soft resource limit (set via setrlimit(2)), which
: limits the number of processes and threads for a real user ID, was
: reached; the kernel's system-wide limit on the number of processes
: and threads, /proc/sys/kernel/threads-max, was reached (see
: proc(5)); or the maximum number of PIDs, /proc/sys/kernel/pid_max,
: was reached (see proc(5)).
so the current behavior is also incorrect wrt. documentation. POSIX man
page also suggest returing EAGAIN when the process count limit is reached.
This patch simply propagates error code from alloc_pid and makes sure we
return -EAGAIN due to reservation failure. This will make behavior of
fork closer to both our documentation and POSIX.
alloc_pid might alsoo fail when the reaper in the pid namespace is dead
(the namespace basically disallows all new processes) and there is no
good error code which would match documented ones. We have traditionally
returned ENOMEM for this case which is misleading as well but as per
Eric W. Biederman this behavior is documented in man pid_namespaces(7)
: If the "init" process of a PID namespace terminates, the kernel
: terminates all of the processes in the namespace via a SIGKILL signal.
: This behavior reflects the fact that the "init" process is essential for
: the correct operation of a PID namespace. In this case, a subsequent
: fork(2) into this PID namespace will fail with the error ENOMEM; it is
: not possible to create a new processes in a PID namespace whose "init"
: process has terminated.
and introducing a new error code would be too risky so let's stick to
ENOMEM for this case.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Davydov [Thu, 16 Apr 2015 19:47:35 +0000 (12:47 -0700)]
signal: remove warning about using SI_TKILL in rt_[tg]sigqueueinfo
Sending SI_TKILL from rt_[tg]sigqueueinfo was deprecated, so now we issue
a warning on the first attempt of doing it. We use WARN_ON_ONCE, which is
not informative and, what is worse, taints the kernel, making the trinity
syscall fuzzer complain false-positively from time to time.
It does not look like we need this warning at all, because the behaviour
changed quite a long time ago (2.6.39), and if an application relies on
the old API, it gets EPERM anyway and can issue a warning by itself.
So let us zap the warning in kernel.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Thu, 16 Apr 2015 19:47:32 +0000 (12:47 -0700)]
ptrace: ptrace_detach() can no longer race with SIGKILL
ptrace_detach() re-checks ->ptrace under tasklist lock and calls
release_task() if __ptrace_detach() returns true. This was needed because
the __TASK_TRACED tracee could be killed/untraced, and it could even pass
exit_notify() before we take tasklist_lock.
But this is no longer possible after
9899d11f6544 "ptrace: ensure
arch_ptrace/ptrace_request can never race with SIGKILL". We can turn
these checks into WARN_ON() and remove release_task().
While at it, document the setting of child->exit_code.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Labath <labath@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Thu, 16 Apr 2015 19:47:29 +0000 (12:47 -0700)]
ptrace: fix race between ptrace_resume() and wait_task_stopped()
ptrace_resume() is called when the tracee is still __TASK_TRACED. We set
tracee->exit_code and then wake_up_state() changes tracee->state. If the
tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
wrongly looks like another report from tracee.
This confuses debugger, and since wait_task_stopped() clears ->exit_code
the tracee can miss a signal.
Test-case:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <pthread.h>
#include <assert.h>
int pid;
void *waiter(void *arg)
{
int stat;
for (;;) {
assert(pid == wait(&stat));
assert(WIFSTOPPED(stat));
if (WSTOPSIG(stat) == SIGHUP)
continue;
assert(WSTOPSIG(stat) == SIGCONT);
printf("ERR! extra/wrong report:%x\n", stat);
}
}
int main(void)
{
pthread_t thread;
pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
for (;;)
kill(getpid(), SIGHUP);
}
assert(pthread_create(&thread, NULL, waiter, NULL) == 0);
for (;;)
ptrace(PTRACE_CONT, pid, 0, SIGCONT);
return 0;
}
Note for stable: the bug is very old, but without
9899d11f6544 "ptrace:
ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
should use lock_task_sighand(child).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Pavel Labath <labath@google.com>
Tested-by: Pavel Labath <labath@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Kuleshov [Thu, 16 Apr 2015 19:47:26 +0000 (12:47 -0700)]
fs/fat: comment fix, fat_bits can be also 32
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Kuleshov [Thu, 16 Apr 2015 19:47:24 +0000 (12:47 -0700)]
fs/fat: remove unnecessary includes
'fat.h' includes <linux/buffer_head.h> which includes <linux/fs.h> which
includes all the header files required for all *.c files fat filesystem.
[akpm@linux-foundation.org: fs/fat/iode.c needs seq_file.h]
[sfr@canb.auug.org.au: put one actually necessary include file back]
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Kuleshov [Thu, 16 Apr 2015 19:47:21 +0000 (12:47 -0700)]
fs/fat: remove unnecessary defintion
'*sb' never used, so let's remote it and pass inode->i_sb directly to the
MSDOS_SB.
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Hebb [Thu, 16 Apr 2015 19:47:18 +0000 (12:47 -0700)]
hfsplus: don't store special "osx" xattr prefix on-disk
On Mac OS X, HFS+ extended attributes are not namespaced. Since we want
to be compatible with OS X filesystems and yet still support the Linux
namespacing system, the hfsplus driver implements a special "osx"
namespace that is reported for any attribute that is not namespaced
on-disk. However, the current code for getting and setting these
unprefixed attributes is broken.
hfsplus_osx_setattr() and hfsplus_osx_getattr() are passed names that have
already had their "osx." prefixes stripped by the generic functions. The
functions first, quite correctly, check those names to make sure that they
aren't prefixed with a known namespace, which would allow namespace access
restrictions to be bypassed. However, the functions then prepend "osx."
to the name they're given before passing it on to hfsplus_getattr() and
hfsplus_setattr(). Not only does this cause the "osx." prefix to be
stored on-disk, defeating its purpose, it also breaks the check for the
special "com.apple.FinderInfo" attribute, which is reported for all files,
and as a consequence makes some userspace applications (e.g. GNU patch)
fail even when extended attributes are not otherwise in use.
There are five commits which have touched this particular code:
127e5f5ae51e ("hfsplus: rework functionality of getting, setting and deleting of extended attributes")
b168fff72109 ("hfsplus: use xattr handlers for removexattr")
bf29e886b242 ("hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes")
fcacbd95e121 ("fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()")
ec1bbd346f18 ("fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()")
The first commit creates the functions to begin with. The namespace is
prepended by the original code, which I believe was correct at the time,
since hfsplus_?etattr() stripped the prefix if found. The second commit
removes this behavior from hfsplus_?etattr() and appears to have been
intended to also remove the prefixing from hfsplus_osx_?etattr().
However, what it actually does is remove a necessary strncpy() call
completely, breaking the osx namespace entirely. The third commit re-adds
the strncpy() call as it was originally, but doesn't mention it in its
commit message. The final two commits refactor the code and don't affect
its functionality.
This commit does what
b168fff attempted to do (prevent the prefix from
being added), but does it properly, instead of passing in an empty buffer
(which is what
b168fff actually did).
Fixes: b168fff72109 ("hfsplus: use xattr handlers for removexattr")
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergei Antonov [Thu, 16 Apr 2015 19:47:15 +0000 (12:47 -0700)]
hfsplus: fix expand when not enough available space
Fix a bug which is reproduced as follows. Create a file:
echo abc > test_file
Try to expand the file beyond available space:
truncate --size=<size exceeding available space> test_file
Since HFS+ does not support file size > allocated size, truncate should
fail. However, it ends successfully. The driver returns success despite
having been unable to allocate the requested space for the file. Also
filesystem check finds an error:
Checking catalog file.
Incorrect size for file test_file
(It should be
469094400 instead of
1000000000)
Add a piece of code analogous to code in the fat driver. Now a proper
error is returned and filesystem remains consistent.
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chengyu Song [Thu, 16 Apr 2015 19:47:12 +0000 (12:47 -0700)]
hfsplus: incorrect return value
In case of memory allocation error, the return should be -ENOMEM, instead
of -ENOSPC.
Signed-off-by: Chengyu Song <csong84@gatech.edu>
Reviewed-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:47:09 +0000 (12:47 -0700)]
fs/hfsplus: replace if/BUG by BUG_ON
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:47:07 +0000 (12:47 -0700)]
fs/hfsplus: use bool instead of int for is_known_namespace() return value
is_known_namespace() only returns true/false. Also remove inline and let
compiler decide what to do with static functions.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:47:04 +0000 (12:47 -0700)]
fs/hfsplus: atomically set inode->i_flags
According to commit
5f16f3225b06 ("ext4: atomically set inode->i_flags in
ext4_set_inode_flags()").
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:47:01 +0000 (12:47 -0700)]
fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()
security/trusted/user/osx setxattr did the same
xattr_name initialization. Move that operation in hfsplus_setxattr().
Tested with security/trusted/user getfattr/setfattr
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:46:58 +0000 (12:46 -0700)]
fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()
security/trusted/user/osx getxattr did the same
xattr_name initialization. Move that operation in hfsplus_getxattr().
Tested with security/trusted/user getfattr/setfattr
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 16 Apr 2015 19:46:56 +0000 (12:46 -0700)]
hfsplus: add missing curly braces in hfsplus_delete_cat()
This doesn't change how the code works, but clearly the curly braces were
intended.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chengyu Song [Thu, 16 Apr 2015 19:46:53 +0000 (12:46 -0700)]
hfs: incorrect return values
In case of memory allocation error, the return should be -ENOMEM, instead
of -ENOSPC.
Signed-off-by: Chengyu Song <csong84@gatech.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:50 +0000 (12:46 -0700)]
nilfs2: use inode_set_flags() in nilfs_set_inode_flags()
Use inode_set_flags() to atomically set i_flags instead of clearing out
the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
FS_IMMUTABLE_FL, FS_APPEND_FL flags to avoid a race where an immutable
file has the immutable flag cleared for a brief window of time.
This is a similar fix to commit
5f16f3225b06 ("ext4: atomically set
inode->i_flags in ext4_set_inode_flags()").
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:47 +0000 (12:46 -0700)]
nilfs2: put out gfp mask manipulation from nilfs_set_inode_flags()
nilfs_set_inode_flags() function adjusts gfp-mask of inode->i_mapping as
well as i_flags, however, this coupling of operations is not appropriate.
For instance, nilfs_ioctl_setflags(), one of three callers of
nilfs_set_inode_flags(), doesn't need to reinitialize the gfp-mask at all.
In addition, nilfs_new_inode(), another caller of
nilfs_set_inode_flags(), doesn't either because it has already initialized
the gfp-mask.
Only __nilfs_read_inode(), the remaining caller, needs it. So, this moves
the gfp mask manipulation to __nilfs_read_inode() from
nilfs_set_inode_flags().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:45 +0000 (12:46 -0700)]
nilfs2: fix gcc warning at nilfs_checkpoint_is_mounted()
Fix the following build warning:
fs/nilfs2/super.c: In function 'nilfs_checkpoint_is_mounted':
fs/nilfs2/super.c:1023:10: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
if (cno < 0 || cno > nilfs->ns_cno)
^
This warning indicates that the comparision "cno < 0" is useless because
variable "cno" has an unsigned integer type "__u64".
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:42 +0000 (12:46 -0700)]
nilfs2: improve execution time of NILFS_IOCTL_GET_CPINFO ioctl
The older a filesystem gets, the slower lscp command becomes. This is
because nilfs_cpfile_do_get_cpinfo() function meets more hole blocks
as the start offset of valid checkpoint numbers gets bigger.
This reduces the overhead by skipping hole blocks efficiently with
nilfs_mdt_find_block() helper.
A measurement result of this patch is as follows:
Before:
$ time lscp
CNO DATE TIME MODE FLG BLKCNT ICNT
5769303 2015-02-22 19:31:33 cp - 108 1
5769304 2015-02-22 19:38:54 cp - 108 1
real 0m0.182s
user 0m0.003s
sys 0m0.180s
After:
$ time lscp
CNO DATE TIME MODE FLG BLKCNT ICNT
5769303 2015-02-22 19:31:33 cp - 108 1
5769304 2015-02-22 19:38:54 cp - 108 1
real 0m0.003s
user 0m0.001s
sys 0m0.002s
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:39 +0000 (12:46 -0700)]
nilfs2: add helper to find existent block on metadata file
Add a new metadata file function, nilfs_mdt_find_block(), which finds
an existent block on a metadata file in a given range of blocks. This
function skips continuous hole blocks efficiently by using
nilfs_bmap_seek_key().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:36 +0000 (12:46 -0700)]
nilfs2: add bmap function to seek a valid key
Add a new bmap function, nilfs_bmap_seek_key(), which seeks a valid
entry and returns its key starting from a given key. This function
can be used to skip hole blocks efficiently.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:34 +0000 (12:46 -0700)]
nilfs2: unify type of key arguments in bmap interface
The type of key arguments in block mapping interface varies depending
on function. For instance, nilfs_bmap_lookup_at_level() takes "__u64"
for its key argument whereas nilfs_bmap_lookup() takes "unsigned
long".
This fits them to "__u64" to eliminate the variation.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:31 +0000 (12:46 -0700)]
nilfs2: use bgl_lock_ptr()
Simplify nilfs_mdt_bgl_lock() by utilizing bgl_lock_ptr() helper in
<linux/blockgroup_lock.h>.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:28 +0000 (12:46 -0700)]
nilfs2: use set_mask_bits() for operations on buffer state bitmap
nilfs_forget_buffer(), nilfs_clear_dirty_page(), and
nilfs_segctor_complete_write() are using a bunch of atomic bit operations
against buffer state bitmap.
This reduces the number of them by utilizing set_mask_bits() macro.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 16 Apr 2015 19:46:25 +0000 (12:46 -0700)]
nilfs2: do not use async write flag for segment summary buffers
The async write flag is introduced to nilfs2 in the commit
7f42ec394156
("nilfs2: fix issue with race condition of competition between segments
for dirty blocks"), but the flag only makes sense for data buffers and
btree node buffers. It is not needed for segment summary buffers.
This gets rid of the latter uses as part of refactoring of atomic bit
operations on buffer state bitmap.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:46:23 +0000 (12:46 -0700)]
befs: replace typedef befs_inode_info by structure
See Documentation/CodingStyle
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:46:20 +0000 (12:46 -0700)]
befs: replace typedef befs_sb_info by structure
See Documenation/CodingStyle
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:46:17 +0000 (12:46 -0700)]
befs: replace typedef befs_mount_options by structure
See Documentation/CodingStyle
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:46:14 +0000 (12:46 -0700)]
rtc: use more standard kernel logging styles
Neaten the logging a bit by adding #define pr_fmt
Miscellanea:
o Remove __FILE__/__func__ uses
o Coalesce formats adding missing spaces
o Align arguments
o (rtc-cmos) Integrated 2 consecutive messages
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Joshua Kinard <kumba@gentoo.org>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heiko Stübner [Thu, 16 Apr 2015 19:46:11 +0000 (12:46 -0700)]
drivers/rtc/rtc-hym8563.c: fix swapped enable/disable of clockout control bit
The hym8563 datasheet describes the clock output control-bit as "when set
to logic 0, the square wave output is enable, when set to logic 1, the
CLKOUT output is inhibited". But in reality the setting is exactly
opposite.
Before now, the clock output was not really used, but on the rk3288 soc
this generated clock is used to supply the temperature sensor block and
the swapped bit value prevented it from working. With the corrected
value, the tsadc now reports correct values.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Thu, 16 Apr 2015 19:46:08 +0000 (12:46 -0700)]
drivers/rtc/rtc-s3c.c: remove one superfluous rtc_valid_tm() check
s3c_rtc_gettime() already returns the result of rtc_valid_tm() on the
obtained time so get rid of another call to rtc_valid_tm().
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lokesh Vutla [Thu, 16 Apr 2015 19:46:05 +0000 (12:46 -0700)]
drivers/rtc/rtc-omap.c: use module_platform_driver
module_platform_driver_probe() prevents driver from requesting probe
deferral. So using module_platform_drive() to support probe deferral.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lokesh Vutla [Thu, 16 Apr 2015 19:46:03 +0000 (12:46 -0700)]
drivers/rtc/Kconfig: update Kconfig for OMAP RTC
RTC is present in AM43xx and DRA7xx also. Updating the Kconfig to depend
on ARCH_OMAP or ARCH_DAVINCI
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lokesh Vutla [Thu, 16 Apr 2015 19:46:00 +0000 (12:46 -0700)]
drivers/rtc/rtc-omap.c: unlock and lock rtc registers before and after register writes
RTC module contains a kicker mechanism to prevent any spurious writes from
changing the register values. This mechanism requires two MMR writes to
the KICK0 and KICK1 registers with exact data values before the kicker
lock mechanism is released.
Currently the driver release the lock in the probe and leaves it enabled
until the rtc driver removal. This eliminates the idea of preventing
spurious writes when RTC driver is loaded. So implement rtc lock and
unlock functions before and after register writes.
This is as advised by Paul to implement lock and unlock functions in the
driver and not to unlock and leave it in probe. The same discussion can
be seen here:
http://www.mail-archive.com/linux-omap%40vger.kernel.org/msg111588.html
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Thu, 16 Apr 2015 19:45:57 +0000 (12:45 -0700)]
drivers/rtc/rtc-s3c.c: fix failed first read of RTC time
Initialize the device time (if it is wrong) before registering RTC device
to fix following error message during rtc-s3c probe:
[ 2.215414] rtc (null): read_time: fail to read
[ 2.216322] s3c-rtc
10070000.rtc: rtc core: registered s3c as rtc1
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaro Koskinen [Thu, 16 Apr 2015 19:45:54 +0000 (12:45 -0700)]
rtc: hctosys: use function name in the error log
Use function name in the error log instead of __FILE__.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaro Koskinen [Thu, 16 Apr 2015 19:45:51 +0000 (12:45 -0700)]
drivers/rtc/interface.c: __rtc_read_time: reduce log level
__rtc_read_time logs should be debug logs instead of error logs.
For example, when the RTC clock is not set, it's not really useful
to print a kernel error log every time someone tries to read the clock:
~ # hwclock -r
[ 604.508263] rtc rtc0: read_time: fail to read
hwclock: RTC_RD_TIME: Invalid argument
If there's a real error, it's likely that lower level or higher level
code will tell it anyway. Make these logs debug logs, and also print
the error code for the read failure.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaro Koskinen [Thu, 16 Apr 2015 19:45:48 +0000 (12:45 -0700)]
drivers/rtc/class.c: initialize rtc name early
In some error cases RTC name is used before it is initialized:
rtc-rs5c372 0-0032: clock needs to be set
rtc-rs5c372 0-0032: rs5c372b found, 24hr, driver version 0.6
rtc (null): read_time: fail to read
rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
Fix by initializing the name early.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Thu, 16 Apr 2015 19:45:45 +0000 (12:45 -0700)]
drivers/rtc/rtc-s5m.c: add support for S2MPS13 RTC
The S2MPS13 RTC is almost the same as S2MPS14. The differences when
updating alarm are:
1. Set WUDR+AUDR field instead of WUDR+RUDR.
2. Clear the AUDR field later (it is not auto-cleared).
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexandre Belloni [Thu, 16 Apr 2015 19:45:43 +0000 (12:45 -0700)]
MAINTAINERS: Add Alexandre Belloni as an RTC maintainer
I've noticed that most of the patches for the RTC subsystem are currently
either taken directly by Andrew or going through another maintainer's
tree, quite often without an Acked-by or Reviewed-by tag.
I'd like to propose myself as the RTC subsystem co-maintainer, to mainly
help Alessandro reviewing incoming patches and maintain a subsystem tree
to avoid having the RTC patches going through trees when they have no
particular dependencies.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Baruch Siach [Thu, 16 Apr 2015 19:45:40 +0000 (12:45 -0700)]
rtc: driver for Conexant Digicolor CX92755 on-chip RTC
Add driver for the RTC hardware block on the Conexant CX92755 SoC, from
the Digicolor series of SoCs. Tested on the Equinox evaluation board for
the CX92755 chip.
[akpm@linux-foundation.org: build command arrays at compile-time]
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Baruch Siach [Thu, 16 Apr 2015 19:45:37 +0000 (12:45 -0700)]
rtc: digicolor: document device tree binding
Add a device tree binding documentation to the Real Time Clock hardware
block on the Conexant CX92755 SoC. The CX92755 is from the Digicolor SoCs
series. Other SoCs in that series may share the same hardware block.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heiko Stuebner [Thu, 16 Apr 2015 19:45:34 +0000 (12:45 -0700)]
drivers/rtc/rtc-hym8563.c: return clock rate even when clock is disabled
When the clock is disabled, do not return a rate of 0 but instead return
the rate the clock will be running at after it gets enabled. This
prevents problems when the core clock code is trying to determine a
suitable rate, while the clock is still off.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adam Ward [Thu, 16 Apr 2015 19:45:32 +0000 (12:45 -0700)]
drivers/rtc/rtc-da9052.c: register ability of alarm to wake device from suspend
Signed-off-by: Adam Ward <adam.ward.opensource@diasemi.com>
Tested-by: Adam Ward <adam.ward.opensource@diasemi.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>