Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

projects / firefly-linux-kernel-4.4.55.git / log

Chuck Lever [Thu, 17 Oct 2013 18:13:02 +0000 (14:13 -0400)]

NFS: Add basic migration support to state manager thread

Migration recovery and state recovery must be serialized, so handle
both in the state manager thread.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:56 +0000 (14:12 -0400)]

NFS: Add a super_block backpointer to the nfs_server struct

NFS_SB() returns the pointer to an nfs_server struct, given a
pointer to a super_block. But we have no way to go back the other
way.

Add a super_block backpointer field so that, given an nfs_server
struct, it is easy to get to the filesystem's root dentry.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:50 +0000 (14:12 -0400)]

NFS: Add method to retrieve fs_locations during migration recovery

The nfs4_proc_fs_locations() function is invoked during referral
processing to perform a GETATTR(fs_locations) on an object's parent
directory in order to discover the target of the referral.  It
performs a LOOKUP in the compound, so the client needs to know the
parent's file handle a priori.

Unfortunately this function is not adequate for handling migration
recovery.  We need to probe fs_locations information on an FSID, but
there's no parent directory available for many operations that
can return NFS4ERR_MOVED.

Another subtlety: recovering from NFS4ERR_LEASE_MOVED is a process
of walking over a list of known FSIDs that reside on the server, and
probing whether they have migrated.  Once the server has detected
that the client has probed all migrated file systems, it stops
returning NFS4ERR_LEASE_MOVED.

A minor version zero server needs to know what client ID is
requesting fs_locations information so it can clear the flag that
forces it to continue returning NFS4ERR_LEASE_MOVED.  This flag is
set per client ID and per FSID.  However, the client ID is not an
argument of either the PUTFH or GETATTR operations.  Later minor
versions have client ID information embedded in the compound's
SEQUENCE operation.

Therefore, by convention, minor version zero clients send a RENEW
operation in the same compound as the GETATTR(fs_locations), since
RENEW's one argument is a clientid4.  This allows a minor version
zero server to identify correctly the client that is probing for a
migration.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:45 +0000 (14:12 -0400)]

NFS: Export _nfs_display_fhandle()

Allow code in nfsv4.ko to use _nfs_display_fhandle().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:39 +0000 (14:12 -0400)]

NFS: Introduce a vector of migration recovery ops

The differences between minor version 0 and minor version 1
migration will be abstracted by the addition of a set of migration
recovery ops.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:34 +0000 (14:12 -0400)]

NFS: Add functions to swap transports during migration recovery

Introduce functions that can walk through an array of returned
fs_locations information and connect a transport to one of the
destination servers listed therein.

Note that NFS minor version 1 introduces "fs_locations_info" which
extends the locations array sorting criteria available to clients.
This is not supported yet.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:28 +0000 (14:12 -0400)]

NFS: Add nfs4_update_server

New function nfs4_update_server() moves an nfs_server to a different
nfs_client. This is done as part of migration recovery.

Though it may be appealing to think of them as the same thing,
migration recovery is not the same as following a referral.

For a referral, the client has not descended into the file system
yet: it has no nfs_server, no super block, no inodes or open state.
It is enough to simply instantiate the nfs_server and super block,
and perform a referral mount.

For a migration, however, we have all of those things already, and
they have to be moved to a different nfs_client. No local namespace
changes are needed here.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Thu, 17 Oct 2013 18:12:23 +0000 (14:12 -0400)]

SUNRPC: Add a helper to switch the transport of an rpc_clnt

Add an RPC client API to redirect an rpc_clnt's transport from a
source server to a destination server during a migration event.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: forward ported to 3.12 ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Chuck Lever [Thu, 17 Oct 2013 18:12:17 +0000 (14:12 -0400)]

SUNRPC: Modify synopsis of rpc_client_register()

The rpc_client_register() helper was added in commit e73f4cc0,
"SUNRPC: split client creation routine into setup and registration,"
Mon Jun 24 11:52:52 2013. In a subsequent patch, I'd like to invoke
rpc_client_register() from a context where a struct rpc_create_args
is not available.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Weston Andros Adamson [Mon, 21 Oct 2013 17:10:13 +0000 (13:10 -0400)]

NFSv4: don't reprocess cached open CLAIM_PREVIOUS

Cached opens have already been handled by _nfs4_opendata_reclaim_to_nfs4_state
and can safely skip being reprocessed, but must still call update_open_stateid
to make sure that all active fmodes are recovered.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@vger.kernel.org # 3.7.x: f494a6071d3: NFSv4: fix NULL dereference
Cc: stable@vger.kernel.org # 3.7.x: a43ec98b72a: NFSv4: don't fail on missin
Cc: stable@vger.kernel.org # 3.7.x
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Mon, 28 Oct 2013 18:57:12 +0000 (14:57 -0400)]

NFSv4: Fix state reference counting in _nfs4_opendata_reclaim_to_nfs4_state

Currently, if the call to nfs_refresh_inode fails, then we end up leaking
a reference count, due to the call to nfs4_get_open_state.
While we're at it, replace nfs4_get_open_state with a simple call to
atomic_inc(); there is no need to do a full lookup of the struct nfs_state
since it is passed as an argument in the struct nfs4_opendata, and
is already assigned to the variable 'state'.

Cc: stable@vger.kernel.org # 3.7.x: a43ec98b72a: NFSv4: don't fail on missing
Cc: stable@vger.kernel.org # 3.7.x
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Weston Andros Adamson [Mon, 21 Oct 2013 17:10:11 +0000 (13:10 -0400)]

NFSv4: don't fail on missing fattr in open recover

This is an unneeded check that could cause the client to fail to recover
opens.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Weston Andros Adamson [Mon, 21 Oct 2013 17:10:10 +0000 (13:10 -0400)]

NFSv4: fix NULL dereference in open recover

_nfs4_opendata_reclaim_to_nfs4_state doesn't expect to see a cached
open CLAIM_PREVIOUS, but this can happen. An example is when there are
RDWR openers and RDONLY openers on a delegation stateid. The recovery
path will first try an open CLAIM_PREVIOUS for the RDWR openers, this
marks the delegation as not needing RECLAIM anymore, so the open
CLAIM_PREVIOUS for the RDONLY openers will not actually send an rpc.

The NULL dereference is due to _nfs4_opendata_reclaim_to_nfs4_state
returning PTR_ERR(rpc_status) when !rpc_done. When the open is
cached, rpc_done == 0 and rpc_status == 0, thus
_nfs4_opendata_reclaim_to_nfs4_state returns NULL - this is unexpected
by callers of nfs4_opendata_to_nfs4_state().

This can be reproduced easily by opening the same file two times on an
NFSv4.0 mount with delegations enabled, once as RDWR and once as RDONLY then
sleeping for a long time.  While the files are held open, kick off state
recovery and this NULL dereference will be hit every time.

An example OOPS:

[   65.003602] BUG: unable to handle kernel NULL pointer dereference at 00000000
00000030
[   65.005312] IP: [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[   65.006820] PGD 7b0ea067 PUD 791ff067 PMD 0
[   65.008075] Oops: 0000 [#1] SMP
[   65.008802] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache
snd_ens1371 gameport nfsd snd_rawmidi snd_ac97_codec ac97_bus btusb snd_seq snd
_seq_device snd_pcm ppdev bluetooth auth_rpcgss coretemp snd_page_alloc crc32_pc
lmul crc32c_intel ghash_clmulni_intel microcode rfkill nfs_acl vmw_balloon serio
_raw snd_timer lockd parport_pc e1000 snd soundcore parport i2c_piix4 shpchp vmw
_vmci sunrpc ata_generic mperf pata_acpi mptspi vmwgfx ttm scsi_transport_spi dr
m mptscsih mptbase i2c_core
[   65.018684] CPU: 0 PID: 473 Comm: 192.168.10.85-m Not tainted 3.11.2-201.fc19
.x86_64 #1
[   65.020113] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 07/31/2013
[   65.022012] task: ffff88003707e320 ti: ffff88007b906000 task.ti: ffff88007b906000
[   65.023414] RIP: 0010:[<ffffffffa037d6ee>]  [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[   65.025079] RSP: 0018:ffff88007b907d10  EFLAGS: 00010246
[   65.026042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   65.027321] RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000000
[   65.028691] RBP: ffff88007b907d38 R08: 0000000000016f60 R09: 0000000000000000
[   65.029990] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[   65.031295] R13: 0000000000000050 R14: 0000000000000000 R15: 0000000000000001
[   65.032527] FS:  0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
[   65.033981] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   65.035177] CR2: 0000000000000030 CR3: 000000007b27f000 CR4: 00000000000407f0
[   65.036568] Stack:
[   65.037011]  0000000000000000 0000000000000001 ffff88007b907d90 ffff88007a880220
[   65.038472]  ffff88007b768de8 ffff88007b907d48 ffffffffa037e4a5 ffff88007b907d80
[   65.039935]  ffffffffa036a6c8 ffff880037020e40 ffff88007a880000 ffff880037020e40
[   65.041468] Call Trace:
[   65.042050]  [<ffffffffa037e4a5>] nfs4_close_state+0x15/0x20 [nfsv4]
[   65.043209]  [<ffffffffa036a6c8>] nfs4_open_recover_helper+0x148/0x1f0 [nfsv4]
[   65.044529]  [<ffffffffa036a886>] nfs4_open_recover+0x116/0x150 [nfsv4]
[   65.045730]  [<ffffffffa036d98d>] nfs4_open_reclaim+0xad/0x150 [nfsv4]
[   65.046905]  [<ffffffffa037d979>] nfs4_do_reclaim+0x149/0x5f0 [nfsv4]
[   65.048071]  [<ffffffffa037e1dc>] nfs4_run_state_manager+0x3bc/0x670 [nfsv4]
[   65.049436]  [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4]
[   65.050686]  [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4]
[   65.051943]  [<ffffffff81088640>] kthread+0xc0/0xd0
[   65.052831]  [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40
[   65.054697]  [<ffffffff8165686c>] ret_from_fork+0x7c/0xb0
[   65.056396]  [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40
[   65.058208] Code: 5c 41 5d 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 89 d5 41 54 53 48 89 fb <4c> 8b 67 30 f0 41 ff 44 24 44 49 8d 7c 24 40 e8 0e 0a 2d e1 44
[   65.065225] RIP  [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[   65.067175]  RSP <ffff88007b907d10>
[   65.068570] CR2: 0000000000000030
[   65.070098] ---[ end trace 0d1fe4f5c7dd6f8b ]---

Cc: <stable@vger.kernel.org> #3.7+
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Mon, 28 Oct 2013 18:50:38 +0000 (14:50 -0400)]

NFSv4.1: Don't change the security label as part of open reclaim.

The current caching model calls for the security label to be set on
first lookup and/or on any subsequent label changes. There is no
need to do it as part of an open reclaim.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Jeff Layton [Mon, 21 Oct 2013 13:52:19 +0000 (09:52 -0400)]

nfs: fix handling of invalid mount options in nfs_remount

nfs_parse_mount_options returns 0 on error, not -errno.

Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Jeff Layton [Fri, 18 Oct 2013 14:03:37 +0000 (10:03 -0400)]

nfs: reject version and minorversion changes on remount attempts

Reported-by: Eric Doutreleau <edoutreleau@genoscope.cns.fr>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Andy Adamson [Thu, 3 Oct 2013 16:23:24 +0000 (12:23 -0400)]

NFSv4 Remove zeroing state kern warnings

As of commit 5d422301f97b821301efcdb6fc9d1a83a5c102d6 we no longer zero the
state.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Thu, 26 Sep 2013 19:22:45 +0000 (15:22 -0400)]

SUNRPC: call_connect_status should recheck bind and connect status on error

Currently, we go directly to call_transmit which sends us to call_status
on error. If we know that the connect attempt failed, we should rather
just jump straight back to call_bind and call_connect.

Ditto for EAGAIN, except do not delay.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Fri, 27 Sep 2013 15:28:40 +0000 (11:28 -0400)]

SUNRPC: Remove redundant initialisations of request rq_bytes_sent

Now that we clear the rq_bytes_sent field on unlock, we don't need
to set it on lock, so we just set it once when initialising the request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Wed, 25 Sep 2013 17:39:45 +0000 (13:39 -0400)]

SUNRPC: Fix RPC call retransmission statistics

A retransmit should be when you successfully transmit an RPC call to
the server a second time.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Tue, 24 Sep 2013 16:06:07 +0000 (12:06 -0400)]

NFSv4: Ensure that we disable the resend timeout for NFSv4

The spec states that the client should not resend requests because
the server will disconnect if it needs to drop an RPC request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Tue, 24 Sep 2013 16:00:27 +0000 (12:00 -0400)]

SUNRPC: Add RPC task and client level options to disable the resend timeout

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Wed, 25 Sep 2013 16:17:18 +0000 (12:17 -0400)]

SUNRPC: Clean up - convert xprt_prepare_transmit to return a bool

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Fri, 27 Sep 2013 15:09:53 +0000 (11:09 -0400)]

SUNRPC: Clear the request rq_bytes_sent field in xprt_release_write

Otherwise the tests of req->rq_bytes_sent in xprt_prepare_transmit
will fail if we're dealing with a resend.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Wed, 25 Sep 2013 15:31:54 +0000 (11:31 -0400)]

SUNRPC: Don't set the request connect_cookie until a successful transmit

We're using the request connect_cookie to track whether or not a
request was successfully transmitted on the current transport
connection or not. For that reason we should ensure that it is
only set after we've successfully transmitted the request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Thu, 26 Sep 2013 14:18:04 +0000 (10:18 -0400)]

SUNRPC: Only update the TCP connect cookie on a successful connect

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Tue, 24 Sep 2013 15:25:22 +0000 (11:25 -0400)]

SUNRPC: Enable the keepalive option for TCP sockets

For NFSv4 we want to avoid retransmitting RPC calls unless the TCP
connection breaks. However we still want to detect TCP connection
breakage as soon as possible. Do this by setting the keepalive option
with the idle timeout and count set to the 'timeo' and 'retrans' mount
options.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Tue, 1 Oct 2013 18:24:58 +0000 (14:24 -0400)]

NFSv4: Fix a use-after-free situation in _nfs4_proc_getlk()

In nfs4_proc_getlk(), when some error causes a retry of the call to
_nfs4_proc_getlk(), we can end up with Oopses of the form

BUG: unable to handle kernel NULL pointer dereference at 0000000000000134
IP: [<ffffffff8165270e>] _raw_spin_lock+0xe/0x30
<snip>
Call Trace:
  [<ffffffff812f287d>] _atomic_dec_and_lock+0x4d/0x70
  [<ffffffffa053c4f2>] nfs4_put_lock_state+0x32/0xb0 [nfsv4]
  [<ffffffffa053c585>] nfs4_fl_release_lock+0x15/0x20 [nfsv4]
  [<ffffffffa0522c06>] _nfs4_proc_getlk.isra.40+0x146/0x170 [nfsv4]
  [<ffffffffa052ad99>] nfs4_proc_lock+0x399/0x5a0 [nfsv4]

The problem is that we don't clear the request->fl_ops after the first
try and so when we retry, nfs4_set_lock_state() exits early without
setting the lock stateid.
Regression introduced by commit 70cc6487a4e08b8698c0e2ec935fb48d10490162
(locks: make ->lock release private data before returning in GETLK case)

Reported-by: Weston Andros Adamson <dros@netapp.com>
Reported-by: Jorge Mora <mora@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: <stable@vger.kernel.org> #2.6.22+

commit | commitdiff | tree

Linus Torvalds [Tue, 1 Oct 2013 00:10:26 +0000 (17:10 -0700)]

Merge tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
- Stable fix for Oopses in the pNFS files layout driver
- Fix a regression when doing a non-exclusive file create on NFSv4.x
- NFSv4.1 security negotiation fixes when looking up the root
   filesystem
- Fix a memory ordering issue in the pNFS files layout driver

* tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Give "flavor" an initial value to fix a compile warning
  NFSv4.1: try SECINFO_NO_NAME flavs until one works
  NFSv4.1: Ensure memory ordering between nfs4_ds_connect and nfs4_fl_prepare_ds
  NFSv4.1: nfs4_fl_prepare_ds - fix bugs when the connect attempt fails
  NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 21:32:32 +0000 (14:32 -0700)]

Merge branch 'akpm' (fixes from Andrew Morton)

Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  pidns: fix free_pid() to handle the first fork failure
  ipc,msg: prevent race with rmid in msgsnd,msgrcv
  ipc/sem.c: update sem_otime for all operations
  mm/hwpoison: fix the lack of one reference count against poisoned page
  mm/hwpoison: fix false report on 2nd attempt at page recovery
  mm/hwpoison: fix test for a transparent huge page
  mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood
  block: change config option name for cmdline partition parsing
  mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration
  mm: avoid reinserting isolated balloon pages into LRU lists
  arch/parisc/mm/fault.c: fix uninitialized variable usage
  include/asm-generic/vtime.h: avoid zero-length file
  nilfs2: fix issue with race condition of competition between segments for dirty blocks
  Documentation/kernel-parameters.txt: replace kernelcore with Movable
  mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored
  kernel/kmod.c: check for NULL in call_usermodehelper_exec()
  ipc/sem.c: synchronize the proc interface
  ipc/sem.c: optimize sem_lock()
  ipc/sem.c: fix race in sem_lock()
  mm/compaction.c: periodically schedule when freeing pages
  ...

commit | commitdiff | tree

Oleg Nesterov [Mon, 30 Sep 2013 20:45:27 +0000 (13:45 -0700)]

pidns: fix free_pid() to handle the first fork failure

"case 0" in free_pid() assumes that disable_pid_allocation() should
clear PIDNS_HASH_ADDING before the last pid goes away.

However this doesn't happen if the first fork() fails to create the
child reaper which should call disable_pid_allocation().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Davidlohr Bueso [Mon, 30 Sep 2013 20:45:26 +0000 (13:45 -0700)]

ipc,msg: prevent race with rmid in msgsnd,msgrcv

This fixes a race in both msgrcv() and msgsnd() between finding the msg
and actually dealing with the queue, as another thread can delete shmid
underneath us if we are preempted before acquiring the
kern_ipc_perm.lock.

Manfred illustrates this nicely:

Assume a preemptible kernel that is preempted just after

    msq = msq_obtain_object_check(ns, msqid)

in do_msgrcv().  The only lock that is held is rcu_read_lock().

Now the other thread processes IPC_RMID.  When the first task is
resumed, then it will happily wait for messages on a deleted queue.

Fix this by checking for if the queue has been deleted after taking the
lock.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reported-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: <stable@vger.kernel.org> [3.11]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Manfred Spraul [Mon, 30 Sep 2013 20:45:25 +0000 (13:45 -0700)]

ipc/sem.c: update sem_otime for all operations

In commit 0a2b9d4c7967 ("ipc/sem.c: move wake_up_process out of the
spinlock section"), the update of semaphore's sem_otime(last semop time)
was moved to one central position (do_smart_update).

But since do_smart_update() is only called for operations that modify
the array, this means that wait-for-zero semops do not update sem_otime
anymore.

The fix is simple:
Non-alter operations must update sem_otime.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Jia He <jiakernel@gmail.com>
Tested-by: Jia He <jiakernel@gmail.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Wanpeng Li [Mon, 30 Sep 2013 20:45:24 +0000 (13:45 -0700)]

mm/hwpoison: fix the lack of one reference count against poisoned page

The lack of one reference count against poisoned page for hwpoison_inject
w/o hwpoison_filter enabled result in hwpoison detect -1 users still
referenced the page, however, the number should be 0 except the poison
handler held one after successfully unmap.  This patch fix it by hold one
referenced count against poisoned page for hwpoison_inject w/ and w/o
hwpoison_filter enabled.

Before patch:

[   71.902112] Injecting memory failure at pfn 224706
[   71.902137] MCE 0x224706: dirty LRU page recovery: Failed
[   71.902138] MCE 0x224706: dirty LRU page still referenced by -1 users

After patch:

[   94.710860] Injecting memory failure at pfn 215b68
[   94.710885] MCE 0x215b68: dirty LRU page recovery: Recovered

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Wanpeng Li [Mon, 30 Sep 2013 20:45:23 +0000 (13:45 -0700)]

mm/hwpoison: fix false report on 2nd attempt at page recovery

If the page is poisoned by software injection w/ MF_COUNT_INCREASED
flag, there is a false report during the 2nd attempt at page recovery
which is not truthful.

This patch fixes it by reporting the first attempt to try free buddy
page recovery if MF_COUNT_INCREASED is set.

Before patch:

[  346.332041] Injecting memory failure at pfn 200010
[  346.332189] MCE 0x200010: free buddy, 2nd try page recovery: Delayed

After patch:

[  297.742600] Injecting memory failure at pfn 200010
[  297.742941] MCE 0x200010: free buddy page recovery: Delayed

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Wanpeng Li [Mon, 30 Sep 2013 20:45:22 +0000 (13:45 -0700)]

mm/hwpoison: fix test for a transparent huge page

PageTransHuge() can't guarantee the page is a transparent huge page
since it returns true for both transparent huge and hugetlbfs pages.

This patch fixes it by checking the page is also !hugetlbfs page.

Before patch:

[  121.571128] Injecting memory failure at pfn 23a200
[  121.571141] MCE 0x23a200: huge page recovery: Delayed
[  140.355100] MCE: Memory failure is now running on 0x23a200

After patch:

[   94.290793] Injecting memory failure at pfn 23a000
[   94.290800] MCE 0x23a000: huge page recovery: Delayed
[  105.722303] MCE: Software-unpoisoned page 0x23a000

Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Wanpeng Li [Mon, 30 Sep 2013 20:45:21 +0000 (13:45 -0700)]

mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood

madvise_hwpoison won't check if the page is small page or huge page and
traverses in small page granularity against the range unconditionally,
which result in a printk flood "MCE xxx: already hardware poisoned" if
the page is a huge page.

This patch fixes it by using compound_order(compound_head(page)) for
huge page iterator.

Testcase:

#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <errno.h>

#define PAGES_TO_TEST 3
#define PAGE_SIZE 4096 * 512

int main(void)
{
char *mem;
int i;

mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, 0, 0);

if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
return -1;

munmap(mem, PAGES_TO_TEST * PAGE_SIZE);

return 0;
}

Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Paul Gortmaker [Mon, 30 Sep 2013 20:45:19 +0000 (13:45 -0700)]

block: change config option name for cmdline partition parsing

Recently commit bab55417b10c ("block: support embedded device command
line partition") introduced CONFIG_CMDLINE_PARSER. However, that name
is too generic and sounds like it enables/disables generic kernel boot
arg processing, when it really is block specific.

Before this option becomes a part of a full/final release, add the BLK_
prefix to it so that it is clear in absence of any other context that it
is block specific.

In addition, fix up the following less critical items:
- help text was not really at all helpful.
- index file for Documentation was not updated
- add the new arg to Documentation/kernel-parameters.txt
- clarify wording in source comments

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Cai Zhiyong <caizhiyong@huawei.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Vlastimil Babka [Mon, 30 Sep 2013 20:45:18 +0000 (13:45 -0700)]

mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration

The function __munlock_pagevec_fill() introduced in commit 7a8010cd3627
("mm: munlock: manual pte walk in fast path instead of
follow_page_mask()") uses pmd_addr_end() for restricting its operation
within current page table.

This is insufficient on architectures/configurations where pmd is folded
and pmd_addr_end() just returns the end of the full range to be walked.
In this case, it allows pte++ to walk off the end of a page table
resulting in unpredictable behaviour.

This patch fixes the function by using pgd_addr_end() and pud_addr_end()
before pmd_addr_end(), which will yield correct page table boundary on
all configurations. This is similar to what existing page walkers do
when walking each level of the page table.

Additionaly, the patch clarifies a comment for get_locked_pte() call in the
function.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Cc: Jörn Engel <joern@logfs.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Rafael Aquini [Mon, 30 Sep 2013 20:45:16 +0000 (13:45 -0700)]

mm: avoid reinserting isolated balloon pages into LRU lists

Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.

The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path.  Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
  IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
  PGD 3cda2067 PUD 3d713067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 #87
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: shrink_page_list+0x24e/0x897
  RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
  RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
  RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
  R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
  R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
  FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
  Call Trace:
    shrink_inactive_list+0x240/0x3de
    shrink_lruvec+0x3e0/0x566
    __shrink_zone+0x94/0x178
    shrink_zone+0x3a/0x82
    balance_pgdat+0x32a/0x4c2
    kswapd+0x2f0/0x372
    kthread+0xa2/0xaa
    ret_from_fork+0x7c/0xb0
  Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
  RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
   RSP <ffff88003da499b8>
  CR2: 0000000000000028
  ---[ end trace 703d2451af6ffbfd ]---
  Kernel panic - not syncing: Fatal exception

This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.

[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Felipe Pena [Mon, 30 Sep 2013 20:45:14 +0000 (13:45 -0700)]

arch/parisc/mm/fault.c: fix uninitialized variable usage

The FAULT_FLAG_WRITE flag has been set based on uninitialized variable.

Fixes a regression added by commit 759496ba6407 ("arch: mm: pass
userspace fault flag to generic fault handler")

Signed-off-by: Felipe Pena <felipensp@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Andrew Morton [Mon, 30 Sep 2013 20:45:13 +0000 (13:45 -0700)]

include/asm-generic/vtime.h: avoid zero-length file

patch(1) can't handle zero-length files - it appears to simply not create
the file, so my powerpc build fails.

Put something in here to make life easier.

Cc: Hugh Dickins <hughd@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Vyacheslav Dubeyko [Mon, 30 Sep 2013 20:45:12 +0000 (13:45 -0700)]

nilfs2: fix issue with race condition of competition between segments for dirty blocks

Many NILFS2 users were reported about strange file system corruption
(for example):

   NILFS: bad btree node (blocknr=185027): level = 0, flags = 0x0, nchildren = 768
   NILFS error (device sda4): nilfs_bmap_last_key: broken bmap (inode number=11540)

But such error messages are consequence of file system's issue that takes
place more earlier.  Fortunately, Jerome Poulin <jeromepoulin@gmail.com>
and Anton Eliasson <devel@antoneliasson.se> were reported about another
issue not so recently.  These reports describe the issue with segctor
thread's crash:

  BUG: unable to handle kernel paging request at 0000000000004c83
  IP: nilfs_end_page_io+0x12/0xd0 [nilfs2]

  Call Trace:
   nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2]
   nilfs_segctor_construct+0x17b/0x290 [nilfs2]
   nilfs_segctor_thread+0x122/0x3b0 [nilfs2]
   kthread+0xc0/0xd0
   ret_from_fork+0x7c/0xb0

These two issues have one reason.  This reason can raise third issue
too.  Third issue results in hanging of segctor thread with eating of
100% CPU.

REPRODUCING PATH:

One of the possible way or the issue reproducing was described by
Jermoe me Poulin <jeromepoulin@gmail.com>:

1. init S to get to single user mode.
2. sysrq+E to make sure only my shell is running
3. start network-manager to get my wifi connection up
4. login as root and launch "screen"
5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies.
6. lscp | xz -9e > lscp.txt.xz
7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs
8. start a screen to dump /proc/kmsg to text file since rsyslog is killed
9. start a screen and launch strace -f -o find-cat.log -t find
/mnt/nilfs -type f -exec cat {} > /dev/null \;
10. start a screen and launch strace -f -o apt-get.log -t apt-get update
11. launch the last command again as it did not crash the first time
12. apt-get crashes
13. ps aux > ps-aux-crashed.log
13. sysrq+W
14. sysrq+E  wait for everything to terminate
15. sysrq+SUSB

Simplified way of the issue reproducing is starting kernel compilation
task and "apt-get update" in parallel.

REPRODUCIBILITY:

The issue is reproduced not stable [60% - 80%].  It is very important to
have proper environment for the issue reproducing.  The critical
conditions for successful reproducing:

(1) It should have big modified file by mmap() way.

(2) This file should have the count of dirty blocks are greater that
    several segments in size (for example, two or three) from time to time
    during processing.

(3) It should be intensive background activity of files modification
    in another thread.

INVESTIGATION:

First of all, it is possible to see that the reason of crash is not valid
page address:

  NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82
  NILFS [nilfs_segctor_complete_write]:2101 segbuf->sb_segnum 6783

Moreover, value of b_page (0x1a82) is 6786.  This value looks like segment
number.  And b_blocknr with b_size values look like block numbers.  So,
buffer_head's pointer points on not proper address value.

Detailed investigation of the issue is discovered such picture:

  [-----------------------------SEGMENT 6783-------------------------------]
  NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
  NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
  NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign
  NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
  NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
  NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
  NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
  NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111149024, segbuf->sb_segnum 6783

  [-----------------------------SEGMENT 6784-------------------------------]
  NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
  NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
  NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
  NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff8802174a6798, bh->b_assoc_buffers.prev ffff880221cffee8
  NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign
  NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
  NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
  NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
  NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
  NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
  NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6784
  NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50
  NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111150080, segbuf->sb_segnum 6784, segbuf->sb_nbio 0
  [----------] ditto
  NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111164416, segbuf->sb_segnum 6784, segbuf->sb_nbio 15

  [-----------------------------SEGMENT 6785-------------------------------]
  NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction
  NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect
  NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
  NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff880219277e80, bh->b_assoc_buffers.prev ffff880221cffc88
  NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage
  NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write
  NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs
  NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write
  NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824
  NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6785
  NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8
  NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111165440, segbuf->sb_segnum 6785, segbuf->sb_nbio 0
  [----------] ditto
  NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111177728, segbuf->sb_segnum 6785, segbuf->sb_nbio 12

  NILFS [nilfs_segctor_do_construct]:2399 nilfs_segctor_wait
  NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6783
  NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6784
  NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6785

  NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82

  BUG: unable to handle kernel paging request at 0000000000001a82
  IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2]

Usually, for every segment we collect dirty files in list.  Then, dirty
blocks are gathered for every dirty file, prepared for write and
submitted by means of nilfs_segbuf_submit_bh() call.  Finally, it takes
place complete write phase after calling nilfs_end_bio_write() on the
block layer.  Buffers/pages are marked as not dirty on final phase and
processed files removed from the list of dirty files.

It is possible to see that we had three prepare_write and submit_bio
phases before segbuf_wait and complete_write phase.  Moreover, segments
compete between each other for dirty blocks because on every iteration
of segments processing dirty buffer_heads are added in several lists of
payload_buffers:

  [SEGMENT 6784]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50
  [SEGMENT 6785]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8

The next pointer is the same but prev pointer has changed.  It means
that buffer_head has next pointer from one list but prev pointer from
another.  Such modification can be made several times.  And, finally, it
can be resulted in various issues: (1) segctor hanging, (2) segctor
crashing, (3) file system metadata corruption.

FIX:
This patch adds:

(1) setting of BH_Async_Write flag in nilfs_segctor_prepare_write()
    for every proccessed dirty block;

(2) checking of BH_Async_Write flag in
    nilfs_lookup_dirty_data_buffers() and
    nilfs_lookup_dirty_node_buffers();

(3) clearing of BH_Async_Write flag in nilfs_segctor_complete_write(),
    nilfs_abort_logs(), nilfs_forget_buffer(), nilfs_clear_dirty_page().

Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
Reported-by: Anton Eliasson <devel@antoneliasson.se>
Cc: Paul Fertser <fercerpav@gmail.com>
Cc: ARAI Shun-ichi <hermes@ceres.dti.ne.jp>
Cc: Piotr Szymaniak <szarpaj@grubelek.pl>
Cc: Juan Barry Manuel Canham <Linux@riotingpacifist.net>
Cc: Zahid Chowdhury <zahid.chowdhury@starsolutions.com>
Cc: Elmer Zhang <freeboy6716@gmail.com>
Cc: Kenneth Langga <klangga@gmail.com>
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Weiping Pan [Mon, 30 Sep 2013 20:45:10 +0000 (13:45 -0700)]

Documentation/kernel-parameters.txt: replace kernelcore with Movable

Han Pingtian found a typo in Documentation/kernel-parameters.txt about
"kernelcore=", that "kernelcore" should be replaced with "Movable" here.

Signed-off-by: Weiping Pan <wpan@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Darrick J. Wong [Mon, 30 Sep 2013 20:45:09 +0000 (13:45 -0700)]

mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored

The "force" parameter in __blk_queue_bounce was being ignored, which
means that stable page snapshots are not always happening (on ext3).
This of course leads to DIF disks reporting checksum errors, so fix this
regression.

The regression was introduced in commit 6bc454d15004 ("bounce: Refactor
__blk_queue_bounce to not use bi_io_vec")

Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Tetsuo Handa [Mon, 30 Sep 2013 20:45:08 +0000 (13:45 -0700)]

kernel/kmod.c: check for NULL in call_usermodehelper_exec()

If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
dereference happens upon core dump because argv_split("") returns
argv[0] == NULL.

This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check
subprocess_info->path != NULL") but was by error reintroduced by commit
7f57cfa4e2aa ("usermodehelper: kill the sub_info->path[0] check").

This bug seems to exist since 2.6.19 (the version which core dump to
pipe was added). Depending on kernel version and config, some side
effect might happen immediately after this oops (e.g. kernel panic with
2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Manfred Spraul [Mon, 30 Sep 2013 20:45:07 +0000 (13:45 -0700)]

ipc/sem.c: synchronize the proc interface

The proc interface is not aware of sem_lock(), it instead calls
ipc_lock_object() directly. This means that simple semop() operations
can run in parallel with the proc interface. Right now, this is
uncritical, because the implementation doesn't do anything that requires
a proper synchronization.

But it is dangerous and therefore should be fixed.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Manfred Spraul [Mon, 30 Sep 2013 20:45:06 +0000 (13:45 -0700)]

ipc/sem.c: optimize sem_lock()

Operations that need access to the whole array must guarantee that there
are no simple operations ongoing. Right now this is achieved by
spin_unlock_wait(sem->lock) on all semaphores.

If complex_count is nonzero, then this spin_unlock_wait() is not
necessary, because it was already performed in the past by the thread
that increased complex_count and even though sem_perm.lock was dropped
inbetween, no simple operation could have started, because simple
operations cannot start when complex_count is non-zero.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Manfred Spraul [Mon, 30 Sep 2013 20:45:04 +0000 (13:45 -0700)]

ipc/sem.c: fix race in sem_lock()

The exclusion of complex operations in sem_lock() is insufficient: after
acquiring the per-semaphore lock, a simple op must first check that
sem_perm.lock is not locked and only after that test check
complex_count.  The current code does it the other way around - and that
creates a race.  Details are below.

The patch is a complete rewrite of sem_lock(), based in part on the code
from Mike Galbraith.  It removes all gotos and all loops and thus the
risk of livelocks.

I have tested the patch (together with the next one) on my i3 laptop and
it didn't cause any problems.

The bug is probably also present in 3.10 and 3.11, but for these kernels
it might be simpler just to move the test of sma->complex_count after
the spin_is_locked() test.

Details of the bug:

Assume:
- sma->complex_count = 0.
- Thread 1: semtimedop(complex op that must sleep)
- Thread 2: semtimedop(simple op).

Pseudo-Trace:

Thread 1: sem_lock(): acquire sem_perm.lock
Thread 1: sem_lock(): check for ongoing simple ops
Nothing ongoing, thread 2 is still before sem_lock().
Thread 1: try_atomic_semop()
<<< preempted.

Thread 2: sem_lock():
        static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
                                      int nsops)
        {
                int locknum;
         again:
                if (nsops == 1 && !sma->complex_count) {
                        struct sem *sem = sma->sem_base + sops->sem_num;

                        /* Lock just the semaphore we are interested in. */
                        spin_lock(&sem->lock);

                        /*
                         * If sma->complex_count was set while we were spinning,
                         * we may need to look at things we did not lock here.
                         */
                        if (unlikely(sma->complex_count)) {
                                spin_unlock(&sem->lock);
                                goto lock_array;
                        }
        <<<<<<<<<
<<< complex_count is still 0.
<<<
        <<< Here it is preempted
        <<<<<<<<<

Thread 1: try_atomic_semop() returns, notices that it must sleep.
Thread 1: increases sma->complex_count.
Thread 1: drops sem_perm.lock
Thread 2:
                /*
                 * Another process is holding the global lock on the
                 * sem_array; we cannot enter our critical section,
                 * but have to wait for the global lock to be released.
                 */
                if (unlikely(spin_is_locked(&sma->sem_perm.lock))) {
                        spin_unlock(&sem->lock);
                        spin_unlock_wait(&sma->sem_perm.lock);
                        goto again;
                }
<<< sem_perm.lock already dropped, thus no "goto again;"

                locknum = sops->sem_num;

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

David Rientjes [Mon, 30 Sep 2013 20:45:03 +0000 (13:45 -0700)]

mm/compaction.c: periodically schedule when freeing pages

We've been getting warnings about an excessive amount of time spent
allocating pages for migration during memory compaction without
scheduling. isolate_freepages_block() already periodically checks for
contended locks or the need to schedule, but isolate_freepages() never
does.

When a zone is massively long and no suitable targets can be found, this
iteration can be quite expensive without ever doing cond_resched().

Check periodically for the need to reschedule while the compaction free
scanner iterates.

Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Dan Aloni [Mon, 30 Sep 2013 20:45:02 +0000 (13:45 -0700)]

fs/binfmt_elf.c: prevent a coredump with a large vm_map_count from Oopsing

A high setting of max_map_count, and a process core-dumping with a large
enough vm_map_count could result in an NT_FILE note not being written,
and the kernel crashing immediately later because it has assumed
otherwise.

Reproduction of the oops-causing bug described here:

https://lkml.org/lkml/2013/8/30/50

Rge ussue originated in commit 2aa362c49c31 ("coredump: extend core dump
note section to contain file names of mapped file") from Oct 4, 2012.

This patch make that section optional in that case. fill_files_note()
should signify the error, and also let the info struct in
elf_core_dump() be zero-initialized so that we can check for the
optionally written note.

[akpm@linux-foundation.org: avoid abusing E2BIG, remove a couple of not-really-needed local variables]
[akpm@linux-foundation.org: fix sparse warning]
Signed-off-by: Dan Aloni <alonid@stratoscale.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Reported-by: Martin MOKREJS <mmokrejs@gmail.com>
Tested-by: Martin MOKREJS <mmokrejs@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Joonyoung Shim [Mon, 30 Sep 2013 20:45:01 +0000 (13:45 -0700)]

revert "mm/memory-hotplug: fix lowmem count overflow when offline pages"

This reverts commit cea27eb2a202 ("mm/memory-hotplug: fix lowmem count
overflow when offline pages").

The fixed bug by commit cea27eb was fixed to another way by commit
3dcc0571cd64 ("mm: correctly update zone->managed_pages"). That commit
enhances memory_hotplug.c to adjust totalhigh_pages when hot-removing
memory, for details please refer to:

http://marc.info/?l=linux-mm&m=136957578620221&w=2

As a result, commit cea27eb2a202 currently causes duplicated decreasing
of totalhigh_pages, thus the revert.

Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 18:13:33 +0000 (11:13 -0700)]

Merge tag 'regulator-v3.12-rc3' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"Quite a few fixes here, mostly small driver specific ones.

  The stand out thing is a fix for errors generating the documentation
  from Randy Dunlap, otherwise unless you're using the driver in
  question there should be no impact"

* tag 'regulator-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: ti-abb: Fix bias voltage glitch in transition to bypass mode
  regulator: wm831x-ldo: Fix max_uV for gp_ldo and aldo linear range settings
  regulator: wm8350: correct the max_uV of LDO
  regulator: fix fatal kernel-doc error
  regulator: palmas: Remove wrong comment for the equation calculating num_voltages
  regulator: da9063: Fix PTR_ERR/ERR_PTR mismatch
  regulator: palmas: configure enable time for LDOs
  regulator: palmas: fix the n_voltages for smps to 122

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 18:12:20 +0000 (11:12 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security

Pull apparmor fixes from James Morris:
"Bugfixes for the Apparmor code for regressions introduced in the 3.12
  pull request"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  apparmor: fix suspicious RCU usage warning in policy.c/policy.h
  apparmor: Use shash crypto API interface for profile hashes

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 18:11:28 +0000 (11:11 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull assorted vfs fixes from Al Viro:
"A couple of bug fixes + removal of dead code in afs ->d_revalidate()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  afs: dget_parent() can't return a negative dentry
  ocfs2: needs ->d_lock to poke in ->d_parent->d_inode from ->d_revalidate()
  sysv: Add forgotten superblock lock init for v7 fs

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 17:40:20 +0000 (10:40 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/egtvedt/linux-avr32

Pull AVR32 fixes from Hans-Christian Egtvedt.

Fix build warnings and use the Kbuild infrastructure for generic headers
rather than doing it by hand.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
  avr32: cast syscall_return to silence compiler warning
  avr32: fix clockevents kernel warning
  avr32: use Kbuild infrastructure to handle the asm-generic headers

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 17:38:46 +0000 (10:38 -0700)]

Merge tag 'for-linus-20130929' of git://github.com/sctscore/official-linux

Pull S+core fixes from Lennox Wu:
"These updates include updating information of maintainers, fix some
  trivial errors, and add a necessary function for supporting ipv6"

* tag 'for-linus-20130929' of git://github.com/sctscore/official-linux:
  Score: Update the information of Score maintaners
  Score: Modify the Makefile of Score, remove -mlong-calls for compiling
  Score: Implement the function csum_ipv6_magic
  Score: The commit is for compiling successfully

commit | commitdiff | tree

Linus Torvalds [Mon, 30 Sep 2013 17:37:05 +0000 (10:37 -0700)]

Merge tag 'arc-fixes-for-3.12' of git://git./linux/kernel/git/vgupta/arc

Pull ARC Fixes from Vineet Gupta:
- Handle unaligned access in zero delay loops
- spinlock livelock fix for SMP systemC model
- fix 32bit overflow in access_ok
- better setup of clockevents

* tag 'arc-fixes-for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: Use clockevents_config_and_register over clockevents_register_device
  ARC: Workaround spinlock livelock in SMP SystemC simulation
  ARC: Fix 32-bit wrap around in access_ok()
  ARC: Handle zero-overhead-loop in unaligned access handler

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:33 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/wm8350' into regulator-linus

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:33 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/wm831x' into regulator-linus

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:32 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/ti-abb' into regulator-linus

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:31 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/palmas' into regulator-linus

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:30 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/doc' into regulator-linus

commit | commitdiff | tree

Mark Brown [Mon, 30 Sep 2013 11:04:29 +0000 (12:04 +0100)]

Merge remote-tracking branch 'regulator/fix/da9063' into regulator-linus

commit | commitdiff | tree

Gabor Juhos [Wed, 25 Sep 2013 19:50:01 +0000 (21:50 +0200)]

avr32: cast syscall_return to silence compiler warning

The patch fixes the following compiler warning:
    CC      arch/avr32/kernel/process.o
  arch/avr32/kernel/process.c: In function 'copy_thread':
  arch/avr32/kernel/process.c:292: warning: assignment makes integer \
  from pointer without a cast

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>

commit | commitdiff | tree

Gabor Juhos [Wed, 25 Sep 2013 13:32:35 +0000 (15:32 +0200)]

avr32: fix clockevents kernel warning

Since commit 01426478df3a8791ff5c8b6b82d409e699cfaf38
(avr32: Use generic idle loop) the kernel throws the
following warning on avr32:

  WARNING: at 900322e4 [verbose debug info unavailable]
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0-rc2 #117
  task: 901c3ecc ti: 901c0000 task.ti: 901c0000
  PC is at cpu_idle_poll_ctrl+0x1c/0x38
  LR is at comparator_mode+0x3e/0x40
  pc : [<900322e4>]    lr : [<90014882>]    Not tainted
  sp : 901c1f74  r12: 00000000  r11: 901c74a0
  r10: 901d2510  r9 : 00000001  r8 : 901db4de
  r7 : 901c74a0  r6 : 00000001  r5 : 00410020  r4 : 901db574
  r3 : 00410024  r2 : 90206fe0  r1 : 00000000  r0 : 007f0000
  Flags: qvnzc
  Mode bits: hjmde....G
  CPU Mode: Supervisor
  Call trace:
   [<90039ede>] clockevents_set_mode+0x16/0x2e
   [<90039f00>] clockevents_shutdown+0xa/0x1e
   [<9003a078>] clockevents_exchange_device+0x58/0x70
   [<9003a78c>] tick_check_new_device+0x38/0x54
   [<9003a1a2>] clockevents_register_device+0x32/0x90
   [<900035c4>] time_init+0xa8/0x108
   [<90000520>] start_kernel+0x128/0x23c

When the 'avr32_comparator' clockevent device is registered,
the clockevent core sets the mode of that clockevent device
to CLOCK_EVT_MODE_SHUTDOWN. Due to this, the 'comparator_mode'
function calls the 'cpu_idle_poll_ctrl' to disables idle poll.
This results in the aforementioned warning because the polling
is not enabled yet.

Change the code to only disable idle poll if it is enabled by
the same function to avoid the warning.

Cc: stable@vger.kernel.org
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>

commit | commitdiff | tree

Steven Rostedt [Sun, 1 Sep 2013 20:56:21 +0000 (22:56 +0200)]

avr32: use Kbuild infrastructure to handle the asm-generic headers

Use kbuild to add asm-generic headers that do nothing, also remove the arch
specific wrapper headers.

This only affects headers that do nothing but include the generic
equivalent. It does not touch any header that does a little more.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>

commit | commitdiff | tree

Al Viro [Sun, 29 Sep 2013 20:29:04 +0000 (16:29 -0400)]

afs: dget_parent() can't return a negative dentry

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 29 Sep 2013 18:59:30 +0000 (14:59 -0400)]

ocfs2: needs ->d_lock to poke in ->d_parent->d_inode from ->d_revalidate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Lubomir Rintel [Wed, 18 Sep 2013 10:39:16 +0000 (12:39 +0200)]

sysv: Add forgotten superblock lock init for v7 fs

Superblock lock was replaced with (un)lock_super() removal, but left
uninitialized for Seventh Edition UNIX filesystem in the following commit (3.7):
c07cb01 sysv: drop lock/unlock super

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

John Johansen [Sun, 29 Sep 2013 15:39:22 +0000 (08:39 -0700)]

apparmor: fix suspicious RCU usage warning in policy.c/policy.h

The recent 3.12 pull request for apparmor was missing a couple rcu _protected
access modifiers. Resulting in the follow suspicious RCU usage

[   29.804534] [ INFO: suspicious RCU usage. ]
[   29.804539] 3.11.0+ #5 Not tainted
[   29.804541] -------------------------------
[   29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage!
[   29.804548]
[   29.804548] other info that might help us debug this:
[   29.804548]
[   29.804553]
[   29.804553] rcu_scheduler_active = 1, debug_locks = 1
[   29.804558] 2 locks held by apparmor_parser/1268:
[   29.804560]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
[   29.804576]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
[   29.804589]
[   29.804589] stack backtrace:
[   29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
[   29.804599] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
[   29.804602]  0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540
[   29.804611]  ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18
[   29.804619]  ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084
[   29.804628] Call Trace:
[   29.804636]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
[   29.804642]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
[   29.804649]  [<ffffffff811f5084>] __aa_update_replacedby+0x53/0x7f
[   29.804655]  [<ffffffff811f5408>] __replace_profile+0x11f/0x1ed
[   29.804661]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
[   29.804668]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
[   29.804674]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
[   29.804680]  [<ffffffff81121609>] SyS_write+0x44/0x7a
[   29.804687]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b
[   29.804691]
[   29.804694] ===============================
[   29.804697] [ INFO: suspicious RCU usage. ]
[   29.804700] 3.11.0+ #5 Not tainted
[   29.804703] -------------------------------
[   29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage!
[   29.804709]
[   29.804709] other info that might help us debug this:
[   29.804709]
[   29.804714]
[   29.804714] rcu_scheduler_active = 1, debug_locks = 1
[   29.804718] 2 locks held by apparmor_parser/1268:
[   29.804721]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
[   29.804733]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
[   29.804744]
[   29.804744] stack backtrace:
[   29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
[   29.804753] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
[   29.804756]  0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540
[   29.804764]  ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000
[   29.804772]  ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94
[   29.804779] Call Trace:
[   29.804786]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
[   29.804791]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
[   29.804798]  [<ffffffff811f4f94>] aa_free_replacedby_kref+0x4d/0x62
[   29.804804]  [<ffffffff811f4f47>] ? aa_put_namespace+0x17/0x17
[   29.804810]  [<ffffffff811f4f0b>] kref_put+0x36/0x40
[   29.804816]  [<ffffffff811f5423>] __replace_profile+0x13a/0x1ed
[   29.804822]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
[   29.804829]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
[   29.804835]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
[   29.804840]  [<ffffffff81121609>] SyS_write+0x44/0x7a
[   29.804847]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b

Reported-by: miles.lane@gmail.com
CC: paulmck@linux.vnet.ibm.com
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>

commit | commitdiff | tree

Tyler Hicks [Sun, 29 Sep 2013 15:39:21 +0000 (08:39 -0700)]

apparmor: Use shash crypto API interface for profile hashes

Use the shash interface, rather than the hash interface, when hashing
AppArmor profiles. The shash interface does not use scatterlists and it
is a better fit for what AppArmor needs.

This fixes a kernel paging BUG when aa_calc_profile_hash() is passed a
buffer from vmalloc(). The hash interface requires callers to handle
vmalloc() buffers differently than what AppArmor was doing. Due to
vmalloc() memory not being physically contiguous, each individual page
behind the buffer must be assigned to a scatterlist with sg_set_page()
and then the scatterlist passed to crypto_hash_update().

The shash interface does not have that limitation and allows vmalloc()
and kmalloc() buffers to be handled in the same manner.

BugLink: https://launchpad.net/bugs/1216294/
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=62261
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 22:02:38 +0000 (15:02 -0700)]

Linux 3.12-rc3

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 20:47:35 +0000 (13:47 -0700)]

Merge tag 'usb-3.12-rc3' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a number of USB driver fixes for 3.12-rc3.

  These are all for host controller issues that have been reported, and
  there's a fix for an annoying error message that gets printed every
  time you remove a USB 3 device from the system that's been bugging me
  for a while"

* tag 'usb-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: dwc3: add support for Merrifield
  USB: fsl/ehci: fix failure of checking PHY_CLK_VALID during reinitialization
  USB: Fix breakage in ffs_fs_mount()
  fsl/usb: Resolve PHY_CLK_VLD instability issue for ULPI phy
  usb/core/devio.c: Don't reject control message to endpoint with wrong direction bit
  usb: chipidea: USB_CHIPIDEA should depend on HAS_DMA
  usb: chipidea: udc: free pending TD at removal procedure
  usb: chipidea: imx: Add usb_phy_shutdown at probe's error path
  usb: chipidea: Fix memleak for ci->hw_bank.regmap when removal
  usb: chipidea: udc: fix the oops after rmmod gadget
  USB: fix PM config symbol in uhci-hcd, ehci-hcd, and xhci-hcd
  USB: OHCI: accept very late isochronous URBs
  USB: UHCI: accept very late isochronous URBs
  USB: iMX21: accept very late isochronous URBs
  usbcore: check usb device's state before sending a Set SEL control transfer
  xhci: Fix race between ep halt and URB cancellation
  usb: Fix xHCI host issues on remote wakeup.
  xhci: Ensure a command structure points to the correct trb on the command ring
  xhci: Fix oops happening after address device timeout

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 20:47:00 +0000 (13:47 -0700)]

Merge tag 'tty-3.12-rc3' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are some serial at tty driver fixes for 3.12-rc3

  The serial driver fixes some kref leaks, documentation is moved to the
  proper places, and the tty and n_tty fixes resolve some reported
  regressions.  There is still one outstanding tty regression fix that
  isn't in here yet, as I want to test it out some more, it will be sent
  for 3.12-rc4 if it checks out"

* tag 'tty-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: ar933x_uart: move devicetree binding documentation
  tty: Fix SIGTTOU not sent with tcflush()
  n_tty: Fix EOF push index when termios changes
  serial: pch_uart: remove unnecessary tty_port_tty_get
  serial: pch_uart: fix tty-kref leak in dma-rx path
  serial: pch_uart: fix tty-kref leak in rx-error path
  serial: tegra: fix tty-kref leak

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 20:46:18 +0000 (13:46 -0700)]

Merge tag 'staging-3.12-rc3' of git://git./linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
"Here are some staging driver fixes, MAINTAINER updates, and a new
  device id.  All of these have been in the linux-next tree, and are
  pretty simple patches"

* tag 'staging-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8188eu: Add new device ID
  staging: imx-drm: Fix probe failure
  staging: vt6656: [BUG] iwctl_siwencodeext return if device not open
  staging: vt6656: [BUG] main_usb.c oops on device_close move flag earlier.
  staging: vt6656: rxtx.c [BUG] s_vGetFreeContext dead lock on null apTD.
  Staging: rtl8192u: r819xU_cmdpkt: checking NULL value after doing dev_alloc_skb
  staging: usbip: Orphan usbip
  staging: r8188eu: Add files for new drive: Cocci spatch "noderef"
  staging: r8188eu: Cocci spatch "noderef"
  staging: octeon-usb: Cocci spatch "noderef"
  staging: r8188eu: Add files for new drive: Cocci spatch "noderef"
  MAINTAINERS: staging: dgnc and dgap drivers: add maintainer
  staging: lustre: Cocci spatch "noderef"

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 20:45:23 +0000 (13:45 -0700)]

Merge tag 'driver-core-3.12-rc3' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core / sysfs fixes from Greg KH:
"Here are 2 fixes for 3.12-rc3.  One fixes a sysfs problem with
  mounting caused by 3.12-rc1, and the other is a bug reported by the
  chromeos developers with the driver core.

  Both have been in linux-next for a bit"

* tag 'driver-core-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core : Fix use after free of dev->parent in device_shutdown
  sysfs: Allow mounting without CONFIG_NET

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 20:44:48 +0000 (13:44 -0700)]

Merge tag 'char-misc-3.12-rc3' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are some HyperV and MEI driver fixes for 3.12-rc3.  They resolve
  some issues that people have been reporting for them"

* tag 'char-misc-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: Terminate vmbus version negotiation on timeout
  Drivers: hv: util: Correctly support ws2008R2 and earlier
  mei: cancel stall timers in mei_reset
  mei: bus: stop wait for read during cl state transition
  mei: make me client counters less error prone

commit | commitdiff | tree

Anna Schumaker [Wed, 25 Sep 2013 21:02:48 +0000 (17:02 -0400)]

NFS: Give "flavor" an initial value to fix a compile warning

The previous patch introduces a compile warning by not assigning an initial
value to the "flavor" variable. This could only be a problem if the server
returns a supported secflavor list of length zero, but it's better to
fix this before it's ever hit.

Signed-off-by: Anna Schumaker <bjschuma@netapp.com>
Acked-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Weston Andros Adamson [Tue, 24 Sep 2013 17:58:02 +0000 (13:58 -0400)]

NFSv4.1: try SECINFO_NO_NAME flavs until one works

Call nfs4_lookup_root_sec for each flavor returned by SECINFO_NO_NAME until
one works.

One example of a situation this fixes:

- server configured for krb5
- server principal somehow gets deleted from KDC
- server still thinking krb is good, sends krb5 as first entry in
SECINFO_NO_NAME response
- client tries krb5, but this fails without even sending an RPC because
gssd's requests to the KDC can't find the server's principal

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Thu, 26 Sep 2013 18:32:56 +0000 (14:32 -0400)]

NFSv4.1: Ensure memory ordering between nfs4_ds_connect and nfs4_fl_prepare_ds

We need to ensure that the initialisation of the data server nfs_client
structure in nfs4_ds_connect is correctly ordered w.r.t. the read of
ds->ds_clp in nfs4_fl_prepare_ds.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit | commitdiff | tree

Trond Myklebust [Thu, 26 Sep 2013 18:08:36 +0000 (14:08 -0400)]

NFSv4.1: nfs4_fl_prepare_ds - fix bugs when the connect attempt fails

- Fix an Oops when nfs4_ds_connect() returns an error.
- Always check the device status after waiting for a connect to complete.

Reported-by: Andy Adamson <andros@netapp.com>
Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: <stable@vger.kernel.org> # v3.10+

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 17:04:03 +0000 (10:04 -0700)]

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf revert from Ingo Molnar:
"This fixes the 'perf top' regression Markus Trippelsdorf reported"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "perf symbols: Demangle cloned functions"

commit | commitdiff | tree

Linus Torvalds [Sun, 29 Sep 2013 17:02:40 +0000 (10:02 -0700)]

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Nothing too major, radeon still has some dpm changes for off by
  default.

  Radeon, intel, msm:
   - radeon: a few more dpm fixes (still off by default), uvd fixes
   - i915: runtime warn backtrace and regression fix
   - msm: iommu changes fallout"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (27 commits)
  drm/msm: use drm_gem_dumb_destroy helper
  drm/msm: deal with mach/iommu.h removal
  drm/msm: Remove iommu include from mdp4_kms.c
  drm/msm: Odd PTR_ERR usage
  drm/i915: Fix up usage of SHRINK_STOP
  drm/radeon: fix hdmi audio on DCE3.0/3.1 asics
  drm/i915: preserve pipe A quirk in i9xx_set_pipeconf
  drm/i915/tv: clear adjusted_mode.flags
  drm/i915/dp: increase i2c-over-aux retry interval on AUX DEFER
  drm/radeon/cik: fix overflow in vram fetch
  drm/radeon: add missing hdmi callbacks for rv6xx
  drm/i915: Use a temporary va_list for two-pass string handling
  drm/radeon/uvd: lower msg&fb buffer requirements on UVD3
  drm/radeon: disable tests/benchmarks if accel is disabled
  drm/radeon: don't set default clocks for SI when DPM is disabled
  drm/radeon/dpm/ci: filter clocks based on voltage/clk dep tables
  drm/radeon/dpm/si: filter clocks based on voltage/clk dep tables
  drm/radeon/dpm/ni: filter clocks based on voltage/clk dep tables
  drm/radeon/dpm/btc: filter clocks based on voltage/clk dep tables
  drm/radeon/dpm: fetch the max clk from voltage dep tables helper
  ...

commit | commitdiff | tree

Ingo Molnar [Sun, 29 Sep 2013 14:12:54 +0000 (16:12 +0200)]

Revert "perf symbols: Demangle cloned functions"

This reverts commit de95ab53645a2f0015e0f68ee723f18dce2b8b51.

Markus Trippelsdorf reported that this commit broke 'perf top':

> I just see a gray screen with no text at all. Sometimes the
> following error messages are printed:
>
>  *** Error in `perf': invalid fastbin entry (free): 0x00000000029b18c0
>  ***
>  *** Error in `perf': malloc(): memory corruption (fast): 0x0000000000ee0b10 ***

While this code is fixable, the commit itself fails on several levels:

- it should have been a separate helper function
- why the heck does it do strchr() twice
- it casts a const char * over into char *
- sloppy style
- it's not even a regression fix!

So lets revert it and re-try the patch in v3.13.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Dave Airlie [Sun, 29 Sep 2013 00:06:28 +0000 (10:06 +1000)]

Merge branch 'msm-fixes-3.12-rc2' of git://people.freedesktop.org/~robclark/linux into drm-fixes

A small fix + deal with fallout of iommu changes + use new
drm_gem_dumb_destroy helper.

* 'msm-fixes-3.12-rc2' of git://people.freedesktop.org/~robclark/linux:
  drm/msm: use drm_gem_dumb_destroy helper
  drm/msm: deal with mach/iommu.h removal
  drm/msm: Remove iommu include from mdp4_kms.c
  drm/msm: Odd PTR_ERR usage

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 21:22:17 +0000 (14:22 -0700)]

Merge branches 'sched-urgent-for-linus', 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler, timer and x86 fixes from Ingo Molnar:
- A context tracking ARM build and functional fix
- A handful of ARM clocksource/clockevent driver fixes
- An AMD microcode patch level sysfs reporting fixlet

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm: Fix build error with context tracking calls

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast
  clocksource: of: Respect device tree node status
  clocksource: exynos_mct: Set IRQ affinity when the CPU goes online
  arm: clocksource: mvebu: Use the main timer as clock source from DT

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Fix patch level reporting for family 15h

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 21:21:13 +0000 (14:21 -0700)]

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"A couple of tooling fixlets and a PMU detection printout fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix PMU detection printout when no PMU is detected
  perf symbols: Demangle cloned functions
  perf machine: Fix path unpopulated in machine__create_modules()
  perf tools: Explicitly add libdl dependency
  perf probe: Fix probing symbols with optimization suffix
  perf trace: Add mmap2 handler
  perf kmem: Make it work again on non NUMA machines

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 20:52:05 +0000 (13:52 -0700)]

Merge tag 'xfs-for-linus-v3.12-rc3' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
- fix for directory node collapse regression
- fix for recovery over stale on disk structures
- fix for eofblocks ioctl
- fix asserts in xfs_inode_free
- lock the ail before removing an item from it

* tag 'xfs-for-linus-v3.12-rc3' of git://oss.sgi.com/xfs/xfs:
  xfs: fix node forward in xfs_node_toosmall
  xfs: log recovery lsn ordering needs uuid check
  xfs: fix XFS_IOC_FREE_EOFBLOCKS definition
  xfs: asserting lock not held during freeing not valid
  xfs: lock the AIL before removing the buffer item

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 20:44:09 +0000 (13:44 -0700)]

Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"Some driver bugfixes for the I2C subsystem"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: ismt: initialize DMA buffer
  i2c: designware: 10-bit addressing mode enabling if I2C_DYNAMIC_TAR_UPDATE is set
  i2c: mv64xxx: Do not use writel_relaxed()
  i2c: mv64xxx: Fix some build warnings
  i2c: s3c2410: fix clk_disable/clk_unprepare WARNings

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 20:27:31 +0000 (13:27 -0700)]

Merge tag 'pm+acpi-3.12-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:
"These fix one recent cpufreq regression, a few older bugs that may
  harm users and a kerneldoc typo.

  Specifics:

   1) After the recent locking changes in the cpufreq core it is
      possible to trigger BUG_ON(!policy) in lock_policy_rwsem_read() if
      cpufreq_get() is called before registering a cpufreq driver.  Fix
      from Viresh Kumar.

   2) If intel_pstate has been loaded already, it doesn't make sense to
      do anything in acpi_cpufreq_init() and moreover doing something in
      there in that case may be harmful, so make that function return
      immediately if another cpufreq driver is already present.  From
      Yinghai Lu.

   3) The ACPI IPMI driver sometimes attempts to acquire a mutex from
      interrupt context, which can be avoided by replacing that mutex
      with a spinlock.  From Lv Zheng.

   4) A NULL pointer may be dereferenced by the exynos5440 cpufreq
      driver if a memory allocation made by it fails.  Fix from Sachin
      Kamat.

   5) Hanjun Guo's commit fixes a typo in the kerneldoc comment
      documenting acpi_bus_unregister_driver()"

* tag 'pm+acpi-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / scan: fix typo in comments of acpi_bus_unregister_driver()
  cpufreq: exynos5440: Fix potential NULL pointer dereference
  cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in cpufreq_get()
  acpi-cpufreq: skip loading acpi_cpufreq after intel_pstate
  ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()

commit | commitdiff | tree

Yinghai Lu [Sat, 28 Sep 2013 20:13:07 +0000 (13:13 -0700)]

PCI: Workaround missing pci_set_master in pci drivers

Ben Herrenschmidt found that commit 928bea964827 ("PCI: Delay enabling
bridges until they're needed") breaks PCI in some powerpc environments.

The reason is that the PCIe port driver will call pci_enable_device() on
the bridge, so the device is enabled, but skips pci_set_master because
pcie_port_auto and no acpi on powerpc.

Because of that, pci_enable_bridge() later on (called as a result of the
child device driver doing pci_enable_device) will see the bridge as
already enabled and will not call pci_set_master() on it.

Fixed by add checking in pci_enable_bridge, and call pci_set_master
if driver skip that.

That will make the code more robot and wade off problem for missing
pci_set_master in drivers.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 19:36:19 +0000 (12:36 -0700)]

Merge branch 'lockref' of git://git./linux/kernel/git/s390/linux

Pull s390 lockref enablement from Heiko Carstens:
"Enabling the new lockless lockref variant on s390 would have been
  trivial until Tony Luck added a cpu_relax() call into the
  CMPXCHG_LOOP(), with commit d472d9d98b46 ("lockref: Relax in cmpxchg
  loop")

  As already mentioned cpu_relax() is very expensive on s390 since it
  yields() the current virtual cpu.  So we are talking of several
  thousand cycles.  Considering this enabling the lockless lockref
  variant would contradict the intention of the new semantics.  And also
  some quick measurements show performance regressions of 50% and more.

  Simply removing the cpu_relax() call again seems also not very
  desireable since Waiman Long reported that for some workloads the call
  improved performance by 5%."

* 'lockref' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: enable ARCH_USE_CMPXCHG_LOCKREF
  lockref: use arch_mutex_cpu_relax() in CMPXCHG_LOOP()
  mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef

commit | commitdiff | tree

Jean Delvare [Fri, 27 Sep 2013 20:17:39 +0000 (13:17 -0700)]

kernel/params: fix handling of signed integer types

Commit 6072ddc8520b ("kernel: replace strict_strto*() with kstrto*()")
broke the handling of signed integer types, fix it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 18:57:26 +0000 (11:57 -0700)]

Merge tag 'devicetree-fixes' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:
"Clean-up to fix some warnings for !OF builds and spelling fixes in
  docs:

   - Clean-up openrisc prom.h
   - Fix warnings caused by of_irq.h ifdefs
   - Spelling fix for Synopsys"

* tag 'devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dts: Fix misspelling of Synopsys
  of: clean-up ifdefs in of_irq.h
  openrisc: clean-up prom.h

commit | commitdiff | tree

Linus Torvalds [Sat, 28 Sep 2013 18:56:34 +0000 (11:56 -0700)]

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"Just a few relatively small ARM fixes found since the last merge
  window, nothing too exciting"

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7837/3: fix Thumb-2 bug in AES assembler code
  ARM: only allow kernel mode neon with AEABI
  ARM: 7839/1: entry: fix tracing of ARM-private syscalls
  ARM: 7836/1: add __get_user_unaligned/__put_user_unaligned

commit | commitdiff | tree

James Ralston [Tue, 24 Sep 2013 23:47:55 +0000 (16:47 -0700)]

i2c: ismt: initialize DMA buffer

This patch adds code to initialize the DMA buffer to compensate for
possible hardware data corruption.

Signed-off-by: James Ralston <james.d.ralston@intel.com>
[wsa: changed to use 'sizeof']
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

commit | commitdiff | tree

Rob Clark [Sat, 28 Sep 2013 14:13:04 +0000 (10:13 -0400)]

drm/msm: use drm_gem_dumb_destroy helper

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Sat, 28 Sep 2013 14:07:06 +0000 (10:07 -0400)]

drm/msm: deal with mach/iommu.h removal

We still need an API exported by msm iommu driver (but not visible in
any public header anymore). For now, just declare the prototype
ourselves, but when msm iommu driver provides a better option, use that
instead.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Ingo Molnar [Sat, 28 Sep 2013 13:48:48 +0000 (15:48 +0200)]

perf/x86: Fix PMU detection printout when no PMU is detected

Ran into this cryptic PMU bootup log recently:

[    0.124047] Performance Events:
[    0.125000] smpboot: ...

Turns out we print this if no PMU is detected. Fall back to
the right condition so that the following is printed:

[    0.122381] Performance Events: no PMU driver, software events only.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-u2fwaUffakjp0qkpRfqljgsn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom