Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

Fix cap_capable to only allow owners in the parent user namespace to have caps.

Andy Lutomirski pointed out that the current behavior of allowing the
owner of a user namespace to have all caps when that owner is not in a
parent user namespace is wrong.  Add a test to ensure the owner of a user
namespace is in the parent of the user namespace to fix this bug.

Thankfully this bug did not apply to the initial user namespace, keeping
the mischief that can be caused by this bug quite small.

This is bug was introduced in v3.5 by commit 783291e6900
"Simplify the user_namespace by making userns->creator a kuid."
But did not matter until the permisions required to create
a user namespace were relaxed allowing a user namespace to be created
inside of a user namespace.

The bug made it possible for the owner of a user namespace to be
present in a child user namespace.  Since the owner of a user nameapce
is granted all capabilities it became possible for users in a
grandchild user namespace to have all privilges over their parent user
namspace.

Reorder the checks in cap_capable.  This should make the common case
faster and make it clear that nothing magic happens in the initial
user namespace.  The reordering is safe because cred->user_ns
can only be in targ_ns or targ_ns->parent but not both.

Add a comment a the top of the loop to make the logic of
the code clear.

Add a distinct variable ns that changes as we walk up
the user namespace hierarchy to make it clear which variable
is changing.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

proc: Usable inode numbers for the namespace file descriptors.

Assign a unique proc inode to each namespace, and use that
inode number to ensure we only allocate at most one proc
inode for every namespace in proc.

A single proc inode per namespace allows userspace to test
to see if two processes are in the same namespace.

This has been a long requested feature and only blocked because
a naive implementation would put the id in a global space and
would ultimately require having a namespace for the names of
namespaces, making migration and certain virtualization tricks
impossible.

We still don't have per superblock inode numbers for proc, which
appears necessary for application unaware checkpoint/restart and
migrations (if the application is using namespace file descriptors)
but that is now allowd by the design if it becomes important.

I have preallocated the ipc and uts initial proc inode numbers so
their structures can be statically initialized.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

proc: Fix the namespace inode permission checks.

Change the proc namespace files into symlinks so that
we won't cache the dentries for the namespace files
which can bypass the ptrace_may_access checks.

To support the symlinks create an additional namespace
inode with it's own set of operations distinct from the
proc pid inode and dentry methods as those no longer
make sense.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

proc: Generalize proc inode allocation

Generalize the proc inode allocation so that it can be
used without having to having to create a proc_dir_entry.

This will allow namespace file descriptors to remain light
weight entitities but still have the same inode number
when the backing namespace is the same.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: Allow unprivilged mounts of proc and sysfs

- The context in which proc and sysfs are mounted have no
effect on the the uid/gid of their files so no conversion is
needed except allowing the mount.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file

To keep things sane in the context of file descriptor passing derive the
user namespace that uids are mapped into from the opener of the file
instead of from current.

When writing to the maps file the lower user namespace must always
be the parent user namespace, or setting the mapping simply does
not make sense. Enforce that the opener of the file was in
the parent user namespace or the user namespace whose mapping
is being set.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

procfs: Print task uids and gids in the userns that opened the proc file

Instead of using current_userns() use the userns of the opener
of the file so that if the file is passed between processes
the contents of the file do not change.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Implement unshare of the user namespace

- Add CLONE_THREAD to the unshare flags if CLONE_NEWUSER is selected
  As changing user namespaces is only valid if all there is only
  a single thread.
- Restore the code to add CLONE_VM if CLONE_THREAD is selected and
  the code to addCLONE_SIGHAND if CLONE_VM is selected.
  Making the constraints in the code clear.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Implent proc namespace operations

This allows entering a user namespace, and the ability
to store a reference to a user namespace with a bind
mount.

Addition of missing userns_ns_put in userns_install
from Gao feng <gaofeng@cn.fujitsu.com>

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Kill task_user_ns

The task_user_ns function hides the fact that it is getting the user
namespace from struct cred on the task. struct cred may go away as
soon as the rcu lock is released. This leads to a race where we
can dereference a stale user namespace pointer.

To make it obvious a struct cred is involved kill task_user_ns.

To kill the race modify the users of task_user_ns to only
reference the user namespace while the rcu lock is held.

Cc: Kees Cook <keescook@chromium.org>
Cc: James Morris <james.l.morris@oracle.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Make create_new_namespaces take a user_ns parameter

Modify create_new_namespaces to explicitly take a user namespace
parameter, instead of implicitly through the task_struct.

This allows an implementation of unshare(CLONE_NEWUSER) where
the new user namespace is not stored onto the current task_struct
until after all of the namespaces are created.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Allow unprivileged use of setns.

- Push the permission check from the core setns syscall into
  the setns install methods where the user namespace of the
  target namespace can be determined, and used in a ns_capable
  call.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Allow unprivileged users to create new namespaces

If an unprivileged user has the appropriate capabilities in their
current user namespace allow the creation of new namespaces.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Allow setting a userns mapping to your current uid.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Allow chown and setgid preservation

- Allow chown if CAP_CHOWN is present in the current user namespace
  and the uid of the inode maps into the current user namespace, and
  the destination uid or gid maps into the current user namespace.

- Allow perserving setgid when changing an inode if CAP_FSETID is
  present in the current user namespace and the owner of the file has
  a mapping into the current user namespace.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: Allow unprivileged users to create user namespaces.

Now that we have been through every permission check in the kernel
having uid == 0 and gid == 0 in your local user namespace no
longer adds any special privileges.  Even having a full set
of caps in your local user namespace is safe because capabilies
are relative to your local user namespace, and do not confer
unexpected privileges.

Over the long term this should allow much more of the kernels
functionality to be safely used by non-root users.  Functionality
like unsharing the mount namespace that is only unsafe because
it can fool applications whose privileges are raised when they
are executed.  Since those applications have no privileges in
a user namespaces it becomes safe to spoof and confuse those
applications all you want.

Those capabilities will still need to be enabled carefully because
we may still need things like rlimits on the number of unprivileged
mounts but that is to avoid DOS attacks not to avoid fooling root
owned processes.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped

When performing an exec where the binary lives in one user namespace and
the execing process lives in another usre namespace there is the possibility
that the target uids can not be represented.

Instead of failing the exec simply ignore the suid/sgid bits and run
the binary with lower privileges. We already do this in the case
of MNT_NOSUID so this should be a well tested code path.

As the user and group are not changed this should not introduce any
security issues.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: fix return value on mntns_install() failure

Change return value from -EINVAL to -EPERM when the permission check fails.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

vfs: Allow unprivileged manipulation of the mount namespace.

- Add a filesystem flag to mark filesystems that are safe to mount as
  an unprivileged user.

- Add a filesystem flag to mark filesystems that don't need MNT_NODEV
  when mounted by an unprivileged user.

- Relax the permission checks to allow unprivileged users that have
  CAP_SYS_ADMIN permissions in the user namespace referred to by the
  current mount namespace to be allowed to mount, unmount, and move
  filesystems.

Acked-by: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

vfs: Only support slave subtrees across different user namespaces

Sharing mount subtress with mount namespaces created by unprivileged
users allows unprivileged mounts created by unprivileged users to
propagate to mount namespaces controlled by privileged users.

Prevent nasty consequences by changing shared subtrees to slave
subtress when an unprivileged users creates a new mount namespace.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

vfs: Add a user namespace reference from struct mnt_namespace

This will allow for support for unprivileged mounts in a new user namespace.

Acked-by: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

vfs: Add setns support for the mount namespace

setns support for the mount namespace is a little tricky as an
arbitrary decision must be made about what to set fs->root and
fs->pwd to, as there is no expectation of a relationship between
the two mount namespaces.  Therefore I arbitrarily find the root
mount point, and follow every mount on top of it to find the top
of the mount stack.  Then I set fs->root and fs->pwd to that
location.  The topmost root of the mount stack seems like a
reasonable place to be.

Bind mount support for the mount namespace inodes has the
possibility of creating circular dependencies between mount
namespaces.  Circular dependencies can result in loops that
prevent mount namespaces from every being freed.  I avoid
creating those circular dependencies by adding a sequence number
to the mount namespace and require all bind mounts be of a
younger mount namespace into an older mount namespace.

Add a helper function proc_ns_inode so it is possible to
detect when we are attempting to bind mound a namespace inode.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

vfs: Allow chroot if you have CAP_SYS_CHROOT in your user namespace

Once you are confined to a user namespace applications can not gain
privilege and escape the user namespace so there is no longer a reason
to restrict chroot.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

pidns: Support unsharing the pid namespace.

Unsharing of the pid namespace unlike unsharing of other namespaces
does not take affect immediately.  Instead it affects the children
created with fork and clone.  The first of these children becomes the init
process of the new pid namespace, the rest become oddball children
of pid 0.  From the point of view of the new pid namespace the process
that created it is pid 0, as it's pid does not map.

A couple of different semantics were considered but this one was
settled on because it is easy to implement and it is usable from
pam modules.  The core reasons for the existence of unshare.

I took a survey of the callers of pam modules and the following
appears to be a representative sample of their logic.
{
setup stuff include pam
child = fork();
if (!child) {
setuid()
                exec /bin/bash
        }
        waitpid(child);

        pam and other cleanup
}

As you can see there is a fork to create the unprivileged user
space process.  Which means that the unprivileged user space
process will appear as pid 1 in the new pid namespace.  Further
most login processes do not cope with extraneous children which
means shifting the duty of reaping extraneous child process to
the creator of those extraneous children makes the system more
comprehensible.

The practical reason for this set of pid namespace semantics is
that it is simple to implement and verify they work correctly.
Whereas an implementation that requres changing the struct
pid on a process comes with a lot more races and pain.  Not
the least of which is that glibc caches getpid().

These semantics are implemented by having two notions
of the pid namespace of a proces.  There is task_active_pid_ns
which is the pid namspace the process was created with
and the pid namespace that all pids are presented to
that process in.  The task_active_pid_ns is stored
in the struct pid of the task.

Then there is the pid namespace that will be used for children
that pid namespace is stored in task->nsproxy->pid_ns.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

pidns: Consolidate initialzation of special init task state

Instead of setting child_reaper and SIGNAL_UNKILLABLE one way
for the system init process, and another way for pid namespace
init processes test pid->nr == 1 and use the same code for both.

For the global init this results in SIGNAL_UNKILLABLE being set
much earlier in the initialization process.

This is a small cleanup and it paves the way for allowing unshare and
enter of the pid namespace as that path like our global init also will
not set CLONE_NEWPID.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

pidns: Add setns support

- Pid namespaces are designed to be inescapable so verify that the
  passed in pid namespace is a child of the currently active
  pid namespace or the currently active pid namespace itself.

  Allowing the currently active pid namespace is important so
  the effects of an earlier setns can be cancelled.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

pidns: Deny strange cases when creating pid namespaces.

task_active_pid_ns(current) != current->ns_proxy->pid_ns will
soon be allowed to support unshare and setns.

The definition of creating a child pid namespace when
task_active_pid_ns(current) != current->ns_proxy->pid_ns could be that
we create a child pid namespace of current->ns_proxy->pid_ns.  However
that leads to strange cases like trying to have a single process be
init in multiple pid namespaces, which is racy and hard to think
about.

The definition of creating a child pid namespace when
task_active_pid_ns(current) != current->ns_proxy->pid_ns could be that
we create a child pid namespace of task_active_pid_ns(current).  While
that seems less racy it does not provide any utility.

Therefore define the semantics of creating a child pid namespace when
task_active_pid_ns(current) != current->ns_proxy->pid_ns to be that the
pid namespace creation fails.  That is easy to implement and easy
to think about.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

pidns: Wait in zap_pid_ns_processes until pid_ns->nr_hashed == 1

Looking at pid_ns->nr_hashed is a bit simpler and it works for
disjoint process trees that an unshare or a join of a pid_namespace
may create.

Acked-by: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

pidns: Don't allow new processes in a dead pid namespace.

Set nr_hashed to -1 just before we schedule the work to cleanup proc.
Test nr_hashed just before we hash a new pid and if nr_hashed is < 0
fail.

This guaranteees that processes never enter a pid namespaces after we
have cleaned up the state to support processes in a pid namespace.

Currently sending SIGKILL to all of the process in a pid namespace as
init exists gives us this guarantee but we need something a little
stronger to support unsharing and joining a pid namespace.

Acked-by: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

pidns: Make the pidns proc mount/umount logic obvious.

Track the number of pids in the proc hash table.  When the number of
pids goes to 0 schedule work to unmount the kernel mount of proc.

Move the mount of proc into alloc_pid when we allocate the pid for
init.

Remove the surprising calls of pid_ns_release proc in fork and
proc_flush_task.  Those code paths really shouldn't know about proc
namespace implementation details and people have demonstrated several
times that finding and understanding those code paths is difficult and
non-obvious.

Because of the call path detach pid is alwasy called with the
rtnl_lock held free_pid is not allowed to sleep, so the work to
unmounting proc is moved to a work queue.  This has the side benefit
of not blocking the entire world waiting for the unnecessary
rcu_barrier in deactivate_locked_super.

In the process of making the code clear and obvious this fixes a bug
reported by Gao feng <gaofeng@cn.fujitsu.com> where we would leak a
mount of proc during clone(CLONE_NEWPID|CLONE_NEWNET) if copy_pid_ns
succeeded and copy_net_ns failed.

Acked-by: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

pidns: Use task_active_pid_ns where appropriate

The expressions tsk->nsproxy->pid_ns and task_active_pid_ns
aka ns_of_pid(task_pid(tsk)) should have the same number of
cache line misses with the practical difference that
ns_of_pid(task_pid(tsk)) is released later in a processes life.

Furthermore by using task_active_pid_ns it becomes trivial
to write an unshare implementation for the the pid namespace.

So I have used task_active_pid_ns everywhere I can.

In fork since the pid has not yet been attached to the
process I use ns_of_pid, to achieve the same effect.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

pidns: Capture the user namespace and filter ns_last_pid

- Capture the the user namespace that creates the pid namespace
- Use that user namespace to test if it is ok to write to
/proc/sys/kernel/ns_last_pid.

Zhao Hongjiang <zhaohongjiang@huawei.com> noticed I was missing a put_user_ns
in when destroying a pid_ns. I have foloded his patch into this one
so that bisects will work properly.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

procfs: Don't cache a pid in the root inode.

Now that we have s_fs_info pointing to our pid namespace
the original reason for the proc root inode having a struct
pid is gone.

Caching a pid in the root inode has led to some complicated
code. Now that we don't need the struct pid, just remove it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

procfs: Use the proc generic infrastructure for proc/self.

I had visions at one point of splitting proc into two filesystems.  If
that had happened proc/self being the the part of proc that actually deals
with pids would have been a nice cleanup.  As it is proc/self requires
a lot of unnecessary infrastructure for a single file.

The only user visible change is that a mounted /proc for a pid namespace
that is dead now shows a broken proc symlink, instead of being completely
invisible.  I don't think anyone will notice or care.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: On mips modify check_same_owner to use uid_eq

The kbuild test robot <fengguang.wu@intel.com> report the following error
when building mips with user namespace support enabled.

All error/warnings:
arch/mips/kernel/mips-mt-fpaff.c: In function 'check_same_owner':
arch/mips/kernel/mips-mt-fpaff.c:53:22: error: invalid operands to binary == (have 'kuid_t' and 'kuid_t')
arch/mips/kernel/mips-mt-fpaff.c:54:15: error: invalid operands to binary == (have 'kuid_t' and 'kuid_t')

Replace "a == b" with uid_eq(a, b) removes this error and allows the
code to work with user namespaces enabled.

Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

userns: make each net (net_ns) belong to a user_ns

The user namespace which creates a new network namespace owns that
namespace and all resources created in it. This way we can target
capability checks for privileged operations against network resources to
the user_ns which created the network namespace in which the resource
lives. Privilege to the user namespace which owns the network
namespace, or any parent user namespace thereof, provides the same
privilege to the network resource.

This patch is reworked from a version originally by
Serge E. Hallyn <serge.hallyn@canonical.com>

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

netns: Deduplicate and fix copy_net_ns when !CONFIG_NET_NS

The copy of copy_net_ns used when the network stack is not
built is broken as it does not return -EINVAL when attempting
to create a new network namespace. We don't even have
a previous network namespace.

Since we need a copy of copy_net_ns in net/net_namespace.h that is
available when the networking stack is not built at all move the
correct version of copy_net_ns from net_namespace.c into net_namespace.h
Leaving us with just 2 versions of copy_net_ns. One version for when
we compile in network namespace suport and another stub for all other
occasions.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: Support fuse interacting with multiple user namespaces

Use kuid_t and kgid_t in struct fuse_conn and struct fuse_mount_data.

The connection between between a fuse filesystem and a fuse daemon is
established when a fuse filesystem is mounted and provided with a file
descriptor the fuse daemon created by opening /dev/fuse.

For now restrict the communication of uids and gids between the fuse
filesystem and the fuse daemon to the initial user namespace.  Enforce
this by verifying the file descriptor passed to the mount of fuse was
opened in the initial user namespace.  Ensuring the mount happens in
the initial user namespace is not necessary as mounts from non-initial
user namespaces are not yet allowed.

In fuse_req_init_context convert the currrent fsuid and fsgid into the
initial user namespace for the request that will be sent to the fuse
daemon.

In fuse_fill_attr convert the uid and gid passed from the fuse daemon
from the initial user namespace into kuids and kgids.

In iattr_to_fattr called from fuse_setattr convert kuids and kgids
into the uids and gids in the initial user namespace before passing
them to the fuse filesystem.

In fuse_change_attributes_common called from fuse_dentry_revalidate,
fuse_permission, fuse_geattr, and fuse_setattr, and fuse_iget convert
the uid and gid from the fuse daemon into a kuid and a kgid to store
on the fuse inode.

By default fuse mounts are restricted to task whose uid, suid, and
euid matches the fuse user_id and whose gid, sgid, and egid matches
the fuse group id.  Convert the user_id and group_id mount options
into kuids and kgids at mount time, and use uid_eq and gid_eq to
compare the in fuse_allow_task.

Cc: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

userns: Support autofs4 interacing with multiple user namespaces

Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue.

When creating directories and symlinks default the uid and gid of
the mount requester to the global root uid and gid. autofs4_wait
will update these fields when a mount is requested.

When generating autofsv5 packets report the uid and gid of the mount
requestor in user namespace of the process that opened the pipe,
reporting unmapped uids and gids as overflowuid and overflowgid.

In autofs_dev_ioctl_requester return the uid and gid of the last mount
requester converted into the calling processes user namespace. When the
uid or gid don't map return overflowuid and overflowgid as appropriate,
allowing failure to find a mount requester to be distinguished from
failure to map a mount requester.

The uid and gid mount options specifying the user and group of the
root autofs inode are converted into kuid and kgid as they are parsed
defaulting to the current uid and current gid of the process that
mounts autofs.

Mounting of autofs for the present remains confined to processes in
the initial user namespace.

Cc: Ian Kent <raven@themaw.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Linux 3.7-rc3

Merge tag 'ktest-v3.7-rc2' of git://git./linux/kernel/git/rostedt/linux-ktest

Pull ktest confusion fix from Steven Rostedt:
"With the v3.7-rc2 kernel, the network cards on my target boxes were
  not being brought up.

  I found that the modules for the network was not being installed.
  This was due to the config CONFIG_MODULES_USE_ELF_RELA that came
  before CONFIG_MODULES, and confused ktest in thinking that
  CONFIG_MODULES=y was not found.

  Ktest needs to test all configs and not just stop if something starts
  with CONFIG_MODULES."

* tag 'ktest-v3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest: Fix ktest confusion with CONFIG_MODULES_USE_ELF_RELA

Merge tag 'spi-mxs' of git://git./linux/kernel/git/broonie/misc

Pull minor spi MXS fixes from Mark Brown:
"These fixes are both pretty minor ones and are driver local."

* tag 'spi-mxs' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc:
spi: mxs: Terminate DMA in case of DMA timeout
spi: mxs: Assign message status after transfer finished

Merge tag 'fixes-for-3.7' of git://git./linux/kernel/git/arm/arm-soc

Pull arm-soc fixes from Arnd Bergmann:
"Bug fixes for a number of ARM platforms, mostly OMAP, imx and at91.

  These come a little later than I had hoped but unfortunately we had a
  few of these patches cause regressions themselves and had to work out
  how to deal with those in the meantime."

* tag 'fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
  Revert "ARM i.MX25: Fix PWM per clock lookups"
  ARM: versatile: fix versatile_defconfig
  ARM: mvebu: update defconfig with 3.7 changes
  ARM: at91: fix at91x40 build
  ARM: socfpga: Fix socfpga compilation with early_printk() enabled
  ARM: SPEAr: Remove unused empty files
  MAINTAINERS: Add arm-soc tree entry
  ARM: dts: mxs: add the "clock-names" for gpmi-nand
  ARM: ux500: Correct SDI5 address and add some format changes
  ARM: ux500: Specify AMBA Primecell IDs for Nomadik I2C in DT
  ARM: ux500: Fix build error relating to IRQCHIP_SKIP_SET_WAKE
  ARM: at91: drop duplicated config SOC_AT91SAM9 entry
  ARM: at91/i2c: change id to let i2c-at91 work
  ARM: at91/i2c: change id to let i2c-gpio work
  ARM: at91/dts: at91sam9g20ek_common: Fix typos in buttons labels.
  ARM: at91: fix external interrupt specification in board code
  ARM: at91: fix external interrupts in non-DT case
  ARM: at91: at91sam9g10: fix SOC type detection
  ARM: at91/tc: fix typo in the DT document
  ARM: AM33XX: Fix configuration of dmtimer parent clock by dmtimer driverDate:Wed, 17 Oct 2012 13:55:55 -0500
  ...

Lock splice_read and splice_write functions

Functions generic_file_splice_read and generic_file_splice_write access
the pagecache directly. For block devices these functions must be locked
so that block size is not changed while they are in progress.

This patch is an additional fix for commit b87570f5d349 ("Fix a crash
when block device is read and block size is changed at the same time")
that locked aio_read, aio_write and mmap against block size change.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

percpu-rw-semaphores: use rcu_read_lock_sched

Use rcu_read_lock_sched / rcu_read_unlock_sched / synchronize_sched
instead of rcu_read_lock / rcu_read_unlock / synchronize_rcu.

This is an optimization. The RCU-protected region is very small, so
there will be no latency problems if we disable preempt in this region.

So we use rcu_read_lock_sched / rcu_read_unlock_sched that translates
to preempt_disable / preempt_disable. It is smaller (and supposedly
faster) than preemptible rcu_read_lock / rcu_read_unlock.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

percpu-rw-semaphores: use light/heavy barriers

This patch introduces new barrier pair light_mb() and heavy_mb() for
percpu rw semaphores.

This patch fixes a bug in percpu-rw-semaphores where a barrier was
missing in percpu_up_write.

This patch improves performance on the read path of
percpu-rw-semaphores: on non-x86 cpus, there was a smp_mb() in
percpu_up_read. This patch changes it to a compiler barrier and removes
the "#if defined(X86) ..." condition.

From: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "ARM i.MX25: Fix PWM per clock lookups"

This reverts commit 92063cee118655d25b50d04eb77b012f3287357a, it
was applied prematurely, causing this build error for
imx_v4_v5_defconfig:

arch/arm/mach-imx/clk-imx25.c: In function 'mx25_clocks_init':
arch/arm/mach-imx/clk-imx25.c:206:26: error: 'pwm_ipg_per' undeclared (first use in this function)
arch/arm/mach-imx/clk-imx25.c:206:26: note: each undeclared identifier is reported only once for each function it appears in

Sascha Hauer explains:
> There are several gates missing in clk-imx25.c. I have a patch which
> adds support for them and I seem to have missed that the above depends
> on it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

ARM: versatile: fix versatile_defconfig

With the introduction of CONFIG_ARCH_MULTIPLATFORM, versatile is
no longer the default platform, so we need to enable
CONFIG_ARCH_VERSATILE explicitly in order for that to be selected
rather than the multiplatform configuration.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

ARM: mvebu: update defconfig with 3.7 changes

The split of 370 and XP into two Kconfig options and the multiplatform
kernel support has changed a few Kconfig symbols, so let's update the
mvebu_defconfig file with the latest changes.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

ARM: at91: fix at91x40 build

patch 738a0fd7 "ARM: at91: fix external interrupts in non-DT case"
fixed a run-time error on some at91 platforms but did not apply
the same change to at91x40, which now doesn't build.

This changes at91x40 in the same way that the other platforms
were changed.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"This is what we usually expect at this stage of the game, lots of
  little things, mostly in drivers.  With the occasional 'oops didn't
  mean to do that' kind of regressions in the core code."

1) Uninitialized data in __ip_vs_get_timeouts(), from Arnd Bergmann

2) Reject invalid ACK sequences in Fast Open sockets, from Jerry Chu.

3) Lost error code on return from _rtl_usb_receive(), from Christian
    Lamparter.

4) Fix reset resume on USB rt2x00, from Stanislaw Gruszka.

5) Release resources on error in pch_gbe driver, from Veaceslav Falico.

6) Default hop limit not set correctly in ip6_template_metrics[], fix
    from Li RongQing.

7) Gianfar PTP code requests wrong kind of resource during probe, fix
    from Wei Yang.

8) Fix VHOST net driver on big-endian, from Michael S Tsirkin.

9) Mallenox driver bug fixes from Jack Morgenstein, Or Gerlitz, Moni
    Shoua, Dotan Barak, and Uri Habusha.

10) usbnet leaks memory on TX path, fix from Hemant Kumar.

11) Use socket state test, rather than presence of FIN bit packet, to
    determine FIONREAD/SIOCINQ value.  Fix from Eric Dumazet.

12) Fix cxgb4 build failure, from Vipul Pandya.

13) Provide a SYN_DATA_ACKED state to complement SYN_FASTOPEN in socket
    info dumps.  From Yuchung Cheng.

14) Fix leak of security path in kfree_skb_partial().  Fix from Eric
    Dumazet.

15) Handle RX FIFO overflows more resiliently in pch_gbe driver, from
    Veaceslav Falico.

16) Fix MAINTAINERS file pattern for networking drivers, from Jean
    Delvare.

17) Add iPhone5 IDs to IPHETH driver, from Jay Purohit.

18) VLAN device type change restriction is too strict, and should not
    trigger for the automatically generated vlan0 device.  Fix from Jiri
    Pirko.

19) Make PMTU/redirect flushing work properly again in ipv4, from
    Steffen Klassert.

20) Fix memory corruptions by using kfree_rcu() in netlink_release().
    From Eric Dumazet.

21) More qmi_wwan device IDs, from Bjørn Mork.

22) Fix unintentional change of SNAT/DNAT hooks in generic NAT
    infrastructure, from Elison Niven.

23) Fix 3.6.x regression in xt_TEE netfilter module, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
  tilegx: fix some issues in the SW TSO support
  qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
  net: usb: Fix memory leak on Tx data path
  net/mlx4_core: Unmap UAR also in the case of error flow
  net/mlx4_en: Don't use vlan tag value as an indication for vlan presence
  net/mlx4_en: Fix double-release-range in tx-rings
  bas_gigaset: fix pre_reset handling
  vhost: fix mergeable bufs on BE hosts
  gianfar_ptp: use iomem, not ioports resource tree in probe
  ipv6: Set default hoplimit as zero.
  NET_VENDOR_TI: make available for am33xx as well
  pch_gbe: fix error handling in pch_gbe_up()
  b43: Fix oops on unload when firmware not found
  mwifiex: clean up scan state on error
  mwifiex: return -EBUSY if specific scan request cannot be honored
  brcmfmac: fix potential NULL dereference
  Revert "ath9k_hw: Updated AR9003 tx gain table for 5GHz"
  ath9k_htc: Add PID/VID for a Ubiquiti WiFiStation
  rt2x00: usb: fix reset resume
  rtlwifi: pass rx setup error code to caller
  ...

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dmaengine fixes from Vinod Koul:
"Three fixes for slave dmanegine.

  Two are for typo omissions in sifr dmaengine driver and the last one
  is for the imx driver fixing a missing unlock"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: sirf: fix a typo in moving running dma_desc to active queue
  dmaengine: sirf: fix a typo in dma_prep_interleaved
  dmaengine: imx-dma: fix missing unlock on error in imxdma_xfer_desc()

Merge tag 'pm+acpi-for-3.7-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael J Wysocki:

- Fix for a memory leak in acpi_bind_one() from Jesper Juhl.

- Fix for an error code path memory leak in pm_genpd_attach_cpuidle()
   from Jonghwan Choi.

- Fix for smp_processor_id() usage in preemptible code in powernow-k8
   from Andreas Herrmann.

- Fix for a suspend-related memory leak in cpufreq stats from Xiaobing
   Tu.

- Freezer fix for failure to clear PF_NOFREEZE along with PF_KTHREAD in
   flush_old_exec() from Oleg Nesterov.

- acpi_processor_notify() fix from Alan Cox.

* tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: missing break
  freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD
  Fix memory leak in cpufreq stats.
  cpufreq / powernow-k8: Remove usage of smp_processor_id() in preemptible code
  PM / Domains: Fix memory leak on error path in pm_genpd_attach_cpuidle
  ACPI: Fix memory leak in acpi_bind_one()

Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband

Pull infiniband fixes from Roland Dreier:
"Small batch of fixes for 3.7:
   - Fix crash in error path in cxgb4
   - Fix build error on 32 bits in mlx4
   - Fix SR-IOV bugs in mlx4"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4_core: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails
  mlx4_core: Remove annoying debug messages from SR-IOV flow
  RDMA/cxgb4: Don't free chunk that we have failed to allocate
  IB/mlx4: Synchronize cleanup of MCGs in MCG paravirtualization
  IB/mlx4: Fix QP1 P_Key processing in the Primary Physical Function (PPF)
  IB/mlx4: Fix build error on platforms where UL is not 64 bits

Merge tag 'usb-3.7-rc3' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
"Here are a bunch of USB fixes for the 3.7-rc tree.

  There's a lot of small USB serial driver fixes, and one larger one
  (the mos7840 driver changes are mostly just moving code around to fix
  problems.) Thanks to Johan Hovold for finding the problems and fixing
  them all up.

  Other than those, there is the usual new device ids, xhci bugfixes,
  and gadget driver fixes, nothing out of the ordinary.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (49 commits)
  xhci: trivial: Remove assigned but unused ep_ctx.
  xhci: trivial: Remove assigned but unused slot_ctx.
  xhci: Fix missing break in xhci_evaluate_context_result.
  xhci: Fix potential NULL ptr deref in command cancellation.
  ehci: Add yet-another Lucid nohandoff pci quirk
  ehci: fix Lucid nohandoff pci quirk to be more generic with BIOS versions
  USB: mos7840: fix port_probe flow
  USB: mos7840: fix port-data memory leak
  USB: mos7840: remove invalid disconnect handling
  USB: mos7840: remove NULL-urb submission
  USB: qcserial: fix interface-data memory leak in error path
  USB: option: fix interface-data memory leak in error path
  USB: ipw: fix interface-data memory leak in error path
  USB: mos7840: fix port-device leak in error path
  USB: mos7840: fix urb leak at release
  USB: sierra: fix port-data memory leak
  USB: sierra: fix memory leak in probe error path
  USB: sierra: fix memory leak in attach error path
  USB: usb-wwan: fix multiple memory leaks in error paths
  USB: keyspan: fix NULL-pointer dereferences and memory leaks
  ...

Merge tag 'tty-3.7-rc3' of git://git./linux/kernel/git/gregkh/tty

Pull serial fix from Greg Kroah-Hartman:
"Here is one patch, a revert of a omap serial driver patch that was
causing problems, for your 3.7-rc tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: omap: fix software flow control"

Merge tag 'staging-3.7-rc2' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg Kroah-Hartman:
"Here are some staging driver fixes for your 3.7-rc tree.

  Nothing major here, a number of iio driver fixups that were causing
  problems, some comedi driver bugfixes, and a bunch of tidspbridge
  warning squashing and other regressions fixed from the 3.6 release.

  All have been in the linux-next releases for a bit.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits)
  staging: tidspbridge: delete unused mmu functions
  staging: tidspbridge: ioremap physical address of the stack segment in shm
  staging: tidspbridge: ioremap dsp sync addr
  staging: tidspbridge: change type to __iomem for per and core addresses
  staging: tidspbridge: drop const from custom mmu implementation
  staging: tidspbridge: request the right irq for mmu
  staging: ipack: add missing include (implicit declaration of function 'kfree')
  staging: ramster: depends on NET
  staging: omapdrm: fix allocation size for page addresses array
  staging: zram: Fix handling of incompressible pages
  Staging: android: binder: Allow using highmem for binder buffers
  Staging: android: binder: Fix memory leak on thread/process exit
  staging: comedi: ni_labpc: fix possible NULL deref during detach
  staging: comedi: das08: fix possible NULL deref during detach
  staging: comedi: amplc_pc263: fix possible NULL deref during detach
  staging: comedi: amplc_pc236: fix possible NULL deref during detach
  staging: comedi: amplc_pc236: fix invalid register access during detach
  staging: comedi: amplc_dio200: fix possible NULL deref during detach
  staging: comedi: 8255_pci: fix possible NULL deref during detach
  staging: comedi: ni_daq_700: fix dio subdevice regression
  ...

Merge tag 'driver-core-3.7-rc3' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg Kroah-Hartman:
"Here are a number of firmware core fixes for 3.7, and some other minor
  fixes.  And some documentation updates thrown in for good measure.

  All have been in the linux-next tree for a while.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation:Chinese translation of Documentation/arm64/memory.txt
  Documentation:Chinese translation of Documentation/arm64/booting.txt
  Documentation:Chinese translation of Documentation/IRQ.txt
  firmware loader: document kernel direct loading
  sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat()
  dynamic_debug: Remove unnecessary __used
  firmware loader: sync firmware cache by async_synchronize_full_domain
  firmware loader: let direct loading back on 'firmware_buf'
  firmware loader: fix one reqeust_firmware race
  firmware loader: cancel uncache work before caching firmware

Merge tag 'char-misc-3.7-rc3' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg Kroah-Hartman:
"Here are some driver fixes for 3.7.  They include extcon driver fixes,
  a hyper-v bugfix, and two other minor driver fixes.

  All of these have been in the linux-next releases for a while.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'char-misc-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  sonypi: suspend/resume callbacks should be conditionally compiled on CONFIG_PM_SLEEP
  Drivers: hv: Cleanup error handling in vmbus_open()
  extcon : register for cable interest by cable name
  extcon: trivial: kfree missed from remove path
  extcon: driver model release call not needed
  extcon: MAX77693: Add platform data for MUIC device to initialize registers
  extcon: max77693: Use max77693_update_reg for rmw operations
  extcon: Fix kerneldoc for extcon_set_cable_state and extcon_set_cable_state_
  extcon: adc-jack: Add missing MODULE_LICENSE
  extcon: adc-jack: Fix checking return value of request_any_context_irq
  extcon: Fix return value in extcon_register_interest()
  extcon: unregister compat link on cleanup
  extcon: Unregister compat class at module unload to fix oops
  extcon: optimising the check_mutually_exclusive function
  extcon: standard cable names definition and declaration changed
  extcon-max8997: remove usage of ret in max8997_muic_handle_charger_type_detach
  extcon: Remove duplicate inclusion of extcon.h header file

VFS: don't do protected {sym,hard}links by default

In commit 800179c9b8a1 ("This adds symlink and hardlink restrictions to
the Linux VFS"), the new link protections were enabled by default, in
the hope that no actual application would care, despite it being
technically against legacy UNIX (and documented POSIX) behavior.

However, it does turn out to break some applications. It's rare, and
it's unfortunate, but it's unacceptable to break existing systems, so
we'll have to default to legacy behavior.

In particular, it has broken the way AFD distributes files, see

http://www.dwd.de/AFD/

along with some legacy scripts.

Distributions can end up setting this at initrd time or in system
scripts: if you have security problems due to link attacks during your
early boot sequence, you have bigger problems than some kernel sysctl
setting. Do:

echo 1 > /proc/sys/fs/protected_symlinks
echo 1 > /proc/sys/fs/protected_hardlinks

to re-enable the link protections.

Alternatively, we may at some point introduce a kernel config option
that sets these kinds of "more secure but not traditional" behavioural
options automatically.

Reported-by: Nick Bowler <nbowler@elliptictech.com>
Reported-by: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # v3.6
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'sound-3.7' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Slightly a high amount of commits come from Adrian Knoth's HDSPM
  driver fixes.  Other than that, all small trival fixes or quirks that
  are pretty driver-specific."

* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: wm8994: Only enable extra BCLK cycles when required
  ALSA: als3000: check for the kzalloc return value
  ALSA: sound/isa/opti9xx/miro.c: eliminate possible double free
  ALSA: hda - Fix silent headphone output from Toshiba P200
  ALSA: hdspm - Fix coding style in CTL_ELEM macros
  ALSA: hdspm - Fix typo in kcontrol element on RME MADI cards
  ALSA: hdspm - Fix sync_in detection on AES/AES32
  ALSA: hdspm - Fix sync_in reporting on RME MADI cards
  ALSA: hdspm - Also report autosync_sample_rate on MADI and MADIface
  ALSA: hdspm - Fix reported autosync_sample_rate
  ALSA: hdspm - Fix sync check reporting on all RME HDSPM cards
  ALSA: hdspm - Report external rate in slave mode on PCI MADI
  ALSA: hdspm - Allow DDS/Varispeed to be set from userspace
  ALSA: hda - add dock support for Thinkpad T430
  ASoC: ux500_msp_i2s: Fix devm_* and return code merge error
  ASoC: Ux500: Dispose of device nodes correctly

Merge branch 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping

Pull DMA-mapping revert from Marek Szyprowski:
"Due to my mistake, my previous pull request (merged as commit
  cff7b8ba60e3: "Merge branch 'fixes_for_linus' ..") contained a patch
  which is aimed for v3.8 and lacks its dependences.  This pull request
  reverts it and fixes build break of ARM architecture."

* 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  Revert "ARM: dma-mapping: support debug_dma_mapping_error"

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"This fixes a couple of nasty page table initialization bugs which were
  causing kdump regressions.  A clean rearchitecturing of the code is in
  the works - meanwhile these are reverts that restore the
  best-known-working state of the kernel.

  There's also EFI fixes and other small fixes."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, mm: Undo incorrect revert in arch/x86/mm/init.c
  x86: efi: Turn off efi_enabled after setup on mixed fw/kernel
  x86, mm: Find_early_table_space based on ranges that are actually being mapped
  x86, mm: Use memblock memory loop instead of e820_RAM
  x86, mm: Trim memory in memblock to be page aligned
  x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
  x86/efi: Fix oops caused by incorrect set_memory_uc() usage
  x86-64: Fix page table accounting
  Revert "x86/mm: Fix the size calculation of mapping tables"
  MAINTAINERS: Add EFI git repository location

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Most of the kernel diffstat relates to a group of Intel P6 and KNC
  (Xeon-Phi Knights Corner) PMU driver fixes, neither of which is in
  heavy use, so we took the fixes.

  The rest is diverse smallish fixes to the tooling and kernel side."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Remove unused variable in nhmex_rbox_alter_er()
  perf/x86: Enable overflow on Intel KNC with a custom knc_pmu_handle_irq()
  perf/x86: Remove cpuc->enable check on Intl KNC event enable/disable
  perf/x86: Make Intel KNC use full 40-bit width of counters
  perf/x86/uncore: Handle pci_read_config_dword() errors
  perf/x86: Remove P6 cpuc->enabled check
  perf/x86: Update/fix generic events on P6 PMU
  perf/x86: Fix P6 FP_ASSIST event constraint
  perf, cpu hotplug: Use cached value of smp_processor_id()
  perf, cpu hotplug: Run CPU_STARTING notifiers with irqs disabled
  x86/perf: Fix virtualization sanity check
  perf test: Fix exclude_guest parse events tests
  perf tools: do not flush maps on COMM for perf report
  perf help: Fix --help for builtins
  perf trace: Check if sample raw_data field is set
  perf trace: Validate syscall id before growing syscall table

Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
"This has our series of fixes for the next rc.  The biggest batch is
  from Jan Schmidt, fixing up some problems in our subvolume quota code
  and fixing btrfs send/receive to work with the new extended inode
  refs."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: do not bug when we fail to commit the transaction
  Btrfs: fix memory leak when cloning root's node
  Btrfs: Use btrfs_update_inode_fallback when creating a snapshot
  Btrfs: Send: preserve ownership (uid and gid) also for symlinks.
  Btrfs: fix deadlock caused by the nested chunk allocation
  btrfs: Return EINVAL when length to trim is less than FSB
  Btrfs: fix memory leak in btrfs_quota_enable()
  Btrfs: send correct rdev and mode in btrfs-send
  Btrfs: extended inode refs support for send mechanism
  Btrfs: Fix wrong error handling code
  Fix a sign bug causing invalid memory access in the ino_paths ioctl.
  Btrfs: comment for loop in tree_mod_log_insert_move
  Btrfs: fix extent buffer reference for tree mod log roots
  Btrfs: determine level of old roots
  Btrfs: tree mod log's old roots could still be part of the tree
  Btrfs: fix a tree mod logging issue for root replacement operations
  Btrfs: don't put removals from push_node_left into tree mod log twice

Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem

Merge branch 'fixes' of git://git./linux/kernel/git/horms/renesas into fixes

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7779: I/O address abuse cleanup

Merge branch 'v3.7-samsung-fixes-2' of git://git./linux/kernel/git/kgene/linux-samsung into fixes

From Kukjin Kim <kgene.kim@samsung.com>:

One is spi stuff for fix the device names for the different subtypes of
the spi controller. And the other is adding missing .smp field for
exynos4-dt and fixing memory sections for exynos4210-trats board.

* 'v3.7-samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Set .smp field of machine descriptor for exynos4-dt
  ARM: dts: Split memory into 4 sections for exynos4210-trats
  ARM: SAMSUNG: Add naming of s3c64xx-spi devices

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'efi-for-3.7' of git://git./linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

"Fix oops with EFI variables on mixed 32/64-bit firmware/kernels and
document EFI git repository location on kernel.org."

Conflicts:
arch/x86/include/asm/efi.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>

tilegx: fix some issues in the SW TSO support

This change correctly computes the header length and data length in
the fragments to avoid a bug where we would end up with extremely
slow performance. Also adopt use of skb_frag_size() accessor.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: stable@vger.kernel.org [v3.6]
Signed-off-by: David S. Miller <davem@davemloft.net>

qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan

These devices provide QMI and ethernet functionality via a standard CDC
ethernet descriptor. But when driven by cdc_ether, the QMI
functionality is unavailable because only cdc_ether can claim the USB
interface. Thus blacklist the devices in cdc_ether and add their IDs to
qmi_wwan, which enables both QMI and ethernet simultaneously.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: usb: Fix memory leak on Tx data path

Driver anchors the tx urbs and defers the urb submission if
a transmit request comes when the interface is suspended.
Anchoring urb increments the urb reference count. These
deferred urbs are later accessed by calling usb_get_from_anchor()
for submission during interface resume. usb_get_from_anchor()
unanchors the urb but urb reference count remains same.
This causes the urb reference count to remain non-zero
after usb_free_urb() gets called and urb never gets freed.
Hence call usb_put_urb() after anchoring the urb to properly
balance the reference count for these deferred urbs. Also,
unanchor these deferred urbs during disconnect, to free them
up.

Signed-off-by: Hemant Kumar <hemantk@codeaurora.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Unmap UAR also in the case of error flow

If a failure takes place during the EQ creation, we need to unmap the
UAR memory block too.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Uri Habusha <urih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Don't use vlan tag value as an indication for vlan presence

The vlan tag can be zero. This is why it can't serve as an indication
that packet requires VLAN header in the TX flow.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Fix double-release-range in tx-rings

The QP range is reserved as a single block. However, when freeing the
en resources, the tx-ring QPs are released both in mlx4_en_destroy_tx_ring
(one at a time) and in mlx4_en_free_resources (as a block release).

Fix by eliminating the one-at-a-time release in mlx4_en_destroy_tx_ring.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bas_gigaset: fix pre_reset handling

The delayed work function int_in_work() may call usb_reset_device()
and thus, indirectly, the driver's pre_reset method. Trying to
cancel the work synchronously in that situation would deadlock.
Fix by avoiding cancel_work_sync() in the pre_reset method.

If the reset was NOT initiated by int_in_work() this might cause
int_in_work() to run after the post_reset method, with urb_int_in
already resubmitted, so handle that case gracefully.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "ARM: dma-mapping: support debug_dma_mapping_error"

This reverts commit 871ae57adc5ed092c1341f411514d0e8482e2611, which is
scheduled for v3.8 and accidently got into v3.7-rc series.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

ktest: Fix ktest confusion with CONFIG_MODULES_USE_ELF_RELA

In order to decide if ktest should bother installing modules on the
target box, it checks if the config file has CONFIG_MODULES=y. But it
also checks if the '=y' part exists. It only will install modules if the
config exists and is set with '=y'. But as the regex that was used
tests:

/^CONFIG_MODULES(=y)?/

this will also match:

CONFIG_MODULES_USE_ELF_RELA

as the '=y' part was optional and it did not test the rest of the line.
When this happens, ktest will stop checking the rest of the configs but
it will also think that no modules are needed to be installed. What it
should do is only jump out of the loop if it actually found a
CONFIG_MODULES that is set to true.

Otherwise, ktest wont install the necessary modules needed for proper
booting of the test target.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm radeon fixes from Dave Airlie:
"Just radeon fixes in this one:
   - some new PCI IDs
   - ATPX regression fix
   - async VM regression fixes
   - some module options fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: fix ATPX regression in acpi rework
  drm/radeon: fix ATPX function documentation
  drm/radeon: move the retry to gem_object_create
  drm/radeon: move size limits to gem_object_create.
  drm/radeon: use vzalloc for gart pages
  drm/radeon: fix and simplify pot argument checks v3
  drm/radeon: fix header size estimation in VM code
  drm/radeon: remove set_page check from VM code
  drm/radeon: fix si_set_page v2
  drm/radeon: fix cayman_vm_set_page v2
  drm/radeon: fix PFP sync in vm_flush
  drm/radeon: add error output if VM CS fails on cayman
  drm/radeon: give each backlight a unique id
  drm/radeon: fix sparse warning
  drm/radeon: add some new SI PCI ids

Merge tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS bugfixes from Trond Myklebust:

- Fix the NFSv2/v3 kernel statd protocol, which broke due to net
   namespace related changes.

- Fix a number of races in the SUNRPC TCP disconnect/reconnect code.

* tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  LOCKD: Clear ln->nsm_clnt only when ln->nsm_users is zero
  LOCKD: fix races in nsm_client_get
  SUNRPC: Get rid of the xs_error_report socket callback
  SUNRPC: Prevent races in xs_abort_connection()
  Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..."
  SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAIT

Merge branch 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Alex writes:
"Fixes pull request for radeon.  The main things here are
fixing a ATPX regression from the acpi rework, fixing some
fallout from the async VM work, and fixing some module options
that were broken in certain cases.  Other than that, mainly
just bug fixes."

* 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: fix ATPX regression in acpi rework
  drm/radeon: fix ATPX function documentation
  drm/radeon: move the retry to gem_object_create
  drm/radeon: move size limits to gem_object_create.
  drm/radeon: use vzalloc for gart pages
  drm/radeon: fix and simplify pot argument checks v3
  drm/radeon: fix header size estimation in VM code
  drm/radeon: remove set_page check from VM code
  drm/radeon: fix si_set_page v2
  drm/radeon: fix cayman_vm_set_page v2
  drm/radeon: fix PFP sync in vm_flush
  drm/radeon: add error output if VM CS fails on cayman
  drm/radeon: give each backlight a unique id
  drm/radeon: fix sparse warning
  drm/radeon: add some new SI PCI ids

Merge branch 'akpm' (Andrew's fixes)

Merge misc fixes from Andrew Morton:
"18 total.  15 fixes and some updates to a device_cgroup patchset which
  bring it up to date with the version which I should have merged in the
  first place."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (18 patches)
  fs/compat_ioctl.c: VIDEO_SET_SPU_PALETTE missing error check
  gen_init_cpio: avoid stack overflow when expanding
  drivers/rtc/rtc-imxdi.c: add missing spin lock initialization
  mm, numa: avoid setting zone_reclaim_mode unless a node is sufficiently distant
  pidns: limit the nesting depth of pid namespaces
  drivers/dma/dw_dmac: make driver's endianness configurable
  mm/mmu_notifier: allocate mmu_notifier in advance
  tools/testing/selftests/epoll/test_epoll.c: fix build
  UAPI: fix tools/vm/page-types.c
  mm/page_alloc.c:alloc_contig_range(): return early for err path
  rbtree: include linux/compiler.h for definition of __always_inline
  genalloc: stop crashing the system when destroying a pool
  backlight: ili9320: add missing SPI dependency
  device_cgroup: add proper checking when changing default behavior
  device_cgroup: stop using simple_strtoul()
  device_cgroup: rename deny_all to behavior
  cgroup: fix invalid rcu dereference
  mm: fix XFS oops due to dirty pages without buffers on s390

ACPI: missing break

We handle NOTIFY_THROTTLING so don't then fall through to unsupported event.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Input: wacom - add touch sensor support for Cintiq 24HD touch

Decode multitouch reports from the touch sensor of the Cintiq 24HD
touch.

Signed-off-by: Jason Gerecke <killertofu@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Input: wacom - handle split-sensor devices with internal hubs

Like our other pen-and-touch products, the Cintiq 24HD touch needs data
to be shared between its two sensors to facilitate proximity-based palm
rejection.

Unlike other tablets that report sensor data through separate interfaces
of the same USB device, the Cintiq 24HD touch has separate USB devices
that are connected to an internal USB hub.

This patch makes it possible to designate the USB VID/PID of the other
device so that the two may share data. To ensure we don't accidentally
link to a sensor from a physically separate device (if several have been
plugged in), we limit the search to siblings (i.e., devices directly
connected to the same hub).

Signed-off-by: Jason Gerecke <killertofu@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Makefile: Documentation for external tool should be correct

If one includes documentation for an external tool, it should be
correct.  This is not:

1. Overriding the input to rngd should typically be neither
   necessary nor desired.  This is especially so since newer
   versions of rngd support a number of different *types* of sources.
2. The default kernel-exported device is called /dev/hwrng not
   /dev/hwrandom nor /dev/hw_random (both of which were used in the
   past; however, kernel and udev seem to have converged on
   /dev/hwrng.)

Overall it is better if the documentation for rngd is kept with rngd
rather than in a kernel Makefile.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"A random collection of various fixes, mainly from Arnd and a few other
  people.  Not thing really stands out here."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: drop experimental status for hotplug and Thumb2
  ARM: 7560/1: SMP_TWD: use DIV_ROUND_CLOSEST() for periodic mode
  ARM: 7559/1: smp: switch away from the idmap before updating init_mm.mm_count
  ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD
  ARM: 7555/1: kexec: fix segment memory addresses check
  ARM: warnings in arch/arm/include/asm/uaccess.h
  ARM: binfmt_flat: unused variable 'persistent'
  ARM: be really quiet when building with 'make -s'
  ARM: pass -marm to gcc by default for both C and assembler
  ARM: Xen: fix initial build problems
  ARM: export default read_current_timer
  ARM: Fix another build warning in arch/arm/mm/alignment.c
  ARM: export set_irq_flags
  ARM: kprobes: make more tests conditional

Merge branch 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping

Pull CMA and DMA-mapping fixes from Marek Szyprowski:
"This consists mainly of a set of one-liner fixes and cleanups for a
  few minor issues identified in both Contiguous Memory Allocator code
  and ARM DMA-mapping subsystem."

* 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: mm: Remove unused arm_vmregion priv field
  ARM: dma-mapping: fix build warning in __dma_alloc()
  ARM: dma-mapping: support debug_dma_mapping_error
  mm: cma: alloc_contig_range: return early for err path
  drivers: cma: Fix wrong CMA selected region size default value
  drivers: dma-coherent: Fix typo in dma_mmap_from_coherent documentation
  drivers: dma-contiguous: Don't redefine SZ_1M

x86, mm: Undo incorrect revert in arch/x86/mm/init.c

Commit

844ab6f9 x86, mm: Find_early_table_space based on ranges that are actually being mapped

added back some lines back wrongly that has been removed in commit

7b16bbf97 Revert "x86/mm: Fix the size calculation of mapping tables"

remove them again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/CAE9FiQW_vuaYQbmagVnxT2DGsYc=9tNeAbdBq53sYkitPOwxSQ@mail.gmail.com
Acked-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

fs/compat_ioctl.c: VIDEO_SET_SPU_PALETTE missing error check

The compat ioctl for VIDEO_SET_SPU_PALETTE was missing an error check
while converting ioctl arguments. This could lead to leaking kernel
stack contents into userspace.

Patch extracted from existing fix in grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: PaX Team <pageexec@freemail.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gen_init_cpio: avoid stack overflow when expanding

Fix possible overflow of the buffer used for expanding environment
variables when building file list.

In the extremely unlikely case of an attacker having control over the
environment variables visible to gen_init_cpio, control over the
contents of the file gen_init_cpio parses, and gen_init_cpio was built
without compiler hardening, the attacker can gain arbitrary execution
control via a stack buffer overflow.

  $ cat usr/crash.list
  file foo ${BIG}${BIG}${BIG}${BIG}${BIG}${BIG} 0755 0 0
  $ BIG=$(perl -e 'print "A" x 4096;') ./usr/gen_init_cpio usr/crash.list
  *** buffer overflow detected ***: ./usr/gen_init_cpio terminated

This also replaces the space-indenting with tabs.

Patch based on existing fix extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: PaX Team <pageexec@freemail.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-imxdi.c: add missing spin lock initialization

Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Roland Stigge <stigge@antcom.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Roland Stigge <stigge@antcom.de>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm, numa: avoid setting zone_reclaim_mode unless a node is sufficiently distant

Commit 957f822a0ab9 ("mm, numa: reclaim from all nodes within reclaim
distance") caused zone_reclaim_mode to be set for all systems where two
nodes are within RECLAIM_DISTANCE of each other. This is the opposite
of what we actually want: zone_reclaim_mode should be set if two nodes
are sufficiently distant.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Julian Wollrath <jwollrath@web.de>
Tested-by: Julian Wollrath <jwollrath@web.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pidns: limit the nesting depth of pid namespaces

'struct pid' is a "variable sized struct" - a header with an array of
upids at the end.

The size of the array depends on a level (depth) of pid namespaces.  Now a
level of pidns is not limited, so 'struct pid' can be more than one page.

Looks reasonable, that it should be less than a page.  MAX_PIS_NS_LEVEL is
not calculated from PAGE_SIZE, because in this case it depends on
architectures, config options and it will be reduced, if someone adds a
new fields in struct pid or struct upid.

I suggest to set MAX_PIS_NS_LEVEL = 32, because it saves ability to expand
"struct pid" and it's more than enough for all known for me use-cases.
When someone finds a reasonable use case, we can add a config option or a
sysctl parameter.

In addition it will reduce the effect of another problem, when we have
many nested namespaces and the oldest one starts dying.
zap_pid_ns_processe will be called for each namespace and find_vpid will
be called for each process in a namespace.  find_vpid will be called
minimum max_level^2 / 2 times.  The reason of that is that when we found a
bit in pidmap, we can't determine this pidns is top for this process or it
isn't.

vpid is a heavy operation, so a fork bomb, which create many nested
namespace, can make a system inaccessible for a long time.  For example my
system becomes inaccessible for a few minutes with 4000 processes.

[akpm@linux-foundation.org: return -EINVAL in response to excessive nesting, not -ENOMEM]
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/dma/dw_dmac: make driver's endianness configurable

The dw_dmac driver was originally developed for avr32 to be used with the
Synopsys DesignWare AHB DMA controller. Starting from 2.6.38, access to
the device's i/o memory was done with the little-endian readl/writel
functions(1)

This broke the driver for the avr32 platform, because it needs big
(native) endian accessors. This patch makes the endianness configurable
using 'DW_DMAC_BIG_ENDIAN_IO', which will default be true for AVR32

I submitted this patch before(2) but then waited for Andy to finish other
changes to the same module(3).

(1) https://patchwork.kernel.org/patch/608211
(2) https://lkml.org/lkml/2012/8/26/148
(3) https://lkml.org/lkml/2012/9/21/173

Signed-off-by: Hein Tibosch <hein_tibosch@yahoo.es>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Havard Skinnemoen <havard@skinnemoen.net>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mmu_notifier: allocate mmu_notifier in advance

While allocating mmu_notifier with parameter GFP_KERNEL, swap would start
to work in case of tight available memory.  Eventually, that would lead to
a deadlock while the swap deamon swaps anonymous pages.  It was caused by
commit e0f3c3f78da29b ("mm/mmu_notifier: init notifier if necessary").

  =================================
  [ INFO: inconsistent lock state ]
  3.7.0-rc1+ #518 Not tainted
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&mapping->i_mmap_mutex){+.+.?.}, at: page_referenced+0x9c/0x2e0
  {RECLAIM_FS-ON-W} state was registered at:
     mark_held_locks+0x86/0x150
     lockdep_trace_alloc+0x67/0xc0
     kmem_cache_alloc_trace+0x33/0x230
     do_mmu_notifier_register+0x87/0x180
     mmu_notifier_register+0x13/0x20
     kvm_dev_ioctl+0x428/0x510
     do_vfs_ioctl+0x98/0x570
     sys_ioctl+0x91/0xb0
     system_call_fastpath+0x16/0x1b
  irq event stamp: 825
  hardirqs last  enabled at (825): _raw_spin_unlock_irq+0x30/0x60
  hardirqs last disabled at (824): _raw_spin_lock_irq+0x19/0x80
  softirqs last  enabled at (0): copy_process+0x630/0x17c0
  softirqs last disabled at (0): (null)
  ...

Simply back out the above commit, which was a small performance
optimization.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reported-by: Andrea Righi <andrea@betterlinux.com>
Tested-by: Andrea Righi <andrea@betterlinux.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Sagi Grimberg <sagig@mellanox.co.il>
Cc: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tools/testing/selftests/epoll/test_epoll.c: fix build

Latest Linus head run of "make selftests" in the tools directory failed
with references to undefined variables. Reference was to
'write_thread_data' which is the name of a struct that is being used, not
the variable itself. Change reference so it points to the variable.

Signed-off-by: Daniel Hazelton <dshadowwolf@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

UAPI: fix tools/vm/page-types.c

Fix tools/vm/page-types.c to use the UAPI variant of linux/kernel-page-flags.h
lest the following error appear:

  In file included from page-types.c:38:0:
    ../../include/linux/kernel-page-flags.h:4:42: fatal error:
    uapi/linux/kernel-page-flags.h: No such file or directory

Reported-by: Daniel Hazelton <dshadowwolf@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Daniel Hazelton <dshadowwolf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page_alloc.c:alloc_contig_range(): return early for err path

If start_isolate_page_range() failed, unset_migratetype_isolate() has been
done inside it.

Signed-off-by: Bob Liu <lliubbo@gmail.com>
Cc: Ni zhan Chen <nizhan.chen@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rbtree: include linux/compiler.h for definition of __always_inline

rb_erase_augmented() is a static function annotated with
__always_inline.  This causes a compile failure when attempting to use
the rbtree implementation as a library (e.g.  kvm tool):

  rbtree_augmented.h:125:24: error: expected `=', `,', `;', `asm' or `__attribute__' before `void'

Include linux/compiler.h in rbtree_augmented.h so that the __always_inline
macro is resolved correctly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>