Heiko Carstens [Wed, 2 Jul 2014 22:22:37 +0000 (15:22 -0700)]
fs/seq_file: fallback to vmalloc allocation
There are a couple of seq_files which use the single_open() interface.
This interface requires that the whole output must fit into a single
buffer.
E.g. for /proc/stat allocation failures have been observed because an
order-4 memory allocation failed due to memory fragmentation. In such
situations reading /proc/stat is not possible anymore.
Therefore change the seq_file code to fallback to vmalloc allocations
which will usually result in a couple of order-0 allocations and hence
also work if memory is fragmented.
For reference a call trace where reading from /proc/stat failed:
sadc: page allocation failure: order:4, mode:0x1040d0
CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
[...]
Call Trace:
show_stack+0x6c/0xe8
warn_alloc_failed+0xd6/0x138
__alloc_pages_nodemask+0x9da/0xb68
__get_free_pages+0x2e/0x58
kmalloc_order_trace+0x44/0xc0
stat_open+0x5a/0xd8
proc_reg_open+0x8a/0x140
do_dentry_open+0x1bc/0x2c8
finish_open+0x46/0x60
do_last+0x382/0x10d0
path_openat+0xc8/0x4f8
do_filp_open+0x46/0xa8
do_sys_open+0x114/0x1f0
sysc_tracego+0x14/0x1a
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
Cc: Andrea Righi <andrea@betterlinux.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
fs/seq_file.c
Change-Id: I009080dd017b020ffd5e812e5b472bdb8349217a
Al Viro [Tue, 6 May 2014 18:02:53 +0000 (14:02 -0400)]
nick kvfree() from apparmor
too many places open-code it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Conflicts:
mm/util.c
security/apparmor/include/apparmor.h
Change-Id: Ie8602e0199282dc462921cb7217158d1998853b0
Al Viro [Mon, 6 May 2013 02:10:35 +0000 (03:10 +0100)]
apparmor: no need to delay vfree()
vfree() can be called from interrupt contexts now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Leo Yan [Mon, 1 Sep 2014 03:09:51 +0000 (11:09 +0800)]
arm64: fix bug for reloading FPSIMD state after cpu power off
Now arm64 defers reloading FPSIMD state, but this optimization also
introduces the bug after cpu resume back from low power mode.
The reason is after the cpu has been powered off, s/w need set the
cpu's fpsimd_last_state to NULL so that it will force to reload
FPSIMD state for the thread, otherwise there has the chance to meet
the condition for both the task's fpsimd_state.cpu field contains the
id of the current cpu, and the cpu's fpsimd_last_state per-cpu variable
points to the task's fpsimd_state, so finally kernel will skip to reload
the context during it return back to userland.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Leo Yan <leoy@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Catalin Marinas [Fri, 29 Aug 2014 15:08:02 +0000 (16:08 +0100)]
arm64: Add brackets around user_stack_pointer()
Commit
5f888a1d33 (ARM64: perf: support dwarf unwinding in compat mode)
changes user_stack_pointer() to return the compat SP for 32-bit tasks
but without brackets around the whole definition, with possible issues
on the call sites (noticed with a subsequent fix for KSTK_ESP).
Fixes: 5f888a1d33c4 (ARM64: perf: support dwarf unwinding in compat mode)
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 29 Aug 2014 15:11:10 +0000 (16:11 +0100)]
arm64: report correct stack pointer in KSTK_ESP for compat tasks
The KSTK_ESP macro is used to determine the user stack pointer for a
given task. In particular, this is used to to report the '[stack]' VMA
in /proc/self/maps, which is used by Android to determine the stack
location for children of the main thread.
This patch fixes the macro to use user_stack_pointer instead of directly
returning sp. This means that we report w13 instead of sp, since the
former is used as the stack pointer when executing in AArch32 state.
Cc: <stable@vger.kernel.org>
Reported-by: Serban Constantinescu <Serban.Constantinescu@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Catalin Marinas [Thu, 10 Jul 2014 10:37:40 +0000 (11:37 +0100)]
arm64: Cast KSTK_(EIP|ESP) to unsigned long
This is for similarity with thread_saved_(pc|sp) and to avoid some
compiler warnings in the audit code.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Jean Pihet [Mon, 3 Feb 2014 18:18:29 +0000 (19:18 +0100)]
ARM64: perf: support dwarf unwinding in compat mode
Add support for unwinding using the dwarf information in compat
mode. Using the correct user stack pointer allows perf to record
the frames correctly in the native and compat modes.
Note that although the dwarf frame unwinding works ok using
libunwind in native mode (on ARMv7 & ARMv8), some changes are
required to the libunwind code for the compat mode. Those changes
are posted separately on the libunwind mailing list.
Tested on ARMv8 platform with v8 and compat v7 binaries, the latter
are statically built.
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Greg Hackmann [Thu, 28 Aug 2014 21:00:10 +0000 (14:00 -0700)]
arm64: check for upper PAGE_SHIFT bits in pfn_valid()
pfn_valid() returns a false positive when the lower (64 - PAGE_SHIFT)
bits match a valid pfn but some of the upper bits are set. This caused
a kernel panic in kpageflags_read() when a userspace utility parsed
/proc/*/pagemap, neglected to discard the upper flag bits, and tried to
lseek()+read() from the corresponding offset in /proc/kpageflags.
A valid pfn will never have the upper PAGE_SHIFT bits set, so simply
check for this before passing the pfn to memblock_is_memory().
Change-Id: Ief5d8cd4dd93cbecd545a634a8d5885865cb5970
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Ard Biesheuvel [Tue, 24 Sep 2013 07:28:03 +0000 (09:28 +0200)]
arm64: pull in <asm/simd.h> from asm-generic
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Ard Biesheuvel [Tue, 4 Mar 2014 01:10:04 +0000 (01:10 +0000)]
arm64: enable generic CPU feature modalias matching for this architecture
This enables support for the generic CPU feature modalias implementation that
wires up optional CPU features to udev based module autoprobing.
A file <asm/cpufeature.h> is provided that maps CPU feature numbers to
elf_hwcap bits, which is the standard way on arm64 to advertise optional CPU
features both internally and to user space.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: removed unnecessary "!!"]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/Kconfig
Change-Id: Ief16b3197cd0564d8cf8aa82e9614bcda6399fe5
Ard Biesheuvel [Tue, 4 Mar 2014 05:28:39 +0000 (13:28 +0800)]
crypto: allow blkcipher walks over AEAD data
This adds the function blkcipher_aead_walk_virt_block, which allows the caller
to use the blkcipher walk API to handle the input and output scatterlists.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Tue, 4 Mar 2014 05:28:38 +0000 (13:28 +0800)]
crypto: remove direct blkcipher_walk dependency on transform
In order to allow other uses of the blkcipher walk API than the blkcipher
algos themselves, this patch copies some of the transform data members to the
walk struct so the transform is only accessed at walk init time.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Mon, 24 Feb 2014 14:26:29 +0000 (15:26 +0100)]
arm64: add support for kernel mode NEON in interrupt context
This patch modifies kernel_neon_begin() and kernel_neon_end(), so
they may be called from any context. To address the case where only
a couple of registers are needed, kernel_neon_begin_partial(u32) is
introduced which takes as a parameter the number of bottom 'n' NEON
q-registers required. To mark the end of such a partial section, the
regular kernel_neon_end() should be used.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/include/asm/neon.h
Change-Id: Ifc7c6aa77e2ab8dd98bb9975cccab54e09693ab7
Ard Biesheuvel [Thu, 8 May 2014 09:20:23 +0000 (11:20 +0200)]
arm64: defer reloading a task's FPSIMD state to userland resume
If a task gets scheduled out and back in again and nothing has touched
its FPSIMD state in the mean time, there is really no reason to reload
it from memory. Similarly, repeated calls to kernel_neon_begin() and
kernel_neon_end() will preserve and restore the FPSIMD state every time.
This patch defers the FPSIMD state restore to the last possible moment,
i.e., right before the task returns to userland. If a task does not return to
userland at all (for any reason), the existing FPSIMD state is preserved
and may be reused by the owning task if it gets scheduled in again on the
same CPU.
This patch adds two more functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded
- fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/kernel/fpsimd.c
Change-Id: Ib1c0d8d0afb3c248cd4d060eb35877530dd92fdc
Ard Biesheuvel [Mon, 24 Feb 2014 14:26:27 +0000 (15:26 +0100)]
arm64: add abstractions for FPSIMD state manipulation
There are two tacit assumptions in the FPSIMD handling code that will no longer
hold after the next patch that optimizes away some FPSIMD state restores:
. the FPSIMD registers of this CPU contain the userland FPSIMD state of
task 'current';
. when switching to a task, its FPSIMD state will always be restored from
memory.
This patch adds the following functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_preserve_current_state -> ensure current's FPSIMD state is saved
- fpsimd_update_current_state -> replace current's FPSIMD state
Where necessary, the signal handling and fork code are updated to use the above
wrappers instead of poking into the FPSIMD registers directly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/kernel/fpsimd.c
Change-Id: I53ae7082427cb1c5cc32e1f2ddbd4218115601ba
Ard Biesheuvel [Sat, 8 Feb 2014 12:34:09 +0000 (13:34 +0100)]
cpu: add generic support for CPU feature based module autoloading
This patch adds support for advertising optional CPU features over udev
using the modalias, and for declaring compatibility with/dependency upon
such a feature in a module.
The mapping between feature numbers and actual features should be provided
by the architecture in a file called <asm/cpufeature.h> which exports the
following functions/macros:
- cpu_feature(FEAT), a preprocessor macro that maps token FEAT to a
numeric index;
- bool cpu_have_feature(n), returning whether this CPU has support for
feature #n;
- MAX_CPU_FEATURES, an upper bound for 'n' in the previous function.
The feature can then be enabled by setting CONFIG_GENERIC_CPU_AUTOPROBE
for the architecture.
For instance, a module that registers its module init function using
module_cpu_feature_match(FEAT_X, module_init_function)
will be probed automatically when the CPU's support for the 'FEAT_X'
feature is advertised over udev, and will only allow the module to be
loaded by hand if the 'FEAT_X' feature is supported.
Change-Id: Icae8e3ff347235fc72a5b41279f0afdb34fb161a
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kbuild test robot [Tue, 24 Sep 2013 00:21:29 +0000 (08:21 +0800)]
crypto: ablk_helper - Replace memcpy with struct assignment
tree: git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
head:
48e6dc1b2a1ad8186d48968d5018912bdacac744
commit:
a62b01cd6cc1feb5e80d64d6937c291473ed82cb [20/24] crypto: create generic version of ablk_helper
coccinelle warnings: (new ones prefixed by >>)
>> crypto/ablk_helper.c:97:2-8: Replace memcpy with struct assignment
>> crypto/ablk_helper.c:78:2-8: Replace memcpy with struct assignment
Please consider folding the attached diff :-)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Fri, 20 Sep 2013 07:55:40 +0000 (09:55 +0200)]
crypto: create generic version of ablk_helper
Create a generic version of ablk_helper so it can be reused
by other architectures.
Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mikulas Patocka [Fri, 25 Jul 2014 23:40:20 +0000 (19:40 -0400)]
crypto: arm64-aes - fix encryption of unaligned data
cryptsetup fails on arm64 when using kernel encryption via AF_ALG socket.
See https://bugzilla.redhat.com/show_bug.cgi?id=
1122937
The bug is caused by incorrect handling of unaligned data in
arch/arm64/crypto/aes-glue.c. Cryptsetup creates a buffer that is aligned
on 8 bytes, but not on 16 bytes. It opens AF_ALG socket and uses the
socket to encrypt data in the buffer. The arm64 crypto accelerator causes
data corruption or crashes in the scatterwalk_pagedone.
This patch fixes the bug by passing the residue bytes that were not
processed as the last parameter to blkcipher_walk_done.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andreas Schwab [Thu, 24 Jul 2014 16:03:26 +0000 (17:03 +0100)]
arm64/crypto: fix makefile rule for aes-glue-%.o
This fixes the following build failure when building with CONFIG_MODVERSIONS
enabled:
CC [M] arch/arm64/crypto/aes-glue-ce.o
ld: cannot find arch/arm64/crypto/aes-glue-ce.o: No such file or directory
make[1]: *** [arch/arm64/crypto/aes-ce-blk.o] Error 1
make: *** [arch/arm64/crypto] Error 2
The $(obj)/aes-glue-%.o rule only creates $(obj)/.tmp_aes-glue-ce.o, it
should use if_changed_rule instead of if_changed_dep.
Signed-off-by: Andreas Schwab <schwab@suse.de>
[ardb: mention CONFIG_MODVERSIONS in commit log]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 16 Jun 2014 10:02:16 +0000 (11:02 +0100)]
arm64/crypto: improve performance of GHASH algorithm
This patches modifies the GHASH secure hash implementation to switch to a
faster, polynomial multiplication based reduction instead of one that uses
shifts and rotates.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 16 Jun 2014 10:02:15 +0000 (11:02 +0100)]
arm64/crypto: fix data corruption bug in GHASH algorithm
This fixes a bug in the GHASH algorithm resulting in the calculated hash to be
incorrect if the input is presented in chunks whose size is not a multiple of
16 bytes.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: fdd2389457b2 ("arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Fri, 21 Mar 2014 09:19:17 +0000 (10:19 +0100)]
arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions
This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes,
both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON.
The Crypto Extensions version can only run on ARMv8 implementations that
have support for these optional extensions.
The plain NEON version is a table based yet time invariant implementation.
All S-box substitutions are performed in parallel, leveraging the wide range
of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can
comfortably hold the entire S-box and still have room to spare for doing the
actual computations.
The key expansion routines were borrowed from aes_generic.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Mon, 10 Feb 2014 10:26:29 +0000 (11:26 +0100)]
arm64/crypto: AES in CCM mode using ARMv8 Crypto Extensions
This patch adds support for the AES-CCM encryption algorithm for CPUs that
have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Wed, 5 Feb 2014 17:13:38 +0000 (18:13 +0100)]
arm64/crypto: AES using ARMv8 Crypto Extensions
This patch adds support for the AES symmetric encryption algorithm for CPUs
that have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Wed, 26 Mar 2014 19:53:05 +0000 (20:53 +0100)]
arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions
This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the
GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the
optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call
carry-less multiply).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Thu, 20 Mar 2014 14:35:40 +0000 (15:35 +0100)]
arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions
This patch adds support for the SHA-224 and SHA-256 Secure Hash Algorithms
for CPUs that have support for the SHA-2 part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
JP Abgrall [Thu, 28 Aug 2014 02:07:30 +0000 (19:07 -0700)]
arm64/crypto: SHA-1 using ARMv8 Crypto Extensions
This patch adds support for the SHA-1 Secure Hash Algorithm for CPUs that
have support for the SHA-1 part of the ARM v8 Crypto Extensions.
Change-Id: I29fafd308e17aff6e0d59938c106fae6ad7fe78e
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Conflicts:
arch/arm64/Makefile
JP Abgrall [Thu, 28 Aug 2014 03:30:29 +0000 (20:30 -0700)]
arm: fixup NR_syscalls to accommodate the new seccomp syscall
This belongs in
commit:
83f1ccba87b06575966b65352db565c363af7bcf
https://android-review.googlesource.com/#/c/104520
Change-Id: Id5037cbebac9b86c863da79c3b8729e627e65f8e
Signed-off-by: JP Abgrall <jpa@google.com>
Kees Cook [Thu, 5 Jun 2014 07:23:17 +0000 (00:23 -0700)]
seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
Applying restrictive seccomp filter programs to large or diverse
codebases often requires handling threads which may be started early in
the process lifetime (e.g., by code that is linked in). While it is
possible to apply permissive programs prior to process start up, it is
difficult to further restrict the kernel ABI to those threads after that
point.
This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for
synchronizing thread group seccomp filters at filter installation time.
When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC,
filter) an attempt will be made to synchronize all threads in current's
threadgroup to its new seccomp filter program. This is possible iff all
threads are using a filter that is an ancestor to the filter current is
attempting to synchronize to. NULL filters (where the task is running as
SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be
transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS,
...) has been set on the calling thread, no_new_privs will be set for
all synchronized threads too. On success, 0 is returned. On failure,
the pid of one of the failing threads will be returned and no filters
will have been applied.
The race conditions against another thread are:
- requesting TSYNC (already handled by sighand lock)
- performing a clone (already handled by sighand lock)
- changing its filter (already handled by sighand lock)
- calling exec (handled by cred_guard_mutex)
The clone case is assisted by the fact that new threads will have their
seccomp state duplicated from their parent before appearing on the tasklist.
Holding cred_guard_mutex means that seccomp filters cannot be assigned
while in the middle of another thread's exec (potentially bypassing
no_new_privs or similar). The call to de_thread() may kill threads waiting
for the mutex.
Changes across threads to the filter pointer includes a barrier.
Based on patches by Will Drewry.
Suggested-by: Julien Tinnes <jln@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Fri, 27 Jun 2014 22:01:35 +0000 (15:01 -0700)]
seccomp: allow mode setting across threads
This changes the mode setting helper to allow threads to change the
seccomp mode from another thread. We must maintain barriers to keep
TIF_SECCOMP synchronized with the rest of the seccomp state.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Change-Id: I091ffa55d8f4e83ff02558a55e2b4dc76ac26905
Kees Cook [Fri, 27 Jun 2014 22:18:48 +0000 (15:18 -0700)]
seccomp: introduce writer locking
Normally, task_struct.seccomp.filter is only ever read or modified by
the task that owns it (current). This property aids in fast access
during system call filtering as read access is lockless.
Updating the pointer from another task, however, opens up race
conditions. To allow cross-thread filter pointer updates, writes to the
seccomp fields are now protected by the sighand spinlock (which is shared
by all threads in the thread group). Read access remains lockless because
pointer updates themselves are atomic. However, writes (or cloning)
often entail additional checking (like maximum instruction counts)
which require locking to perform safely.
In the case of cloning threads, the child is invisible to the system
until it enters the task list. To make sure a child can't be cloned from
a thread and left in a prior state, seccomp duplication is additionally
moved under the sighand lock. Then parent and child are certain have
the same seccomp state when they exit the lock.
Based on patches by Will Drewry and David Drysdale.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/fork.c
Change-Id: Ie01ece43b610867013f7d0e0a2a7be0b9077630f
Kees Cook [Fri, 27 Jun 2014 22:16:33 +0000 (15:16 -0700)]
seccomp: split filter prep from check and apply
In preparation for adding seccomp locking, move filter creation away
from where it is checked and applied. This will allow for locking where
no memory allocation is happening. The validation, filter attachment,
and seccomp mode setting can all happen under the future locks.
For extreme defensiveness, I've added a BUG_ON check for the calculated
size of the buffer allocation in case BPF_MAXINSN ever changes, which
shouldn't ever happen. The compiler should actually optimize out this
check since the test above it makes it impossible.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Change-Id: I8d89f80a5b4f2826d90474dcea441c41f0af6594
Kees Cook [Tue, 10 Jun 2014 22:45:09 +0000 (15:45 -0700)]
MIPS: add seccomp syscall
Wires up the new seccomp syscall.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Conflicts:
arch/mips/include/uapi/asm/unistd.h
arch/mips/kernel/scall32-o32.S
arch/mips/kernel/scall64-64.S
arch/mips/kernel/scall64-n32.S
arch/mips/kernel/scall64-o32.S
Change-Id: I7031bdbec7c90292aeb7e255c73cb36e6ec43af2
Kees Cook [Tue, 10 Jun 2014 22:40:23 +0000 (15:40 -0700)]
ARM: add seccomp syscall
Wires up the new seccomp syscall.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Conflicts:
arch/arm/include/uapi/asm/unistd.h
arch/arm/kernel/calls.S
Change-Id: Ia519993681f70bd38699a73d8897ae9b44b3f0af
Kees Cook [Wed, 25 Jun 2014 23:08:24 +0000 (16:08 -0700)]
seccomp: add "seccomp" syscall
This adds the new "seccomp" syscall with both an "operation" and "flags"
parameter for future expansion. The third argument is a pointer value,
used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
In addition to the TSYNC flag later in this patch series, there is a
non-zero chance that this syscall could be used for configuring a fixed
argument area for seccomp-tracer-aware processes to pass syscall arguments
in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
for this syscall. Additionally, this syscall uses operation, flags,
and user pointer for arguments because strictly passing arguments via
a user pointer would mean seccomp itself would be unable to trivially
filter the seccomp syscall itself.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
arch/x86/syscalls/syscall_32.tbl
arch/x86/syscalls/syscall_64.tbl
include/uapi/asm-generic/unistd.h
kernel/seccomp.c
Change-Id: Id7a365079829fd9164315dec75d6ee415c29b176
Kees Cook [Wed, 25 Jun 2014 22:55:25 +0000 (15:55 -0700)]
seccomp: split mode setting routines
Separates the two mode setting paths to make things more readable with
fewer #ifdefs within function bodies.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Wed, 25 Jun 2014 22:38:02 +0000 (15:38 -0700)]
seccomp: extract check/assign mode helpers
To support splitting mode 1 from mode 2, extract the mode checking and
assignment logic into common functions.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Wed, 21 May 2014 22:02:11 +0000 (15:02 -0700)]
seccomp: create internal mode-setting function
In preparation for having other callers of the seccomp mode setting
logic, split the prctl entry point away from the core logic that performs
seccomp mode setting.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Fri, 18 Jul 2014 18:28:33 +0000 (11:28 -0700)]
MAINTAINERS: create seccomp entry
Add myself as seccomp maintainer.
Suggested-by: James Morris <jmorris@namei.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Wed, 16 Apr 2014 17:54:34 +0000 (10:54 -0700)]
seccomp: fix memory leak on filter attach
This sets the correct error code when final filter memory is unavailable,
and frees the raw filter no matter what.
unreferenced object 0xffff8800d6ea4000 (size 512):
comm "sshd", pid 278, jiffies
4294898315 (age 46.653s)
hex dump (first 32 bytes):
21 00 00 00 04 00 00 00 15 00 01 00 3e 00 00 c0 !...........>...
06 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
backtrace:
[<
ffffffff8151414e>] kmemleak_alloc+0x4e/0xb0
[<
ffffffff811a3a40>] __kmalloc+0x280/0x320
[<
ffffffff8110842e>] prctl_set_seccomp+0x11e/0x3b0
[<
ffffffff8107bb6b>] SyS_prctl+0x3bb/0x4a0
[<
ffffffff8152ef2d>] system_call_fastpath+0x1a/0x1f
[<
ffffffffffffffff>] 0xffffffffffffffff
Reported-by: Masami Ichikawa <masami256@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Masami Ichikawa <masami256@gmail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
kernel/seccomp.c
Change-Id: Ide3c27bf378397f8faf4218e75c31e4b8bc43c4c
Kees Cook [Fri, 8 Nov 2013 23:51:56 +0000 (00:51 +0100)]
ARM: 7888/1: seccomp: not compatible with ARM OABI
Make sure that seccomp filter won't be built when ARM OABI is in use,
since there is work needed to distinguish calling conventions. Until
that is done (which is likely never since OABI is deprecated), make
sure seccomp filter is unavailable in the OABI world.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Will Drewry <wad@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
xerox_lin [Thu, 14 Aug 2014 06:48:44 +0000 (14:48 +0800)]
usb: Add support for rndis uplink aggregation
RNDIS protocol supports data aggregation on uplink and can help
reduce mips by reducing number of interrupts on device. Throughput
also improved by 20-30%. Aggregation is disabled by setting
aggregation packet size to 1. To help better UL throughput, set
as ul aggregation support to 3 rndis packets by default. It can be
configured via module parameter: rndis_ul_max_pkt_per_xfer.
Change-Id: I0b62a21a5c3ceb6b04933d0d6da33301dbafe493
Signed-off-by: Vamsi Krishna <vskrishn@codeaurora.org>
Signed-off-by: Xerox Lin <xerox_lin@htc.com>
Dmitry Shmidt [Fri, 22 Aug 2014 21:40:18 +0000 (14:40 -0700)]
Add flags parameter to get_country_code template
Change-Id: Ic3f173db144a301ea104f544fc8ec723241c1d59
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Colin Cross [Tue, 5 Aug 2014 19:05:17 +0000 (12:05 -0700)]
mm: fix prctl_set_vma_anon_name
prctl_set_vma_anon_name could attempt to set the name across
two vmas at the same time due to a typo, which might corrupt
the vma list. Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.
Change-Id: Ie32d8ddb0fd547efbeedd6528acdab5ca5b308b4
Reported-by: Jed Davis <jld@mozilla.com>
Signed-off-by: Colin Cross <ccross@android.com>
xerox_lin [Mon, 18 Aug 2014 13:54:23 +0000 (21:54 +0800)]
USB: rndis: Free the rndis response queue during REMOTE_NDIS_RESET_MSG
When rndis data transfer is in progress, some Windows7 Host PC is not
sending the GET_ENCAPSULATED_RESPONSE command for receiving the response
for the previous SEND_ENCAPSULATED_COMMAND processed.
The rndis function driver appends each response for the
SEND_ENCAPSULATED_COMMAND in a queue. As the above process got corrupted,
the Host sends a REMOTE_NDIS_RESET_MSG command to do a soft-reset.
As the rndis response queue is not freed, the previous response is sent
as a part of this REMOTE_NDIS_RESET_MSG's reset response and the Host
blocks any more Rndis transfers.
Hence free the rndis response queue as a part of this soft-reset so that
the current response for REMOTE_NDIS_RESET_MSG is sent properly during the
response command.
Change-Id: I8eff3849db452fe01b7d1fe4140ef1f1ad3f4fd4
Signed-off-by: Rajkumar Raghupathy <raghup@codeaurora.org>
Signed-off-by: Xerox Lin <xerox_lin@htc.com>
Maxime Bizon [Thu, 29 Aug 2013 18:28:13 +0000 (20:28 +0200)]
firmware loader: fix pending_fw_head list corruption
Got the following oops just before reboot:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
[<
8028d300>] (__list_del_entry+0x44/0xac)
[<
802e3320>] (__fw_load_abort.part.13+0x1c/0x50)
[<
802e337c>] (fw_shutdown_notify+0x28/0x50)
[<
80034f80>] (notifier_call_chain.isra.1+0x5c/0x9c)
[<
800350ec>] (__blocking_notifier_call_chain+0x44/0x58)
[<
80035114>] (blocking_notifier_call_chain+0x14/0x18)
[<
80035d64>] (kernel_restart_prepare+0x14/0x38)
[<
80035d94>] (kernel_restart+0xc/0x50)
The following race condition triggers here:
_request_firmware_load()
device_create_file(...)
kobject_uevent(...)
(schedule)
(resume)
firmware_loading_store(1)
firmware_loading_store(0)
list_del_init(&buf->pending_list)
(schedule)
(resume)
list_add(&buf->pending_list, &pending_fw_head);
wait_for_completion(&buf->completion);
causing an oops later when walking pending_list after the firmware has
been released.
The proposed fix is to move the list_add() before sysfs attribute
creation.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Acked-by: Ming Lei <ming.lei@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Wed, 22 May 2013 16:28:38 +0000 (18:28 +0200)]
firmware: Avoid deadlock of usermodehelper lock at shutdown
When a system goes to reboot/shutdown, it tries to disable the
usermode helper via usermodehelper_disable(). This might be blocked
when a driver tries to load a firmware beforehand and it's stuck by
some reason. For example, dell_rbu driver loads the firmware in
non-hotplug mode and waits for user-space clearing the loading sysfs
flag. If user-space doesn't clear the flag, it waits forever, thus
blocks the reboot, too.
As a workaround, in this patch, the firmware class driver registers a
reboot notifier so that it can abort all pending f/w bufs before
issuing usermodehelper_disable().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
drivers/base/firmware_class.c
Change-Id: I7ff6c198cd34090e55845b9d4035b1e5dc86226b
Laura Abbott [Fri, 24 Jan 2014 23:19:49 +0000 (15:19 -0800)]
staging: android: ashmem: Avoid deadlock with mmap/shrink
Both ashmem_mmap and ashmem_shrink take the ashmem_lock. It may
be possible for ashmem_mmap to invoke ashmem_shrink:
-000|mutex_lock(lock = 0x0)
-001|ashmem_shrink(?, sc = 0x0) <--- try to take ashmem_mutex again
-002|shrink_slab(shrink = 0xDA5F1CC0, nr_pages_scanned = 0, lru_pages
-002|=
-002|124)
-003|try_to_free_pages(zonelist = 0x0, ?, ?, ?)
-004|__alloc_pages_nodemask(gfp_mask = 21200, order = 1, zonelist =
-004|0xC11D0940,
-005|new_slab(s = 0xE4841E80, ?, node = -1)
-006|__slab_alloc.isra.43.constprop.50(s = 0xE4841E80, gfpflags =
-006|
2148925462, ad
-007|kmem_cache_alloc(s = 0xE4841E80, gfpflags = 208)
-008|shmem_alloc_inode(?)
-009|alloc_inode(sb = 0xE480E800)
-010|new_inode_pseudo(?)
-011|new_inode(?)
-012|shmem_get_inode(sb = 0xE480E800, dir = 0x0, ?, dev = 0, flags =
-012|187)
-013|shmem_file_setup(?, ?, flags = 187)
-014|ashmem_mmap(?, vma = 0xC5D64210) <---- Acquire ashmem_mutex
-015|mmap_region(file = 0xDF8E2C00, addr =
1772974080, len = 233472,
-015|flags = 57,
-016|sys_mmap_pgoff(addr = 0, len = 230400, prot = 3, flags = 1, fd =
-016|157, pgoff
-017|ret_fast_syscall(asm)
-->|exception
-018|NUR:0x40097508(asm)
---|end of frame
Avoid this deadlock by using mutex_trylock in ashmem_shrink; if the mutex
is already held, do not attempt to shrink.
Change-Id: I222bbf55856d5849da813b730de0636c80966c8e
Reported-by: Matt Wagantall <mattw@codeaurora.org>
Reported-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Reported-by: Osvaldo Banuelos <osvaldob@codeaurora.org>
Reported-by: Subbaraman Narayanamurthy <subbaram@codeaurora.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
JP Abgrall [Wed, 23 Jul 2014 23:55:07 +0000 (16:55 -0700)]
ext4: Add support for FIDTRIM, a best-effort ioctl for deep discard trim
* What
This provides an interface for issuing an FITRIM which uses the
secure discard instead of just a discard.
Only the eMMC command is "secure", and not how the FS uses it:
due to the fact that the FS might reassign a region somewhere else,
the original deleted data will not be affected by the "trim" which only
handles un-used regions.
So we'll just call it "deep discard", and note that this is a
"best effort" cleanup.
* Why
Once in a while, We want to be able to cleanup most of the unused blocks
after erasing a bunch of files.
We don't want to constantly secure-discard via a mount option.
From an eMMC spec perspective, it tells the device to really get rid of
all the data for the specified blocks and not just put them back into the
pool of free ones (unlike the normal TRIM). The eMMC spec says the
secure trim handling must make sure the data (and metadata) is not available
anymore. A simple TRIM doesn't clear the data, it just puts blocks in the
free pool.
JEDEC Standard No. 84-A441
7.6.9 Secure Erase
7.6.10 Secure Trim
From an FS perspective, it is acceptable to leave some data behind.
- directory entries related to deleted files
- databases entries related to deleted files
- small-file data stored in inode extents
- blocks held by the FS waiting to be re-used (mitigated by sync).
- blocks reassigned by the FS prior to FIDTRIM.
Change-Id: I676a1404a80130d93930c84898360f2e6fb2f81e
Signed-off-by: Geremy Condra <gcondra@google.com>
Signed-off-by: JP Abgrall <jpa@google.com>
Anson Jacob [Mon, 23 Jun 2014 11:14:01 +0000 (19:14 +0800)]
usb: gadget: f_audio_source: Fixed USB Audio Class Interface Descriptor
Fixed Android Issue #56549.
When both Vendor Class and Audio Class are activated for AOA 2.0,
the baInterfaceNr of the AudioControl Interface Descriptor points
to wrong interface numbers. They should be pointing to
Audio Control Device and Audio Streaming interfaces.
Replaced baInterfaceNr with the correct value.
Change-Id: Iaa083f3d97c1f0fc9481bf87852b2b51278a6351
Signed-off-by: Anson Jacob <ansonkuzhumbil@gmail.com>
Anson Jacob [Tue, 1 Jul 2014 10:17:20 +0000 (18:17 +0800)]
usb: gadget: f_audio_source: change max ISO packet size
Re-applying from
https://gitorious.org/shr/linux/commit/
eb4c9d2db894c3492c0a848581bd4f6790f93d5f
Most USB-AUDIO devices are limited to 256 byte for max iso buffer size.
If a IN_EP_MAX_PACKET_SIZE is bigger than a USB-AUDIO device's max iso
buffer size, it will cause noise. This patch will prevent this case as
possibe by reducing packet size. When using 44.1khz, 2ch, 16bit audio
data, if max packet size is bigger than 176 bytes, it's no problem.
Credits to: Iliyan Malchev <malchev@google.com>
Change-Id: Ic2a1c19ea65d5fb42bf12926b51b255b465d7215
Signed-off-by: Anson Jacob <ansonkuzhumbil@gmail.com>
Jonathan Hamilton [Thu, 17 Jul 2014 22:54:44 +0000 (15:54 -0700)]
video: adf: Cleanup sw_sync timeline at adf_device_destroy
If a sw_sync timeline was created by ADF (for drivers that do not implement
ops->complete_fence) we should clean it up when the ADF device is
destroyed.
Change-Id: Idd90180fcae56a87111f7d12bdd80190756a6b80
Signed-off-by: Jonathan Hamilton <jonathan.hamilton@imgtec.com>
Kees Cook [Wed, 28 Aug 2013 20:32:01 +0000 (22:32 +0200)]
HID: check for NULL field when setting values
Defensively check that the field to be worked on is not NULL.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 28 Aug 2013 20:31:52 +0000 (22:31 +0200)]
HID: picolcd_core: validate output report details
A HID device could send a malicious output report that would cause the
picolcd HID driver to trigger a NULL dereference during attr file writing.
[jkosina@suse.cz: changed
report->maxfield < 1
to
report->maxfield != 1
as suggested by Bruno].
CVE-2013-2899
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Reviewed-by: Bruno Prémont <bonbons@linux-vserver.org>
Acked-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 28 Aug 2013 20:31:44 +0000 (22:31 +0200)]
HID: sensor-hub: validate feature report details
A HID device could send a malicious feature report that would cause the
sensor-hub HID driver to read past the end of heap allocation, leaking
kernel memory contents to the caller.
CVE-2013-2898
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Benjamin Tissoires [Wed, 11 Sep 2013 19:56:58 +0000 (21:56 +0200)]
HID: multitouch: validate indexes details
When working on report indexes, always validate that they are in bounds.
Without this, a HID device could report a malicious feature report that
could trick the driver into a heap overflow:
[ 634.885003] usb 1-1: New USB device found, idVendor=0596, idProduct=0500
...
[ 676.469629] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten
Note that we need to change the indexes from s8 to s16 as they can
be between -1 and 255.
CVE-2013-2897
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Benjamin Tissoires [Wed, 11 Sep 2013 19:56:57 +0000 (21:56 +0200)]
HID: validate feature and input report details
When dealing with usage_index, be sure to properly use unsigned instead of
int to avoid overflows.
When working on report fields, always validate that their report_counts are
in bounds.
Without this, a HID device could report a malicious feature report that
could trick the driver into a heap overflow:
[ 634.885003] usb 1-1: New USB device found, idVendor=0596, idProduct=0500
...
[ 676.469629] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten
CVE-2013-2897
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 28 Aug 2013 20:31:28 +0000 (22:31 +0200)]
HID: ntrig: validate feature report details
A HID device could send a malicious feature report that would cause the
ntrig HID driver to trigger a NULL dereference during initialization:
[57383.031190] usb 3-1: New USB device found, idVendor=1b96, idProduct=0001
...
[57383.315193] BUG: unable to handle kernel NULL pointer dereference at
0000000000000030
[57383.315308] IP: [<
ffffffffa08102de>] ntrig_probe+0x25e/0x420 [hid_ntrig]
CVE-2013-2896
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:56 +0000 (21:56 +0200)]
HID: logitech-dj: validate output report details
A HID device could send a malicious output report that would cause the
logitech-dj HID driver to leak kernel memory contents to the device, or
trigger a NULL dereference during initialization:
[ 304.424553] usb 1-1: New USB device found, idVendor=046d, idProduct=c52b
...
[ 304.780467] BUG: unable to handle kernel NULL pointer dereference at
0000000000000028
[ 304.781409] IP: [<
ffffffff815d50aa>] logi_dj_recv_send_report.isra.11+0x1a/0x90
CVE-2013-2895
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:55 +0000 (21:56 +0200)]
HID: lenovo-tpkbd: validate output report details
A HID device could send a malicious output report that would cause the
lenovo-tpkbd HID driver to write just beyond the output report allocation
during initialization, causing a heap overflow:
[ 76.109807] usb 1-1: New USB device found, idVendor=17ef, idProduct=6009
...
[ 80.462540] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten
CVE-2013-2894
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:54 +0000 (21:56 +0200)]
HID: LG: validate HID output report details
A HID device could send a malicious output report that would cause the
lg, lg3, and lg4 HID drivers to write beyond the output report allocation
during an event, causing a heap overflow:
[ 325.245240] usb 1-1: New USB device found, idVendor=046d, idProduct=c287
...
[ 414.518960] BUG kmalloc-4096 (Not tainted): Redzone overwritten
Additionally, while lg2 did correctly validate the report details, it was
cleaned up and shortened.
CVE-2013-2893
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 28 Aug 2013 20:30:49 +0000 (22:30 +0200)]
HID: pantherlord: validate output report details
A HID device could send a malicious output report that would cause the
pantherlord HID driver to write beyond the output report allocation
during initialization, causing a heap overflow:
[ 310.939483] usb 1-1: New USB device found, idVendor=0e8f, idProduct=0003
...
[ 315.980774] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten
CVE-2013-2892
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:53 +0000 (21:56 +0200)]
HID: steelseries: validate output report details
A HID device could send a malicious output report that would cause the
steelseries HID driver to write beyond the output report allocation
during initialization, causing a heap overflow:
[ 167.981534] usb 1-1: New USB device found, idVendor=1038, idProduct=1410
...
[ 182.050547] BUG kmalloc-256 (Tainted: G W ): Redzone overwritten
CVE-2013-2891
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:51 +0000 (21:56 +0200)]
HID: zeroplus: validate output report details
The zeroplus HID driver was not checking the size of allocated values
in fields it used. A HID device could send a malicious output report
that would cause the driver to write beyond the output report allocation
during initialization, causing a heap overflow:
[ 1442.728680] usb 1-1: New USB device found, idVendor=0c12, idProduct=0005
...
[ 1466.243173] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten
CVE-2013-2889
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 11 Sep 2013 19:56:50 +0000 (21:56 +0200)]
HID: provide a helper for validating hid reports
Many drivers need to validate the characteristics of their HID report
during initialization to avoid misusing the reports. This adds a common
helper to perform validation of the report exisitng, the field existing,
and the expected number of values within the field.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Kees Cook [Wed, 28 Aug 2013 20:29:55 +0000 (22:29 +0200)]
HID: validate HID report id size
The "Report ID" field of a HID report is used to build indexes of
reports. The kernel's index of these is limited to 256 entries, so any
malicious device that sets a Report ID greater than 255 will trigger
memory corruption on the host:
[ 1347.156239] BUG: unable to handle kernel paging request at
ffff88094958a878
[ 1347.156261] IP: [<
ffffffff813e4da0>] hid_register_report+0x2a/0x8b
CVE-2013-2888
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Anson Jacob [Mon, 23 Jun 2014 11:07:44 +0000 (19:07 +0800)]
usb: gadget: f_accessory: Enabled Zero Length Packet (ZLP) for acc_write
Accessory connected to Android Device requires
Zero Length Packet (ZLP) to be written when data
transferred out from the Android device are multiples
of wMaxPacketSize (64bytes (Full-Speed) / 512bytes (High-Speed))
to end the transfer.
Change-Id: Ib2c2c0ab98ef9afa10e74a720142deca5c0ed476
Signed-off-by: Anson Jacob <ansonkuzhumbil@gmail.com>
Sreeram Ramachandran [Tue, 8 Jul 2014 18:57:14 +0000 (11:57 -0700)]
Handle 'sk' being NULL in UID-based routing.
Bug:
15413527
Change-Id: Iab1fae9da6053b284591628ef1de878761b137b1
Signed-off-by: Sreeram Ramachandran <sreeram@google.com>
Daniel Rosenberg [Fri, 27 Jun 2014 23:39:35 +0000 (16:39 -0700)]
input: Made keyreset more robust
Switched do_restart to run in a seperate workqueue to handle
cases where kernel_restart hangs.
Change-Id: I1ecd61f8d0859f1a86d37c692351d644b5db9c69
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Dmitry Shmidt [Tue, 1 Jul 2014 21:48:15 +0000 (14:48 -0700)]
net: cfg80211: Fix wiphy_vendor_command 'doit' type
Change-Id: I5b1732eed7ac4f6bc267b4baa2153f6de2e16dc8
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Yann Soubeyrand [Wed, 18 Jun 2014 12:57:29 +0000 (14:57 +0200)]
net: wireless: fix misplaced #endif in net/wireless/nl80211.c
The patch "nl80211: cumulative vendor command support patch" introduced
compilation error in file net/wireless/nl80211.c. The nl80211_vendor_mcgrp
variable is defined only if the CONFIG_NL80211_TESTMODE preprocessor constant
is defined. However, this variable is later used wether
CONFIG_NL80211_TESTMODE is defined or not. The cause is a misplaced #endif.
Change-Id: I466488285578d57e6554a1f8ebe71d4f3385ecf2
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Minsung Kim [Wed, 25 Jun 2014 10:44:50 +0000 (19:44 +0900)]
cpufreq: fix sleeping in atomic context when realloc freq_table for all_time_in_state
Commit
40cf2f8 (cpufreq: Persist cpufreq time in state data across hotplug)
causes the following call trace to be spit on boot:
BUG: sleeping function called from invalid context at mm/slub.c:936
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
CPU: 6 PID: 1 Comm: swapper/0 Not tainted
3.10.9-20140624.172707-eng-gd6c0f69-dirty #50
Backtrace:
[<
c0012270>] (dump_backtrace+0x0/0x10c) from [<
c001256c>] (show_stack+0x18/0x1c)
r6:
ffff1788 r5:
c0c020c0 r4:
e609c000 r3:
00000000
[<
c0012554>] (show_stack+0x0/0x1c) from [<
c07a2970>] (dump_stack+0x20/0x28)
[<
c07a2950>] (dump_stack+0x0/0x28) from [<
c0057678>] (__might_sleep+0x104/0x120)
[<
c0057574>] (__might_sleep+0x0/0x120) from [<
c00ff000>] (__kmalloc_track_caller+0x144/0x274)
r6:
00000000 r5:
e609c000 r4:
e6802140
[<
c00feebc>] (__kmalloc_track_caller+0x0/0x274) from [<
c00da098>] (krealloc+0x58/0xb0)
[<
c00da040>] (krealloc+0x0/0xb0) from [<
c050266c>] (cpufreq_allstats_create+0x120/0x204)
r8:
e4c4ff00 r7:
c0d266b8 r6:
0013d620 r5:
e4c4e600 r4:
00000001
r3:
e535d6d0
[<
c050254c>] (cpufreq_allstats_create+0x0/0x204) from [<
c0502e38>] (cpufreq_stat_notifier_policy+0xb8/0xd0)
[<
c0502d80>] (cpufreq_stat_notifier_policy+0x0/0xd0) from [<
c00517cc>] (notifier_call_chain+0x4c/0x8c)
r5:
00000000 r4:
fffffffe
[<
c0051780>] (notifier_call_chain+0x0/0x8c) from [<
c00519fc>] (__blocking_notifier_call_chain+0x50/0x68)
r8:
c0cd4d00 r7:
00000002 r6:
e609dd7c r5:
ffffffff r4:
c0d25a4c
r3:
ffffffff
[<
c00519ac>] (__blocking_notifier_call_chain+0x0/0x68) from [<
c0051a34>] (blocking_notifier_call_chain+0x20/0x28)
r7:
c0e24f30 r6:
00000000 r5:
e53e1e00 r4:
e609dd7c
[<
c0051a14>] (blocking_notifier_call_chain+0x0/0x28) from [<
c0500fec>] (__cpufreq_set_policy+0xc0/0x1d0)
[<
c0500f2c>] (__cpufreq_set_policy+0x0/0x1d0) from [<
c0501308>] (cpufreq_add_dev_interface+0x20c/0x270)
r7:
00000008 r6:
00000000 r5:
e53e1e00 r4:
e53e1e58
[<
c05010fc>] (cpufreq_add_dev_interface+0x0/0x270) from [<
c05016a8>] (cpufreq_add_dev+0x33c/0x420)
[<
c050136c>] (cpufreq_add_dev+0x0/0x420) from [<
c03604a4>] (subsys_interface_register+0x80/0xbc)
[<
c0360424>] (subsys_interface_register+0x0/0xbc) from [<
c050035c>] (cpufreq_register_driver+0x8c/0x194)
Change-Id: If77a656d0ea60a8fc4083283d104509fa6c07f8f
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
Dmitry Shmidt [Thu, 26 Jun 2014 16:26:21 +0000 (09:26 -0700)]
net: wireless: Fix cfg80211_vendor_cmd_alloc_reply_skb
Change-Id: Ia8da6cdacd5668d10f8955972d996177305b7228
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Lorenzo Colitti [Mon, 31 Mar 2014 07:23:51 +0000 (16:23 +0900)]
net: core: Support UID-based routing.
This contains the following commits:
1.
cc2f522 net: core: Add a UID range to fib rules.
2.
d7ed2bd net: core: Use the socket UID in routing lookups.
3.
2f9306a net: core: Add a RTA_UID attribute to routes.
This is so that userspace can do per-UID route lookups.
4.
8e46efb net: ipv6: Use the UID in IPv6 PMTUD
IPv4 PMTUD already does this because ipv4_sk_update_pmtu
uses __build_flow_key, which includes the UID.
Bug:
15413527
Change-Id: I81bd31dae655de9cce7d7a1f9a905dc1c2feba7c
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Stephen Smalley [Tue, 23 Jul 2013 21:38:41 +0000 (17:38 -0400)]
SELinux: Enable setting security contexts on rootfs inodes.
rootfs (ramfs) can support setting of security contexts
by userspace due to the vfs fallback behavior of calling
the security module to set the in-core inode state
for security.* attributes when the filesystem does not
provide an xattr handler. No xattr handler required
as the inodes are pinned in memory and have no backing
store.
This is useful in allowing early userspace to label individual
files within a rootfs while still providing a policy-defined
default via genfs.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Dmitry Shmidt [Tue, 24 Jun 2014 20:19:46 +0000 (13:19 -0700)]
net: wireless: Add NL80211_FLAG_NEED_WIPHY flag to vendor command
Change-Id: I52ee3bc8a422c2a4c57cccccccd6ba3e721b4c01
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Dmitry Shmidt [Tue, 24 Jun 2014 16:36:50 +0000 (09:36 -0700)]
net: wireless: Increase scan entry expiration to fit new scan time
Change-Id: I0e23ce45d78d7c17633670973f49943a5ed6032d
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Dmitry Shmidt [Tue, 17 Jun 2014 20:42:34 +0000 (13:42 -0700)]
nl80211: cumulative vendor command support patch
Based on commit
d3fd06d0259232e1362c6d1da136970d26628467
Author: Johannes Berg <johannes.berg@intel.com>
Date: Sat Jan 25 10:17:18 2014 -0800
nl80211: vendor command support
Change-Id: I832eb4da295fe7b2c9bd8ff69ae80fe7bfe30add
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Todd Poynor [Thu, 12 Dec 2013 23:59:09 +0000 (15:59 -0800)]
power: Add property CHARGE_COUNTER_EXT and 64-bit precision properties
Add POWER_SUPPLY_PROP_CHARGE_COUNTER_EXT that stores accumulated charge
in nAh units as a signed 64-bit value.
Add generic support for signed 64-bit property values.
Change-Id: I2bd34b1e95ffba24e7bfef81f398f22bd2aaf05e
Signed-off-by: Todd Poynor <toddpoynor@google.com>
Johannes Berg [Tue, 30 Jul 2013 20:34:28 +0000 (22:34 +0200)]
nl80211: fix another nl80211_fam.attrbuf race
commit
c319d50bfcf678c2857038276d9fab3c6646f3bf upstream.
This is similar to the race Linus had reported, but in this case
it's an older bug: nl80211_prepare_wdev_dump() uses the wiphy
index in cb->args[0] as it is and thus parses the message over
and over again instead of just once because 0 is the first valid
wiphy index. Similar code in nl80211_testmode_dump() correctly
offsets the wiphy_index by 1, do that here as well.
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ruchi Kandoi [Sat, 14 Jun 2014 00:03:01 +0000 (17:03 -0700)]
prctl: adds the capable(CAP_SYS_NICE) check to PR_SET_TIMERSLACK_PID.
Adds a capable() check to make sure that arbitary apps do not change the
timer slack for other apps.
Bug:
15000427
Change-Id: I558a2551a0e3579c7f7e7aae54b28aa9d982b209
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Ruchi Kandoi [Sat, 15 Mar 2014 01:27:20 +0000 (18:27 -0700)]
cpufreq: Persist cpufreq time in state data across hotplug
Cpufreq time_in_state data for all CPUs is made persistent across
hotplug and exposed to userspace via sysfs file
/sys/devices/system/cpu/cpufreq/all_time_in_state
Change-Id: I97cb5de24b6de16189bf8b5df9592d0a6e6ddf32
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Ruchi Kandoi [Fri, 13 Jun 2014 23:24:15 +0000 (16:24 -0700)]
cpufreq: interactive: prevents the frequency to directly raise above the
hispeed_freq from a lower frequency.
When the load was below go_hispeed_load, there is a possibility that
choose_freq() would return a frequency which would be higher than the
hispeed_freq. According to the policy we should first jump to the
hispeed_freq, stay there for above_hispeed_delay and then be allowed to
raise higher than that.
Added a check to prevent the frequency to be directly raised to
something higher than the hispeed_freq.
Change-Id: Icda5d848dd9beadcc18835082ddf269732c75bd0
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Sherman Yin [Thu, 12 Jun 2014 21:35:38 +0000 (14:35 -0700)]
netfilter: fix seq_printf type mismatch warning
The return type of atomic64_read() varies depending on arch. The
arm64 version is being changed from long long to long in the mainline
for v3.16, causing a seq_printf type mismatch (%llu) in
guid_ctrl_proc_show().
This commit fixes the type mismatch by casting atomic64_read() to u64.
Change-Id: Iae0a6bd4314f5686a9f4fecbe6203e94ec0870de
Signed-off-by: Sherman Yin <shermanyin@gmail.com>
Mohamad Ayyash [Wed, 11 Jun 2014 21:52:38 +0000 (14:52 -0700)]
arm64: Fix correct dtb clean-files location
This Makefile is evaluated in arch/arm64/boot/Makefile which is what
$(obj) is.
Signed-off-by: Mohamad Ayyash <mkayyash@google.com>
Change-Id: I75355f064e249a8db693e06073f5cf395ca29ab6
Thomas Gleixner [Tue, 3 Jun 2014 12:27:08 +0000 (12:27 +0000)]
futex: Make lookup_pi_state more robust
The current implementation of lookup_pi_state has ambigous handling of
the TID value 0 in the user space futex. We can get into the kernel
even if the TID value is 0, because either there is a stale waiters
bit or the owner died bit is set or we are called from the requeue_pi
path or from user space just for fun.
The current code avoids an explicit sanity check for pid = 0 in case
that kernel internal state (waiters) are found for the user space
address. This can lead to state leakage and worse under some
circumstances.
Handle the cases explicit:
Waiter | pi_state | pi->owner | uTID | uODIED | ?
[1] NULL | --- | --- | 0 | 0/1 | Valid
[2] NULL | --- | --- | >0 | 0/1 | Valid
[3] Found | NULL | -- | Any | 0/1 | Invalid
[4] Found | Found | NULL | 0 | 1 | Valid
[5] Found | Found | NULL | >0 | 1 | Invalid
[6] Found | Found | task | 0 | 1 | Valid
[7] Found | Found | NULL | Any | 0 | Invalid
[8] Found | Found | task | ==taskTID | 0/1 | Valid
[9] Found | Found | task | 0 | 0 | Invalid
[10] Found | Found | task | !=taskTID | 0/1 | Invalid
[1] Indicates that the kernel can acquire the futex atomically. We
came came here due to a stale FUTEX_WAITERS/FUTEX_OWNER_DIED bit.
[2] Valid, if TID does not belong to a kernel thread. If no matching
thread is found then it indicates that the owner TID has died.
[3] Invalid. The waiter is queued on a non PI futex
[4] Valid state after exit_robust_list(), which sets the user space
value to FUTEX_WAITERS | FUTEX_OWNER_DIED.
[5] The user space value got manipulated between exit_robust_list()
and exit_pi_state_list()
[6] Valid state after exit_pi_state_list() which sets the new owner in
the pi_state but cannot access the user space value.
[7] pi_state->owner can only be NULL when the OWNER_DIED bit is set.
[8] Owner and user space value match
[9] There is no transient state which sets the user space TID to 0
except exit_robust_list(), but this is indicated by the
FUTEX_OWNER_DIED bit. See [4]
[10] There is no transient state which leaves owner and user space
TID out of sync.
Backport to 3.13
conflicts: kernel/futex.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Drewry <wad@chromium.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Tue, 3 Jun 2014 12:27:07 +0000 (12:27 +0000)]
futex: Always cleanup owner tid in unlock_pi
If the owner died bit is set at futex_unlock_pi, we currently do not
cleanup the user space futex. So the owner TID of the current owner
(the unlocker) persists. That's observable inconsistant state,
especially when the ownership of the pi state got transferred.
Clean it up unconditionally.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Drewry <wad@chromium.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Tue, 3 Jun 2014 12:27:06 +0000 (12:27 +0000)]
futex: Validate atomic acquisition in futex_lock_pi_atomic()
We need to protect the atomic acquisition in the kernel against rogue
user space which sets the user space futex to 0, so the kernel side
acquisition succeeds while there is existing state in the kernel
associated to the real owner.
Verify whether the futex has waiters associated with kernel state. If
it has, return -EINVAL. The state is corrupted already, so no point in
cleaning it up. Subsequent calls will fail as well. Not our problem.
[ tglx: Use futex_top_waiter() and explain why we do not need to try
restoring the already corrupted user space state. ]
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Drewry <wad@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Tue, 3 Jun 2014 12:27:06 +0000 (12:27 +0000)]
futex-prevent-requeue-pi-on-same-futex.patch futex: Forbid uaddr == uaddr2 in futex_requeue(..., requeue_pi=1)
If uaddr == uaddr2, then we have broken the rule of only requeueing
from a non-pi futex to a pi futex with this call. If we attempt this,
then dangling pointers may be left for rt_waiter resulting in an
exploitable condition.
This change brings futex_requeue() into line with
futex_wait_requeue_pi() which performs the same check as per commit
6f7b0a2a5 (futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi())
[ tglx: Compare the resulting keys as well, as uaddrs might be
different depending on the mapping ]
Fixes CVE-2014-3153.
Reported-by: Pinkie Pie
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Mark Salyzyn [Fri, 6 Jun 2014 19:06:42 +0000 (12:06 -0700)]
android: base-cfg: disable LOGGER
Bug:
15384806
Change-Id: If8d324ffdb4ebd56e5d68876f8e229547e20eaf4
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Andreas Henriksson [Thu, 7 Nov 2013 17:26:38 +0000 (18:26 +0100)]
net: Fix "ip rule delete table 256"
[ Upstream commit
13eb2ab2d33c57ebddc57437a7d341995fc9138c ]
When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.
Originally reported at: http://bugs.debian.org/724783
Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lorenzo Colitti [Wed, 26 Mar 2014 04:03:12 +0000 (13:03 +0900)]
net: support marking accepting TCP sockets
When using mark-based routing, sockets returned from accept()
may need to be marked differently depending on the incoming
connection request.
This is the case, for example, if different socket marks identify
different networks: a listening socket may want to accept
connections from all networks, but each connection should be
marked with the network that the request came in on, so that
subsequent packets are sent on the correct network.
This patch adds a sysctl to mark TCP sockets based on the fwmark
of the incoming SYN packet. If enabled, and an unmarked socket
receives a SYN, then the SYN packet's fwmark is written to the
connection's inet_request_sock, and later written back to the
accepted socket when the connection is established. If the
socket already has a nonzero mark, then the behaviour is the same
as it is today, i.e., the listening socket's fwmark is used.
Black-box tested using user-mode linux:
- IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the
mark of the incoming SYN packet.
- The socket returned by accept() is marked with the mark of the
incoming SYN packet.
- Tested with syncookies=1 and syncookies=2.
Change-Id: I26bc1eceefd2c588d73b921865ab70e4645ade57
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Lorenzo Colitti [Sat, 10 May 2014 02:56:37 +0000 (11:56 +0900)]
net: Use fwmark reflection in PMTU discovery.
Currently, routing lookups used for Path PMTU Discovery in
absence of a socket or on unmarked sockets use a mark of 0.
This causes PMTUD not to work when using routing based on
netfilter fwmark mangling and fwmark ip rules, such as:
iptables -j MARK --set-mark 17
ip rule add fwmark 17 lookup 100
This patch causes these route lookups to use the fwmark from the
received ICMP error when the fwmark_reflect sysctl is enabled.
This allows the administrator to make PMTUD work by configuring
appropriate fwmark rules to mark the inbound ICMP packets.
Black-box tested using user-mode linux by pointing different
fwmarks at routing tables egressing on different interfaces, and
using iptables mangling to mark packets inbound on each interface
with the interface's fwmark. ICMPv4 and ICMPv6 PMTU discovery
work as expected when mark reflection is enabled and fail when
it is disabled.
Change-Id: Id7fefb7ec1ff7f5142fba43db1960b050e0dfaec
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Lorenzo Colitti [Tue, 18 Mar 2014 11:52:27 +0000 (20:52 +0900)]
net: add a sysctl to reflect the fwmark on replies
Kernel-originated IP packets that have no user socket associated
with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
are emitted with a mark of zero. Add a sysctl to make them have
the same mark as the packet they are replying to.
This allows an administrator that wishes to do so to use
mark-based routing, firewalling, etc. for these replies by
marking the original packets inbound.
Tested using user-mode linux:
- ICMP/ICMPv6 echo replies and errors.
- TCP RST packets (IPv4 and IPv6).
Change-Id: I6873d973196797bcf32e2e91976df647c7e8b85a
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Lorenzo Colitti [Wed, 26 Mar 2014 10:35:41 +0000 (19:35 +0900)]
net: ipv6: autoconf routes into per-device tables
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).
To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.
Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.
The sysctl behaves as follows:
- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
device's ifindex, and use that table.
The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.
Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Lorenzo Colitti [Thu, 27 Feb 2014 04:38:26 +0000 (13:38 +0900)]
net: ipv6: ping: Use socket mark in routing lookup
Change-Id: I5a61e0f9f22f193c51b1aafd270fb0642a2e0fab
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hurley [Wed, 14 May 2014 03:07:33 +0000 (20:07 -0700)]
n_tty: Fix n_tty_write crash when echoing in raw mode
commit
4291086b1f081b869c6d79e5b7441633dc3ace00 upstream.
The tty atomic_write_lock does not provide an exclusion guarantee for
the tty driver if the termios settings are LECHO & !OPOST. And since
it is unexpected and not allowed to call TTY buffer helpers like
tty_insert_flip_string concurrently, this may lead to crashes when
concurrect writers call pty_write. In that case the following two
writers:
* the ECHOing from a workqueue and
* pty_write from the process
race and can overflow the corresponding TTY buffer like follows.
If we look into tty_insert_flip_string_fixed_flag, there is:
int space = __tty_buffer_request_room(port, goal, flags);
struct tty_buffer *tb = port->buf.tail;
...
memcpy(char_buf_ptr(tb, tb->used), chars, space);
...
tb->used += space;
so the race of the two can result in something like this:
A B
__tty_buffer_request_room
__tty_buffer_request_room
memcpy(buf(tb->used), ...)
tb->used += space;
memcpy(buf(tb->used), ...) ->BOOM
B's memcpy is past the tty_buffer due to the previous A's tb->used
increment.
Since the N_TTY line discipline input processing can output
concurrently with a tty write, obtain the N_TTY ldisc output_lock to
serialize echo output with normal tty writes. This ensures the tty
buffer helper tty_insert_flip_string is not called concurrently and
everything is fine.
Note that this is nicely reproducible by an ordinary user using
forkpty and some setup around that (raw termios + ECHO). And it is
present in kernels at least after commit
d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to
use the normal buffering logic) in 2.6.31-rc3.
js: add more info to the commit log
js: switch to bool
js: lock unconditionally
js: lock only the tty->ops->write call
References: CVE-2014-0196
Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cherry-picked from
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
branch: stable/linux-3.10.y
commit:
abb5100737bba3f82b5514350fea89ca361ac66c
Change-Id: I81e79fe209f5c7b25cac35189a44286e5a9ddac0
Signed-off-by: JP Abgrall <jpa@google.com>
Daniel Rosenberg [Wed, 7 May 2014 21:17:47 +0000 (14:17 -0700)]
input: Changed keyreset to act as a wrapper for keycombo.
keyreset now registers a keycombo driver that acts as the old
keyreset driver acted.
Change-Id: I08f5279e3a33b267571b699697f9f54508868983
Signed-off-by: Daniel Rosenberg <drosen@google.com>