Kees Cook [Tue, 10 Jun 2014 22:40:23 +0000 (15:40 -0700)]
ARM: add seccomp syscall
Wires up the new seccomp syscall.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Conflicts:
arch/arm/include/uapi/asm/unistd.h
arch/arm/kernel/calls.S
Signed-off-by: Lee Campbell <leecam@chromium.org>
Lee Campbell [Wed, 8 Oct 2014 21:40:22 +0000 (14:40 -0700)]
seccomp: fix syscall numbers for x86 and x86_64
Correcting syscall numbers for seccomp
Signed-off-by: Lee Campbell <leecam@chromium.org>
Guenter Roeck [Mon, 11 Aug 2014 03:50:30 +0000 (20:50 -0700)]
seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock
Current upstream kernel hangs with mips and powerpc targets in
uniprocessor mode if SECCOMP is configured.
Bisect points to commit
dbd952127d11 ("seccomp: introduce writer locking").
Turns out that code such as
BUG_ON(!spin_is_locked(&list_lock));
can not be used in uniprocessor mode because spin_is_locked() always
returns false in this configuration, and that assert_spin_locked()
exists for that very purpose and must be used instead.
Fixes: dbd952127d11 ("seccomp: introduce writer locking")
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Thu, 5 Jun 2014 07:23:17 +0000 (00:23 -0700)]
seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
Applying restrictive seccomp filter programs to large or diverse
codebases often requires handling threads which may be started early in
the process lifetime (e.g., by code that is linked in). While it is
possible to apply permissive programs prior to process start up, it is
difficult to further restrict the kernel ABI to those threads after that
point.
This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for
synchronizing thread group seccomp filters at filter installation time.
When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC,
filter) an attempt will be made to synchronize all threads in current's
threadgroup to its new seccomp filter program. This is possible iff all
threads are using a filter that is an ancestor to the filter current is
attempting to synchronize to. NULL filters (where the task is running as
SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be
transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS,
...) has been set on the calling thread, no_new_privs will be set for
all synchronized threads too. On success, 0 is returned. On failure,
the pid of one of the failing threads will be returned and no filters
will have been applied.
The race conditions against another thread are:
- requesting TSYNC (already handled by sighand lock)
- performing a clone (already handled by sighand lock)
- changing its filter (already handled by sighand lock)
- calling exec (handled by cred_guard_mutex)
The clone case is assisted by the fact that new threads will have their
seccomp state duplicated from their parent before appearing on the tasklist.
Holding cred_guard_mutex means that seccomp filters cannot be assigned
while in the middle of another thread's exec (potentially bypassing
no_new_privs or similar). The call to de_thread() may kill threads waiting
for the mutex.
Changes across threads to the filter pointer includes a barrier.
Based on patches by Will Drewry.
Suggested-by: Julien Tinnes <jln@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Fri, 27 Jun 2014 22:01:35 +0000 (15:01 -0700)]
seccomp: allow mode setting across threads
This changes the mode setting helper to allow threads to change the
seccomp mode from another thread. We must maintain barriers to keep
TIF_SECCOMP synchronized with the rest of the seccomp state.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Kees Cook [Fri, 27 Jun 2014 22:18:48 +0000 (15:18 -0700)]
seccomp: introduce writer locking
Normally, task_struct.seccomp.filter is only ever read or modified by
the task that owns it (current). This property aids in fast access
during system call filtering as read access is lockless.
Updating the pointer from another task, however, opens up race
conditions. To allow cross-thread filter pointer updates, writes to the
seccomp fields are now protected by the sighand spinlock (which is shared
by all threads in the thread group). Read access remains lockless because
pointer updates themselves are atomic. However, writes (or cloning)
often entail additional checking (like maximum instruction counts)
which require locking to perform safely.
In the case of cloning threads, the child is invisible to the system
until it enters the task list. To make sure a child can't be cloned from
a thread and left in a prior state, seccomp duplication is additionally
moved under the sighand lock. Then parent and child are certain have
the same seccomp state when they exit the lock.
Based on patches by Will Drewry and David Drysdale.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/fork.c
Kees Cook [Fri, 27 Jun 2014 22:16:33 +0000 (15:16 -0700)]
seccomp: split filter prep from check and apply
In preparation for adding seccomp locking, move filter creation away
from where it is checked and applied. This will allow for locking where
no memory allocation is happening. The validation, filter attachment,
and seccomp mode setting can all happen under the future locks.
For extreme defensiveness, I've added a BUG_ON check for the calculated
size of the buffer allocation in case BPF_MAXINSN ever changes, which
shouldn't ever happen. The compiler should actually optimize out this
check since the test above it makes it impossible.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Kees Cook [Wed, 21 May 2014 22:23:46 +0000 (15:23 -0700)]
sched: move no_new_privs into new atomic flags
Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/sys.c
Kees Cook [Wed, 25 Jun 2014 23:08:24 +0000 (16:08 -0700)]
seccomp: add "seccomp" syscall
This adds the new "seccomp" syscall with both an "operation" and "flags"
parameter for future expansion. The third argument is a pointer value,
used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
In addition to the TSYNC flag later in this patch series, there is a
non-zero chance that this syscall could be used for configuring a fixed
argument area for seccomp-tracer-aware processes to pass syscall arguments
in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
for this syscall. Additionally, this syscall uses operation, flags,
and user pointer for arguments because strictly passing arguments via
a user pointer would mean seccomp itself would be unable to trivially
filter the seccomp syscall itself.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
arch/x86/syscalls/syscall_32.tbl
arch/x86/syscalls/syscall_64.tbl
include/uapi/asm-generic/unistd.h
kernel/seccomp.c
And fixup of unistd32.h to truly enable sys_secomp.
Change-Id: I95bea02382c52007d22e5e9dc563c7d055c2c83f
Kees Cook [Wed, 25 Jun 2014 22:55:25 +0000 (15:55 -0700)]
seccomp: split mode setting routines
Separates the two mode setting paths to make things more readable with
fewer #ifdefs within function bodies.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Wed, 25 Jun 2014 22:38:02 +0000 (15:38 -0700)]
seccomp: extract check/assign mode helpers
To support splitting mode 1 from mode 2, extract the mode checking and
assignment logic into common functions.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Wed, 21 May 2014 22:02:11 +0000 (15:02 -0700)]
seccomp: create internal mode-setting function
In preparation for having other callers of the seccomp mode setting
logic, split the prctl entry point away from the core logic that performs
seccomp mode setting.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Oleg Nesterov [Tue, 21 Jan 2014 23:49:56 +0000 (15:49 -0800)]
introduce for_each_thread() to replace the buggy while_each_thread()
while_each_thread() and next_thread() should die, almost every lockless
usage is wrong.
1. Unless g == current, the lockless while_each_thread() is not safe.
while_each_thread(g, t) can loop forever if g exits, next_thread()
can't reach the unhashed thread in this case. Note that this can
happen even if g is the group leader, it can exec.
2. Even if while_each_thread() itself was correct, people often use
it wrongly.
It was never safe to just take rcu_read_lock() and loop unless
you verify that pid_alive(g) == T, even the first next_thread()
can point to the already freed/reused memory.
This patch adds signal_struct->thread_head and task->thread_node to
create the normal rcu-safe list with the stable head. The new
for_each_thread(g, t) helper is always safe under rcu_read_lock() as
long as this task_struct can't go away.
Note: of course it is ugly to have both task_struct->thread_node and the
old task_struct->thread_group, we will kill it later, after we change
the users of while_each_thread() to use for_each_thread().
Perhaps we can kill it even before we convert all users, we can
reimplement next_thread(t) using the new thread_head/thread_node. But
we can't do this right now because this will lead to subtle behavioural
changes. For example, do/while_each_thread() always sees at least one
task, while for_each_thread() can do nothing if the whole thread group
has died. Or thread_group_empty(), currently its semantics is not clear
unless thread_group_leader(p) and we need to audit the callers before we
can change it.
So this patch adds the new interface which has to coexist with the old
one for some time, hopefully the next changes will be more or less
straightforward and the old one will go away soon.
Bug
200004307
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Sergey Dyasly <dserrg@gmail.com>
Tested-by: Sergey Dyasly <dserrg@gmail.com>
Reviewed-by: Sameer Nanda <snanda@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: "Ma, Xindong" <xindong.ma@intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit
0c740d0afc3bff0a097ad03a1c8df92757516f5c)
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Change-Id: Id689cb1383ceba2561b66188d88258619b68f5c6
Reviewed-on: http://git-master/r/419041
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Peter Zijlstra [Wed, 6 Nov 2013 13:57:36 +0000 (14:57 +0100)]
arch: Introduce smp_load_acquire(), smp_store_release()
A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads. Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.
This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation. The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes. These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.
In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability. It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits. It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.
[Changelog by PaulMck]
(cherry picked from commit
47933ad41a86a4a9b50bed7c9b9bd2ba242aac63)
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
AKASHI Takahiro [Thu, 4 Sep 2014 15:01:08 +0000 (16:01 +0100)]
arm64: add seccomp support
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
secure_computing() is called first in syscall_trace_enter() so that a system
call will be aborted quickly without doing succeeding syscall tracing,
contrary to other cases, if seccomp rules deny that system call.
On compat task, syscall numbers for system calls allowed in seccomp mode 1
are different from those on normal tasks, and so _NR_seccomp_xxx_32's need
to be redefined.
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
Conflicts:
arch/arm64/Kconfig
arch/arm64/kernel/entry.S
Change-Id: I5ec44507d7e536df7ec9d62d30a418c26ef15100
Eric Paris [Tue, 11 Mar 2014 16:48:43 +0000 (12:48 -0400)]
syscall_get_arch: remove useless function arguments
Every caller of syscall_get_arch() uses current for the task and no
implementors of the function need args. So just get rid of both of
those things. Admittedly, since these are inline functions we aren't
wasting stack space, but it just makes the prototypes better.
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Cc: linux390@de.ibm.com
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Conflicts:
arch/mips/include/asm/syscall.h
arch/mips/kernel/ptrace.c
AKASHI Takahiro [Thu, 4 Sep 2014 14:48:01 +0000 (15:48 +0100)]
arm64: add SIGSYS siginfo for compat task
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
SIGSYS is primarily used in secure computing to notify tracer.
This patch allows signal handler on compat task to get correct information
with SA_SYSINFO specified when this signal is delivered.
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
AKASHI Takahiro [Thu, 4 Sep 2014 14:39:13 +0000 (15:39 +0100)]
add seccomp syscall for compat task
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
This patch allows compat task to issue seccomp() system call.
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
Conflicts:
arch/arm64/include/asm/unistd32.h
Change-Id: I63d38f68da72b3333327256b4cacba2c3ddb39fc
AKASHI Takahiro [Thu, 4 Sep 2014 14:34:14 +0000 (15:34 +0100)]
asm-generic: add generic seccomp.h for secure computing mode 1
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
Those values (__NR_seccomp_*) are used solely in secure_computing()
to identify mode 1 system calls. If compat system calls have different
syscall numbers, asm/seccomp.h may override them.
Acked-by: Arnd Bergmann <arnd <at> arndb.de>
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
AKASHI Takahiro [Thu, 4 Sep 2014 14:20:53 +0000 (15:20 +0100)]
arm64: ptrace: allow tracer to skip a system call
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
If tracer specifies -1 as a syscall number, this traced system call should
be skipped with a value in x0 used as a return value.
This patch enables this semantics, but there is a restriction here:
when syscall(-1) is issued by user, tracer cannot skip this system call
and modify a return value at syscall entry.
In order to ease this flavor, we need to treat whatever value in x0 as
a return value, but this might result in a bogus value being returned,
especially when tracer doesn't do anything at this syscall.
So we always return ENOSYS instead, while we have another chance to change
a return value at syscall exit.
Please also note:
* syscall entry tracing and syscall exit tracing (ftrace tracepoint and
audit) are always executed, if enabled, even when skipping a system call
(that is, -1).
In this way, we can avoid a potential bug where audit_syscall_entry()
might be called without audit_syscall_exit() at the previous system call
being called, that would cause OOPs in audit_syscall_entry().
* syscallno may also be set to -1 if a fatal signal (SIGKILL) is detected
in tracehook_report_syscall_entry(), but since a value set to x0 (ENOSYS)
is not used in this case, we may neglect the case.
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
Conflicts:
arch/arm64/kernel/entry.S
Change-Id: Ifcdcdbcb7c8cf97e5b5f1086a1ea4107e1d4f9a8
AKASHI Takahiro [Thu, 4 Sep 2014 13:54:29 +0000 (14:54 +0100)]
arm64: ptrace: add PTRACE_SET_SYSCALL
Note: This patch is from v6 of Takahiro's proposed
"arm64: add seccomp support" patchset (leecam@google.com)
To allow tracer to be able to change/skip a system call by re-writing
a syscall number, there are several approaches:
(1) modify x8 register with ptrace(PTRACE_SETREGSET), and handle this case
later on in syscall_trace_enter(), or
(2) support ptrace(PTRACE_SET_SYSCALL) as on arm
Thinking of the fact that user_pt_regs doesn't expose 'syscallno' to
tracer as well as that secure_computing() expects a changed syscall number
to be visible, especially case of -1, before this function returns in
syscall_trace_enter(), we'd better take (2).
Signed-off-by: AKASHI Takahiro <takahiro.akashi <at> linaro.org>
Badhri Jagan Sridharan [Thu, 25 Sep 2014 02:36:33 +0000 (19:36 -0700)]
USB: f_rndis: fix compile error
Change-Id: Ied5dd8ef905bdf84d176a5e560b09e292b68fbc5
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
xerox_lin [Thu, 4 Sep 2014 08:01:59 +0000 (16:01 +0800)]
USB: gadget: rndis: Add module parameter for DL max packets per xfer
Currently DL aggregation is supported in RNDIS driver and is set to
3 by default. And there is no support to change downlink maximum
packets per transfer at runtime through module parameter. Hence add
module parameter for DL maximum packets per transfer to change it at
runtime.
echo 6 > /sys/module/g_android/parameters/rndis_dl_max_pkt_per_xfer
To disable DL aggregation during runtime,
echo 1 > /sys/module/g_android/parameters/rndis_dl_max_pkt_per_xfer
Change-Id: I3a1d0bc97358e2b6f233df7ae8725fb507de50db
Signed-off-by: Xerox Lin <xerox_lin@htc.com>
Signed-off-by: Vijayavardhan Vennapusa <vvreddy@codeaurora.org>
Badhri Jagan Sridharan [Thu, 18 Sep 2014 17:48:48 +0000 (10:48 -0700)]
ndis: Add debug support to disable RNDIS Multipacket Feature
This change adds module param which allows to disable RNDIS
Multi-packet Feature (Aggregation support in Downlink path)
as this feature is enabled by default.
To disable use this param before moving to RNDIS Composition:
echo 1 > /sys/module/g_android/parameters/rndis_multipacket_dl_disable
Also counts errors as Rx errors if received RNDIS packets are
not following RNDIS message format as those packets are being
discarded.
Change-Id: I764430da78f2204af92e14bb279c11b24c7e4c67
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Badhri Jagan Sridharan [Thu, 18 Sep 2014 17:46:08 +0000 (10:46 -0700)]
RNDIS: Add Data aggregation (multi packet) support
Add data aggregation support using RNDIS Multi Packet feature
to achieve better UDP Downlink throughput. Max 3 RNDIS Packets
aggregated into one RNDIS Packet with this implementation.
With this change, seeing UDP Downlink throughput increase
from 90 Mbps to above 100 Mbps when using Iperf and sending
data more than 100 Mbps.
Change-Id: I21c39482718944bb1b1068bdd02f626531e58f08
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Rajkumar Raghupathy <raghup@codeaurora.org>
Badhri Jagan Sridharan [Thu, 18 Sep 2014 17:42:41 +0000 (10:42 -0700)]
USB: gadget: u_ether: Fix data stall issue in RNDIS tethering mode
For dual speed gadget, with current no. of request(10), there is
possibility of corner case occurence where all 10 reuqests are queued
to HW without setting IOC bit, which could lead to data stall in
RNDIS tethering and RNDIS local networking.
With this patch, counter will be incremented before queueing request to
HW and sets IOC bit for every nth request due to which the corner case
of all requests queued to HW without IOC bit set will be avoided.
Change-Id: I26515bfd9bbc8f7af38be7835692143f7093118a
Signed-off-by: Vijayavardhan Vennapusa <vvreddy@codeaurora.org>
taeju.park [Fri, 14 Sep 2012 05:09:03 +0000 (14:09 +0900)]
usb: gadget: prevent change of Host MAC address of 'usb0' interface
On windows 7 platform, previously allocated ip address is maintained.
However, Host MAC address of 'usb0' interface is changed when the
tethering driver re-enumerated. Thus, the tethering network driver
can't be allocated ip address from dhcp. It causes connection delay
between host and phone for usb tethering.
This patch prevents from changing Host MAC address of 'usb0' interface.
In other words, this patch maintains the Host MAC address allocated when
first tethering driver although the driver is re-enumerated. However,
after reboot, the Host MAC address can be changed.
Change-Id: I43add9925e9d6d90c56cffbd3ed999104448f818
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Badhri Jagan Sridharan [Thu, 25 Sep 2014 01:58:23 +0000 (18:58 -0700)]
usb: u_ether: Add workqueue as bottom half handler for rx data path
u_ether driver passes rx data to network layer and resubmits the
request back to usb hardware in interrupt context. Network layer
processes rx data by scheduling tasklet. For high throughput
scenarios on rx data path driver is spending lot of time in interrupt
context due to rx data processing by tasklet and continuous completion
and re-submission of the usb requests which results in watchdog bark.
Hence move the rx data processing and usb request submission to a
workqueue bottom half handler.
Change-Id: I316de8e267997137ac189a8b7b2846fa325f4a5a
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
JP Abgrall [Thu, 18 Sep 2014 02:26:43 +0000 (19:26 -0700)]
arm64: Fixup __NR_* compat syscalls count.
Should have gone in the cherry-pick
cfc7e99e9e3900056028a7d90072e9ea0d886f8d
arm64: Add __NR_* definitions for compat syscalls
Change-Id: I69a69e4b1f206aad4ece1a8b06f9e23e99adcbfb
AKASHI Takahiro [Wed, 30 Apr 2014 09:51:32 +0000 (10:51 +0100)]
arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h
Some kernel files may include both linux/compat.h and asm/compat.h directly
or indirectly. Since both header files contain is_compat_task() under
!CONFIG_COMPAT, compiling them with !CONFIG_COMPAT will eventually fail.
Such files include kernel/auditsc.c, kernel/seccomp.c and init/do_mountfs.c
(do_mountfs.c may read asm/compat.h via asm/ftrace.h once ftrace is
implemented).
So this patch proactively
1) removes is_compat_task() under !CONFIG_COMPAT from asm/compat.h
2) replaces asm/compat.h to linux/compat.h in kernel/*.c,
but asm/compat.h is still necessary in ptrace.c and process.c because
they use is_compat_thread().
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/kernel/hw_breakpoint.c
arch/arm64/kernel/ptrace.c
Change-Id: I5b8330e43ab8bdd383cd410d8223d6c1a39fa0fc
AKASHI Takahiro [Wed, 30 Apr 2014 09:51:31 +0000 (10:51 +0100)]
arm64: Add regs_return_value() in syscall.h
This macro, regs_return_value, is used mainly for audit to record system
call's results, but may also be used in test_kprobes.c.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
JP Abgrall [Thu, 18 Sep 2014 01:18:11 +0000 (18:18 -0700)]
arm64: audit: Add audit hook in syscall_trace_enter/exit()
This patch adds auditing functions on entry to or exit from
every system call invocation.
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/kernel/ptrace.c
Change-Id: I7ebff5df4acbdab56c74e584dbc5fef5d8bfc9a8
AKASHI Takahiro [Fri, 4 Jul 2014 07:28:30 +0000 (08:28 +0100)]
arm64: Add audit support
On AArch64, audit is supported through generic lib/audit.c and
compat_audit.c, and so this patch adds arch specific definitions required.
Acked-by Will Deacon <will.deacon@arm.com>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/Kconfig
include/uapi/linux/audit.h
Change-Id: Ia6d7b25786843d43191e67d514928e3ecba11e2f
Dan Aloni [Wed, 28 Aug 2013 13:24:53 +0000 (14:24 +0100)]
Move the EM_ARM and EM_AARCH64 definitions to uapi/linux/elf-em.h
Signed-off-by: Dan Aloni <alonid@stratoscale.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
JP Abgrall [Thu, 18 Sep 2014 00:59:28 +0000 (17:59 -0700)]
arm64: Add __NR_* definitions for compat syscalls
This patch adds __NR_* definitions to asm/unistd32.h, moves the
__NR_compat_* definitions to asm/unistd.h and removes all the explicit
unistd32.h includes apart from the one building the compat syscall
table. The aim is to have the compat __NR_* definitions available but
without colliding with the native syscall definitions (required by
lib/compat_audit.c to avoid duplicating the audit header files between
native and compat).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/include/asm/unistd32.h
arch/arm64/kernel/kuser32.S
Change-Id: I8776881b5beb39769aadc4c4f14a51ea54325112
AKASHI Takahiro [Wed, 30 Apr 2014 09:51:30 +0000 (10:51 +0100)]
arm64: split syscall_trace() into separate functions for enter/exit
As done in arm, this change makes it easy to confirm we invoke syscall
related hooks, including syscall tracepoint, audit and seccomp which would
be implemented later, in correct order. That is, undoing operations in the
opposite order on exit that they were done on entry.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
AKASHI Takahiro [Wed, 30 Apr 2014 09:51:29 +0000 (10:51 +0100)]
arm64: make a single hook to syscall_trace() for all syscall features
Currently syscall_trace() is called only for ptrace.
With additional TIF_xx flags defined, it is now called in all the cases
of audit, ftrace and seccomp in addition to ptrace.
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/include/asm/thread_info.h
arch/arm64/kernel/entry.S
Change-Id: Iee71c44c45b363194a1cc7182906c0afa6b5348b
JP Abgrall [Wed, 17 Sep 2014 22:01:45 +0000 (15:01 -0700)]
seccomp: revert previous patches in prep for updated ones
This reverts the seccomp related patches committed around 2014-08-27.
This allows for a cleaner cherry-pick of newly landed upstream patches.
f56b1aa arm: fixup NR_syscalls to accommodate the new seccomp syscall
81ff7fa seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
d924727 seccomp: allow mode setting across threads
743266a seccomp: introduce writer locking
3497a88 seccomp: split filter prep from check and apply
2c6d7de MIPS: add seccomp syscall
83f1ccba ARM: add seccomp syscall
a75a29b seccomp: add "seccomp" syscall
1a63bce seccomp: split mode setting routines
c208e4e seccomp: extract check/assign mode helpers
6862b01 seccomp: create internal mode-setting function
1ba2ccb MAINTAINERS: create seccomp entry
c2da3eb seccomp: fix memory leak on filter attach
945a225 ARM: 7888/1: seccomp: not compatible with ARM OABI
Change-Id: I3f129263d68a7b3c206d79f84f7f9908d13064f6
Signed-off-by: JP Abgrall <jpa@google.com>
Catalin Marinas [Mon, 2 Sep 2013 15:33:54 +0000 (16:33 +0100)]
arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S
This string has been moved to arch/arm64/kernel/cputable.c.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 16 Dec 2013 21:04:35 +0000 (21:04 +0000)]
arm64: drop redundant macros from read_cpuid()
asm/cputype.h contains a bunch of #defines for CPU id registers
that essentially map to themselves. Remove the #defines and pass
the tokens directly to the inline asm() that reads the registers.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Greg Hackmann [Mon, 15 Sep 2014 22:34:55 +0000 (15:34 -0700)]
android: base-cfg: enable ARMV7_COMPAT
Enables backwards-compatibility features on arm64, and has no effect
(does not exist) on other architectures
Change-Id: I6fc2f6567437750a0032f8a39a9cde1fb92d4ef4
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Greg Hackmann [Tue, 5 Aug 2014 23:14:27 +0000 (16:14 -0700)]
arm64: restrict effects of ARMV7_COMPAT_CPUINFO to ARMv7 tasks
Since ARMV7_COMPAT_CPUINFO only exists to support existing ARMv7
binaries, restrict its effects to compat tasks
Bug:
16819658
Change-Id: I1092de596c7822d23f5f3f8a05b417a3cb49f593
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Alex Van Brunt [Thu, 20 Feb 2014 18:46:21 +0000 (10:46 -0800)]
arm64: report vfpv3 instead of vfpv3d16
vfpv3 is the correct version for an ARMv8 processor and it is the
version reported by an A15.
Change-Id: I486f3af21a352c27775888cca332a48d7e0c59ce
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/370076
Alex Van Brunt [Thu, 9 Jan 2014 20:51:05 +0000 (12:51 -0800)]
arm64: cpuinfo: ARMv7 compatable cpuinfo option
To be backwards compatable with the output of cpuinfo on an ARMv7,
print the features that were optional in ARMv7 but are required in
ARMv8.
Change-Id: Ic728f71be4a971adc79ef552f25cfbf95a4dac29
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/366095
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Rich Wiley [Wed, 4 Jun 2014 18:44:03 +0000 (11:44 -0700)]
arm64: enable deprecated SETEND instruction in SCTLR compat config
Change-Id: I703d4843f8aab2ec63324f04cc13aaabae88e163
Signed-off-by: Rich Wiley <rwiley@nvidia.com>
Reviewed-on: http://git-master/r/422174
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
Rich Wiley [Wed, 4 Jun 2014 18:41:53 +0000 (11:41 -0700)]
arm64: make SCTLR compat config depend on CONFIG_ARMV7_COMPAT
Conflicts:
arch/arm64/mm/proc.S
Change-Id: I76e0067839c96e3082b42c80d3fc670cf3d371b5
Signed-off-by: Rich Wiley <rwiley@nvidia.com>
Reviewed-on: http://git-master/r/422173
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
Alex Van Brunt [Tue, 28 Jan 2014 20:40:10 +0000 (12:40 -0800)]
arm64: optionally set CP15BEN in SCTLR
Setting CP15BEN allows legacy applications running in AArch32 mode
that use CP15 DMB as similar instructions to continue running.
Change-Id: If76d3c6ee12865ff8c4b4e7aed01146bead87773
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/366096
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Rich Wiley [Mon, 10 Mar 2014 21:01:06 +0000 (14:01 -0700)]
arm64: fix SWP instruction emulation
initial variable values may get overwritten
if they're listed as an output in ASM, even if
they're not explicitly written to.
Change-Id: I2a239e1819850a2a7005a46e83d82deac4ca303b
Signed-off-by: Rich Wiley <rwiley@nvidia.com>
Reviewed-on: http://git-master/r/379646
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Li Li (SW-TEGRA) <lli5@nvidia.com>
Tested-by: Li Li (SW-TEGRA) <lli5@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Alex Van Brunt [Fri, 21 Feb 2014 02:18:53 +0000 (18:18 -0800)]
arm64: add fault handling to SWP emulation
Add excpetion table and fixup for SWP/SWPB instruction emulation.
This prevents the kernel from panicing when emulating a SWP/SWPB
instruction that access unmapped memory.
Change-Id: I4a9ca34fa161a0f306cdb663827d9bee39cec733
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/370278
Alex Van Brunt [Wed, 19 Feb 2014 01:50:57 +0000 (17:50 -0800)]
arm64: fix a warning and a typo in SWP emulation
The store-release-exclusive is missing the "L" that makes it a
release rather than a normal store-exclusive.
Remove a variable that is not used and causes a compiler warning.
Change-Id: I91633a352b805ed9af450b632c9ee394235637c4
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/369076
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Alex Van Brunt [Thu, 30 Jan 2014 23:10:39 +0000 (15:10 -0800)]
arm64: emulate the swp/swpb instruction
The swp and spwb instructions were deprecated in ARMv6. ARMv8
obsoleted the instruction. Despite this, many applications rely on
these instruuctions.
This patch starts with the version present in the arm architecture.
However, it uses the ldx*()/stx*() functions to implement the handler
in C code. It also removes a lot of code that is not needed.
Change-Id: I6882fbe5f71bfa8f9e9a75d067b2111188c6f2fa
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/366097
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Conflicts:
arch/arm64/Kconfig
arch/arm64/kernel/Makefile
Alex Van Brunt [Tue, 11 Feb 2014 18:08:51 +0000 (10:08 -0800)]
arm64: a backwards compatible config option
Create a config option that when selected configures the kernel to be
as backwards compatable with kernels that ran on an ARMv7 processor
as possible.
Change-Id: I7cd67e6d4174335f9a67aba2a39dfd993f240c27
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/366094
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Peng Du [Wed, 23 Jul 2014 18:40:33 +0000 (11:40 -0700)]
arm64: kernel: check mode for get_user in undefinstr
get_user() should be called only for user_mode undef instruction.
Change-Id: Ia654783de0cf72abac6847ac9630236f9f0d6ebb
Signed-off-by: Peng Du <pdu@nvidia.com>
Reviewed-on: http://git-master/r/441348
Reviewed-by: Thomas Cherry <tcherry@nvidia.com>
Reviewed-by: Bo Yan <byan@nvidia.com>
Alex Van Brunt [Wed, 29 Jan 2014 21:41:01 +0000 (13:41 -0800)]
arm64: add undefined instruction handler hooks
Add undefined instruction handler hooks similar to the system in the
arm archetecture. One difference is that hooks can only be added at
boot time and they can never be removed. This removes the need for
the spinlock in the handler.
Change-Id: I4684937f5209ca2a64ee63947bb2ab6411ae14f7
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/361736
Reviewed-on: http://git-master/r/365059
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Alex Van Brunt [Wed, 29 Jan 2014 21:45:20 +0000 (13:45 -0800)]
arm64: ptrace: add is_wide_instruction() macro
Add the is_wide_instruction() macro. This was copied from the arm
architecture.
Change-Id: I28f83b47f5c587fe778dc2846df77673f8dd918b
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/361737
Reviewed-by: Peng Du <pdu@nvidia.com>
Reviewed-on: http://git-master/r/365060
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Alex Van Brunt [Thu, 30 Jan 2014 23:07:34 +0000 (15:07 -0800)]
arm64: copy conditional instruction tests from arm
Copy the code that is used to compute if a conditional instruction
would be executed.
This code is needed to support A32 instruction emulation in the
kernel.
Change-Id: I0bab7537efd8cc317bd20995cd36961cf95165aa
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/362154
Reviewed-on: http://git-master/r/365061
Reviewed-by: Richard Wiley <rwiley@nvidia.com>
Tested-by: Oskari Jaaskelainen <oskarij@nvidia.com>
Will Deacon [Sat, 16 Mar 2013 08:48:13 +0000 (08:48 +0000)]
arm64: debug: consolidate software breakpoint handlers
The software breakpoint handlers are hooked in directly from ptrace,
which makes it difficult to add additional handlers for things like
kprobes and kgdb.
This patch moves the handling code into debug-monitors.c, where we can
dispatch to different debug subsystems more easily.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 3 Mar 2014 07:34:46 +0000 (07:34 +0000)]
arm64: advertise ARMv8 extensions to 32-bit compat ELF binaries
This adds support for advertising the presence of ARMv8 Crypto
Extensions in the Aarch32 execution state to 32-bit ELF binaries
running in 32-bit compat mode under the arm64 kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 3 Mar 2014 07:34:45 +0000 (07:34 +0000)]
arm64: add AT_HWCAP2 support for 32-bit compat
Add support for the ELF auxv entry AT_HWCAP2 when running 32-bit
ELF binaries in compat mode.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 3 Mar 2014 07:34:44 +0000 (07:34 +0000)]
binfmt_elf: add ELF_HWCAP2 to compat auxv entries
Add ELF_HWCAP2 to the set of auxv entries that is passed to
a 32-bit ELF program running in 32-bit compat mode under a
64-bit kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Steve Capper [Mon, 16 Dec 2013 21:04:36 +0000 (21:04 +0000)]
arm64: Add hwcaps for crypto and CRC32 extensions.
Advertise the optional cryptographic and CRC32 instructions to
user space where present. Several hwcap bits [3-7] are allocated.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
[bit 2 is taken now so use bits 3-7 instead]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Steve Capper [Wed, 18 Sep 2013 15:14:28 +0000 (16:14 +0100)]
arm64: Widen hwcap to be 64 bit
Under arm64 elf_hwcap is a 32 bit quantity, but it is stored in
a 64 bit auxiliary ELF field and glibc reads hwcap as 64 bit.
This patch widens elf_hwcap to be 64 bit.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Greg Hackmann [Tue, 9 Sep 2014 19:14:40 +0000 (12:14 -0700)]
arm64: add HWCAP_EVTSTRM and associated hwcap refactoring
Take the hwcaps changes from
46efe547aca8498d51b64460c02366ae4032ca32 to
facilitate cherry-picking later hwcaps changes, while skipping the timer
changes that actually enable event streams for now. The timer changes
depend on some non-trivial changes made after 3.10, and can safely be
dropped: the kernel will just continue reporting that HWCAP_EVTSTRM is
not available.
Bug:
17431179
Change-Id: I41548846f8cd7ae8147a2b115cc0f84708e29552
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Greg Hackmann [Wed, 10 Sep 2014 00:36:05 +0000 (17:36 -0700)]
arm64: process: dump memory around registers when displaying regs
A port of
8608d7c4418c75841c562a90cddd9beae5798a48 to ARM64. Both the
original code and this port are limited to dumping kernel addresses, so
don't bother if the registers are from a userspace process.
Change-Id: Idc76804c54efaaeb70311cbb500c54db6dac4525
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Todd Poynor [Sat, 6 Sep 2014 01:27:38 +0000 (18:27 -0700)]
cpufreq: interactive: make common_tunables static
From: Cylen Yao <cylen.yao@mediatek.com>
common_tunables should be static.
Change-Id: I502ee3062bece5082fea7861eff2f6237e25cede
Signed-off-by: Todd Poynor <toddpoynor@google.com>
JP Abgrall [Thu, 4 Sep 2014 00:36:44 +0000 (17:36 -0700)]
android: base-cfg: enforce the needed XFRM_MODE_TUNNEL (for VPN)
Change-Id: I587023d56877d32806079676790751155c768982
Signed-off-by: JP Abgrall <jpa@google.com>
Heiko Carstens [Wed, 2 Jul 2014 22:22:37 +0000 (15:22 -0700)]
fs/seq_file: fallback to vmalloc allocation
There are a couple of seq_files which use the single_open() interface.
This interface requires that the whole output must fit into a single
buffer.
E.g. for /proc/stat allocation failures have been observed because an
order-4 memory allocation failed due to memory fragmentation. In such
situations reading /proc/stat is not possible anymore.
Therefore change the seq_file code to fallback to vmalloc allocations
which will usually result in a couple of order-0 allocations and hence
also work if memory is fragmented.
For reference a call trace where reading from /proc/stat failed:
sadc: page allocation failure: order:4, mode:0x1040d0
CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
[...]
Call Trace:
show_stack+0x6c/0xe8
warn_alloc_failed+0xd6/0x138
__alloc_pages_nodemask+0x9da/0xb68
__get_free_pages+0x2e/0x58
kmalloc_order_trace+0x44/0xc0
stat_open+0x5a/0xd8
proc_reg_open+0x8a/0x140
do_dentry_open+0x1bc/0x2c8
finish_open+0x46/0x60
do_last+0x382/0x10d0
path_openat+0xc8/0x4f8
do_filp_open+0x46/0xa8
do_sys_open+0x114/0x1f0
sysc_tracego+0x14/0x1a
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
Cc: Andrea Righi <andrea@betterlinux.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
fs/seq_file.c
Change-Id: I009080dd017b020ffd5e812e5b472bdb8349217a
Al Viro [Tue, 6 May 2014 18:02:53 +0000 (14:02 -0400)]
nick kvfree() from apparmor
too many places open-code it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Conflicts:
mm/util.c
security/apparmor/include/apparmor.h
Change-Id: Ie8602e0199282dc462921cb7217158d1998853b0
Al Viro [Mon, 6 May 2013 02:10:35 +0000 (03:10 +0100)]
apparmor: no need to delay vfree()
vfree() can be called from interrupt contexts now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Leo Yan [Mon, 1 Sep 2014 03:09:51 +0000 (11:09 +0800)]
arm64: fix bug for reloading FPSIMD state after cpu power off
Now arm64 defers reloading FPSIMD state, but this optimization also
introduces the bug after cpu resume back from low power mode.
The reason is after the cpu has been powered off, s/w need set the
cpu's fpsimd_last_state to NULL so that it will force to reload
FPSIMD state for the thread, otherwise there has the chance to meet
the condition for both the task's fpsimd_state.cpu field contains the
id of the current cpu, and the cpu's fpsimd_last_state per-cpu variable
points to the task's fpsimd_state, so finally kernel will skip to reload
the context during it return back to userland.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Leo Yan <leoy@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Catalin Marinas [Fri, 29 Aug 2014 15:08:02 +0000 (16:08 +0100)]
arm64: Add brackets around user_stack_pointer()
Commit
5f888a1d33 (ARM64: perf: support dwarf unwinding in compat mode)
changes user_stack_pointer() to return the compat SP for 32-bit tasks
but without brackets around the whole definition, with possible issues
on the call sites (noticed with a subsequent fix for KSTK_ESP).
Fixes: 5f888a1d33c4 (ARM64: perf: support dwarf unwinding in compat mode)
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 29 Aug 2014 15:11:10 +0000 (16:11 +0100)]
arm64: report correct stack pointer in KSTK_ESP for compat tasks
The KSTK_ESP macro is used to determine the user stack pointer for a
given task. In particular, this is used to to report the '[stack]' VMA
in /proc/self/maps, which is used by Android to determine the stack
location for children of the main thread.
This patch fixes the macro to use user_stack_pointer instead of directly
returning sp. This means that we report w13 instead of sp, since the
former is used as the stack pointer when executing in AArch32 state.
Cc: <stable@vger.kernel.org>
Reported-by: Serban Constantinescu <Serban.Constantinescu@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Catalin Marinas [Thu, 10 Jul 2014 10:37:40 +0000 (11:37 +0100)]
arm64: Cast KSTK_(EIP|ESP) to unsigned long
This is for similarity with thread_saved_(pc|sp) and to avoid some
compiler warnings in the audit code.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Jean Pihet [Mon, 3 Feb 2014 18:18:29 +0000 (19:18 +0100)]
ARM64: perf: support dwarf unwinding in compat mode
Add support for unwinding using the dwarf information in compat
mode. Using the correct user stack pointer allows perf to record
the frames correctly in the native and compat modes.
Note that although the dwarf frame unwinding works ok using
libunwind in native mode (on ARMv7 & ARMv8), some changes are
required to the libunwind code for the compat mode. Those changes
are posted separately on the libunwind mailing list.
Tested on ARMv8 platform with v8 and compat v7 binaries, the latter
are statically built.
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Greg Hackmann [Thu, 28 Aug 2014 21:00:10 +0000 (14:00 -0700)]
arm64: check for upper PAGE_SHIFT bits in pfn_valid()
pfn_valid() returns a false positive when the lower (64 - PAGE_SHIFT)
bits match a valid pfn but some of the upper bits are set. This caused
a kernel panic in kpageflags_read() when a userspace utility parsed
/proc/*/pagemap, neglected to discard the upper flag bits, and tried to
lseek()+read() from the corresponding offset in /proc/kpageflags.
A valid pfn will never have the upper PAGE_SHIFT bits set, so simply
check for this before passing the pfn to memblock_is_memory().
Change-Id: Ief5d8cd4dd93cbecd545a634a8d5885865cb5970
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Ard Biesheuvel [Tue, 24 Sep 2013 07:28:03 +0000 (09:28 +0200)]
arm64: pull in <asm/simd.h> from asm-generic
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Ard Biesheuvel [Tue, 4 Mar 2014 01:10:04 +0000 (01:10 +0000)]
arm64: enable generic CPU feature modalias matching for this architecture
This enables support for the generic CPU feature modalias implementation that
wires up optional CPU features to udev based module autoprobing.
A file <asm/cpufeature.h> is provided that maps CPU feature numbers to
elf_hwcap bits, which is the standard way on arm64 to advertise optional CPU
features both internally and to user space.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: removed unnecessary "!!"]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/Kconfig
Change-Id: Ief16b3197cd0564d8cf8aa82e9614bcda6399fe5
Ard Biesheuvel [Tue, 4 Mar 2014 05:28:39 +0000 (13:28 +0800)]
crypto: allow blkcipher walks over AEAD data
This adds the function blkcipher_aead_walk_virt_block, which allows the caller
to use the blkcipher walk API to handle the input and output scatterlists.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Tue, 4 Mar 2014 05:28:38 +0000 (13:28 +0800)]
crypto: remove direct blkcipher_walk dependency on transform
In order to allow other uses of the blkcipher walk API than the blkcipher
algos themselves, this patch copies some of the transform data members to the
walk struct so the transform is only accessed at walk init time.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Mon, 24 Feb 2014 14:26:29 +0000 (15:26 +0100)]
arm64: add support for kernel mode NEON in interrupt context
This patch modifies kernel_neon_begin() and kernel_neon_end(), so
they may be called from any context. To address the case where only
a couple of registers are needed, kernel_neon_begin_partial(u32) is
introduced which takes as a parameter the number of bottom 'n' NEON
q-registers required. To mark the end of such a partial section, the
regular kernel_neon_end() should be used.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/include/asm/neon.h
Change-Id: Ifc7c6aa77e2ab8dd98bb9975cccab54e09693ab7
Ard Biesheuvel [Thu, 8 May 2014 09:20:23 +0000 (11:20 +0200)]
arm64: defer reloading a task's FPSIMD state to userland resume
If a task gets scheduled out and back in again and nothing has touched
its FPSIMD state in the mean time, there is really no reason to reload
it from memory. Similarly, repeated calls to kernel_neon_begin() and
kernel_neon_end() will preserve and restore the FPSIMD state every time.
This patch defers the FPSIMD state restore to the last possible moment,
i.e., right before the task returns to userland. If a task does not return to
userland at all (for any reason), the existing FPSIMD state is preserved
and may be reused by the owning task if it gets scheduled in again on the
same CPU.
This patch adds two more functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded
- fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/kernel/fpsimd.c
Change-Id: Ib1c0d8d0afb3c248cd4d060eb35877530dd92fdc
Ard Biesheuvel [Mon, 24 Feb 2014 14:26:27 +0000 (15:26 +0100)]
arm64: add abstractions for FPSIMD state manipulation
There are two tacit assumptions in the FPSIMD handling code that will no longer
hold after the next patch that optimizes away some FPSIMD state restores:
. the FPSIMD registers of this CPU contain the userland FPSIMD state of
task 'current';
. when switching to a task, its FPSIMD state will always be restored from
memory.
This patch adds the following functions to abstract away from straight FPSIMD
register file saves and restores:
- fpsimd_preserve_current_state -> ensure current's FPSIMD state is saved
- fpsimd_update_current_state -> replace current's FPSIMD state
Where necessary, the signal handling and fork code are updated to use the above
wrappers instead of poking into the FPSIMD registers directly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Conflicts:
arch/arm64/kernel/fpsimd.c
Change-Id: I53ae7082427cb1c5cc32e1f2ddbd4218115601ba
Ard Biesheuvel [Sat, 8 Feb 2014 12:34:09 +0000 (13:34 +0100)]
cpu: add generic support for CPU feature based module autoloading
This patch adds support for advertising optional CPU features over udev
using the modalias, and for declaring compatibility with/dependency upon
such a feature in a module.
The mapping between feature numbers and actual features should be provided
by the architecture in a file called <asm/cpufeature.h> which exports the
following functions/macros:
- cpu_feature(FEAT), a preprocessor macro that maps token FEAT to a
numeric index;
- bool cpu_have_feature(n), returning whether this CPU has support for
feature #n;
- MAX_CPU_FEATURES, an upper bound for 'n' in the previous function.
The feature can then be enabled by setting CONFIG_GENERIC_CPU_AUTOPROBE
for the architecture.
For instance, a module that registers its module init function using
module_cpu_feature_match(FEAT_X, module_init_function)
will be probed automatically when the CPU's support for the 'FEAT_X'
feature is advertised over udev, and will only allow the module to be
loaded by hand if the 'FEAT_X' feature is supported.
Change-Id: Icae8e3ff347235fc72a5b41279f0afdb34fb161a
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kbuild test robot [Tue, 24 Sep 2013 00:21:29 +0000 (08:21 +0800)]
crypto: ablk_helper - Replace memcpy with struct assignment
tree: git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
head:
48e6dc1b2a1ad8186d48968d5018912bdacac744
commit:
a62b01cd6cc1feb5e80d64d6937c291473ed82cb [20/24] crypto: create generic version of ablk_helper
coccinelle warnings: (new ones prefixed by >>)
>> crypto/ablk_helper.c:97:2-8: Replace memcpy with struct assignment
>> crypto/ablk_helper.c:78:2-8: Replace memcpy with struct assignment
Please consider folding the attached diff :-)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Fri, 20 Sep 2013 07:55:40 +0000 (09:55 +0200)]
crypto: create generic version of ablk_helper
Create a generic version of ablk_helper so it can be reused
by other architectures.
Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mikulas Patocka [Fri, 25 Jul 2014 23:40:20 +0000 (19:40 -0400)]
crypto: arm64-aes - fix encryption of unaligned data
cryptsetup fails on arm64 when using kernel encryption via AF_ALG socket.
See https://bugzilla.redhat.com/show_bug.cgi?id=
1122937
The bug is caused by incorrect handling of unaligned data in
arch/arm64/crypto/aes-glue.c. Cryptsetup creates a buffer that is aligned
on 8 bytes, but not on 16 bytes. It opens AF_ALG socket and uses the
socket to encrypt data in the buffer. The arm64 crypto accelerator causes
data corruption or crashes in the scatterwalk_pagedone.
This patch fixes the bug by passing the residue bytes that were not
processed as the last parameter to blkcipher_walk_done.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andreas Schwab [Thu, 24 Jul 2014 16:03:26 +0000 (17:03 +0100)]
arm64/crypto: fix makefile rule for aes-glue-%.o
This fixes the following build failure when building with CONFIG_MODVERSIONS
enabled:
CC [M] arch/arm64/crypto/aes-glue-ce.o
ld: cannot find arch/arm64/crypto/aes-glue-ce.o: No such file or directory
make[1]: *** [arch/arm64/crypto/aes-ce-blk.o] Error 1
make: *** [arch/arm64/crypto] Error 2
The $(obj)/aes-glue-%.o rule only creates $(obj)/.tmp_aes-glue-ce.o, it
should use if_changed_rule instead of if_changed_dep.
Signed-off-by: Andreas Schwab <schwab@suse.de>
[ardb: mention CONFIG_MODVERSIONS in commit log]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 16 Jun 2014 10:02:16 +0000 (11:02 +0100)]
arm64/crypto: improve performance of GHASH algorithm
This patches modifies the GHASH secure hash implementation to switch to a
faster, polynomial multiplication based reduction instead of one that uses
shifts and rotates.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 16 Jun 2014 10:02:15 +0000 (11:02 +0100)]
arm64/crypto: fix data corruption bug in GHASH algorithm
This fixes a bug in the GHASH algorithm resulting in the calculated hash to be
incorrect if the input is presented in chunks whose size is not a multiple of
16 bytes.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: fdd2389457b2 ("arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Fri, 21 Mar 2014 09:19:17 +0000 (10:19 +0100)]
arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions
This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes,
both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON.
The Crypto Extensions version can only run on ARMv8 implementations that
have support for these optional extensions.
The plain NEON version is a table based yet time invariant implementation.
All S-box substitutions are performed in parallel, leveraging the wide range
of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can
comfortably hold the entire S-box and still have room to spare for doing the
actual computations.
The key expansion routines were borrowed from aes_generic.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Mon, 10 Feb 2014 10:26:29 +0000 (11:26 +0100)]
arm64/crypto: AES in CCM mode using ARMv8 Crypto Extensions
This patch adds support for the AES-CCM encryption algorithm for CPUs that
have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Wed, 5 Feb 2014 17:13:38 +0000 (18:13 +0100)]
arm64/crypto: AES using ARMv8 Crypto Extensions
This patch adds support for the AES symmetric encryption algorithm for CPUs
that have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Wed, 26 Mar 2014 19:53:05 +0000 (20:53 +0100)]
arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions
This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the
GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the
optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call
carry-less multiply).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Ard Biesheuvel [Thu, 20 Mar 2014 14:35:40 +0000 (15:35 +0100)]
arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions
This patch adds support for the SHA-224 and SHA-256 Secure Hash Algorithms
for CPUs that have support for the SHA-2 part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
JP Abgrall [Thu, 28 Aug 2014 02:07:30 +0000 (19:07 -0700)]
arm64/crypto: SHA-1 using ARMv8 Crypto Extensions
This patch adds support for the SHA-1 Secure Hash Algorithm for CPUs that
have support for the SHA-1 part of the ARM v8 Crypto Extensions.
Change-Id: I29fafd308e17aff6e0d59938c106fae6ad7fe78e
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Conflicts:
arch/arm64/Makefile
JP Abgrall [Thu, 28 Aug 2014 03:30:29 +0000 (20:30 -0700)]
arm: fixup NR_syscalls to accommodate the new seccomp syscall
This belongs in
commit:
83f1ccba87b06575966b65352db565c363af7bcf
https://android-review.googlesource.com/#/c/104520
Change-Id: Id5037cbebac9b86c863da79c3b8729e627e65f8e
Signed-off-by: JP Abgrall <jpa@google.com>
Kees Cook [Thu, 5 Jun 2014 07:23:17 +0000 (00:23 -0700)]
seccomp: implement SECCOMP_FILTER_FLAG_TSYNC
Applying restrictive seccomp filter programs to large or diverse
codebases often requires handling threads which may be started early in
the process lifetime (e.g., by code that is linked in). While it is
possible to apply permissive programs prior to process start up, it is
difficult to further restrict the kernel ABI to those threads after that
point.
This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for
synchronizing thread group seccomp filters at filter installation time.
When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC,
filter) an attempt will be made to synchronize all threads in current's
threadgroup to its new seccomp filter program. This is possible iff all
threads are using a filter that is an ancestor to the filter current is
attempting to synchronize to. NULL filters (where the task is running as
SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be
transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS,
...) has been set on the calling thread, no_new_privs will be set for
all synchronized threads too. On success, 0 is returned. On failure,
the pid of one of the failing threads will be returned and no filters
will have been applied.
The race conditions against another thread are:
- requesting TSYNC (already handled by sighand lock)
- performing a clone (already handled by sighand lock)
- changing its filter (already handled by sighand lock)
- calling exec (handled by cred_guard_mutex)
The clone case is assisted by the fact that new threads will have their
seccomp state duplicated from their parent before appearing on the tasklist.
Holding cred_guard_mutex means that seccomp filters cannot be assigned
while in the middle of another thread's exec (potentially bypassing
no_new_privs or similar). The call to de_thread() may kill threads waiting
for the mutex.
Changes across threads to the filter pointer includes a barrier.
Based on patches by Will Drewry.
Suggested-by: Julien Tinnes <jln@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Kees Cook [Fri, 27 Jun 2014 22:01:35 +0000 (15:01 -0700)]
seccomp: allow mode setting across threads
This changes the mode setting helper to allow threads to change the
seccomp mode from another thread. We must maintain barriers to keep
TIF_SECCOMP synchronized with the rest of the seccomp state.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Change-Id: I091ffa55d8f4e83ff02558a55e2b4dc76ac26905
Kees Cook [Fri, 27 Jun 2014 22:18:48 +0000 (15:18 -0700)]
seccomp: introduce writer locking
Normally, task_struct.seccomp.filter is only ever read or modified by
the task that owns it (current). This property aids in fast access
during system call filtering as read access is lockless.
Updating the pointer from another task, however, opens up race
conditions. To allow cross-thread filter pointer updates, writes to the
seccomp fields are now protected by the sighand spinlock (which is shared
by all threads in the thread group). Read access remains lockless because
pointer updates themselves are atomic. However, writes (or cloning)
often entail additional checking (like maximum instruction counts)
which require locking to perform safely.
In the case of cloning threads, the child is invisible to the system
until it enters the task list. To make sure a child can't be cloned from
a thread and left in a prior state, seccomp duplication is additionally
moved under the sighand lock. Then parent and child are certain have
the same seccomp state when they exit the lock.
Based on patches by Will Drewry and David Drysdale.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/fork.c
Change-Id: Ie01ece43b610867013f7d0e0a2a7be0b9077630f
Kees Cook [Fri, 27 Jun 2014 22:16:33 +0000 (15:16 -0700)]
seccomp: split filter prep from check and apply
In preparation for adding seccomp locking, move filter creation away
from where it is checked and applied. This will allow for locking where
no memory allocation is happening. The validation, filter attachment,
and seccomp mode setting can all happen under the future locks.
For extreme defensiveness, I've added a BUG_ON check for the calculated
size of the buffer allocation in case BPF_MAXINSN ever changes, which
shouldn't ever happen. The compiler should actually optimize out this
check since the test above it makes it impossible.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Conflicts:
kernel/seccomp.c
Change-Id: I8d89f80a5b4f2826d90474dcea441c41f0af6594