Guenter Roeck [Tue, 3 Feb 2015 18:01:18 +0000 (10:01 -0800)]
regmap: Export regmap_get_val_endian
We'll need to call it from regmap-i2c.c, which can be built as module.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mark Brown <broonie@kernel.org>
Rob Herring [Tue, 3 Feb 2015 23:03:35 +0000 (17:03 -0600)]
spi: spi-pxa2xx: only include mach/dma.h for legacy DMA
Move the include of mach/dma.h to the legacy PXA DMA code where it is used.
This enables building spi-pxa2xx on ARM64.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Wed, 4 Feb 2015 18:22:08 +0000 (10:22 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small cifs fixes. One fixes a hang under stress, and the other
two are security related"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix MUST SecurityFlags filtering
Complete oplock break jobs before closing file handle
cifs: use memzero_explicit to clear stack buffer
Linus Torvalds [Wed, 4 Feb 2015 17:42:55 +0000 (09:42 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A number of ARM fixes, the biggest is fixing a regression caused by
appended DT blobs exceeding 64K, causing the decompressor fixup code
to fail to patch the DT blob. Another important fix is for the ASID
allocator from Will Deacon which prevents some rare crashes seen on
some systems. Lastly, there's a build fix for v7M systems when printk
support is disabled.
The last two remaining fixes are more cosmetic - the IOMMU one
prevents an annoying harmless warning message, and we disable the
kernel strict memory permissions on non-MMU which can't support it
anyway"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover
ARM: 8298/1: ARM_KERNMEM_PERMS only works with MMU enabled
ARM: 8295/1: fix v7M build for !CONFIG_PRINTK
ARM: 8294/1: ATAG_DTB_COMPAT: remove the DT workspace's hardcoded 64KB size
ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown
Lars Persson [Tue, 3 Feb 2015 16:08:17 +0000 (17:08 +0100)]
MIPS: Fix syscall_get_nr for the syscall exit tracing.
Register 2 is alredy overwritten by the return value when
syscall_trace_leave() is called.
Signed-off-by: Lars Persson <larper@axis.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9187/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 4 Feb 2015 11:59:43 +0000 (12:59 +0100)]
MIPS: elf2ecoff: Ignore PT_MIPS_ABIFLAGS program headers.
These are generated by very recent toolchains and result in an error
message when attenpting to convert a kernel from ELF to ECOFF.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 2 Feb 2015 00:01:46 +0000 (01:01 +0100)]
MIPS: elf2ecoff: Rewrite main processing loop to switch.
The if construct was getting hard to read and would be getting even more
complex with the next bug fix.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Nicholas Mc Guire [Mon, 2 Feb 2015 15:43:31 +0000 (10:43 -0500)]
spi: atmel: cleanup wait_for_completion return handling
return type of wait_for_completion_timeout is unsigned long not int, this
patch adds an appropriate variable and fixes up the assignment. It removes
the else branch as the only thing it was doing is assigning ret = 0; - but
ret is never used thereafter so that is not needed. As the string in
dev_err already states "timeout" there is little point in printing the 0.
A typo in "trasfer" -> transfer is also fixed.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Wed, 4 Feb 2015 10:45:28 +0000 (11:45 +0100)]
regulator: Fix build breakage on !REGULATOR
Add missing stubs for regulator_suspend_prepare() and
regulator_suspend_finish() to fix exynos_defconfig build without
REGULATOR:
arch/arm/mach-exynos/built-in.o: In function `exynos_suspend_finish':
arch/arm/mach-exynos/suspend.c:537: undefined reference to `regulator_suspend_finish'
arch/arm/mach-exynos/built-in.o: In function `exynos_suspend_prepare':
arch/arm/mach-exynos/suspend.c:520: undefined reference to `regulator_suspend_prepare'
make: *** [vmlinux] Error 1
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Rutland [Wed, 7 Jan 2015 15:01:54 +0000 (15:01 +0000)]
perf: Decouple unthrottling and rotating
Currently the adjusments made as part of perf_event_task_tick() use the
percpu rotation lists to iterate over any active PMU contexts, but these
are not used by the context rotation code, having been replaced by
separate (per-context) hrtimer callbacks. However, some manipulation of
the rotation lists (i.e. removal of contexts) has remained in
perf_rotate_context(). This leads to the following issues:
* Contexts are not always removed from the rotation lists. Removal of
PMUs which have been placed in rotation lists, but have not been
removed by a hrtimer callback can result in corruption of the rotation
lists (when memory backing the context is freed).
This has been observed to result in hangs when PMU drivers built as
modules are inserted and removed around the creation of events for
said PMUs.
* Contexts which do not require rotation may be removed from the
rotation lists as a result of a hrtimer, and will not be considered by
the unthrottling code in perf_event_task_tick.
This patch fixes the issue by updating the rotation ist when events are
scheduled in/out, ensuring that each rotation list stays in sync with
the HW state. As each event holds a refcount on the module of its PMU,
this ensures that when a PMU module is unloaded none of its CPU contexts
can be in a rotation list. By maintaining a list of perf_event_contexts
rather than perf_event_cpu_contexts, we don't need separate paths to
handle the cpu and task contexts, which also makes the code a little
simpler.
As the rotation_list variables are not used for rotation, these are
renamed to active_ctx_list, which better matches their current function.
perf_pmu_rotate_{start,stop} are renamed to
perf_pmu_ctx_{activate,deactivate}.
Reported-by: Johannes Jensen <johannes.jensen@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150129134511.GR17721@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark Rutland [Wed, 7 Jan 2015 14:56:51 +0000 (14:56 +0000)]
perf: Drop module reference on event init failure
When initialising an event, perf_init_event will call try_module_get() to
ensure that the PMU's module cannot be removed for the lifetime of the
event, with __free_event() dropping the reference when the event is
finally destroyed. If something fails after the event has been
initialised, but before the event is installed, perf_event_alloc will
drop the reference on the module.
However, if we fail to initialise an event for some reason (e.g. we ask
an uncore PMU to perform sampling, and it refuses to initialise the
event), we do not drop the refcount. If we try to open such a bogus
event without a precise IDR type, we will loop over each PMU in the pmus
list, incrementing each of their refcounts without decrementing them.
This patch adds a module_put when pmu->event_init(event) fails, ensuring
that the refcounts are balanced in failure cases. As the innards of the
precise and search based initialisation look very similar, this logic is
hoisted out into a new helper function. While the early return for the
failed try_module_get is removed from the search case, this is handled
by the remaining return when ret is not -ENOENT.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1420642611-22667-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiri Olsa [Wed, 28 Jan 2015 17:54:38 +0000 (18:54 +0100)]
perf: Use POLLIN instead of POLL_IN for perf poll data in flag
Currently we flag available data (via poll syscall) on perf fd with
POLL_IN macro, which is normally used for SIGIO interface.
We've been lucky, because POLLIN (0x1) is subset of POLL_IN (0x20001)
and sys_poll (do_pollfd function) cut the extra bit out (0x20000).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422467678-22341-1-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 29 Jan 2015 13:44:34 +0000 (14:44 +0100)]
perf: Fix put_event() ctx lock
So what I suspect; but I'm in zombie mode today it seems; is that while
I initially thought that it was impossible for ctx to change when
refcount dropped to 0, I now suspect its possible.
Note that until perf_remove_from_context() the event is still active and
visible on the lists. So a concurrent sys_perf_event_open() from another
task into this task can race.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: mark.rutland@arm.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150129134434.GB26304@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra (Intel) [Tue, 27 Jan 2015 10:53:12 +0000 (11:53 +0100)]
perf: Fix move_group() order
Jiri reported triggering the new WARN_ON_ONCE in event_sched_out over
the weekend:
event_sched_out.isra.79+0x2b9/0x2d0
group_sched_out+0x69/0xc0
ctx_sched_out+0x106/0x130
task_ctx_sched_out+0x37/0x70
__perf_install_in_context+0x70/0x1a0
remote_function+0x48/0x60
generic_exec_single+0x15b/0x1d0
smp_call_function_single+0x67/0xa0
task_function_call+0x53/0x80
perf_install_in_context+0x8b/0x110
I think the below should cure this; if we install a group leader it
will iterate the (still intact) group list and find its siblings and
try and install those too -- even though those still have the old
event->ctx -- in the new ctx.
Upon installing the first group sibling we'd try and schedule out the
group and trigger the above warn.
Fix this by installing the group leader last, installing siblings
would have no effect, they're not reachable through the group lists
and therefore we don't schedule them.
Also delay resetting the state until we're absolutely sure the events
are quiescent.
Reported-by: Jiri Olsa <jolsa@redhat.com>
Reported-by: vincent.weaver@maine.edu
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150126162639.GA21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 23 Jan 2015 11:24:14 +0000 (12:24 +0100)]
perf: Fix event->ctx locking
There have been a few reported issues wrt. the lack of locking around
changing event->ctx. This patch tries to address those.
It avoids the whole rwsem thing; and while it appears to work, please
give it some thought in review.
What I did fail at is sensible runtime checks on the use of
event->ctx, the RCU use makes it very hard.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 23 Jan 2015 10:20:10 +0000 (11:20 +0100)]
perf: Add a bit of paranoia
Add a few WARN()s to catch things that should never happen.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.150481799@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
David S. Miller [Wed, 4 Feb 2015 07:06:49 +0000 (23:06 -0800)]
Merge branch 'virtio_net_ufo'
Vladislav Yasevich says:
====================
Restore UFO support to virtio_net devices
commit
3d0ad09412ffe00c9afa201d01effdb6023d09b4
Author: Ben Hutchings <ben@decadent.org.uk>
Date: Thu Oct 30 18:27:12 2014 +0000
drivers/net: Disable UFO through virtio
Turned off UFO support to virtio-net based devices due to issues
with IPv6 fragment id generation for UFO packets. The issue
was that IPv6 UFO/GSO implementation expects the fragment id
to be supplied in skb_shinfo(). However, for packets generated
by the VMs, the fragment id is not supplied which causes all
IPv6 fragments to have the id of 0.
The problem is that turning off UFO support on tap/macvtap
as well as virtio devices caused issues with migrations.
Migrations would fail when moving a vm from a kernel supporting
expecting UFO to work to the newer kernels that disabled UFO.
This series provides a partial solution to address the migration
issue. The series allows us to track whether skb_shinfo()->ip6_frag_id
has been set by treating value of 0 as unset.
This lets GSO code to generate fragment ids if they are necessary
(ex: packet was generated by VM or packet socket).
Since v3:
- Resolved build issue when IPv6 is a module.
- Removed trailing white space.
Since v2:
- Rebase and rebuild to make sure everything works. No changes
to the patches were done.
Since v1:
- Removed the skb bit and use value of 0 as tracker.
- Used Eric's suggestion to set fragment id as 0x80000000 if id
generation procedure yeilded a 0 result.
- Consolidated ipv6 id genration code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 3 Feb 2015 21:36:17 +0000 (16:36 -0500)]
Revert "drivers/net: Disable UFO through virtio"
This reverts commit
3d0ad09412ffe00c9afa201d01effdb6023d09b4.
Now that GSO functionality can correctly track if the fragment
id has been selected and select a fragment id if necessary,
we can re-enable UFO on tap/macvap and virtio devices.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 3 Feb 2015 21:36:16 +0000 (16:36 -0500)]
Revert "drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets"
This reverts commit
5188cd44c55db3e92cd9e77a40b5baa7ed4340f7.
Now that GSO layer can track if fragment id has been selected
and can allocate one if necessary, we don't need to do this in
tap and macvtap. This reverts most of the code and only keeps
the new ipv6 fragment id generation function that is still needed.
Fixes: 3d0ad09412ff (drivers/net: Disable UFO through virtio)
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 3 Feb 2015 21:36:15 +0000 (16:36 -0500)]
ipv6: Select fragment id during UFO segmentation if not set.
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
We now consider a fragment id of 0 as unset and if id selection
process returns 0 (after all the pertrubations), we set it to
0x80000000, thus giving us ample space not to create collisions
with the next packet we may have to fragment.
When doing UFO integrity checking, we also select the
fragment id if it has not be set yet. This is stored into
the skb_shinfo() thus allowing UFO to function correclty.
This patch also removes duplicate fragment id generation code
and moves ipv6_select_ident() into the header as it may be
used during GSO.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Wed, 4 Feb 2015 06:58:29 +0000 (07:58 +0100)]
Merge tag 'v3.19-rc7' into perf/core, to merge fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Davidlohr Bueso [Mon, 2 Feb 2015 06:16:24 +0000 (22:16 -0800)]
locking/rtmutex: Optimize setting task running after being blocked
We explicitly mark the task running after returning from
a __rt_mutex_slowlock() call, which does the actual sleeping
via wait-wake-trylocking. As such, this patch does two things:
(1) refactors the code so that setting current to TASK_RUNNING
is done by __rt_mutex_slowlock(), and not by the callers. The
downside to this is that it becomes a bit unclear when at what
point we block. As such I've added a comment that the task
blocks when calling __rt_mutex_slowlock() so readers can figure
out when it is running again.
(2) relaxes setting current's state through __set_current_state(),
instead of it's more expensive barrier alternative. There was no
need for the implied barrier as we're obviously not planning on
blocking.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422857784.18096.1.camel@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Davidlohr Bueso [Mon, 26 Jan 2015 07:36:04 +0000 (23:36 -0800)]
locking/rwsem: Use task->state helpers
Call __set_task_state() instead of assigning the new state
directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP
environments, keeping track of who last changed the state.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422257769-14083-2-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nicholas Mc Guire [Fri, 23 Jan 2015 11:41:47 +0000 (12:41 +0100)]
sched/completion: Add lock-free checking of the blocking case
The "thread would block" case can be checked without grabbing ->wait.lock.
[ If the check does not return early then grab the lock and recheck.
A memory barrier is not needed as complete() and complete_all() imply
a barrier.
The ACCESS_ONCE() is needed for calls in a loop that, if inlined, could
optimize out the re-fetching of x->done. ]
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422013307-13200-1-git-send-email-der.herr@hofr.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nicholas Mc Guire [Sat, 17 Jan 2015 04:05:34 +0000 (05:05 +0100)]
sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421467534-22834-1-git-send-email-der.herr@hofr.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Davidlohr Bueso [Tue, 20 Jan 2015 01:39:21 +0000 (17:39 -0800)]
locking/mutex: Explicitly mark task as running after wakeup
By the time we wake up and get the lock after being asleep
in the slowpath, we better be running. As good practice,
be explicit about this and avoid any mischief.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421717961.4903.11.camel@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 4 Feb 2015 06:53:17 +0000 (07:53 +0100)]
Merge tag 'v3.19-rc7' into locking/core, to refresh the branch before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sharon Dvir [Sun, 1 Feb 2015 21:47:32 +0000 (23:47 +0200)]
sched/Documentation: Remove unneeded word
The second 'mutex' shouldn't be there, it can't be about the mutex,
as the mutex can't be freed, but unlocked, the memory where the
mutex resides however, can be freed.
Signed-off-by: Sharon Dvir <sharon.dvir1@mail.huji.ac.il>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422827252-31363-1-git-send-email-sharon.dvir1@mail.huji.ac.il
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Johan Hedberg [Fri, 30 Jan 2015 11:14:36 +0000 (13:14 +0200)]
sched/wait: Introduce wait_on_bit_timeout()
Add a new wait_on_bit_timeout() helper, basically the same as
wait_on_bit() except that it also takes a 'timeout' parameter.
All the building blocks like bit_wait_timeout() and
out_of_line_wait_on_bit_timeout() are already in place so the
addition is rather simple.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: davem@davemloft.net
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422616476-2917-2-git-send-email-johan.hedberg@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Frederic Weisbecker [Wed, 28 Jan 2015 00:24:09 +0000 (01:24 +0100)]
sched: Pull resched loop to __schedule() callers
__schedule() disables preemption during its job and re-enables it
afterward without doing a preemption check to avoid recursion.
But if an event happens after the context switch which requires
rescheduling, we need to check again if a task of a higher priority
needs the CPU. A preempt irq can raise such a situation. To handle that,
__schedule() loops on need_resched().
But preempt_schedule_*() functions, which call __schedule(), also loop
on need_resched() to handle missed preempt irqs. Hence we end up with
the same loop happening twice.
Lets simplify that by attributing the need_resched() loop responsibility
to all __schedule() callers.
There is a risk that the outer loop now handles reschedules that used
to be handled by the inner loop with the added overhead of caller details
(inc/dec of PREEMPT_ACTIVE, irq save/restore) but assuming those inner
rescheduling loop weren't too frequent, this shouldn't matter. Especially
since the whole preemption path is now losing one loop in any case.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1422404652-29067-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Xunlei Pang [Mon, 19 Jan 2015 04:49:37 +0000 (04:49 +0000)]
sched/deadline: Remove cpu_active_mask from cpudl_find()
cpu_active_mask is rarely changed (only on hotplug), so remove this
operation to gain a little performance.
If there is a change in cpu_active_mask, rq_online_dl() and
rq_offline_dl() should take care of it normally, so cpudl::free_cpus
carries enough information for us.
For the rare case when a task is put onto a dying cpu (which
rq_offline_dl() can't handle in a timely fashion), it will be
handled through _cpu_down()->...->multi_cpu_stop()->migration_call()
->migrate_tasks(), preventing the task from hanging on the
dead cpu.
Cc: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
[peterz: changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421642980-10045-2-git-send-email-pang.xunlei@linaro.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wanpeng Li [Wed, 26 Nov 2014 00:44:06 +0000 (08:44 +0800)]
sched: Fix hrtick_start() on UP
The commit
177ef2a6315e ("sched/deadline: Fix a precision problem in
the microseconds range") forgot to change the UP version of
hrtick_start(), do so now.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 177ef2a6315e ("sched/deadline: Fix a precision problem in the microseconds range")
[ Fixed the changelog. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1416962647-76792-7-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wanpeng Li [Wed, 26 Nov 2014 00:44:04 +0000 (08:44 +0800)]
sched/deadline: Avoid pointless __setscheduler()
There is no need to dequeue/enqueue and push/pull if there are
no scheduling parameters changed for the DL class.
Both fair and RT classes already check if parameters changed for
them to avoid unnecessary overhead. This patch add the parameters
changed test for the DL class in order to reduce overhead.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
[ Fixed up the changelog. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1416962647-76792-5-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 26 Nov 2014 00:44:03 +0000 (08:44 +0800)]
sched/deadline: Fix stale yield state
When we fail to start the deadline timer in update_curr_dl(), we
forget to clear ->dl_yielded, resulting in wrecked time keeping.
Since the natural place to clear both ->dl_yielded and ->dl_throttled
is in replenish_dl_entity(); both are after all waiting for that event;
make it so.
Luckily since
67dfa1b756f2 ("sched/deadline: Implement
cancel_dl_timer() to use in switched_from_dl()") the
task_on_rq_queued() condition in dl_task_timer() must be true, and can
therefore call enqueue_task_dl() unconditionally.
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1416962647-76792-4-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wanpeng Li [Wed, 26 Nov 2014 00:44:01 +0000 (08:44 +0800)]
sched/deadline: Fix hrtick for a non-leftmost task
After update_curr_dl() the current task might not be the leftmost task
anymore. In that case do not start a new hrtick for it.
In this case NEED_RESCHED will be set and the next schedule will start
the hrtick for the new task if and when appropriate.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Acked-by: Juri Lelli <juri.lelli@arm.com>
[ Rewrote the changelog and comment. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1416962647-76792-2-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 4 Feb 2015 06:44:00 +0000 (07:44 +0100)]
Merge branch 'sched/urgent' into sched/core, to merge fixes before applying new patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 28 Jan 2015 14:08:03 +0000 (15:08 +0100)]
sched/deadline: Fix deadline parameter modification handling
Commit
67dfa1b756f2 ("sched/deadline: Implement cancel_dl_timer() to
use in switched_from_dl()") removed the hrtimer_try_cancel() function
call out from init_dl_task_timer(), which gets called from
__setparam_dl().
The result is that we can now re-init the timer while its active --
this is bad and corrupts timer state.
Furthermore; changing the parameters of an active deadline task is
tricky in that you want to maintain guarantees, while immediately
effective change would allow one to circumvent the CBS guarantees --
this too is bad, as one (bad) task should not be able to affect the
others.
Rework things to avoid both problems. We only need to initialize the
timer once, so move that to __sched_fork() for new tasks.
Then make sure __setparam_dl() doesn't affect the current running
state but only updates the parameters used to calculate the next
scheduling period -- this guarantees the CBS functions as expected
(albeit slightly pessimistic).
This however means we need to make sure __dl_clear_params() needs to
reset the active state otherwise new (and tasks flipping between
classes) will not properly (re)compute their first instance.
Todo: close class flipping CBS hole.
Todo: implement delayed BW release.
Reported-by: Luca Abeni <luca.abeni@unitn.it>
Acked-by: Juri Lelli <juri.lelli@arm.com>
Tested-by: Luca Abeni <luca.abeni@unitn.it>
Fixes: 67dfa1b756f2 ("sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl()")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150128140803.GF23038@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 4 Feb 2015 05:57:24 +0000 (06:57 +0100)]
Merge branch 'liblockdep-fixes-3.19' of git://git./linux/kernel/git/sashal/linux into core/urgent
Pull liblockdep fixes from Sasha Levin.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 4 Feb 2015 04:12:57 +0000 (20:12 -0800)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband reverts from Roland Dreier:
"Last minute InfiniBand/RDMA changes for 3.19:
- Revert IPoIB driver back to 3.18 state. We had a number of fixes
go into 3.19, but they introduced regressions. We tried to get
everything fixed up but ran out of time, so we'll try again for
3.20.
- Similarly, turn off the new "extended query port" verb. Late in
the cycle we realized the ABI is not quite right, and rather than
freeze something in a rush and make a mistake, we'll take a bit
more time and get it right in 3.20"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/core: Temporarily disable ex_query_device uverb
Revert "IPoIB: Consolidate rtnl_lock tasks in workqueue"
Revert "IPoIB: Make the carrier_on_task race aware"
Revert "IPoIB: fix MCAST_FLAG_BUSY usage"
Revert "IPoIB: fix mcast_dev_flush/mcast_restart_task race"
Revert "IPoIB: change init sequence ordering"
Revert "IPoIB: Use dedicated workqueues per interface"
Revert "IPoIB: Make ipoib_mcast_stop_thread flush the workqueue"
Revert "IPoIB: No longer use flush as a parameter"
Linus Torvalds [Wed, 4 Feb 2015 03:54:57 +0000 (19:54 -0800)]
Merge tag 'md/3.19-fixes' of git://neil.brown.name/md
Pull two fixes for md from Neil Brown:
- Another live lock, needs backporting
- work-around false positive with new warnings.
* tag 'md/3.19-fixes' of git://neil.brown.name/md:
md/bitmap: fix a might_sleep() warning.
md/raid5: fix another livelock caused by non-aligned writes.
Myron Stowe [Tue, 3 Feb 2015 23:01:24 +0000 (16:01 -0700)]
PCI: Handle read-only BARs on AMD CS553x devices
Some AMD CS553x devices have read-only BARs because of a firmware or
hardware defect. There's a workaround in quirk_cs5536_vsa(), but it no
longer works after
36e8164882ca ("PCI: Restore detection of read-only
BARs"). Prior to
36e8164882ca, we filled in res->start; afterwards we
leave it zeroed out. The quirk only updated the size, so the driver tried
to use a region starting at zero, which didn't work.
Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
hard-code the sizes.
On Nix's system BAR 2's read-only value is 0x6200. Prior to
36e8164882ca,
we interpret that as a 512-byte BAR based on the lowest-order bit set. Per
datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
avoid clearing any address bits if a platform uses only 64-byte alignment.
[bhelgaas: changelog, reduce BAR 2 size to 64]
Fixes: 36e8164882ca ("PCI: Restore detection of read-only BARs")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdf
Reported-and-tested-by: Nix <nix@esperi.org.uk>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v.2.6.27+
Dave Chinner [Wed, 4 Feb 2015 00:29:05 +0000 (19:29 -0500)]
aio: annotate aio_read_event_ring for sleep patterns
Under CONFIG_DEBUG_ATOMIC_SLEEP=y, aio_read_event_ring() will throw
warnings like the following due to being called from wait_event
context:
WARNING: CPU: 0 PID: 16006 at kernel/sched/core.c:7300 __might_sleep+0x7f/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at [<
ffffffff810d85a3>] prepare_to_wait_event+0x63/0x110
Modules linked in:
CPU: 0 PID: 16006 Comm: aio-dio-fcntl-r Not tainted 3.19.0-rc6-dgc+ #705
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff821c0372 ffff88003c117cd8 ffffffff81daf2bd 000000000000d8d8
ffff88003c117d28 ffff88003c117d18 ffffffff8109beda ffff88003c117cf8
ffffffff821c115e 0000000000000061 0000000000000000 00007ffffe4aa300
Call Trace:
[<
ffffffff81daf2bd>] dump_stack+0x4c/0x65
[<
ffffffff8109beda>] warn_slowpath_common+0x8a/0xc0
[<
ffffffff8109bf56>] warn_slowpath_fmt+0x46/0x50
[<
ffffffff810d85a3>] ? prepare_to_wait_event+0x63/0x110
[<
ffffffff810d85a3>] ? prepare_to_wait_event+0x63/0x110
[<
ffffffff810bdfcf>] __might_sleep+0x7f/0x90
[<
ffffffff81db8344>] mutex_lock+0x24/0x45
[<
ffffffff81216b7c>] aio_read_events+0x4c/0x290
[<
ffffffff81216fac>] read_events+0x1ec/0x220
[<
ffffffff810d8650>] ? prepare_to_wait_event+0x110/0x110
[<
ffffffff810fdb10>] ? hrtimer_get_res+0x50/0x50
[<
ffffffff8121899d>] SyS_io_getevents+0x4d/0xb0
[<
ffffffff81dba5a9>] system_call_fastpath+0x12/0x17
---[ end trace
bde69eaf655a4fea ]---
There is not actually a bug here, so annotate the code to tell the
debug logic that everything is just fine and not to fire a false
positive.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Grygorii Strashko [Tue, 3 Feb 2015 15:01:58 +0000 (17:01 +0200)]
hwmon: (tmp102) add hibernation callbacks
Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SIMPLE_DEV_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
mp102_suspend/tmp102_resume under CONFIG_PM_SLEEP to avoid
build warnings.
Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org>
[groeck: Declare tmp102_dev_pm_ops as static variable]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Tue, 3 Feb 2015 19:36:57 +0000 (11:36 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull final block layer fixes from Jens Axboe:
"Unfortunately the hctx/ctx lifetime fix from last pull had some
issues. This pull request contains a revert of the problematic
commit, and a proper rewrite of it.
The rewrite has been tested by the users complaining about the
regression, and it works fine now. Additionally, I've run testing on
all the blk-mq use cases for it and it passes. So we should
definitely get this into 3.19, to avoid regression for some cases"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: release mq's kobjects in blk_release_queue()
Revert "blk-mq: fix hctx/ctx kobject use-after-free"
Linus Torvalds [Tue, 3 Feb 2015 19:26:54 +0000 (11:26 -0800)]
Merge tag 'gpio-v3.19-5' of git://git./linux/kernel/git/linusw/linux-gpio
Pull gpio fixes from Linus Walleij:
"Yet more GPIO fixes for the v3.19 series.
There is a high bug-spot activity in GPIO this merge window, much due
to Johan Hovolds spearheading into actually exercising the removal
path for GPIO chips, something that was never really exercised before.
The other two fixes are augmenting erroneous behaviours in two
specific drivers for minor systems.
Summary from signed tag:
- Two fixes stabilizing that which was never stable before: removal
of GPIO chips, now let's stop leaking memory.
- Make sure OMAP IRQs are usable when the irqchip API is used
orthogonally to the gpiochip API.
- Provide a default GPIO base for the mcp23s08 driver"
* tag 'gpio-v3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low
gpio: sysfs: fix memory leak in gpiod_export_link
gpio: mcp23s08: handle default gpio base
gpio: omap: Fix bad device access with setup_irq()
Roland Dreier [Tue, 3 Feb 2015 17:29:25 +0000 (09:29 -0800)]
Merge branches 'ipoib' and 'odp' into for-next
Haggai Eran [Sun, 1 Feb 2015 13:35:30 +0000 (15:35 +0200)]
IB/core: Temporarily disable ex_query_device uverb
Commit
5a77abf9a97a ("IB/core: Add support for extended query device caps")
added a new extended verb to query the capabilities of RDMA devices, but the
semantics of this verb are still under debate [1].
Don't expose this verb to userspace until the ABI is nailed down.
[1] [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19
http://www.spinics.net/lists/linux-rdma/msg22904.html
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Rafael J. Wysocki [Tue, 3 Feb 2015 13:29:43 +0000 (14:29 +0100)]
Revert "ACPI / LPSS: introduce a 'proxy' device to power on LPSS for DMA"
Revert commit
6c17ee44d524 (ACPI / LPSS: introduce a 'proxy' device
to power on LPSS for DMA), as it introduced registration and probe
ordering problems between devices on the LPSS that may lead to full
hard system hang on boot in some cases.
Eric Nelson [Fri, 30 Jan 2015 21:07:55 +0000 (14:07 -0700)]
ASoC: sgtl5000: add delay before first I2C access
To quote from section 1.3.1 of the data sheet:
The SGTL5000 has an internal reset that is deasserted
8 SYS_MCLK cycles after all power rails have been brought
up. After this time, communication can start
...
1.0us represents 8 SYS_MCLK cycles at the minimum 8.0 MHz SYS_MCLK.
Signed-off-by: Eric Nelson <eric.nelson@boundarydevices.com>
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Will Deacon [Thu, 29 Jan 2015 15:41:46 +0000 (16:41 +0100)]
ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover
Commit
e1a5848e3398 ("ARM: 7924/1: mm: don't bother with reserved ttbr0
when running with LPAE") removed the use of the reserved TTBR0 value
for LPAE systems, since the ASID is held in the TTBR and can be updated
atomicly with the pgd of the next mm.
Unfortunately, this patch forgot to update flush_context, which
deliberately avoids marking the local active ASID as allocated, since we
used to switch via ASID zero and didn't need to allocate the ASID of
the previous mm. The side-effect of this is that we can allocate the
same ASID to the next mm and, between flushing the local TLB and updating
TTBR0, we can perform speculative TLB fills for userspace nG mappings
using the page table of the previous mm.
The consequence of this is that the next mm can erroneously hit some
mappings of the previous mm. Note that this was made significantly
harder to hit by
a391263cd84e ("ARM: 8203/1: mm: try to re-use old ASID
assignments following a rollover") but is still theoretically possible.
This patch fixes the problem by removing the code from flush_context
that forces the allocated ASID to zero for the local CPU. Many thanks
to the Broadcom guys for tracking this one down.
Fixes: e1a5848e3398 ("ARM: 7924/1: mm: don't bother with reserved ttbr0 when running with LPAE")
Cc: <stable@vger.kernel.org> # v3.14+
Reported-by: Raymond Ngun <rngun@broadcom.com>
Tested-by: Raymond Ngun <rngun@broadcom.com>
Reviewed-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Robin Gong [Tue, 3 Feb 2015 02:25:53 +0000 (10:25 +0800)]
spi: imx: use pio mode for i.mx6dl
For TKT238285 hardware issue which may cause txfifo store data twice can only
be caught on i.mx6dl, we use pio mode instead of DMA mode on i.mx6dl.
Fixes: f62caccd12c17e4 (spi: spi-imx: add DMA support)
Signed-off-by: Robin Gong <b38343@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Ingo Molnar [Tue, 3 Feb 2015 11:24:08 +0000 (12:24 +0100)]
Merge tag 'pr-
20150201-x86-entry' of git://git./linux/kernel/git/luto/linux into x86/asm
Pull "x86: Entry cleanups and a bugfix for 3.20" from Andy Lutomirski:
" This fixes a bug in the RCU code I added in ist_enter. It also includes
the sysret stuff discussed here:
http://lkml.kernel.org/g/cover.
1421453410.git.luto%40amacapital.net "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Tue, 3 Feb 2015 11:22:47 +0000 (12:22 +0100)]
Merge tag 'pr-
20150201-x86-vdso' of git://git./linux/kernel/git/luto/linux into x86/asm
Pull VDSO fix fro Andy Lutomirski:
"x86, vdso: One trivial last-minute VDSO build improvement
Andrey noticed that the VDSO build wasn't cleaning itself up. This
one-liner fixes it."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Tue, 3 Feb 2015 11:22:18 +0000 (12:22 +0100)]
Merge tag 'v3.19-rc7' into x86/asm, to refresh the branch before pulling in new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mikulas Patocka [Mon, 2 Feb 2015 14:39:02 +0000 (09:39 -0500)]
sched/wait: Remove might_sleep() from wait_event_cmd()
The patch
e22b886a8a43 ("sched/wait: Add might_sleep() checks")
introduced a bug in the raid5 subsystem.
The function raid5_quiesce() (and resize_stripes()) uses the 'cmd'
part to release and acquire a spinlock (so we call the sleep
primitives in atomic context), and therefore we cannot do the
might_sleep() check.
Remove it.
Fixes: e22b886a8a43 ("sched/wait: Add might_sleep() checks")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1502020935580.13510@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
David Vrabel [Mon, 2 Feb 2015 16:57:51 +0000 (16:57 +0000)]
xen-netback: stop the guest rx thread after a fatal error
After commit
e9d8b2c2968499c1f96563e6522c56958d5a1d0d (xen-netback:
disable rogue vif in kthread context), a fatal (protocol) error would
leave the guest Rx thread spinning, wasting CPU time. Commit
ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e (xen-netback: reintroduce
guest Rx stall detection) made this even worse by removing a
cond_resched() from this path.
Since a fatal error is non-recoverable, just allow the guest Rx thread
to exit. This requires taking additional refs to the task so the
thread exiting early is handled safely.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Julien Grall <julien.grall@linaro.org>
Tested-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Mon, 2 Feb 2015 13:18:42 +0000 (15:18 +0200)]
net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than 80 VFs
Commit
de966c592802 (net/mlx4_core: Support more than 64 VFs) was meant to
allow up to 126 VFs. However, due to leaving MLX4_MFUNC_MAX too low, using
more than 80 VFs resulted in memory corruptions (and Oopses) when more than
80 VFs were requested. In addition, the number of slaves was left too high.
This commit fixes these issues.
Fixes: de966c592802 ("net/mlx4_core: Support more than 64 VFs")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Sun, 1 Feb 2015 20:54:25 +0000 (23:54 +0300)]
isdn: off by one in connect_res()
The bug here is that we use "Reject" as the index into the cau_t[] array
in the else path. Since the cau_t[] has 9 elements if Reject == 9 then
we are reading beyond the end of the array.
My understanding of the code is that it's saying that if Reject is 1 or
too high then that's invalid and we should hang up.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Feb 2015 03:30:53 +0000 (19:30 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Validate hooks for nf_tables NAT expressions, otherwise users can
crash the kernel when using them from the wrong hook. We already
got one user trapped on this when configuring masquerading.
2) Fix a BUG splat in nf_tables with CONFIG_DEBUG_PREEMPT=y. Reported
by Andreas Schultz.
3) Avoid unnecessary reroute of traffic in the local input path
in IPVS that triggers a crash in in xfrm. Reported by Florian
Wiessner and fixes by Julian Anastasov.
4) Fix memory and module refcount leak from the error path of
nf_tables_newchain().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Weinberger [Fri, 30 Jan 2015 19:50:44 +0000 (20:50 +0100)]
Documentation: Update netlink_mmap.txt
Update netlink_mmap.txt wrt. commit
4682a0358639b29cf
("netlink: Always copy on mmap TX.").
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
David L Stevens [Fri, 30 Jan 2015 17:29:45 +0000 (12:29 -0500)]
sunvnet: set queue mapping when doing packet copies
This patch fixes a bug where vnet_skb_shape() didn't set the already-selected
queue mapping when a packet copy was required. This results in using the
wrong queue index for stops/starts, hung tx queues and watchdog timeouts
under heavy load.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Leitner [Fri, 30 Jan 2015 11:56:01 +0000 (09:56 -0200)]
qlge: Fix qlge_update_hw_vlan_features to handle if interface is down
Currently qlge_update_hw_vlan_features() will always first put the
interface down, then update features and then bring it up again. But it
is possible to hit this code while the adapter is down and this causes a
non-paired call to napi_disable(), which will get stuck.
This patch fixes it by skipping these down/up actions if the interface
is already down.
Fixes: a45adbe8d352 ("qlge: Enhance nested VLAN (Q-in-Q) handling.")
Cc: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Tue, 3 Feb 2015 01:21:11 +0000 (11:21 +1000)]
Merge tag 'drm-amdkfd-fixes-2015-02-02' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
Three small fixes that came up during last week, nothing scary:
- Accidently incremented a counter instead of decrementing it (copy-paste error)
- Module parameter of max num of queues must be at least 1 and not 0
- Don't do BUG() as a result from wrong user input
* tag 'drm-amdkfd-fixes-2015-02-02' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Don't create BUG due to incorrect user parameter
drm/amdkfd: max num of queues can't be 0
drm/amdkfd: Fix bug in accounting of queues
Dave Airlie [Tue, 3 Feb 2015 01:20:39 +0000 (11:20 +1000)]
Merge branch 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
One last round of fixes for radeon for 3.19:
- fix some fallout from the reservation object integration on the
test/benchmark options
- fix a crash in the gpu vm code if gfx init fails
- fix a pll issue that leads to a blank screen on older IGP parts
* 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix the crash in test functions
drm/radeon: fix the crash in benchmark functions
drm/radeon: properly set vm fragment size for TN/RL
drm/radeon: don't init gpuvm if accel is disabled (v3)
drm/radeon: fix PLLs on RS880 and older v2
Bhuvanchandra DV [Sat, 31 Jan 2015 16:33:25 +0000 (22:03 +0530)]
spi: fsl-dspi: Remove possible memory leak of 'chip'
Move the check for spi->bits_per_word
before allocation, to avoid memory leak.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bhuvanchandra DV <bhuvanchandra.dv@toradex.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Nobuhiro Iwamatsu [Fri, 30 Jan 2015 06:11:54 +0000 (15:11 +0900)]
spi: sh-msiof: Update calculation of frequency dividing
sh-msiof of frequency dividing does not perform the calculation, driver have
to manage setting value in the table. It is not possible to set frequency
dividing value close to the actual data in this way. This changes from
frequency dividing of table management to setting by calculation.
This driver is able to set a value close to the actual data.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Takashi Iwai [Fri, 30 Jan 2015 19:29:31 +0000 (20:29 +0100)]
regulator: Build sysfs entries with static attribute groups
Instead of calling device_create_file() manually after the device
registration, put all in attribute groups and filter the unwanted ones
via is_visible callback. This not only simplifies the code but also
avoids the possible race between the device registration and sysfs
registration.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Ian Abbott [Fri, 30 Jan 2015 18:43:33 +0000 (18:43 +0000)]
spi: spidev: Convert buf pointers for 32-bit compat SPI_IOC_MESSAGE(n)
The SPI_IOC_MESSAGE(n) ioctl commands' argument points to an array of n
struct spi_ioc_transfer elements. The spidev's compat_ioctl handler
just converts this pointer and passes it on to the unlocked_ioctl
handler to process it.
The tx_buf and rx_buf members of struct spi_ioc_transfer are of type
__u64 and hold pointer values. A 32-bit userspace application running
in a 64-bit kernel might not have widened the 32-bit pointers correctly
for the kernel. The application might have sign-extended the pointer to
when the kernel expects it to be zero-extended, or vice versa, leading
to an -EFAULT being returned by spidev_message() if the widened pointer
is invalid.
Handle the SPI_IOC_MESSAGE(n) ioctl commands specially in the
compat_ioctl handler, calling new function spidev_compat_ioctl_message()
to handle them. This processes them in the same way as the
unlocked_ioctl handler except that it uses compat_ptr() to convert the
tx_buf and rx_buf members of each struct spi_ioc_transfer element.
To save code, factor out part of the unlocked_ioctl handler into a new
function spidev_get_ioc_message(). This checks the ioctl command code
is a valid SPI_IOC_MESSAGE(n), determines n and copies the array of n
struct spi_ioc_transfer elements from userspace into dynamically
allocated memory, returning either a pointer to the memory, an
ERR_PTR(-err) value, or NULL (for SPI_IOC_MESSAGE(0)).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Ilija Hadzic [Fri, 30 Jan 2015 05:38:44 +0000 (00:38 -0500)]
drm/radeon: fix the crash in test functions
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from vram BO.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88464
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ilija Hadzic [Fri, 30 Jan 2015 05:38:43 +0000 (00:38 -0500)]
drm/radeon: fix the crash in benchmark functions
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from destination BO.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88464
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 30 Jan 2015 14:52:24 +0000 (09:52 -0500)]
drm/radeon: properly set vm fragment size for TN/RL
Should be the same as cayman. We don't use VM by default
on NI parts so this isn't critical.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Wed, 28 Jan 2015 19:36:26 +0000 (14:36 -0500)]
drm/radeon: don't init gpuvm if accel is disabled (v3)
If acceleration is disabled, it does not make sense
to init gpuvm since nothing will use it. Moreover,
if radeon_vm_init() gets called it uses accel to try
and clear the pde tables, etc. which results in a bug.
v2: handle vm_fini as well
v3: handle bo_open/close as well
Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88786
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Christian König [Thu, 29 Jan 2015 15:01:03 +0000 (16:01 +0100)]
drm/radeon: fix PLLs on RS880 and older v2
This is a workaround for RS880 and older chips which seem to have
an additional limit on the minimum PLL input frequency.
v2: fix signed/unsigned warning
bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=91861
https://bugzilla.kernel.org/show_bug.cgi?id=83461
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Guenter Roeck [Fri, 16 Jan 2015 18:12:50 +0000 (10:12 -0800)]
hwmon: (ads2828) Only keep data in device data structure if needed
The variables diff_input, ext_vref, and vref_mv are only used in the probe
function and therefore don't need to be kept in the device data structure.
Reviewed-and-Tested-by: Robert Rosengren <robert.rosengren@axis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Guenter Roeck [Fri, 16 Jan 2015 18:08:27 +0000 (10:08 -0800)]
hwmon: (ads2828) Convert to use regmap
Simplify code and reduce code size by using regmap to access i2c registers.
Reviewed-and-Tested-by: Robert Rosengren <robert.rosengren@axis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Charlotte Richardson [Mon, 2 Feb 2015 15:36:23 +0000 (09:36 -0600)]
PCI: Add NEC variants to Stratus ftServer PCIe DMI check
NEC OEMs the same platforms as Stratus does, which have multiple devices on
some PCIe buses under downstream ports.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=51331
Fixes: 1278998f8ff6 ("PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)")
Signed-off-by: Charlotte Richardson <charlotte.richardson@stratus.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.5+
CC: Myron Stowe <myron.stowe@redhat.com>
Brian King [Thu, 29 Jan 2015 21:54:40 +0000 (15:54 -0600)]
sd: Fix max transfer length for 4k disks
The following patch fixes an issue observed with 4k sector disks
where the max_hw_sectors attribute was getting set too large in
sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units
of SCSI logical blocks and queue_max_hw_sectors is in units of
512 byte blocks, on a 4k sector disk, every time we went through
sd_revalidate_disk, we were taking the current value of
queue_max_hw_sectors and increasing it by a factor of 8. Fix
this by only shifting sdkp->max_xfer_blocks.
Cc: stable@vger.kernel.org
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Mike Christie [Wed, 28 Jan 2015 09:46:53 +0000 (03:46 -0600)]
scsi: fix device handler detach oops
This fixes a regression caused by commit 1d5203 ("scsi: handle more device
handler setup/teardown in common code").
The bug is that the alua detach() callout will try to access the
sddev->scsi_dh_data, but we have already set it to NULL. This patch
moves the clearing of that field to after detach() is called.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Ricardo Ribalda Delgado [Mon, 2 Feb 2015 10:06:56 +0000 (11:06 +0100)]
spi/xilinx: Fix access invalid memory on xilinx_spi_tx
On 1 and 2 bytes per word, the transfer of the 3 last bytes will access
memory outside tx_ptr.
Although this has not trigger any error on real hardware, we should
better fix this.
Fixes: 24ba5e593f391507 (Remove rx_fn and tx_fn pointer)
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Mon, 2 Feb 2015 12:13:31 +0000 (12:13 +0000)]
regmap: ac97: Clean up indentation
Signed-off-by: Mark Brown <broonie@kernel.org>
Jie Yang [Mon, 2 Feb 2015 08:55:27 +0000 (16:55 +0800)]
MAINTAINERS: ASoC: add maintainer for Intel BDW/HSW ASoC driver
Adding myself as the Intel BDW/HSW ASoC driver maintainer.
Signed-off-by: Jie Yang <yang.jie@intel.com>
Acked-by: Liam Girdwood <lgirdwood@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Oded Gabbay [Tue, 20 Jan 2015 12:57:19 +0000 (14:57 +0200)]
drm/amdkfd: Don't create BUG due to incorrect user parameter
This patch changes a BUG_ON() statement to pr_debug, in case the user tries to
update a non-existing queue.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Oded Gabbay [Thu, 29 Jan 2015 08:33:34 +0000 (10:33 +0200)]
drm/amdkfd: max num of queues can't be 0
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Oded Gabbay [Thu, 29 Jan 2015 08:32:25 +0000 (10:32 +0200)]
drm/amdkfd: Fix bug in accounting of queues
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Eric Dumazet [Fri, 30 Jan 2015 05:35:05 +0000 (21:35 -0800)]
ipv4: tcp: get rid of ugly unicast_sock
In commit
be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock")
I tried to address contention on a socket lock, but the solution
I chose was horrible :
commit
3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside
of TCP stack") addressed a selinux regression.
commit
0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1")
took care of another regression.
commit
b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression.
commit
811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
was another shot in the dark.
Really, just use a proper socket per cpu, and remove the skb_orphan()
call, to re-enable flow control.
This solves a serious problem with FQ packet scheduler when used in
hostile environments, as we do not want to allocate a flow structure
for every RST packet sent in response to a spoofed packet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NeilBrown [Mon, 2 Feb 2015 06:08:03 +0000 (17:08 +1100)]
md/bitmap: fix a might_sleep() warning.
commit
8eb23b9f35aae413140d3fda766a98092c21e9b0
sched: Debug nested sleeps
causes false-positive warnings in RAID5 code.
This annotation removes them and adds a comment
explaining why there is no real problem.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 1 Feb 2015 23:44:29 +0000 (10:44 +1100)]
md/raid5: fix another livelock caused by non-aligned writes.
If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.
As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.
This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.
However since commit
67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.
So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.
Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321
Cc: stable@vger.kernel.org (v3.16+)
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Linus Torvalds [Mon, 2 Feb 2015 04:07:21 +0000 (20:07 -0800)]
Linux 3.19-rc7
Linus Torvalds [Sun, 1 Feb 2015 21:20:47 +0000 (13:20 -0800)]
Merge tag 'armsoc-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"One more week's worth of fixes. Worth pointing out here are:
- A patch fixing detaching of iommu registrations when a device is
removed -- earlier the ops pointer wasn't managed properly
- Another set of Renesas boards get the same GIC setup fixup as
others have in previous -rcs
- Serial port aliases fixups for sunxi. We did the same to tegra but
we caught that in time before the merge window due to more machines
being affected. Here it took longer for anyone to notice.
- A couple more DT tweaks on sunxi
- A follow-up patch for the mvebu coherency disabling in last -rc
batch"
* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
ARM: sunxi: dt: Fix aliases
ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline
ARM: dts: sun6i: ippo-q8h-v5: Fix serial0 alias
ARM: dts: sunxi: Fix usb-phy support for sun4i/sun5i
Linus Torvalds [Sun, 1 Feb 2015 21:16:40 +0000 (13:16 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input layer updates from Dmitry Torokhov:
"Just a few quirks for PS/2 this time"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - add more Fujtisu notebooks to force crc_enabled
Input: i8042 - add noloop quirk for Medion Akoya E7225 (MD98857)
Input: synaptics - adjust min/max for Lenovo ThinkPad X1 Carbon 2nd
Linus Torvalds [Sun, 1 Feb 2015 20:23:32 +0000 (12:23 -0800)]
sched: don't cause task state changes in nested sleep debugging
Commit
8eb23b9f35aa ("sched: Debug nested sleeps") added code to report
on nested sleep conditions, which we generally want to avoid because the
inner sleeping operation can re-set the thread state to TASK_RUNNING,
but that will then cause the outer sleep loop not actually sleep when it
calls schedule.
However, that's actually valid traditional behavior, with the inner
sleep being some fairly rare case (like taking a sleeping lock that
normally doesn't actually need to sleep).
And the debug code would actually change the state of the task to
TASK_RUNNING internally, which makes that kind of traditional and
working code not work at all, because now the nested sleep doesn't just
sometimes cause the outer one to not block, but will cause it to happen
every time.
In particular, it will cause the cardbus kernel daemon (pccardd) to
basically busy-loop doing scheduling, converting a laptop into a heater,
as reported by Bruno Prémont. But there may be other legacy uses of
that nested sleep model in other drivers that are also likely to never
get converted to the new model.
This fixes both cases:
- don't set TASK_RUNNING when the nested condition happens (note: even
if WARN_ONCE() only _warns_ once, the return value isn't whether the
warning happened, but whether the condition for the warning was true.
So despite the warning only happening once, the "if (WARN_ON(..))"
would trigger for every nested sleep.
- in the cases where we knowingly disable the warning by using
"sched_annotate_sleep()", don't change the task state (that is used
for all core scheduling decisions), instead use '->task_state_change'
that is used for the debugging decision itself.
(Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
suggested change to use 'task_state_change' as part of the test)
Reported-and-bisected-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Rafael J Wysocki <rjw@rjwysocki.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>,
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rainer Koenig [Tue, 27 Jan 2015 23:15:11 +0000 (15:15 -0800)]
Input: elantech - add more Fujtisu notebooks to force crc_enabled
Add two more Fujitsu LIFEBOOK models that also ship with the Elantech
touchpad and don't work with crc_disabled to the quirk list.
Signed-off-by: Rainer Koenig <Rainer.Koenig@ts.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Olof Johansson [Sun, 1 Feb 2015 16:51:12 +0000 (08:51 -0800)]
Merge tag 'renesas-soc-fixes3-for-v3.19' of git://git./linux/kernel/git/horms/renesas into fixes
Merge "Third Round of Renesas ARM Based SoC Fixes for v3.19" from Simon Horman:
* Instantiate GIC from C board code in legacy builds on r8a7790 and r8a73a4
* tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
Signed-off-by: Olof Johansson <olof@lixom.net>
Andy Lutomirski [Mon, 7 Jul 2014 18:37:17 +0000 (11:37 -0700)]
x86_64, entry: Remove the syscall exit audit and schedule optimizations
We used to optimize rescheduling and audit on syscall exit. Now
that the full slow path is reasonably fast, remove these
optimizations. Syscall exit auditing is now handled exclusively by
syscall_trace_leave.
This adds something like 10ns to the previously optimized paths on
my computer, presumably due mostly to SAVE_REST / RESTORE_REST.
I think that we should eventually replace both the syscall and
non-paranoid interrupt exit slow paths with a pair of C functions
along the lines of the syscall entry hooks.
Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.net
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Andy Lutomirski [Tue, 22 Jul 2014 19:46:50 +0000 (12:46 -0700)]
x86_64, entry: Use sysret to return to userspace when possible
The x86_64 entry code currently jumps through complex and
inconsistent hoops to try to minimize the impact of syscall exit
work. For a true fast-path syscall, almost nothing needs to be
done, so returning is just a check for exit work and sysret. For a
full slow-path return from a syscall, the C exit hook is invoked if
needed and we join the iret path.
Using iret to return to userspace is very slow, so the entry code
has accumulated various special cases to try to do certain forms of
exit work without invoking iret. This is error-prone, since it
duplicates assembly code paths, and it's dangerous, since sysret
can malfunction in interesting ways if used carelessly. It's
also inefficient, since a lot of useful cases aren't optimized
and therefore force an iret out of a combination of paranoia and
the fact that no one has bothered to write even more asm code
to avoid it.
I would argue that this approach is backwards. Rather than trying
to avoid the iret path, we should instead try to make the iret path
fast. Under a specific set of conditions, iret is unnecessary. In
particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
set, and both SS and CS are as expected, then
movq 32(%rsp),%rsp;sysret does the same thing as iret. This set of
conditions is nearly always satisfied on return from syscalls, and
it can even occasionally be satisfied on return from an irq.
Even with the careful checks for sysret applicability, this cuts
nearly 80ns off of the overhead from syscalls with unoptimized exit
work. This includes tracing and context tracking, and any return
that invokes KVM's user return notifier. For example, the cost of
getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
~280ns on my computer.
This may allow the removal and even eventual conversion to C
of a respectable amount of exit asm.
This may require further tweaking to give the full benefit on Xen.
It may be worthwhile to adjust signal delivery and exec to try hit
the sysret path.
This does not optimize returns to 32-bit userspace. Making the same
optimization for CS == __USER32_CS is conceptually straightforward,
but it will require some tedious code to handle the differences
between sysretl and sysexitl.
Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.net
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Andy Lutomirski [Sat, 31 Jan 2015 12:53:53 +0000 (04:53 -0800)]
x86, traps: Fix ist_enter from userspace
context_tracking_user_exit() has no effect if in_interrupt() returns true,
so ist_enter() didn't work. Fix it by calling exception_enter(), and thus
context_tracking_user_exit(), before incrementing the preempt count.
This also adds an assertion that will catch the problem reliably if
CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.
Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
Fixes: 959274753857 x86, traps: Track entry into and exit from IST context
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Eric Dumazet [Fri, 30 Jan 2015 01:30:12 +0000 (17:30 -0800)]
net: sched: fix panic in rate estimators
Doing the following commands on a non idle network device
panics the box instantly, because cpu_bstats gets overwritten
by stats.
tc qdisc add dev eth0 root <your_favorite_qdisc>
... some traffic (one packet is enough) ...
tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc>
[ 325.355596] BUG: unable to handle kernel paging request at
ffff8841dc5a074c
[ 325.362609] IP: [<
ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[ 325.369158] PGD
1fa7067 PUD 0
[ 325.372254] Oops: 0000 [#1] SMP
[ 325.375514] Modules linked in: ...
[ 325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163
[ 325.412042] task:
ffff8800793ab5d0 ti:
ffff881ff2fa4000 task.ti:
ffff881ff2fa4000
[ 325.419518] RIP: 0010:[<
ffffffff81541c9e>] [<
ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[ 325.428506] RSP: 0018:
ffff881ff2fa7928 EFLAGS:
00010286
[ 325.433824] RAX:
000000000000000c RBX:
ffff881ff2fa796c RCX:
000000000000000c
[ 325.440988] RDX:
ffff8841dc5a0744 RSI:
0000000000000060 RDI:
0000000000000060
[ 325.448120] RBP:
ffff881ff2fa7948 R08:
ffffffff81cd4f80 R09:
0000000000000000
[ 325.455268] R10:
ffff883ff223e400 R11:
0000000000000000 R12:
000000015cba0744
[ 325.462405] R13:
ffffffff81cd4f80 R14:
ffff883ff223e460 R15:
ffff883feea0722c
[ 325.469536] FS:
00007f2ee30fa700(0000) GS:
ffff88407fa20000(0000) knlGS:
0000000000000000
[ 325.477630] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 325.483380] CR2:
ffff8841dc5a074c CR3:
0000003feeae9000 CR4:
00000000001407e0
[ 325.490510] Stack:
[ 325.492524]
ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0
[ 325.499990]
ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744
[ 325.507460]
00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0
[ 325.514956] Call Trace:
[ 325.517412] [<
ffffffff815424ee>] gen_new_estimator+0x8e/0x230
[ 325.523250] [<
ffffffff815427aa>] gen_replace_estimator+0x4a/0x60
[ 325.529349] [<
ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590
[ 325.535117] [<
ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240
[ 325.540963] [<
ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20
[ 325.546532] [<
ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0
[ 325.552145] [<
ffffffff8155b355>] rtnetlink_rcv+0x25/0x40
[ 325.557558] [<
ffffffff8157f0d8>] netlink_unicast+0x168/0x220
[ 325.563317] [<
ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0
Lets play safe and not use an union : percpu 'pointers' are mostly read
anyway, and we have typically few qdiscs per host.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe")
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 29 Jan 2015 20:34:49 +0000 (12:34 -0800)]
hyperv: Fix the error processing in netvsc_send()
The existing code frees the skb in EAGAIN case, in which the skb will be
retried from upper layer and used again.
Also, the existing code doesn't free send buffer slot in error case, because
there is no completion message for unsent packets.
This patch fixes these problems.
(Please also include this patch for stable trees. Thanks!)
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 31 Jan 2015 18:34:25 +0000 (10:34 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"i2c driver bugfixes (s3c2410, slave-eeprom, sh_mobile), size
regression "bugfix" (i2c slave), documentation bugfix (st).
Also, one documentation update (da9063), so some devicetrees can now
be verified"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sh_mobile: terminate DMA reads properly
i2c: Only include slave support if selected
i2c: s3c2410: fix ABBA deadlock by keeping clock prepared
i2c: slave-eeprom: fix boundary check when using sysfs
i2c: st: Rename clock reference to something that exists
DT: i2c: Add devices handled by the da9063 MFD driver
Linus Torvalds [Sat, 31 Jan 2015 03:49:44 +0000 (19:49 -0800)]
Merge tag 'char-misc-3.19-rc7' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are two tiny patches, one fixing up the drivers/Kconfig file, and
one adding a MAINTAINERS entry for the UIO git tree"
* tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
drivers/Kconfig: remove duplicate entry for soc
MAINTAINERS: add git url entry for UIO