Srivatsa S. Bhat [Sat, 7 Jun 2014 20:41:43 +0000 (02:11 +0530)]
cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
Cpufreq governors like the ondemand governor calculate the load on the CPU
periodically by employing deferrable timers. A deferrable timer won't fire
if the CPU is completely idle (and there are no other timers to be run), in
order to avoid unnecessary wakeups and thus save CPU power.
However, the load calculation logic is agnostic to all this, and this can
lead to the problem described below.
Time (ms) CPU 1
100 Task-A running
110 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
110.5 Task-A running
120 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
125 Task-A went to sleep. With nothing else to do, CPU 1
went completely idle.
200 Task-A woke up and started running again.
200.5 Governor's deferred timer (which was originally programmed
to fire at time 130) fires now. It calculates load for the
time period 120 to 200.5, and finds the load is almost zero.
Hence it decreases the CPU frequency to the minimum.
210 Governor's timer fires, finds load as 100% in the last
10ms interval and increases the CPU frequency.
So, after the workload woke up and started running, the frequency was suddenly
dropped to absolute minimum, and after that, there was an unnecessary delay of
10ms (sampling period) to increase the CPU frequency back to a reasonable value.
And this pattern repeats for every wake-up-from-cpu-idle for that workload.
This can be quite undesirable for latency- or response-time sensitive bursty
workloads. So we need to fix the governor's logic to detect such wake-up-from-
cpu-idle scenarios and start the workload at a reasonably high CPU frequency.
One extreme solution would be to fake a load of 100% in such scenarios. But
that might lead to undesirable side-effects such as frequency spikes (which
might also need voltage changes) especially if the previous frequency happened
to be very low.
We just want to avoid the stupidity of dropping down the frequency to a minimum
and then enduring a needless (and long) delay before ramping it up back again.
So, let us simply carry forward the previous load - that is, let us just pretend
that the 'load' for the current time-window is the same as the load for the
previous window. That way, the frequency and voltage will continue to be set
to whatever values they were set at previously. This means that bursty workloads
will get a chance to influence the CPU frequency at which they wake up from
cpu-idle, based on their past execution history. Thus, they might be able to
avoid suffering from slow wakeups and long response-times.
However, we should take care not to over-do this. For example, such a "copy
previous load" logic will benefit cases like this: (where # represents busy
and . represents idle)
##########.........#########.........###########...........##########........
but it will be detrimental in cases like the one shown below, because it will
retain the high frequency (copied from the previous interval) even in a mostly
idle system:
##########.........#.................#.....................#...............
(i.e., the workload finished and the remaining tasks are such that their busy
periods are smaller than the sampling interval, which causes the timer to
always get deferred. So, this will make the copy-previous-load logic copy
the initial high load to subsequent idle periods over and over again, thus
keeping the frequency high unnecessarily).
So, we modify this copy-previous-load logic such that it is used only once
upon every wakeup-from-idle. Thus if we have 2 consecutive idle periods, the
previous load won't get blindly copied over; cpufreq will freshly evaluate the
load in the second idle interval, thus ensuring that the system comes back to
its normal state.
[ The right way to solve this whole problem is to teach the CPU frequency
governors to also track load on a per-task basis, not just a per-CPU basis,
and then use both the data sources intelligently to set the appropriate
frequency on the CPUs. But that involves redesigning the cpufreq subsystem,
so this patch should make the situation bearable until then. ]
Experimental results:
+-------------------+
I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps in
between its execution such that its total utilization can be a user-defined
value, say 10% or 20% (higher the utilization specified, lesser the amount of
sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8.
Behavior observed with tracing (sample taken from 40% utilization runs):
------------------------------------------------------------------------
Without patch:
~~~~~~~~~~~~~~
kworker/8:2-12137 416.335742: cpu_frequency: state=
2061000 cpu_id=8
kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.345744: cpu_frequency: state=
4123000 cpu_id=8
kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
<snip> --------------------------------------------------------------------- <snip>
<...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8
<idle>-0 416.502130: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy
<...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.505739: cpu_frequency: state=
2061000 cpu_id=8
kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-12137 416.515742: cpu_frequency: state=
4123000 cpu_id=8
kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
Observation: Ebizzy went idle at 416.402202, and started running again at
416.502130. But cpufreq noticed the long idle period, and dropped the frequency
at 416.505739, only to increase it back again at 416.515742, realizing that the
workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest frequency
for almost 13 milliseconds (almost 1 full sample period), and this pattern
repeats on every sleep-wakeup. This could hurt latency-sensitive workloads quite
a lot.
With patch:
~~~~~~~~~~~
kworker/8:2-29802 464.832535: cpu_frequency: state=
2061000 cpu_id=8
<snip> --------------------------------------------------------------------- <snip>
kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 464.972536: cpu_frequency: state=
4123000 cpu_id=8
kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
<snip> --------------------------------------------------------------------- <snip>
kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8
<idle>-0 465.240178: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy
<...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy
<...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2
Observation: Ebizzy went idle at 465.035797, and started running again at
465.240178. Since ebizzy was the only real workload running on this CPU,
cpufreq retained the frequency at 4.1Ghz throughout the run of ebizzy, no
matter how many times ebizzy slept and woke-up in-between. Thus, ebizzy
got the 10ms worth of 4.1 Ghz benefit during every sleep-wakeup (as compared
to the run without the patch) and this boost gave a modest improvement in total
throughput, as shown below.
Sleeping-ebizzy records-per-second:
-----------------------------------
Utilization Without patch With patch Difference (Absolute and % values)
10% 274767 277046 + 2279 (+0.829%)
20% 543429 553484 + 10055 (+1.850%)
40%
1090744 1107959 + 17215 (+1.578%)
60%
1634908 1662018 + 27110 (+1.658%)
A rudimentary and somewhat approximately latency-sensitive workload such as
sleeping-ebizzy itself showed a consistent, noticeable performance improvement
with this patch. Hence, workloads that are truly latency-sensitive will benefit
quite a bit from this change. Moreover, this is an overall win-win since this
patch does not hurt power-savings at all (because, this patch does not reduce
the idle time or idle residency; and the high frequency of the CPU when it goes
to cpu-idle does not affect/hurt the power-savings of deep idle states).
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Ed Swarthout [Thu, 5 Jun 2014 23:56:17 +0000 (18:56 -0500)]
cpufreq: ppc-corenet-cpu-freq: do_div use quotient
Commit
6712d2931933 (cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost
error) used the remainder from do_div instead of the quotient. Fix that
and add one to ensure minimum is met.
Fixes: 6712d2931933 (cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error)
Signed-off-by: Ed Swarthout <Ed.Swarthout@freescale.com>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 5 Jun 2014 21:50:10 +0000 (23:50 +0200)]
Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
This reverts commit
4920ab84979d (cpufreq: Enable big.LITTLE cpufreq
driver on arm64) that breaks build on arm64.
Fixes: 4920ab84979d (cpufreq: Enable big.LITTLE cpufreq driver on arm64)
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Mon, 2 Jun 2014 17:19:29 +0000 (22:49 +0530)]
cpufreq: Tegra: implement intermediate frequency callbacks
Tegra has been switching to intermediate frequency (pll_p_clk) forever.
CPUFreq core has better support for handling notifications for these
frequencies and so we can adapt Tegra's driver to it.
Also do a WARN() if clk_set_parent() fails while moving back to pll_x
as we should have atleast restored to earlier frequency on error.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Mon, 2 Jun 2014 17:19:28 +0000 (22:49 +0530)]
cpufreq: add support for intermediate (stable) frequencies
Douglas Anderson, recently pointed out an interesting problem due to which
udelay() was expiring earlier than it should.
While transitioning between frequencies few platforms may temporarily switch to
a stable frequency, waiting for the main PLL to stabilize.
For example: When we transition between very low frequencies on exynos, like
between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
No CPUFREQ notification is sent for that. That means there's a period of time
when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
and 300MHz. And so udelay behaves badly.
To get this fixed in a generic way, introduce another set of callbacks
get_intermediate() and target_intermediate(), only for drivers with
target_index() and CPUFREQ_ASYNC_NOTIFICATION unset.
get_intermediate() should return a stable intermediate frequency platform wants
to switch to, and target_intermediate() should set CPU to that frequency,
before jumping to the frequency corresponding to 'index'. Core will take care of
sending notifications and driver doesn't have to handle them in
target_intermediate() or target_index().
NOTE: ->target_index() should restore to policy->restore_freq in case of
failures as core would send notifications for that.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Tue, 3 Jun 2014 13:03:27 +0000 (15:03 +0200)]
Merge back earlier cpufreq material.
Conflicts:
arch/mips/loongson/lemote-2f/clock.c
drivers/cpufreq/intel_pstate.c
Doug Smythies [Fri, 30 May 2014 17:10:57 +0000 (10:10 -0700)]
intel_pstate: Improve initial busy calculation
This change makes the busy calculation using 64 bit math which prevents
overflow for large values of aperf/mperf.
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dirk Brandewie [Thu, 29 May 2014 16:32:24 +0000 (09:32 -0700)]
intel_pstate: add sample time scaling
The PID assumes that samples are of equal time, which for a deferable
timers this is not true when the system goes idle. This causes the
PID to take a long time to converge to the min P state and depending
on the pattern of the idle load can make the P state appear stuck.
The hold-off value of three sample times before using the scaling is
to give a grace period for applications that have high performance
requirements and spend a lot of time idle, The poster child for this
behavior is the ffmpeg benchmark in the Phoronix test suite.
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dirk Brandewie [Thu, 29 May 2014 16:32:23 +0000 (09:32 -0700)]
intel_pstate: Correct rounding in busy calculation
Changing to fixed point math throughout the busy calculation in
commit
e66c1768 (Change busy calculation to use fixed point
math.) Introduced some inaccuracies by rounding the busy value at two
points in the calculation. This change removes roundings and moves
the rounding to the output of the PID where the calculations are
complete and the value returned as an integer.
Fixes: e66c17683746 (intel_pstate: Change busy calculation to use fixed point math.)
Reported-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dirk Brandewie [Thu, 29 May 2014 16:32:22 +0000 (09:32 -0700)]
intel_pstate: Remove C0 tracking
Commit
fcb6a15c (intel_pstate: Take core C0 time into account for core
busy calculation) introduced a regression referenced below. The issue
with "lockup" after suspend that this commit was addressing is now dealt
with in the suspend path.
Fixes: fcb6a15c2e7e (intel_pstate: Take core C0 time into account for core busy calculation)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=66581
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75121
Reported-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Mon, 2 Jun 2014 02:12:24 +0000 (19:12 -0700)]
Linux 3.15-rc8
Linus Torvalds [Mon, 2 Jun 2014 01:30:07 +0000 (18:30 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fix from Ben Herrenschmidt:
"Here's just one trivial patch to wire up sys_renameat2 which I seem to
have completely missed so far.
(My test build scripts fwd me warnings but miss the ones generated for
missing syscalls)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Wire renameat2() syscall
Linus Torvalds [Mon, 2 Jun 2014 01:28:58 +0000 (18:28 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"A fair number of fixes across the field. Nothing terribly
complicated; the one liners in below changelog should be fairly
descriptive.
Noteworthy is the SB1 change which the result of changes to binutils
resulting in one big gas warning for most files being assembled as
well as the asid_cache and branch emulation fixes which fix corruption
or possible uninteded behaviour of kernel or application code. The
remainder of fixes are more platforms or subsystem specific"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2
MIPS: ptrace: Avoid smp_processor_id() in preemptible code
MIPS: Lemote 2F: cs5536: mfgpt: use raw locks
MIPS: SB1: Fix excessive kernel warnings.
MIPS: RC32434: fix broken PCI resource initialization
MIPS: malta: memory.c: Initialize the 'memsize' variable
MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores
MIPS: Fix inconsistancy of __NR_Linux_syscalls value
MIPS: Fix branch emulation of branch likely instructions.
MIPS: Fix a typo error in AUDIT_ARCH definition
MIPS: Change type of asid_cache to unsigned long
Linus Torvalds [Mon, 2 Jun 2014 01:26:59 +0000 (18:26 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Various fixlets, mostly related to the (root-only) SCHED_DEADLINE
policy, but also a hotplug bug fix and a fix for a NR_CPUS related
overallocation bug causing a suspend/resume regression"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix hotplug vs. set_cpus_allowed_ptr()
sched/cpupri: Replace NR_CPUS arrays
sched/deadline: Replace NR_CPUS arrays
sched/deadline: Restrict user params max value to 2^63 ns
sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE
sched: Disallow sched_attr::sched_policy < 0
sched: Make sched_setattr() correctly return -EFBIG
Benjamin Herrenschmidt [Sun, 1 Jun 2014 23:24:27 +0000 (09:24 +1000)]
powerpc: Wire renameat2() syscall
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Sat, 31 May 2014 16:47:55 +0000 (09:47 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core futex/rtmutex fixes from Thomas Gleixner:
"Three fixlets for long standing issues in the futex/rtmutex code
unearthed by Dave Jones syscall fuzzer:
- Add missing early deadlock detection checks in the futex code
- Prevent user space from attaching a futex to kernel threads
- Make the deadlock detector of rtmutex work again
Looks large, but is more comments than code change"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Fix deadlock detector for real
futex: Prevent attaching to kernel threads
futex: Add another early deadlock detection check
Linus Torvalds [Sat, 31 May 2014 16:19:02 +0000 (09:19 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Mostly quiet now:
i915:
fixing userspace visiblie issues, all stable marked
radeon:
one more pll fix, two crashers, one suspend/resume regression"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Resume fbcon last
drm/radeon: only allocate necessary size for vm bo list
drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission
drm/radeon: avoid crash if VM command submission isn't available
drm/radeon: lower the ref * post PLL maximum once more
drm/i915: Prevent negative relocation deltas from wrapping
drm/i915: Only copy back the modified fields to userspace from execbuffer
drm/i915: Fix dynamic allocation of physical handles
Linus Torvalds [Sat, 31 May 2014 16:13:21 +0000 (09:13 -0700)]
dcache: add missing lockdep annotation
lock_parent() very much on purpose does nested locking of dentries, and
is careful to maintain the right order (lock parent first). But because
it didn't annotate the nested locking order, lockdep thought it might be
a deadlock on d_lock, and complained.
Add the proper annotation for the inner locking of the child dentry to
make lockdep happy.
Introduced by commit
046b961b45f9 ("shrink_dentry_list(): take parent's
->d_lock earlier").
Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Vetter [Fri, 30 May 2014 14:41:23 +0000 (16:41 +0200)]
drm/radeon: Resume fbcon last
So a few people complained that
commit
177cf92de4aa97ec1435987e91696ed8b5023130
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Apr 1 22:14:59 2014 +0200
drm/crtc-helpers: fix dpms on logic
which was merged into 3.15-rc1, broke resume on radeons. Strangely git
bisect lead everyone to
commit
25f397a429dfa43f22c278d0119a60a343aa568f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Jul 19 18:57:11 2013 +0200
drm/crtc-helper: explicit DPMS on after modeset
which was merged long ago and actually part of 3.14.
Digging deeper I've noticed (again) that the call to
drm_helper_resume_force_mode in the radeon resume handlers was a no-op
previously because everything gets shut down on suspend. radeon does
this with explicit calls to drm_helper_connector_dpms with DPMS_OFF.
But with 177c we now force the dpms state to ON, so suddenly
resume_force_mode actually forced the crtcs back on.
This is the intention of the change after all, the problem is that
radeon resumes the fbdev console layer _before_ restoring the display,
through calling fb_set_suspend. And fbcon does an immediate ->set_par,
which in turn causes the same forced mode restore to happen.
Two concurrent modeset operations didn't lead to happiness. Fix this
by delaying the fbcon resume until the end of the readeon resum
functions.
v2: Fix up a bit of the spelling fail.
References: https://lkml.org/lkml/2014/5/29/1043
References: https://lkml.org/lkml/2014/5/2/388
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=74751
Tested-by: Ken Moffat <zarniwhoop@ntlworld.com>
Cc: Alex Deucher <alexdeucher@gmail.com>
Cc: Ken Moffat <zarniwhoop@ntlworld.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Dave Airlie [Fri, 30 May 2014 23:19:05 +0000 (09:19 +1000)]
Merge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes
this is the next pull request for stashed up radeon fixes for 3.15. This is finally calming down with only four patches in this pull request.
* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon: only allocate necessary size for vm bo list
drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission
drm/radeon: avoid crash if VM command submission isn't available
drm/radeon: lower the ref * post PLL maximum once more
Linus Torvalds [Fri, 30 May 2014 19:07:48 +0000 (12:07 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"A couple of driver/build fixups and also redone quirk for Synaptics
touchpads on Lenovo boxes (now using PNP IDs instead of DMI data to
limit number of quirks)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics - change min/max quirk table to pnp-id matching
Input: synaptics - add a matches_pnp_id helper function
Input: synaptics - T540p - unify with other LEN0034 models
Input: synaptics - add min/max quirk for the ThinkPad W540
Input: ambakmi - request a shared interrupt for AMBA KMI devices
Input: pxa27x-keypad - fix generating scancode
Input: atmel-wm97xx - only build for AVR32
Input: fix ps2/serio module dependency
Linus Torvalds [Fri, 30 May 2014 19:06:15 +0000 (12:06 -0700)]
Merge tag 'firewire-fixes' of git://git./linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Stefan Richter:
"A regression fix for the IEEE 1394 subsystem: re-enable IRQ-based
asynchronous request reception at addresses below 128 TB"
* tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: revert to 4 GB RDMA, fix protocols using Memory Space
Linus Torvalds [Fri, 30 May 2014 19:04:56 +0000 (12:04 -0700)]
Merge tag 'dm-3.15-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device-mapper fixes from Mike Snitzer:
"A dm-cache stable fix to split discards on cache block boundaries
because dm-cache cannot yet handle discards that span cache blocks.
Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1.
Add a 'no_space_timeout' control to dm-thinp to restore the ability to
queue IO indefinitely when no data space is available. This fixes a
change in behavior that was introduced in -rc6 where the timeout
couldn't be disabled"
* tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm mpath: really fix lockdep warning
dm cache: always split discards on cache block boundaries
dm thin: add 'no_space_timeout' dm-thin-pool module param
Minchan Kim [Wed, 28 May 2014 06:53:59 +0000 (15:53 +0900)]
x86_64: expand kernel stack to 16K
While I play inhouse patches with much memory pressure on qemu-kvm,
3.14 kernel was randomly crashed. The reason was kernel stack overflow.
When I investigated the problem, the callstack was a little bit deeper
by involve with reclaim functions but not direct reclaim path.
I tried to diet stack size of some functions related with alloc/reclaim
so did a hundred of byte but overflow was't disappeard so that I encounter
overflow by another deeper callstack on reclaim/allocator path.
Of course, we might sweep every sites we have found for reducing
stack usage but I'm not sure how long it saves the world(surely,
lots of developer start to add nice features which will use stack
agains) and if we consider another more complex feature in I/O layer
and/or reclaim path, it might be better to increase stack size(
meanwhile, stack usage on 64bit machine was doubled compared to 32bit
while it have sticked to 8K. Hmm, it's not a fair to me and arm64
already expaned to 16K. )
So, my stupid idea is just let's expand stack size and keep an eye
toward stack consumption on each kernel functions via stacktrace of ftrace.
For example, we can have a bar like that each funcion shouldn't exceed 200K
and emit the warning when some function consumes more in runtime.
Of course, it could make false positive but at least, it could make a
chance to think over it.
I guess this topic was discussed several time so there might be
strong reason not to increase kernel stack size on x86_64, for me not
knowing so Ccing x86_64 maintainers, other MM guys and virtio
maintainers.
Here's an example call trace using up the kernel stack:
Depth Size Location (51 entries)
----- ---- --------
0) 7696 16 lookup_address
1) 7680 16 _lookup_address_cpa.isra.3
2) 7664 24 __change_page_attr_set_clr
3) 7640 392 kernel_map_pages
4) 7248 256 get_page_from_freelist
5) 6992 352 __alloc_pages_nodemask
6) 6640 8 alloc_pages_current
7) 6632 168 new_slab
8) 6464 8 __slab_alloc
9) 6456 80 __kmalloc
10) 6376 376 vring_add_indirect
11) 6000 144 virtqueue_add_sgs
12) 5856 288 __virtblk_add_req
13) 5568 96 virtio_queue_rq
14) 5472 128 __blk_mq_run_hw_queue
15) 5344 16 blk_mq_run_hw_queue
16) 5328 96 blk_mq_insert_requests
17) 5232 112 blk_mq_flush_plug_list
18) 5120 112 blk_flush_plug_list
19) 5008 64 io_schedule_timeout
20) 4944 128 mempool_alloc
21) 4816 96 bio_alloc_bioset
22) 4720 48 get_swap_bio
23) 4672 160 __swap_writepage
24) 4512 32 swap_writepage
25) 4480 320 shrink_page_list
26) 4160 208 shrink_inactive_list
27) 3952 304 shrink_lruvec
28) 3648 80 shrink_zone
29) 3568 128 do_try_to_free_pages
30) 3440 208 try_to_free_pages
31) 3232 352 __alloc_pages_nodemask
32) 2880 8 alloc_pages_current
33) 2872 200 __page_cache_alloc
34) 2672 80 find_or_create_page
35) 2592 80 ext4_mb_load_buddy
36) 2512 176 ext4_mb_regular_allocator
37) 2336 128 ext4_mb_new_blocks
38) 2208 256 ext4_ext_map_blocks
39) 1952 160 ext4_map_blocks
40) 1792 384 ext4_writepages
41) 1408 16 do_writepages
42) 1392 96 __writeback_single_inode
43) 1296 176 writeback_sb_inodes
44) 1120 80 __writeback_inodes_wb
45) 1040 160 wb_writeback
46) 880 208 bdi_writeback_workfn
47) 672 144 process_one_work
48) 528 112 worker_thread
49) 416 240 kthread
50) 176 176 ret_from_fork
[ Note: the problem is exacerbated by certain gcc versions that seem to
generate much bigger stack frames due to apparently bad coalescing of
temporaries and generating too many spills. Rusty saw gcc-4.6.4 using
35% more stack on the virtio path than 4.8.2 does, for example.
Minchan not only uses such a bad gcc version (4.6.3 in his case), but
some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is
what causes that kernel_map_pages() frame, for example). But we're
clearly getting too close.
The VM code also seems to have excessive stack frames partly for the
same compiler reason, triggered by excessive inlining and lots of
function arguments.
We need to improve on our stack use, but in the meantime let's do this
simple stack increase too. Unlike most earlier reports, there is
nothing simple that stands out as being really horribly wrong here,
apart from the fact that the stack frames are just bigger than they
should need to be. - Linus ]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S Tsirkin <mst@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 30 May 2014 16:52:55 +0000 (09:52 -0700)]
Merge branch 'for-linus-2' of git://git./linux/kernel/git/viro/vfs
Pull vfs dcache livelock fix from Al Viro:
"Fixes for livelocks in shrink_dentry_list() introduced by fixes to
shrink list corruption; the root cause was that trylock of parent's
->d_lock could be disrupted by d_walk() happening on other CPUs,
resulting in shrink_dentry_list() making no progress *and* the same
d_walk() being called again and again for as long as
shrink_dentry_list() doesn't get past that mess.
The solution is to have shrink_dentry_list() treat that trylock
failure not as 'try to do the same thing again', but 'lock them in the
right order'"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
dentry_kill() doesn't need the second argument now
dealing with the rest of shrink_dentry_list() livelock
shrink_dentry_list(): take parent's ->d_lock earlier
expand dentry_kill(dentry, 0) in shrink_dentry_list()
split dentry_kill()
lift the "already marked killed" case into shrink_dentry_list()
Al Viro [Thu, 29 May 2014 13:18:26 +0000 (09:18 -0400)]
dentry_kill() doesn't need the second argument now
it's 1 in the only remaining caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 29 May 2014 13:11:45 +0000 (09:11 -0400)]
dealing with the rest of shrink_dentry_list() livelock
We have the same problem with ->d_lock order in the inner loop, where
we are dropping references to ancestors. Same solution, basically -
instead of using dentry_kill() we use lock_parent() (introduced in the
previous commit) to get that lock in a safe way, recheck ->d_count
(in case if lock_parent() has ended up dropping and retaking ->d_lock
and somebody managed to grab a reference during that window), trylock
the inode->i_lock and use __dentry_kill() to do the rest.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 29 May 2014 12:54:52 +0000 (08:54 -0400)]
shrink_dentry_list(): take parent's ->d_lock earlier
The cause of livelocks there is that we are taking ->d_lock on
dentry and its parent in the wrong order, forcing us to use
trylock on the parent's one. d_walk() takes them in the right
order, and unfortunately it's not hard to create a situation
when shrink_dentry_list() can't make progress since trylock
keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
keeps calling d_walk() disrupting the very shrink_dentry_list() it's
waiting for.
Solution is straightforward - if that trylock fails, let's unlock
the dentry itself and take locks in the right order. We need to
stabilize ->d_parent without holding ->d_lock, but that's doable
using RCU. And we'd better do that in the very beginning of the
loop in shrink_dentry_list(), since the checks on refcount, etc.
would need to be redone anyway.
That deals with a half of the problem - killing dentries on the
shrink list itself. Another one (dropping their parents) is
in the next commit.
locking parent is interesting - it would be easy to do rcu_read_lock(),
lock whatever we think is a parent, lock dentry itself and check
if the parent is still the right one. Except that we need to check
that *before* locking the dentry, or we are risking taking ->d_lock
out of order. Fortunately, once the D1 is locked, we can check if
D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
can start or stop pointing to D1 only under D1->d_lock, so taking
D1->d_lock is enough. In other words, the right solution is
rcu_read_lock/lock what looks like parent right now/check if it's
still our parent/rcu_read_unlock/lock the child.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Christian König [Wed, 28 May 2014 10:24:17 +0000 (12:24 +0200)]
drm/radeon: only allocate necessary size for vm bo list
No need to always allocate the theoretical maximum here.
Signed-off-by: Christian König <christian.koenig@amd.com>
Marek Olšák [Tue, 27 May 2014 00:56:36 +0000 (02:56 +0200)]
drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission
It hangs the hardware.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Christian König [Wed, 21 May 2014 15:43:59 +0000 (17:43 +0200)]
drm/radeon: avoid crash if VM command submission isn't available
Signed-off-by: Christian König <christian.koenig@amd.com>
CC: stable@vger.kernel.org
Christian König [Wed, 21 May 2014 13:25:41 +0000 (15:25 +0200)]
drm/radeon: lower the ref * post PLL maximum once more
Let's be conservative and use 100 here until we find something better.
Bugs: https://bugzilla.kernel.org/show_bug.cgi?id=75241
Signed-off-by: Christian König <christian.koenig@amd.com>
Linus Torvalds [Fri, 30 May 2014 01:31:09 +0000 (18:31 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"The usual random collection of relatively small ARM fixes"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
ARM: 8064/1: fix v7-M signal return
ARM: 8057/1: amba: Add Qualcomm vendor ID.
ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
ARM: 8051/1: put_user: fix possible data corruption in put_user
ARM: 8048/1: fix v7-M setup stack location
Linus Torvalds [Thu, 29 May 2014 21:14:43 +0000 (14:14 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"Fix CoW regression for transparent hugepages by routing set_pmd_at to
set_pte_at, which correctly handles PTE_WRITE and will mark the
resulting table entry as read-only where appropriate"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix pmd_write CoW brokenness
Linus Torvalds [Thu, 29 May 2014 21:05:57 +0000 (14:05 -0700)]
Merge tag 'pm+acpi-3.15-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are three stable-candidate fixes, one for the ACPI thermal
driver and two for cpufreq drivers.
Specifics:
- A workqueue is destroyed too early during the ACPI thermal driver
module unload which leads to a NULL pointer dereference in the
driver's remove callback. Fix from Aaron Lu.
- A wrong argument is passed to devm_regulator_get_optional() in the
probe routine of the cpu0 cpufreq driver which leads to resource
leaks if the driver is unbound from the cpufreq platform device.
Fix from Lucas Stach.
- A lock is missing in cpufreq_governor_dbs() which leads to memory
corruption and NULL pointer dereferences during system
suspend/resume, for example. Fix from Bibek Basu"
* tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / thermal: fix workqueue destroy order
cpufreq: cpu0: drop wrong devm usage
cpufreq: remove race while accessing cur_policy
Linus Torvalds [Thu, 29 May 2014 20:59:18 +0000 (13:59 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux
Pull clock fixes from Mike Turquette:
"Small number of user-visible regression fixes for clock drivers.
There is a memory leak fix for an ST platform, an infinite Loop Of
Doom fix for the recent changes to the basic clock divider (hopefully
the last fix for those recent changes) and some Tegra PLL changes
which keep PCI from being hosed on that platform"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk: st: Fix memory leak
clk: divider: Fix table round up function
clk: tegra: Fix enabling of PLLE
clk: tegra: Introduce divider mask and shift helpers
clk: tegra: Fix PLLE programming
Stefan Richter [Thu, 29 May 2014 13:23:26 +0000 (15:23 +0200)]
firewire: revert to 4 GB RDMA, fix protocols using Memory Space
Undo a feature introduced in v3.14 by commit
fcd46b34425d
"firewire: Enable remote DMA above 4 GB". That change raised the
minimum address at which protocol drivers and user programs can register
for request reception from 0x0001'0000'0000 to 0x8000'0000'0000.
It turned out that at least one vendor-specific protocol exists which
uses lower addresses: https://bugzilla.kernel.org/show_bug.cgi?id=76921
For the time being, revert most of commit
fcd46b34425d so that affected
protocols work like with kernel v3.13 and before. Just keep the valid
documentation parts from the regressing commit, and the ability to
identify controllers which could be programmed to accept >32 bit
physical DMA addresses. The rest of
fcd46b34425d should probably be
brought back as an optional instead of default feature.
Reported-by: Fabien Spindler <fabien.spindler@inria.fr>
Cc: <stable@vger.kernel.org> # 3.14+
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Al Viro [Wed, 28 May 2014 17:59:13 +0000 (13:59 -0400)]
expand dentry_kill(dentry, 0) in shrink_dentry_list()
Result will be massaged to saner shape in the next commits. It is
ugly, no questions - the point of that one is to be a provably
equivalent transformation (and it might be worth splitting a bit
more).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 28 May 2014 17:51:12 +0000 (13:51 -0400)]
split dentry_kill()
... into trylocks and everything else. The latter (actual killing)
is __dentry_kill().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Will Deacon [Tue, 27 May 2014 18:11:58 +0000 (19:11 +0100)]
arm64: mm: fix pmd_write CoW brokenness
Commit
9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte
equivalents") changed the pmd manipulator and accessor functions to
convert the target pmd to a pte, process it with the pte functions, then
convert it back. Along the way, we gained support for PTE_WRITE, however
this is completely ignored by set_pmd_at, and so we fail to set the
PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
CoW not working).
Partially reverting the offending commit (by making use of
PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
leads to further issues because pmd_write can then return potentially
incorrect values for page table entries marked as RDONLY, leading to
BUG_ON(pmd_write(entry)) tripping under some THP workloads.
This patch fixes the issue by routing set_pmd_at through set_pte_at,
which correctly takes the PTE_WRITE flag into account. Given that
THP mappings are always anonymous, the additional cache-flushing code
in __sync_icache_dcache won't impose any significant overhead as the
flush will be skipped.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Viresh Kumar [Wed, 21 May 2014 08:59:29 +0000 (14:29 +0530)]
cpufreq: handle calls to ->target_index() in separate routine
Handling calls to ->target_index() has got complex over time and might become
more complex. So, its better to take target_index() bits out in another routine
__target_index() for better code readability. Shouldn't have any functional
impact.
Tested-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Wed, 28 May 2014 18:17:41 +0000 (11:17 -0700)]
Merge tag 'sound-3.15-rc8' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just two small stable fixes: an HD-audio fix for the new Intel
chipsets and a PM handling fix in PCM dmaengine core"
* tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix onboard audio on Intel H97/Z97 chipsets
ALSA: pcm_dmaengine: Add check during device suspend
Linus Torvalds [Wed, 28 May 2014 18:15:57 +0000 (11:15 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
"Oh, well... Still nothing useful on that livelock (I had something
that looked kinda-sorta like a non-invasive solution, but it
deadlocks), so it's just Miklos' vmsplice fix for now"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: fix vmplice_to_user()
Nicolas Pitre [Fri, 23 May 2014 21:31:44 +0000 (22:31 +0100)]
ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
The content of /sys/devices/system/cpu/cpu*/online is still 1 for those
CPUs that the switcher has removed even though the global state in
/sys/devices/system/cpu/online is updated correctly.
It turns out that commit
0902a9044f ("Driver core: Use generic
offline/online for CPU offline/online") has changed the way those files
retrieve their content by relying on on the generic attribute handling
code. The switcher, by calling cpu_down() directly, bypasses this
handling and the attribute value doesn't get updated.
Fix this by calling device_offline()/device_online() instead.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Thomas Gleixner [Thu, 22 May 2014 03:25:39 +0000 (03:25 +0000)]
rtmutex: Fix deadlock detector for real
The current deadlock detection logic does not work reliably due to the
following early exit path:
/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
* mode!
*/
if (top_waiter && (!task_has_pi_waiters(task) ||
top_waiter != task_top_pi_waiter(task)))
goto out_unlock_pi;
So this not only exits when the task has no waiters, it also exits
unconditionally when the current waiter is not the top priority waiter
of the task.
So in a nested locking scenario, it might abort the lock chain walk
and therefor miss a potential deadlock.
Simple fix: Continue the chain walk, when deadlock detection is
enabled.
We also avoid the whole enqueue, if we detect the deadlock right away
(A-A). It's an optimization, but also prevents that another waiter who
comes in after the detection and before the task has undone the damage
observes the situation and detects the deadlock and returns
-EDEADLOCK, which is wrong as the other task is not in a deadlock
situation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 28 May 2014 15:08:03 +0000 (08:08 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Small fixes for x86, slightly larger fixes for PPC, and a forgotten
s390 patch. The PPC fixes are important because they fix breakage
that is new in 3.15"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: announce irqfd capability
KVM: x86: disable master clock if TSC is reset during suspend
KVM: vmx: disable APIC virtualization in nested guests
KVM guest: Make pv trampoline code executable
KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
Linus Torvalds [Wed, 28 May 2014 15:06:50 +0000 (08:06 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull two powerpc fixes from Ben Herrenschmidt:
"Here's a pair of powerpc fixes for 3.15 which are also going to
stable.
One's a fix for building with newer binutils (the problem currently
only affects the BookE kernels but the affected macro might come back
into use on BookS platforms at any time). Unfortunately, the binutils
maintainer did a backward incompatible change to a construct that we
use so we have to add Makefile check.
The other one is a fix for CPUs getting stuck in kexec when running
single threaded. Since we routinely use kexec on power (including in
our newer bootloaders), I deemed that important enough"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
powerpc: Fix 64 bit builds with binutils 2.24
Al Viro [Wed, 28 May 2014 13:48:44 +0000 (09:48 -0400)]
lift the "already marked killed" case into shrink_dentry_list()
It can happen only when dentry_kill() is called with unlock_on_failure
equal to 0 - other callers had dentry pinned until the moment they've
got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().
IOW, only one of three call sites of dentry_kill() might end up reaching
that code. Just move it there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Thomas Bogendoerfer [Tue, 8 Apr 2014 06:58:01 +0000 (08:58 +0200)]
MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2
Fix uasm warning, which triggered because of workaround for R4600 V2 CPUs.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6716/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Alex Smith [Thu, 1 May 2014 11:51:19 +0000 (12:51 +0100)]
MIPS: ptrace: Avoid smp_processor_id() in preemptible code
ptrace_{get,set}_watch_regs access current_cpu_data to get the watch
register count/masks, which calls smp_processor_id(). However they are
run in preemptible context and therefore trigger warnings like so:
[ 6340.092000] BUG: using smp_processor_id() in preemptible [
00000000] code: gdb/367
[ 6340.092000] caller is ptrace_get_watch_regs+0x44/0x220
Since the watch register count/masks should be the same across all
CPUs, use boot_cpu_data instead. Note that this may need to change in
future should a heterogenous system be supported where the count/masks
are not the same across all CPUs (the current code is also incorrect
for this scenario - current_cpu_data here would not necessarily be
correct for the CPU that the target task will execute on).
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6879/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Sebastian Andrzej Siewior [Tue, 13 May 2014 15:07:04 +0000 (17:07 +0200)]
MIPS: Lemote 2F: cs5536: mfgpt: use raw locks
The lock is taken in the raw irq path and therefore a rawlock should be
used instead of a normal spinlock.
While here I drop the export symbol on that variable since there are no
other users.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Hua Yan <yanh@lemote.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Hongliang Tao <taohl@lemote.com>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6936/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 28 May 2014 06:36:23 +0000 (08:36 +0200)]
MIPS: SB1: Fix excessive kernel warnings.
A kernel build with binutils 2.24 is going to emit warnings like
CC kernel/sys.o
{standard input}: Assembler messages:
{standard input}:701: Warning: the 32-bit MIPS architecture does not support the `mdmx' extension
{standard input}:701: Warning: the `mdmx' extension requires 64-bit FPRs
{standard input}:701: Warning: the `mips3d' extension requires MIPS32 revision 2 or greater
{standard input}:701: Warning: the `mips3d' extension requires 64-bit FPRs
for almost every file. This is caused by changes to gas' interpretation
of .set semantics. Fixed by explicitly disabling MIPS3D and MDMX for
Sibyte builds.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Miklos Szeredi [Tue, 27 May 2014 14:41:16 +0000 (16:41 +0200)]
vfs: fix vmplice_to_user()
Commit
6130f5315ee8 "switch vmsplice_to_user() to copy_page_to_iter()" in
v3.15-rc1 broke vmsplice(2).
This patch fixes two bugs:
- count is not initialized to a proper value, which resulted in no data
being copied
- if rw_copy_check_uvector() returns negative then the iov might be leaked.
Tested OK.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mike Turquette [Wed, 28 May 2014 04:11:08 +0000 (21:11 -0700)]
Merge tag 'clk-tegra-fixes-3.15' of git://nv-tegra.nvidia.com/user/pdeschrijver/linux into clk-fixes
PLLE fixes for 3.15
Srivatsa S. Bhat [Tue, 27 May 2014 10:55:34 +0000 (16:25 +0530)]
powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:
[ 0.089866] POWER8 performance monitor hardware support registered
[ 0.089985] power8-pmu: PMAO restore workaround active.
[ 5.095419] Processor 1 is stuck.
[ 10.097933] Processor 2 is stuck.
[ 15.100480] Processor 3 is stuck.
[ 20.102982] Processor 4 is stuck.
[ 25.105489] Processor 5 is stuck.
[ 30.108005] Processor 6 is stuck.
[ 35.110518] Processor 7 is stuck.
[ 40.113369] Processor 9 is stuck.
[ 45.115879] Processor 10 is stuck.
[ 50.118389] Processor 11 is stuck.
[ 55.120904] Processor 12 is stuck.
[ 60.123425] Processor 13 is stuck.
[ 65.125970] Processor 14 is stuck.
[ 70.128495] Processor 15 is stuck.
[ 75.131316] Processor 17 is stuck.
Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:
[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.
Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit
e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().
It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().
Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.
So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.
Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.
Fixes: c97102ba963 (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Guenter Roeck [Thu, 15 May 2014 16:33:42 +0000 (09:33 -0700)]
powerpc: Fix 64 bit builds with binutils 2.24
With binutils 2.24, various 64 bit builds fail with relocation errors
such as
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_base_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_end_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
The assembler maintainer says:
I changed the ABI, something that had to be done but unfortunately
happens to break the booke kernel code. When building up a 64-bit
value with lis, ori, shl, oris, ori or similar sequences, you now
should use @high and @higha in place of @h and @ha. @h and @ha
(and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
now report overflow if the value is out of 32-bit signed range.
ie. @h and @ha assume you're building a 32-bit value. This is needed
to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
and @toc@ha expressions, and for consistency I did the same for all
other @h and @ha relocs.
Replacing @h with @high in one strategic location fixes the relocation
errors. This has to be done conditionally since the assembler either
supports @h or @high but not both.
Cc: <stable@vger.kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Dave Airlie [Tue, 27 May 2014 23:18:32 +0000 (09:18 +1000)]
Merge tag 'drm-intel-fixes-2014-05-27' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Fixes from Chris, all cc: stable.
* tag 'drm-intel-fixes-2014-05-27' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Prevent negative relocation deltas from wrapping
drm/i915: Only copy back the modified fields to userspace from execbuffer
drm/i915: Fix dynamic allocation of physical handles
Linus Torvalds [Tue, 27 May 2014 22:59:34 +0000 (15:59 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull virtio_blk fix from Jens Axboe:
"There's a start/stop queue race in virtio_blk, which causes stalls and
erratic behaviour for some. I've had this queued up for 3.16 for a
while, but I think we should push it into the current series as well.
So I cherry picked the commit and added a stable marker as well, so it
can propagate down"
* 'for-linus' of git://git.kernel.dk/linux-block:
virtio_blk: fix race between start and stop queue
Linus Torvalds [Tue, 27 May 2014 22:58:20 +0000 (15:58 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull two timer fixes from Thomas Gleixner:
"Two small fixlets for ARM SoC clocksource drivers:
- avoid calling functions which might sleep from interrupt [disabled]
context in tcb_clksrc used on Atmel SoCs
- use irq_force_affinity() to pin the per cpu timer interrupt on a
not yet online cpu in the SiRFprimaII driver"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: tcb_clksrc: Make tc_mode interrupt safe
clocksource: marco: Fix the affinity set for local timer of CPU1
Linus Torvalds [Tue, 27 May 2014 20:59:24 +0000 (13:59 -0700)]
Merge tag 'fixes-for-3.15' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A slightly larger set of fixes than we'd like at this point in the
release. Hopefully our very last batch before 3.15:
OMAP:
- Fix boot regression with CPU_IDLE enabled
- Fixes for audio playback on OMAP5
- Clock rate setting fix for OMAP3
- Misc idle/PM fixes
Exynos:
- Removal of a couple of power domains to work around issues with
access when they are powered down
- Enabling missing highspeed-i2c driver to make MMC regulators work
- Secondary CPU spin-up fix for 4212
- Remove MDMA1 engine to avoid conflicts on secure mode platforms
- A few other DT fixes
Marvell:
- PCI-e fixes for clocks and resource allocation
plus a few other smaller fixes, add a MAINTAINERS entry for reset
drivers, etc"
* tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
MAINTAINERS: Add reset controller framework entry
ARM: trusted_foundations: fix compile error on non-SMP
ARM: at91: sam9260: fix compilation issues
ARM: mvebu: fix definitions of PCIe interfaces on Armada 38x
ARM: imx: fix error handling in ipu device registration
ARM: OMAP4: Fix the boot regression with CPU_IDLE enabled
ARM: dts: Keep LDO4 always ON for exynos5250-arndale board
ARM: dts: Fix SPI interrupt numbers for exynos5420
ARM: dts: fix incorrect ak8975 compatible for exynos4412-trats2 board
ARM: OMAP2+: Fix DMA hang after off-idle
ARM: OMAP2+: nand: Fix NAND on OMAP2 and OMAP3 boards
ARM: dts: Remove g2d_pd node for exynos5420
ARM: dts: Remove mau_pd node for exynos5420
ARM: exynos_defconfig: enable HS-I2C to fix for mmc partition mount
ARM: dts: disable MDMA1 node for exynos5420
ARM: EXYNOS: fix the secondary CPU boot of exynos4212
ARM: omap5: hwmod_data: Correct IDLEMODE for McPDM
ARM: mvebu: mvebu-soc-id: keep clock enabled if PCIe unit is enabled
ARM: mvebu: mvebu-soc-id: add missing clk_put() call
ARM: at91/dt: sam9260: correct external trigger value
...
Linus Torvalds [Tue, 27 May 2014 20:58:46 +0000 (13:58 -0700)]
Merge tag 'pinctrl-v3.15-4' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fix from Linus Walleij:
"A single last pinctrl fix for the v3.15 series: the vt8500 driver was
failing to update the output value when the combined set direction
output and set value was executed"
* tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: vt8500: Ensure value reg is updated when setting direction
Linus Torvalds [Tue, 27 May 2014 20:57:00 +0000 (13:57 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"We have three small fixes.
First one from Andy reverts the devm_request irq as we need to ensure
the tasklet is killed after irq is freed, so we need to do free irq in
our code. Other two from Arnd are fixing the compilation issue in
omap and sa11x0 drivers with ARM randconfigs"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: sa11x0: remove broken #ifdef
dmaengine: omap: hide filter_fn for built-in drivers
dmaengine: dw: went back to plain {request,free}_irq() calls
Philipp Zabel [Tue, 27 May 2014 13:58:09 +0000 (15:58 +0200)]
MAINTAINERS: Add reset controller framework entry
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
Hannes Reinecke [Mon, 26 May 2014 12:45:39 +0000 (14:45 +0200)]
dm mpath: really fix lockdep warning
lockdep complains about a circular locking. And indeed, we need to
release the lock before calling dm_table_run_md_queue_async().
As such, commit
4cdd2ad ("dm mpath: fix lock order inconsistency in
multipath_ioctl") must also be reverted in addition to fixing the
lock order in the other dm_table_run_md_queue_async() callers.
Reported-by: Bart van Assche <bvanassche@acm.org>
Tested-by: Bart van Assche <bvanassche@acm.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Ming Lei [Fri, 16 May 2014 15:31:21 +0000 (23:31 +0800)]
virtio_blk: fix race between start and stop queue
When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed & freed.
Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.
This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts:
drivers/block/virtio_blk.c
Heinz Mauelshagen [Fri, 23 May 2014 18:10:01 +0000 (14:10 -0400)]
dm cache: always split discards on cache block boundaries
The DM cache target cannot cope with discards that span multiple cache
blocks, so each discard bio that spans more than one cache block must
get split by the DM core.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.9+
Chris Wilson [Fri, 23 May 2014 06:48:08 +0000 (08:48 +0200)]
drm/i915: Prevent negative relocation deltas from wrapping
This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.
So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.
This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.
v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.
v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
v5: Add static to eb_get_batch, spotted by 0-day tester.
Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Fri, 23 May 2014 09:45:52 +0000 (10:45 +0100)]
drm/i915: Only copy back the modified fields to userspace from execbuffer
We only want to modifiy a single field in the userspace view of the
execbuffer command buffer, so explicitly change that rather than copy
everything back again.
This serves two purposes:
1. The single fields are much cheaper to copy (constant size so the
copy uses special case code) and much smaller than the whole array.
2. We modify the array for internal use that need to be masked from
the user.
Note: We need this backported since without it the next bugfix will
blow up when userspace recycles batchbuffers and relocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Wed, 21 May 2014 11:42:56 +0000 (12:42 +0100)]
drm/i915: Fix dynamic allocation of physical handles
A single object may be referenced by multiple registers fundamentally
breaking the static allotment of ids in the current design. When the
object is used the second time, the physical address of the first
assignment is relinquished and a second one granted. However, the
hardware is still reading (and possibly writing) to the old physical
address now returned to the system. Eventually hilarity will ensue, but
in the short term, it just means that cursors are broken when using more
than one pipe.
v2: Fix up leak of pci handle when handling an error during attachment,
and avoid a double kmap/kunmap. (Ville)
Rebase against -fixes.
v3: And fix the error handling added in v2 (Ville)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77351
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paul Bolle [Fri, 23 May 2014 15:05:00 +0000 (17:05 +0200)]
cpufreq: s5pv210: drop check for CONFIG_PM_VERBOSE
A pr_err() was added in v3.1. It was guarded by a check for
CONFIG_PM_VERBOSE. The Kconfig symbol PM_VERBOSE was removed in v3.0. So
this pr_err() has never been used. Drop that check and clean up the
message a bit.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stratos Karafotis [Tue, 20 May 2014 18:12:27 +0000 (21:12 +0300)]
cpufreq: intel_pstate: Remove unused member name of cpudata
Although, a value is assigned to member name of struct cpudata,
it is never used.
We can safely remove it.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Hans de Goede [Tue, 20 May 2014 05:54:09 +0000 (22:54 -0700)]
Input: synaptics - change min/max quirk table to pnp-id matching
Most of the affected models share pnp-ids for the touchpad. So switching
to pnp-ids give us 2 advantages:
1) It shrinks the quirk list
2) It will lower the new quirk addition frequency, ie the recently added W540
quirk would not have been necessary since it uses the same LEN0034 pnp ids
as other models already added before it
As an added bonus it actually puts the quirk on the actual psmouse, rather
then on the machine, which is technically more correct.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Hans de Goede [Tue, 20 May 2014 05:53:23 +0000 (22:53 -0700)]
Input: synaptics - add a matches_pnp_id helper function
This is a preparation patch for simplifying the min/max quirk table.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Hans de Goede [Tue, 20 May 2014 05:52:30 +0000 (22:52 -0700)]
Input: synaptics - T540p - unify with other LEN0034 models
The T540p has a touchpad with pnp-id LEN0034, all the models with this
pnp-id have the same min/max values, except the T540p where the values are
slightly off. Fix them to be identical.
This is a preparation patch for simplifying the quirk table.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Rafael J. Wysocki [Mon, 26 May 2014 21:20:16 +0000 (23:20 +0200)]
Merge branches 'pm-cpufreq' and 'acpi-thermal'
* pm-cpufreq:
cpufreq: cpu0: drop wrong devm usage
cpufreq: remove race while accessing cur_policy
* acpi-thermal:
ACPI / thermal: fix workqueue destroy order
Aaron Lu [Mon, 26 May 2014 12:34:07 +0000 (14:34 +0200)]
ACPI / thermal: fix workqueue destroy order
When the thermal module is to be removed, we should destroy the wq
acpi_thermal_pm_queue after the ACPI driver's remove callback is
executed as we will need to flush the workqueue there, or a NULL pointer
access will be hit.
Reported-and-tested-by: Kui Zhang <kuizhang@gmail.com>
References: http://www.spinics.net/lists/kernel/msg1747251.html
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Gabor Juhos [Thu, 15 May 2014 08:35:44 +0000 (10:35 +0200)]
MIPS: RC32434: fix broken PCI resource initialization
The parent field of the 'rc32434_res_pci_mem1' resource points to
the resource itself which is obviously wrong. Due to the broken
initialitazion, the PCI devices on the Mikrotik RB532 boards are
not working since commit
22283178 (MIPS: avoid possible resource
conflict in register_pci_controller).
Remove the field initialization to fix the issue.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Reported-by: Waldemar Brodkorb <wbx@openadk.org>
Cc: linux-mips@linux-mips.org
Cc: Gabor Juhos <juhosg@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/6940/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Sun, 25 May 2014 23:06:00 +0000 (16:06 -0700)]
Linux 3.15-rc7
Rabin Vincent [Sat, 24 May 2014 16:38:01 +0000 (17:38 +0100)]
ARM: 8064/1: fix v7-M signal return
According to the ARM ARM, the behaviour is UNPREDICTABLE if the PC read
from the exception return stack is not half word aligned. See the
pseudo code for ExceptionReturn() and PopStack().
The signal handler's address has the bit 0 set, and setup_return()
directly writes this to regs->ARM_pc. Current hardware happens to
discard this bit, but QEMU's emulation doesn't and this makes processes
crash. Mask out bit 0 before the exception return in order to get
predictable behaviour.
Fixes: 19c4d593f0b4 ("ARM: ARMv7-M: Add support for exception handling")
Cc: stable@kernel.org
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
srinik [Thu, 15 May 2014 10:28:44 +0000 (11:28 +0100)]
ARM: 8057/1: amba: Add Qualcomm vendor ID.
This patch adds Qualcomm amba vendor Id to the list. This ID is used in mmci driver. The ID selected in same lines like 0x41 is "A" for ARM, 0x51 is "Q" for Qualcomm.
As there are no physical register on Qcom SOC for amba vendor id, this is a fake ID assigned based on "Q" prefix from Qualcomm.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nikolay Borisov [Thu, 8 May 2014 14:54:26 +0000 (15:54 +0100)]
ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
The arm EABI states that unwind opcode 10100nnn means pop register r4-4[4+nnn],aditionally there is a similar unwind opcode: 10101nnn which means the same thing plus popping r14. Those two cases are handled by the unwind_exec_pop_r4_to_rN function which checks whether the 4th bit is set and does r14 popping.
However, up until now it has been checking whether the 8th bit was set (mask & 0x80) instead of the 4th (mask & 0x8), a simple to make typo but this meant that we were always popping r14 even if we had the former opcode.
This patch changes the mask so that the 2 unwind opcodes are being handled correctly.
Signed-off-by: Nikolay Borisov <Nikolay.Borisov@arm.com>
Reviewed-by: Anurag Aggarwal <anurag19aggarwal@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Andrey Ryabinin [Wed, 7 May 2014 07:07:25 +0000 (08:07 +0100)]
ARM: 8051/1: put_user: fix possible data corruption in put_user
According to arm procedure call standart r2 register is call-cloberred.
So after the result of x expression was put into r2 any following
function call in p may overwrite r2. To fix this, the result of p
expression must be saved to the temporary variable before the
assigment x expression to __r2.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rabin Vincent [Sat, 3 May 2014 17:19:17 +0000 (18:19 +0100)]
ARM: 8048/1: fix v7-M setup stack location
__v7m_setup_stack currently sits in the .proc.info.init section, and
thus creates a bogus proc info entry (which by the way matches any
unknown CPU IDs, due to the entry's mask being 0). Move it out of
there.
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sun, 25 May 2014 19:40:36 +0000 (12:40 -0700)]
Merge branch 'afs' of git://git./linux/kernel/git/dhowells/linux-fs
Pull AFS fixes and cleanups from David Howells:
"Here are some patches to the AFS filesystem:
1) Fix problems in the clean-up parts of the cache manager service
handler.
2) Split afs_end_call() introduced in (1) and replace some identical
code elsewhere with a call to the first half of the split function.
3) Fix an error introduced in the workqueue PREPARE_WORK() elimination
commits.
4) Clean up argument passing to functions called from the workqueue as
there's now an insulating layer between them and the workqueue.
This is possible from (3)"
* 'afs' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct*
AFS: Fix kafs module unloading
AFS: Part of afs_end_call() is identical to code elsewhere, so split it
AFS: Fix cache manager service handlers
Linus Torvalds [Sun, 25 May 2014 19:39:08 +0000 (12:39 -0700)]
Merge branch 'rdunlap' (patches from Randy Dunlap)
Merge documentation fixes from Randy Dunlap.
* emailed patches from Randy Dunlap <rdunlap@infradead.org>:
Documentation: update /proc/stat "intr" count summary
Documentation: update java sample wrapper for java 7
Documentation: update thunderbird email client settings
Documentation: fix typos in drm docbook
Jan Moskyto Matejka [Thu, 15 May 2014 20:55:34 +0000 (13:55 -0700)]
Documentation: update /proc/stat "intr" count summary
The sum at the beginning of line "intr" includes also unnumbered
interrupts. It implies that the sum at the beginning isn't the sum
of the remainder of the line, not even an estimation.
Fixed the documentation to mention that.
This behaviour was added to /proc/stat in commit
a2eddfa95919 ("x86:
make /proc/stat account for all interrupts")
Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jonathan Callen [Thu, 15 May 2014 20:54:52 +0000 (13:54 -0700)]
Documentation: update java sample wrapper for java 7
The sample wrapper currently fails on some Java 7 .class files. This
updates the wrapper to properly handle those files.
Signed-off-by: Jonathan Callen <jcallen@gentoo.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul McQuade [Thu, 15 May 2014 20:54:25 +0000 (13:54 -0700)]
Documentation: update thunderbird email client settings
Added setting to email-clients that is easier to read and is easier to
setup thunderbird. Removed config settings and added GUI settings.
Signed-off-by: Paul McQuade <paulmcquad@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Masanari Iida [Thu, 15 May 2014 20:54:06 +0000 (13:54 -0700)]
Documentation: fix typos in drm docbook
Fix spelling typo in DocBook/drm.tmpl
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 25 May 2014 17:20:36 +0000 (10:20 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging
Pull hwmon subsystem fixes from Jean Delvare.
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (ntc_thermistor) Fix OF device ID mapping
hwmon: (ntc_thermistor) Fix dependencies
hwmon: Document temp[1-*]_min_hyst sysfs attribute
Linus Torvalds [Sun, 25 May 2014 17:13:50 +0000 (10:13 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull single scsi fix from James Bottomley:
"This is a single fix for a bug exposed by a sysfs change in 3.13 which
now causes libsas to trigger a warn on in device removal"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] scsi_transport_sas: move bsg destructor into sas_rphy_remove
Linus Torvalds [Sun, 25 May 2014 17:08:48 +0000 (10:08 -0700)]
Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux
Pull two nfsd bugfixes from Bruce Fields:
"Just two bugfixes, one for a merge-window-introduced ACL regression,
the other for a longer-standing v4 state bug"
* 'for-3.15' of git://linux-nfs.org/~bfields/linux:
nfsd4: warn on finding lockowner without stateid's
nfsd4: remove lockowner when removing lock stateid
nfsd4: fix corruption on setting an ACL.
Jean Delvare [Sun, 25 May 2014 15:23:08 +0000 (17:23 +0200)]
hwmon: (ntc_thermistor) Fix OF device ID mapping
The mapping from OF device IDs to platform device IDs is wrong.
TYPE_NCPXXWB473 is 0, TYPE_NCPXXWL333 is 1, so
ntc_thermistor_id[TYPE_NCPXXWB473] is { "ncp15wb473", TYPE_NCPXXWB473 }
while
ntc_thermistor_id[TYPE_NCPXXWL333] is { "ncp18wb473", TYPE_NCPXXWB473 }.
So the name is wrong for all but the "ntc,ncp15wb473" entry, and the
type is wrong for the "ntc,ncp15wl333" entry.
So map the entries by index, it is neither elegant nor robust but at
least it is correct.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 9e8269de hwmon: (ntc_thermistor) Add DT with IIO support to NTC thermistor driver
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Naveen Krishna Chatradhi <ch.naveen@samsung.com>
Cc: Doug Anderson <dianders@chromium.org>
Jean Delvare [Sun, 25 May 2014 15:23:08 +0000 (17:23 +0200)]
hwmon: (ntc_thermistor) Fix dependencies
In commit
9e8269de, support was added for ntc_thermistor devices being
declared in the device tree and implemented on top of IIO. With that
change, a dependency was added to the ntc_thermistor driver:
depends on (!OF && !IIO) || (OF && IIO)
This construct has the drawback that the driver can no longer be
selected when OF is set and IIO isn't, nor when IIO is set and OF is
not. This is a regression for the original users of the driver.
As the new code depends on IIO and is useless without OF, include it
only if both are enabled, and set the dependencies accordingly. This
is clearer, more simple and more correct.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 9e8269de hwmon: (ntc_thermistor) Add DT with IIO support to NTC thermistor driver
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Naveen Krishna Chatradhi <ch.naveen@samsung.com>
Cc: Doug Anderson <dianders@chromium.org>
Jean Delvare [Sun, 25 May 2014 15:23:07 +0000 (17:23 +0200)]
hwmon: Document temp[1-*]_min_hyst sysfs attribute
The temp[1-*]_min_hyst sysfs attribute is already implemented by 3
hwmon drivers (adt7x10, lm77 and lm92) but was missing from the
standard interface.
Also add temp[1-*]_lcrit_hyst for consistency, even though no driver
implement that one for the time being.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Markos Chandras [Fri, 23 May 2014 12:31:31 +0000 (13:31 +0100)]
MIPS: malta: memory.c: Initialize the 'memsize' variable
If the 'memsize' environmental variable is not set by the bootloader
the 'memsize' variable is not initialized, leading to potential memory
problems. This patch fixes the problem by setting the initial
value to '0' to force the kernel to set a good default memory size.
Cc: <stable@vger.kernel.org> # v3.15+
Reported-by: Matheus Almeida <Matheus.Almeida@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/6984/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Markos Chandras [Wed, 21 May 2014 11:35:00 +0000 (12:35 +0100)]
MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores
Introduced by the following two commits:
75b5b5e0a262790fa11043fe45700499c7e3d818
"MIPS: Add support for FTLBs"
6de20451857ed14a4eecc28d08f6de5925d1cf96
"MIPS: Add printing of ES bit for Imgtec cores when cache error occurs"
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Reported-by: Matheus Almeida <Matheus.Almeida@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: stable@vger.kernel.org # v3.14+
Patchwork: https://patchwork.linux-mips.org/patch/6980/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Sun, 25 May 2014 02:00:36 +0000 (10:00 +0800)]
MIPS: Fix inconsistancy of __NR_Linux_syscalls value
Originally, __NR_O32_Linux_syscalls, __NR_N32_Linux_syscalls and
__NR_64_Linux_syscalls have the same values as __NR_Linux_syscalls in
corresponding ABIs. But after commit
367f0b50e502d (MIPS: Wire up
renameat2 syscall) they are not the same. I think this is incorrect and
need a fix.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6987/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Fri, 23 May 2014 23:52:15 +0000 (16:52 -0700)]
Merge tag 'dmaengine-fixes-3.15-rc5' of git://git./linux/kernel/git/djbw/dmaengine
Pull dmaengine fixes from Dan Williams:
"Two fixes for -stable:
- async_mult() sometimes maps less buffers than initially requested.
We end up freeing dmaengine_unmap_data on an invalid pool.
- mv_xor: register write ordering fix"
* tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
dmaengine: fix dmaengine_unmap failure
dma: mv_xor: Flush descriptors before activating a channel
Linus Torvalds [Fri, 23 May 2014 22:41:52 +0000 (15:41 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"A small bunch of bug fixes, in particular:
1) On older cpus we need a different chunk of virtual address space
to map the huge page TSB.
2) Missing memory barrier in Niagara2 memcpy.
3) trinity showed some places where fault validation was
unnecessarily loud on sparc64
4) Some sysfs printf's need a type adjustment, from Toralf Förster"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: fix format string mismatch in arch/sparc/kernel/sysfs.c
sparc64: Add membar to Niagara2 memcpy code.
sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault addresses.