Mike Christie [Fri, 24 Jun 2011 20:11:54 +0000 (15:11 -0500)]
iscsi_tcp: fix locking around iscsi sk user data
commit
03adb5f91280b433c3685d0ee86b2e1424af3d88 upstream.
iscsi_sw_tcp_conn_restore_callbacks could have set
the sk_user_data field to NULL then iscsi_sw_tcp_data_ready
could read that and try to access the NULL pointer. This
adds some checks for NULL sk_user_data in the sk
callback functions and it uses the sk_callback_lock to
set/get that sk_user_data field.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Wanlong Gao [Wed, 21 Sep 2011 08:22:10 +0000 (10:22 +0200)]
blk-cgroup: be able to remove the record of unplugged device
commit
d11bb4462c4cc6ddd45c6927c617ad79fa6fb8fc upstream.
The bug is we're not able to remove the device from blkio cgroup's
per-device control files if it gets unplugged.
To reproduce the bug:
# mount -t cgroup -o blkio xxx /cgroup
# cd /cgroup
# echo "8:0 1000" > blkio.throttle.read_bps_device
# unplug the device
# cat blkio.throttle.read_bps_device
8:0 1000
# echo "8:0 0" > blkio.throttle.read_bps_device
-bash: echo: write error: No such device
After patching, the device removal will succeed.
Thanks for the comments of Paul, Zefan, and Vivek.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <paul@paulmenage.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Don Fry [Thu, 15 Sep 2011 15:36:22 +0000 (08:36 -0700)]
iwlagn: workaround bug crashing some APs
commit
2249b011432ca3dcce112f0f71e0f531b4bb9347 upstream.
This patch reverts commit
9b7688328422b88a7a15dc0dc123ad9ab1a6e22d which
was introduced in 2.6.38-rc1. It works around a problem where the iwlagn
driver stimulates a bug crashing (requiring power cycle to recover) some
APs under heavy traffic.
Signed-off-by: Don Fry <donald.h.fry@intel.com>
SIgned-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Larry Finger [Wed, 14 Sep 2011 21:50:23 +0000 (16:50 -0500)]
rtl2800usb: Fix incorrect storage of MAC address on big-endian platforms
commit
daabead1c32f331edcfb255fd973411c667977e8 upstream.
The eeprom data is stored in little-endian order in the rt2x00 library.
As it was converted to cpu order in the read routines, the data need to
be converted to LE on a big-endian platform.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rajkumar Manoharan [Wed, 14 Sep 2011 08:58:17 +0000 (14:28 +0530)]
wireless: Reset beacon_found while updating regulatory
commit
aa3d7eef398dd4f29045e9889b817d5161afe03e upstream.
During the association, the regulatory is updated by country IE
that reaps the previously found beacons. The impact is that
after a STA disconnects *or* when for any reason a regulatory
domain change happens the beacon hint flag is not cleared
therefore preventing future beacon hints to be learned.
This is important as a regulatory domain change or a restore
of regulatory settings would set back the passive scan and no-ibss
flags on the channel. This is the right place to do this given that
it covers any regulatory domain change.
Reviewed-by: Luis R. Rodriguez <mcgrof@gmail.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Acked-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Vrabel [Tue, 13 Sep 2011 14:17:32 +0000 (10:17 -0400)]
xen/e820: if there is no dom0_mem=, don't tweak extra_pages.
commit
e3b73c4a25e9a5705b4ef28b91676caf01f9bc9f upstream.
The patch "xen: use maximum reservation to limit amount of usable RAM"
(
d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11) breaks machines that
do not use 'dom0_mem=' argument with:
reserve RAM buffer:
000000133f2e2000 -
000000133fffffff
(XEN) mm.c:4976:d0 Global bit is set to kernel page
fffff8117e
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...
The reason being that the last E820 entry is created using the
'extra_pages' (which is based on how many pages have been freed).
The mentioned git commit sets the initial value of 'extra_pages'
using a hypercall which returns the number of pages (if dom0_mem
has been used) or -1 otherwise. If the later we return with
MAX_DOMAIN_PAGES as basis for calculation:
return min(max_pages, MAX_DOMAIN_PAGES);
and use it:
extra_limit = xen_get_max_pages();
if (extra_limit >= max_pfn)
extra_pages = extra_limit - max_pfn;
else
extra_pages = 0;
which means we end up with extra_pages = 128GB in PFNs (
33554432)
- 8GB in PFNs (
2097152, on this specific box, can be larger or smaller),
and then we add that value to the E820 making it:
Xen:
00000000ff000000 -
0000000100000000 (reserved)
Xen:
0000000100000000 -
000000133f2e2000 (usable)
which is clearly wrong. It should look as so:
Xen:
00000000ff000000 -
0000000100000000 (reserved)
Xen:
0000000100000000 -
000000027fbda000 (usable)
Naturally this problem does not present itself if dom0_mem=max:X
is used.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Vrabel [Fri, 19 Aug 2011 14:57:16 +0000 (15:57 +0100)]
xen: use maximum reservation to limit amount of usable RAM
commit
d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11 upstream.
Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).
On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:
Before:
Memory: 627792k/4472000k available
After:
Memory: 549740k/11132224k available
A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved
low memory is also reduced from 253 MiB to 32 MiB. The total
additional usable RAM is 329 MiB.
For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0') (c/s 23790)
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Hansen [Tue, 20 Sep 2011 22:19:41 +0000 (15:19 -0700)]
teach /proc/$pid/numa_maps about transparent hugepages
commit
32ef43848f283e0ef945d3c67e851c143fea3970 upstream.
This is modeled after the smaps code.
It detects transparent hugepages and then does a single gather_stats()
for the page as a whole. This has two benifits:
1. It is more efficient since it does many pages in a single shot.
2. It does not have to break down the huge page.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Hansen [Tue, 20 Sep 2011 22:19:39 +0000 (15:19 -0700)]
break out numa_maps gather_pte_stats() checks
commit
3200a8aaab0c9ccdc0f59b0dac2d4a47029137fa upstream.
gather_pte_stats() does a number of checks on a target page
to see whether it should even be considered for statistics.
This breaks that code out in to a separate function so that
we can use it in the transparent hugepage case in the next
patch.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Christoph Lameter <cl@gentwo.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Hansen [Tue, 20 Sep 2011 22:19:38 +0000 (15:19 -0700)]
make /proc/$pid/numa_maps gather_stats() take variable page size
commit
eb4866d0066ffd5446751c102d64feb3318d8bd1 upstream.
We need to teach the numa_maps code about transparent huge pages. The
first step is to teach gather_stats() that the pte it is dealing with
might represent more than one page.
Note that will we use this in a moment for transparent huge pages since
they have use a single pmd_t which _acts_ as a "surrogate" for a bunch
of smaller pte_t's.
I'm a _bit_ unhappy that this interface counts in hugetlbfs page sizes
for hugetlbfs pages and PAGE_SIZE for normal pages. That means that to
figure out how many _bytes_ "dirty=1" means, you must first know the
hugetlbfs page size. That's easier said than done especially if you
don't have visibility in to the mount.
But, that's probably a discussion for another day especially since it
would change behavior to fix it. But, just in case anyone wonders why
this patch only passes a '1' in the hugetlb case...
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Larry Finger [Wed, 14 Sep 2011 21:50:22 +0000 (16:50 -0500)]
rt2800pci: Fix compiler error on PowerPC
commit
d331eb51e4d4190b2178c30fcafea54a94a577e8 upstream.
Using gcc 4.4.5 on a Powerbook G4 with a PPC cpu, a complicated
if statement results in incorrect flow, whereas the equivalent switch
statement works correctly.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lasse Collin [Wed, 21 Sep 2011 14:30:50 +0000 (17:30 +0300)]
XZ: Fix incorrect XZ_BUF_ERROR
commit
9c1f8594df4814ebfd6822ca3c9444fb3445888d upstream.
xz_dec_run() could incorrectly return XZ_BUF_ERROR if all of the
following was true:
- The caller knows how many bytes of output to expect and only provides
that much output space.
- When the last output bytes are decoded, the caller-provided input
buffer ends right before the LZMA2 end of payload marker. So LZMA2
won't provide more output anymore, but it won't know it yet and thus
won't return XZ_STREAM_END yet.
- A BCJ filter is in use and it hasn't left any unfiltered bytes in the
temp buffer. This can happen with any BCJ filter, but in practice
it's more likely with filters other than the x86 BCJ.
This fixes <https://bugzilla.redhat.com/show_bug.cgi?id=735408> where
Squashfs thinks that a valid file system is corrupt.
This also fixes a similar bug in single-call mode where the uncompressed
size of a block using BCJ + LZMA2 was 0 bytes and caller provided no
output space. Many empty .xz files don't contain any blocks and thus
don't trigger this bug.
This also tweaks a closely related detail: xz_dec_bcj_run() could call
xz_dec_lzma2_run() to decode into temp buffer when it was known to be
useless. This was harmless although it wasted a minuscule number of CPU
cycles.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jesse Brandeburg [Tue, 20 Sep 2011 15:13:03 +0000 (15:13 +0000)]
ixgbe: fix possible null buffer error
commit
b811ce9104a7f7663ddae4f7795a194a103b8f90 upstream.
It seems that at least one PPC machine would occasionally give a (valid) 0 as
the return value from dma_map, this caused the ixgbe code to not work
correctly. A fix is pending in the PPC tree to not return 0 from dma map, but
we can also fix the driver to make sure we don't mess up in other arches as
well.
This patch is applicable to all current stable kernels.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=683611
Reported-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Manual Munz [Sun, 18 Sep 2011 23:24:03 +0000 (18:24 -0500)]
b43: Fix beacon problem in ad-hoc mode
commit
8c23516fbb209ccf8f8c36268311c721faff29ee upstream.
In ad-hoc mode, driver b43 does not issue beacons.
Signed-off-by: Manual Munz <freifunk@somakoma.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Carsten Emde [Wed, 21 Sep 2011 08:22:11 +0000 (10:22 +0200)]
floppy: use del_timer_sync() in init cleanup
commit
6c4867f6469964e34c5f4ee229a2a7f71a34c7ff upstream.
When no floppy is found the module code can be released while a timer
function is pending or about to be executed.
CPU0 CPU1
floppy_init()
timer_softirq()
spin_lock_irq(&base->lock);
detach_timer();
spin_unlock_irq(&base->lock);
-> Interrupt
del_timer();
return -ENODEV;
module_cleanup();
<- EOI
call_timer_fn();
OOPS
Use del_timer_sync() to prevent this.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Nicolas Pitre [Wed, 14 Sep 2011 05:22:05 +0000 (01:22 -0400)]
ARM: Dove: fix second SPI initialization call
commit
72cc205611879525db0374d9831f84f787112b25 upstream.
Commit
980f9f601a "ARM: orion: Consolidate SPI initialization."
broke it by overwriting the SPI0 registration.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Steve French [Mon, 29 Aug 2011 18:54:12 +0000 (18:54 +0000)]
Fix the conflict between rwpidforward and rw mount options
commit
c9c7fa0064f4afe1d040e72f24c2256dd8ac402d upstream.
Both these options are started with "rw" - that's why the first one
isn't switched on even if it is specified. Fix this by adding a length
check for "rw" option check.
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Layton [Tue, 23 Aug 2011 11:21:28 +0000 (07:21 -0400)]
cifs: fix possible memory corruption in CIFSFindNext
commit
9438fabb73eb48055b58b89fc51e0bc4db22fabd upstream.
The name_len variable in CIFSFindNext is a signed int that gets set to
the resume_name_len in the cifs_search_info. The resume_name_len however
is unsigned and for some infolevels is populated directly from a 32 bit
value sent by the server.
If the server sends a very large value for this, then that value could
look negative when converted to a signed int. That would make that
value pass the PATH_MAX check later in CIFSFindNext. The name_len would
then be used as a length value for a memcpy. It would then be treated
as unsigned again, and the memcpy scribbles over a ton of memory.
Fix this by making the name_len an unsigned value in CIFSFindNext.
Reported-by: Darren Lavender <dcl@hppine99.gbr.hp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Takashi Iwai [Mon, 19 Sep 2011 09:31:34 +0000 (11:31 +0200)]
ALSA: hda/realtek - Fix auto-mute with HP+LO configuration
commit
8974bd51a77824d91010176f9a5da28513c2e1f5 upstream.
When the system has only the headphone and the line-out jacks without
speakers, the current auto-mute code doesn't work. It's because the
spec->automute_lines flag is wrongly referred in update_speakers().
This flag must be meaningless when spec->automute_hp_lo isn't set, thus
they should be always coupled.
The patch fixes the problem and add a comment to indicate the
relationship briefly.
BugLink: http://bugs.launchpad.net/bugs/851697
Reported-by: David Henningsson <david.henningsson@canonical.com>
Tested-By: Jayne Han <jayne.han@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Johannes Berg [Mon, 12 Sep 2011 19:09:10 +0000 (12:09 -0700)]
iwlagn: fix command queue timeout
commit
282cdb325aea4ebbc42ce753b47cc96145eb54bc upstream.
If the command queue is constantly busy,
which can happen in P2P, the hangcheck
timer will frequently find a command in
it and will eventually reset the device
because nothing sets the timestamp for
this queue when commands are processed.
Fix this by setting the timestamp when
a command completes.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sarah Sharp [Mon, 19 Sep 2011 23:05:11 +0000 (16:05 -0700)]
USB: xhci: Set change bit when warm reset change is set.
commit
44f4c3ed60fb21e1d2dd98304390ac121e6c7c6d upstream.
Sometimes, when a USB 3.0 device is disconnected, the Intel Panther
Point xHCI host controller will report a link state change with the
state set to "SS.Inactive". This causes the xHCI host controller to
issue a warm port reset, which doesn't finish before the USB core times
out while waiting for it to complete.
When the warm port reset does complete, and the xHC gives back a port
status change event, the xHCI driver kicks khubd. However, it fails to
set the bit indicating there is a change event for that port because the
logic in xhci-hub.c doesn't check for the warm port reset bit.
After that, the warm port status change bit is never cleared by the USB
core, and the xHC stops reporting port status change bits. (The xHCI
spec says it shouldn't report more port events until all change bits are
cleared.) This means any port changes when a new device is connected
will never be reported, and the port will seem "dead" until the xHCI
driver is unloaded and reloaded, or the computer is rebooted. Fix this
by making the xHCI driver set the port change bit when a warm port reset
change bit is set.
A better solution would be to make the USB core handle warm port reset
in differently, merging the current code with the standard port reset
code that does an incremental backoff on the timeout, and tries to
complete the port reset two more times before giving up. That more
complicated fix will be merged next window, and this fix will be
backported to stable.
This should be backported to kernels as old as 3.0, since that was the
first kernel with commit
a11496ebf375 ("xHCI: warm reset support").
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alex Deucher [Fri, 16 Sep 2011 16:04:08 +0000 (12:04 -0400)]
drm/radeon/kms: Make GPU/CPU page size handling consistent in blit code (v2)
commit
003cefe0c238e683a29d2207dba945b508cd45b7 upstream.
The BO blit code inconsistenly handled the page size. This wasn't
an issue on system with 4k pages since the GPU's page size is 4k as
well. Switch the driver blit callbacks to take num pages in GPU
page units.
Fixes lemote mipsel systems using AMD rs780/rs880 chipsets.
v2: incorporate suggestions from Michel.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ming Lei [Wed, 31 Aug 2011 02:45:46 +0000 (10:45 +0800)]
firewire: ohci: add no MSI quirk for O2Micro controller
commit
f39aa30d7741f40ad964341e9243dbbd7f8ff057 upstream.
This fixes https://bugs.launchpad.net/ubuntu/+source/linux/+bug/801719 .
An O2Micro PCI Express FireWire controller,
"FireWire (IEEE 1394) [0c00]: O2 Micro, Inc. Device [1217:11f7] (rev 05)"
which is a combination device together with an SDHCI controller and some
sort of storage controller, misses SBP-2 status writes from an attached
FireWire HDD. This problem goes away if MSI is disabled for this
FireWire controller.
The device reportedly does not require QUIRK_CYCLE_TIMER.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Anton Blanchard [Wed, 7 Sep 2011 14:41:05 +0000 (14:41 +0000)]
ibmveth: Checksum offload is always disabled
commit
91aae1e5c407d4fc79f6983e6c6ba04756c004cb upstream.
Commit
b9367bf3ee6d (net: ibmveth: convert to hw_features) reversed
a check in ibmveth_set_csum_offload that results in checksum offload
never being enabled.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Anton Blanchard [Wed, 7 Sep 2011 14:41:04 +0000 (14:41 +0000)]
ibmveth: Fix issue with DMA mapping failure
commit
b93da27f5234198433345e40b39ff59797bc6f6e upstream.
descs[].fields.address is 32bit which truncates any dma mapping
errors so dma_mapping_error() fails to catch it.
Use a dma_addr_t to do the comparison. With this patch I was able
to transfer many gigabytes of data with IOMMU fault injection set
at 10% probability.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Brian King [Wed, 7 Sep 2011 14:41:03 +0000 (14:41 +0000)]
ibmveth: Fix DMA unmap error
commit
33a48ab105a75d37021e422a0a3283241099b142 upstream.
Commit
6e8ab30ec677 (ibmveth: Add scatter-gather support) introduced a
DMA mapping API inconsistency resulting in dma_unmap_page getting
called on memory mapped via dma_map_single. This was seen when
CONFIG_DMA_API_DEBUG was enabled. Fix up this API usage inconsistency.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Arjan van de Ven [Thu, 15 Sep 2011 06:49:25 +0000 (08:49 +0200)]
ALSA: pcm - fix race condition in wait_for_avail()
commit
763437a9e7737535b2fc72175ad4974048769be6 upstream.
wait_for_avail() in pcm_lib.c has a race in it (observed in practice by an
Intel validation group).
The function is supposed to return once space in the buffer has become
available, or if some timeout happens. The entity that creates space (irq
handler of sound driver and some such) will do a wake up on a waitqueue
that this function registers for.
However there are two races in the existing code
1) If space became available between the caller noticing there was no
space and this function actually sleeping, the wakeup is missed and the
timeout condition will happen instead
2) If a wakeup happened but not sufficient space became available, the
code will loop again and wait for more space. However, if the second
wake comes in prior to hitting the schedule_timeout_interruptible(), it
will be missed, and potentially you'll wait out until the timeout
happens.
The fix consists of using more careful setting of the current state (so
that if a wakeup happens in the main loop window, the schedule_timeout()
falls through) and by checking for available space prior to going into the
schedule_timeout() loop, but after being on the waitqueue and having the
state set to interruptible.
[tiwai: the following changes have been added to Arjan's original patch:
- merged akpm's fix for waitqueue adding order into a single patch
- reduction of duplicated code of avail check
]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Tuttle [Wed, 14 Sep 2011 23:22:28 +0000 (16:22 -0700)]
workqueue: lock cwq access in drain_workqueue
commit
fa2563e41c3d6d6e8af437643981ed28ae0cb56d upstream.
Take cwq->gcwq->lock to avoid racing between drain_workqueue checking to
make sure the workqueues are empty and cwq_dec_nr_in_flight decrementing
and then incrementing nr_active when it activates a delayed work.
We discovered this when a corner case in one of our drivers resulted in
us trying to destroy a workqueue in which the remaining work would
always requeue itself again in the same workqueue. We would hit this
race condition and trip the BUG_ON on workqueue.c:3080.
Signed-off-by: Thomas Tuttle <ttuttle@chromium.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Naga Chumbalkar [Wed, 14 Sep 2011 23:22:23 +0000 (16:22 -0700)]
drivers/cpufreq/pcc-cpufreq.c: avoid NULL pointer dereference
commit
e71f5cc402ecb42b407ae52add7b173bf1c53daa upstream.
per_cpu(processors, n) can be NULL, resulting in:
Loading CPUFreq modules[ 437.661360] BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffffa0434314>] pcc_cpufreq_cpu_init+0x74/0x220 [pcc_cpufreq]
It's better to avoid the oops by failing the driver, and allowing the
system to boot.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Johan Hovold [Wed, 14 Sep 2011 23:22:16 +0000 (16:22 -0700)]
drivers/leds/ledtrig-timer.c: fix broken sysfs delay handling
commit
7a5caabd090b8f7d782c40fc1c048d798f2b6fd7 upstream.
Fix regression introduced by commit
5ada28bf7675 ("led-class: always
implement blinking") which broke sysfs delay handling by not storing the
updated value. Consequently it was only possible to set one of the delays
through the sysfs interface as the other delay was automatically restored
to it's default value. Reading the parameters always gave the defaults.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Florian Fainelli <florian@openwrt.org>
Acked-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Vrabel [Wed, 14 Sep 2011 23:22:02 +0000 (16:22 -0700)]
mm: sync vmalloc address space page tables in alloc_vm_area()
commit
461ae488ecb125b140d7ea29ceeedbcce9327003 upstream.
Xen backend drivers (e.g., blkback and netback) would sometimes fail to
map grant pages into the vmalloc address space allocated with
alloc_vm_area(). The GNTTABOP_map_grant_ref would fail because Xen could
not find the page (in the L2 table) containing the PTEs it needed to
update.
(XEN) mm.c:3846:d0 Could not find L1 PTE for address
fbb42000
netback and blkback were making the hypercall from a kernel thread where
task->active_mm != &init_mm and alloc_vm_area() was only updating the page
tables for init_mm. The usual method of deferring the update to the page
tables of other processes (i.e., after taking a fault) doesn't work as a
fault cannot occur during the hypercall.
This would work on some systems depending on what else was using vmalloc.
Fix this by reverting
ef691947d8a3 ("vmalloc: remove vmalloc_sync_all()
from alloc_vm_area()") and add a comment to explain why it's needed.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Keir Fraser <keir.xen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Al Viro [Wed, 14 Sep 2011 17:55:41 +0000 (18:55 +0100)]
restore pinning the victim dentry in vfs_rmdir()/vfs_rename_dir()
commit
1d2ef5901483004d74947bbf78d5146c24038fe7 upstream.
We used to get the victim pinned by dentry_unhash() prior to commit
64252c75a219 ("vfs: remove dget() from dentry_unhash()") and ->rmdir()
and ->rename() instances relied on that; most of them don't care, but
ones that used d_delete() themselves do. As the result, we are getting
rmdir() oopses on NFS now.
Just grab the reference before locking the victim and drop it explicitly
after unlocking, same as vfs_rename_other() does.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michel Dänzer [Tue, 13 Sep 2011 09:27:35 +0000 (11:27 +0200)]
drm/radeon: Don't read from CP ring write pointer registers.
commit
87463ff83bcda210d8f0ae440bd64d1548f852e7 upstream.
Apparently this doesn't always work reliably, e.g. at resume time.
Just initialize to 0, so the ring is considered empty.
Tested with hibernation on Sumo and Cayman cards.
Should fix https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/ .
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Henningsson [Wed, 14 Sep 2011 11:22:54 +0000 (13:22 +0200)]
ALSA: HDA: Cirrus - fix "Surround Speaker" volume control name
commit
2e1210bc3d065a6e26ff5fef228a9a7e08921d2c upstream.
This patch fixes "Surround Speaker Playback Volume" being cut off.
(Commit
b4dabfc452a10 was probably meant to fix this, but it fixed
only the "Switch" name, not the "Volume" name.)
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Tue, 19 Jul 2011 14:25:42 +0000 (16:25 +0200)]
x86, iommu: Mark DMAR IRQ as non-threaded
commit
477694e71113fd0694b6bb0bcc2d006b8ac62691 upstream.
Mark this lowlevel IRQ handler as non-threaded. This prevents a boot
crash when "threadirqs" is on the kernel commandline. Also the
interrupt handler is handling hardware critical events which should
not be delayed into a thread.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Johannes Berg [Tue, 6 Sep 2011 10:47:39 +0000 (12:47 +0200)]
mac80211: fix missing sta_lock in __sta_info_destroy
commit
4bae7d976976fa52d345805ba686934cd548343e upstream.
Since my commit
34e895075e21be3e21e71d6317440d1ee7969ad0
("mac80211: allow station add/remove to sleep") there is
a race in mac80211 when it clears the TIM bit because a
sleeping station disconnected, the spinlock isn't held
around the relevant code any more. Use the right API to
acquire the spinlock correctly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
George [Sat, 3 Sep 2011 15:58:48 +0000 (10:58 -0500)]
rtlwifi: Fix problem when switching connections
commit
bac2555c6d86387132930af4d14cb47c4dd3f4f7 upstream.
The driver fails to clear encryption keys making it impossible
to switch connections.
Signed-off-by: George <george0505@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
George [Sat, 3 Sep 2011 15:58:47 +0000 (10:58 -0500)]
rtlwifi: rtl8192su: Fix problem connecting to HT-enabled AP
commit
3401dc6eba788ebc7c14ce51018d775b1c263399 upstream.
The driver fails to connect to 802.11n-enabled APs. The patch fixes
Bug #42262.
Signed-off-by: George <george0505@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Geert Uytterhoeven [Sun, 11 Sep 2011 11:59:27 +0000 (13:59 +0200)]
genirq: Make irq_shutdown() symmetric vs. irq_startup again
commit
ed585a651681e822089087b426e6ebfb6d3d9873 upstream.
If an irq_chip provides .irq_shutdown(), but neither of .irq_disable() or
.irq_mask(), free_irq() crashes when jumping to NULL.
Fix this by only trying .irq_disable() and .irq_mask() if there's no
.irq_shutdown() provided.
This revives the symmetry with irq_startup(), which tries .irq_startup(),
.irq_enable(), and irq_unmask(), and makes it consistent with the comment for
irq_chip.irq_shutdown() in <linux/irq.h>, which says:
* @irq_shutdown: shut down the interrupt (defaults to ->disable if NULL)
This is also how __free_irq() behaved before the big overhaul, cfr. e.g.
3b56f0585fd4c02d047dc406668cb40159b2d340 ("genirq: Remove bogus conditional"),
where the core interrupt code always overrode .irq_shutdown() to
.irq_disable() if .irq_shutdown() was NULL.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Link: http://lkml.kernel.org/r/1315742394-16036-2-git-send-email-geert@linux-m68k.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Anand Gadiyar [Thu, 18 Aug 2011 10:44:31 +0000 (16:14 +0530)]
mfd: Make omap-usb-host TLL mode work again
commit
e600cffe618ff0da29ae1f8b8d3824ce0e2409fc upstream.
This code section seems to have been accidentally copy pasted.
It causes incorrect bits to be set up in the TLL_CHANNEL_CONF
register and prevents the TLL mode from working correctly.
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Keshava Munegowda <keshava_mgowda@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Johan Hovold [Mon, 15 Aug 2011 10:42:03 +0000 (12:42 +0200)]
mfd: Fix initialisation of tps65910 interrupts
commit
fa948761e685fb03823b3029e5b6bdb52229d6ce upstream.
Fix regression introduced by commit
a2974732ca7614aaf0baf9d6dd3ad893d50ce1c5 (TPS65911: Add new irq
definitions) which caused irq_num to be incorrectly set for tps65910.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mark Brown [Wed, 3 Aug 2011 09:04:29 +0000 (18:04 +0900)]
mfd: Fix value of WM8994_CONFIGURE_GPIO
commit
8efcc57dedfebc99c3cd39564e3fc47cd1a24b75 upstream.
This needs to be an out of band value for the register and on this device
registers are 16 bit so we must shift left one to the 17th bit.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lars-Peter Clausen [Mon, 5 Sep 2011 11:49:57 +0000 (13:49 +0200)]
ASoC: Blackfin: bf5xx-ad193x: Fix codec device name
commit
c5d2e650bd805a00ff9af537d5b5dede598a198c upstream.
Fix the codec_name field of the dai_link to match the actual device name
of the codec. Otherwise the card won't be instantiated.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mark Brown [Sun, 4 Sep 2011 15:18:18 +0000 (08:18 -0700)]
ASoC: Fix reporting of partial jack updates
commit
747da0f80e566500421bd7760b2e050fea3fde5e upstream.
We need to report the entire jack state to the core jack code, not just
the bits that were being updated by the caller, otherwise the status
reported by other detection methods will be omitted from the state seen
by userspace.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jeff Kirsher [Wed, 31 Aug 2011 00:58:56 +0000 (20:58 -0400)]
e1000: Fix driver to be used on PA RISC C8000 workstations
commit
e2faeec2de9e2c73958e6ea6065dde1e8cd6f3a2 upstream.
The checksum field in the EEPROM on HPPA is really not a
checksum but a signature (0x16d6). So allow 0x16d6 as the
matching checksum on HPPA systems.
This issue is present on longterm/stable kernels, I have
verified that this patch is applicable back to at least
2.6.32.y kernels.
v2- changed ifdef to use CONFIG_PARISC instead of __hppa__
CC: Guy Martin <gmsoft@tuxicoman.be>
CC: Rolf Eike Beer <eike-kernel@sf-tec.de>
CC: Matt Turner <mattst88@gmail.com>
Reported-by: Mikulas Patocka <mikulas@artax.kerlin.mff.cuni.cz>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Felix Fietkau [Mon, 29 Aug 2011 08:06:14 +0000 (10:06 +0200)]
ath9k_hw: fix calibration on 5 ghz
commit
0e4660cbe51276e86dbdab17228733dbcdb49249 upstream.
ADC calibrations cannot run on 5 GHz with fast clock enabled. They
need to be disabled, otherwise they'll hang and IQ mismatch calibration
will not be run either.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Reported-by: Adrian Chadd <adrian@freebsd.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Len Brown [Tue, 30 Aug 2011 03:01:58 +0000 (23:01 -0400)]
acpica: ACPI_MAX_SLEEP should be 2 sec, not 20
commit
b33c25d6a62ac253caabda2b5f43258abff451c0 upstream.
This limit is a workaround for AML that sleeps too long,
but the workaround didn't work b/c of a typo.
https://bugzilla.kernel.org/show_bug.cgi?id=13195
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stanislaw Gruszka [Fri, 26 Aug 2011 15:24:59 +0000 (17:24 +0200)]
iwlegacy: fix BUG_ON(info->control.rates[0].idx < 0)
commit
7c2510120e9b43b0caf32c3786a6ab831f7d9e87 upstream.
When trying to connect to 5GHz we can provide negative index to
mac80211 what trigger BUG_ON. Reason of iwl-3945-rs malfunction
on 5GHz is unknown and needs further investigation. For now, to
do not trigger a bug, correct value and just print WARNING.
Address bug:
https://bugzilla.redhat.com/show_bug.cgi?id=730653
Reported-and-tested-by: Jan Teichmann <jan.teichmann@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andrew Vasquez [Tue, 16 Aug 2011 18:29:28 +0000 (11:29 -0700)]
qla2xxx: Correct inadvertent loop state transitions during port-update handling.
commit
58b48576966ed0afd3f63ef17480ec12748a7119 upstream.
Transitioning to a LOOP_UPDATE loop-state could cause the driver
to miss normal link/target processing. LOOP_UPDATE is a crufty
artifact leftover from at time the driver performed it's own
internal command-queuing. Safely remove this state.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stephen M. Cameron [Tue, 9 Aug 2011 13:18:01 +0000 (08:18 -0500)]
hpsa: fix physical device lun and target numbering problem
commit
01350d05539d1c95ef3568d062d864ab76ae7670 upstream.
If a physical device exposed to the OS by hpsa
is replaced (e.g. one hot plug tape drive is replaced
by another, or a tape drive is placed into "OBDR" mode
in which it acts like a CD-ROM device) and a rescan is
initiated, the replaced device will be added to the
SCSI midlayer with target and lun numbers set to -1.
After that, a panic is likely to ensue. When a physical
device is replaced, the lun and target number should be
preserved.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stephen M. Cameron [Tue, 9 Aug 2011 13:17:30 +0000 (08:17 -0500)]
hpsa: fix problem that OBDR devices are not detected
commit
0b0e1d6cbcc8627970e0399df8f06edd690ec7d9 upstream.
The test to detect OBDR ("One Button Disaster Recovery")
cd-rom devices was comparing against uninitialized data.
Fixed by moving the test for the device to where the
inquiry data is collected, and uninitialized variable
altogether as it wasn't really being used.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Timur Tabi [Tue, 23 Aug 2011 21:48:26 +0000 (16:48 -0500)]
ASoC: MPC5200: replace of_device with platform_device
commit
3bdf28feafc52864bd7f17b39deec64833a89d19 upstream.
'struct of_device' no longer exists, and its functionality has been merged
into platform_device. Update the MPC5200 audio DMA driver (mpc5200_dma)
accordingly. This fixes a build break.
Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dan Williams [Sat, 30 Jul 2011 00:16:45 +0000 (17:16 -0700)]
isci: fix 32-bit operation when CONFIG_HIGHMEM64G=n
commit
ee33e2b771f9e9e4aaba2bb2ace7b727fe451a8b upstream.
The unsolicited frame control infrastructure requires a table of dma
addresses for the hardware to lookup the frame buffer location by an
index. The hardware expects the elements of this table to be 64-bit
quantities, so we cannot reference these elements as dma_addr_t. All
unsolicited frame protocols are affected, particularly SATA-PIO and SMP
which prevented direct-attached SATA drives and expander-attached drives
to not be discovered.
Reported-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dan Williams [Sat, 30 Jul 2011 00:16:40 +0000 (17:16 -0700)]
isci: fix sata response handling
commit
1a878284473284f9577d44babf16d87152a05c33 upstream.
A bug (likely copy/paste) that has been carried from the original
implementation. The unsolicited frame handling structure returns the
d2h fis in the isci_request.stp.rsp buffer.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jim Garlick [Sat, 20 Aug 2011 18:51:18 +0000 (00:21 +0530)]
fs/9p: Use protocol-defined value for lock/getlock 'type' field.
commit
51b8b4fb32271d39fbdd760397406177b2b0fd36 upstream.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Tue, 16 Aug 2011 16:49:28 +0000 (22:19 +0530)]
fs/9p: Always ask new inode in lookup for cache mode disabled
commit
73f507171cfa407b19f254aef95cbb058c8180cf upstream.
This make sure we don't end up reusing the unlinked inode object.
The ideal way is to use inode i_generation. But i_generation is
not available in userspace always.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Wed, 17 Aug 2011 16:56:04 +0000 (16:56 +0000)]
net/9p: Fix kernel crash with msize 512K
commit
b49d8b5d7007a673796f3f99688b46931293873e upstream.
With msize equal to 512K (PAGE_SIZE * VIRTQUEUE_NUM), we hit multiple
crashes. This patch fix those.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Wed, 3 Aug 2011 14:25:32 +0000 (19:55 +0530)]
fs/9p: Add OS dependent open flags in 9p protocol
commit
f88657ce3f9713a0c62101dffb0e972a979e77b9 upstream.
Some of the flags are OS/arch dependent we add a 9p
protocol value which maps to asm-generic/fcntl.h values in Linux
Based on the original patch from Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
[extra comments from author as to why this needs to go to stable:
Earlier for different operation such as open we used the values of open
flag as defined by the OS. But some of these flags such as O_DIRECT are
arch dependent. So if we have the 9p client and server running on
different architectures, we end up with client sending client
architecture value of these open flag and server will try to map these
values to what its architecture states. For ex: O_DIRECT on a x86 client
maps to
#define O_DIRECT
00040000
Where as on sparc server it will maps to
#define O_DIRECT 0x100000
Hence we need to map these open flags to OS/arch independent flag
values. Getting these changes to an early version of kernel ensures us
that we work with different combination of client and server. We should
ideally backport this patch to all possible kernel version.]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Mon, 25 Jul 2011 18:06:33 +0000 (18:06 +0000)]
fs/9p: Don't update file type when updating file attributes
commit
45089142b1497dab2327d60f6c71c40766fc3ea4 upstream.
We should only update attributes that we can change on stat2inode.
Also do file type initialization in v9fs_init_inode.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Mon, 25 Jul 2011 18:06:32 +0000 (18:06 +0000)]
fs/9p: Add fid before dentry instantiation
commit
5441ae5eb3614d3c28f77073370738a2820c88e4 upstream.
d_instantiate marks the dentry positive. So a parallel lookup and mkdir of
the directory can find dentry that doesn't have fid attached. This can result
in both the code path doing v9fs_fid_add which results in v9fs_dentry leak.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fenghua Yu [Mon, 4 Jul 2011 08:36:16 +0000 (08:36 +0000)]
ACPICA: Do not repair _TSS return package if _PSS is present
commit
8f9c91273e36e5762c617c23e4fd48d5172e0dac upstream.
We can only sort the _TSS return package if there is no _PSS
in the same scope. This is because if _PSS is present, the ACPI
specification dictates that the _TSS Power Dissipation field is
to be ignored, and therefore some BIOSs leave garbage values in
the _TSS Power field(s). In this case, it is best to just return
the _TSS package as-is.
Reported-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joerg Roedel [Fri, 2 Sep 2011 12:10:32 +0000 (14:10 +0200)]
iommu/amd: Make sure iommu->need_sync contains correct value
commit
f1ca1512e765337a7c09eb875eedef8ea4e07654 upstream.
The value is only set to true but never set back to false,
which causes to many completion-wait commands to be sent to
hardware. Fix it with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joerg Roedel [Fri, 2 Sep 2011 12:19:50 +0000 (14:19 +0200)]
iommu/amd: Don't take domain->lock recursivly
commit
e33acde91140f1809952d1c135c36feb66a51887 upstream.
The domain_flush_devices() function takes the domain->lock.
But this function is only called from update_domain() which
itself is already called unter the domain->lock. This causes
a deadlock situation when the dma-address-space of a domain
grows larger than 1GB.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Randy Dunlap [Wed, 22 Jun 2011 03:32:53 +0000 (20:32 -0700)]
irda: fix smsc-ircc2 section mismatch warning
commit
f470e5ae34d68880a38aa79ee5c102ebc2a1aef6 upstream.
Fix section mismatch warning:
WARNING: drivers/net/irda/smsc-ircc2.o(.devinit.text+0x1a7): Section mismatch in reference from the function smsc_ircc_pnp_probe() to the function .init.text:smsc_ircc_open()
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Al Viro [Sat, 23 Jul 2011 06:28:13 +0000 (02:28 -0400)]
9p: close ACL leaks
commit
1ec95bf34d976b38897d1977b155a544d77b05e7 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Venkateswararao Jujjuri (JV) [Thu, 30 Jun 2011 01:06:33 +0000 (18:06 -0700)]
net/9p: Fix the msize calculation.
commit
c9ffb05ca5b5098d6ea468c909dd384d90da7d54 upstream.
msize represents the maximum PDU size that includes P9_IOHDRSZ.
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Wed, 6 Jul 2011 11:02:31 +0000 (16:32 +0530)]
fs/9p: Always ask new inode in create
commit
ed80fcfac2565fa866d93ba14f0e75de17a8223e upstream.
This make sure we don't end up reusing the unlinked inode object.
The ideal way is to use inode i_generation. But i_generation is
not available in userspace always.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Prem Karat [Fri, 6 May 2011 12:54:18 +0000 (18:24 +0530)]
fs/9p: Fix invalid mount options/args
commit
a2dd43bb0d7b9ce28f8a39254c25840c0730498e upstream.
Without this fix, if any invalid mount options/args are passed while mouting
the 9p fs, no error (-EINVAL) is returned and default arg value is assigned.
This fix returns -EINVAL when an invalid arguement is found while parsing
mount options.
Signed-off-by: Prem Karat <prem.karat@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Mon, 11 Jul 2011 16:40:59 +0000 (16:40 +0000)]
fs/9p: When doing inode lookup compare qid details and inode mode bits.
commit
fd2421f54423f307ecd31bdebdca6bc317e0c492 upstream.
This make sure we don't use wrong inode from the inode hash. The inode number
of the file deleted is reused by the next file system object created
and if we only use inode number for inode hash lookup we could end up
with wrong struct inode.
Also compare inode generation number. Not all Linux file system provide
st_gen in userspace. So it could be 0;
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aneesh Kumar K.V [Mon, 11 Jul 2011 16:40:58 +0000 (16:40 +0000)]
fs/9p: Fid is not valid after a failed clunk.
commit
5034990e28efb2d232ee82443a9edd62defd17ba upstream.
free the fid even in case of failed clunk.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
jvrao [Thu, 30 Jun 2011 23:18:41 +0000 (23:18 +0000)]
VirtIO can transfer VIRTQUEUE_NUM of pages.
commit
7f781679dd596c8abde8336b4d0d166d6a4aad04 upstream.
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
jvrao [Thu, 30 Jun 2011 23:18:39 +0000 (23:18 +0000)]
Fix the size of receive buffer packing onto VirtIO ring.
commit
114e6f3a5ede73d5b56e145f04680c61c3dd67c4 upstream.
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Van Hensbergen [Thu, 14 Jul 2011 00:12:18 +0000 (19:12 -0500)]
net/9p: fix client code to fail more gracefully on protocol error
commit
b85f7d92d7bd7e3298159e8b1eed8cb8cbbb0348 upstream.
There was a BUG_ON to protect against a bad id which could be dealt with
more gracefully.
Reported-by: Natalie Orlin <norlin@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Florian Mickler [Wed, 10 Aug 2011 10:05:20 +0000 (07:05 -0300)]
vp7045: fix buffer setup
commit
fc61ccd35fd59d5362d37c8bf9c0526c85086c84 upstream.
dvb_usb_device_init calls the frontend_attach method of this driver which
uses vp7045_usb_ob. In order to have a buffer ready in vp7045_usb_op, it has to
be allocated before that happens.
Luckily we can use the whole private data as the buffer as it gets separately
allocated on the heap via kzalloc in dvb_usb_device_init and is thus apt for
use via usb_control_msg.
This fixes a
BUG: unable to handle kernel paging request at
0000000000001e78
reported by Tino Keitel and diagnosed by Dan Carpenter.
Tested-by: Tino Keitel <tino.keitel@tikei.de>
Signed-off-by: Florian Mickler <florian@mickler.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jarod Wilson [Mon, 8 Aug 2011 20:20:40 +0000 (17:20 -0300)]
nuvoton-cir: simplify raw IR sample handling
commit
de4ed0c111ed078b8729a5cc49c23197740f5bad upstream.
The nuvoton-cir driver was storing up consecutive pulse-pulse and
space-space samples internally, for no good reason, since
ir_raw_event_store_with_filter() already merges back to back like
samples types for us. This should also fix a regression introduced late
in 3.0 that related to a timeout change, which actually becomes correct
when coupled with this change. Tested with RC6 and RC5 on my own
nuvoton-cir hardware atop vanilla 3.0.0, after verifying quirky
behavior in 3.0 due to the timeout change.
Reported-by: Stephan Raue <sraue@openelec.tv>
CC: Stephan Raue <sraue@openelec.tv>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NeilBrown [Sat, 10 Sep 2011 07:21:28 +0000 (17:21 +1000)]
md: Fix handling for devices from 2TB to 4TB in 0.90 metadata.
commit
27a7b260f71439c40546b43588448faac01adb93 upstream.
0.90 metadata uses an unsigned 32bit number to count the number of
kilobytes used from each device.
This should allow up to 4TB per device.
However we multiply this by 2 (to get sectors) before casting to a
larger type, so sizes above 2TB get truncated.
Also we allow rdev->sectors to be larger than 4TB, so it is possible
for the array to be resized larger than the metadata can handle.
So make sure rdev->sectors never exceeds 4TB when 0.90 metadata is in
used.
Also the sanity check at the end of super_90_load should include level
1 as it used ->size too. (RAID0 and Linear don't use ->size at all).
Reported-by: Pim Zandbergen <P.Zandbergen@macroscoop.nl>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NeilBrown [Sat, 10 Sep 2011 07:20:21 +0000 (17:20 +1000)]
Avoid dereferencing a 'request_queue' after last close.
commit
94007751bb02797ba87bac7aacee2731ac2039a3 upstream.
On the last close of an 'md' device which as been stopped, the device
is destroyed and in particular the request_queue is freed. The free
is done in a separate thread so it might happen a short time later.
__blkdev_put calls bdev_inode_switch_bdi *after* ->release has been
called.
Since commit
f758eeabeb96f878c860e8f110f94ec8820822a9
bdev_inode_switch_bdi will dereference the 'old' bdi, which lives
inside a request_queue, to get a spin lock. This causes the last
close on an md device to sometime take a spin_lock which lives in
freed memory - which results in an oops.
So move the called to bdev_inode_switch_bdi before the call to
->release.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Marcin Slusarz [Mon, 22 Aug 2011 21:14:05 +0000 (23:14 +0200)]
drm/nouveau: properly handle allocation failure in nouveau_sgdma_populate
commit
17c8b960930da3599e47801a54ac0ea1070545d2 upstream.
Not cleaning after alloc failure would result in crash on destroy,
because nouveau_sgdma_clear assumes "ttm_alloced" to be not null when
"pages" is not null.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Linus Walleij [Tue, 2 Aug 2011 15:48:38 +0000 (17:48 +0200)]
ARM: davinci: fix cache flush build error
commit
897a6a1a14837d6d582bfd1fd7aba00be44b6469 upstream.
The TNET variant of DaVinci compiles some code that it shares
with other DaVinci variants, however it has a V6 CPU rather than
an ARM926T, thus the hardcoded call to arm926_flush_kern_cache_all()
in sleep.S will obviously fail, and we need to build with the
v6_flush_kern_cache_all() call instead. This was triggered by
manually altering the DaVinci config to build the TNET version.
Cc: Dave Martin <dave.martin@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sudhakar Rajashekhara [Tue, 12 Jul 2011 10:28:53 +0000 (15:58 +0530)]
ARM: davinci: da850 EVM: read mac address from SPI flash
commit
810198bc9c109489dfadc57131c5183ce6ad2d7d upstream.
DA850/OMAP-L138 EMAC driver uses random mac address instead of
a fixed one because the mac address is not stuffed into EMAC
platform data.
This patch provides a function which reads the mac address
stored in SPI flash (registered as MTD device) and populates the
EMAC platform data. The function which reads the mac address is
registered as a callback which gets called upon addition of MTD
device.
NOTE: In case the MAC address stored in SPI flash is erased, follow
the instructions at [1] to restore it.
[1] http://processors.wiki.ti.com/index.php/GSG:_OMAP-L138_DVEVM_Additional_Procedures#Restoring_MAC_address_on_SPI_Flash
Modifications in v2:
Guarded registering the mtd_notifier only when MTD is enabled.
Earlier this was handled using mtd_has_partitions() call, but
this has been removed in Linux v3.0.
Modifications in v3:
a. Guarded da850_evm_m25p80_notify_add() function and
da850evm_spi_notifier structure with CONFIG_MTD macros.
b. Renamed da850_evm_register_mtd_user() function to
da850_evm_setup_mac_addr() and removed the struct mtd_notifier
argument to this function.
c. Passed the da850evm_spi_notifier structure to register_mtd_user()
function.
Modifications in v4:
Moved the da850_evm_setup_mac_addr() function within the first
CONFIG_MTD ifdef construct.
Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Linus Walleij [Tue, 6 Sep 2011 07:08:13 +0000 (08:08 +0100)]
ARM: 7081/1: mach-integrator: fix the clocksource
commit
bb9ea77846620ed2b37e74c852d72c7a476b248c upstream.
I was intrigued by the fact that the clock stood still on
the Integrator, but it wasn't strange at all, because the
timer was set up all wrong and probably has been for a
while. With this patch the clock starts ticking again:
make the timer periodic (reload), |= on the divisor bit
and load the timer before starting it.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Guenter Roeck [Sun, 28 Aug 2011 20:01:49 +0000 (13:01 -0700)]
hwmon: (max16065) Fix current calculation
commit
ff71c182f461da5ae9d2d65f8a63f5a9193b9be1 upstream.
Current calculation is completely wrong. Add missing brackets to fix it.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Konrad Rzeszutek Wilk [Thu, 1 Sep 2011 13:48:27 +0000 (09:48 -0400)]
xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.
commit
ed467e69f16e6b480e2face7bc5963834d025f91 upstream.
We have hit a couple of customer bugs where they would like to
use those parameters to run an UP kernel - but both of those
options turn of important sources of interrupt information so
we end up not being able to boot. The correct way is to
pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and
the kernel will patch itself to be a UP kernel.
Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308
Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Igor Mammedov [Thu, 1 Sep 2011 11:46:55 +0000 (13:46 +0200)]
xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
commit
d198d499148a0c64a41b3aba9e7dd43772832b91 upstream.
If vmalloc page_fault happens inside of interrupt handler with interrupts
disabled then on exit path from exception handler when there is no pending
interrupts, the following code (arch/x86/xen/xen-asm_32.S:112):
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
sete XEN_vcpu_info_mask(%eax)
will enable interrupts even if they has been previously disabled according to
eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99)
testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
setz XEN_vcpu_info_mask(%eax)
Solution is in setting XEN_vcpu_info_mask only when it should be set
according to
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
but not clearing it if there isn't any pending events.
Reproducer for bug is attached to RHBZ 707552
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Girish K S [Fri, 26 Aug 2011 09:28:18 +0000 (14:58 +0530)]
mmc: sdhci-s3c: Fix mmc card I/O problem
commit
49bb1e619568ec84785ceb366f07db2a6f0b64cc upstream.
This patch fixes the problem in sdhci-s3c host driver for Samsung Soc's.
During the card identification stage the mmc core driver enumerates for
the best bus width in combination with the highest available data rate.
It starts enumerating from the highest bus width (8) to lowest width (1).
In case of few MMC cards the 4-bit bus enumeration fails and tries
the 1-bit bus enumeration. When switched to 1-bit bus mode the host driver
has to clear the previous bus width setting and apply the new setting.
The current patch will clear the previous bus mode and apply the new
mode setting.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mika Westerberg [Thu, 18 Aug 2011 12:23:49 +0000 (15:23 +0300)]
mmc: core: use non-reentrant workqueue for clock gating
commit
50a50f9248497484c678631a9c1a719f1aaeab79 upstream.
The default multithread workqueue can cause the same work to be executed
concurrently on a different CPUs. This isn't really suitable for clock
gating as it might already gated the clock and gating it twice results both
host->clk_old and host->ios.clock to be set to 0.
To prevent this from happening we use system_nrt_wq instead.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mika Westerberg [Thu, 18 Aug 2011 12:23:48 +0000 (15:23 +0300)]
mmc: core: prevent aggressive clock gating racing with ios updates
commit
778e277cb82411c9002ca28ccbd216c4d9eb9158 upstream.
We have seen at least two different races when clock gating kicks in in a
middle of ios structure update.
First one happens when ios->clock is changed outside of aggressive clock
gating framework, for example via mmc_set_clock(). The race might happen
when we run following code:
mmc_set_ios():
...
if (ios->clock > 0)
mmc_set_ungated(host);
Now if gating kicks in right after the condition check we end up setting
host->clk_gated to false even though we have just gated the clock. Next
time a request is started we try to ungate and restore the clock in
mmc_host_clk_hold(). However since we have host->clk_gated set to false the
original clock is not restored.
This eventually will cause the host controller to hang since its clock is
disabled while we are trying to issue a request. For example on Intel
Medfield platform we see:
[ 13.818610] mmc2: Timeout waiting for hardware interrupt.
[ 13.818698] sdhci: =========== REGISTER DUMP (mmc2)===========
[ 13.818753] sdhci: Sys addr: 0x00000000 | Version: 0x00008901
[ 13.818804] sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 13.818853] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 13.818903] sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 13.818951] sdhci: Power: 0x0000000d | Blk gap: 0x00000000
[ 13.819000] sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
[ 13.819049] sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
[ 13.819098] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
[ 13.819147] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[ 13.819196] sdhci: Caps: 0x6bee32b2 | Caps_1: 0x00000000
[ 13.819245] sdhci: Cmd: 0x00000000 | Max curr: 0x00000000
[ 13.819292] sdhci: Host ctl2: 0x00000000
[ 13.819331] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 13.819377] sdhci: ===========================================
[ 13.919605] mmc2: Reset 0x2 never completed.
and it never recovers.
Second race might happen while running mmc_power_off():
static void mmc_power_off(struct mmc_host *host)
{
host->ios.clock = 0;
host->ios.vdd = 0;
[ clock gating kicks in here ]
/*
* Reset ocr mask to be the highest possible voltage supported for
* this mmc host. This value will be used at next power up.
*/
host->ocr = 1 << (fls(host->ocr_avail) - 1);
if (!mmc_host_is_spi(host)) {
host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN;
host->ios.chip_select = MMC_CS_DONTCARE;
}
host->ios.power_mode = MMC_POWER_OFF;
host->ios.bus_width = MMC_BUS_WIDTH_1;
host->ios.timing = MMC_TIMING_LEGACY;
mmc_set_ios(host);
}
If the clock gating worker kicks in while we are only partially updated the
ios structure the host controller gets incomplete ios and might not work as
supposed. Again on Intel Medfield platform we get:
[ 4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155!
[ 4.185422] invalid opcode: 0000 [#1] PREEMPT SMP
[ 4.185509] Modules linked in:
[ 4.185565]
[ 4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA
[ 4.185742] EIP: 0060:[<
c136364e>] EFLAGS:
00010083 CPU: 0
[ 4.185827] EIP is at sdhci_set_power+0x3e/0xd0
[ 4.185891] EAX:
f5ff98e0 EBX:
f5ff98e0 ECX:
00000000 EDX:
00000001
[ 4.185970] ESI:
f5ff977c EDI:
f5ff9904 EBP:
f644fe98 ESP:
f644fe94
[ 4.186049] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 4.186125] Process kworker/0:0 (pid: 4, ti=
f644e000 task=
f644c0e0 task.ti=
f644e000)
[ 4.186219] Stack:
[ 4.186257]
f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec
[ 4.186418]
c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286
[ 4.186579]
f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e
[ 4.186739] Call Trace:
[ 4.186802] [<
c1365173>] sdhci_set_ios+0x1c3/0x340
[ 4.186883] [<
c1355bd8>] mmc_gate_clock+0x68/0x120
[ 4.186963] [<
c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60
[ 4.187052] [<
c107922b>] ? trace_hardirqs_on+0xb/0x10
[ 4.187134] [<
c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130
[ 4.187219] [<
c105ec09>] ? process_one_work+0xf9/0x5b0
[ 4.187300] [<
c135841d>] mmc_host_clk_gate_work+0xd/0x10
[ 4.187379] [<
c105ec82>] process_one_work+0x172/0x5b0
[ 4.187457] [<
c105ec09>] ? process_one_work+0xf9/0x5b0
[ 4.187538] [<
c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130
[ 4.187625] [<
c105f3c8>] worker_thread+0x118/0x330
[ 4.187700] [<
c1496cee>] ? preempt_schedule+0x2e/0x50
[ 4.187779] [<
c105f2b0>] ? rescuer_thread+0x1f0/0x1f0
[ 4.187857] [<
c1062cf4>] kthread+0x74/0x80
[ 4.187931] [<
c1062c80>] ? __init_kthread_worker+0x60/0x60
[ 4.188015] [<
c149acfa>] kernel_thread_helper+0x6/0xd
[ 4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0
[ 4.188780] EIP: [<
c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:
f644fe94
[ 4.188898] ---[ end trace
a7b23eecc71777e4 ]---
This BUG() comes from the fact that ios.power_mode was still in previous
value (MMC_POWER_ON) and ios.vdd was set to zero.
We prevent these by inhibiting the clock gating while we update the ios
structure.
Both problems can be reproduced by simply running the device in a reboot
loop.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mika Westerberg [Thu, 18 Aug 2011 12:23:47 +0000 (15:23 +0300)]
mmc: rename mmc_host_clk_{ungate|gate} to mmc_host_clk_{hold|release}
commit
08c14071fda4e69abb9d5b1566651cd092b158d3 upstream.
As per suggestion by Linus Walleij:
> If you think the names of the functions are confusing then
> you may rename them, say like this:
>
> mmc_host_clk_ungate() -> mmc_host_clk_hold()
> mmc_host_clk_gate() -> mmc_host_clk_release()
>
> Which would make the usecases more clear
(This is CC'd to stable@ because the next two patches, which fix
observable races, depend on it.)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andrey Vagin [Tue, 30 Aug 2011 08:32:36 +0000 (12:32 +0400)]
x86, perf: Check that current->mm is alive before getting user callchain
commit
20afc60f892d285fde179ead4b24e6a7938c2f1b upstream.
An event may occur when an mm is already released.
I added an event in dequeue_entity() and caught a panic with
the following backtrace:
[ 434.421110] BUG: unable to handle kernel NULL pointer dereference at
0000000000000050
[ 434.421258] IP: [<
ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120
...
[ 434.421258] Call Trace:
[ 434.421258] [<
ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0
[ 434.421258] [<
ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90
[ 434.421258] [<
ffffffff8101b048>] perf_callchain_user+0x128/0x170
[ 434.421258] [<
ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100
[ 434.421258] [<
ffffffff81116690>] perf_prepare_sample+0x200/0x280
[ 434.421258] [<
ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290
[ 434.421258] [<
ffffffff81065240>] ? tg_shares_up+0x0/0x670
[ 434.421258] [<
ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0
[ 434.421258] [<
ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0
[ 434.421258] [<
ffffffff81119150>] do_perf_sw_event+0x1e0/0x250
[ 434.421258] [<
ffffffff81119204>] perf_tp_event+0x44/0x70
[ 434.421258] [<
ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110
[ 434.421258] [<
ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0
[ 434.421258] [<
ffffffff810614ec>] dequeue_task_fair+0x1c/0x60
[ 434.421258] [<
ffffffff8105818a>] dequeue_task+0x9a/0xb0
[ 434.421258] [<
ffffffff810581e2>] deactivate_task+0x42/0xe0
[ 434.421258] [<
ffffffff814bc019>] thread_return+0x191/0x808
[ 434.421258] [<
ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60
[ 434.421258] [<
ffffffff8106f4c4>] do_exit+0x464/0x910
[ 434.421258] [<
ffffffff8106f9c8>] do_group_exit+0x58/0xd0
[ 434.421258] [<
ffffffff8106fa57>] sys_exit_group+0x17/0x20
[ 434.421258] [<
ffffffff8100b202>] system_call_fastpath+0x16/0x1b
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
WANG Cong [Thu, 18 Aug 2011 12:36:57 +0000 (20:36 +0800)]
sched: Fix a memory leak in __sdt_free()
commit
feff8fa0075bdfd43c841e9d689ed81adda988d6 upstream.
This patch fixes the following memory leak:
unreferenced object 0xffff880107266800 (size 512):
comm "sched-powersave", pid 3718, jiffies
4323097853 (age 27495.450s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
ffffffff81133940>] create_object+0x187/0x28b
[<
ffffffff814ac103>] kmemleak_alloc+0x73/0x98
[<
ffffffff811232ba>] __kmalloc_node+0x104/0x159
[<
ffffffff81044b98>] kzalloc_node.clone.97+0x15/0x17
[<
ffffffff8104cb90>] build_sched_domains+0xb7/0x7f3
[<
ffffffff8104d4df>] partition_sched_domains+0x1db/0x24a
[<
ffffffff8109ee4a>] do_rebuild_sched_domains+0x3b/0x47
[<
ffffffff810a00c7>] rebuild_sched_domains+0x10/0x12
[<
ffffffff8104d5ba>] sched_power_savings_store+0x6c/0x7b
[<
ffffffff8104d5df>] sched_mc_power_savings_store+0x16/0x18
[<
ffffffff8131322c>] sysdev_class_store+0x20/0x22
[<
ffffffff81193876>] sysfs_write_file+0x108/0x144
[<
ffffffff81135b10>] vfs_write+0xaf/0x102
[<
ffffffff81135d23>] sys_write+0x4d/0x74
[<
ffffffff814c8a42>] system_call_fastpath+0x16/0x1b
[<
ffffffffffffffff>] 0xffffffffffffffff
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1313671017-4112-1-git-send-email-amwang@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Wed, 22 Jun 2011 17:47:01 +0000 (19:47 +0200)]
sched: Move blk_schedule_flush_plug() out of __schedule()
commit
9c40cef2b799f9b5e7fa5de4d2ad3a0168ba118c upstream.
There is no real reason to run blk_schedule_flush_plug() with
interrupts and preemption disabled.
Move it into schedule() and call it when the task is going voluntarily
to sleep. There might be false positives when the task is woken
between that call and actually scheduling, but that's not really
different from being woken immediately after switching away.
This fixes a deadlock in the scheduler where the
blk_schedule_flush_plug() callchain enables interrupts and thereby
allows a wakeup to happen of the task that's going to sleep.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-dwfxtra7yg1b5r65m32ywtct@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Wed, 22 Jun 2011 17:47:00 +0000 (19:47 +0200)]
sched: Separate the scheduler entry for preemption
commit
c259e01a1ec90063042f758e409cd26b2a0963c8 upstream.
Block-IO and workqueues call into notifier functions from the
scheduler core code with interrupts and preemption disabled. These
calls should be made before entering the scheduler core.
To simplify this, separate the scheduler core code into
__schedule(). __schedule() is directly called from the places which
set PREEMPT_ACTIVE and from schedule(). This allows us to add the work
checks into schedule(), so they are only called when a task voluntary
goes to sleep.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110622174918.813258321@linutronix.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
John Stultz [Fri, 22 Jul 2011 09:12:51 +0000 (09:12 +0000)]
rtc: Fix RTC PIE frequency limit
commit
938f97bcf1bdd1b681d5d14d1d7117a2e22d4434 upstream.
Thomas earlier submitted a fix to limit the RTC PIE freq, but
picked 5000Hz out of the air. Willy noticed that we should
instead use the 8192Hz max from the rtc man documentation.
Cc: Willy Tarreau <w@1wt.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
John Stultz [Wed, 10 Aug 2011 17:26:09 +0000 (10:26 -0700)]
alarmtimers: Avoid possible denial of service with high freq periodic timers
commit
6af7e471e5a7746b8024d70b4363d3dfe41d36b8 upstream.
Its possible to jam up the alarm timers by setting very small interval
timers, which will cause the alarmtimer subsystem to spend all of its time
firing and restarting timers. This can effectivly lock up a box.
A deeper fix is needed, closely mimicking the hrtimer code, but for now
just cap the interval to 100us to avoid userland hanging the system.
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
John Stultz [Thu, 4 Aug 2011 14:51:56 +0000 (07:51 -0700)]
alarmtimers: Memset itimerspec passed into alarm_timer_get
commit
ea7802f630d356acaf66b3c0b28c00a945fc35dc upstream.
Following common_timer_get, zero out the itimerspec passed in.
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
John Stultz [Thu, 4 Aug 2011 14:25:35 +0000 (07:25 -0700)]
alarmtimers: Avoid possible null pointer traversal
commit
971c90bfa2f0b4fe52d6d9002178d547706f1343 upstream.
We don't check if old_setting is non null before assigning it, so
correct this.
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Troy Kisky [Fri, 24 Jun 2011 17:52:56 +0000 (10:52 -0700)]
MXC: iomux-v3: correct NO_PAD_CTRL definition
commit
425933b30b0ccfac58065bca6c853ea627443cdf upstream.
iomux-v3.c uses NO_PAD_CTRL as a 32 bit value
so it should not be shifted left by MUX_PAD_CTRL_SHIFT(41)
Previously, anything requesting NO_PAD_CTRL would get
their pad control register set to 0.
Since it is a pad control mask, place it with the other mask values.
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Acked-by: Lothar Waßmann <LW@KARO-electronics.de>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Carolyn Wyborny [Thu, 7 Jul 2011 00:24:56 +0000 (00:24 +0000)]
igb: fix WOL on second port of i350 device
commit
6d337dce664b6872ddf1655f6b1fcab76ce35b08 upstream.
This patch fixes a problem where WOL would fail on second port of i350
device.
Reported-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Reported-by: Stefan Assmann<sassmann@redhat.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mel Gorman [Tue, 26 Jul 2011 00:12:30 +0000 (17:12 -0700)]
mm: page allocator: reconsider zones for allocation after direct reclaim
commit
76d3fbf8fbf6cc78ceb63549e0e0c5bc8a88f838 upstream.
With zone_reclaim_mode enabled, it's possible for zones to be considered
full in the zonelist_cache so they are skipped in the future. If the
process enters direct reclaim, the ZLC may still consider zones to be full
even after reclaiming pages. Reconsider all zones for allocation if
direct reclaim returns successfully.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mel Gorman [Tue, 26 Jul 2011 00:12:29 +0000 (17:12 -0700)]
mm: page allocator: initialise ZLC for first zone eligible for zone_reclaim
commit
cd38b115d5ad79b0100ac6daa103c4fe2c50a913 upstream.
There have been a small number of complaints about significant stalls
while copying large amounts of data on NUMA machines reported on a
distribution bugzilla. In these cases, zone_reclaim was enabled by
default due to large NUMA distances. In general, the complaints have not
been about the workload itself unless it was a file server (in which case
the recommendation was disable zone_reclaim).
The stalls are mostly due to significant amounts of time spent scanning
the preferred zone for pages to free. After a failure, it might fallback
to another node (as zonelists are often node-ordered rather than
zone-ordered) but stall quickly again when the next allocation attempt
occurs. In bad cases, each page allocated results in a full scan of the
preferred zone.
Patch 1 checks the preferred zone for recent allocation failure
which is particularly important if zone_reclaim has failed
recently. This avoids rescanning the zone in the near future and
instead falling back to another node. This may hurt node locality
in some cases but a failure to zone_reclaim is more expensive than
a remote access.
Patch 2 clears the zlc information after direct reclaim.
Otherwise, zone_reclaim can mark zones full, direct reclaim can
reclaim enough pages but the zone is still not considered for
allocation.
This was tested on a 24-thread 2-node x86_64 machine. The tests were
focused on large amounts of IO. All tests were bound to the CPUs on
node-0 to avoid disturbances due to processes being scheduled on different
nodes. The kernels tested are
3.0-rc6-vanilla Vanilla 3.0-rc6
zlcfirst Patch 1 applied
zlcreconsider Patches 1+2 applied
FS-Mark
./fs_mark -d /tmp/fsmark-10813 -D 100 -N 5000 -n 208 -L 35 -t 24 -S0 -s 524288
fsmark-3.0-rc6 3.0-rc6 3.0-rc6
vanilla zlcfirs zlcreconsider
Files/s min 54.90 ( 0.00%) 49.80 (-10.24%) 49.10 (-11.81%)
Files/s mean 100.11 ( 0.00%) 135.17 (25.94%) 146.93 (31.87%)
Files/s stddev 57.51 ( 0.00%) 138.97 (58.62%) 158.69 (63.76%)
Files/s max 361.10 ( 0.00%) 834.40 (56.72%) 802.40 (55.00%)
Overhead min 76704.00 ( 0.00%) 76501.00 ( 0.27%) 77784.00 (-1.39%)
Overhead mean
1485356.51 ( 0.00%)
1035797.83 (43.40%)
1594680.26 (-6.86%)
Overhead stddev
1848122.53 ( 0.00%) 881489.88 (109.66%)
1772354.90 ( 4.27%)
Overhead max
7989060.00 ( 0.00%)
3369118.00 (137.13%)
10135324.00 (-21.18%)
MMTests Statistics: duration
User/Sys Time Running Test (seconds) 501.49 493.91 499.93
Total Elapsed Time (seconds) 2451.57 2257.48 2215.92
MMTests Statistics: vmstat
Page Ins 46268 63840 66008
Page Outs
90821596 90671128 88043732
Swap Ins 0 0 0
Swap Outs 0 0 0
Direct pages scanned
13091697 8966863 8971790
Kswapd pages scanned 0
1830011 1831116
Kswapd pages reclaimed 0
1829068 1829930
Direct pages reclaimed
13037777 8956828 8648314
Kswapd efficiency 100% 99% 99%
Kswapd velocity 0.000 810.643 826.346
Direct efficiency 99% 99% 96%
Direct velocity 5340.128 3972.068 4048.788
Percentage direct scans 100% 83% 83%
Page writes by reclaim 0 3 0
Slabs scanned 796672 720640 720256
Direct inode steals
7422667 7160012 7088638
Kswapd inode steals 0
1736840 2021238
Test completes far faster with a large increase in the number of files
created per second. Standard deviation is high as a small number of
iterations were much higher than the mean. The number of pages scanned by
zone_reclaim is reduced and kswapd is used for more work.
LARGE DD
3.0-rc6 3.0-rc6 3.0-rc6
vanilla zlcfirst zlcreconsider
download tar 59 ( 0.00%) 59 ( 0.00%) 55 ( 7.27%)
dd source files 527 ( 0.00%) 296 (78.04%) 320 (64.69%)
delete source 36 ( 0.00%) 19 (89.47%) 20 (80.00%)
MMTests Statistics: duration
User/Sys Time Running Test (seconds) 125.03 118.98 122.01
Total Elapsed Time (seconds) 624.56 375.02 398.06
MMTests Statistics: vmstat
Page Ins
3594216 439368 407032
Page Outs
23380832 23380488 23377444
Swap Ins 0 0 0
Swap Outs 0 436 287
Direct pages scanned
17482342 69315973 82864918
Kswapd pages scanned 0 519123 575425
Kswapd pages reclaimed 0 466501 522487
Direct pages reclaimed
5858054 2732949 2712547
Kswapd efficiency 100% 89% 90%
Kswapd velocity 0.000 1384.254 1445.574
Direct efficiency 33% 3% 3%
Direct velocity 27991.453 184832.737 208171.929
Percentage direct scans 100% 99% 99%
Page writes by reclaim 0 5082 13917
Slabs scanned 17280 29952 35328
Direct inode steals 115257
1431122 332201
Kswapd inode steals 0 0 979532
This test downloads a large tarfile and copies it with dd a number of
times - similar to the most recent bug report I've dealt with. Time to
completion is reduced. The number of pages scanned directly is still
disturbingly high with a low efficiency but this is likely due to the
number of dirty pages encountered. The figures could probably be improved
with more work around how kswapd is used and how dirty pages are handled
but that is separate work and this result is significant on its own.
Streaming Mapped Writer
MMTests Statistics: duration
User/Sys Time Running Test (seconds) 124.47 111.67 112.64
Total Elapsed Time (seconds) 2138.14 1816.30 1867.56
MMTests Statistics: vmstat
Page Ins 90760 89124 89516
Page Outs
121028340 120199524 120736696
Swap Ins 0 86 55
Swap Outs 0 0 0
Direct pages scanned
114989363 96461439 96330619
Kswapd pages scanned
56430948 56965763 57075875
Kswapd pages reclaimed
27743219 27752044 27766606
Direct pages reclaimed 49777 46884 36655
Kswapd efficiency 49% 48% 48%
Kswapd velocity 26392.541 31363.631 30561.736
Direct efficiency 0% 0% 0%
Direct velocity 53780.091 53108.759 51581.004
Percentage direct scans 67% 62% 62%
Page writes by reclaim 385 122 1513
Slabs scanned 43008 39040 42112
Direct inode steals 0 10 8
Kswapd inode steals 733 534 477
This test just creates a large file mapping and writes to it linearly.
Time to completion is again reduced.
The gains are mostly down to two things. In many cases, there is less
scanning as zone_reclaim simply gives up faster due to recent failures.
The second reason is that memory is used more efficiently. Instead of
scanning the preferred zone every time, the allocator falls back to
another zone and uses it instead improving overall memory utilisation.
This patch: initialise ZLC for first zone eligible for zone_reclaim.
The zonelist cache (ZLC) is used among other things to record if
zone_reclaim() failed for a particular zone recently. The intention is to
avoid a high cost scanning extremely long zonelists or scanning within the
zone uselessly.
Currently the zonelist cache is setup only after the first zone has been
considered and zone_reclaim() has been called. The objective was to avoid
a costly setup but zone_reclaim is itself quite expensive. If it is
failing regularly such as the first eligible zone having mostly mapped
pages, the cost in scanning and allocation stalls is far higher than the
ZLC initialisation step.
This patch initialises ZLC before the first eligible zone calls
zone_reclaim(). Once initialised, it is checked whether the zone failed
zone_reclaim recently. If it has, the zone is skipped. As the first zone
is now being checked, additional care has to be taken about zones marked
full. A zone can be marked "full" because it should not have enough
unmapped pages for zone_reclaim but this is excessive as direct reclaim or
kswapd may succeed where zone_reclaim fails. Only mark zones "full" after
zone_reclaim fails if it failed to reclaim enough pages after scanning.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>