firefly-linux-kernel-4.4.55.git
10 years agoof: Specify initrd location using 64-bit
Santosh Shilimkar [Mon, 1 Jul 2013 18:20:35 +0000 (14:20 -0400)]
of: Specify initrd location using 64-bit

On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit.  These systems need the ability to specify the
initrd location using 64-bit numbers.

This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.

There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"

More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690
https://lkml.org/lkml/2012/9/13/544

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit 374d5c9964c10373ba39bbe934f4262eb87d7114)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
Chen Gang [Mon, 24 Jun 2013 09:27:49 +0000 (10:27 +0100)]
arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()

If 'COMPAT' not defined, aarch32_break_handler() cannot pass compiling,
and it can work independent with 'COMPAT', so remove dummy definition.

The related error:

  arch/arm64/kernel/debug-monitors.c:249:5: error: redefinition of ‘aarch32_break_handler’
  In file included from arch/arm64/kernel/debug-monitors.c:29:0:
  /root/linux-next/arch/arm64/include/asm/debug-monitors.h:89:12: note: previous definition of ‘aarch32_break_handler’ was here

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c783c2815e13bbb0c0b99997cc240bd7e91b6bb8)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
Vinayak Kale [Wed, 24 Apr 2013 09:07:00 +0000 (10:07 +0100)]
arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board

This patch adds initial DTS files required for APM Mustang board.

Signed-off-by: Kumar Sankaran <ksankaran@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ee877b5321c4dfee9dc9f2a12b19ddcd33149f6a)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Add defines for APM ARMv8 implementation
Vinayak Kale [Wed, 24 Apr 2013 09:06:59 +0000 (10:06 +0100)]
arm64: Add defines for APM ARMv8 implementation

This patch adds defines for APM CPU implementer ID and APM CPU part numbers in asm/cputype.h

Signed-off-by: Kumar Sankaran <ksankaran@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 4ad637a452d5683ca7ff9e9eb994ac4b7a517073)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Enable APM X-Gene SOC family in the defconfig
Vinayak Kale [Wed, 24 Apr 2013 09:06:58 +0000 (10:06 +0100)]
arm64: Enable APM X-Gene SOC family in the defconfig

This patch enables APM X-Gene SOC family in the defconfig. It also enables 8250 serial driver needed by X-Gene SOC family.

Signed-off-by: Kumar Sankaran <ksankaran@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c1db16dc9e74addba4ef8c954702f13e457f0c7e)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Add Kconfig option for APM X-Gene SOC family
Vinayak Kale [Wed, 24 Apr 2013 09:06:57 +0000 (10:06 +0100)]
arm64: Add Kconfig option for APM X-Gene SOC family

This patch adds arm64/Kconfig option for APM X-Gene SOC family.

Signed-off-by: Kumar Sankaran <ksankaran@apm.com>
Signed-off-by: Loc Ho <lho@apm.com>
Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 159428538323188158a6058956c16c199909d844)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64/Makefile: provide vdso_install target
Kyle McMartin [Sun, 16 Jun 2013 19:32:44 +0000 (20:32 +0100)]
arm64/Makefile: provide vdso_install target

Provide a vdso_install target in the arm64 Makefile, as other architectures
with a vdso do.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 3c01742a8ac93a3abf9b099758db970410427afd)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: debug: consolidate software breakpoint handlers
Will Deacon [Sat, 16 Mar 2013 08:48:13 +0000 (08:48 +0000)]
arm64: debug: consolidate software breakpoint handlers

The software breakpoint handlers are hooked in directly from ptrace,
which makes it difficult to add additional handlers for things like
kprobes and kgdb.

This patch moves the handling code into debug-monitors.c, where we can
dispatch to different debug subsystems more easily.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 1442b6ed249d2b3d2cfcf45b65ac64393495c96c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: extable: sort the exception table at build time
Will Deacon [Wed, 8 May 2013 16:29:24 +0000 (17:29 +0100)]
arm64: extable: sort the exception table at build time

As is done for other architectures, sort the exception table at
build-time rather than during boot.

Since sortextable appears to be a standalone C program relying on the
host elf.h to provide EM_AARCH64, I've had to add a conditional check in
order to allow cross-compilation on machines that aren't running a
bleeding-edge libc-dev.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit adace89562c7a9645b8dc84f6e1ac7ba8756094e)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: device: add iommu pointer to device archdata
Will Deacon [Mon, 10 Jun 2013 18:34:42 +0000 (19:34 +0100)]
arm64: device: add iommu pointer to device archdata

When using an IOMMU for device mappings, it is necessary to keep a
pointer between the device and the IOMMU to which it is attached in
order to obtain the correct IOMMU when attaching the device to a domain.

This patch adds an iommu pointer to the dev_archdata structure, in a
similar manner to other architectures (ARM, PowerPC, x86, ...).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 73150c983ac1f9b7653cfd3823b1ad4a44aad3bf)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: pgtable: use pte_index instead of __pte_index
Will Deacon [Mon, 10 Jun 2013 18:34:41 +0000 (19:34 +0100)]
arm64: pgtable: use pte_index instead of __pte_index

pte_index is a useful helper outside of arch/arm64, for things like the
ARM SMMU driver, so rename __pte_index to pte_index to be consistent
with both arch/arm/ and also the definitions of pmd_index and pgd_index.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9ab6d02fddc6831b166812956ff387d7112ff626)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: kernel: compiling issue, need delete read_current_timer()
Chen Gang [Tue, 21 May 2013 09:46:05 +0000 (10:46 +0100)]
arm64: kernel: compiling issue, need delete read_current_timer()

Under arm64, we will calibrate the delay loop statically using a known
timer frequency, so delete read_current_timer(), or it will cause
compiling issue with allmodconfig.

The related error:
  ERROR: "read_current_timer" [lib/rbtree_test.ko] undefined!
  ERROR: "read_current_timer" [lib/interval_tree_test.ko] undefined!
  ERROR: "read_current_timer" [fs/ext4/ext4.ko] undefined!
  ERROR: "read_current_timer" [crypto/tcrypt.ko] undefined!

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 6916b14ea140ff5c915895eefe9431888a39a84d)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: mm: don't bother invalidating the icache in switch_mm
Will Deacon [Wed, 5 Jun 2013 11:40:34 +0000 (12:40 +0100)]
arm64: mm: don't bother invalidating the icache in switch_mm

We don't support software broadcast of cache maintenance operations, so
this flush is not required (__sync_icache_dcache will always affect all
CPUs).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 737c16dffc458e58ee7840556d43b874cb8e16a0)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Fix build for __PAGE_NONE define
Mark Brown [Mon, 14 Apr 2014 17:25:05 +0000 (18:25 +0100)]
arm64: Fix build for __PAGE_NONE define

Simple typo.

Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: mm: Add double logical invert to pte accessors
Steve Capper [Tue, 25 Feb 2014 11:38:53 +0000 (11:38 +0000)]
arm64: mm: Add double logical invert to pte accessors

Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: mm: Introduce PTE_WRITE
Steve Capper [Wed, 15 Jan 2014 14:07:13 +0000 (14:07 +0000)]
arm64: mm: Introduce PTE_WRITE

We have the following means for encoding writable or dirty ptes:

                                PTE_DIRTY       PTE_RDONLY
!pte_dirty && !pte_write        0               1
!pte_dirty && pte_write         0               1
pte_dirty && !pte_write         1               1
pte_dirty && pte_write          1               0

So we can't distinguish between writable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writable but not dirty.

This patch introduces a new software bit PTE_WRITE which allows us to
correctly identify writable ptes. PTE_RDONLY is now only clear for
valid ptes where a page is both writable and dirty.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
arch/arm64/include/asm/pgtable.h

10 years agoarm64: mm: Remove PTE_BIT_FUNC macro
Steve Capper [Wed, 15 Jan 2014 14:07:12 +0000 (14:07 +0000)]
arm64: mm: Remove PTE_BIT_FUNC macro

Expand out the pte manipulation functions. This makes our life easier
when using things like tags and cscope.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agomm/hugetlb.c: call MMU notifiers when copying a hugetlb page range
Andreas Sandberg [Tue, 21 Jan 2014 23:49:09 +0000 (15:49 -0800)]
mm/hugetlb.c: call MMU notifiers when copying a hugetlb page range

When copy_hugetlb_page_range() is called to copy a range of hugetlb
mappings, the secondary MMUs are not notified if there is a protection
downgrade, which breaks COW semantics in KVM.

This patch adds the necessary MMU notifier calls.

Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se>
Acked-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoarm64: mm: Fix PMD_SECT_PROT_NONE definition
Steve Capper [Thu, 5 Dec 2013 12:04:51 +0000 (12:04 +0000)]
arm64: mm: Fix PMD_SECT_PROT_NONE definition

Modify the value of PMD_SECT_PROT_NONE to match that of PTE_NONE. This
should have been in commit 3676f9ef5481 (Move PTE_PROT_NONE higher up).

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org> # 3.11+: 3676f9ef5481: arm64: Move PTE_PROT_NONE higher up
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoarm64: Move PTE_PROT_NONE higher up
Catalin Marinas [Wed, 27 Nov 2013 16:59:27 +0000 (16:59 +0000)]
arm64: Move PTE_PROT_NONE higher up

PTE_PROT_NONE means that a pte is present but does not have any
read/write attributes. However, setting the memory type like
pgprot_writecombine() is allowed and such bits overlap with
PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the
vma->vm_pg_prot on PROT_NONE mappings.

This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit
59911ca4325d (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE
together with the other software bits.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org> # 3.11+
10 years agoARM64: mm: THP support.
Steve Capper [Fri, 19 Apr 2013 15:23:57 +0000 (16:23 +0100)]
ARM64: mm: THP support.

Bring Transparent HugePage support to ARM. The size of a
transparent huge page depends on the normal page size. A
transparent huge page is always represented as a pmd.

If PAGE_SIZE is 4KB, THPs are 2MB.
If PAGE_SIZE is 64KB, THPs are 512MB.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
Steve Capper [Thu, 25 Apr 2013 14:19:21 +0000 (15:19 +0100)]
ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.

The buddy allocator has a default MAX_ORDER of 11, which is too
low to allocate enough memory for 512MB Transparent HugePages if
our base page size is 64KB.

This patch introduces MAX_ZONE_ORDER and sets it to 14 when 64KB
pages are used in conjuction with THP, otherwise the default value
of 11 is used.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoARM64: mm: HugeTLB support.
Steve Capper [Wed, 10 Apr 2013 12:48:00 +0000 (13:48 +0100)]
ARM64: mm: HugeTLB support.

Add huge page support to ARM64, different huge page sizes are
supported depending on the size of normal pages:

PAGE_SIZE is 4KB:
   2MB - (pmds) these can be allocated at any time.
1024MB - (puds) usually allocated on bootup with the command line
         with something like: hugepagesz=1G hugepages=6

PAGE_SIZE is 64KB:
 512MB - (pmds) usually allocated on bootup via command line.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoARM64: mm: Move PTE_PROT_NONE bit.
Steve Capper [Tue, 28 May 2013 12:35:51 +0000 (13:35 +0100)]
ARM64: mm: Move PTE_PROT_NONE bit.

Under ARM64, PTEs can be broadly categorised as follows:
   - Present and valid: Bit #0 is set. The PTE is valid and memory
     access to the region may fault.

   - Present and invalid: Bit #0 is clear and bit #1 is set.
     Represents present memory with PROT_NONE protection. The PTE
     is an invalid entry, and the user fault handler will raise a
     SIGSEGV.

   - Not present (file or swap): Bits #0 and #1 are clear.
     Memory represented has been paged out. The PTE is an invalid
     entry, and the fault handler will try and re-populate the
     memory where necessary.

Huge PTEs are block descriptors that have bit #1 clear. If we wish
to represent PROT_NONE huge PTEs we then run into a problem as
there is no way to distinguish between regular and huge PTEs if we
set bit #1.

To resolve this ambiguity this patch moves PTE_PROT_NONE from
bit #1 to bit #2 and moves PTE_FILE from bit #2 to bit #3. The
number of swap/file bits is reduced by 1 as a consequence, leaving
60 bits for file and swap entries.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoARM64: mm: Make PAGE_NONE pages read only and no-execute.
Steve Capper [Thu, 2 May 2013 15:25:42 +0000 (16:25 +0100)]
ARM64: mm: Make PAGE_NONE pages read only and no-execute.

If we consider the following code sequence:

my_pte = pte_modify(entry, myprot);
x = pte_write(my_pte);
y = pte_exec(my_pte);

If myprot comes from a PROT_NONE page, then x and y will both be
true which is undesireable behaviour.

This patch sets the no-execute and read-only bits for PAGE_NONE
such that the code above will return false for both x and y.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agoARM64: mm: Restore memblock limit when map_mem finished.
Steve Capper [Tue, 30 Apr 2013 10:00:33 +0000 (11:00 +0100)]
ARM64: mm: Restore memblock limit when map_mem finished.

In paging_init the memblock limit is set to restrict any addresses
returned by early_alloc to fit within the initial direct kernel
mapping in swapper_pg_dir. This allows map_mem to allocate puds,
pmds and ptes from the initial direct kernel mapping.

The limit stays low after paging_init() though, meaning any
bootmem allocations will be from a restricted subset of memory.
Gigabyte huge pages, for instance, are normally allocated from
bootmem as their order (18) is too large for the default buddy
allocator (MAX_ORDER = 11).

This patch restores the memblock limit when map_mem has finished,
allowing gigabyte huge pages (and other objects) to be allocated
from all of bootmem.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
10 years agomm: thp: Correct the HPAGE_PMD_ORDER check.
Steve Capper [Tue, 7 May 2013 13:46:03 +0000 (14:46 +0100)]
mm: thp: Correct the HPAGE_PMD_ORDER check.

All Transparent Huge Pages are allocated by the buddy allocator.

A compile time check is in place that fails when the order of a
transparent huge page is too large to be allocated by the buddy
allocator. Unfortunately that compile time check passes when:
HPAGE_PMD_ORDER == MAX_ORDER
( which is incorrect as the buddy allocator can only allocate
memory of order strictly less than MAX_ORDER. )

This patch updates the compile time check to fail in the above
case.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: mm: Remove general hugetlb code from x86.
Steve Capper [Tue, 30 Apr 2013 07:03:42 +0000 (08:03 +0100)]
x86: mm: Remove general hugetlb code from x86.

huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d have
already been copied over to mm.

This patch removes the x86 copies of these functions and activates
the general ones by enabling:
CONFIG_ARCH_WANT_GENERAL_HUGETLB

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: hugetlb: Copy general hugetlb code from x86 to mm.
Steve Capper [Tue, 30 Apr 2013 07:02:03 +0000 (08:02 +0100)]
mm: hugetlb: Copy general hugetlb code from x86 to mm.

The huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d
functions in x86/mm/hugetlbpage.c do not rely on any architecture
specific knowledge other than the fact that pmds and puds can be
treated as huge ptes.

To allow other architectures to use this code (and reduce the need
for code duplication), this patch copies these functions into mm,
replaces the use of pud_large with pud_huge and provides a config
flag to activate them:
CONFIG_ARCH_WANT_GENERAL_HUGETLB

If CONFIG_ARCH_WANT_HUGE_PMD_SHARE is also active then the
huge_pmd_share code will be called by huge_pte_alloc (othewise we
call pmd_alloc and skip the sharing code).

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
10 years agox86: mm: Remove x86 version of huge_pmd_share.
Steve Capper [Mon, 29 Apr 2013 13:29:48 +0000 (14:29 +0100)]
x86: mm: Remove x86 version of huge_pmd_share.

The huge_pmd_share code has been copied over to mm/hugetlb.c to
make it accessible to other architectures.

Remove the x86 copy of the huge_pmd_share code and enable the
ARCH_WANT_HUGE_PMD_SHARE config flag. That way we reference the
general one.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
10 years agomm: hugetlb: Copy huge_pmd_share from x86 to mm.
Steve Capper [Tue, 23 Apr 2013 11:35:02 +0000 (12:35 +0100)]
mm: hugetlb: Copy huge_pmd_share from x86 to mm.

Under x86, multiple puds can be made to reference the same bank of
huge pmds provided that they represent a full PUD_SIZE of shared
huge memory that is aligned to a PUD_SIZE boundary.

The code to share pmds does not require any architecture specific
knowledge other than the fact that pmds can be indexed, thus can
be beneficial to some other architectures.

This patch copies the huge pmd sharing (and unsharing) logic from
x86/ to mm/ and introduces a new config option to activate it:
CONFIG_ARCH_WANTS_HUGE_PMD_SHARE

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/arm64-lts' into lsk-v3.10-arm64-misc
Mark Brown [Thu, 15 May 2014 10:42:57 +0000 (11:42 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/arm64-lts' into lsk-v3.10-arm64-misc

10 years agoarm64: Remove __flush_dcache_page()
Catalin Marinas [Wed, 1 May 2013 15:38:23 +0000 (16:38 +0100)]
arm64: Remove __flush_dcache_page()

This function is only used in __sync_icache_dcache(), so remove it and
call __flush_dcache_area() directly. The flush_icache_user_range()
function is not used in the arm64 kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit ebd88367de80f9509bd30a09342d0a19c925b23e)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Make DMA coherent and strongly ordered mappings not executable
Catalin Marinas [Wed, 12 Mar 2014 16:07:06 +0000 (16:07 +0000)]
arm64: Make DMA coherent and strongly ordered mappings not executable

commit de2db7432917a82b62d55bb59635586eeca6d1bd upstream.

pgprot_{dmacoherent,writecombine,noncached} don't need to generate
executable mappings with side-effects like __sync_icache_dcache() being
called when the mapping is in user space.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 73e8697b71fa956642403304b96442fd4b57ce97)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Do not synchronise I and D caches for special ptes
Catalin Marinas [Wed, 12 Mar 2014 16:28:09 +0000 (16:28 +0000)]
arm64: Do not synchronise I and D caches for special ptes

commit 71fdb6bf61bf0692f004f9daf5650392c0cfe300 upstream.

Special pte mappings are not intended to be executable and do not even
have an associated struct page. This patch ensures that we do not call
__sync_icache_dcache() on such ptes.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b1f2bfc9d8bb42b4e5a93470e8a0974cd4c04697)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoARM64: unwind: Fix PC calculation
Olof Johansson [Fri, 14 Feb 2014 19:35:15 +0000 (19:35 +0000)]
ARM64: unwind: Fix PC calculation

commit e306dfd06fcb44d21c80acb8e5a88d55f3d1cf63 upstream.

The frame PC value in the unwind code used to just take the saved LR
value and use that.  That's incorrect as a stack trace, since it shows
the return path stack, not the call path stack.

In particular, it shows faulty information in case the bl is done as
the very last instruction of one label, since the return point will be
in the next label. That can easily be seen with tail calls to panic(),
which is marked __noreturn and thus doesn't have anything useful after it.

Easiest here is to just correct the unwind code and do a -4, to get the
actual call site for the backtrace instead of the return site.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c56e0dc1b70f1537659acffc6ac8d8d615be2dae)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: add DSB after icache flush in __flush_icache_all()
Vinayak Kale [Wed, 5 Feb 2014 09:34:36 +0000 (09:34 +0000)]
arm64: add DSB after icache flush in __flush_icache_all()

commit 5044bad43ee573d0b6d90e3ccb7a40c2c7d25eb4 upstream.

Add DSB after icache flush to complete the cache maintenance operation.
The function __flush_icache_all() is used only for user space mappings
and an ISB is not required because of an exception return before executing
user instructions. An exception return would behave like an ISB.

Signed-off-by: Vinayak Kale <vkale@apm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 02599bad3774a5147eb8d98df7c9362fdc1a50c6)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: vdso: fix coarse clock handling
Nathan Lynch [Wed, 5 Feb 2014 05:53:04 +0000 (05:53 +0000)]
arm64: vdso: fix coarse clock handling

commit 069b918623e1510e58dacf178905a72c3baa3ae4 upstream.

When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or
CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the
caller has placed in x2 ("ret x2" to return from the fast path).  Fix
this by saving x30/LR to x2 only in code that will call
__do_get_tspec, restoring x30 afterward, and using a plain "ret" to
return from the routine.

Also: while the resulting tv_nsec value for CLOCK_REALTIME and
CLOCK_MONOTONIC must be computed using intermediate values that are
left-shifted by cs_shift (x12, set by __do_get_tspec), the results for
coarse clocks should be calculated using unshifted values
(xtime_coarse_nsec is in units of actual nanoseconds).  The current
code shifts intermediate values by x12 unconditionally, but x12 is
uninitialized when servicing a coarse clock.  Fix this by setting x12
to 0 once we know we are dealing with a coarse clock id.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fb569d15d867a06e89b1be8278404b6fbf6b5bde)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Invalidate the TLB when replacing pmd entries during boot
Catalin Marinas [Tue, 4 Feb 2014 16:01:31 +0000 (16:01 +0000)]
arm64: Invalidate the TLB when replacing pmd entries during boot

commit a55f9929a9b257f84b6cc7b2397379cabd744a22 upstream.

With the 64K page size configuration, __create_page_tables in head.S
maps enough memory to get started but using 64K pages rather than 512M
sections with a single pgd/pud/pmd entry pointing to a pte table.
create_mapping() may override the pgd/pud/pmd table entry with a block
(section) one if the RAM size is more than 512MB and aligned correctly.
For the end of this block to be accessible, the old TLB entry must be
invalidated.

Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f35f27e775d6f8aa265fb698975c9c95f2757ef4)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: vdso: prevent ld from aligning PT_LOAD segments to 64k
Will Deacon [Tue, 4 Feb 2014 14:41:26 +0000 (14:41 +0000)]
arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k

commit 40507403485fcb56b83d6ddfc954e9b08305054c upstream.

Whilst the text segment for our VDSO is marked as PT_LOAD in the ELF
headers, it is mapped by the kernel and not actually subject to
demand-paging. ld doesn't realise this, and emits a p_align field of 64k
(the maximum supported page size), which conflicts with the load address
picked by the kernel on 4k systems, which will be 4k aligned. This
causes GDB to fail with "Failed to read a valid object file image from
memory" when attempting to load the VDSO.

This patch passes the -n option to ld, which prevents it from aligning
PT_LOAD segments to the maximum page size.

Reported-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6737eaebff6297e5b6aeb458a4b4ad4b61df8f57)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE
Nathan Lynch [Mon, 3 Feb 2014 19:48:52 +0000 (19:48 +0000)]
arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE

commit d4022a335271a48cce49df35d825897914fbffe3 upstream.

Update wall-to-monotonic fields in the VDSO data page
unconditionally.  These are used to service CLOCK_MONOTONIC_COARSE,
which is not guarded by use_syscall.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b666f382900c74f23c78556777041c2ceb7c2b24)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Use Normal NonCacheable memory for writecombine
Catalin Marinas [Fri, 29 Nov 2013 10:56:14 +0000 (10:56 +0000)]
arm64: Use Normal NonCacheable memory for writecombine

commit 4f00130b70e5eee813cc7bc298e0f3fdf79673cc upstream.

This provides better performance compared to Device GRE and also allows
unaligned accesses. Such memory is intended to be used with standard RAM
(e.g. framebuffers) and not I/O.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3655a197b1ea3ce989d34868768c5f4b6205061c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Do not flush the D-cache for anonymous pages
Catalin Marinas [Wed, 1 May 2013 15:34:22 +0000 (16:34 +0100)]
arm64: Do not flush the D-cache for anonymous pages

commit 7249b79f6b4cc3c2aa9138dca52e535a4c789107 upstream.

The D-cache on AArch64 is VIPT non-aliasing, so there is no need to
flush it for anonymous pages.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fc54900e08d840fcfec9e5d2fba2c6f233aa49b9)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Avoid cache flushing in flush_dcache_page()
Catalin Marinas [Wed, 1 May 2013 11:23:05 +0000 (12:23 +0100)]
arm64: Avoid cache flushing in flush_dcache_page()

commit b5b6c9e9149d8a7c3f1d7b9d0c046c6184e1dd17 upstream.

The flush_dcache_page() function is called when the kernel modified a
page cache page. Since the D-cache on AArch64 does not have aliases
this function can simply mark the page as dirty for later flushing via
set_pte_at()/__sync_icache_dcache() if the page is executable (to ensure
the I-D cache coherency).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1e616427f20943c9966296dfff9e7a2b825846aa)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoclocksource: arch_timer: use virtual counters
Mark Rutland [Wed, 30 Jan 2013 17:51:26 +0000 (17:51 +0000)]
clocksource: arch_timer: use virtual counters

commit 0d651e4e65e96989f72236bf83bd4c6e55eb6ce4 upstream.

Switching between reading the virtual or physical counters is
problematic, as some core code wants a view of time before we're fully
set up. Using a function pointer and switching the source after the
first read can make time appear to go backwards, and having a check in
the read function is an unfortunate block on what we want to be a fast
path.

Instead, this patch makes us always use the virtual counters. If we're a
guest, or don't have hyp mode, we'll use the virtual timers, and as such
don't care about CNTVOFF as long as it doesn't change in such a way as
to make time appear to travel backwards. As the guest will use the
virtual timers, a (potential) KVM host must use the physical timers
(which can wake up the host even if they fire while a guest is
executing), and hence a host must have CNTVOFF set to zero so as to have
a consistent view of time between the physical timers and virtual
counters.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 714c21cb90951905b269870087a99c37f3a7af0c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S
Catalin Marinas [Mon, 2 Sep 2013 15:33:54 +0000 (16:33 +0100)]
arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S

commit f3a1d7d53dccf51959aec16b574617cc6bfeca09 upstream.

This string has been moved to arch/arm64/kernel/cputable.c.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ae5092fadd50cfd0d29e762771d30cbf7f2c2fec)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: dts: Reserve the memory used for secondary CPU release address
Catalin Marinas [Thu, 14 Nov 2013 15:15:37 +0000 (15:15 +0000)]
arm64: dts: Reserve the memory used for secondary CPU release address

commit df503ba7f653c590b475ab80bde788edf5af70d5 upstream.

With the spin-table SMP booting method, secondary CPUs poll a location
passed in the DT. The foundation-v8.dts file doesn't have this memory
reserved and there is a risk of Linux using it before secondary CPUs are
started.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 29f37087bb51dfe4ff61daad60bd93f7bcfa3d72)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: check for number of arguments in syscall_get/set_arguments()
AKASHI Takahiro [Thu, 3 Oct 2013 05:47:44 +0000 (06:47 +0100)]
arm64: check for number of arguments in syscall_get/set_arguments()

commit 7b22c03536a539142f931815528d55df455ffe2d upstream.

In ftrace_syscall_enter(),
    syscall_get_arguments(..., 0, n, ...)
        if (i == 0) { <handle orig_x0> ...; n--;}
        memcpy(..., n * sizeof(args[0]));
If 'number of arguments(n)' is zero and 'argument index(i)' is also zero in
syscall_get_arguments(), none of arguments should be copied by memcpy().
Otherwise 'n--' can be a big positive number and unexpected amount of data
will be copied. Tracing system calls which take no argument, say sync(void),
may hit this case and eventually make the system corrupted.
This patch fixes the issue both in syscall_get_arguments() and
syscall_set_arguments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e2956ef5b5ddb3ede70962afe9297d8340787fa6)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: fix possible invalid FPSIMD initialization state
Jiang Liu [Fri, 27 Sep 2013 08:04:41 +0000 (09:04 +0100)]
arm64: fix possible invalid FPSIMD initialization state

commit 6db83cea1c975b9a102e17def7d2795814e1ae2b upstream.

If context switching happens during executing fpsimd_flush_thread(),
stale value in FPSIMD registers will be saved into current thread's
fpsimd_state by fpsimd_thread_switch(). That may cause invalid
initialization state for the new process, so disable preemption
when executing fpsimd_flush_thread().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2bf5861acf602fe6201ac1f82955ac7283e56b6f)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Change kernel stack size to 16K
Feng Kan [Tue, 23 Jul 2013 17:52:31 +0000 (18:52 +0100)]
arm64: Change kernel stack size to 16K

commit 845ad05ec31e0f3872a321e10dbeaf872022632c upstream.

Written by Catalin Marinas, tested by APM on storm platform. This is needed
because of the failures encountered when running SpecWeb benchmark test.

Signed-off-by: Feng Kan <fkan@apm.com>
Acked-by: Kumar Sankaran <ksankaran@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 79f783f05539479676406d3d42c3d86bd203f083)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: virt: ensure visibility of __boot_cpu_mode
Mark Rutland [Tue, 9 Jul 2013 14:16:06 +0000 (15:16 +0100)]
arm64: virt: ensure visibility of __boot_cpu_mode

commit 82b2f495fba338d1e3098dde1df54944a9c19751 upstream.

Secondary CPUs write to __boot_cpu_mode with caches disabled, and thus a
cached value of __boot_cpu_mode may be incoherent with that in memory.
This could lead to a failure to detect mismatched boot modes.

This patch adds flushing to ensure that writes by secondaries to
__boot_cpu_mode are made visible before we test against it.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5fb08df3dd1f7b8e83936808b042725a8b067562)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Only enable local interrupts after the CPU is marked online
Catalin Marinas [Fri, 19 Jul 2013 14:08:15 +0000 (15:08 +0100)]
arm64: Only enable local interrupts after the CPU is marked online

commit 53ae3acd4390ffeecb3a11dbd5be347b5a3d98f2 upstream.

There is a slight chance that (timer) interrupts are triggered before a
secondary CPU has been marked online with implications on softirq thread
affinity.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 81ca15a3e04961b54d4a0b395e681bffb6cfbc68)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: spinlock: retry trylock operation if strex fails on free lock
Catalin Marinas [Fri, 31 May 2013 15:30:58 +0000 (16:30 +0100)]
arm64: spinlock: retry trylock operation if strex fails on free lock

commit 4ecf7ccb1973fd826456b6ab1e6dfafe9023c753 upstream.

An exclusive store instruction may fail for reasons other than lock
contention (e.g. a cache eviction during the critical section) so, in
line with other architectures using similar exclusive instructions
(alpha, mips, powerpc), retry the trylock operation if the lock appears
to be free but the strex reported failure.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 42b5bb47a62b7b2d8cd0b9af58f914c5e83ef76e)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
Will Deacon [Tue, 17 Dec 2013 17:09:08 +0000 (17:09 +0000)]
arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events

commit cdc27c27843248ae7eb0df5fc261dd004eaa5670 upstream.

Commit 8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for
disabled breakpoints") fixed an issue with GDB trying to zero breakpoint
control registers. The problem there is that the arch hw_breakpoint code
will attempt to create a (disabled), execute breakpoint of length 0.

This will fail validation and report unexpected failure to GDB. To avoid
this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that
seems to have broken with recent kernels, causing watchpoints to be
treated as TYPE_INST in the core code and returning ENOSPC for any
further breakpoints.

This patch fixes the problem by prioritising the `enable' field of the
breakpoint: if it is cleared, we simply update the perf_event_attr to
indicate that the thing is disabled and don't bother changing either the
type or the length. This reinforces the behaviour that the breakpoint
control register is essentially read-only apart from the enable bit
when disabling a breakpoint.

Reported-by: Aaron Liu <liucy214@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 11b81802921618b02122855db021a63df7d9fd49)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: perf: fix ARMv8 EVTYPE_MASK to include NSH bit
Will Deacon [Tue, 20 Aug 2013 10:47:42 +0000 (11:47 +0100)]
arm64: perf: fix ARMv8 EVTYPE_MASK to include NSH bit

commit 178cd9ce377232518ec17ff2ecab2e80fa60784c upstream.

This is a port of f2fe09b055e2 ("ARM: 7663/1: perf: fix ARMv7 EVTYPE_MASK
to include NSH bit") to arm64, which fixes the broken evtype mask to
include the NSH bit, allowing profiling at EL2.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0fe9a0dc92a64c088e76fcd3d35b2ba36b4d7f3c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: perf: fix group validation when using enable_on_exec
Will Deacon [Tue, 20 Aug 2013 10:47:41 +0000 (11:47 +0100)]
arm64: perf: fix group validation when using enable_on_exec

commit 8455e6ec70f33b0e8c3ffd47067e00481f09f454 upstream.

This is a port of cb2d8b342aa0 ("ARM: 7698/1: perf: fix group validation
when using enable_on_exec") to arm64, which fixes the event validation
checking so that events in the OFF state are still considered when
enable_on_exec is true.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1f96d83b38937294584ede9473c2b2961528f287)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: perf: fix event validation for software group leaders
Will Deacon [Tue, 20 Aug 2013 10:47:40 +0000 (11:47 +0100)]
arm64: perf: fix event validation for software group leaders

commit ee7538a008a45050c8f706d38b600f55953169f9 upstream.

This is a port of c95eb3184ea1 ("ARM: 7809/1: perf: fix event validation
for software group leaders") to arm64, which fixes a panic in the arm64
perf backend found as a result of Vince's fuzzing tool.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6c6f6c7166e9ee45e545eab850a2862a92c135b2)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: perf: fix array out of bounds access in armpmu_map_hw_event()
Will Deacon [Tue, 20 Aug 2013 10:47:39 +0000 (11:47 +0100)]
arm64: perf: fix array out of bounds access in armpmu_map_hw_event()

commit 868f6fea8fa63f09acbfa93256d0d2abdcabff79 upstream.

This is a port of d9f966357b14 ("ARM: 7810/1: perf: Fix array out of
bounds access in armpmu_map_hw_event()") to arm64, which fixes an oops
in the arm64 perf backend found as a result of Vince's fuzzing tool.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 975a3cd4cc0dff06a635cb16b8fb25fd3ae88234)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: mm: don't treat user cache maintenance faults as writes
Will Deacon [Fri, 19 Jul 2013 14:37:12 +0000 (15:37 +0100)]
arm64: mm: don't treat user cache maintenance faults as writes

commit db6f41063cbdb58b14846e600e6bc3f4e4c2e888 upstream.

On arm64, cache maintenance faults appear as data aborts with the CM
bit set in the ESR. The WnR bit, usually used to distinguish between
faulting loads and stores, always reads as 1 and (slightly confusingly)
the instructions are treated as reads by the architecture.

This patch fixes our fault handling code to treat cache maintenance
faults in the same way as loads.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 88c0a794e5d9bcc29926e636cd1d6eb5c9dcb235)
Signed-off-by: Mark Brown <broonie@linaro.org>
11 years agoLinux 3.10
Linus Torvalds [Sun, 30 Jun 2013 22:13:29 +0000 (15:13 -0700)]
Linux 3.10

11 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Sun, 30 Jun 2013 22:08:15 +0000 (15:08 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull another powerpc fix from Benjamin Herrenschmidt:
 "I mentioned that while we had fixed the kernel crashes, EEH error
  recovery didn't always recover...  It appears that I had a fix for
  that already in powerpc-next (with a stable CC).

  I cherry-picked it today and did a few tests and it seems that things
  now work quite well.  The patch is also pretty simple, so I see no
  reason to wait before merging it."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/eeh: Fix fetching bus for single-dev-PE

11 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 30 Jun 2013 22:06:25 +0000 (15:06 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of seven bug fixes.  Several fcoe fixes for locking
  problems, initiator issues and a VLAN API change, all of which could
  eventually lead to data corruption, one fix for a qla2xxx locking
  problem which could lead to multiple completions of the same request
  (and subsequent data corruption) and a use after free in the ipr
  driver.  Plus one minor MAINTAINERS file update"

(only six bugfixes in this pull, since I had already pulled the fcoe API
fix directly from Robert Love)

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] ipr: Avoid target_destroy accessing memory after it was freed
  [SCSI] qla2xxx: Fix for locking issue between driver ISR and mailbox routines
  MAINTAINERS: Fix fcoe mailing list
  libfc: extend ex_lock to protect all of fc_seq_send
  libfc: Correct check for initiator role
  libfcoe: Fix Conflicting FCFs issue in the fabric

11 years agopowerpc/eeh: Fix fetching bus for single-dev-PE
Gavin Shan [Wed, 5 Jun 2013 07:34:02 +0000 (15:34 +0800)]
powerpc/eeh: Fix fetching bus for single-dev-PE

While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.

Cc: <stable@vger.kernel.org> # v3.7+
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
11 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Sun, 30 Jun 2013 00:02:48 +0000 (17:02 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
 "We discovered some breakage in our "EEH" (PCI Error Handling) code
  while doing error injection, due to a couple of regressions.  One of
  them is due to a patch (37f02195bee9 "powerpc/pci: fix PCI-e devices
  rescan issue on powerpc platform") that, in hindsight, I shouldn't
  have merged considering that it caused more problems than it solved.

  Please pull those two fixes.  One for a simple EEH address cache
  initialization issue.  The other one is a patch from Guenter that I
  had originally planned to put in 3.11 but which happens to also fix
  that other regression (a kernel oops during EEH error handling and
  possibly hotplug).

  With those two, the couple of test machines I've hammered with error
  injection are remaining up now.  EEH appears to still fail to recover
  on some devices, so there is another problem that Gavin is looking
  into but at least it's no longer crashing the kernel."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Improve device hotplug initialization
  powerpc/eeh: Add eeh_dev to the cache during boot

11 years agoARM: dt: Only print warning, not WARN() on bad cpu map in device tree
Olof Johansson [Sat, 29 Jun 2013 23:25:14 +0000 (16:25 -0700)]
ARM: dt: Only print warning, not WARN() on bad cpu map in device tree

Due to recent changes and expecations of proper cpu bindings, there are
now cases for many of the in-tree devicetrees where a WARN() will hit
on boot due to badly formatted /cpus nodes.

Downgrade this to a pr_warn() to be less alarmist, since it's not a
new problem.

Tested on Arndale, Cubox, Seaboard and Panda ES. Panda hits the WARN
without this, the others do not.

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agopowerpc/pci: Improve device hotplug initialization
Guenter Roeck [Mon, 10 Jun 2013 17:18:08 +0000 (10:18 -0700)]
powerpc/pci: Improve device hotplug initialization

Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc
platform) fixes a problem with interrupt and DMA initialization on hot
plugged devices. With this commit, interrupt and DMA initialization for
hot plugged devices is handled in the pci device enable function.

This approach has a couple of drawbacks. First, it creates two code paths
for device initialization, one for hot plugged devices and another for devices
known during the initial PCI scan. Second, the initialization code for hot
plugged devices is only called when the device is enabled, ie typically
in the probe function. Also, the platform specific setup code is called each
time pci_enable_device() is called, not only once during device discovery,
meaning it is actually called multiple times, once for devices discovered
during the initial scan and again each time a driver is re-loaded.

The visible result is that interrupt pins are only assigned to hot plugged
devices when the device driver is loaded. Effectively this changes the PCI
probe API, since pci_dev->irq and the device's dma configuration will now
only be valid after pci_enable() was called at least once. A more subtle
change is that platform specific PCI device setup is moved from device
discovery into the driver's probe function, more specifically into the
pci_enable_device() call.

To fix the inconsistencies, add new function pcibios_add_device.
Call pcibios_setup_device from pcibios_setup_bus_devices if device setup
is not complete, and from pcibios_add_device if bus setup is complete.

With this change, device setup code is moved back into device initialization,
and called exactly once for both static and hot plugged devices.

[ This also fixes a regression introduced by the above patch which
  causes dev->irq to be overwritten under some cirumstances after
  MSIs have been enabled for the device which leads to crashes due
  to the MSI core "hijacking" dev->irq to store the base MSI number
  and not the LSI. --BenH
]

Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sat, 29 Jun 2013 18:34:18 +0000 (11:34 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "This fixes a crash in the crypto layer exposed by an SCTP test tool"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algboss - Hold ref count on larval

11 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Sat, 29 Jun 2013 18:32:05 +0000 (11:32 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm/qxl fix from Dave Airlie:
 "Bad me forgot an access check, possible security issue, but since this
  is the first kernel with it, should be fine to just put it in now"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/qxl: add missing access check for execbuffer ioctl

11 years agoFix: kernel/ptrace.c: ptrace_peek_siginfo() missing __put_user() validation
Mathieu Desnoyers [Fri, 28 Jun 2013 13:49:46 +0000 (09:49 -0400)]
Fix: kernel/ptrace.c: ptrace_peek_siginfo() missing __put_user() validation

This __put_user() could be used by unprivileged processes to write into
kernel memory.  The issue here is that even if copy_siginfo_to_user()
fails, the error code is not checked before __put_user() is executed.

Luckily, ptrace_peek_siginfo() has been added within the 3.10-rc cycle,
so it has not hit a stable release yet.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Sat, 29 Jun 2013 17:31:15 +0000 (10:31 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph fix from Sage Weil:
 "This is a recently spotted regression in the snapshot behavior...

  It turns out several tests weren't being run in the nightlies so this
  took a while to spot"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: send snapshot context with writes

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 29 Jun 2013 17:30:31 +0000 (10:30 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull ubifs fixes from Al Viro:
 "A couple of ubifs readdir/lseek race fixes.  Stable fodder, really
  nasty..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  UBIFS: fix a horrid bug
  UBIFS: prepare to fix a horrid bug

11 years agoMerge tag 'for-linus-20130628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowe...
Linus Torvalds [Sat, 29 Jun 2013 17:28:52 +0000 (10:28 -0700)]
Merge tag 'for-linus-20130628' of git://git./linux/kernel/git/dhowells/linux-mn10300

Pull two MN10300 fixes from David Howells:
 "The first fixes a problem with passing arrays rather than pointers to
  get_user() where __typeof__ then wants to declare and initialise an
  array variable which gcc doesn't like.

  The second fixes a problem whereby putting mem=xxx into the kernel
  command line causes init=xxx to get an incorrect value."

* tag 'for-linus-20130628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300:
  mn10300: Use early_param() to parse "mem=" parameter
  mn10300: Allow to pass array name to get_user()

11 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 29 Jun 2013 17:27:19 +0000 (10:27 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "Correct an ordering issue in the tick broadcast code.  I really wish
  we'd get compensation for pain and suffering for each line of code we
  write to work around dysfunctional timer hardware."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick: Fix tick_broadcast_pending_mask not cleared

11 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 29 Jun 2013 17:26:50 +0000 (10:26 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Ingo Molnar:
 "One more fix for a recently discovered bug"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Disable monitoring on setuid processes for regular users

11 years agoUBIFS: fix a horrid bug
Artem Bityutskiy [Fri, 28 Jun 2013 11:15:15 +0000 (14:15 +0300)]
UBIFS: fix a horrid bug

Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
in the middle of 'ubifs_readdir()'.

This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses
it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage,
but this may corrupt memory and lead to all kinds of problems like crashes an
security holes.

This patch fixes the problem by using the 'file->f_version' field, which
'->llseek()' always unconditionally sets to zero. We set it to 1 in
'ubifs_readdir()' and whenever we detect that it became 0, we know there was a
seek and it is time to clear the state saved in 'file->private_data'.

I tested this patch by writing a user-space program which runds readdir and
seek in parallell. I could easily crash the kernel without these patches, but
could not crash it with these patches.

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoUBIFS: prepare to fix a horrid bug
Artem Bityutskiy [Fri, 28 Jun 2013 11:15:14 +0000 (14:15 +0300)]
UBIFS: prepare to fix a horrid bug

Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
in the middle of 'ubifs_readdir()'.

First of all, this means that 'file->private_data' can be freed while
'ubifs_readdir()' uses it.  But this particular patch does not fix the problem.
This patch is only a preparation, and the fix will follow next.

In this patch we make 'ubifs_readdir()' stop using 'file->f_pos' directly,
because 'file->f_pos' can be changed by '->llseek()' at any point. This may
lead 'ubifs_readdir()' to returning inconsistent data: directory entry names
may correspond to incorrect file positions.

So here we introduce a local variable 'pos', read 'file->f_pose' once at very
the beginning, and then stick to 'pos'. The result of this is that when
'ubifs_dir_llseek()' changes 'file->f_pos' while we are in the middle of
'ubifs_readdir()', the latter "wins".

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomn10300: Use early_param() to parse "mem=" parameter
Akira Takeuchi [Fri, 28 Jun 2013 15:53:03 +0000 (16:53 +0100)]
mn10300: Use early_param() to parse "mem=" parameter

This fixes the problem that "init=" options may not be passed to kernel
correctly.

parse_mem_cmdline() of mn10300 arch gets rid of "mem=" string from
redboot_command_line. Then init_setup() parses the "init=" options from
static_command_line, which is a copy of redboot_command_line, and keeps
the pointer to the init options in execute_command variable.

Since the commit 026cee0 upstream (params: <level>_initcall-like kernel
parameters), static_command_line becomes overwritten by saved_command_line at
do_initcall_level(). Notice that saved_command_line is a command line
which includes "mem=" string.

As a result, execute_command may point to weird string by the length of
"mem=" parameter.
I noticed this problem when using the command line like this:

    mem=128M console=ttyS0,115200 init=/bin/sh

Here is the processing flow of command line parameters.
    start_kernel()
      setup_arch(&command_line)
         parse_mem_cmdline(cmdline_p)
           * strcpy(boot_command_line, redboot_command_line);
           * Remove "mem=xxx" from redboot_command_line.
           * *cmdline_p = redboot_command_line;
      setup_command_line(command_line) <-- command_line is redboot_command_line
        * strcpy(saved_command_line, boot_command_line)
        * strcpy(static_command_line, command_line)
      parse_early_param()
        strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
        parse_early_options(tmp_cmdline);
          parse_args("early options", cmdline, NULL, 0, 0, 0, do_early_param);
      parse_args("Booting ..", static_command_line, ...);
        init_setup() <-- save the pointer in execute_command
      rest_init()
        kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

At this point, execute_command points to "/bin/sh" string.

    kernel_init()
      kernel_init_freeable()
        do_basic_setup()
          do_initcalls()
            do_initcall_level()
              (*) strcpy(static_command_line, saved_command_line);

Here, execute_command gets to point to "200" string !!

Signed-off-by: David Howells <dhowells@redhat.com>
11 years agomn10300: Allow to pass array name to get_user()
Akira Takeuchi [Fri, 28 Jun 2013 15:53:01 +0000 (16:53 +0100)]
mn10300: Allow to pass array name to get_user()

This fixes the following compile error:

CC block/scsi_ioctl.o
block/scsi_ioctl.c: In function 'sg_scsi_ioctl':
block/scsi_ioctl.c:449: error: invalid initializer

Signed-off-by: David Howells <dhowells@redhat.com>
11 years agodrm/qxl: add missing access check for execbuffer ioctl
Dave Airlie [Fri, 28 Jun 2013 03:27:40 +0000 (13:27 +1000)]
drm/qxl: add missing access check for execbuffer ioctl

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agopowerpc/eeh: Add eeh_dev to the cache during boot
Thadeu Lima de Souza Cascardo [Thu, 27 Jun 2013 21:00:10 +0000 (18:00 -0300)]
powerpc/eeh: Add eeh_dev to the cache during boot

commit f8f7d63fd96ead101415a1302035137a866f8998 ("powerpc/eeh: Trace eeh
device from I/O cache") broke EEH on pseries for devices that were
present during boot and have not been hotplugged/DLPARed.

eeh_check_failure will get the eeh_dev from the cache, and will get
NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev
for the giving pci_device is not set yet. Just reordering the call to
eeh_addr_cache_insert_dev works fine. The ordering is similar to the one
in eeh_add_device_late.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
11 years agorbd: send snapshot context with writes
Josh Durgin [Wed, 26 Jun 2013 19:56:17 +0000 (12:56 -0700)]
rbd: send snapshot context with writes

Sending the right snapshot context with each write is required for
snapshots to work. Due to the ordering of calls, the snapshot context
is never set for any requests. This causes writes to the current
version of the image to be reflected in all snapshots, which are
supposed to be read-only.

This happens because rbd_osd_req_format_write() sets the snapshot
context based on obj_request->img_request. At this point, however,
obj_request->img_request has not been set yet, to the snapshot context
is set to NULL. Fix this by moving rbd_img_obj_request_add(), which
sets obj_request->img_request, before the osd request formatting
calls.

This resolves:
    http://tracker.ceph.com/issues/5465

Reported-by: Karol Jurak <karol.jurak@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
11 years agoMerge tag 'fcoe1' into fixes
James Bottomley [Thu, 27 Jun 2013 06:08:22 +0000 (23:08 -0700)]
Merge tag 'fcoe1' into fixes

This patch fixes a critical bug that was introduced in 3.9
related to VLAN tagging FCoE frames.

11 years agoMerge tag 'fcoe' into fixes
James Bottomley [Thu, 27 Jun 2013 06:07:53 +0000 (23:07 -0700)]
Merge tag 'fcoe' into fixes

3.10 fixes

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 27 Jun 2013 05:24:37 +0000 (19:24 -1000)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Found via trinity:

    If you connect up an ipv6 socket to an ipv4 mapped address then an
    ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the
    route cached in the socket is an ipv6 one.  In this case there is an
    ipv4 route attached, so it gets stomped on.

    Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric
    Dumazet.

 2) AF_KEY notifications leak some kernel memory to userspace, fix from
    Mathias Krause.

 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del
    doesn't validate that the device being deleted is actually a DLCI
    one.  Fixes from Li Zefan.

 4) Length check on bluetooth l2cap information responses is wrong, each
    response type has a different lenth, so we should make sure it's in
    a given range rather than enforce one single valid length.  From
    Jaganath Kanakkassery.

 5) Receive FIFO overflow is really easy to trigger in stress scenerios
    in the sh_eth driver, but the event isn't being handled properly at
    all.  Specifically, the mask of error interrupts doesn't include the
    event so we never clear it, resulting in the driver becomming wedged
    processing an interrupt that never gets cleared.

    Fix from Sergei Shtylyov.

 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of
    msleep().  From Shahed Shaikh.

 7) Missing curly braces causes SIP netfilter NAT module to always drop
    packets.  Fix from Balazs Peter Odor.

 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing
    the timer to dereference crap when it fires.  Fix from Gao Feng.

 9) Missing RCU protection around txq->axq_acq traversal in
    ath_txq_schedule().  Fix from Felix Fietkau.

10) Idle state transition test in ath9k_htc_config() is reversed, fix
    from Sujith Manoharan.

11) IPV6 forwarding handles unicast Router Alert packets incorrectly.
    It tests the wrong option state.  Previously opt->ra being non-zero
    indicated a router alert marking in the SKB, but now it's indicated
    by a bit in opt->flags.  Fix from YOSHIFUJI Hideaki.

12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet.

13) get_user_pages_fast() error handling in TUN and MACVTAP use the same
    local variable for the base index and the loop iterator for page
    traversal, oops! Fix from Michael S Tsirkin.

14) ipv6_get_lladdr() can fail, and we must therefore check it's return
    value in inet6_set_iftoken().  For from Hannes Frederic Sowa.

15) If you change an interface name and meanwhile can sneak in something
    that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can
    deadlock with CONFIG_PREEMPT=n.  Fix this by providing a helper
    function that properly uses raw_seqcount_begin().  From Nicolas
    Schichan.

16) Chain noise calibration test is inverted in iwlwifi, fix from
    Nikolay Martynov.

17) Properly set TX iwlwifi descriptor flags for back requests.  Fix
    from Emmanuel Grumbach.

18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP
    module, fix from Pablo Neira Ayuso.

19) Some crummy APs don't provide the proper High Throughput info in
    association response frames.  Add a workaround by assume we'll use
    whatever is in the beacon/probe.  Fix from Johannes Berg.

20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and
    channel width).  Fix from Simon Wunderlich.

21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented
    frames.  Fix from Phil Oester.

22) Fix rate control regression causing iwlwifi/iwlegacy chips to use
    1Mbit/s on pre-11n networks.  From Moshe Benji and Stanslaw Gruszka.

23) Disable brcmsmac power-save functions, they cause regressions.  From
    Arend van Spriel.

24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can
    easily crash.  Fix from Anderson Lizardo.

25) If a learning packet arrives during vxlan_stop() we crash, easily
    fixed by checking netif_running().  From Stephen Hemminger.

26) Static vxlan FDB entries should not be migrated, also from Stephen.

27) skb_clone() failures not handled in vxlan_xmit(), oops.  Also from
    Stephen.

28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes
    Berg.

29) Fix regression in userspace VLAN acceleration control, added by the
    802.1ad support changes.  Fix from Fernando Luis Vazquez Cao.

30) Interval selection for MLD queries in the bridging code was
    reversed.  Fix from Linus Lüssing.

31) ipv6's ndisc_send_redirect() erroneously writes to the packet we
    received not the packet we are building to send out.  Fix from
    Matthias Schiffer.

32) Don't free netdev before unregistering it, in usb_8dev can driver.
    From Marc Kleine-Budde.

33) Fix nl80211 attribute buffer races, from Johannes Berg.

34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild.
    From Stephen Hemminger.

35) Wrong address and family passed to MD5 key lookups in TCP, from
    Aydin Arik.

36) phy_type attribute created by SFC driver should not be writable.
    From Ben Hutchings.

37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth
    should use kzalloc().  Otherwise if setup fails half-way, we'll
    dereference garbage when trying to teardown the rings.  From Lubomir
    Rintel.

38) Fix double-allocation of dst (resulting in unfreeable net device) in
    ipv6's init_loopback().  From Gao Feng.

39) Fix fragmentation handling SKB leak in netfilter conntrack, we were
    freeing the wrong skb pointer.  From Phil Oester.

40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from
    Nikolay Aleksandrov.

41) davinci_cpdma doesn't check for DMA mapping errors, letting the
    device scribble to random addresses.  From Sebastian Siewior.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
  dlci: validate the net device in dlci_del()
  dlci: acquire rtnl_lock before calling __dev_get_by_name()
  af_key: fix info leaks in notify messages
  ipv6: ip6_sk_dst_check() must not assume ipv6 dst
  net: fix kernel deadlock with interface rename and netdev name retrieval.
  net/tg3: Avoid delay during MMIO access
  ipv6: check return value of ipv6_get_lladdr
  macvtap: fix recovery from gup errors
  tun: fix recovery from gup errors
  gre: fix a possible skb leak
  ipv6: Process unicast packet with Router Alert by checking flag in skb.
  ath9k_htc: Handle IDLE state transition properly
  ath9k: fix an RCU issue in calling ieee80211_get_tx_rates
  netfilter: ipt_ULOG: fix incorrect setting of ulog timer
  netfilter: ctnetlink: send event when conntrack label was modified
  netfilter: nf_nat_sip: fix mangling
  qlcnic: Do not sleep while holding spinlock
  drivers: net: cpsw: fix compilation error with cpsw driver
  tcp: doc : fix the syncookies default value
  sh_eth: fix misreporting of transmit abort
  ...

11 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Thu, 27 Jun 2013 05:23:15 +0000 (19:23 -1000)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull i915 drm fixes from Dave Airlie:
 "These should be the last two fixes for i915, one is for a fence leak
  killing X on some older GPUs, and one is a late regression partial
  revert for an swiotlb/xen/i915 interaction, Konrad has promised to
  figure out the proper answer, and this patch is the best thing to do
  at this stage to avoid regressing"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
  drm/i915: Restore fences after resume and GPU resets

11 years agodlci: validate the net device in dlci_del()
Zefan Li [Wed, 26 Jun 2013 07:31:58 +0000 (15:31 +0800)]
dlci: validate the net device in dlci_del()

We triggered an oops while running trinity with 3.4 kernel:

BUG: unable to handle kernel paging request at 0000000100000d07
IP: [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci]
PGD 640c0d067 PUD 0
Oops: 0000 [#1] PREEMPT SMP
CPU 3
...
Pid: 7302, comm: trinity-child3 Not tainted 3.4.24.09+ 40 Huawei Technologies Co., Ltd. Tecal RH2285          /BC11BTSA
RIP: 0010:[<ffffffffa0109738>]  [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci]
...
Call Trace:
  [<ffffffff8137c5c3>] sock_ioctl+0x153/0x280
  [<ffffffff81195494>] do_vfs_ioctl+0xa4/0x5e0
  [<ffffffff8118354a>] ? fget_light+0x3ea/0x490
  [<ffffffff81195a1f>] sys_ioctl+0x4f/0x80
  [<ffffffff81478b69>] system_call_fastpath+0x16/0x1b
...

It's because the net device is not a dlci device.

Reported-by: Li Jinyue <lijinyue@huawei.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodlci: acquire rtnl_lock before calling __dev_get_by_name()
Zefan Li [Wed, 26 Jun 2013 07:29:54 +0000 (15:29 +0800)]
dlci: acquire rtnl_lock before calling __dev_get_by_name()

Otherwise the net device returned can be freed at anytime.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoaf_key: fix info leaks in notify messages
Mathias Krause [Wed, 26 Jun 2013 21:52:30 +0000 (23:52 +0200)]
af_key: fix info leaks in notify messages

key_notify_sa_flush() and key_notify_policy_flush() miss to initialize
the sadb_msg_reserved member of the broadcasted message and thereby
leak 2 bytes of heap memory to listeners. Fix that.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: ip6_sk_dst_check() must not assume ipv6 dst
Eric Dumazet [Wed, 26 Jun 2013 11:15:07 +0000 (04:15 -0700)]
ipv6: ip6_sk_dst_check() must not assume ipv6 dst

It's possible to use AF_INET6 sockets and to connect to an IPv4
destination. After this, socket dst cache is a pointer to a rtable,
not rt6_info.

ip6_sk_dst_check() should check the socket dst cache is IPv6, or else
various corruptions/crashes can happen.

Dave Jones can reproduce immediate crash with
trinity -q -l off -n -c sendmsg -c connect

With help from Hannes Frederic Sowa

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fix kernel deadlock with interface rename and netdev name retrieval.
Nicolas Schichan [Wed, 26 Jun 2013 15:23:42 +0000 (17:23 +0200)]
net: fix kernel deadlock with interface rename and netdev name retrieval.

When the kernel (compiled with CONFIG_PREEMPT=n) is performing the
rename of a network interface, it can end up waiting for a workqueue
to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a
SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to
the fact that read_secklock_begin() will spin forever waiting for the
writer process (the one doing the interface rename) to update the
devnet_rename_seq sequence.

This patch fixes the problem by adding a helper (netdev_get_name())
and using it in the code handling the SIOCGIFNAME ioctl and
SO_BINDTODEVICE setsockopt.

The netdev_get_name() helper uses raw_seqcount_begin() to avoid
spinning forever, waiting for devnet_rename_seq->sequence to become
even. cond_resched() is used in the contended case, before retrying
the access to give the writer process a chance to finish.

The use of raw_seqcount_begin() will incur some unneeded work in the
reader process in the contended case, but this is better than
deadlocking the system.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'regulator-v3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 26 Jun 2013 19:18:37 +0000 (09:18 -1000)]
Merge tag 'regulator-v3.10-rc7' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "Fix module loading for tps6586x.

  A simple one liner fix to make module loading work for distros
  (product specific kernels tend to have things built in)"

* tag 'regulator-v3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  mfd: tps6586x: correct device name of the regulator cell

11 years agoMerge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux
Linus Torvalds [Wed, 26 Jun 2013 19:08:58 +0000 (09:08 -1000)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux

Pull GPIO regression fix from Grant Likely:
 "It took a while to work out the correct solution to this regression.
  It is sorted now.  This branch was constructed and tested by Tony.
  I've verified that it builds and signed the tag"

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux:
  gpio/omap: don't use linear domain mapping for OMAP1

11 years agoMerge tag 'pm+acpi-3.10-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Wed, 26 Jun 2013 18:55:03 +0000 (08:55 -1000)]
Merge tag 'pm+acpi-3.10-late' of git://git./linux/kernel/git/rafael/linux-pm

Pull late power management and ACPI fixes from Rafael Wysocki:
 "Sorry about the timing of this, but ACPI-based docking stations with
  PCI devices on them and ATA bays would be hardly usable with 3.10
  without it.  We've been working on these fixes for the last couple of
  weeks and everyone involved appears to be reasonably comfortable with
  them now.

  The PM part is one fix for a cpufreq regression introduced recently

   - Fix for an ACPI dock regression introduced by the recent rework of
     the ACPI-based PCI hotplug code (acpiphp) that caused it to be
     initialized before the ACPI dock driver, which is incorrect (ACPI
     dock has to be initialized before acpiphp so that acpiphp can
     register PCI devices on docking stations with it for PCI hotplug on
     re-dock to work).  From Jiang Liu.

   - Fix for PCI resources allocation in the ACPI-based PCI hotplug code
     (acpiphp) that makes it use the same PCI resources assignment rules
     during runtime hotplug that are used during boot (the BIOS' choices
     are now respected in both cases).  This prevents PCI resource
     allocation failures during hotplug from happening in some cases.
     From Jiang Liu.

   - Fix for ordering and synchronization issues during hot-removal of
     PCI devices on docking stations.  It makes the ACPI dock code carry
     out the PCI devices removal synchronously during undock instead of
     spawning a separate asynchronous work item to remove each of them
     without even bothering to wait for all those work items to
     complete.  The hot-addition part is changed analogously.

   - Fix for a regression (introduced a few releases ago) that removed
     the code to register a hotplug notificaion handler for for ATA
     ports/devices inadvertently which prevented ATA bays hotplug from
     working.  The missing code is added back with some improvements.
     From Aaron Lu.

   - Fix for a recent cpufreq regression causing a NULL pointer
     dereference to trigger in od_set_powersave_bias() in some
     situations from Jacob Shin"

* tag 'pm+acpi-3.10-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: fix NULL pointer deference at od_set_powersave_bias()
  libata-acpi: add back ACPI based hotplug functionality
  ACPI / dock / PCI: Synchronous handling of dock events for PCI devices
  PCI / ACPI: Use boot-time resource allocation rules during hotplug
  ACPI / dock: Initialize ACPI dock subsystem upfront

11 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 26 Jun 2013 18:51:44 +0000 (08:51 -1000)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Three small fixlets"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hw_breakpoint: Use cpu_possible_mask in {reserve,release}_bp_slot()
  hw_breakpoint: Fix cpu check in task_bp_pinned(cpu)
  kprobes: Fix arch_prepare_kprobe to handle copy insn failures

11 years agoMerge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Linus Torvalds [Wed, 26 Jun 2013 18:50:39 +0000 (08:50 -1000)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
 "Another round of ARM fixes.  Largest one is the second half of the
  PJ4B fix which was pushed in the previous -rc - this one was delayed
  because its original caused a build regression while trying to fix a
  regression!

  As ever, noMMU gets forgotten when fixing problems on MMU, so we have
  a noMMU fix for a previous fix included in this set.

  A couple of fixes from Lorenzo for problems with the ARM DT CPU code,
  and a one liner to remove the buggy 'wait for interrupt' with FA526
  cores"

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7773/1: PJ4B: Add support for errata 4742
  ARM: 7772/1: Fix missing flush_kernel_dcache_page() for noMMU
  ARM: 7763/1: kernel: fix __cpu_logical_map default initialization
  ARM: 7762/1: kernel: fix arm_dt_init_cpu_maps() to skip non-cpu nodes
  ARM: 7760/1: cpu_fa526_do_idle: remove WFI

11 years agoMerge tag 'critical_fix_for_3.9' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 26 Jun 2013 18:48:53 +0000 (08:48 -1000)]
Merge tag 'critical_fix_for_3.9' of git://git./linux/kernel/git/rwlove/fcoe

Pull FCoE fix from Robert W Love:
 "This patch fixes a critical bug that was introduced in 3.9 related to
  VLAN tagging FCoE frames"

* tag 'critical_fix_for_3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rwlove/fcoe:
  fcoe: Use correct API to set vlan tag for FCoE Ethertype skbs

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Wed, 26 Jun 2013 18:47:46 +0000 (08:47 -1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph fix from Sage Weil:
 "This fixes another problem with using v2 images on 3.10 due to the
  order in which fields are read from the image header.

  Hopefully this is the last one"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fetch object order before using it

11 years agoperf: Disable monitoring on setuid processes for regular users
Stephane Eranian [Thu, 20 Jun 2013 09:36:28 +0000 (11:36 +0200)]
perf: Disable monitoring on setuid processes for regular users

There was a a bug in setup_new_exec(), whereby
the test to disabled perf monitoring was not
correct because the new credentials for the
process were not yet committed and therefore
the get_dumpable() test was never firing.

The patch fixes the problem by moving the
perf_event test until after the credentials
are committed.

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agogpio/omap: don't use linear domain mapping for OMAP1
Javier Martinez Canillas [Mon, 24 Jun 2013 15:13:23 +0000 (17:13 +0200)]
gpio/omap: don't use linear domain mapping for OMAP1

Commit ede4d7a5 ("gpio/omap: convert gpio irq domain to linear mapping")
converted the OMAP GPIO driver to use a linear mapping for the GPIO IRQ
domain instead of using a legacy mapping. Not using a legacy mapping has
a number of benefits but it requires the platform to support SPARSE_IRQ
which currently is not supported on OMAP1.

So this change caused a regression on OMAP1 platforms [1].

Since this issue is not present on all OMAP2+ platforms, there is no need to
revert the driver to use legacy domain mapping for all the platforms.

[1]: http://www.mail-archive.com/linux-omap@vger.kernel.org/msg89005.html

Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tony Lindgren <tony@atomide.com>
11 years agonet/tg3: Avoid delay during MMIO access
Gavin Shan [Tue, 25 Jun 2013 07:24:32 +0000 (15:24 +0800)]
net/tg3: Avoid delay during MMIO access

When the EEH error is the result of a fenced host bridge, MMIO accesses
can be very slow (milliseconds) to timeout and return all 1's,
thus causing the driver various timeout loops to take way too long and
trigger soft-lockup warnings (in addition to taking minutes to recover).

It might be worthwhile to check if for any of these cases, ffffffff is
a valid possible value, and if not, bail early since that means the HW
is either gone or isolated. In the meantime, checking that the PCI channel
is offline would be workaround of the problem.

Cc: <stable@vger.kernel.org> # v3.0+
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>