firefly-linux-kernel-4.4.55.git
10 years agoipv4: icmp: Fix pMTU handling for rare case
Edward Allcutt [Mon, 30 Jun 2014 15:16:02 +0000 (16:16 +0100)]
ipv4: icmp: Fix pMTU handling for rare case

[ Upstream commit 68b7107b62983f2cff0948292429d5f5999df096 ]

Some older router implementations still send Fragmentation Needed
errors with the Next-Hop MTU field set to zero. This is explicitly
described as an eventuality that hosts must deal with by the
standard (RFC 1191) since older standards specified that those
bits must be zero.

Linux had a generic (for all of IPv4) implementation of the algorithm
described in the RFC for searching a list of MTU plateaus for a good
value. Commit 46517008e116 ("ipv4: Kill ip_rt_frag_needed().")
removed this as part of the changes to remove the routing cache.
Subsequently any Fragmentation Needed packet with a zero Next-Hop
MTU has been discarded without being passed to the per-protocol
handlers or notifying userspace for raw sockets.

When there is a router which does not implement RFC 1191 on an
MTU limited path then this results in stalled connections since
large packets are discarded and the local protocols are not
notified so they never attempt to lower the pMTU.

One example I have seen is an OpenBSD router terminating IPSec
tunnels. It's worth pointing out that this case is distinct from
the BSD 4.2 bug which incorrectly calculated the Next-Hop MTU
since the commit in question dismissed that as a valid concern.

All of the per-protocols handlers implement the simple approach from
RFC 1191 of immediately falling back to the minimum value. Although
this is sub-optimal it is vastly preferable to connections hanging
indefinitely.

Remove the Next-Hop MTU != 0 check and allow such packets
to follow the normal path.

Fixes: 46517008e116 ("ipv4: Kill ip_rt_frag_needed().")
Signed-off-by: Edward Allcutt <edward.allcutt@openmarket.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotcp: Fix divide by zero when pushing during tcp-repair
Christoph Paasch [Sat, 28 Jun 2014 16:26:37 +0000 (18:26 +0200)]
tcp: Fix divide by zero when pushing during tcp-repair

[ Upstream commit 5924f17a8a30c2ae18d034a86ee7581b34accef6 ]

When in repair-mode and TCP_RECV_QUEUE is set, we end up calling
tcp_push with mss_now being 0. If data is in the send-queue and
tcp_set_skb_tso_segs gets called, we crash because it will divide by
mss_now:

[  347.151939] divide error: 0000 [#1] SMP
[  347.152907] Modules linked in:
[  347.152907] CPU: 1 PID: 1123 Comm: packetdrill Not tainted 3.16.0-rc2 #4
[  347.152907] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  347.152907] task: f5b88540 ti: f3c82000 task.ti: f3c82000
[  347.152907] EIP: 0060:[<c1601359>] EFLAGS: 00210246 CPU: 1
[  347.152907] EIP is at tcp_set_skb_tso_segs+0x49/0xa0
[  347.152907] EAX: 00000b67 EBX: f5acd080 ECX: 00000000 EDX: 00000000
[  347.152907] ESI: f5a28f40 EDI: f3c88f00 EBP: f3c83d10 ESP: f3c83d00
[  347.152907]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  347.152907] CR0: 80050033 CR2: 083158b0 CR3: 35146000 CR4: 000006b0
[  347.152907] Stack:
[  347.152907]  c167f9d9 f5acd080 000005b4 00000002 f3c83d20 c16013e6 f3c88f00 f5acd080
[  347.152907]  f3c83da0 c1603b5a f3c83d38 c10a0188 00000000 00000000 f3c83d84 c10acc85
[  347.152907]  c1ad5ec0 00000000 00000000 c1ad679c 010003e0 00000000 00000000 f3c88fc8
[  347.152907] Call Trace:
[  347.152907]  [<c167f9d9>] ? apic_timer_interrupt+0x2d/0x34
[  347.152907]  [<c16013e6>] tcp_init_tso_segs+0x36/0x50
[  347.152907]  [<c1603b5a>] tcp_write_xmit+0x7a/0xbf0
[  347.152907]  [<c10a0188>] ? up+0x28/0x40
[  347.152907]  [<c10acc85>] ? console_unlock+0x295/0x480
[  347.152907]  [<c10ad24f>] ? vprintk_emit+0x1ef/0x4b0
[  347.152907]  [<c1605716>] __tcp_push_pending_frames+0x36/0xd0
[  347.152907]  [<c15f4860>] tcp_push+0xf0/0x120
[  347.152907]  [<c15f7641>] tcp_sendmsg+0xf1/0xbf0
[  347.152907]  [<c116d920>] ? kmem_cache_free+0xf0/0x120
[  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
[  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
[  347.152907]  [<c114f0f0>] ? do_wp_page+0x3e0/0x850
[  347.152907]  [<c161c36a>] inet_sendmsg+0x4a/0xb0
[  347.152907]  [<c1150269>] ? handle_mm_fault+0x709/0xfb0
[  347.152907]  [<c15a006b>] sock_aio_write+0xbb/0xd0
[  347.152907]  [<c1180b79>] do_sync_write+0x69/0xa0
[  347.152907]  [<c1181023>] vfs_write+0x123/0x160
[  347.152907]  [<c1181d55>] SyS_write+0x55/0xb0
[  347.152907]  [<c167f0d8>] sysenter_do_call+0x12/0x28

This can easily be reproduced with the following packetdrill-script (the
"magic" with netem, sk_pacing and limit_output_bytes is done to prevent
the kernel from pushing all segments, because hitting the limit without
doing this is not so easy with packetdrill):

0   socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0  bind(3, ..., ...) = 0
+0  listen(3, 1) = 0

+0  < S 0:0(0) win 32792 <mss 1460>
+0  > S. 0:0(0) ack 1 <mss 1460>
+0.1  < . 1:1(0) ack 1 win 65000

+0  accept(3, ..., ...) = 4

// This forces that not all segments of the snd-queue will be pushed
+0 `tc qdisc add dev tun0 root netem delay 10ms`
+0 `sysctl -w net.ipv4.tcp_limit_output_bytes=2`
+0 setsockopt(4, SOL_SOCKET, 47, [2], 4) = 0

+0 write(4,...,10000) = 10000
+0 write(4,...,10000) = 10000

// Set tcp-repair stuff, particularly TCP_RECV_QUEUE
+0 setsockopt(4, SOL_TCP, 19, [1], 4) = 0
+0 setsockopt(4, SOL_TCP, 20, [1], 4) = 0

// This now will make the write push the remaining segments
+0 setsockopt(4, SOL_SOCKET, 47, [20000], 4) = 0
+0 `sysctl -w net.ipv4.tcp_limit_output_bytes=130000`

// Now we will crash
+0 write(4,...,1000) = 1000

This happens since ec3423257508 (tcp: fix retransmission in repair
mode). Prior to that, the call to tcp_push was prevented by a check for
tp->repair.

The patch fixes it, by adding the new goto-label out_nopush. When exiting
tcp_sendmsg and a push is not required, which is the case for tp->repair,
we go to this label.

When repairing and calling send() with TCP_RECV_QUEUE, the data is
actually put in the receive-queue. So, no push is required because no
data has been added to the send-queue.

Cc: Andrew Vagin <avagin@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Fixes: ec3423257508 (tcp: fix retransmission in repair mode)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Acked-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agobnx2x: fix possible panic under memory stress
Eric Dumazet [Thu, 26 Jun 2014 07:44:02 +0000 (00:44 -0700)]
bnx2x: fix possible panic under memory stress

[ Upstream commit 07b0f00964def8af9321cfd6c4a7e84f6362f728 ]

While it is legal to kfree(NULL), it is not wise to use :
put_page(virt_to_head_page(NULL))

 BUG: unable to handle kernel paging request at ffffeba400000000
 IP: [<ffffffffc01f5928>] virt_to_head_page+0x36/0x44 [bnx2x]

Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Fixes: d46d132cc021 ("bnx2x: use netdev_alloc_frag()")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: fix sparse warning in sk_dst_set()
Eric Dumazet [Wed, 2 Jul 2014 09:39:38 +0000 (02:39 -0700)]
net: fix sparse warning in sk_dst_set()

[ Upstream commit 5925a0555bdaf0b396a84318cbc21ba085f6c0d3 ]

sk_dst_cache has __rcu annotation, so we need a cast to avoid
following sparse error :

include/net/sock.h:1774:19: warning: incorrect type in initializer (different address spaces)
include/net/sock.h:1774:19:    expected struct dst_entry [noderef] <asn:4>*__ret
include/net/sock.h:1774:19:    got struct dst_entry *dst

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 7f502361531e ("ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix
Eric Dumazet [Mon, 30 Jun 2014 08:26:23 +0000 (01:26 -0700)]
ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix

[ Upstream commit 7f502361531e9eecb396cf99bdc9e9a59f7ebd7f ]

We have two different ways to handle changes to sk->sk_dst

First way (used by TCP) assumes socket lock is owned by caller, and use
no extra lock : __sk_dst_set() & __sk_dst_reset()

Another way (used by UDP) uses sk_dst_lock because socket lock is not
always taken. Note that sk_dst_lock is not softirq safe.

These ways are not inter changeable for a given socket type.

ipv4_sk_update_pmtu(), added in linux-3.8, added a race, as it used
the socket lock as synchronization, but users might be UDP sockets.

Instead of converting sk_dst_lock to a softirq safe version, use xchg()
as we did for sk_rx_dst in commit e47eb5dfb296b ("udp: ipv4: do not use
sk_dst_lock from softirq context")

In a follow up patch, we probably can remove sk_dst_lock, as it is
only used in IPv6.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Fixes: 9cb3a50c5f63e ("ipv4: Invalidate the socket cached route on pmtu events if possible")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoipv4: fix dst race in sk_dst_get()
Eric Dumazet [Tue, 24 Jun 2014 17:05:11 +0000 (10:05 -0700)]
ipv4: fix dst race in sk_dst_get()

[ Upstream commit f88649721268999bdff09777847080a52004f691 ]

When IP route cache had been removed in linux-3.6, we broke assumption
that dst entries were all freed after rcu grace period. DST_NOCACHE
dst were supposed to be freed from dst_release(). But it appears
we want to keep such dst around, either in UDP sockets or tunnels.

In sk_dst_get() we need to make sure dst refcount is not 0
before incrementing it, or else we might end up freeing a dst
twice.

DST_NOCACHE set on a dst does not mean this dst can not be attached
to a socket or a tunnel.

Then, before actual freeing, we need to observe a rcu grace period
to make sure all other cpus can catch the fact the dst is no longer
usable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dormando <dormando@rydia.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years ago8021q: fix a potential memory leak
Li RongQing [Wed, 18 Jun 2014 05:46:02 +0000 (13:46 +0800)]
8021q: fix a potential memory leak

[ Upstream commit 916c1689a09bc1ca81f2d7a34876f8d35aadd11b ]

skb_cow called in vlan_reorder_header does not free the skb when it failed,
and vlan_reorder_header returns NULL to reset original skb when it is called
in vlan_untag, lead to a memory leak.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: sctp: check proc_dointvec result in proc_sctp_do_auth
Daniel Borkmann [Wed, 18 Jun 2014 21:46:31 +0000 (23:46 +0200)]
net: sctp: check proc_dointvec result in proc_sctp_do_auth

[ Upstream commit 24599e61b7552673dd85971cf5a35369cd8c119e ]

When writing to the sysctl field net.sctp.auth_enable, it can well
be that the user buffer we handed over to proc_dointvec() via
proc_sctp_do_auth() handler contains something other than integers.

In that case, we would set an uninitialized 4-byte value from the
stack to net->sctp.auth_enable that can be leaked back when reading
the sysctl variable, and it can unintentionally turn auth_enable
on/off based on the stack content since auth_enable is interpreted
as a boolean.

Fix it up by making sure proc_dointvec() returned sucessfully.

Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb
Neal Cardwell [Thu, 19 Jun 2014 01:15:03 +0000 (21:15 -0400)]
tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb

[ Upstream commit 2cd0d743b05e87445c54ca124a9916f22f16742e ]

If there is an MSS change (or misbehaving receiver) that causes a SACK
to arrive that covers the end of an skb but is less than one MSS, then
tcp_match_skb_to_sack() was rounding up pkt_len to the full length of
the skb ("Round if necessary..."), then chopping all bytes off the skb
and creating a zero-byte skb in the write queue.

This was visible now because the recently simplified TLP logic in
bef1909ee3ed1c ("tcp: fixing TLP's FIN recovery") could find that 0-byte
skb at the end of the write queue, and now that we do not check that
skb's length we could send it as a TLP probe.

Consider the following example scenario:

 mss: 1000
 skb: seq: 0 end_seq: 4000  len: 4000
 SACK: start_seq: 3999 end_seq: 4000

The tcp_match_skb_to_sack() code will compute:

 in_sack = false
 pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999
 new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000
 new_len += mss = 4000

Previously we would find the new_len > skb->len check failing, so we
would fall through and set pkt_len = new_len = 4000 and chop off
pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment
afterward in the write queue.

With this new commit, we notice that the new new_len >= skb->len check
succeeds, so that we return without trying to fragment.

Fixes: adb92db857ee ("tcp: Make SACK code to split only at mss boundaries")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoip_tunnel: fix ip_tunnel_lookup
Dmitry Popov [Fri, 4 Jul 2014 22:26:37 +0000 (02:26 +0400)]
ip_tunnel: fix ip_tunnel_lookup

[ Upstream commit e0056593b61253f1a8a9941dacda22e73b963cdc ]

This patch fixes 3 similar bugs where incoming packets might be routed into
wrong non-wildcard tunnels:

1) Consider the following setup:
    ip address add 1.1.1.1/24 dev eth0
    ip address add 1.1.1.2/24 dev eth0
    ip tunnel add ipip1 remote 2.2.2.2 local 1.1.1.1 mode ipip dev eth0
    ip link set ipip1 up

Incoming ipip packets from 2.2.2.2 were routed into ipip1 even if it has dst =
1.1.1.2. Moreover even if there was wildcard tunnel like
   ip tunnel add ipip0 remote 2.2.2.2 local any mode ipip dev eth0
but it was created before explicit one (with local 1.1.1.1), incoming ipip
packets with src = 2.2.2.2 and dst = 1.1.1.2 were still routed into ipip1.

Same issue existed with all tunnels that use ip_tunnel_lookup (gre, vti)

2)  ip address add 1.1.1.1/24 dev eth0
    ip tunnel add ipip1 remote 2.2.146.85 local 1.1.1.1 mode ipip dev eth0
    ip link set ipip1 up

Incoming ipip packets with dst = 1.1.1.1 were routed into ipip1, no matter what
src address is. Any remote ip address which has ip_tunnel_hash = 0 raised this
issue, 2.2.146.85 is just an example, there are more than 4 million of them.
And again, wildcard tunnel like
   ip tunnel add ipip0 remote any local 1.1.1.1 mode ipip dev eth0
wouldn't be ever matched if it was created before explicit tunnel like above.

Gre & vti tunnels had the same issue.

3)  ip address add 1.1.1.1/24 dev eth0
    ip tunnel add gre1 remote 2.2.146.84 local 1.1.1.1 key 1 mode gre dev eth0
    ip link set gre1 up

Any incoming gre packet with key = 1 were routed into gre1, no matter what
src/dst addresses are. Any remote ip address which has ip_tunnel_hash = 0 raised
the issue, 2.2.146.84 is just an example, there are more than 4 million of them.
Wildcard tunnel like
   ip tunnel add gre2 remote any local any key 1 mode gre dev eth0
wouldn't be ever matched if it was created before explicit tunnel like above.

All this stuff happened because while looking for a wildcard tunnel we didn't
check that matched tunnel is a wildcard one. Fixed.

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoshmem: fix splicing from a hole while it's punched
Hugh Dickins [Wed, 23 Jul 2014 21:00:13 +0000 (14:00 -0700)]
shmem: fix splicing from a hole while it's punched

commit b1a366500bd537b50c3aad26dc7df083ec03a448 upstream.

shmem_fault() is the actual culprit in trinity's hole-punch starvation,
and the most significant cause of such problems: since a page faulted is
one that then appears page_mapped(), needing unmap_mapping_range() and
i_mmap_mutex to be unmapped again.

But it is not the only way in which a page can be brought into a hole in
the radix_tree while that hole is being punched; and Vlastimil's testing
implies that if enough other processors are busy filling in the hole,
then shmem_undo_range() can be kept from completing indefinitely.

shmem_file_splice_read() is the main other user of SGP_CACHE, which can
instantiate shmem pagecache pages in the read-only case (without holding
i_mutex, so perhaps concurrently with a hole-punch).  Probably it's
silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
ought to be safe, but might bring surprises - not a change to be rushed.

shmem_read_mapping_page_gfp() is an internal interface used by
drivers/gpu/drm GEM (and next by uprobes): it should be okay.  And
shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
called internally by the kernel (perhaps for a stacking filesystem,
which might rely on holes to be reserved): it's unclear whether it could
be provoked to keep hole-punch busy or not.

We could apply the same umbrella as now used in shmem_fault() to
shmem_file_splice_read() and the others; but it looks ugly, and use over
a range raises questions - should it actually be per page? can these get
starved themselves?

The origin of this part of the problem is my v3.1 commit d0823576bf4b
("mm: pincer in truncate_inode_pages_range"), once it was duplicated
into shmem.c.  It seemed like a nice idea at the time, to ensure
(barring RCU lookup fuzziness) that there's an instant when the entire
hole is empty; but the indefinitely repeated scans to ensure that make
it vulnerable.

Revert that "enhancement" to hole-punch from shmem_undo_range(), but
retain the unproblematic rescanning when it's truncating; add a couple
of comments there.

Remove the "indices[0] >= end" test: that is now handled satisfactorily
by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
to be worth avoiding here.

But if we do not always loop indefinitely, we do need to handle the case
of swap swizzled back to page before shmem_free_swap() gets it: add a
retry for that case, as suggested by Konstantin Khlebnikov; and for the
case of page swizzled back to swap, as suggested by Johannes Weiner.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoshmem: fix faulting into a hole, not taking i_mutex
Hugh Dickins [Wed, 23 Jul 2014 21:00:10 +0000 (14:00 -0700)]
shmem: fix faulting into a hole, not taking i_mutex

commit 8e205f779d1443a94b5ae81aa359cb535dd3021e upstream.

Commit f00cdc6df7d7 ("shmem: fix faulting into a hole while it's
punched") was buggy: Sasha sent a lockdep report to remind us that
grabbing i_mutex in the fault path is a no-no (write syscall may already
hold i_mutex while faulting user buffer).

We tried a completely different approach (see following patch) but that
proved inadequate: good enough for a rational workload, but not good
enough against trinity - which forks off so many mappings of the object
that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
into serious starvation when concurrent faults force the puncher to fall
back to single-page unmap_mapping_range() searches of the i_mmap tree.

So return to the original umbrella approach, but keep away from i_mutex
this time.  We really don't want to bloat every shmem inode with a new
mutex or completion, just to protect this unlikely case from trinity.
So extend the original with wait_queue_head on stack at the hole-punch
end, and wait_queue item on the stack at the fault end.

This involves further use of i_lock to guard against the races: lockdep
has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
i_lock around wake_up_bit(), which is comparable to what we do here.
i_lock is more convenient, but we could switch to shmem's info->lock.

This issue has been tagged with CVE-2014-4171, which will require commit
f00cdc6df7d7 and this and the following patch to be backported: we
suggest to 3.1+, though in fact the trinity forkbomb effect might go
back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
Anyone running trinity on 3.0 and earlier? I don't think we need care.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoshmem: fix faulting into a hole while it's punched
Hugh Dickins [Mon, 23 Jun 2014 20:22:06 +0000 (13:22 -0700)]
shmem: fix faulting into a hole while it's punched

commit f00cdc6df7d7cfcabb5b740911e6788cb0802bdb upstream.

Trinity finds that mmap access to a hole while it's punched from shmem
can prevent the madvise(MADV_REMOVE) or fallocate(FALLOC_FL_PUNCH_HOLE)
from completing, until the reader chooses to stop; with the puncher's
hold on i_mutex locking out all other writers until it can complete.

It appears that the tmpfs fault path is too light in comparison with its
hole-punching path, lacking an i_data_sem to obstruct it; but we don't
want to slow down the common case.

Extend shmem_fallocate()'s existing range notification mechanism, so
shmem_fault() can refrain from faulting pages into the hole while it's
punched, waiting instead on i_mutex (when safe to sleep; or repeatedly
faulting when not).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoiwlwifi: dvm: don't enable CTS to self
Emmanuel Grumbach [Wed, 25 Jun 2014 06:12:30 +0000 (09:12 +0300)]
iwlwifi: dvm: don't enable CTS to self

commit 43d826ca5979927131685cc2092c7ce862cb91cd upstream.

We should always prefer to use full RTS protection. Using
CTS to self gives a meaningless improvement, but this flow
is much harder for the firmware which is likely to have
issues with it.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoigb: do a reset on SR-IOV re-init if device is down
Stefan Assmann [Thu, 10 Jul 2014 10:29:39 +0000 (03:29 -0700)]
igb: do a reset on SR-IOV re-init if device is down

commit 76252723e88681628a3dbb9c09c963e095476f73 upstream.

To properly re-initialize SR-IOV it is necessary to reset the device
even if it is already down. Not doing this may result in Tx unit hangs.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agohwmon: (adt7470) Fix writes to temperature limit registers
Guenter Roeck [Thu, 17 Jul 2014 00:40:31 +0000 (17:40 -0700)]
hwmon: (adt7470) Fix writes to temperature limit registers

commit de12d6f4b10b21854441f5242dcb29ea96181e58 upstream.

Temperature limit registers are signed. Limits therefore need
to be clamped to (-128, 127) degrees C and not to (0, 255)
degrees C.

Without this fix, writing a limit of 128 degrees C sets the
actual limit to -128 degrees C.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agohwmon: (da9052) Don't use dash in the name attribute
Axel Lin [Wed, 9 Jul 2014 01:18:59 +0000 (09:18 +0800)]
hwmon: (da9052) Don't use dash in the name attribute

commit ee14b644daaa58afe1e91bb9ebd9cf1b18d1f5fa upstream.

Dashes are not allowed in hwmon name attributes.
Use "da9052" instead of "da9052-hwmon".

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agohwmon: (da9055) Don't use dash in the name attribute
Axel Lin [Wed, 9 Jul 2014 01:22:54 +0000 (09:22 +0800)]
hwmon: (da9055) Don't use dash in the name attribute

commit 6b00f440dd678d786389a7100a2e03fe44478431 upstream.

Dashes are not allowed in hwmon name attributes.
Use "da9055" instead of "da9055-hwmon".

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs
zhangwei(Jovi) [Thu, 18 Jul 2013 08:31:05 +0000 (16:31 +0800)]
tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs

commit 8abfb8727f4a724d31f9ccfd8013fbd16d539445 upstream.

Currently trace option stacktrace is not applicable for
trace_printk with constant string argument, the reason is
in __trace_puts/__trace_bputs ftrace_trace_stack is missing.

In contrast, when using trace_printk with non constant string
argument(will call into __trace_printk/__trace_bprintk), then
trace option stacktrace is workable, this inconstant result
will confuses users a lot.

Link: http://lkml.kernel.org/p/51E7A7C9.9040401@huawei.com
Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotracing: Fix graph tracer with stack tracer on other archs
Steven Rostedt (Red Hat) [Tue, 15 Jul 2014 15:05:12 +0000 (11:05 -0400)]
tracing: Fix graph tracer with stack tracer on other archs

commit 5f8bf2d263a20b986225ae1ed7d6759dc4b93af9 upstream.

Running my ftrace tests on PowerPC, it failed the test that checks
if function_graph tracer is affected by the stack tracer. It was.
Looking into this, I found that the update_function_graph_func()
must be called even if the trampoline function is not changed.
This is because archs like PowerPC do not support ftrace_ops being
passed by assembly and instead uses a helper function (what the
trampoline function points to). Since this function is not changed
even when multiple ftrace_ops are added to the code, the test that
falls out before calling update_function_graph_func() will miss that
the update must still be done.

Call update_function_graph_function() for all calls to
update_ftrace_function()

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofuse: handle large user and group ID
Miklos Szeredi [Mon, 7 Jul 2014 13:28:51 +0000 (15:28 +0200)]
fuse: handle large user and group ID

commit 233a01fa9c4c7c41238537e8db8434667ff28a2f upstream.

If the number in "user_id=N" or "group_id=N" mount options was larger than
INT_MAX then fuse returned EINVAL.

Fix this to handle all valid uid/gid values.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoBluetooth: Ignore H5 non-link packets in non-active state
Loic Poulain [Mon, 23 Jun 2014 15:42:44 +0000 (17:42 +0200)]
Bluetooth: Ignore H5 non-link packets in non-active state

commit 48439d501e3d9e8634bdc0c418e066870039599d upstream.

When detecting a non-link packet, h5_reset_rx() frees the Rx skb.
Not returning after that will cause the upcoming h5_rx_payload()
call to dereference a now NULL Rx skb and trigger a kernel oops.

Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoDrivers: hv: util: Fix a bug in the KVP code
K. Y. Srinivasan [Mon, 7 Jul 2014 23:34:25 +0000 (16:34 -0700)]
Drivers: hv: util: Fix a bug in the KVP code

commit 9bd2d0dfe4714dd5d7c09a93a5c9ea9e14ceb3fc upstream.

Add code to poll the channel since we process only one message
at a time and the host may not interrupt us. Also increase the
receive buffer size since some KVP messages are close to 8K bytes in size.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: gspca_pac7302: Add new usb-id for Genius i-Look 317
Hans de Goede [Wed, 9 Jul 2014 09:20:44 +0000 (06:20 -0300)]
media: gspca_pac7302: Add new usb-id for Genius i-Look 317

commit 242841d3d71191348f98310e2d2001e1001d8630 upstream.

Tested-and-reported-by: yullaw <yullaw@mageia.cz>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agousb: Check if port status is equal to RxDetect
Gavin Guo [Thu, 17 Jul 2014 17:12:13 +0000 (01:12 +0800)]
usb: Check if port status is equal to RxDetect

commit bb86cf569bbd7ad4dce581a37c7fbd748057e9dc upstream.

When using USB 3.0 pen drive with the [AMD] FCH USB XHCI Controller
[1022:7814], the second hotplugging will experience the USB 3.0 pen
drive is recognized as high-speed device. After bisecting the kernel,
I found the commit number 41e7e056cdc662f704fa9262e5c6e213b4ab45dd
(USB: Allow USB 3.0 ports to be disabled.) causes the bug. After doing
some experiments, the bug can be fixed by avoiding executing the function
hub_usb3_port_disable(). Because the port status with [AMD] FCH USB
XHCI Controlleris [1022:7814] is already in RxDetect
(I tried printing out the port status before setting to Disabled state),
it's reasonable to check the port status before really executing
hub_usb3_port_disable().

Fixes: 41e7e056cdc6 (USB: Allow USB 3.0 ports to be disabled.)
Signed-off-by: Gavin Guo <gavin.guo@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Fri, 25 Jul 2014 11:59:39 +0000 (12:59 +0100)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk
Mark Brown [Fri, 25 Jul 2014 11:59:32 +0000 (12:59 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk

10 years agoarm64: mm: Make icache synchronisation logic huge page aware
Steve Capper [Wed, 2 Jul 2014 10:46:23 +0000 (11:46 +0100)]
arm64: mm: Make icache synchronisation logic huge page aware

The __sync_icache_dcache routine will only flush the dcache for the
first page of a compound page, potentially leading to stale icache
data residing further on in a hugetlb page.

This patch addresses this issue by taking into consideration the
order of the page when flushing the dcache.

Reported-by: Mark Brown <broonie@linaro.org>
Tested-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> # v3.11+
(cherry picked from commit 923b8f5044da753e4985ab15c1374ced2cdf616c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Fri, 25 Jul 2014 11:52:58 +0000 (12:52 +0100)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk
Mark Brown [Fri, 25 Jul 2014 11:50:12 +0000 (12:50 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk

Conflicts:
arch/arm64/mm/tlb.S

10 years agoarm64: Fix barriers used for page table modifications
Catalin Marinas [Mon, 9 Jun 2014 10:55:03 +0000 (11:55 +0100)]
arm64: Fix barriers used for page table modifications

The architecture specification states that both DSB and ISB are required
between page table modifications and subsequent memory accesses using the
corresponding virtual address. When TLB invalidation takes place, the
tlb_flush_* functions already have the necessary barriers. However, there are
other functions like create_mapping() for which this is not the case.

The patch adds the DSB+ISB instructions in the set_pte() function for
valid kernel mappings. The invalid pte case is handled by tlb_flush_*
and the user mappings in general have a corresponding update_mmu_cache()
call containing a DSB. Even when update_mmu_cache() isn't called, the
kernel can still cope with an unlikely spurious page fault by
re-executing the instruction.

In addition, the set_pmd, set_pud() functions gain an ISB for
architecture compliance when block mappings are created.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 54d6ba0ede61f12b2a03d74bdbf004719a9cfefc)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: mm: Optimise tlb flush logic where we have >4K granule
Steve Capper [Fri, 2 May 2014 13:49:00 +0000 (14:49 +0100)]
arm64: mm: Optimise tlb flush logic where we have >4K granule

The tlb maintainence functions: __cpu_flush_user_tlb_range and
__cpu_flush_kern_tlb_range do not take into consideration the page
granule when looping through the address range, and repeatedly flush
tlb entries for the same page when operating with 64K pages.

This patch re-works the logic s.t. we instead advance the loop by
 1 << (PAGE_SHIFT - 12), so avoid repeating ourselves.

Also the routines have been converted from assembler to static inline
functions to aid with legibility and potential compiler optimisations.

The isb() has been removed from flush_tlb_kernel_range(.) as it is
only needed when changing the execute permission of a mapping. If one
needs to set an area of the kernel as execute/non-execute an isb()
must be inserted after the call to flush_tlb_kernel_range.

Cc: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fa48e6f780a681cdbc7820e33259edfe1a79b9e3)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: use correct register width when retrieving ASID
Matthew Leach [Wed, 25 Sep 2013 15:33:13 +0000 (16:33 +0100)]
arm64: use correct register width when retrieving ASID

The ASID is represented as an unsigned int in mm_context_t and we
currently use the mmid assembler macro to access this element of the
struct. This should be accessed with a register of 32-bit width. If
the incorrect register width is used the ASID will be returned in
bits[32:63] of the register when running under big-endian.

Fix a use of the mmid macro in tlb.S to use a 32-bit access.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fc18047c732f6becba92618a397555927687efd3)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Thu, 24 Jul 2014 22:07:25 +0000 (23:07 +0100)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/arm64-efi' into linux-linaro-lsk
Mark Brown [Thu, 24 Jul 2014 22:03:21 +0000 (23:03 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/arm64-efi' into linux-linaro-lsk

Conflicts:
arch/arm64/kernel/Makefile
arch/arm64/kernel/head.S

10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Mark Brown [Thu, 24 Jul 2014 22:01:03 +0000 (23:01 +0100)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

Conflicts:
drivers/of/fdt.c

10 years agoarm64: efi: only attempt efi map setup if booting via EFI
Leif Lindholm [Fri, 23 May 2014 13:16:56 +0000 (14:16 +0100)]
arm64: efi: only attempt efi map setup if booting via EFI

Booting a kernel with CONFIG_EFI enabled on a non-EFI system caused
an oops with the current UEFI support code.
Add the required test to prevent this.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 74bcc2499291d38b6253f9dbd6af33a195222208)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: add EFI runtime services
Mark Salter [Wed, 16 Apr 2014 01:59:30 +0000 (21:59 -0400)]
arm64: add EFI runtime services

This patch adds EFI runtime support for arm64. This runtime support allows
the kernel to access various EFI runtime services provided by EFI firmware.
Things like reboot, real time clock, EFI boot variables, and others.

This functionality is supported for little endian kernels only. The UEFI
firmware standard specifies that the firmware be little endian. A future
patch is expected to add support for big endian kernels running with
little endian firmware.

Signed-off-by: Mark Salter <msalter@redhat.com>
[ Remove unnecessary cache/tlb maintenance. ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit f84d02755f5a9f3b88e8d15d6384da25ad6dcf5e)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/Kconfig
arch/arm64/kernel/Makefile

10 years agoefi: fdt: Do not report an error during boot if UEFI is not available
Catalin Marinas [Tue, 8 Jul 2014 15:54:18 +0000 (16:54 +0100)]
efi: fdt: Do not report an error during boot if UEFI is not available

Currently, fdt_find_uefi_params() reports an error if no EFI parameters
are found in the DT. This is however a valid case for non-UEFI kernel
booting. This patch checks changes the error reporting to a
pr_info("UEFI not found") when no EFI parameters are found in the DT.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Leif Lindholm <leif.lindholm@linaro.org>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 29e2435fd6d71e0136e2c2ff0433b7dbeeaaccfd)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi/arm64: efistub: remove local copy of linux_banner
Ard Biesheuvel [Fri, 13 Jun 2014 11:11:51 +0000 (13:11 +0200)]
efi/arm64: efistub: remove local copy of linux_banner

The shared efistub code for ARM and arm64 contains a local copy of
linux_banner, allowing it to be referenced from separate executables
such as the ARM decompressor. However, this introduces a dependency on
generated header files, causing unnecessary rebuilds of the stub itself
and, in case of arm64, vmlinux which contains it.

On arm64, the copy is not actually needed since we can reference the
original symbol directly, and as it turns out, there may be better ways
to deal with this for ARM as well, so let's remove it from the shared
code. If it still needs to be reintroduced for ARM later, it should live
under arch/arm anyway and not in shared code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit a55c072dfe520f8fa03cf11b07b9268a8a17820a)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: Fix compiler warnings (unused, const, type)
Catalin Marinas [Mon, 2 Jun 2014 10:31:06 +0000 (11:31 +0100)]
efi: Fix compiler warnings (unused, const, type)

This patch fixes a few compiler warning in the efi code for unused
variable, discarding const qualifier and wrong pointer type:

drivers/firmware/efi/fdt.c|66 col 22| warning: unused variable â€˜name’ [-Wunused-variable]
drivers/firmware/efi/efi.c|368 col 3| warning: passing argument 3 of â€˜of_get_flat_dt_prop’ from incompatible pointer type [enabled by default]
drivers/firmware/efi/efi.c|368 col 8| warning: assignment discards â€˜const’ qualifier from pointer target type [enabled by default]

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 6fb8cc82c096fd5ccf277678639193cae07125a0)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi/arm64: ignore dtb= when UEFI SecureBoot is enabled
Ard Biesheuvel [Thu, 3 Apr 2014 15:46:58 +0000 (17:46 +0200)]
efi/arm64: ignore dtb= when UEFI SecureBoot is enabled

Loading unauthenticated FDT blobs directly from storage is a security hazard,
so this should only be allowed when running with UEFI Secure Boot disabled.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 345c736edd07b657a8c48190baed2719b85d0938)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: efi: add EFI stub
Mark Salter [Wed, 16 Apr 2014 02:47:52 +0000 (22:47 -0400)]
arm64: efi: add EFI stub

This patch adds PE/COFF header fields to the start of the kernel
Image so that it appears as an EFI application to UEFI firmware.
An EFI stub is included to allow direct booting of the kernel
Image.

Signed-off-by: Mark Salter <msalter@redhat.com>
[Add support in PE/COFF header for signed images]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 3c7f255039a2ad6ee1e3890505caf0d029b22e29)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/Kconfig
arch/arm64/kernel/Makefile

10 years agoarm64: Expand arm64 image header
Roy Franz [Wed, 14 Aug 2013 23:10:00 +0000 (00:10 +0100)]
arm64: Expand arm64 image header

Expand the arm64 image header to allow for co-existance with
PE/COFF header required by the EFI stub.  The PE/COFF format
requires the "MZ" header to be at offset 0, and the offset
to the PE/COFF header to be at offset 0x3c.  The image
header is expanded to allow 2 instructions at the beginning
to accommodate a benign intruction at offset 0 that includes
the "MZ" header, a magic number, and the offset to the PE/COFF
header.

Signed-off-by: Roy Franz <roy.franz@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 4370eec05a887b0cd4392cd5dc5b2713174745c0)
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit 3033aae67ae55a86e2a0b73199984ff060effa7b)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: Move facility flags to struct efi
Matt Fleming [Wed, 15 Jan 2014 13:21:22 +0000 (13:21 +0000)]
efi: Move facility flags to struct efi

As we grow support for more EFI architectures they're going to want the
ability to query which EFI features are available on the running system.
Instead of storing this information in an architecture-specific place,
stick it in the global 'struct efi', which is already the central
location for EFI state.

While we're at it, let's change the return value of efi_enabled() to be
bool and replace all references to 'facility' with 'feature', which is
the usual word used to describe the attributes of the running system.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 3e909599215456928e6b42a04f11c2517881570b)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/x86/platform/efi/efi.c

10 years agoia64/efi: Implement efi_enabled()
Matt Fleming [Wed, 15 Jan 2014 13:49:51 +0000 (13:49 +0000)]
ia64/efi: Implement efi_enabled()

There's no good reason to keep efi_enabled() under CONFIG_X86 anymore,
since nothing about the implementation is specific to x86.

Set EFI feature flags in the ia64 boot path instead of claiming to
support all features. The old behaviour was actually buggy since
efi.memmap never points to a valid memory map, so we shouldn't be
claiming to support EFI_MEMMAP.

Fortunately, this bug was never triggered because EFI_MEMMAP isn't used
outside of arch/x86 currently, but that may not always be the case.

Reviewed-and-tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 092063808c498eccac8e891973bf143e7b60d723)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: Add separate 32-bit/64-bit definitions
Matt Fleming [Fri, 10 Jan 2014 13:47:37 +0000 (13:47 +0000)]
efi: Add separate 32-bit/64-bit definitions

The traditional approach of using machine-specific types such as
'unsigned long' does not allow the kernel to interact with firmware
running in a different CPU mode, e.g. 64-bit kernel with 32-bit EFI.

Add distinct EFI structure definitions for both 32-bit and 64-bit so
that we can use them in the 32-bit and 64-bit code paths.

Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 677703cef0a148ba07d37ced649ad25b1cda2f78)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
include/linux/efi.h

10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/libfdt' into linux-linaro-lsk
Mark Brown [Thu, 24 Jul 2014 21:54:49 +0000 (22:54 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/libfdt' into linux-linaro-lsk

Conflicts:
drivers/of/fdt.c

10 years agoMerge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk
Mark Brown [Thu, 24 Jul 2014 21:52:37 +0000 (22:52 +0100)]
Merge remote-tracking branch 'lsk/v3.10/topic/arm64-misc' into linux-linaro-lsk

Conflicts:
arch/arm64/kernel/head.S

10 years agolib: add fdt_empty_tree.c
Mark Salter [Tue, 4 Feb 2014 16:11:10 +0000 (11:11 -0500)]
lib: add fdt_empty_tree.c

CONFIG_LIBFDT support does not include fdt_empty_tree.c which is
needed by arm64 EFI stub. Add it to libfdt_files.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit adaf5687846c25613d58c0a2f5d9e024547cdbec)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoof/fdt: Convert FDT functions to use libfdt
Rob Herring [Wed, 2 Apr 2014 20:10:14 +0000 (15:10 -0500)]
of/fdt: Convert FDT functions to use libfdt

The kernel FDT functions predate libfdt and are much more limited in
functionality. Also, the kernel functions and libfdt functions are
not compatible with each other because they have different definitions
of node offsets. To avoid this incompatibility and in preparation to
add more FDT parsing functions which will need libfdt, let's first
convert the existing code to use libfdt.

The FDT unflattening, top-level FDT scanning, and property retrieval
functions are converted to use libfdt. The scanning code should be
re-worked to be more efficient and understandable by using libfdt to
find nodes directly by path or compatible strings.

Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Tested-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Stephen Chivers <schivers@csc.com>
(cherry picked from commit e6a6928c3ea1d0195ed75a091e345696b916c09b)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
drivers/of/fdt.c

10 years agoof/fdt: update of_get_flat_dt_prop in prep for libfdt
Mark Brown [Thu, 24 Jul 2014 20:06:21 +0000 (21:06 +0100)]
of/fdt: update of_get_flat_dt_prop in prep for libfdt

Make of_get_flat_dt_prop arguments compatible with libfdt fdt_getprop
call in preparation to convert FDT code to use libfdt. Make the return
value const and the property length ptr type an int.

Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Tested-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Stephen Chivers <schivers@csc.com>
(cherry picked from commit 9d0c4dfedd96ee54fc075b16d02f82499c8cc3a6)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arc/kernel/devtree.c
arch/arm/kernel/devtree.c
arch/arm/mach-exynos/exynos.c
arch/arm/plat-samsung/s5p-dev-mfc.c
arch/powerpc/kernel/epapr_paravirt.c
arch/powerpc/kernel/prom.c
arch/powerpc/mm/hash_utils_64.c
arch/powerpc/platforms/powernv/opal.c
arch/xtensa/kernel/setup.c
drivers/of/fdt.c

10 years agoof/fdt: remove unused of_scan_flat_dt_by_path
Rob Herring [Sat, 29 Mar 2014 19:14:17 +0000 (14:14 -0500)]
of/fdt: remove unused of_scan_flat_dt_by_path

of_scan_flat_dt_by_path is unused anywhere in the kernel, so remove it.

Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Tested-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Stephen Chivers <schivers@csc.com>
(cherry picked from commit bba04d965d06abbbe10afd3687742389107e198e)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
drivers/of/fdt.c

10 years agoof: Fix the section mismatch warnings.
Xiubo Li [Tue, 8 Apr 2014 05:48:07 +0000 (13:48 +0800)]
of: Fix the section mismatch warnings.

In tag next-20140407, building with CONFIG_DEBUG_SECTION_MISMATCH
enabled, the following WARNING is occured:

WARNING: drivers/built-in.o(.text.unlikely+0x2220): Section mismatch
in reference from the function __reserved_mem_check_root() to the
function .init.text:of_get_flat_dt_prop()
The function __reserved_mem_check_root() references
the function __init of_get_flat_dt_prop().
This is often because __reserved_mem_check_root lacks a __init
annotation or the annotation of of_get_flat_dt_prop is wrong.

WARNING: vmlinux.o(.text.unlikely+0xb9d0): Section mismatch in reference
from the function __reserved_mem_check_root() to the (unknown reference)
.init.data:(unknown)
The function __reserved_mem_check_root() references
the (unknown reference) __initdata (unknown).
This is often because __reserved_mem_check_root lacks a __initdata
annotation or the annotation of (unknown) is wrong.

This is cause by :
'drivers: of: add initialization code for dynamic reserved memory'.

Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
Signed-off-by: Rob Herring <robh@kernel.org>
(cherry picked from commit 5b6241185e2cded07ca3f5f646b55c641928ba4e)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoof: only scan for reserved mem when fdt present
Josh Cartwright [Thu, 13 Mar 2014 21:36:36 +0000 (16:36 -0500)]
of: only scan for reserved mem when fdt present

When the reserved memory patches hit -next, several legacy (non-DT) boot
failures were detected and bisected down to that commit. There needs to
be some sanity checking whether a DT is even present before parsing the
reserved ranges.

Reported-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Josh Cartwright <joshc@codeaurora.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit 2040b52768ebab6e7bd73af0dc63703269c62f17)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agodrivers: of: add support for custom reserved memory drivers
Marek Szyprowski [Fri, 28 Feb 2014 13:42:49 +0000 (14:42 +0100)]
drivers: of: add support for custom reserved memory drivers

Add support for custom reserved memory drivers. Call their init() function
for each reserved region and prepare for using operations provided by them
with by the reserved_mem->ops array.

Based on previous code provided by Josh Cartwright <joshc@codeaurora.org>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit f618c4703a14672d27bc2ca5d132a844363d6f5f)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agodrivers: of: add initialization code for dynamic reserved memory
Marek Szyprowski [Fri, 28 Feb 2014 13:42:48 +0000 (14:42 +0100)]
drivers: of: add initialization code for dynamic reserved memory

This patch adds support for dynamically allocated reserved memory regions
declared in device tree. Such regions are defined by 'size', 'alignment'
and 'alloc-ranges' properties.

Based on previous code provided by Josh Cartwright <joshc@codeaurora.org>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit 3f0c8206644836e4f10a6b9fc47cda6a9a372f9b)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agodrivers: of: add initialization code for static reserved memory
Marek Szyprowski [Fri, 28 Feb 2014 13:42:47 +0000 (14:42 +0100)]
drivers: of: add initialization code for static reserved memory

This patch adds support for static (defined by 'reg' property) reserved
memory regions declared in device tree.

Memory blocks can be reliably reserved only during early boot. This must
happen before the whole memory management subsystem is initialized,
because we need to ensure that the given contiguous blocks are not yet
allocated by kernel. Also it must happen before kernel mappings for the
whole low memory are created, to ensure that there will be no mappings
(for reserved blocks). Typically, all this happens before device tree
structures are unflattened, so we need to get reserved memory layout
directly from fdt.

Based on previous code provided by Josh Cartwright <joshc@codeaurora.org>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit e8d9d1f5485b52ec3c4d7af839e6914438f6c285)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
drivers/of/fdt.c
include/linux/of_fdt.h

10 years agodrivers: of: add function to scan fdt nodes given by path
Marek Szyprowski [Mon, 26 Aug 2013 12:41:56 +0000 (14:41 +0200)]
drivers: of: add function to scan fdt nodes given by path

Add a function to scan the flattened device-tree starting from the
node given by the path. It is used to extract information (like reserved
memory), which is required on early boot before we can unflatten the tree.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Tomasz Figa <t.figa@samsung.com>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
(cherry picked from commit 57d74bcf3072b65bde5aa540cedc976a75c48e5c)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoOF: Add helper for matching against linux,stdout-path
Sascha Hauer [Mon, 5 Aug 2013 12:40:44 +0000 (14:40 +0200)]
OF: Add helper for matching against linux,stdout-path

devicetrees may have a linux,stdout-path property in the chosen
node describing the console device. This adds a helper function
to match a device against this property so a driver can call
add_preferred_console for a matching device.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5c19e95216b93b0d29c6a4887e69a980edc6fc81)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoof: Specify initrd location using 64-bit
Santosh Shilimkar [Mon, 1 Jul 2013 18:20:35 +0000 (14:20 -0400)]
of: Specify initrd location using 64-bit

On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit.  These systems need the ability to specify the
initrd location using 64-bit numbers.

This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.

There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"

More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690
https://lkml.org/lkml/2012/9/13/544

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
(cherry picked from commit 374d5c9964c10373ba39bbe934f4262eb87d7114)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: KVM: define HYP and Stage-2 translation page flags
Marc Zyngier [Fri, 7 Dec 2012 18:35:41 +0000 (18:35 +0000)]
arm64: KVM: define HYP and Stage-2 translation page flags

Add HYP and S2 page flags, for both normal and device memory.

Reviewed-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit 363116073a26dbc2903d8417047597eebcc05273)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/include/asm/pgtable-hwdef.h
arch/arm64/include/asm/pgtable.h

10 years agoarm64: Fix for the arm64 kern_addr_valid() function
Dave Anderson [Tue, 15 Apr 2014 17:53:24 +0000 (18:53 +0100)]
arm64: Fix for the arm64 kern_addr_valid() function

Fix for the arm64 kern_addr_valid() function to recognize
virtual addresses in the kernel logical memory map.  The
function fails as written because it does not check whether
the addresses in that region are mapped at the pmd level to
2MB or 512MB pages, continues the page table walk to the
pte level, and issues a garbage value to pfn_valid().

Tested on 4K-page and 64K-page kernels.

Signed-off-by: Dave Anderson <anderson@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit da6e4cb67c6dd1f72257c0a4a97c26dc4e80d3a7)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Clean up the default pgprot setting
Catalin Marinas [Thu, 3 Apr 2014 14:57:15 +0000 (15:57 +0100)]
arm64: Clean up the default pgprot setting

The primary aim of this patchset is to remove the pgprot_default and
prot_sect_default global variables and rely strictly on predefined
values. The original goal was to be able to run SMP kernels on UP
hardware by not setting the Shareability bit. However, it is unlikely to
see UP ARMv8 hardware and even if we do, the Shareability bit is no
longer assumed to disable cacheable accesses.

A side effect is that the device mappings now have the Shareability
attribute set. The hardware, however, should ignore it since Device
accesses are always Outer Shareable.

Following the removal of the two global variables, there is some PROT_*
macro reshuffling and cleanup, including the __PAGE_* macros (replaced
by PAGE_*).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit a501e32430d4232012ab708b8f0ce841f29e0f02)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/include/asm/io.h
arch/arm64/include/asm/pgtable.h
arch/arm64/mm/mmu.c

10 years agoarm64: Add function to create identity mappings
Mark Salter [Wed, 12 Mar 2014 16:28:06 +0000 (12:28 -0400)]
arm64: Add function to create identity mappings

At boot time, before switching to a virtual UEFI memory map, firmware
expects UEFI memory and IO regions to be identity mapped whenever
kernel makes runtime services calls. The existing early boot code
creates an identity map of kernel text/data but this is not sufficient
for UEFI. This patch adds a create_id_mapping() function which reuses
the core code of the existing create_mapping().

Signed-off-by: Mark Salter <msalter@redhat.com>
[ Fixed error message formatting (%pa). ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit d7ecbddf4caefbac1b99478dd2b679f83dfc2545)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/mm/mmu.c

10 years agoefivars: Add compatibility code for compat tasks
Matt Fleming [Mon, 17 Mar 2014 15:36:37 +0000 (15:36 +0000)]
efivars: Add compatibility code for compat tasks

It seems people are using 32-bit efibootmgr on top of 64-bit kernels,
which will currently fail horribly when using the efivars interface,
which is the traditional efibootmgr backend (the other being efivarfs).

Since there is no versioning info in the data structure, figure out when
we need to munge the structure data via judicious use of
is_compat_task().

Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit e33655a386ed3b26ad36fb97a47ebb1c2ca1e928)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars: Refactor sanity checking code into separate function
Matt Fleming [Mon, 17 Mar 2014 15:08:34 +0000 (15:08 +0000)]
efivars: Refactor sanity checking code into separate function

Move a large chunk of code that checks the validity of efi_variable into
a new function, because we'll also need to use it for the compat code.

Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 54d2fbfb0c9d341c891926100ed0e5d4c4b0c987)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars: Stop passing a struct argument to efivar_validate()
Matt Fleming [Mon, 17 Mar 2014 10:57:00 +0000 (10:57 +0000)]
efivars: Stop passing a struct argument to efivar_validate()

In preparation for compat support, we can't assume that user variable
object is represented by a 'struct efi_variable'. Convert the validation
functions to take the variable name as an argument, which is the only
piece of the struct that was ever used anyway.

Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit a5d92ad32dad94fd8f3f61778561d532bb3a2f77)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars: Check size of user object
Matt Fleming [Mon, 17 Mar 2014 09:17:28 +0000 (09:17 +0000)]
efivars: Check size of user object

Unbelieavably there are no checks to see whether the data structure
passed to 'new_var' and 'del_var' is the size that we expect. Let's add
some for better robustness.

Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit e003bbee2a6a19a4c733335989284caf1b179e0d)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars: Use local variables instead of a pointer dereference
Matt Fleming [Sun, 16 Mar 2014 12:14:49 +0000 (12:14 +0000)]
efivars: Use local variables instead of a pointer dereference

In order to support a compat interface we need to stop passing pointers
to structures around, since the type of structure is going to depend on
whether the current task is a compat task.

Cc: Mike Waychison <mikew@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit bafc84d539c0ffa916037840df54428623abc3e6)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: Use NULL instead of 0 for pointer
Daeseok Youn [Thu, 13 Feb 2014 08:16:36 +0000 (17:16 +0900)]
efi: Use NULL instead of 0 for pointer

Fix following sparse warnings:

drivers/firmware/efi/efivars.c:230:66: warning:
 Using plain integer as NULL pointer
drivers/firmware/efi/efi.c:236:27: warning:
 Using plain integer as NULL pointer

Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 69e608411473ac56358ef35277563982d0565381)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Seiji Aguchi [Wed, 30 Oct 2013 19:27:26 +0000 (15:27 -0400)]
efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed

Currently, when mounting pstore file system, a read callback of
efi_pstore driver runs mutiple times as below.

- In the first read callback, scan efivar_sysfs_list from head and pass
  a kmsg buffer of a entry to an upper pstore layer.
- In the second read callback, rescan efivar_sysfs_list from the entry
  and pass another kmsg buffer to it.
- Repeat the scan and pass until the end of efivar_sysfs_list.

In this process, an entry is read across the multiple read function
calls. To avoid race between the read and erasion, the whole process
above is protected by a spinlock, holding in open() and releasing in
close().

At the same time, kmemdup() is called to pass the buffer to pstore
filesystem during it. And then, it causes a following lockdep warning.

To make the dynamic memory allocation runnable without taking spinlock,
holding off a deletion of sysfs entry if it happens while scanning it
via efi_pstore, and deleting it after the scan is completed.

To implement it, this patch introduces two flags, scanning and deleting,
to efivar_entry.

On the code basis, it seems that all the scanning and deleting logic is
not needed because __efivars->lock are not dropped when reading from the
EFI variable store.

But, the scanning and deleting logic is still needed because an
efi-pstore and a pstore filesystem works as follows.

In case an entry(A) is found, the pointer is saved to psi->data.  And
efi_pstore_read() passes the entry(A) to a pstore filesystem by
releasing  __efivars->lock.

And then, the pstore filesystem calls efi_pstore_read() again and the
same entry(A), which is saved to psi->data, is used for resuming to scan
a sysfs-list.

So, to protect the entry(A), the logic is needed.

[    1.143710] ------------[ cut here ]------------
[    1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
[    1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[    1.144058] Modules linked in:
[    1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
[    1.144058]  0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
[    1.144058]  ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
[    1.144058]  00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
[    1.144058] Call Trace:
[    1.144058]  [<ffffffff816614a5>] dump_stack+0x54/0x74
[    1.144058]  [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
[    1.144058]  [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
[    1.144058]  [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
[    1.144058]  [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
[    1.144058]  [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
[    1.144058]  [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff8115b260>] kmemdup+0x20/0x50
[    1.144058]  [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
[    1.144058]  [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
[    1.144058]  [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
[    1.144058]  [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
[    1.144058]  [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
[    1.158207]  [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
[    1.158207]  [<ffffffff8128ce30>] ? parse_options+0x80/0x80
[    1.158207]  [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
[    1.158207]  [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
[    1.158207]  [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
[    1.158207]  [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
[    1.158207]  [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
[    1.158207]  [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
[    1.158207]  [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
[    1.158207]  [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
[    1.158207]  [<ffffffff811cc373>] SyS_mount+0x83/0xc0
[    1.158207]  [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
[    1.158207] ---[ end trace 61981bc62de9f6f4 ]---

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Tested-by: Madper Xie <cxie@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit e0d59733f6b1796b8d6692642c87d7dd862c3e3a)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefivars: Mark local function as static
Bojan Prtvar [Tue, 3 Sep 2013 06:56:20 +0000 (08:56 +0200)]
efivars: Mark local function as static

This fixes the following sparse warning
drivers/firmware/efi/efivars.c:567:6: warning: symbol 'efivars_sysfs_exit' was not declared. Should it be static?

Signed-off-by: Bojan Prtvar <prtvar.b@gmail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 6f9dd30c22da4e48c4b7b837e9641f072e673161)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: Move facility flags to struct efi
Matt Fleming [Wed, 15 Jan 2014 13:21:22 +0000 (13:21 +0000)]
efi: Move facility flags to struct efi

As we grow support for more EFI architectures they're going to want the
ability to query which EFI features are available on the running system.
Instead of storing this information in an architecture-specific place,
stick it in the global 'struct efi', which is already the central
location for EFI state.

While we're at it, let's change the return value of efi_enabled() to be
bool and replace all references to 'facility' with 'feature', which is
the usual word used to describe the attributes of the running system.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit 3e909599215456928e6b42a04f11c2517881570b)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/x86/include/asm/efi.h
arch/x86/platform/efi/efi.c

10 years agoefi: Add proper definitions for some EFI function pointers.
Roy Franz [Sun, 22 Sep 2013 22:45:26 +0000 (15:45 -0700)]
efi: Add proper definitions for some EFI function pointers.

The x86/AMD64 EFI stubs must use a call wrapper to convert between
the Linux and EFI ABIs, so void pointers are sufficient.  For ARM,
the ABIs are compatible, so we can directly invoke the function
pointers.  The functions that are used by the ARM stub are updated
to match the EFI definitions.
Also add some EFI types used by EFI functions.

Signed-off-by: Roy Franz <roy.franz@linaro.org>
Acked-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit ed37ddffe201bfad7be3c45bc08bd65b5298adca)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoefi: create memory map iteration helper
Mark Salter [Fri, 10 Jan 2014 19:26:06 +0000 (14:26 -0500)]
efi: create memory map iteration helper

There are a lot of places in the kernel which iterate through an
EFI memory map. Most of these places use essentially the same
for-loop code. This patch adds a for_each_efi_memory_desc()
helper to clean up all of the existing duplicate code and avoid
more in the future.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
(cherry picked from commit e885cd805fc6e65ef5150a211c7bac02f925af04)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: place initial page tables above the kernel
Mark Rutland [Tue, 24 Jun 2014 15:51:35 +0000 (16:51 +0100)]
arm64: place initial page tables above the kernel

Currently we place swapper_pg_dir and idmap_pg_dir below the kernel
image, between PHYS_OFFSET and (PHYS_OFFSET + TEXT_OFFSET). However,
bootloaders may use portions of this memory below the kernel and we do
not parse the memory reservation list until after the MMU has been
enabled. As such we may clobber some memory a bootloader wishes to have
preserved.

To enable the use of all of this memory by bootloaders (when the
required memory reservations are communicated to the kernel) it is
necessary to move our initial page tables elsewhere. As we currently
have an effectively unbound requirement for memory at the end of the
kernel image for .bss, we can place the page tables here.

This patch moves the initial page table to the end of the kernel image,
after the BSS. As they do not consist of any initialised data they will
be stripped from the kernel Image as with the BSS. The BSS clearing
routine is updated to stop at __bss_stop rather than _end so as to not
clobber the page tables, and memory reservations made redundant by the
new organisation are removed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit bd00cd5f8c8c3c282bb1e1eac6a6679a4f808091)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/mm/init.c

10 years agoarm64: Relax the kernel cache requirements for boot
Catalin Marinas [Wed, 26 Mar 2014 18:25:55 +0000 (18:25 +0000)]
arm64: Relax the kernel cache requirements for boot

With system caches for the host OS or architected caches for guest OS we
cannot easily guarantee that there are no dirty or stale cache lines for
the areas of memory written by the kernel during boot with the MMU off
(therefore non-cacheable accesses).

This patch adds the necessary cache maintenance during boot and relaxes
the booting requirements.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c218bca74eeafa2f8528b6bbb34d112075fcf40a)
Signed-off-by: Mark Brown <broonie@linaro.org>
Conflicts:
arch/arm64/kernel/head.S

10 years agoarm64: head: create a new function for setting the boot_cpu_mode flag
Matthew Leach [Fri, 11 Oct 2013 13:52:16 +0000 (14:52 +0100)]
arm64: head: create a new function for setting the boot_cpu_mode flag

Currently, the code for setting the __cpu_boot_mode flag is munged in
with el2_setup. This makes things difficult on a BE bringup as a
memory access has to have occurred before el2_setup which is the place
that we'd like to set the endianess on the current EL.

Create a new function for setting __cpu_boot_mode and have el2_setup
return the mode the CPU. Also define a new constant in virt.h,
BOOT_CPU_MODE_EL1, for readability.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 828e9834e9a5b7e61046aa3c5f603a4fecba2fb4)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Alex Shi [Wed, 23 Jul 2014 07:07:07 +0000 (15:07 +0800)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

10 years agoMerge branch 'v3.10/topic/misc' into linux-linaro-lsk
Alex Shi [Wed, 23 Jul 2014 07:02:32 +0000 (15:02 +0800)]
Merge branch 'v3.10/topic/misc' into linux-linaro-lsk

10 years agotty: serial: samsung: drop uart_port->lock before calling tty_flip_buffer_push()
Viresh Kumar [Mon, 19 Aug 2013 14:44:26 +0000 (20:14 +0530)]
tty: serial: samsung: drop uart_port->lock before calling tty_flip_buffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b272168ceec056ca1d8a870a97fa9c26e11a
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f5693ea2710cd9f26ed2bfa917f7ed68e9502464)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
10 years agotty: serial: altera: drop uart_port->lock before calling tty_flip_buffer_push()
Viresh Kumar [Mon, 19 Aug 2013 14:44:06 +0000 (20:14 +0530)]
tty: serial: altera: drop uart_port->lock before calling tty_flip_buffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b272168ceec056ca1d8a870a97fa9c26e11a
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Cc: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit dd085ed8ef6c97e0b606da304856b36e0d3496ac)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
10 years agoarm64: Align the kbuild output for VDSOL and VDSOA
Ian Campbell [Tue, 15 Jul 2014 07:38:08 +0000 (08:38 +0100)]
arm64: Align the kbuild output for VDSOL and VDSOA

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ad789ba5f7086138461420d2156478d33fb61077)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: vdso: put vdso datapage in a separate vma
Will Deacon [Wed, 9 Jul 2014 18:22:11 +0000 (19:22 +0100)]
arm64: vdso: put vdso datapage in a separate vma

The VDSO datapage doesn't need to be executable (no code there) or
CoW-able (the kernel writes the page, so a private copy is totally
useless).

This patch moves the datapage into its own VMA, identified as "[vvar]"
in /proc/<pid>/maps.

Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 8715493852783358ef8656a0054a14bf822509cf)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: Remove duplicate (SWAPPER|IDMAP)_DIR_SIZE definitions
Catalin Marinas [Tue, 15 Jul 2014 14:46:02 +0000 (15:46 +0100)]
arm64: Remove duplicate (SWAPPER|IDMAP)_DIR_SIZE definitions

Just keep the asm/page.h definition as this is included in vmlinux.lds.S
as well.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
(cherry picked from commit b2f8c07bcb7d1a3575f41444d2d8048d0c922762)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: head.S: remove unnecessary function alignment
Mark Rutland [Tue, 24 Jun 2014 15:51:34 +0000 (16:51 +0100)]
arm64: head.S: remove unnecessary function alignment

Currently __turn_mmu_on is aligned to 64 bytes to ensure that it doesn't
span any page boundary, which simplifies the idmap and spares us
requiring an additional page table to map half of the function. In
keeping with other important requirements in architecture code, this
fact is undocumented.

Additionally, as the function consists of three instructions totalling
12 bytes with no literal pool data, a smaller alignment of 16 bytes
would be sufficient.

This patch reduces the alignment to 16 bytes and documents the
underlying reason for the alignment. This reduces the required alignment
of the entire .head.text section from 64 bytes to 16 bytes, though it
may still be aligned to a larger value depending on TEXT_OFFSET.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 909a4069da65a5cfca8c968edf9f0d99f694d2f3)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: export __cpu_{clear,copy}_user_page functions
Mark Salter [Tue, 17 Jun 2014 17:14:26 +0000 (18:14 +0100)]
arm64: export __cpu_{clear,copy}_user_page functions

The __cpu_clear_user_page() and __cpu_copy_user_page() functions
are not currently exported. This prevents modules from using
clear_user_page() and copy_user_page().

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit bec7cedc8a92bfe96d32febe72634b30c63896bd)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoarm64: dts: Add more serial port nodes in APM X-Gene device tree
Vinayak Kale [Wed, 26 Mar 2014 12:19:06 +0000 (12:19 +0000)]
arm64: dts: Add more serial port nodes in APM X-Gene device tree

APM X-Gene Storm SoC supports 4 serial ports. This patch adds device nodes
for serial ports 1 to 3 (a device node for serial port 0 is already present
in the dts file).
This patch also sets the compatible property of serial nodes to "ns16550a".

Signed-off-by: Vinayak Kale <vkale@apm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 457ced8458605f1935214289d44aabb80bf75756)
Signed-off-by: Mark Brown <broonie@linaro.org>
10 years agoMerge branch 'linux-linaro-lsk' into linux-linaro-lsk-android
Alex Shi [Fri, 18 Jul 2014 06:09:17 +0000 (14:09 +0800)]
Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android

10 years agoMerge tag 'v3.10.49' into linux-linaro-lsk
Alex Shi [Fri, 18 Jul 2014 06:06:21 +0000 (14:06 +0800)]
Merge tag 'v3.10.49'  into linux-linaro-lsk

This is the 3.10.49 stable release

10 years agoLinux 3.10.49
Greg Kroah-Hartman [Thu, 17 Jul 2014 22:58:15 +0000 (15:58 -0700)]
Linux 3.10.49

10 years agoACPI / battery: Retry to get battery information if failed during probing
Lan Tianyu [Mon, 7 Jul 2014 07:47:12 +0000 (15:47 +0800)]
ACPI / battery: Retry to get battery information if failed during probing

commit 75646e758a0ecbed5024454507d5be5b9ea9dcbf upstream.

Some machines (eg. Lenovo Z480) ECs are not stable during boot up
and causes battery driver fails to be loaded due to failure of getting
battery information from EC sometimes. After several retries, the
operation will work. This patch is to retry to get battery information 5
times if the first try fails.

[ backport to 3.14.5: removed second parameter in acpi_battery_update(),
introduced by the commit 9e50bc14a7f58b5d8a55973b2d69355852ae2dae (ACPI /
battery: Accelerate battery resume callback)]

[naszar <naszar@ya.ru>: backport to 3.14.5]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75581
Reported-and-tested-by: naszar <naszar@ya.ru>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agox86, ioremap: Speed up check for RAM pages
Roland Dreier [Fri, 2 May 2014 18:18:41 +0000 (11:18 -0700)]
x86, ioremap: Speed up check for RAM pages

commit c81c8a1eeede61e92a15103748c23d100880cc8a upstream.

In __ioremap_caller() (the guts of ioremap), we loop over the range of
pfns being remapped and checks each one individually with page_is_ram().
For large ioremaps, this can be very slow.  For example, we have a
device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
seconds -- sometimes long enough to trigger the soft lockup detector!

Internally, page_is_ram() calls walk_system_ram_range() on a single
page.  Instead, we can make a single call to walk_system_ram_range()
from __ioremap_caller(), and do our further checks only for any RAM
pages that we find.  For the common case of MMIO, this saves an enormous
amount of work, since the range being ioremapped doesn't intersect
system RAM at all.

With this change, ioremap on our 256 GiB BAR takes less than 1 second.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoScore: Modify the Makefile of Score, remove -mlong-calls for compiling
Lennox Wu [Sat, 14 Sep 2013 06:41:22 +0000 (14:41 +0800)]
Score: Modify the Makefile of Score, remove -mlong-calls for compiling

commit df9e4d1c39c472cb44d81ab2ed2db503fc486e3b upstream.

Signed-off-by: Lennox Wu <lennox.wu@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoScore: The commit is for compiling successfully.
Lennox Wu [Sat, 14 Sep 2013 05:48:37 +0000 (13:48 +0800)]
Score: The commit is for compiling successfully.

commit 5fbbf8a1a93452b26e7791cf32cefce62b0a480b upstream.

The modifications include:
 1. Kconfig of Score: we don't support ioremap
 2. Missed headfile including
 3. There are some errors in other people's commit not checked by us, we fix it now
 3.1 arch/score/kernel/entry.S: wrong instructions
 3.2 arch/score/kernel/process.c : just some typos

Signed-off-by: Lennox Wu <lennox.wu@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoScore: Implement the function csum_ipv6_magic
Lennox Wu [Sat, 14 Sep 2013 05:58:40 +0000 (13:58 +0800)]
Score: Implement the function csum_ipv6_magic

commit 1ed62ca648557b884d117a4a8bbcf2ae4e2d1153 upstream.

Signed-off-by: Lennox Wu <lennox.wu@gmail.com>
Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoscore: normalize global variables exported by vmlinux.lds
Jiang Liu [Wed, 3 Jul 2013 22:03:37 +0000 (15:03 -0700)]
score: normalize global variables exported by vmlinux.lds

commit ae49b83dcacfb69e22092cab688c415c2f2d870c upstream.

Generate mandatory global variables _sdata in file vmlinux.lds.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agortmutex: Plug slow unlock race
Thomas Gleixner [Wed, 11 Jun 2014 18:44:04 +0000 (18:44 +0000)]
rtmutex: Plug slow unlock race

commit 27e35715df54cbc4f2d044f681802ae30479e7fb upstream.

When the rtmutex fast path is enabled the slow unlock function can
create the following situation:

spin_lock(foo->m->wait_lock);
foo->m->owner = NULL;
     rt_mutex_lock(foo->m); <-- fast path
free = atomic_dec_and_test(foo->refcnt);
rt_mutex_unlock(foo->m); <-- fast path
if (free)
   kfree(foo);

spin_unlock(foo->m->wait_lock); <--- Use after free.

Plug the race by changing the slow unlock to the following scheme:

     while (!rt_mutex_has_waiters(m)) {
          /* Clear the waiters bit in m->owner */
    clear_rt_mutex_waiters(m);
           owner = rt_mutex_owner(m);
           spin_unlock(m->wait_lock);
           if (cmpxchg(m->owner, owner, 0) == owner)
              return;
           spin_lock(m->wait_lock);
     }

So in case of a new waiter incoming while the owner tries the slow
path unlock we have two situations:

 unlock(wait_lock);
lock(wait_lock);
 cmpxchg(p, owner, 0) == owner
           mark_rt_mutex_waiters(lock);
  acquire(lock);

Or:

 unlock(wait_lock);
lock(wait_lock);
  mark_rt_mutex_waiters(lock);
 cmpxchg(p, owner, 0) != owner
enqueue_waiter();
unlock(wait_lock);
 lock(wait_lock);
 wakeup_next waiter();
 unlock(wait_lock);
lock(wait_lock);
acquire(lock);

If the fast path is disabled, then the simple

   m->owner = NULL;
   unlock(m->wait_lock);

is sufficient as all access to m->owner is serialized via
m->wait_lock;

Also document and clarify the wakeup_next_waiter function as suggested
by Oleg Nesterov.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agortmutex: Handle deadlock detection smarter
Thomas Gleixner [Thu, 5 Jun 2014 10:34:23 +0000 (12:34 +0200)]
rtmutex: Handle deadlock detection smarter

commit 3d5c9340d1949733eb37616abd15db36aef9a57c upstream.

Even in the case when deadlock detection is not requested by the
caller, we can detect deadlocks. Right now the code stops the lock
chain walk and keeps the waiter enqueued, even on itself. Silly not to
yell when such a scenario is detected and to keep the waiter enqueued.

Return -EDEADLK unconditionally and handle it at the call sites.

The futex calls return -EDEADLK. The non futex ones dequeue the
waiter, throw a warning and put the task into a schedule loop.

Tagged for stable as it makes the code more robust.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brad Mouring <bmouring@ni.com>
Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>