firefly-linux-kernel-4.4.55.git
9 years agonet/ipv6/udp: Fix ipv6 multicast socket filter regression
Henning Rogge [Mon, 18 May 2015 19:08:49 +0000 (21:08 +0200)]
net/ipv6/udp: Fix ipv6 multicast socket filter regression

Commit <5cf3d46192fc> ("udp: Simplify__udp*_lib_mcast_deliver")
simplified the filter for incoming IPv6 multicast but removed
the check of the local socket address and the UDP destination
address.

This patch restores the filter to prevent sockets bound to a IPv6
multicast IP to receive other UDP traffic link unicast.

Signed-off-by: Henning Rogge <hrogge@gmail.com>
Fixes: 5cf3d46192fc ("udp: Simplify__udp*_lib_mcast_deliver")
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoxen/events: don't bind non-percpu VIRQs with percpu chip
David Vrabel [Tue, 19 May 2015 17:40:49 +0000 (18:40 +0100)]
xen/events: don't bind non-percpu VIRQs with percpu chip

A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different
VCPU than it is bound to.  This can result in a race between
handle_percpu_irq() and removing the action in __free_irq() because
handle_percpu_irq() does not take desc->lock.  The interrupt handler
sees a NULL action and oopses.

Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER).

  # cat /proc/interrupts | grep virq
   40:      87246          0  xen-percpu-virq      timer0
   44:          0          0  xen-percpu-virq      debug0
   47:          0      20995  xen-percpu-virq      timer1
   51:          0          0  xen-percpu-virq      debug1
   69:          0          0   xen-dyn-virq      xen-pcpu
   74:          0          0   xen-dyn-virq      mce
   75:         29          0   xen-dyn-virq      hvc_console

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org>
9 years agoMerge tag 'nfs-for-4.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Tue, 19 May 2015 18:20:48 +0000 (11:20 -0700)]
Merge tag 'nfs-for-4.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull two NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - fix a Linux-4.1 regression affecting stat()

   - take an extra reference to fl->fl_file when running a setlk"

* tag 'nfs-for-4.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: take extra reference to fl->fl_file when running a setlk
  nfs: stat(2) fails during cthon04 basic test5 on NFSv4.0

9 years agoMerge tag 'powerpc-4.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Linus Torvalds [Tue, 19 May 2015 18:19:49 +0000 (11:19 -0700)]
Merge tag 'powerpc-4.1-4' of git://git./linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:

 - THP/hugetlb fixes from Aneesh.

 - MCE fix from Daniel.

 - TOC fix from Anton.

* tag 'powerpc-4.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc: Align TOC to 256 bytes
  powerpc/mce: fix off by one errors in mce event handling
  powerpc/mm: Return NULL for not present hugetlb page
  powerpc/thp: Serialize pmd clear against a linux page table walk.

9 years agoMerge tag 'pwm/for-4.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry...
Linus Torvalds [Tue, 19 May 2015 18:18:14 +0000 (11:18 -0700)]
Merge tag 'pwm/for-4.1-rc5' of git://git./linux/kernel/git/thierry.reding/linux-pwm

Pull pwm fix from Thierry Reding:
 "A single fix to make the Pistachio driver respect the limits imposed
  by hardware"

* tag 'pwm/for-4.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: img: Impose upper and lower timebase steps value

9 years agowatchdog: fix double lock in watchdog_nmi_enable_all
Michal Hocko [Tue, 19 May 2015 07:07:27 +0000 (09:07 +0200)]
watchdog: fix double lock in watchdog_nmi_enable_all

Commit ab992dc38f9a ("watchdog: Fix merge 'conflict'") has introduced an
obvious deadlock because of a typo.  watchdog_proc_mutex should be
unlocked on exit.

Thanks to Miroslav Benes who was staring at the code with me and noticed
this.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Duh-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoinet: properly align icsk_ca_priv
Eric Dumazet [Tue, 19 May 2015 00:06:14 +0000 (17:06 -0700)]
inet: properly align icsk_ca_priv

tcp_illinois and upcoming tcp_cdg require 64bit alignment of
icsk_ca_priv

x86 does not care, but other architectures might.

Fixes: 05cbc0db03e82 ("ipv4: Create probe timer for tcp PMTU as per RFC4821")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Fan Du <fan.du@intel.com>
Acked-by: Fan Du <fan.du@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopwm: img: Impose upper and lower timebase steps value
Naidu Tellapati [Fri, 8 May 2015 21:47:31 +0000 (18:47 -0300)]
pwm: img: Impose upper and lower timebase steps value

The PWM hardware on Pistachio platform has a maximum timebase steps
value to 255. To fix it, let's introduce a compatible-specific
data structure to contain the SoC-specific details and use it to
specify a maximum timebase.

Also, let's limit the minimum timebase to 16 steps, to allow a sane
range of duty cycle steps.

Fixes: 277bb6a29e00 ("pwm: Imagination Technologies PWM DAC driver")
Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
9 years agodrm/exynos: dp: Lower level of EDID read success message
Krzysztof Kozlowski [Thu, 14 May 2015 00:03:06 +0000 (09:03 +0900)]
drm/exynos: dp: Lower level of EDID read success message

Don't pollute the dmesg with EDID read success message as an error.
Printing as debug should be fine.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: cleanup exynos_drm_plane
Tobias Jakobi [Sat, 25 Apr 2015 18:06:54 +0000 (20:06 +0200)]
drm/exynos: cleanup exynos_drm_plane

Remove the unused fields of struct exynos_drm_plane.

v2: Remove index_color as well, also unused (thanks Joonyoung).

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: 'win' is always unsigned
Tobias Jakobi [Wed, 6 May 2015 12:10:22 +0000 (14:10 +0200)]
drm/exynos: 'win' is always unsigned

The index for the hardware layer is always >=0. Previous
code that also used -1 as special index is now gone.

Also apply this to 'ch_enabled' (decon/fimd), since the
variable is on the same line (and is again always unsigned).

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: mixer: don't dump registers under spinlock
Tobias Jakobi [Wed, 6 May 2015 12:10:21 +0000 (14:10 +0200)]
drm/exynos: mixer: don't dump registers under spinlock

mixer_regs_dump() was called in mixer_run(), which was called
under the register spinlock in mixer_graph_buffer() and
vp_video_buffer().

This would trigger a sysmmu pagefault with drm.debug=0xff because
of the large delay caused by the register dumping.

To keep consistency also move register dumping out of mixer_stop(),
which is the counterpart to mixer_run().

Kernel dump:
[  131.296529] [drm:mixer_win_commit] win: 2
[  131.300693] [drm:mixer_regs_dump] MXR_STATUS = 00000081
[  131.305888] [drm:mixer_regs_dump] MXR_CFG = 000007d5
[  131.310835] [drm:mixer_regs_dump] MXR_INT_EN = 00000000
[  131.316043] [drm:mixer_regs_dump] MXR_INT_STATUS = 00000900
[  131.321598] [drm:mixer_regs_dump] MXR_LAYER_CFG = 00000321
[  131.327066] [drm:mixer_regs_dump] MXR_VIDEO_CFG = 00000000
[  131.332535] [drm:mixer_regs_dump] MXR_GRAPHIC0_CFG = 00310700
[  131.338263] [drm:mixer_regs_dump] MXR_GRAPHIC0_BASE = 20c00000
[  131.344079] [drm:mixer_regs_dump] MXR_GRAPHIC0_SPAN = 00000780
[  131.349895] [drm:mixer_regs_dump] MXR_GRAPHIC0_WH = 07800438
[  131.355537] [drm:mixer_regs_dump] MXR_GRAPHIC0_SXY = 00000000
[  131.361265] [drm:mixer_regs_dump] MXR_GRAPHIC0_DXY = 00000000
[  131.366994] [drm:mixer_regs_dump] MXR_GRAPHIC1_CFG = 00000000
[  131.372723] [drm:mixer_regs_dump] MXR_GRAPHIC1_BASE = 00000000
[  131.378539] [drm:mixer_regs_dump] MXR_GRAPHIC1_SPAN = 00000000
[  131.384354] [drm:mixer_regs_dump] MXR_GRAPHIC1_WH = 00000000
[  131.389996] [drm:mixer_regs_dump] MXR_GRAPHIC1_SXY = 00000000
[  131.395725] [drm:mixer_regs_dump] MXR_GRAPHIC1_DXY = 00000000
[  131.401486] PAGE FAULT occurred at 0x0 by 12e20000.sysmmu(Page table base: 0x6d990000)
[  131.409353]  Lv1 entry: 0x6e0f2401
[  131.412753] ------------[ cut here ]------------
[  131.417339] kernel BUG at drivers/iommu/exynos-iommu.c:358!
[  131.422894] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[  131.428709] Modules linked in: ecb bridge stp llc bnep btrfs xor xor_neon zlib_inflate zlib_deflate raid6_pq btusb bluetooth usb_storage s5p_jpeg
videobuf2_dma_contig videobuf2_memops v4l2_mem2mem videobuf2_core
[  131.447461] CPU: 0 PID: 2418 Comm: lt-modetest Tainted: G        W       4.0.1-debug+ #3
[  131.455530] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  131.461607] task: ee194100 ti: ec4fe000 task.ti: ec4fe000
[  131.466995] PC is at exynos_sysmmu_irq+0x2a0/0x2a8
[  131.471766] LR is at vprintk_emit+0x268/0x594
[  131.476103] pc : [<c02781a4>]    lr : [<c00650d0>]    psr: a00001d3
[  131.476103] sp : ec4ff9d8  ip : 00000000  fp : ec4ffa14
[  131.487559] r10: ffffffda  r9 : ee206e28  r8 : ee2d1a10
[  131.492767] r7 : 00000000  r6 : 00000000  r5 : 00000000  r4 : ee206e10
[  131.499277] r3 : c06fca20  r2 : 00000000  r1 : 00000000  r0 : ee28be00
[  131.505788] Flags: NzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment user
[  131.513079] Control: 10c5387d  Table: 6c72404a  DAC: 00000015
[  131.518808] Process lt-modetest (pid: 2418, stack limit = 0xec4fe218)
[  131.525231] Stack: (0xec4ff9d8 to 0xec500000)
[  131.529571] f9c0:                                                       ec4ff9e4 c03a0c40
[  131.537732] f9e0: bbfa6e35 6d990000 6d161c3d ee20a900 ee04a7e0 00000028 ee007000 00000000
[  131.545891] fa00: 00000000 c06fb1fc ec4ffa5c ec4ffa18 c0066a34 c0277f10 ee257664 0000000b
[  131.554050] fa20: ec4ffa5c c06fafbb ee04a780 c06fb1e8 00000000 ee04a780 ee04a7e0 ee20a900
[  131.562209] fa40: ee007000 00000015 ec4ffb48 ee008000 ec4ffa7c ec4ffa60 c0066c90 c00669e0
[  131.570369] fa60: 00020000 ee04a780 ee04a7e0 00001000 ec4ffa94 ec4ffa80 c0069c6c c0066c58
[  131.578528] fa80: 00000028 ee004450 ec4ffaac ec4ffa98 c0066028 c0069bac 000000a0 c06e19b4
[  131.586687] faa0: ec4ffad4 ec4ffab0 c0223678 c0066000 c02235dc 00000015 00000000 00000015
[  131.594846] fac0: ec4ffc80 00000001 ec4ffaec ec4ffad8 c0066028 c02235e8 00000089 c06bfc54
[  131.603005] fae0: ec4ffb1c ec4ffaf0 c006633c c0066000 ec4ffb48 f002000c 00000025 00000015
[  131.611165] fb00: c06c680c ec4ffb48 f0020000 ee008000 ec4ffb44 ec4ffb20 c000867c c00662c4
[  131.619324] fb20: c02046ac 60000153 ffffffff ec4ffb7c 00000000 00000101 ec4ffbb4 ec4ffb48
[  131.627483] fb40: c0013240 c0008650 00000001 ee257508 00000002 00000001 ee257504 ee257508
[  131.635642] fb60: 00000000 c06bf27c 00000000 00000101 ee008000 ec4ffbb4 00000000 ec4ffb90
[  131.643802] fb80: c002e124 c02046ac 60000153 ffffffff c002e09c 00000000 c06c6080 00000283
[  131.651960] fba0: 00000001 c06fb1ac ec4ffc0c ec4ffbb8 c002d690 c002e0a8 ee78d080 ee008000
[  131.660120] fbc0: 00400000 c04eb3b0 ffff7c44 c06c6100 c06fdac0 0000000a c06bf2f0 c06c6080
[  131.668279] fbe0: c06bfc54 c06bfc54 00000000 00000025 00000000 00000001 ec4ffc80 ee008000
[  131.676438] fc00: ec4ffc24 ec4ffc10 c002dbb8 c002d564 00000089 c06bfc54 ec4ffc54 ec4ffc28
[  131.684597] fc20: c0066340 c002dafc ec4ffc80 f002000c 0000001c 0000000c c06c680c ec4ffc80
[  131.692757] fc40: f0020000 00000080 ec4ffc7c ec4ffc58 c000867c c00662c4 c04e6624 60000053
[  131.700916] fc60: ffffffff ec4ffcb4 c072df54 ee22d010 ec4ffcdc ec4ffc80 c0013240 c0008650
[  131.709075] fc80: ee22d664 ee194100 00000000 ec4fe000 60000053 00000400 00000002 ee22d420
[  131.717234] fca0: c072df54 ee22d010 00000080 ec4ffcdc ec4ffcc8 ec4ffcc8 c04e6620 c04e6624
[  131.725393] fcc0: 60000053 ffffffff ec4fe000 c072df54 ec4ffd34 ec4ffce0 c02b64d0 c04e6618
[  131.733552] fce0: ec4ffcf8 00000000 00000000 60000053 00010000 00010000 00000000 200cb000
[  131.741712] fd00: 20080000 ee22d664 00000001 ee256000 ee261400 ee22d420 00000080 00000080
[  131.749871] fd20: ee256000 00000280 ec4ffd74 ec4ffd38 c02a8844 c02b5fec 00000080 00000280
[  131.758030] fd40: 000001e0 00000000 00000000 00000280 000001e0 ee22d220 01e00000 00000002
[  131.766189] fd60: ee22d420 ee261400 ec4ffdbc ec4ffd78 c0293cbc c02a87a4 00000080 00000280
[  131.774348] fd80: 000001e0 00000000 00000000 02800000 01e00000 ee261400 ee22d460 ee261400
[  131.782508] fda0: ee22d420 00000000 01e00000 000001e0 ec4ffe24 ec4ffdc0 c0297800 c0293b24
[  131.790667] fdc0: 00000080 00000280 000001e0 00000000 00000000 02800000 01e00000 ec4ffdf8
[  131.798826] fde0: c028db00 00000080 00000080 ee256000 02800000 00000000 ec4ffe24 c06c6448
[  131.806985] fe00: c072df54 000000b7 ee013800 ec4ffe54 edbf7300 ec4ffe54 ec4fff04 ec4ffe28
[  131.815145] fe20: c028a848 c029768c 00000001 c06195d8 ec4ffe5c ec4ffe40 c0297680 c0521f6c
[  131.823304] fe40: 00000030 bed45d38 00000030 c03064b7 ec4ffe8c 00000011 00000015 00000022
[  131.831463] fe60: 00000000 00000080 00000080 00000280 000001e0 00000000 00000000 01e00000
[  131.839622] fe80: 02800000 00000000 00000000 0004b000 00000000 00000000 c00121e4 c0011080
[  131.847781] fea0: c00110a4 00000000 00000000 00000000 ec4ffeec ec4ffec0 c00110f0 c00121cc
[  131.855940] fec0: 00000000 c00e7fec ec4ffeec ec4ffed8 c004af2c dc8ba201 edae4fc0 edbf7000
[  131.864100] fee0: edbf7000 00000003 bed45d38 00000003 bed45d38 ee3f2040 ec4fff7c ec4fff08
[  131.872259] ff00: c010b62c c028a684 edae4fc0 00000000 00000000 b6666000 ec40d108 edae4fc4
[  131.880418] ff20: ec4fff6c ec4fff30 c00e7fec c02207b0 000001f9 00000000 edae5008 ec40d110
[  131.888577] ff40: 00070800 edae5008 edae4fc0 00070800 b6666000 edbf7000 edbf7000 c03064b7
[  131.896736] ff60: bed45d38 00000003 ec4fe000 00000000 ec4fffa4 ec4fff80 c010b84c c010b208
[  131.904896] ff80: 00000022 00000000 bed45d38 c03064b7 00000036 c000ede4 00000000 ec4fffa8
[  131.913055] ffa0: c000ec40 c010b81c 00000000 bed45d38 00000003 c03064b7 bed45d38 00000022
[  131.921214] ffc0: 00000000 bed45d38 c03064b7 00000036 00000080 00000080 00000000 000001e0
[  131.929373] ffe0: b6da4064 bed45d1c b6d98968 b6e8082c 60000050 00000003 00000000 00000000
[  131.937529] Backtrace:
[  131.939967] [<c0277f04>] (exynos_sysmmu_irq) from [<c0066a34>] (handle_irq_event_percpu+0x60/0x278)
[  131.948988]  r10:c06fb1fc r9:00000000 r8:00000000 r7:ee007000 r6:00000028 r5:ee04a7e0
[  131.956799]  r4:ee20a900
[  131.959320] [<c00669d4>] (handle_irq_event_percpu) from [<c0066c90>] (handle_irq_event+0x44/0x64)
[  131.968170]  r10:ee008000 r9:ec4ffb48 r8:00000015 r7:ee007000 r6:ee20a900 r5:ee04a7e0
[  131.975982]  r4:ee04a780
[  131.978504] [<c0066c4c>] (handle_irq_event) from [<c0069c6c>] (handle_level_irq+0xcc/0x144)
[  131.986832]  r6:00001000 r5:ee04a7e0 r4:ee04a780 r3:00020000
[  131.992478] [<c0069ba0>] (handle_level_irq) from [<c0066028>] (generic_handle_irq+0x34/0x44)
[  132.000894]  r5:ee004450 r4:00000028
[  132.004459] [<c0065ff4>] (generic_handle_irq) from [<c0223678>] (combiner_handle_cascade_irq+0x9c/0x108)
[  132.013914]  r4:c06e19b4 r3:000000a0
[  132.017476] [<c02235dc>] (combiner_handle_cascade_irq) from [<c0066028>] (generic_handle_irq+0x34/0x44)
[  132.026847]  r8:00000001 r7:ec4ffc80 r6:00000015 r5:00000000 r4:00000015 r3:c02235dc
[  132.034576] [<c0065ff4>] (generic_handle_irq) from [<c006633c>] (__handle_domain_irq+0x84/0xf0)
[  132.043252]  r4:c06bfc54 r3:00000089
[  132.046815] [<c00662b8>] (__handle_domain_irq) from [<c000867c>] (gic_handle_irq+0x38/0x70)
[  132.055144]  r10:ee008000 r9:f0020000 r8:ec4ffb48 r7:c06c680c r6:00000015 r5:00000025
[  132.062956]  r4:f002000c r3:ec4ffb48
[  132.066520] [<c0008644>] (gic_handle_irq) from [<c0013240>] (__irq_svc+0x40/0x74)
[  132.073980] Exception stack(0xec4ffb48 to 0xec4ffb90)
[  132.079016] fb40:                   00000001 ee257508 00000002 00000001 ee257504 ee257508
[  132.087176] fb60: 00000000 c06bf27c 00000000 00000101 ee008000 ec4ffbb4 00000000 ec4ffb90
[  132.095333] fb80: c002e124 c02046ac 60000153 ffffffff
[  132.100367]  r9:00000101 r8:00000000 r7:ec4ffb7c r6:ffffffff r5:60000153 r4:c02046ac
[  132.108098] [<c002e09c>] (tasklet_hi_action) from [<c002d690>] (__do_softirq+0x138/0x38c)
[  132.116251]  r8:c06fb1ac r7:00000001 r6:00000283 r5:c06c6080 r4:00000000 r3:c002e09c
[  132.123980] [<c002d558>] (__do_softirq) from [<c002dbb8>] (irq_exit+0xc8/0x104)
[  132.131268]  r10:ee008000 r9:ec4ffc80 r8:00000001 r7:00000000 r6:00000025 r5:00000000
[  132.139080]  r4:c06bfc54
[  132.141600] [<c002daf0>] (irq_exit) from [<c0066340>] (__handle_domain_irq+0x88/0xf0)
[  132.149409]  r4:c06bfc54 r3:00000089
[  132.152971] [<c00662b8>] (__handle_domain_irq) from [<c000867c>] (gic_handle_irq+0x38/0x70)
[  132.161300]  r10:00000080 r9:f0020000 r8:ec4ffc80 r7:c06c680c r6:0000000c r5:0000001c
[  132.169112]  r4:f002000c r3:ec4ffc80
[  132.172675] [<c0008644>] (gic_handle_irq) from [<c0013240>] (__irq_svc+0x40/0x74)
[  132.180137] Exception stack(0xec4ffc80 to 0xec4ffcc8)
[  132.185173] fc80: ee22d664 ee194100 00000000 ec4fe000 60000053 00000400 00000002 ee22d420
[  132.193332] fca0: c072df54 ee22d010 00000080 ec4ffcdc ec4ffcc8 ec4ffcc8 c04e6620 c04e6624
[  132.201489] fcc0: 60000053 ffffffff
[  132.204961]  r9:ee22d010 r8:c072df54 r7:ec4ffcb4 r6:ffffffff r5:60000053 r4:c04e6624
[  132.212694] [<c04e660c>] (_raw_spin_unlock_irqrestore) from [<c02b64d0>] (mixer_win_commit+0x4f0/0xcc8)
[  132.222060]  r4:c072df54 r3:ec4fe000
[  132.225625] [<c02b5fe0>] (mixer_win_commit) from [<c02a8844>] (exynos_update_plane+0xac/0xb8)
[  132.234126]  r10:00000280 r9:ee256000 r8:00000080 r7:00000080 r6:ee22d420 r5:ee261400
[  132.241937]  r4:ee256000
[  132.244461] [<c02a8798>] (exynos_update_plane) from [<c0293cbc>] (__setplane_internal+0x1a4/0x2c0)
[  132.253395]  r7:ee261400 r6:ee22d420 r5:00000002 r4:01e00000
[  132.259041] [<c0293b18>] (__setplane_internal) from [<c0297800>] (drm_mode_setplane+0x180/0x244)
[  132.267804]  r9:000001e0 r8:01e00000 r7:00000000 r6:ee22d420 r5:ee261400 r4:ee22d460
[  132.275535] [<c0297680>] (drm_mode_setplane) from [<c028a848>] (drm_ioctl+0x1d0/0x58c)
[  132.283428]  r10:ec4ffe54 r9:edbf7300 r8:ec4ffe54 r7:ee013800 r6:000000b7 r5:c072df54
[  132.291240]  r4:c06c6448
[  132.293763] [<c028a678>] (drm_ioctl) from [<c010b62c>] (do_vfs_ioctl+0x430/0x614)
[  132.301222]  r10:ee3f2040 r9:bed45d38 r8:00000003 r7:bed45d38 r6:00000003 r5:edbf7000
[  132.309034]  r4:edbf7000
[  132.311555] [<c010b1fc>] (do_vfs_ioctl) from [<c010b84c>] (SyS_ioctl+0x3c/0x64)
[  132.318842]  r10:00000000 r9:ec4fe000 r8:00000003 r7:bed45d38 r6:c03064b7 r5:edbf7000
[  132.326654]  r4:edbf7000
[  132.329176] [<c010b810>] (SyS_ioctl) from [<c000ec40>] (ret_fast_syscall+0x0/0x34)
[  132.336723]  r8:c000ede4 r7:00000036 r6:c03064b7 r5:bed45d38 r4:00000000 r3:00000022
[  132.344451] Code: e3130002 0affffaf eb09a67d eaffffad (e7f001f2)
[  132.350528] ---[ end trace d428689b94df895c ]---
[  132.355126] Kernel panic - not syncing: Fatal exception in interrupt
[  132.361465] CPU2: stopping
[  132.364155] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W       4.0.1-debug+ #3
[  132.371791] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  132.377866] Backtrace:
[  132.380304] [<c0012484>] (dump_backtrace) from [<c001269c>] (show_stack+0x18/0x1c)
[  132.387849]  r6:c06e158c r5:ffffffff r4:00000000 r3:dc8ba201
[  132.393497] [<c0012684>] (show_stack) from [<c04dfb94>] (dump_stack+0x88/0xc8)
[  132.400698] [<c04dfb0c>] (dump_stack) from [<c0014894>] (handle_IPI+0x1c8/0x2c4)
[  132.408073]  r6:c06bfc54 r5:c06bfc54 r4:00000005 r3:ee0b0000
[  132.413718] [<c00146cc>] (handle_IPI) from [<c00086b0>] (gic_handle_irq+0x6c/0x70)
[  132.421267]  r9:f0028000 r8:ee0b1f48 r7:c06c680c r6:fffffff5 r5:00000005 r4:f002800c
[  132.428995] [<c0008644>] (gic_handle_irq) from [<c0013240>] (__irq_svc+0x40/0x74)
[  132.436457] Exception stack(0xee0b1f48 to 0xee0b1f90)
[  132.441493] 1f40:                   00000001 00000000 00000000 c00206c0 c06c6518 c04eb3a4
[  132.449653] 1f60: 00000000 00000000 c06c0dc0 00000001 c06fb774 ee0b1f9c ee0b1fa0 ee0b1f90
[  132.457811] 1f80: c000f82c c000f830 600f0053 ffffffff
[  132.462844]  r9:00000001 r8:c06c0dc0 r7:ee0b1f7c r6:ffffffff r5:600f0053 r4:c000f830
[  132.470575] [<c000f7f0>] (arch_cpu_idle) from [<c005b6e8>] (cpu_startup_entry+0x318/0x4ec)
[  132.478818] [<c005b3d0>] (cpu_startup_entry) from [<c00144d0>] (secondary_start_kernel+0xf4/0x100)
[  132.487755]  r7:c06fd440
[  132.490279] [<c00143dc>] (secondary_start_kernel) from [<40008744>] (0x40008744)
[  132.497651]  r4:6e09006a r3:c000872c
[  132.501210] CPU3: stopping
[  132.503904] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D W       4.0.1-debug+ #3
[  132.511539] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  132.517614] Backtrace:
[  132.520051] [<c0012484>] (dump_backtrace) from [<c001269c>] (show_stack+0x18/0x1c)
[  132.527597]  r6:c06e158c r5:ffffffff r4:00000000 r3:dc8ba201
[  132.533243] [<c0012684>] (show_stack) from [<c04dfb94>] (dump_stack+0x88/0xc8)
[  132.540446] [<c04dfb0c>] (dump_stack) from [<c0014894>] (handle_IPI+0x1c8/0x2c4)
[  132.547820]  r6:c06bfc54 r5:c06bfc54 r4:00000005 r3:ee0b2000
[  132.553466] [<c00146cc>] (handle_IPI) from [<c00086b0>] (gic_handle_irq+0x6c/0x70)
[  132.561014]  r9:f002c000 r8:ee0b3f48 r7:c06c680c r6:fffffff5 r5:00000005 r4:f002c00c
[  132.568743] [<c0008644>] (gic_handle_irq) from [<c0013240>] (__irq_svc+0x40/0x74)
[  132.576205] Exception stack(0xee0b3f48 to 0xee0b3f90)
[  132.581241] 3f40:                   00000001 00000000 00000000 c00206c0 c06c6518 c04eb3a4
[  132.589401] 3f60: 00000000 00000000 c06c0dc0 00000001 c06fb774 ee0b3f9c ee0b3fa0 ee0b3f90
[  132.597558] 3f80: c000f82c c000f830 600f0053 ffffffff
[  132.602591]  r9:00000001 r8:c06c0dc0 r7:ee0b3f7c r6:ffffffff r5:600f0053 r4:c000f830
[  132.610321] [<c000f7f0>] (arch_cpu_idle) from [<c005b6e8>] (cpu_startup_entry+0x318/0x4ec)
[  132.618566] [<c005b3d0>] (cpu_startup_entry) from [<c00144d0>] (secondary_start_kernel+0xf4/0x100)
[  132.627503]  r7:c06fd440
[  132.630023] [<c00143dc>] (secondary_start_kernel) from [<40008744>] (0x40008744)
[  132.637399]  r4:6e09006a r3:c000872c
[  132.640958] CPU1: stopping
[  132.643651] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D W       4.0.1-debug+ #3
[  132.651287] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  132.657362] Backtrace:
[  132.659799] [<c0012484>] (dump_backtrace) from [<c001269c>] (show_stack+0x18/0x1c)
[  132.667344]  r6:c06e158c r5:ffffffff r4:00000000 r3:dc8ba201
[  132.672991] [<c0012684>] (show_stack) from [<c04dfb94>] (dump_stack+0x88/0xc8)
[  132.680194] [<c04dfb0c>] (dump_stack) from [<c0014894>] (handle_IPI+0x1c8/0x2c4)
[  132.687569]  r6:c06bfc54 r5:c06bfc54 r4:00000005 r3:ee0ae000
[  132.693214] [<c00146cc>] (handle_IPI) from [<c00086b0>] (gic_handle_irq+0x6c/0x70)
[  132.700762]  r9:f0024000 r8:ee0aff48 r7:c06c680c r6:fffffff5 r5:00000005 r4:f002400c
[  132.708491] [<c0008644>] (gic_handle_irq) from [<c0013240>] (__irq_svc+0x40/0x74)
[  132.715953] Exception stack(0xee0aff48 to 0xee0aff90)
[  132.720989] ff40:                   00000001 00000000 00000000 c00206c0 c06c6518 c04eb3a4
[  132.729149] ff60: 00000000 00000000 c06c0dc0 00000001 c06fb774 ee0aff9c ee0affa0 ee0aff90
[  132.737306] ff80: c000f82c c000f830 60070053 ffffffff
[  132.742339]  r9:00000001 r8:c06c0dc0 r7:ee0aff7c r6:ffffffff r5:60070053 r4:c000f830
[  132.750069] [<c000f7f0>] (arch_cpu_idle) from [<c005b6e8>] (cpu_startup_entry+0x318/0x4ec)
[  132.758314] [<c005b3d0>] (cpu_startup_entry) from [<c00144d0>] (secondary_start_kernel+0xf4/0x100)
[  132.767251]  r7:c06fd440
[  132.769772] [<c00143dc>] (secondary_start_kernel) from [<40008744>] (0x40008744)
[  132.777146]  r4:6e09006a r3:c000872c
[  132.780709] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: Consolidate return statements in fimd_bind()
Krzysztof Kozlowski [Thu, 7 May 2015 00:04:46 +0000 (09:04 +0900)]
drm/exynos: Consolidate return statements in fimd_bind()

Simplify the code and remove superfluous return statement. Just return
the result of fimd_iommu_attach_devices().

Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: Constify exynos_drm_crtc_ops
Krzysztof Kozlowski [Thu, 7 May 2015 00:04:45 +0000 (09:04 +0900)]
drm/exynos: Constify exynos_drm_crtc_ops

The Exynos DRM code does not modify the ops provided by CRTC driver in
exynos_drm_crtc_create() call.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: Fix build breakage on !DRM_EXYNOS_FIMD
Krzysztof Kozlowski [Thu, 7 May 2015 00:04:44 +0000 (09:04 +0900)]
drm/exynos: Fix build breakage on !DRM_EXYNOS_FIMD

Disabling the CONFIG_DRM_EXYNOS_FIMD (e.g. by enabling of CONFIG_FB_S3C)
leads to build error:

drivers/built-in.o: In function `exynos_dp_dpms':
binder.c:(.text+0xd6a840): undefined reference to `fimd_dp_clock_enable'
binder.c:(.text+0xd6ab54): undefined reference to `fimd_dp_clock_enable'

Fix this by changing direct call to fimd_dp_clock_enable() into optional
call to exynos_drm_crtc_ops->clock_enable(). Only the DRM_EXYNOS_FIMD
implements this op.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: mixer: Constify platform_device_id
Krzysztof Kozlowski [Fri, 1 May 2015 15:56:36 +0000 (00:56 +0900)]
drm/exynos: mixer: Constify platform_device_id

The platform_device_id is not modified by the driver and core uses it as
const.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: mixer: cleanup pixelformat handling
Tobias Jakobi [Mon, 27 Apr 2015 21:11:59 +0000 (23:11 +0200)]
drm/exynos: mixer: cleanup pixelformat handling

Move the defines for the pixelformats that the mixer supports out
of mixer_graph_buffer() to the top of the source.
Then select the mixer pixelformat (pf) in mixer_graph_buffer() based on
the plane's pf (and not bpp).
Also add handling of RGB565 and XRGB1555 to the switch statement and
exit early if the plane has an unsupported pf.

Partially based on 'drm/exynos: enable/disable blend based on pixel
format' by Gustavo Padovan <gustavo.padovan@collabora.co.uk>.

v2: Use the shorter MXR_FORMAT as prefix.
v3: Re-add ARGB8888 because of compatibility reasons
    (suggested by Joonyoung Shim).

Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: mixer: also allow NV21 for the video processor
Tobias Jakobi [Mon, 27 Apr 2015 21:10:16 +0000 (23:10 +0200)]
drm/exynos: mixer: also allow NV21 for the video processor

All the necessary code is already there, just need to
handle the format in the switch statement.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: mixer: remove buffer count handling in vp_video_buffer()
Tobias Jakobi [Mon, 27 Apr 2015 21:10:15 +0000 (23:10 +0200)]
drm/exynos: mixer: remove buffer count handling in vp_video_buffer()

The video processor (VP) supports four formats: NV12, NV21 and its
tiled variants. All these formats are bi-planar, so the buffer
count in vp_video_buffer() is always 2.

Also properly exit if we're called with an invalid (non-VP) pixelformat.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: plane: honor buffer offset for dma_addr
Tobias Jakobi [Mon, 27 Apr 2015 21:10:14 +0000 (23:10 +0200)]
drm/exynos: plane: honor buffer offset for dma_addr

Previously we were ignoring the buffer offsets that are
passed through the addfb2 ioctl. This didn't cause any
major issues, since for uni-planar formats (like XRGB8888)
userspace would most of the time just use offsets[0]=0.

However with NV12 offsets[1] is very likely non-zero.
So properly apply the offsets to our dma addresses.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agodrm/exynos: fb: use drm_format_num_planes to get buffer count
Tobias Jakobi [Mon, 27 Apr 2015 21:10:13 +0000 (23:10 +0200)]
drm/exynos: fb: use drm_format_num_planes to get buffer count

The previous code had some special case handling for the buffer
count in exynos_drm_format_num_buffers().

This code was incorrect though, since this special case doesn't
exist for DRM. It stemmed from the existence of the special NV12M
V4L2 format. NV12 is a bi-planar format (separate planes for luma
and chroma) and V4L2 differentiates between a NV12 buffer where
luma and chroma is contiguous in memory (so no data between
luma/chroma), and a NV12 buffer where luma and chroma have two
explicit memory locations (which is then called NV12M).

This distinction doesn't exist for DRM. A bi-planar format always
explicitly comes with the information about its two planes (even
if these planes should be contiguous).

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
9 years agox86/fpu: Reorganize fpu/internal.h
Ingo Molnar [Tue, 5 May 2015 13:56:33 +0000 (15:56 +0200)]
x86/fpu: Reorganize fpu/internal.h

fpu/internal.h has grown organically, with not much high level structure,
which hurts its readability.

Organize the various definitions into 5 sections:

 - high level FPU state functions
 - FPU/CPU feature flag helpers
 - fpstate handling functions
 - FPU context switching helpers
 - misc helper functions

Other related changes:

 - Move MXCSR_DEFAULT to fpu/types.h.
 - drop the unused X87_FSW_ES define

No change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code
Ingo Molnar [Tue, 5 May 2015 09:34:49 +0000 (11:34 +0200)]
x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code

There are various internal FPU state debugging checks that never
trigger in practice, but which are useful for FPU code development.

Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a
couple of new ones.

The size difference is about 0.5K of code on defconfig:

   text        data     bss          filename
   15028906    2578816  1638400      vmlinux
   15029430    2578816  1638400      vmlinux

( Keep this enabled by default until the new FPU code is debugged. )

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
Ingo Molnar [Tue, 5 May 2015 09:43:02 +0000 (11:43 +0200)]
x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT

I tried to simulate an ancient CPU via this option, and
found that it still has fxsr_opt enabled, confusing the
FPU code.

Make the 'nofxsr' option also clear FXSR_OPT flag.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Pass 'struct fpu' to fpu__restore()
Ingo Molnar [Mon, 4 May 2015 09:49:58 +0000 (11:49 +0200)]
x86/fpu: Pass 'struct fpu' to fpu__restore()

This cleans up the call sites and the function a bit,
and also makes it more symmetric with the other high
level FPU state handling functions.

It's still only valid for the current task, as we copy
to the FPU registers of the current CPU.

No change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu/init: Propagate __init annotations
Ingo Molnar [Mon, 4 May 2015 07:52:42 +0000 (09:52 +0200)]
x86/fpu/init: Propagate __init annotations

Now that all the FPU init function call dependencies are
cleaned up we can propagate __init annotations deeper.

This shrinks the runtime size of the kernel a bit, and
also addresses a few section warnings.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu/xstate: Clean up setup_xstate_comp() call
Ingo Molnar [Mon, 4 May 2015 07:43:55 +0000 (09:43 +0200)]
x86/fpu/xstate: Clean up setup_xstate_comp() call

So call setup_xstate_comp() from the xstate init code, not
from the generic fpu__init_system() code.

This allows us to remove the protytype from xstate.h as well.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Clean up xstate feature reservation
Ingo Molnar [Mon, 4 May 2015 07:04:56 +0000 (09:04 +0200)]
x86/fpu: Clean up xstate feature reservation

Put MPX support into its separate high level structure, and
also replace the fixed YMM, LWP and MPX structures in
xregs_state with just reservations - their exact offsets
in the structure will depend on the CPU and no code actually
relies on those fields.

No change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end
Ingo Molnar [Mon, 4 May 2015 05:37:47 +0000 (07:37 +0200)]
x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end

The current xstate code in setup_xstate_features() assumes that
the first zero bit means the end of xfeatures - but that is not
so, the SDM clearly states that an arbitrary set of xfeatures
might be enabled - and it is also clear from the description
of the compaction feature that holes are possible:

  "13-6 Vol. 1MANAGING STATE USING THE XSAVE FEATURE SET
  [...]

  Compacted format. Each state component i (i â‰¥ 2) is located at a byte
  offset from the base address of the XSAVE area based on the XCOMP_BV
  field in the XSAVE header:

  â€” If XCOMP_BV[i] = 0, state component i is not in the XSAVE area.

  â€” If XCOMP_BV[i] = 1, the following items apply:

  â€¢ If XCOMP_BV[j] = 0 for every j, 2 â‰¤ j < i, state component i is
    located at a byte offset 576 from the base address of the XSAVE
    area. (This item applies if i is the first bit set in bits 62:2 of
    the XCOMP_BV; it implies that state component i is located at the
    beginning of the extended region.)

  â€¢ Otherwise, let j, 2 â‰¤ j < i, be the greatest value such that
    XCOMP_BV[j] = 1. Then state component i is located at a byte offset
    X from the location of state component j, where X is the number of
    bytes required for state component j as enumerated in
    CPUID.(EAX=0DH,ECX=j):EAX. (This item implies that state component i
    immediately follows the preceding state component whose bit is set
    in XCOMP_BV.)"

So don't assume that the first zero xfeatures bit means the end of
all xfeatures - iterate through all of them.

I'm not aware of hardware that triggers this currently.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()
Ingo Molnar [Fri, 1 May 2015 08:54:22 +0000 (10:54 +0200)]
x86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()

kernel_fpu_begin() is __kernel_fpu_begin() with a preempt_disable().

Move the kernel_fpu_begin() debugging check into __kernel_fpu_begin(),
so that users of __kernel_fpu_begin() may benefit from it as well.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Document the various fpregs state formats
Ingo Molnar [Sat, 2 May 2015 08:22:45 +0000 (10:22 +0200)]
x86/fpu: Document the various fpregs state formats

Document all the structures that make up 'struct fpu'.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments
Ingo Molnar [Fri, 1 May 2015 07:59:04 +0000 (09:59 +0200)]
x86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments

Improve the memory layout of 'struct fpu':

 - change ->fpregs_active from 'int' to 'char' - it's just a single flag
   and modern x86 CPUs can do efficient byte accesses.

 - pack related fields closer to each other: often 'fpu->state' will not be
   touched, while the other fields will - so pack them into a group.

Also add comments to each field, describing their purpose, and add
some background information about lazy restores.

Also fix an obsolete, lazy switching related comment in fpu_copy()'s description.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Harmonize FPU register state types
Ingo Molnar [Thu, 30 Apr 2015 15:15:32 +0000 (17:15 +0200)]
x86/fpu: Harmonize FPU register state types

Use these consistent names:

    struct fregs_state           # was: i387_fsave_struct
    struct fxregs_state          # was: i387_fxsave_struct
    struct swregs_state          # was: i387_soft_struct
    struct xregs_state           # was: xsave_struct
    union  fpregs_state          # was: thread_xstate

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Factor out the FPU regset code into fpu/regset.c
Ingo Molnar [Thu, 30 Apr 2015 10:59:30 +0000 (12:59 +0200)]
x86/fpu: Factor out the FPU regset code into fpu/regset.c

So much of fpu/core.c is the regset code, but it just obscures the generic
FPU state machine logic. Factor out the regset code into fpu/regset.c, where
it can be read in isolation.

This affects one API: fpu__activate_stopped() has to be made available
from the core to fpu/regset.c.

No change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Factor out fpu/signal.c
Ingo Molnar [Thu, 30 Apr 2015 10:45:38 +0000 (12:45 +0200)]
x86/fpu: Factor out fpu/signal.c

fpu/xstate.c has a lot of generic FPU signal frame handling routines,
move them into a separate file: fpu/signal.c.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions
Ingo Molnar [Thu, 30 Apr 2015 09:34:09 +0000 (11:34 +0200)]
x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions

Standardize the naming of the various functions that copy register
content in specific FPU context formats:

  copy_fxregs_to_kernel()         # was: fpu_fxsave()
  copy_xregs_to_kernel()          # was: xsave_state()

  copy_kernel_to_fregs()          # was: frstor_checking()
  copy_kernel_to_fxregs()         # was: fxrstor_checking()
  copy_kernel_to_xregs()          # was: fpu_xrstor_checking()
  copy_kernel_to_xregs_booting()  # was: xrstor_state_booting()

  copy_fregs_to_user()            # was: fsave_user()
  copy_fxregs_to_user()           # was: fxsave_user()
  copy_xregs_to_user()            # was: xsave_user()

  copy_user_to_fregs()            # was: frstor_user()
  copy_user_to_fxregs()           # was: fxrstor_user()
  copy_user_to_xregs()            # was: xrestore_user()
  copy_user_to_fpregs_zeroing()   # was: restore_user_xstate()

Eliminate fpu_xrstor_checking(), because it was just a wrapper.

No change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move restore_init_xstate() out of fpu/internal.h
Ingo Molnar [Thu, 30 Apr 2015 09:21:59 +0000 (11:21 +0200)]
x86/fpu: Move restore_init_xstate() out of fpu/internal.h

Move restore_init_xstate() next to its sole caller.

Also rename it to copy_init_fpstate_to_fpregs() and add
some comments about what it does.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Generalize 'init_xstate_ctx'
Ingo Molnar [Thu, 30 Apr 2015 09:07:06 +0000 (11:07 +0200)]
x86/fpu: Generalize 'init_xstate_ctx'

So the handling of init_xstate_ctx has a layering violation: both
'struct xsave_struct' and 'union thread_xstate' have a
'struct i387_fxsave_struct' member:

   xsave_struct::i387
   thread_xstate::fxsave

The handling of init_xstate_ctx is generic, it is used on all
CPUs, with or without XSAVE instruction. So it's confusing how
the generic code passes around and handles an XSAVE specific
format.

What we really want is for init_xstate_ctx to be a proper
fpstate and we use its ::fxsave and ::xsave members, as
appropriate.

Since the xsave_struct::i387 and thread_xstate::fxsave aliases
each other this is not a functional problem.

So implement this, and move init_xstate_ctx to the generic FPU
code in the process.

Also, since init_xstate_ctx is not XSAVE specific anymore,
rename it to init_fpstate, and mark it __read_mostly,
because it's only modified once during bootup, and used
as a reference fpstate later on.

There's no change in functionality.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Create 'union thread_xstate' helper for fpstate_init()
Ingo Molnar [Thu, 30 Apr 2015 08:23:42 +0000 (10:23 +0200)]
x86/fpu: Create 'union thread_xstate' helper for fpstate_init()

fpstate_init() only uses fpu->state, so pass that in to it.

This enables the cleanup we will do in the next patch.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Harmonize the names of the fpstate_init() helper functions
Ingo Molnar [Thu, 30 Apr 2015 08:08:36 +0000 (10:08 +0200)]
x86/fpu: Harmonize the names of the fpstate_init() helper functions

Harmonize the inconsistent naming of these related functions:

                          fpstate_init()
  finit_soft_fpu()   =>   fpstate_init_fsoft()
  fx_finit()         =>   fpstate_init_fxstate()
  fx_finit()         =>   fpstate_init_fstate()       # split out

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Factor out the exception error code handling code
Ingo Molnar [Thu, 30 Apr 2015 07:29:38 +0000 (09:29 +0200)]
x86/fpu: Factor out the exception error code handling code

Factor out the FPU error code handling code from traps.c and fpu/internal.h
and move them close to each other.

Also convert the helper functions to 'struct fpu *', which further simplifies
them.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Remove run-once init quirks
Ingo Molnar [Thu, 30 Apr 2015 07:17:50 +0000 (09:17 +0200)]
x86/fpu: Remove run-once init quirks

Remove various boot quirks that came from the old code.

The new code is cleanly split up into per-system and per-cpu
init sequences, and system init functions are only called once.

Remove the run-once quirks.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Factor out fpu/regset.h from fpu/internal.h
Ingo Molnar [Thu, 30 Apr 2015 06:53:18 +0000 (08:53 +0200)]
x86/fpu: Factor out fpu/regset.h from fpu/internal.h

Only a few places use the regset definitions, so factor them out.

Also fix related header dependency assumptions.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions
Ingo Molnar [Thu, 30 Apr 2015 06:45:02 +0000 (08:45 +0200)]
x86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions

Most of the FPU does not use them, so split it out and include
them in signal.c and ia32_signal.c

Also fix header file dependency assumption in fpu/core.c.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h
Ingo Molnar [Thu, 30 Apr 2015 05:26:04 +0000 (07:26 +0200)]
x86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h

Move them to their only user. This makes the code easier to read,
the header is less cluttered, and it also speeds up the build a bit.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Merge fpu__reset() and fpu__clear()
Ingo Molnar [Thu, 30 Apr 2015 05:12:46 +0000 (07:12 +0200)]
x86/fpu: Merge fpu__reset() and fpu__clear()

With recent cleanups and fixes the fpu__reset() and fpu__clear()
functions have become almost identical in functionality: the only
difference is that fpu__reset() assumed that the fpstate
was already active in the eagerfpu case, while fpu__clear()
activated it if it was inactive.

This distinction almost never matters, the only case where such
fpstate activation happens if if the init thread (PID 1) gets exec()-ed
for the first time.

So keep fpu__clear() and change all fpu__reset() uses to
fpu__clear() to simpify the logic.

( In a later patch we'll further simplify fpu__clear() by making
  sure that all contexts it is called on are already active. )

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move the signal frame handling code closer to each other
Ingo Molnar [Wed, 29 Apr 2015 19:09:18 +0000 (21:09 +0200)]
x86/fpu: Move the signal frame handling code closer to each other

Consolidate more signal frame related functions:

   text      data    bss     dec       filename
   14108070  2575280 1634304 18317654  vmlinux.before
   14107944  2575344 1634304 18317592  vmlinux.after

Also, while moving it, rename alloc_mathframe() to fpu__alloc_mathframe().

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()
Ingo Molnar [Wed, 29 Apr 2015 18:55:19 +0000 (20:55 +0200)]
x86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()

restore_xstate_sig() is a misnomer: it's not limited to 'xstate' at all,
it is the high level 'restore FPU state from a signal frame' function
that works with all legacy FPU formats as well.

Rename it (and its helper) accordingly, and also move it to the
fpu__*() namespace.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing
Ingo Molnar [Wed, 29 Apr 2015 18:35:33 +0000 (20:35 +0200)]
x86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing

Do it like all other high level FPU state handling functions: they
only know about struct fpu, not about the task.

(Also remove a dead prototype while at it.)

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move all the fpu__*() high level methods closer to each other
Ingo Molnar [Wed, 29 Apr 2015 18:24:14 +0000 (20:24 +0200)]
x86/fpu: Move all the fpu__*() high level methods closer to each other

The fpu__*() methods are closely related, but they are defined
in scattered places within the FPU code.

Concentrate them, and also uninline fpu__save(), fpu__drop()
and fpu__reset() to save about 5K of kernel text on 64-bit kernels:

   text            data    bss     dec        filename
   14113063        2575280 1634304 18322647   vmlinux.before
   14108070        2575280 1634304 18317654   vmlinux.after

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()
Ingo Molnar [Wed, 29 Apr 2015 18:10:43 +0000 (20:10 +0200)]
x86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()

fpu_restore_checking() is a helper function of restore_fpu_checking(),
but this is not apparent from the naming.

Both copy fpstate contents to fpregs, while the fuller variant does
a full copy without leaking information.

So rename them to:

    copy_fpstate_to_fpregs()
  __copy_fpstate_to_fpregs()

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()
Ingo Molnar [Wed, 29 Apr 2015 17:04:31 +0000 (19:04 +0200)]
x86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()

drop_fpu() and fpu_reset_state() are similar in functionality
and in scope, yet this is not apparent from their names.

drop_fpu() deactivates FPU contents (both the fpregs and the fpstate),
but leaves register contents intact in the eager-FPU case, mostly as an
optimization. It disables fpregs in the lazy FPU case. The drop_fpu()
method can be used to destroy FPU state in an optimized way, when we
know that a new state will be loaded before user-space might see
any remains of the old FPU state:

     - such as in sys_exit()'s exit_thread() where we know this task
       won't execute any user-space instructions anymore and the
       next context switch cleans up the FPU. The old FPU state
       might still be around in the eagerfpu case but won't be
       saved.

     - in __restore_xstate_sig(), where we use drop_fpu() before
       copying a new state into the fpstate and activating that one.
       No user-pace instructions can execute between those steps.

     - in sys_execve()'s fpu__clear(): there we use drop_fpu() in
       the !eagerfpu case, where it's equivalent to a full reinit.

fpu_reset_state() is a stronger version of drop_fpu(): both in
the eagerfpu and the lazy-FPU case it guarantees that fpregs
are reinitialized to init state. This method is used in cases
where we need a full reset:

     - handle_signal() uses fpu_reset_state() to reset the FPU state
       to init before executing a user-space signal handler. While we
       have already saved the original FPU state at this point, and
       always restore the original state, the signal handling code
       still has to do this reinit, because signals may interrupt
       any user-space instruction, and the FPU might be in various
       intermediate states (such as an unbalanced x87 stack) that is
       not immediately usable for general C signal handler code.

     - __restore_xstate_sig() uses fpu_reset_state() when the signal
       frame has no FP context. Since the signal handler may have
       modified the FPU state, it gets reset back to init state.

     - in another branch __restore_xstate_sig() uses fpu_reset_state()
       to handle a restoration error: when restore_user_xstate() fails
       to restore FPU state and we might have inconsistent FPU data,
       fpu_reset_state() is used to reset it back to a known good
       state.

     - __kernel_fpu_end() uses fpu_reset_state() in an error branch.
       This is in a 'must not trigger' error branch, so on bug-free
       kernels this never triggers.

     - fpu__restore() uses fpu_reset_state() in an error path
       as well: if the fpstate was set up with invalid FPU state
       (via ptrace or via a signal handler), then it's reset back
       to init state.

     - likewise, the scheduler's switch_fpu_finish() uses it in a
       restoration error path too.

Move both drop_fpu() and fpu_reset_state() to the fpu__*() namespace
and harmonize their naming with their function:

    fpu__drop()
    fpu__reset()

This clearly shows that both methods operate on the full state of the
FPU, just like fpu__restore().

Also add comments to explain what each function does.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_...
Ingo Molnar [Thu, 30 Apr 2015 07:09:26 +0000 (09:09 +0200)]
x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()

We'd like to use xsave_state() earlier, but its SYSTEM_BOOTING check
is too imprecise.

The real condition that xsave_state() would like to check is whether
alternative XSAVE instructions were patched into the kernel image
already.

Add such a (read-mostly) debug flag and use it in xsave_state().

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Better document fpu__clear() state handling
Ingo Molnar [Wed, 29 Apr 2015 06:46:26 +0000 (08:46 +0200)]
x86/fpu: Better document fpu__clear() state handling

So prior to this fix:

  c88d47480d30 ("x86/fpu: Always restore_xinit_state() when use_eager_cpu()")

we leaked FPU state across execve() boundaries on eagerfpu systems:

$ /host/home/mingo/dump-xmm-regs-exec
# XMM state before execve():
XMM0 : 000000000000dede
XMM1 : 000000000000dedf
XMM2 : 000000000000dee0
XMM3 : 000000000000dee1
XMM4 : 000000000000dee2
XMM5 : 000000000000dee3
XMM6 : 000000000000dee4
XMM7 : 000000000000dee5
XMM8 : 000000000000dee6
XMM9 : 000000000000dee7
XMM10: 000000000000dee8
XMM11: 000000000000dee9
XMM12: 000000000000deea
XMM13: 000000000000deeb
XMM14: 000000000000deec
XMM15: 000000000000deed

# XMM state after execve(), in the new task context:
XMM0 : 0000000000000000
XMM1 : 2f2f2f2f2f2f2f2f
XMM2 : 0000000000000000
XMM3 : 0000000000000000
XMM4 : 00000000000000ff
XMM5 : 00000000ff000000
XMM6 : 000000000000dee4
XMM7 : 000000000000dee5
XMM8 : 0000000000000000
XMM9 : 0000000000000000
XMM10: 0000000000000000
XMM11: 0000000000000000
XMM12: 0000000000000000
XMM13: 000000000000deeb
XMM14: 000000000000deec
XMM15: 000000000000deed

Better explain what this function is supposed to do and why.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Initialize fpregs in fpu__init_cpu_generic()
Ingo Molnar [Wed, 29 Apr 2015 08:58:03 +0000 (10:58 +0200)]
x86/fpu: Initialize fpregs in fpu__init_cpu_generic()

FPU fpregs do not get initialized during bootup on secondary CPUs,
on non-xsave capable CPUs.

For example on one of my systems, the secondary CPU has this FPU
state on bootup:

x86: Booting SMP configuration:
.... node  #0, CPUs:      #1
x86/fpu ######################
x86/fpu # FPU register dump on CPU#1:
x86/fpu # ... CWD: ffff0040
x86/fpu # ... SWD: ffff0000
x86/fpu # ... TWD: ffff555a
x86/fpu # ... FIP: 00000000
x86/fpu # ... FCS: 00000000
x86/fpu # ... FOO: 00000000
x86/fpu # ... FOS: ffff0000
x86/fpu # ... FP0: 02 57 00 00 00 00 00 00 ff ff
x86/fpu # ... FP1: 1b e2 00 00 00 00 00 00 ff ff
x86/fpu # ... FP2: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP3: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP4: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP5: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP6: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP7: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ...  SW: dadadada
x86/fpu ######################

Note how CWD and TWD are off their usual init state (0x037f and 0xffff),
and how FP0 and FP1 has non-zero content.

This is normally not a problem, because any user-space FPU state
is initalized properly - but it can complicate the use of FPU
instructions in kernel code via kernel_fpu_begin()/end(): if
the FPU using code does not initialize registers itself, it
might generate spurious exceptions depending on which CPU it
executes on.

Fix this by initializing the x87 state via the FNINIT instruction.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename user_has_fpu() to fpregs_active()
Ingo Molnar [Tue, 28 Apr 2015 10:28:08 +0000 (12:28 +0200)]
x86/fpu: Rename user_has_fpu() to fpregs_active()

Rename this function in line with the new FPU nomenclature.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Clarify ancient comments in fpu__restore()
Ingo Molnar [Tue, 28 Apr 2015 10:53:45 +0000 (12:53 +0200)]
x86/fpu: Clarify ancient comments in fpu__restore()

So this function still had ancient language about 'saving current
math information' - but we haven't been doing lazy FPU saves for
quite some time, we are doing lazy FPU restores.

Also remove IRQ13 related comment, which we don't support anymore
either.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()
Ingo Molnar [Tue, 28 Apr 2015 10:09:27 +0000 (12:09 +0200)]
x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()

Move the naming in line with existing names, so that we now have:

  copy_fpregs_to_fpstate()
  copy_fpstate_to_sigframe()
  copy_fpregs_to_sigframe()

... where each function does what its name suggests.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()
Ingo Molnar [Tue, 28 Apr 2015 09:35:20 +0000 (11:35 +0200)]
x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()

Standardize the naming of save_xstate_sig() by renaming it to
copy_fpstate_to_sigframe(): this tells us at a glance that
the function copies an FPU fpstate to a signal frame.

This naming also follows the naming of copy_fpregs_to_fpstate().

Don't put 'xstate' into the name: since this is a generic name,
it's expected that the function is able to handle xstate frames
as well, beyond legacy frames.

xstate used to be the odd case in the x86 FPU code - now it's the
common case.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()
Ingo Molnar [Tue, 28 Apr 2015 09:25:02 +0000 (11:25 +0200)]
x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()

Currently fpstate_sanitize_xstate() has a task_struct input parameter,
but it only uses the fpu structure from it - so pass in a 'struct fpu'
pointer only and update all call sites.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify fpstate_sanitize_xstate() calls
Ingo Molnar [Tue, 28 Apr 2015 09:17:55 +0000 (11:17 +0200)]
x86/fpu: Simplify fpstate_sanitize_xstate() calls

Remove the extra layer of __fpstate_sanitize_xstate():

if (!use_xsaveopt())
return;
__fpstate_sanitize_xstate(tsk);

and move the check for use_xsaveopt() into fpstate_sanitize_xstate().

In general we optimize for the presence of CPU features, not for
the absence of them. Furthermore there's little point in this inlining,
as the call sites are not super hot code paths.

Doing this uninlining shrinks the code a bit:

   text    data     bss     dec     hex filename
   14108751        2573624 1634304 18316679        1177d87 vmlinux.before
   14108627        2573624 1634304 18316555        1177d0b vmlinux.after

Also remove a pointless '!fx' check from fpstate_sanitize_xstate().

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()
Ingo Molnar [Tue, 28 Apr 2015 09:11:10 +0000 (11:11 +0200)]
x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()

So the sanitize_i387_state() function has the following purpose:
on CPUs that support optimized xstate saving instructions, an
FPU fpstate might end up having partially uninitialized data.

This function initializes that data.

Note that the function name is a misnomer and confusing on two levels,
not only is it not i387 specific at all, but it is the exact opposite:
it only matters on xstate CPUs.

So rename sanitize_i387_state() and __sanitize_i387_state() to
fpstate_sanitize_xstate() and __fpstate_sanitize_xstate(),
to clearly express the purpose and usage of the function.

We'll further clean up this function in the next patch.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move asm/xcr.h to asm/fpu/internal.h
Ingo Molnar [Tue, 28 Apr 2015 08:56:54 +0000 (10:56 +0200)]
x86/fpu: Move asm/xcr.h to asm/fpu/internal.h

Now that all FPU internals using drivers are converted to public APIs,
move xcr.h's definitions into fpu/internal.h and remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c
Ingo Molnar [Tue, 28 Apr 2015 08:59:00 +0000 (10:59 +0200)]
x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c

This file only uses the public FPU APIs, so remove the xcr.h, fpu/xstate.h
and fpu/internal.h headers and add the fpu/api.h include.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature...
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks
Ingo Molnar [Tue, 28 Apr 2015 08:11:24 +0000 (10:11 +0200)]
x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit.

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature...
Ingo Molnar [Tue, 28 Apr 2015 07:59:44 +0000 (09:59 +0200)]
x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

 - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

 - Removes detection complexity from the driver, no more raw XGETBV instruction

 - Shrinks the code a bit:

     text    data     bss     dec     hex filename
     2128    2896       0    5024    13a0 camellia_aesni_avx_glue.o.before
     2067    2896       0    4963    1363 camellia_aesni_avx_glue.o.after

 - Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move xfeature type enumeration to fpu/types.h
Ingo Molnar [Tue, 28 Apr 2015 07:46:04 +0000 (09:46 +0200)]
x86/fpu: Move xfeature type enumeration to fpu/types.h

So xsave.h is an internal header that FPU using drivers commonly include,
to get access to the xstate feature names, amongst other things.

Move these type definitions to fpu/fpu.h to allow simplification
of FPU using driver code.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Enumerate xfeature bits
Ingo Molnar [Tue, 28 Apr 2015 07:40:26 +0000 (09:40 +0200)]
x86/fpu: Enumerate xfeature bits

Transform the xstate masks into an enumerated list of xfeature bits.

This removes the hard coding of XFEATURES_NR_MAX.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify print_xstate_features()
Ingo Molnar [Tue, 28 Apr 2015 07:17:26 +0000 (09:17 +0200)]
x86/fpu: Simplify print_xstate_features()

We do a boot time printout of xfeatures in print_xstate_features(),
simplify this code to make use of the recently introduced cpu_has_xfeature()
method.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name)
Ingo Molnar [Tue, 28 Apr 2015 06:51:17 +0000 (08:51 +0200)]
x86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name)

A lot of FPU using driver code is querying complex CPU features to be
able to figure out whether a given set of xstate features is supported
by the CPU or not.

Introduce a simplified API function that can be used on any CPU type
to get this information. Also add an error string return pointer,
so that the driver can print a meaningful error message with a
standardized feature name.

Also mark xfeatures_mask as __read_only.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename fpu/xsave.c to fpu/xstate.c
Ingo Molnar [Tue, 28 Apr 2015 06:46:23 +0000 (08:46 +0200)]
x86/fpu: Rename fpu/xsave.c to fpu/xstate.c

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename fpu/xsave.h to fpu/xstate.h
Ingo Molnar [Tue, 28 Apr 2015 06:41:33 +0000 (08:41 +0200)]
x86/fpu: Rename fpu/xsave.h to fpu/xstate.h

'xsave' is an x86 instruction name to most people - but xsave.h is
about a lot more than just the XSAVE instruction: it includes
definitions and support, both internal and external, related to
xstate and xfeatures support.

As a first step in cleaning up the various xstate uses rename this
header to 'fpu/xstate.h' to better reflect what this header file
is about.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Optimize fpu_copy() some more on lazy switching systems
Ingo Molnar [Mon, 27 Apr 2015 08:08:39 +0000 (10:08 +0200)]
x86/fpu: Optimize fpu_copy() some more on lazy switching systems

The current fpu_copy() code on lazy switching CPUs always saves
into the current fpstate and then copies it over into the child
context:

preempt_disable();
if (!copy_fpregs_to_fpstate(src_fpu))
fpregs_deactivate(src_fpu);
preempt_enable();
memcpy(&dst_fpu->state, &src_fpu->state, xstate_size);

That memcpy() can be avoided on all lazy switching setups except
really old FNSAVE-only systems: change fpu_copy() to directly save
into the child context, for both the lazy and the eager context
switching case.

Note that we still have to do a memcpy() back into the parent
context in the FNSAVE case, but this won't be executed on the
majority of x86 systems that got built in the last 10 years or so.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Optimize fpu_copy()
Ingo Molnar [Mon, 27 Apr 2015 07:59:11 +0000 (09:59 +0200)]
x86/fpu: Optimize fpu_copy()

Optimize fpu_copy() a bit by expanding the ->fpstate_active == 1
portion of fpu__save() into it.

( The main purpose of this change is to enable another, larger
  optimization that will be done in the next patch. )

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Optimize fpu__save()
Ingo Molnar [Mon, 27 Apr 2015 07:45:12 +0000 (09:45 +0200)]
x86/fpu: Optimize fpu__save()

So fpu__save() does this currently:

copy_fpregs_to_fpstate(fpu);
if (!use_eager_fpu())
fpregs_deactivate(fpu);

... which deactivates the FPU on lazy switching systems unconditionally.

Both usecases of fpu__save() use this function to save the
FPU state into a fpstate: fork()/clone() and math error signal handling.

The unconditional disabling of FPU registers in the lazy switching
case is probably a mistaken conversion of old FNSAVE code (that had
to disable FPU registers).

So speed up this code by only disabling FPU registers when absolutely
necessary: when indicated by the copy_fpregs_to_fpstate() return
code:

if (!copy_fpregs_to_fpstate(fpu))
fpregs_deactivate(fpu);

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify fpu__save()
Ingo Molnar [Mon, 27 Apr 2015 07:38:21 +0000 (09:38 +0200)]
x86/fpu: Simplify fpu__save()

Factor out a common call.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Eliminate __save_fpu()
Ingo Molnar [Mon, 27 Apr 2015 07:26:41 +0000 (09:26 +0200)]
x86/fpu: Eliminate __save_fpu()

The current implementation of __save_fpu():

if (use_xsave()) {
xsave_state(&fpu->state.xsave);
} else {
fpu_fxsave(fpu);
}

Is actually a simplified version of copy_fpregs_to_fpstate(),
if use_eager_fpu() is true.

But all call sites of __save_fpu() call it only it when use_eager_fpu()
is true.

So we can eliminate __save_fpu() altogether and use the standard
copy_fpregs_to_fpstate() function. This cleans up the code
by making it use fewer variants of FPU register saving.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify __save_fpu()
Ingo Molnar [Mon, 27 Apr 2015 07:23:43 +0000 (09:23 +0200)]
x86/fpu: Simplify __save_fpu()

__save_fpu() has this pattern:

if (unlikely(system_state == SYSTEM_BOOTING))
xsave_state_booting(&fpu->state.xsave);
else
xsave_state(&fpu->state.xsave);

... but it does not actually get called during system bootup.

So remove the complication and always call xsave_state().

To make sure this assumption is correct, add a WARN_ONCE()
debug check to xsave_state().

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Factor out FPU hw activation/deactivation
Ingo Molnar [Mon, 27 Apr 2015 06:58:45 +0000 (08:58 +0200)]
x86/fpu: Factor out FPU hw activation/deactivation

We have repeat patterns of:

if (!use_eager_fpu())
clts();

... to activate FPU registers, and:

if (!use_eager_fpu())
stts();

... to deactivate them.

Encapsulate these in:

__fpregs_activate_hw();
__fpregs_activate_hw();

and use them accordingly.

Doing this synchronizes the idiom with the fpu->fpregs_active
software-flag's handling functions, creating clear patterns of:

__fpregs_activate_hw();
__fpregs_activate(fpu);

etc., which improves readability.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped()
Ingo Molnar [Mon, 27 Apr 2015 05:22:58 +0000 (07:22 +0200)]
x86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped()

In line with the fpstate_activate() change, name
fpu__unlazy_stopped() in a similar fashion as well: its purpose
is to make the fpstate of a stopped task the current and active FPU
context, which may require unlazying and initialization.

The unlazying is just part of the job, the main concept is to make
the fpstate active.

Also clarify the function's description to clarify its exact
usage and the background behind it all.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify fpstate_init_curr() usage
Ingo Molnar [Mon, 27 Apr 2015 05:18:17 +0000 (07:18 +0200)]
x86/fpu: Simplify fpstate_init_curr() usage

Now that fpstate_init_curr() is not doing implicit allocations
anymore, almost all uses of it involve a very simple pattern:

if (!fpu->fpstate_active)
fpstate_init_curr(fpu);

which is basically activating the FPU fpstate if it was not active
before.

So propagate the check into the function itself, and rename the
function according to its new purpose:

fpu__activate_curr(fpu);

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu, kvm: Simplify fx_init()
Ingo Molnar [Mon, 27 Apr 2015 04:58:22 +0000 (06:58 +0200)]
x86/fpu, kvm: Simplify fx_init()

Now that fpstate_init() cannot fail the error return of fx_init()
has lost its purpose. Eliminate the error return and propagate this
change to all callers.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify fpu__unlazy_stopped() error handling
Ingo Molnar [Mon, 27 Apr 2015 04:55:54 +0000 (06:55 +0200)]
x86/fpu: Simplify fpu__unlazy_stopped() error handling

Now that FPU contexts are always allocated, fpu__unlazy_stopped()
cannot fail. Remove its error return and propagate the changes to
the callers.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr()
Ingo Molnar [Mon, 27 Apr 2015 04:50:29 +0000 (06:50 +0200)]
x86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr()

Now that there are no FPU context allocations, rename fpstate_alloc_init()
to fpstate_init_curr(), to signal that it initializes the fpstate and
marks it active, for the current task.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Remove failure return from fpstate_alloc_init()
Ingo Molnar [Mon, 27 Apr 2015 04:46:52 +0000 (06:46 +0200)]
x86/fpu: Remove failure return from fpstate_alloc_init()

Remove the failure code and propagate this down to callers.

Note that this function still has an 'init' aspect, which must be
called.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Remove failure paths from fpstate-alloc low level functions
Ingo Molnar [Mon, 27 Apr 2015 03:52:40 +0000 (05:52 +0200)]
x86/fpu: Remove failure paths from fpstate-alloc low level functions

Now that we always allocate the FPU context as part of task_struct there's
no need for separate allocations - remove them and their primary failure
handling code.

( Note that there's still secondary error codes that have become superfluous,
  those will be removed in separate patches. )

Move the somewhat misplaced setup_xstate_comp() call to the core.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)
Ingo Molnar [Mon, 27 Apr 2015 02:19:39 +0000 (04:19 +0200)]
x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)

So 6 years ago we made the FPU fpstate dynamically allocated:

  aa283f49276e ("x86, fpu: lazy allocation of FPU area - v5")
  61c4628b5386 ("x86, fpu: split FPU state from task struct - v5")

In hindsight this was a mistake:

   - it complicated context allocation failure handling, such as:

/* kthread execs. TODO: cleanup this horror. */
if (WARN_ON(fpstate_alloc_init(fpu)))
force_sig(SIGKILL, tsk);

   - it caused us to enable irqs in fpu__restore():

                local_irq_enable();
                /*
                 * does a slab alloc which can sleep
                 */
                if (fpstate_alloc_init(fpu)) {
                        /*
                         * ran out of memory!
                         */
                        do_group_exit(SIGKILL);
                        return;
                }
                local_irq_disable();

   - it (slightly) slowed down task creation/destruction by adding
     slab allocation/free pattens.

   - it made access to context contents (slightly) slower by adding
     one more pointer dereference.

The motivation for the dynamic allocation was two-fold:

   - reduce memory consumption by non-FPU tasks

   - allocate and handle only the necessary amount of context for
     various XSAVE processors that have varying hardware frame
     sizes.

These days, with glibc using SSE memcpy by default and GCC optimizing
for SSE/AVX by default, the scope of FPU using apps on an x86 system is
much larger than it was 6 years ago.

For example on a freshly installed Fedora 21 desktop system, with a
recent kernel, all non-kthread tasks have used the FPU shortly after
bootup.

Also, even modern embedded x86 CPUs try to support the latest vector
instruction set - so they'll too often use the larger xstate frame
sizes.

So remove the dynamic allocation complication by embedding the FPU
fpstate in task_struct again. This should make the FPU a lot more
accessible to all sorts of atomic contexts.

We could still optimize for the xstate frame size in the future,
by moving the state structure to the last element of task_struct,
and allocating only a part of that.

This change is kept minimal by still keeping the ctx_alloc()/free()
routines (that now do nothing substantial) - we'll remove them in
the following patches.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization...
Ingo Molnar [Mon, 27 Apr 2015 01:32:18 +0000 (03:32 +0200)]
x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions

So we have the following ancient code in copy_fpregs_to_fpstate():

if (unlikely(fpu->state->fxsave.swd & X87_FSW_ES)) {
asm volatile("fnclex");
goto drop_fpregs;
}

which clears pending FPU exceptions and then drops registers, which
causes the next FP instruction of the saved context to re-load the
saved FPU state, with all pending exceptions marked properly, and
will re-start the exception handling mechanism in the hardware.

Since FPU exceptions are always issued on instruction boundaries,
in particular on the next FP instruction following the exception
generating instruction, there's no fear of getting an FP exception
asynchronously.

They were truly asynchronous back in the IRQ13 days, when the FPU was
a weird and expensive co-processor that did its own processing, and we
had to synchronize with them, but that code is not working anymore:
we don't have IRQ13 mapped in the IDT anymore.

With the introduction of optimized XSAVE support there's a new
complication: if the xstate features bit indicates that a particular
state component is unused (in 'init state'), then the hardware does
not guarantee that the XSAVE (et al) instruction keeps the underlying
FPU state image in memory valid and current. In practice this means
that the hardware won't write it, and the exceptions flag in the
state might be an older version, with it still being set. This
meant that we had to check the xfeatures flag as well, adding
another memory load and branch to a critical hot path of the scheduler.

So optimize all this by removing both the old quirk and the new check,
and straight-line optimizing the most common cases with likely()
hints. Quite a bit of code gets removed this way:

  arch/x86/kernel/process_64.o:

    text    data     bss     dec     filename
    5484       8       0    5492     process_64.o.before
    5416       8       0    5424     process_64.o.after

Now there's also a chance that some weird behavior or erratum was
masked by our IRQ13 handling quirk (or that I misunderstood the
nature of the quirk), and that this change triggers some badness.

There's no real good way to protect against that possibility other
than keeping this change well isolated, well commented and well
bisectable. If you bisect a weird (or not so weird) breakage to
this commit then please let us know!

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate()
Ingo Molnar [Mon, 27 Apr 2015 00:53:16 +0000 (02:53 +0200)]
x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate()

So fpu_save_init() is a historic name that got its name when the only
way the FPU state was FNSAVE, which cleared (well, destroyed) the FPU
state after saving it.

Nowadays the name is misleading, because ever since the introduction of
FXSAVE (and more modern FPU saving instructions) the 'we need to reload
the FPU state' part is only true if there's a pending FPU exception [*],
which is almost never the case.

So rename it to copy_fpregs_to_fpstate() to make it clear what's
happening. Also add a few comments about why we cannot keep registers
in certain cases.

Also clean up the control flow a bit, to make it more apparent when
we are dropping/keeping FP registers, and to optimize the common
case (of keeping fpregs) some more.

[*] Probably not true anymore, modern instructions always leave the FPU
    state intact, even if exceptions are pending: because pending FP
    exceptions are posted on the next FP instruction, not asynchronously.

    They were truly asynchronous back in the IRQ13 case, and we had to
    synchronize with them, but that code is not working anymore: we don't
    have IRQ13 mapped in the IDT anymore.

    But a cleanup patch is obviously not the place to change subtle behavior.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Uninline the irq_ts_save()/restore() functions
Ingo Molnar [Sun, 26 Apr 2015 14:57:55 +0000 (16:57 +0200)]
x86/fpu: Uninline the irq_ts_save()/restore() functions

Especially the irq_ts_save() function is pretty bloaty, generating
over a dozen instructions, so uninline them.

Even though the API is used rarely, the space savings are measurable:

   text    data     bss     dec     hex filename
   13331995        2572920 1634304 17539219        10ba093 vmlinux.before
   13331739        2572920 1634304 17538963        10b9f93 vmlinux.after

( This also allows the removal of an include file inclusion from fpu/api.h,
  speeding up the kernel build slightly. )

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Move various internal function prototypes to fpu/internal.h
Ingo Molnar [Sun, 26 Apr 2015 14:56:05 +0000 (16:56 +0200)]
x86/fpu: Move various internal function prototypes to fpu/internal.h

There are a number of FPU internal function prototypes and an inline function
in fpu/api.h, mostly placed so historically as the code grew over the years.

Move them over into fpu/internal.h where they belong. (Add sched.h include
to stackprotector.h which incorrectly relied on getting it from fpu/api.h.)

fpu/api.h is now a pure file that only contains FPU APIs intended for driver
use.

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Uninline kernel_fpu_begin()/end()
Ingo Molnar [Sun, 26 Apr 2015 10:07:18 +0000 (12:07 +0200)]
x86/fpu: Uninline kernel_fpu_begin()/end()

Both inline functions call an inline function unconditionally, so we
already pay the function call based clobbering cost. Uninline them.

This saves quite a bit of code in various performance sensitive
code paths:

   text            data    bss     dec             hex     filename
   13321334        2569888 1634304 17525526        10b6b16 vmlinux.before
   13320246        2569888 1634304 17524438        10b66d6 vmlinux.after

Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>