From: Mika Westerberg <mika.westerberg@linux.intel.com>
Date: Mon, 3 Oct 2016 10:17:08 +0000 (+0300)
Subject: x86/irq: Prevent force migration of irqs which are not in the vector domain
X-Git-Tag: firefly_0821_release~176^2~4^2~26^2~8
X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=eba90a4c76b91db06c307035c06b20dbcb5fa487;p=firefly-linux-kernel-4.4.55.git

x86/irq: Prevent force migration of irqs which are not in the vector domain

commit db91aa793ff984ac048e199ea1c54202543952fe upstream.

When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
affinities related to the CPU in question. The same thing is also done when
the system is suspended to S-states like S3 (mem).

For each IRQ we try to complete any on-going move regardless whether the
IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
its chip_data, assume it is of type struct apic_chip_data and manipulate it
by clearing old_domain mask etc. For irq_chips that are not part of the
x86_vector_domain, like those created by various GPIO drivers, will find
their chip_data being changed unexpectly.

Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
corrupted after resume:

  # cat /sys/kernel/debug/gpio
  gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
   gpio-511 (                    |sysfs               ) in  hi

  # rtcwake -s10 -mmem
  <10 seconds passes>

  # cat /sys/kernel/debug/gpio
  gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
   gpio-511 (                    |sysfs               ) in  ?

Note '?' in the output. It means the struct gpio_chip ->get function is
NULL whereas before suspend it was there.

Fix this by first checking that the IRQ belongs to x86_vector_domain before
we try to use the chip_data as struct apic_chip_data.

Reported-and-tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: http://lkml.kernel.org/r/20161003101708.34795-1-mika.westerberg@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index df6b4eeac0bd..0988e204f1e3 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -659,11 +659,28 @@ void irq_complete_move(struct irq_cfg *cfg)
  */
 void irq_force_complete_move(struct irq_desc *desc)
 {
-	struct irq_data *irqdata = irq_desc_get_irq_data(desc);
-	struct apic_chip_data *data = apic_chip_data(irqdata);
-	struct irq_cfg *cfg = data ? &data->cfg : NULL;
+	struct irq_data *irqdata;
+	struct apic_chip_data *data;
+	struct irq_cfg *cfg;
 	unsigned int cpu;
 
+	/*
+	 * The function is called for all descriptors regardless of which
+	 * irqdomain they belong to. For example if an IRQ is provided by
+	 * an irq_chip as part of a GPIO driver, the chip data for that
+	 * descriptor is specific to the irq_chip in question.
+	 *
+	 * Check first that the chip_data is what we expect
+	 * (apic_chip_data) before touching it any further.
+	 */
+	irqdata = irq_domain_get_irq_data(x86_vector_domain,
+					  irq_desc_get_irq(desc));
+	if (!irqdata)
+		return;
+
+	data = apic_chip_data(irqdata);
+	cfg = data ? &data->cfg : NULL;
+
 	if (!cfg)
 		return;