powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
authorGavin Shan <gwshan@linux.vnet.ibm.com>
Fri, 28 Aug 2015 01:57:00 +0000 (11:57 +1000)
committerMichael Ellerman <mpe@ellerman.id.au>
Fri, 28 Aug 2015 03:26:31 +0000 (13:26 +1000)
The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:

   0001:01:00.0 0200: 14e4:168e (rev 10)
   0001:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.

Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable@vger.kernel.org # v4.0+
Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
arch/powerpc/kernel/eeh.c

index 58c598400028c7f18d9e35d85fd7a6fd34014a83..e968533e3e057603eddee6ceaf0dd15ba750e01c 100644 (file)
@@ -308,11 +308,26 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
        if (!(pe->type & EEH_PE_PHB)) {
                if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG))
                        eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+
+               /*
+                * The config space of some PCI devices can't be accessed
+                * when their PEs are in frozen state. Otherwise, fenced
+                * PHB might be seen. Those PEs are identified with flag
+                * EEH_PE_CFG_RESTRICTED, indicating EEH_PE_CFG_BLOCKED
+                * is set automatically when the PE is put to EEH_PE_ISOLATED.
+                *
+                * Restoring BARs possibly triggers PCI config access in
+                * (OPAL) firmware and then causes fenced PHB. If the
+                * PCI config is blocked with flag EEH_PE_CFG_BLOCKED, it's
+                * pointless to restore BARs and dump config space.
+                */
                eeh_ops->configure_bridge(pe);
-               eeh_pe_restore_bars(pe);
+               if (!(pe->state & EEH_PE_CFG_BLOCKED)) {
+                       eeh_pe_restore_bars(pe);
 
-               pci_regs_buf[0] = 0;
-               eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
+                       pci_regs_buf[0] = 0;
+                       eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
+               }
        }
 
        eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);