Porting from Samsung.
761320: Full cache line writes to the same memory region from at least two processors
might deadlock the processor
Status
Affects: Product Cortex-A9 MPCore.
Fault Type: Programmer Category B (Rare)
Fault Status: Present in: All r0, r1, r2 and r3 revisions Fixed in r4p0
Description
Under very rare circumstances, full cache line writes from (at least) 2 processors on cache lines in hazard with
other requests may cause arbitration issues in the SCU, leading to processor deadlock.
Configurations affected
This erratum affects the configurations of the processor with three or more active coherent agents, which is
either:
- Two or more processors if the ACP is present
- Three or more processors
Conditions
To trigger the erratum, at least three agents need to be working in SMP mode, and accessing coherent memory
regions.
Two or more processors need to perform full cache line writes, to cache lines which are in hazard with other
access requests in the SCU. The hazard in the SCU happens when another processor, or the ACP, is
performing a read or a write of the same cache line.
The following example describes one scenario that might cause this deadlock:
- CPU0 performs a full cache line write to address A, then a full cache line write to address B
- CPU1 performs a full cache line write to address B, then a full cache line write to address A
- CPU2 performs read accesses to addresses A and B
Under certain rare timing circumstances, the requests might create a loop of dependencies, causing a
processor deadlock.
Implications
When the erratum happens, it leads to system deadlock.
It is important to note that any scenario leading to this deadlock situation is uncommon. It requires two
processors writing full cache lines to a coherent memory region, without taking any semaphore, with another
processor or the ACP accessing the same lines at the same time, meaning that these latter accesses are not
deterministic. This, combined with the extremely rare microarchitectural timing conditions under which the defect
can happen, explains why the erratum is not expected to cause any significant malfunction in real systems.
Workaround
This erratum can be worked round by setting bit[21] of the undocumented Diagnostic Control Register to 1. This
register is encoded as CP15 c15 0 c0 1.
The bit can be written in Secure state only, with the following Read/Modify/Write code sequence:
MRC p15,0,rt,c15,c0,1
ORR rt,rt,#0x200000
MCR p15,0,rt,c15,c0,1
When this bit is set, the “direct eviction” optimization in the Bus Interface Unit is disabled, which means this
erratum cannot occur.
Setting this bit might prevent the Cortex-A9 from utilizing the full bandwidth when performing intensive full cache
line writes, and therefore a slight performance drop might be visible.
In addition, this erratum cannot occur if at least one of the following bits in the Diagnostic Control Register is set
to 1:
- bit [23] – Disable Read-Allocate mode
- bit [22] – Disable Write Allocate Wait mode
select RK_TIMER
select HAVE_SMP
select MIGHT_HAVE_CACHE_L2X0
+ select ARM_ERRATA_761320
select ARM_ERRATA_764369
select ARM_ERRATA_754322
select ARM_ERRATA_775420
This workaround defines cpu_relax() as smp_mb(), preventing correctly
written polling loops from denying visibility of updates to memory.
+config ARM_ERRATA_761320
+ bool "ARM errata: no direct eviction"
+ depends on CPU_V7 && SMP
+ help
+ This option enables the workaround for the 761320 Cortex-A9 erratum.
+
config ARM_ERRATA_764369
bool "ARM errata: Data cache line maintenance operation by MVA may not succeed"
depends on CPU_V7 && SMP
orrlt r10, r10, #1 << 11 @ set bit #11
mcrlt p15, 0, r10, c15, c0, 1 @ write diagnostic register
#endif
+#ifdef CONFIG_ARM_ERRATA_761320
+ mrc p15, 0, r10, c15, c0, 1 @ read diagnostic register
+ orr r10, r10, #1 << 21 @ set bit #21
+ mcr p15, 0, r10, c15, c0, 1 @ write diagnostic register
+#endif
3: mov r10, #0
mcr p15, 0, r10, c7, c5, 0 @ I+BTB cache invalidate