From: Brian Norris Date: Wed, 5 Dec 2012 23:29:32 +0000 (-0800) Subject: doc: notes: fence: update fence notes X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=f6ce9f8de3aea5cd500202b245f3736d81479243;p=cdsspec-compiler.git doc: notes: fence: update fence notes The backtracking and seq-cst/MO constraints are complete, but the release/acquire, etc. --- diff --git a/doc/notes/fence.txt b/doc/notes/fence.txt index b42ad94..190708c 100644 --- a/doc/notes/fence.txt +++ b/doc/notes/fence.txt @@ -9,19 +9,37 @@ Fences provide a few new modification order constraints as well as an interesting extension to release sequences, detailed in 29.3 (p4-p7) and 29.8 (p2-p4). -The Section 29.3 references detail the modification order constraints -established by sequentially-consistent fences. +Section 29.3 details the modification order constraints established by +sequentially-consistent fences. -The Section 29.8 references detail the behavior of release and acquire fences -(note that memory_order_seq_cst is both release and acquire). +Section 29.8 details the behavior of release and acquire fences (note that +memory_order_seq_cst is both release and acquire). The text of these rules are provided at the end of this document for reference. ******************************* -Seq-cst MO constraints (29.3 p4-p7) + Backtracking requirements ******************************* -[INCOMPLETE] +Because we maintain the seq-cst order as consistent with the execution order, +seq-cst fences cannot commute with each other, with seq-cst loads, nor with +seq-cst stores; we backtrack at all such pairs. + +Fences extend release/acquire synchronization beyond just +store-release/load-acquire. We must backtrack with potentially-synchronizing +fences: that is, with any pair of store- or fence-release and load- or +fence-acquire, where the release comes after the acquire in the execution order +(the other ordering is OK, as we will explore both behaviors; where the pair +synchronize and where they don't). + +Note that, for instance, a fence-release may synchronize with a fence-acquire +only in the presence of a appropriate load/store pair (29.8p2); but the +synchronization still occurs between the fences, so the backtracking +requirements are only placed on the release/acquire fences themselves. + +******************************* + Seq-cst MO constraints (29.3 p4-p7) +******************************* The statements given in the specification regarding sequentially consistent fences can be transformed into the following 4 modification order constraints. @@ -31,7 +49,7 @@ fences can be transformed into the following 4 modification order constraints. If is_write(A) && is_read(B) && is_write(X) && is_fence(Y) && is_seqcst(X) && is_seqcst(Y) && A != X && - same_var(X, A, B) && + same_loc(X, A, B) && A --rf-> B && X --sc-> Y --sb-> B then @@ -39,7 +57,8 @@ then Intuition/Implementation: * We may (but don't currently) limit our considertion of X to only the most - recent (in the SC order) store to the same variable as A and B + recent (in the SC order) store to the same location as A and B prior to Y + (note that all prior writes will be ordered prior to X in both SC and MO) * We should consider the "most recent" seq-cst fence Y that precedes B * This search can be combined with the r_modification_order search, since we already iterate through the necessary stores X @@ -49,7 +68,7 @@ Intuition/Implementation: If is_write(A) && is_read(B) && is_write(X) && is_fence(Y) && is_seqcst(B) && is_seqcst(Y) && - same_var(X, A, B) && + same_loc(X, A, B) && A != X && A --rf-> B && X --sb-> Y --sc-> B @@ -57,10 +76,11 @@ then X --mo-> A Intuition/Implementation: + * We only need to examine the "most recent" seq-cst fence Y from each thread * We only need to examine the "most recent" qualifying store X that precedes Y; all other X will provide a weaker MO constraint - * Search for the most recent fence Y in the same thread as A - * We should consider the "most recent" write + * This search can be combined with the r_modification_order search, since we + already iterate through the necessary stores X For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there is a memory_order_seq_cst fence X such that A is @@ -70,9 +90,9 @@ A or a later modification of M in its modification order. 29.3p6 If - is_write(A) && is_read(B) && is_write(X) && is_fence(Y) is_fence(Z) && + is_write(A) && is_read(B) && is_write(X) && is_fence(Y) && is_fence(Z) && is_seqcst(Y) && is_seqcst(Z) && - same_var(X, A, B) && + same_loc(X, A, B) && A != X && A --rf-> B && X --sb-> Y --sc-> Z --sb-> B @@ -80,24 +100,42 @@ then X --mo-> A Intuition/Implementation: - * We should consider the "most recent" write - -For atomic operations A and B on an atomic object M, where A modifies M and B -takes its value, if there are memory_order_seq_cst fences X and Y such that A -is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B -observes either the effects of A or a later modification of M in its -modification order. + * We should consider only the "most recent" fence Z in the same thread as B + (prior fences may only yield the same or weaker constraints) + * We may then consider the "most recent" seq-cst fence Y prior to Z (in SC order) + from each thread (prior fences may only yield the same or weaker constraints) + * We should consider only the "most recent" store X (to the same location as A, + B) in the same thread as Y (prior stores may only yield the same or weaker + constraints) + * This search can be combined with the r_modification_order search, since we + already iterate through the necessary stores X 29.3p7 -For atomic operations A and B on an atomic object M, if there are -memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is -sequenced before B, and X precedes Y in S, then B occurs later than A in the -modification order of M. +If + is_write(A) && is_write(B) && is_fence(X) && is_fence(Y) && + is_seqcst(X) && is_seqcst(Y) && + same_loc(A, B) && + A --sb-> X --sc-> Y --sb-> B +then + A --mo-> B + +Intuition/Implementation: + * (Similar to 29.3p6 rules, except using A/B write/write) only search for the + most recent fence Y in the same thread; search for the most recent (prior to + Y) fence X from each thread; search for the most recent store A prior to X + * This search can be combined with the w_modification_order search, since we + already iterate through the necessary stores A + +********************************************************************** + Release/acquire synchronization: extended to fences (29.3 p4-p7) +********************************************************************** + +[INCOMPLETE] ******************************* -Miscellaneous Notes + Miscellaneous Notes ******************************* fence(memory_order_consume) acts like memory_order_release, so if we ever