From 40ad7bdbe6db661c513c66e350ebfa479cb94765 Mon Sep 17 00:00:00 2001 From: Peizhao Ou Date: Thu, 16 Apr 2015 19:23:15 -0700 Subject: [PATCH] changes --- correctness-model/writeup/formalization.tex | 300 +++++++++++--------- correctness-model/writeup/paper.tex | 9 + 2 files changed, 172 insertions(+), 137 deletions(-) diff --git a/correctness-model/writeup/formalization.tex b/correctness-model/writeup/formalization.tex index 0aff8cd..2702945 100644 --- a/correctness-model/writeup/formalization.tex +++ b/correctness-model/writeup/formalization.tex @@ -1,13 +1,59 @@ \mysection{Formalization of Correctness Model}\label{sec:formalization} +\todo{discuss ordering point --- why does ordering point make sense? why do we +want them} + Unlike the SC memory model, finding an appropriate correctness model for -concurrent data structures under the C/C++11 memory model is challenging. For -example, linearizability no longer fits C/C++ by the fact that the C/C++ memory -model allows atomic loads to read from atomic stores that appear later in the -trace and that it is in general impossible to produce a sequential history that -preserves program order for the C/C++ memory model. +concurrent data structures under the C/C++11 memory model is challenging by the +fact that the C/C++ memory model allows atomic loads to read from atomic stores +that appear later in the trace and that it is in general impossible to produce a +sequential history that preserves program order for the C/C++ memory model. + +Developers usually leverage the relaxed atomics provided by the C/C++11 memory +model to gain performace by introducing weaker semantics to their users. +However, the semantics should not be arbitrarily relaxed since users may have a +very difficult time using the data structure implemntations. Intuitively, if +method $A$ has an information flow to method $B$, developers and users generally +assume method $B$ to see the effects before method $A$. Specifically in C/C++11, +we want method $A$ to have \code{release} semantics while method $B$ to have +\code{acquire} semantics so that they establish the happens-before relationship. +For example, for a concurrent queue, it is necessary that a dequeuer synchronize +with the corresponding enqueuer so that when the dequeuer obtains a reference to +an object, it can read the fully initialized value of that object. To some +degree this can also avoid data races. When it comes to C/C++11 data structures, +the ordering points of method calls should have release/acquire semantics on +stores and loads. + +However, even if we require data structures to have release/acquire semantics on +method calls, it still remains a challenge to apply existing correctness model +to C/C++11 data structures to generate a correct sequential history that is +consistent with the program order. In order to further illustrate such a +challenge, we define \moc as the \textit{modification order consistency} +relation between memory operations. Given two operations $X$ and $Y$, the +\moc edges are as follow: + +\squishcount +\item {\bf Modification Order Consistency (write-write):} $X \xrightarrow{mo} Y +\Rightarrow X \xrightarrow{moc} Y$. + +\item {\bf Modification Order Consistency (read-write):} $A \xrightarrow{mo} Y \ +\wedge \ A \xrightarrow{rf} X \Rightarrow X \xrightarrow{moc} Y$. -Consider the following example: +\item {\bf Modification Order Consistency (write-read):} $X \xrightarrow{mo} B \ +\wedge \ B \xrightarrow{rf} Y \Rightarrow X \xrightarrow{moc} Y$. + +\item {\bf Modification Order Consistency (read-read):} $A \xrightarrow{mo} B \ +\wedge \ A \xrightarrow{rf} X \ \wedge \ B \xrightarrow{rf} Y \Rightarrow X +\xrightarrow{moc} Y$. +\countend +\vspace{0.3cm} + +When we generate a sequential history, the \moc relation imposes +restrictions on operations on the same memory location such that every load +operation reads from the most recent store operation in that history. In +general, it can be impossible to generate any sequential history that is +consistent with the \moc relation and the \hb relation. Consider the +following example: { \footnotesize @@ -18,66 +64,32 @@ Consider the following example: \end{verbatim} } -Suppose each operation in this example is the only operation of each method -call, and shared variables \code{x} and \code{y} are both initilly \code{0}. -Each store operation has \code{release} semantics and each load operation has -\code{acquire} semantics. For the execution where both \code{r1} and \code{r2} -obtain the old value \code{0}, we encounter a challenge of generating a -sequential history. Since neither load operation reads -from the corresponding store, they should be ordered before their corresponding -store operation. On the other hand, both stores happen before the other load -(intra-thread ordering), making it impossible to topologically sort the +In this example, shared variables \code{x} and \code{y} are both initilly +\code{0}. Each store operation has \code{release} semantics and each load +operation has \code{acquire} semantics. For the execution where both \code{r1} +and \code{r2} obtain the old value \code{0}, we cannot generate a sequential +history consistent with the \hb and \moc relation. Since both load +operations reads from the initial store operations, there exists an \moc +edge from each load to the corresponding store operation. Since the intra-thread +order (\textit{sb}) is a part of the \hb relation, a cycle exists in the union +of \hb and \moc relation, making it impossible to topologically sort the operations to generate a consistent sequential history. -Intuitively however, we can weaken some constraints, e.g. -the \textit{happens-before} edges in some cases, to generate a reordering of -ordering points such that they yield a sequential history that is consistent -with the specification. We formalize our approach as follow. The goal is to come -up with a correctness model that is weaker than linearizability and SC and yet -is composable. - -First of all, we represent a trace as an \textit{execution graph}, where each -node represents each API method call with a set of ordering points, and edges -between nodes are derived from the \textit{happens-before} edges and the -\textit{modification order} edges between ordering points. We define -\textit{opo} as the \textit{ordering point order} relation between ordering -point. Given two operations $X$ and $Y$ that are both ordering points, the -\textit{modification order consistency} edges are as follow: - -\squishcount - -\item {\bf Modification Order Consistency (write-write):} $X \xrightarrow{mo} Y \Rightarrow X \xrightarrow{opo} Y$. - -\item {\bf Modification Order Consistency (read-write):} $A \xrightarrow{mo} Y \ \wedge \ A \xrightarrow{rf} X \Rightarrow X \xrightarrow{opo} Y$. - -\item {\bf Modification Order Consistency (write-read):} $X \xrightarrow{mo} B \ \wedge \ B \xrightarrow{rf} Y \Rightarrow X \xrightarrow{opo} Y$. - -\item {\bf Modification Order Consistency (read-read):} $A \xrightarrow{mo} B \ \wedge \ A \xrightarrow{rf} X \ \wedge \ B \xrightarrow{rf} Y \Rightarrow X \xrightarrow{opo} Y$. -\countend - -\vspace{0.3cm} - -Intuitively, if method $A$ has an information flow to method $B$, we want method -$B$ to see the effects before method $A$. In C/C++11, on the other hand, we want -method $A$ to have \code{release} semantics while method $B$ to have -\code{acquire} semantics so that they establish the happens-before relationship. -For example, for a concurrent queue, we want a dequeuer synchronizes with the -corresponding enqueuer so that when the dequeuer obtains a reference to an -object, it can read the fully initialized value of that object. To some degree -this can also avoid data races. When it comes to C/C++11 data structures, the -ordering points of method calls should have release/acquire semantics on stores -and loads. - -In order to relax the contraints on the original execution graph, we will have -to disregard some happens-before edges. To make our model intuitive, we want to -keep the happens-before edges from stores to stores and from load operations to -store operations because that can ensure information is only flowing from -earlier store operations. Besides, we also want to keep the happens-before edges -between operations on the same memory location since otherwise the generated -history will become very counter-intuitive. However, such a model does not work -in C/C++ in general. Consider the following example: - -Consider the following example: +However, while maintaining the \moc relation, we can weaken some +constraints, e.g. eliminating some \textit{happens-before} edges, to generate a +history that is consistent with the union of \hb and \moc relation. For +instance, in the above example, we do not necessarily have to maintain the +intra-thread order in both threads because they are independent operations and +the later operation is a load. + +To be specific, in order to make the sequential history intuitive, we want to +keep the happens-before edges from store to store, from load to store because +that can ensure information is only flowing from earlier store operations. +Besides, we also want to keep the happens-before edges between operations on the +same memory location since that is consistent with the dependency between +operations on the same location. However, a correctness model with these +intuitions still does not work in C/C++ in general. Consider the following +example: { \footnotesize @@ -89,79 +101,82 @@ Consider the following example: } We encounter a challenge for the execution where the following holds: -\path{r1 == 2 && r2 == 1 && r3 == 1 && r4 == 2}. For any total order that is -consistent with the happens-before relation, that total order will be -inconsistent with the modification order in either variable \code{x} or -\code{y}. For example, if \code{y = 1} is ordered before \code{y = 2}, then -\code{x = 1} is ordered before \code{x = 2} while \code{r1 = x} is ordered -before \code{r2 = x}. As a result, we cannot possibly move up the second load -operation \code{r2 = x} across \code{r1 = x} to generate a consistent sequential -history. To solve this, one option is that allow loads to move up any operation -including load and store operation on the same location. However, it is -extremely conuter-intuitive for a later load operation to read an older value. -By analysis, the problem here is that the store operations with different values -to the same memory location are not ordered. This is actually a really rare case -because it could be possible that only one store operation takes effect and all -other store operations are useless. We believe that such case of blind stores -from different threads is not the general pattern for concurrent data -structures, and we believe that store operations are ordered. In terms of -ordering points, we believe that in real-world data structures, store operations -of ordering points are ordered by happens-before relation. -\todo{explain why ordered stores are reasonable to concurrent data structures} - -We next define a \textit{relaxing action} that takes input a relation $R$ -and outputs a new relation $R'$. We denote $R_r$ as the new relation after -taking a relaxing action $r$ on relation $R$. - -$ \forall X, Y, address(X)\not=address(Y) \wedge -X \relation{R} Y \wedge -Y \in \textit{LoadOps}\xspace -\Rightarrow -\forall Z, -Z \in \textit{orderingPoints}\xspace \wedge -Z \relation{hb} X, -Z \relation{hb} Y \wedge \neg X \relation{hb} Y $. - -We denote \textit{hb}$_r$ as the happens-before relation after taking a relaxing action -$r$. +\path{r1 == 2 && r2 == 1 && r3 == 1 && r4 == 2}. Operation \code{y = 1} has an +\moc edge to operation \code{y = 2}, and operation \code{x = 2} also has +an edge to operation \code{x = 1}. As a result, there exists a cycle in the +union of the \moc and \hb relation. + +To solve this, one option is to allow load operations to move up any operation +including load and store operations on the same location. However, this can be +extremely conuter-intuitive for a later load operation in \hb relation to read +an older value than its previous load operation. Actually, the problem here is +that the store operations with different values to the same memory location are +not ordered by \hb relation. This is a really rare case because it could be +possible that only one store operation takes effect and all other store +operations are useless. We believe that such a case of blind store operations is +not the general pattern for concurrent data structures implementation. Instead, +we believe that store operations to the same memory location should be ordered +by \hb relation. + +We first define two categories of memory operations as follow: +\begin{eqnarray*} +s \in \StoreOp &=& \{store\} \times \Address \times \Value \cup \\ +&&\{rmw\} \times \Address \times \Value \\ +l \in \LoadOp &=& \{load\} \times \Address. +\end{eqnarray*} + +We next define a \textit{relaxing operation} --- $r(x, y)$ on relation $R$ ($(x, +y) \in R$), and it generates a new relation, denoted as relation $R_{r(x, y)}$ +(or $R_r$ for short) which is defined as follow: + +$ R_{r(x, y)} = \{ (a, b) \mid [ (a, b) \in R \wedge \neg [ (a, b) == (x, y) +\wedge \movable(x, y) ]] \vee [ b == y \wedge (a, x) \in R \wedge \movable(x, y) ]$, +where $ \movable(x, y) = \{ \Address(x) \not= \Address(y) \wedge y \in \LoadOp \}$. + +We also define a \textit{transform operation} --- $T$ on relation $R$, as a +finite number of relaxing operations, $r1, r2,..., rn$ that are cumulatively +operated on $R$, as follow: + +$R_T = R_{r1r2...rn} = (((R_{r1})_{r2})...)_{rn}$ . + +Furthermore, we define a special \textit{transform operation} $UT$ on relation +$R$, where any further \textit{relaxing operation} $r$ on $R_{UT}$ Under these analysis and assumptions, we present Lemma~\ref{seqHistoryExistence}. \begin{lemma} \label{seqHistoryExistence} -For any legal C/C++11 execution, where store / load operations have release / acquire -semantics and store operations to the same memory location are ordered by -\textit{happens-before} relation, there exists a sequential history -that is consistent with the modification order consistency and the -happens-before relation in terms of store to store operations, load to store -operations and operations on the same memory location. +For any legal C/C++11 execution, where store / load operations have release / +acquire semantics and store operations to the same memory location are ordered +by \hb relation, there exists a sequential history that is consistent with the +\moc relation and the \hb relation in terms of store to store, load to store and +operations on the same memory location. \end{lemma} \begin{proof} To prove Lemma~\ref{seqHistoryExistence}, we present in Figure~\ref{fig:algorithmConsistentHistory} our constructive algorithm that -generates a sequential history and prove that its output satisfies the -condition. This algorithm topologically sort operations by \hb, and then it only -moves up certain load operations (Line~\ref{line:moveUpLoads}); thus, the store -to store and load to store \hb relation hold. Since load operations that read -from the correct store is not moved (Line~\ref{line:loadSet}) and store -operations are ordered by \hb relation, the write-write and write-read \mo -consistency holds. By induction, moving up load operations will end up placing -load operations between the store operation that they read from and the next -store operation, and thus the read-read and read-write \mo consistency also -holds. We then illustrates that the \hb relation between the same memory -location also holds by contradiction. The only possibility of moving load -operations across operations on the same location lies in -Line~\ref{line:loadSet} where there exists an operation $o$ between -$w'$ -and $r$, and $o$ is not moved up while some operation $r'$ in $l'$ is -moved up, and $o$ happens before $r'$. If $o$ is a load operation, $o$ must -read from $w$ otherwise $o$ will be moved up. Thus we have the relation: $w -\relation{hb} w' \relation{hb} o \relation{hb} r'$, which contradicts -the fact that $r$ reads from $w$. Likewise, if $o$ is a store operation, since -store operations are ordered by \hb relation, we have the relation: $w -\relation{hb} o \relation{hb} r'$ and this is a contradiciton similarly. -Therefore, the generated sequential history by this algorithm satisfies the -conditions in this Lemma. +generates a sequential history and prove that such a history satisfies the +condition. The algorithm first topologically sort operations by \hb, and then it +only moves up certain load operations (Line~\ref{line:moveUpLoads}); thus, the +\hb relation in terms of store to store and load to store holds. Since load +operations that read from the correct store is not moved +(Line~\ref{line:loadSet}) and store operations are ordered by \hb relation, the +\moc relation in terms of write-write and write-read operations holds. By +induction, moving up load operations will end up placing load operations between +the store operation that they read from and the next store operation, and thus +the \moc relation in terms of read-read and read-write operations also holds. As +a result, the \moc relation holds. We then illustrates that the \hb relation +between the same memory location also holds by contradiction. The only +possibility of moving load operations across operations on the same location +lies in Line~\ref{line:loadSet} where there exists an operation $o$ between $w'$ +and $r$, and $o$ is not moved up while some operation $r'$ in $l'$ is moved up, +and $o$ happens before $r'$. If $o$ is a load operation, $o$ must read from $w$ +otherwise $o$ will be moved up. Thus we have the relation: $w \relation{hb} w' +\relation{hb} o \relation{hb} r'$, which contradicts the fact that $r$ reads +from $w$. Likewise, if $o$ is a store operation, since store operations are +ordered by \hb relation, we have the relation: $w \relation{hb} o \relation{hb} +r'$ and this is a contradiciton similarly. Therefore, the generated sequential +history by this algorithm satisfies the conditions in this Lemma. \end{proof} \begin{figure}[!htbp] @@ -192,21 +207,25 @@ For any legal C/C++11 execution, where store / load operations have release / acquire semantics and store operations to the same memory location are ordered by \textit{happens-before} relation, the sequential history $H$ generated by the algorithm in Figure~\ref{fig:algorithmConsistentHistory} is equivalent to some -topological sorting of the operations by \mo consistency and \hb relation after -finite steps of actions (\textit{transform}). +topological sorting of the operations by the \moc relation and $\hb_T$ relation, +where $T$ is some \textit{transform operation} on the \hb relation. \end{lemma} \begin{proof} In the algorithm showed in Figure~\ref{fig:algorithmConsistentHistory}, during -each iteration it can have a set of load operations $l'$ +each iteration there can be a set of load operations $l'$ (Line~\ref{line:loadSet}) to be moved up across the most recent store operation $w'$. For any load opeartion $l''$ in the set $l'$, as showed in the proof of -Lemma~\ref{seqHistoryExistence}, no other operation on the same memory location -exists between $w'$ and $l''$. When $l''$ moves across an operation $o$ that -happens before $l''$, we make a transform $t$ on the happens-before relation on -$o$ and $l''$. When it finishes generating $H$, we have a finite steps of -transform taken on the happens-before graph, denoted as $T$. We also denote that -\textit{hb}$_T$ as the happens before relation after $T$. +Lemma~\ref{seqHistoryExistence}, any operations on the same location as $l''$ +between $w$ and $r$ must be in $l'$. When $l''$ is moved across an operation +$o$ that happens before $l''$, we add a \textit{relaxing opeartion} $r(o, l')$ +to the \textit{transform operation} $T$ on the \hb relation. When it finishes +generating $H$, we have a \textit{transform} $T$ on the \hb relation, and we +denote that $\hb_T$ as the relation on \hb after taking the operation $T$. Since +each \textit{relaxing operation} $r(o, l')$ eliminates the happens-before edge +from $o$ to $l'$, history $H$ should be consistent with the $\hb_T$ relation and +\moc relation. On the other hand, it means we can topologically sort the +operations by the $\hb_T$ and \moc relation and $H$ is one possible sorting. \end{proof} \todo{formalize composability and show hat our correctness model is composable} @@ -216,3 +235,10 @@ ordered? objects should be disjoint? +First of all, we represent a trace as an \textit{execution graph}, where each +node represents each API method call with a set of ordering points, and edges +between nodes are derived from the \textit{happens-before} edges and the +\textit{modification order} edges between ordering points. We define +\textit{opo} as the \textit{ordering point order} relation between ordering +point. Given two operations $X$ and $Y$ that are both ordering points, the +\textit{modification order consistency} edges are as follow: diff --git a/correctness-model/writeup/paper.tex b/correctness-model/writeup/paper.tex index 00e607d..015155f 100644 --- a/correctness-model/writeup/paper.tex +++ b/correctness-model/writeup/paper.tex @@ -90,6 +90,15 @@ \newcommand{\hb}[0]{\textit{hb}\xspace} \newcommand{\mo}[0]{\textit{mo}\xspace} +\newcommand{\moc}[0]{\textit{moc}\xspace} + +\newcommand{\address}[0]{\texttt{address}\xspace} +\newcommand{\StoreOp}[0]{\textit{StoreOps}\xspace} +\newcommand{\LoadOp}[0]{\textit{LoadOps}\xspace} +\newcommand{\Ops}[0]{\textit{Ops}\xspace} +\newcommand{\Address}[0]{\textit{Address}\xspace} +\newcommand{\Value}[0]{\textit{Value}\xspace} +\newcommand{\movable}[0]{\textit{movable}\xspace} \newcommand{\pushcode}[1][1]{\hskip\dimexpr#1\algorithmicindent\relax} -- 2.34.1