key: cord-0053360-rizfc52n
authors: Krishna, Siddharth; Emmi, Michael; Enea, Constantin; Jovanović, Dejan
title: Verifying Visibility-Based Weak Consistency
date: 2020-04-18
journal: Programming Languages and Systems
DOI: 10.1007/978-3-030-44914-8_11
sha: 4ba205b2721f7d6c909ecc8f49761263c5bb3898
doc_id: 53360
cord_uid: rizfc52n

Multithreaded programs generally leverage efficient and thread-safe concurrent objects like sets, key-value maps, and queues. While some concurrent-object operations are designed to behave atomically, each witnessing the atomic effects of predecessors in a linearization order, others forego such strong consistency to avoid complex control and synchronization bottlenecks. For example, contains (value) methods of key-value maps may iterate through key-value entries without blocking concurrent updates, to avoid unwanted performance bottlenecks, and consequently overlook the effects of some linearization-order predecessors. While such weakly-consistent operations may not be atomic, they still offer guarantees, e.g., only observing values that have been present. In this work we develop a methodology for proving that concurrent object implementations adhere to weak-consistency specifications. In particular, we consider (forward) simulation-based proofs of implementations against relaxed-visibility specifications, which allow designated operations to overlook some of their linearization-order predecessors, i.e., behaving as if they never occurred. Besides annotating implementation code to identify linearization points, i.e., points at which operations’ logical effects occur, we also annotate code to identify visible operations, i.e., operations whose effects are observed; in practice this annotation can be done automatically by tracking the writers to each accessed memory location. We formalize our methodology over a general notion of transition systems, agnostic to any particular programming language or memory model, and demonstrate its application, using automated theorem provers, by verifying models of Java concurrent object implementations.

Programming e cient multithreaded programs generally involves carefully organizing shared memory accesses to facilitate inter-thread communication while avoiding synchronization bottlenecks. Modern software platforms like Java include reusable abstractions which encapsulate low-level shared memory accesses and synchronization into familiar high-level abstract data types (ADTs). These so-called concurrent objects typically include mutual-exclusion primitives like locks, numeric data types like atomic integers, as well as collections like sets, key-value maps, and queues; Java's standardedition platform contains many implementations of each. Such objects typically provide strong consistency guarantees like linearizability [18] , ensuring that each operation appears to happen atomically, witnessing the atomic e ects of predecessors according to some linearization order among concurrently-executing operations.

While such strong consistency guarantees are ideal for logical reasoning about programs which use concurrent objects, these guarantees are too strong for many operations, since they preclude simple and/or e cient implementation -over half of Java's concurrent collection methods forego atomicity for weak-consistency [13] . On the one hand, basic operations like the get and put methods of key-value maps typically admit relatively-simple atomic implementations, since their behaviors essentially depend upon individual memory cells, e.g., where the relevant key-value mapping is stored. On the other hand, making aggregate operations like size and contains (value) atomic would impose synchronization bottlenecks, or otherwise-complex control structures, since their atomic behavior depends simultaneously upon the values stored across many memory cells. Interestingly, such implementations are not linearizable even when their underlying memory operations are sequentially consistent, e.g., as is the case with Java 8's concurrent collections, whose memory accesses are data-race free. 4 For instance, the contains (value) method of Java's concurrent hash map iterates through key-value entries without blocking concurrent updates in order to avoid unreasonable performance bottlenecks. Consequently, in a given execution, a containsvalue-v operation o 1 will overlook operation o 2 's concurrent insertion of k 1 → v for a key k 1 it has already traversed. This oversight makes it possible for o 1 to conclude that value v is not present, and can only be explained by o 1 being linearized before o 2 . In the case that operation o 3 removes k 2 → v concurrently before o 1 reaches key k 2 , but only after o 2 completes, then atomicity is violated since in every possible linearization, either mapping k 2 → v or k 1 → v is always present. Nevertheless, such weakly-consistent operations still o er guarantees, e.g., that values never present are never observed, and initially-present values not removed are observed.

In this work we develop a methodology for proving that concurrent-object implementations adhere to the guarantees prescribed by their weak-consistency speci cations. The key salient aspects of our approach are the lifting of existing sequential ADT speci cations via visibility relaxation [13] , and the harnessing of simple and mechanizable reasoning based on forward simulation [25] by relaxed-visibility ADTs. E ectively, our methodology extends the predominant forward-simulation based linearizabilityproof methodology to concurrent objects with weakly-consistent operations, and enables automation for proving weak-consistency guarantees.

To enable the harnessing of existing sequential ADT speci cations, we adopt the recent methodology of visibility relaxation [13] . As in linearizability [18] , the return value of each operation is dictated by the atomic e ects of its predecessors in some (i.e., existentially quanti ed) linearization order. To allow consistency weakening, operations are allowed, to a certain extent, to overlook some of their linearization-order predecessors, behaving as if they had not occurred. Intuitively, this (also existentially quanti ed) visibility captures the inability or unwillingness to atomically observe the values stored across many memory cells. To provide guarantees, the extent of visibility relaxation is bounded to varying degrees. Notably, the visibility of an absolute operation must include all of its linearization-order predecessors, while the visibility of a monotonic operation must include all happens-before predecessors, along with all operations visible to them. The majority of Java's concurrent collection methods are absolute or monotonic [13] . For instance, in the contains-value example described above, by considering that operation o 2 is not visible to o 1 , the conclusion that v is not present can be justi ed by the linearization o 2 ; o 3 ; o 1 , in which o 1 sees o 3 's removal of k 2 → v yet not o 2 's insertion of k 1 → v. Ascribing the monotonic visibility to the contains-value method amounts to a guarantee that initially-present values are observed unless removed (i.e., concurrently).

While relaxed-visibility speci cations provide a means to describing the guarantees provided by weakly-consistent concurrent-object operations, systematically establishing implementations' adherence requires a strategy for demonstrating simulation [25] , i.e., that each step of the implementation is simulated by some step of (an operational representation of) the speci cation. The crux of our contribution is thus threefold: rst, to identify the relevant speci cation-level actions with which to relate implementation-level transitions; second, to identify implementation-level annotations relating transitions to speci cation-level actions; and third, to develop strategies for devising such annotations systematically. For instance, the existing methodology based on linearization points [18] essentially amounts to annotating implementation-level transitions with the points at which its speci cation-level action, i.e., its atomic e ect, occurs. Relaxed-visibility speci cations require not only a witness for the existentiallyquanti ed linearization order, but also an existentially-quanti ed visibility relation, and thus requires a second kind of annotation to resolve operations' visibilities. We propose a notion of visibility actions which enable operations to declare their visibility of others, e.g., specifying the writers of memory cells it has read.

The remainder of our approach amounts to devising a systematic means for constructing simulation proofs to enable automated veri cation. Essentially, we identify a strategy for systematically annotating implementations with visibility actions, given linearization-point annotations and visibility bounds (i.e., absolute or monotonic), and then encode the corresponding simulation check using an o -the-shelf veri cation tool. For the latter, we leverage [16] , a language and veri er for Owicki-Gries style modular proofs of concurrent programs with arbitrarily-many threads. In principle, since our approach reduces simulation to safety veri cation, any safety veri er could be used, though facilitates reasoning for multithreaded programs by capturing interference at arbitrary program points. Using , we have veri ed monotonicity of the contains-value and size methods of Java's concurrent hash-map and concurrent linked-queue, respectively -and absolute consistency of add and remove operations. Although our models are written in and assume sequentially-consistent memory accesses, they capture the di cult aspects of weak-consistency in Java, including heapbased memory access; furthermore, our models are also sound with respect to Java 8's memory model, since their Java 8 implementations guarantee data-race freedom.

In summary, we present the rst methodology for verifying weakly-consistent operations using sequential speci cations and forward simulation. Contributions include:

the formalization of our methodology over a general notion of transition systems, agnostic to any particular programming language or memory model ( §3); the application of our methodology to verifying a weakly-consistent contains-value method of a key-value map ( §4); and a mechanization of our methodology used for verifying models of weakly-consistent Java methods using automated theorem provers ( §5).

Aside from the outline above, this article summarizes an existing weak-consistency speci cation methodology via visibility relaxation ( §2), summarizes related work ( §6), and concludes ( §7). Proofs of all theorems and lemmas are listed in Appendix A.

Our methodology for verifying weakly-consistent concurrent objects relies both on the precise characterization of weak consistency speci cations, as well as a proof technique for establishing adherence to speci cations. In this section we recall and outline a characterization called visibility relaxation [13] , an extension of sequential abstract data type (ADT) speci cations in which the return values of some operations may not re ect the e ects of previously-e ectuated operations. Notationally, in the remainder of this article, ε denotes the empty sequence, ∅ denotes the empty set, _ denotes an unused binding, and and ⊥ denote the Boolean values true and false, respectively. We write R(x) to denote the inclusion x ∈ R of a tuple x in the relation R; and R[x → y] to denote the extension R ∪ {xy} of R to include xy; and R | X to denote the projection R ∩ X * of R to set X; and R to denote the complement {x : x / ∈ R} of R; and R(x) to denote the image {y : xy ∈ R} of R on x; and R −1 (y) to denote the pre-image {x : xy ∈ R} of R on y; whether R(x) refers to inclusion or an image will be clear from its context. Finally, we write x i to refer to the ith element of tuple x = x 0 x 1 . . ..

For a general notion of ADT speci cations, we consider xed sets M and X of method names and argument or return values, respectively. An operation label λ = m, x, y is a method name m ∈ M along with argument and return values x, y ∈ X. A readonly predicate is a unary relation R(λ) on operation labels, an operation sequence s = λ 0 λ 1 . . . is a sequence of operation labels, and a sequential speci cation S = {s 0 , s 1 , . . .} is a set of operation sequences. We say that R is compatible with S when S is closed under deletion of read-only operations, i.e., λ 0 . . . λ j−1 λ j+1 . . . λ i ∈ S when λ 0 . . . λ i ∈ S and R(λ j ).

The key-value map ADT sequential speci cation S m is the pre x-closed set containing all sequences λ 0 . . . λ i such that λ i is either:

put, kv, b , and b = iff some rem, k, _ follows any prior put, kv, _ ; rem, k, b , and b = iff no other rem, k, _ follows some prior put, kv, _ ; get, k, v , and no put, kv , _ nor rem, k, _ follows some prior put, kv, _ , and v = ⊥ if no such put, kv, _ exists; or has, v, b , and b = iff no prior put, kv , _ nor rem, k, _ follows some prior put, kv, _ .

The read-only predicate R m holds for the following cases:

This is a simpli cation of Java's Map ADT, i.e., with fewer methods. 5 To derive weak speci cations from sequential ones, we consider a set V of exactly two visibility labels from prior work [13] : absolute and monotonic. 6 

Intuitively, absolute visibility requires operations to observe the e ects of all of their linearization-order predecessors. The weaker monotonic visibility requires operations to observe the e ects of all their happens-before (i.e., program-and synchronizationorder) predecessors, along with the e ects already observed by those predecessors, i.e., so that sets of visible e ects are monotonically increasing over happens-before chains of operations; conversely, operations may ignore e ects which have been ignored by their happens-before predecessors, so long as those e ects are not transitively related by program and synchronization order.

De nition 1. A weak-visibility speci cation W = S, R, V is a sequential speci cation S with a compatible read-only predicate R and a visibility annotation V . Java's concurrent hash map appears to be consistent with this speci cation [13] .

We ascribe semantics to speci cations by characterizing the values returned by concurrent method invocations, given constraints on invocation order. In practice, the happens-before order among invocations is determined by a program order, i.e., among invocations of the same thread, and a synchronization order, i.e., among invocations of distinct threads accessing the same atomic objects, e.g., locks. To relate operations' return values in a given history back to sequential speci cations, we consider certain sequencings of those operations. A linearization of a history h = O, _, _, hb is a total order lin ⊇ hb over O which includes hb, and a visibility projection vis of lin maps each operation o ∈ O to a subset vis(o) ⊆ lin −1 (o) of the operations preceding o in lin; note that o 1 , o 2 ∈ vis means o 1 observes o 2 . For a given read-only predicate R, we say o's visibility is monotonic when it includes every happens-before predecessor, and operation visible to a happens-before predecessor, which is not read-only, 7 i.e., vis(o) ⊇ hb −1 (o) ∪ vis(hb −1 (o)) | R. We says o's visibility is absolute when vis(o) = lin −1 (o), and vis is itself absolute when each vis(o)

is. An abstract execution e = h, lin, vis is a history h along with a linearization of h, and a visibility projection vis of lin. An abstract execution is sequential when hb is total, complete when h is, and absolute when vis is.

Example 3. An abstract execution can be de ned using the linearization 8 put, 1, 1 , get, 1, 1 put, 0, 1 , put, 1, 0 , ⊥ has, 1, ⊥ along with a happens-before order that, compared to the linearization order, keeps has, 1, ⊥ unordered w.r.t. put, 0, 1 , and put, 1, 0 , ⊥ , and a visibility projection where the visibility of every put and get includes all the linearization predecessors and the visibility of has, 1, ⊥ consists of put, 1, 1 , and put, 1, 0 , ⊥ . Recall that in the argument k, v to put operations, the key k precedes value v.

To determine the consistency of individual histories against weak-visibility speci cations, we consider adherence of their corresponding abstract executions. Let h = O, inv , ret, hb be a history and e = h, lin, vis a complete abstract execution. Then e is consistent with a visibility annotation V and read-only predicate R if for each operation o ∈ dom(lin) with inv (o) = m, _ , vis(o) is absolute or monotonic, respectively, according to V (m) and R. The labeling λ 0 λ 1 . . . of a total order o 0 ≺ o 1 ≺ . . . of complete operations is the sequence of operation labels, i.e., λ i is the label of o i . Then e is consistent with a sequential speci cation S when the labeling 9 of lin | (vis(o) ∪ {o}) is included in S, for each operation o ∈ dom(lin). 10 Finally, we say e is consistent with a weak-visibility speci cation S, R, V when it is consistent with S, R, and V . Remark 1. Consistency models suited for modern software platforms like Java are based on happens-before relations which abstract away from real-time execution order. Since happens-before, unlike real-time, is not necessarily an interval order, the composition of linearizations of two distinct objects in the same execution may be cyclic, i.e., not linearizable. Recovering compositionality in this setting is orthogonal to our work of proving consistency against a given model, and is explored elsewhere [11] .

The abstract executions E(W ) of a weak-visibility speci cation W = S, R, V include those complete, sequential, and absolute abstract executions derived from sequences of S, i.e., when s = λ 0 . . . λ n ∈ S then each e s labels each o i by λ i , and orders hb(o i , o j ) iff i < j. In addition, when E(W ) includes an abstract execution h, lin, vis with h = O, inv , ret, hb , then E(W ) also includes any:

execution h , lin, vis such that h = O, inv , ret, hb and hb ⊆ hb; and -W -consistent execution h , lin, vis with h = O, inv , ret , hb and vis ⊆ vis.

Note that while happens-before weakening hb ⊆ hb always yields consistent executions, unguarded visibility weakening vis ⊆ vis generally breaks consistency with visibility annotations and sequential speci cations: visibilities can become non-monotonic, and return values can change when operations observe fewer operations' e ects. Example 5. The abstract executions of W m include the complete, sequential, and absolute abstract execution de ned by the following happens-before order

which implies that it also includes one in which just the happens-before order is modied such that has, 1, becomes unordered w.r.t. put, 0, 1 , and put, 1, 0 , ⊥ . Since it includes the latter, it also includes the execution in Example 3 where the visibility of has is weakened which also modi es its return value from to ⊥.

De nition 2. The histories of a weak-visibility speci cation W are the projections H(W ) = {h : h, _, _ ∈ E(W )} of its abstract executions.

To de ne the consistency of implementations against speci cations, we leverage a general model of computation to capture the behavior of typical concurrent systems, e.g., including multiprocess and multithreaded systems. A sequence-labeled transition system Q, A, q, → is a set Q of states, along with a set A of actions, initial state q ∈ Q and transition relation → ∈ Q × A * × Q. An execution is an alternating sequence η = q 0 a 0 q 1 a 1 . . . q n of states and action sequences starting with q 0 = q such that q i ai − → q i+1 for each 0 ≤ i < n. The trace τ ∈ A * of the execution η is its projection a 0 a 1 . . . to individual actions.

To capture the histories admitted by a given implementation, we consider sequencelabeled transition systems (SLTSs) which expose actions corresponding to method call, return, and happens-before constraints. We refer to the actions call(o, m, x), ret(o, y), and hb(o, o ), for o, o ∈ N, m ∈ M, and x, y ∈ X, as the history actions, and a history transition system is an SLTS whose actions include the history actions. We say that an action over operation identi er o is an o-action, and assume that executions are well formed in the sense that for a given operation identi er o: at most one call o-action occurs, at most one ret o-action occurs, and no ret nor hb o-actions occur prior to a call o-action. Furthermore, we assume call o-actions are enabled, so long as no prior call o-action has occurred. The history of a trace τ is de ned inductively by f h (h ∅ , τ ), where h ∅ is the empty history, and,

where h = O, inv , ret, hb , and a is a call, ret, or hb action, andã is not. An implementation I is a history transition system, and the histories H(I) of I are those of its traces. Finally, we de ne consistency against speci cations via history containment.

De nition 3. Implementation I is consistent with speci cation W iff H(I) ⊆ H(W ).

To obtain a consistency proof strategy, we more closely relate implementations to speci cations via their admitted abstract executions. To capture the abstract executions admitted by a given implementation, we consider SLTSs which expose not only historyrelated actions, but also actions witnessing linearization and visibility. We where e = h, lin, vis , and a is a call, ret, hb, lin, or vis action,ã is not, andâ is a call, ret, or hb action. A witnessing implementation I is an abstract-execution transition system, and the abstract executions E(I) of I are those of its traces.

We adopt forward simulation [25] for proving consistency against weak-visibility speci cations. Formally, a simulation relation from one system

that initial states are related, R(χ 1 , χ 2 ), and: for any pair of related states R(q 1 , q 2 ) and sourcesystem transition q 1 a1 − →1 q 1 , there exists a target-system transition q 2 a2 − →2 q 2 to related states, i.e., R(q 1 , q 2 ), over common actions, i.e., ( a 1 | A 2 ) = ( a 2 | A 1 ). We say Σ 2 simulates Σ 1 and write Σ 1 Σ 2 when a simulation relation from Σ 1 to Σ 2 exists.

We derive transition systems to model consistency speci cations in simulation. The following lemma establishes the soundness and completeness of this substitution, and the subsequent theorem asserts the soundness of the simulation-based proof strategy.

De nition 4. The transition system W s of a weak-visibility speci cation W is the AETS whose actions are the abstract execution actions, whose states are abstract executions, whose initial state is the empty execution, and whose transitions include e 1 a − → e 2 iff f e (e 1 , a) = e 2 and e 2 is consistent with W .

A weak-visibility spec. and its transition system have identical histories. Theorem 1. A witnessing implementation I is consistent with a weak-visibility specication W if the transition system W s of W simulates I.

Our notion of simulation is in some sense complete when the sequential speci cation S of a weak-consistency speci cation W = S, R, V is return-value deterministic, i.e., there is a single label m, x, y such that λ · m, x, y ∈ S for any method m, argument-value x, and admitted sequence λ ∈ S. In particular, W s simulates any witnessing implementation I whose abstract executions E(I) are included in E( W s ). 11 This completeness, however, extends only to inclusion of abstract executions, and not all the way to consistency, since consistency is de ned on histories, and any given operation's return value is not completely determined by the other operation labels and happens-before relation of a given history: return values generally depend on linearization order and visibility as well. Nevertheless, sequential speci cations typically are return-value deterministic, and we have used simulation to prove consistency of Java-inspired weakly-consistent objects.

Establishing simulation for an implementation is also helpful when reasoning about clients of a concurrent object. One can use the speci cation in place of the implementation and encode the client invariants using the abstract execution of the speci cation in order to prove client properties, following Sergey et al. 's approach [35] .

Proving simulation between an implementation and its speci cation can generally be achieved via product construction: complete the transition system of the speci cation, replacing non-enabled transitions with error-state transitions; then ensure the synchronized product of implementation and completed-speci cation transition systems is safe, i.e., no error state is reachable. Assuming that the individual transition systems are safe, then the product system is safe i the speci cation simulates the implementation. This reduction to safety veri cation is also generally applicable to implementation and speci cation programs, though we limit our formalization to their underlying transition systems for simplicity. By the upcoming Corollary 1, such reductions enable consistency veri cation with existing safety veri cation tools.

While Theorem 1 establishes forward simulation as a strategy for proving the consistency of implementations against weak-visibility speci cations, its application to real-world implementations requires program-level mechanisms to signal the underlying AETS lin and vis actions. To apply forward simulation, we thus develop a notion of programs whose commands include such mechanisms.

This section illustrates a toy programming language with AETS semantics which provides these mechanisms. The key features are the lin and vis program commands, which emit linearization and visibility actions for the currently-executing operation, along with load, store, and cas (compare-and-swap) commands, which record and return the set of operation identi ers having written to each memory cell. Such augmented memory commands allow programs to obtain handles to the operations whose e ects it has observed, in order to signal the corresponding vis actions.

While one can develop similar mechanisms for languages with any underlying memory model, the toy language presented here assumes a sequentially-consistent memory. Note that the assumption of sequentially-consistent memory operations is practically without loss of generality for Java 8's concurrent collections since they are designed to be data-race free -their anomalies arise not from weak-memory semantics, but from non-atomic operations spanning several memory cells.

For generality, we assume abstract notions of commands and memory, using κ, µ, , and M respectively to denote a program command, memory command, local state, and global memory. So that operations can assert their visibilities, we consider memory which stores, and returns upon access, the identi er(s) of operations which previously accessed a given cell. A program P = init, cmd, idle, done consists of an init(m, x) = function mapping method name m and argument values x to local state , along with a cmd( ) = κ function mapping local state to program command κ, and idle( ) and done( ) predicates on local states . Intuitively, identifying local states with threads, the idle predicate indicates whether a thread is outside of atomic sections, and subject to interference from other threads; meanwhile the done predicate indicates whether whether a thread has terminated. The denotation of a memory command µ is a function µ m from global memory M 1 , argument value x, and operation o to a tuple µ m (M 1 , x, o) = M 2 , y consisting of a global memory M 2 , along with a return value y.

Example 6. A sequentially-consistent memory system which records the set of operations to access each location can be captured by mapping addresses x to value and operation-set pairs M (x) = y, O , along with three memory commands:

where the compare-and-swap (CAS) operation stores value z at address x and returns true when y was previously stored, and otherwise returns false.

The denotation of a program command κ is a function κ c from local state 1 to a tuple κ c ( 1 ) = µ, x, f consisting of a memory command µ and argument value x, and a update continuation f mapping the memory command's return value y to a pair f (y) = 2 , α , where 2 is an updated local state, and α maps an operation o to an LTS action α(o). We assume the denotation ret x c ( 1 ) = nop, ε, λy. 2 , λo.ret(z) of the ret command yields a local state 2 with done( 2 ) without executing memory commands, and outputs a corresponding LTS ret action. Each atomic a − → step of the AETS underlying a given program is built from a sequence of steps for the individual program commands in an atomic section. Individual program commands essentially execute one small step from shared memory and local state M 1 , 1 to M 2 , 2 , invoking memory command µ with De nition 5. A program P is consistent with a speci cation iff its semantics P p is.

Thus the consistency of P with W amounts to the inclusion of P p 's histories in W 's. The following corollary of Theorem 1 follows directly by De nition 5, and immediately yields a program veri cation strategy: validate a simulation relation from the states of P p to the states of W s such that each command of P is simulated by a step of W s . Corollary 1. A program P is consistent with speci cation W if W s simulates P p .

In this section we develop a systematic means to annotating concurrent objects for relaxed-visibility simulation proofs. Besides leveraging an auxiliary memory system which tags memory accesses with the operation identi ers which wrote read values (see §3.2), annotations signal linearization points with lin commands, and indicate visibility of other operations with vis commands. As in previous works [3, 37, 2, 18] we assume linearization points are given, and focus on visibility-related annotations.

As we focus on data-race free implementations (e.g., Java 8's concurrent collections) for which sequential consistency is sound, it can be assumed without loss of generality that the happens-before order is exactly the returns-before order between operations, which orders two operations o 1 and o 2 i the return action of o 1 occurs in real-time before the call action of o 2 . This assumption allows to guarantee that linearizations are consistent with happens-before just by ensuring that the linearization point of each operation occurs in between its call and return action (like in standard linearizability). It is without loss of generality because the clients of such implementations can use auxiliary variables to impose synchronization order constraints between every two operations ordered by returns-before, e.g., writing a variable after each operation returns which is read before each other operation is called (under sequential consistency, every write happens-before every other read which reads the written value).

We illustrate our methodology with the key-value map implementation I chm of Figure 2 , which models Java's concurrent hash map. The lines marked in blue and red represent linearization/visibility commands added by the instrumentation that will be described below. Key-value pairs are stored in an array table indexed by keys. The implementation of put and get are obvious while the implementation of has returns true i the input value is associated to some key consists of a while loop traversing the array and searching for the input value. To simplify the exposition, the shared memory reads and writes are already adapted to the memory system described in Section 3.2 (essentially, this consists in adding new variables storing the set of operation identi ers returned by a shared memory read). While put and get are obviously linearizable, has is weakly consistent, with monotonic visibility. For instance, given the two thread program {get(1); has(1)} || {put(1, 1); put(0, 1); put(1, 0)} it is possible that get(1) returns 1 while has(1) returns false. This is possible in an interleaving where has reads table[0] before put(0,1) writes into it (observing the initial value 0), and table [1] after put(1,0) writes into it (observing value 0 as well). The only abstract execution consistent with the weakly-consistent contains-value map W m (Example 2) which justi es these return values is given in Example 3. We show that this implementation is consistent with a simpli cation of the contains-value map W m , without remove key operations, and where put operations return no value.

Given an implementation I, let L(I) be an instrumentation of I with program commands lin() emitting linearization actions. The execution of lin() in the context of an operation with identi er o emits a linearization action lin(o). We assume that L(I) leads to well-formed executions (e.g., at most one linearization action per operation). Figure 2 , the linearization commands of put and get are executed atomically with the store to table[k] in put and the load of table [k] in get, respectively. The linearization command of has is executed at any point after observing the input value v or after exiting the loop, but before the return. The two choices correspond to di erent return values and only one of them will be executed during an invocation.

Given an instrumentation L(I), a visibility annotation V for I's methods, and a read-only predicate R, we de ne a witnessing implementation V(L(I)) according to a generic heuristic that depends only on V and R. This de nition uses a program command getLin() which returns the set of operations in the current linearization sequence. 12 The current linearization sequence is stored in a history variable which is updated with every linearization action by appending the corresponding operation identi er. For readability, we leave this history variable implicit and omit the corresponding updates. As syntactic sugar, we use a command getModLin() which returns the set of modi ers (non read-only operations) in the current linearization sequence. Therefore, V(L(I)) extends the instrumentation L(I) with commands generating visibility actions as follows:

for absolute methods, each linearization command is preceded by vis(getLin()) which ensures that the visibility of an invocation includes all the predecessors in linearization order. This is executed atomically with lin(). for monotonic methods, the call action is followed by vis(getModLin()) (and executed atomically with this command) which ensures that the visibility of each invocation is monotonic, and every read of a shared variable which has been written by a set of operations O is preceded by vis(O ∩ getModLin()) (and executed atomically with this command). The latter is needed so that the visibility of such an invocation contains enough operations to explain its return value (the visibility command attached to call actions is enough to ensure monotonic visibilities).

Example 9. The blue lines in Figure 2 demonstrate the visibility commands added by the instrumentation V(·) to the key-value map in Figure 2 (in this case, the modi ers are put operations). The rst visibility command in has precedes the procedure body to emphasize the fact that it is executed atomically with the procedure call. Also, note that the read of the array table is the only shared memory read in has. for every modi er o which has been already linearized. Therefore, any operation which has returned before o (i.e., happens-before o) has already been linearized and it will necessarily have a smaller visibility (w.r.t. set inclusion) because the linearization sequence is modi ed only by appending new operations. The instrumentation of shared memory reads may add more visibility actions vis(o, _) but this preserves the monotonicity status of o's visibility. The case of absolute methods is obvious.

The consistency of the abstract executions of V(L(I)) with a given sequential speci cation S, which completes the proof of consistency with a weak-visibility specication W = S, R, V , can be proved by showing that the transition system W s of W simulates V(L(I)) (Theorem 1). De ning a simulation relation between the two systems is in some part implementation speci c, and in the following we demonstrate it for the key-value map implementation V(L(I chm )).

We show that W m s simulates implementation I chm . A state of I chm in Figure 2 1,v) operation, provided that tv = v. Above, the linearization pre x associated to an index j 1 < j 2 should be a pre x of the one associated to j 2 .

A large part of this de nition is applicable to any implementation, only points (5), (6) , and (7) being speci c to the implementation we consider. The points (6) and (7) ensure that the return values of operations are consistent with S and mimic the e ect of the vis commands from Figure 2 .

Theorem 3. R chm is a simulation relation from V(L(I chm )) to W m s .

In this section we e ectuate our methodology by verifying two weakly-consistent concurrent objects: Java's ConcurrentHashMap and ConcurrentLinkedQueue. 13 We use an o -the-shelf deductive veri cation tool called [16] , though any concurrent program veri er could su ce. We chose because comparable veri ers either require a manual encoding of the concurrency reasoning (e.g. Dafny or Viper) which can be error-prone, or require cumbersome reasoning about interleavings of threadlocal histories (e.g. VerCors). An additional bene t of is that it directly proves simulation, thereby tying the mechanized proofs to our theoretical development. Our proofs assume no bound on the number of threads or the size of the memory.

Our use of imposes two restrictions on the implementations we can verify. First, uses the Owicki-Gries method [29] to verify concurrent programs. These methods are unsound for weak memory models [22] , so , and hence our proofs, assume a sequentially-consistent memory model. Second, 's strategy for building the simulation relation requires implementations to have statically-known linearization points because it checks that there exists exactly one atomic section in each code path where the global state is modi ed, and this modi cation is simulated by the speci cation.

Given these restrictions, we can simplify our proof strategy of forward re nement by factoring the simulations we construct through an atomic version of the speci cation transition system. This atomic speci cation is obtained from the speci cation AETS W s by restricting the interleavings between its transitions. Note that the language of W a is included in the language of W s and simulation proofs towards W a apply to W s as well.

Our proofs show that there is a simulation from an implementation to its atomic speci cation, which is encoded as a program whose state consists of the components of an abstract execution, i.e., O, inv , ret, hb, lin, vis . These were encoded as maps from operation identi ers to values, sequences of operation identi ers, and maps from operation identi ers to sets of operation identi ers respectively. Our axiomatization of sequences and sets were adapted from those used by the Dafny veri er [23] . For each method in M, we de ned atomic procedures corresponding to call actions, return actions, and combined visibility and linearization actions in order to obtain exactly the atomic transitions of W a .

It is challenging to encode Java implementations faithfully in , as the latter's input programming language is a basic imperative language lacking many Java features. Most notable among these is dynamic memory allocation on the heap, used by almost all of the concurrent data structure implementations. As is a rst-order prover, we needed an encoding of the heap that lets us perform reachability reasoning on the heap. We adapted the rst-order theory of reachability and footprint sets from the GRASShopper veri er [30] for dynamically allocated data structures. This fragment is decidable, but relies on local theory extensions [36] , which we implemented by using the trigger mechanism of the underlying SMT solver [27, 15] to ensure that quanti ed axioms were only instantiated for program expressions. For instance, here is the "cycle" axiom that says that if a node x has a eld f[x] that points to itself, then any y that it can reach via that eld (encoded using the between predicate Btwn(f, x, y, y)) must be equal to x: We use the trigger known(x), known(y) (known is a dummy function that maps every reference to true) and introduce known(t) terms in our programs for every term t of type Ref (for instance, by adding assert known(t) to the point of the program where t is introduced). This ensures that the cycle axiom is only instantiated for terms that appear in the program, and not for terms that are generated by instantations of axioms (like f[x] in the cycle axiom). This process was key to keeping the veri cation time manageable.

Since we consider ne-grained concurrent implementations, we also needed to reason about interference by other threads and show thread safety. provides Owicki-Gries [29] style thread-modular reasoning, by means of demarcating atomic blocks and providing preconditions for each block that are checked for stability under all possible modi cations by other threads. One of the consequences of this is that these annotations can only talk about the local state of a thread and the shared global state, but not other threads. To encode facts such as distinctness of operation identi ers and ownership of unreachable nodes (e.g. newly allocated nodes) in the shared heap, we use 's linear type system [40] . For instance, the proof of the push method needs to make assertions about the value of the newly-allocated node x. These assertions would not be stable under interference of other threads if we didn't have a way of specifying that the address of the new node is known only by the push thread. We encode this knowledge by marking the type of the variable x as linear -this tells that all values of x across all threads are distinct, which is su cient for the proof.

ensures soundness by making sure that linear variables are not duplicated (for instance, they cannot be passed to another method and then used afterwards).

We evaluate our proof methodology by considering models of two of Java's weaklyconsistent concurrent objects.

One is the ConcurrentHashMap implementation of the Map ADT, consisting of absolute put and get methods and a monotonic has method that follows the algortihm given in Figure 2 . For simplicity, we assume here that keys are integers and the hash function is identity, but note that the proof of monotonicity of has is not a ected by these assumptions. 14 

Code Proof Total Time (s) -85  85  -Executions and Consistency  -30  30  -Heap and Reachability  -35  35  -Map ADT  51  34  85  -Array-map implementation  138 175 313  6   Queue ADT  50  22 72 -Linked Queue implementation 280 325 605 13 Fig. 3 . Case study detail: for each object we show lines of code, lines of proof, total lines, and veri cation time in seconds. We also list common de nitions and axiomatizations separately.

can construct a simulation relation equivalent to the one de ned in De nition 6 automatically, given an inductive invariant that relates the state of the implementation to the abstract execution. A rst attempt at an invariant might be that the value stored at table[k] for every key k is the same as the value returned by adding a get operation on k by the speci cation AETS. This invariant is su cient for to prove that the return value of the absolute methods (put and get) is consistent with the speci cation.

However, it is not enough to show that the return value of the monotonic has method is consistent with its visibility. This is because our proof technique constructs a visibility set for has by taking the union of the memory tags (the set of operations that wrote to each memory location) of each table entry it reads, but without additional invariants this visibility set could entail a di erent return value. We thus strengthen the invariant to say that tableTags[k], the memory tags associated with hash table entry k, is exactly the set of linearized put operations with key k. A consequence of this is that the abstract state encoded by tableTags[k] has the same value for key k as the value stored at table[k] .

can then prove, given the following loop invariant, that the value returned by has is consistent with its visibility set. This loop invariant says that among the entries scanned thus far, the abstract map given by the projection of lin to the current operation's visibility my_vis does not include value v.

Concurrent Linked Queue Our second case study is the ConcurrentLinkedQueue implementation of the Queue ADT, consisting of absolute push and pop methods and a monotonic size method that traverses the queue from head to tail without any locks and returns the number of nodes it sees (see Figure 4 for the full code). We again model the core algorithm (the Michael-Scott queue [26] ) and omit some of Java's optimizations, for instance to speed up garbage collection by setting the next eld of popped nodes to themselves, or setting the values of nodes to null when popping values.

The invariants needed to verify the absolute methods are a straightforward combination of structural invariants (e.g. that the queue is composed of a linked list from the head to null, with the tail being a member of this list) and a relation between the , see our queue case study, this issue is orthogonal to the weak-consistency reasoning that we study here. abstract and concrete states. Once again, we need to strengthen this invariant in order to verify the monotonic size method, because otherwise we cannot prove that the visibility set we construct (by taking the union of the memory tags of nodes in the list during traversal) justi es the return value.

The key additional invariant is that the memory tags for the next eld of each node (denoted x.nextTags for each node x) in the queue contain the operation label of the operation that pushed the next node into the queue (if it exists). Further, the sequence of push operations in lin are exactly the operations in the nextTags eld of nodes in the queue, and in the order they are present in the queue. Figure 5 shows a simpli ed version of the encoding of these invariants. In it, we use the following auxiliary variables in order to avoid quanti er alternation: nextInvoc maps nodes to the operation label (type Invoc in ) contained in the nextTags eld; nextRef maps operations to the nodes whose nextTags eld contains them, i.e. it is the inverse of nextInvoc; and absRefs maps the index of the abstract queue (represented as a mathematical sequence) to the corresponding concrete heap node. We omit the triggers and known predicates for readability; the full invariant can be found in the accompanying proof scripts.

Given these invariants, one can show that the return value s computed by size is consistent with the visibility set it constructs by picking up the memory tags from each node that it traverses. The loop invariant is more involved, as due to concurrent updates size could be traversing nodes that have been popped from the queue; see our proofs for more details.

Results Figure 3 provides a summary of our case studies. We separate the table into sections, one for each case study, and a common section at the top that contains the common theories of sets and sequences and our encoding of the heap. In each case study section, we separate the de nitions of the atomic speci cation of the ADT (which can be reused for other implementations) from the code and proof of the implementation we consider. For each resulting module, we list the number of lines of code, lines of proof, total lines, and 's veri cation time in seconds. Experiments were conducted on an Intel Core i7-4470 3.4 GHz 8-core machine with 16GB RAM.

Our two case studies are representative of the weakly-consistent behaviors exhibited by all the Java concurrent objects studied in [13] , both those using xed-size arrays and those using dynamic memory. As does not direclty support dynamic memory and other Java language features, we were forced to make certain simpli cations to the algorithms in our veri cation e ort. However, the assumptions we make are orthogonal to the reasoning and proof of weak consistency of the monotonic methods. The underlying algorithm used by, and hence the proof argument for monotonicity of, hash map's has method is the same as that in the other monotonic hash map operations such as elements, entrySet, and toString. Similarly, the argument used for the queue's size can be adapted to other monotonic ConcurrentLinkedQueue and LinkedTransferQueue operations like toArray and toString. Thus, our proofs carry over to the full versions of the implementations as the key invariants linking the memory tags and visibility sets to the speci cation state are the same.

In addition, does not currently have any support for inferring the preconditions of each atomic block, which currently accounts for most of the lines of proof in our case studies. However, these problems have been studied and solved in other tools [30, 39] , and in theory can be integrated with in order to simplify these kinds of proofs.

In conclusion, our case studies show that verifying weakly-consistent operations introduces little overhead compared to the proofs of the core absolute operations. The additional invariants needed to prove monotonicity were natural and easy to construct. We also see that our methodology brings weak-consistency proofs within the scope of what is provable by o -the-shelf automated concurrent program veri ers in reasonable time.

Though linearizability [18] has reigned as the de-facto concurrent-object consistency criterion, several recent works proposed weaker criteria, including quantitative relaxation [17] , quiescent consistency [10] , and local linearizability [14] ; these works e ectively permit externally-visible interference among threads by altering objects' sequential speci cations, each in their own way. Motivated by the diversity of these proposals, Sergey et al. [35] proposed the use of Hoare logic for describing a custom consistency speci cation for each concurrent object. Raad et al. [31] continued in this direction by proposing declarative consistency models for concurrent objects atop weak-memory platforms. One common feature between our paper and this line of work (see also [21, 9] ) is encoding and reasoning directly about the concurrent history. The notion of visibility relaxation [13] originates from Burckhardt et al.'s axiomatic speci cations [7] , and leverages traditional sequential speci cations by allowing certain operations to behave as if they are unaware of concurrently-executed linearizationorder predecessors. The linearization (and visibility) actions of our simulation-proof methodology are unique to visibility-relaxation based weak-consistency, since they refer to a global linearization order linking executions with sequential speci cations.

Typical methodologies for proving linearizability are based on reductions to safety veri cation [8, 5] and forward simulation [3, 37, 2] , the latter generally requiring the annotation of per-operation linearization points, each typically associated with a single program statement in the given operation, e.g., a shared memory access. Extensions to this methodology include cooperation [38, 12, 41] , i.e., allowing operations' linearization points to coincide with other operations' statements, and prophecy [33, 24] , i.e., allowing operation' linearization points to depend on future events. Such extensions enable linearizability proofs of objects like the Herlihy-Wing Queue (HWQ). While prophecy [25] , alternatively backward simulation [25] , is generally more powerful than forward simulation alone, Bouajjani et al. [6] described a methodology based on forward simulation capable of proving seemingly future-dependent objects like HWQ by considering xed linearization points only for value removal, and an additional kind of speci cation-simulated action, commit points, corresponding to operations' nal shared-memory accesses. Our consideration of speci cation-simulated visibility actions follows this line of thinking, enabling the forward-simulation based proof of weakly-consistent concurrent objects.

This work develops the rst veri cation methodology for weakly-consistent operations using sequential speci cations and forward simulation, thus reusing existing sequential ADT speci cations and enabling simple reasoning, i.e., without prophecy [1] or backward simulation [25] . This paper demonstrates the application of our methodology to absolute and monotonic methods on sequentially-consistent memory, as these are the consistency levels demonstrated in actual Java implementations of which we are aware. Our formalization is general, and also applicable to the other visibility relaxations, e.g., the peer and weak visibilities [13] , and weaker memory models, e.g., the Java memory model.

Extrapolating, we speculate that handling other visibilities amounts to adding annotations and auxiliary state which mirrors inter-operation communication. For example, while monotonic operations on shared-memory implementations observe mutating linearization-order predecessors -corresponding to a sequence of shared-memory updates -causal operations with message-passing based implementations would observe operations whose messages have (transitively) propagated. The corresponding annotations may require auxiliary state to track message propagation, similar in spirit to the getModLin() auxiliary state that tracks mutating linearization-order predecessors ( §4). Since weak memory models essentially alter the mechanics of inter-operation communication, the corresponding visibility annotations and auxiliary state may similarly re ect this communication. Since this communication is partly captured by the denotations of memory commands ( §3.2), these denotations would be modi ed, e.g., to include not one value and tag per memory location, but multiple. While variations are possible depending on the extent to which the proof of a given implementation relies on the details of the memory model, in the worst case the auxiliary state could capture an existing memory model (e.g., operational) semantics exactly.

As with systematic or automated linearizability-proof methodologies, our proof methodology is susceptible to two potential sources of incompleteness. First, as mentioned in Section 3, methodologies like ours based on forward simulation are only complete when speci cations are return-value deterministic. However, data types are typically designed to be return-value deterministic and this source of incompleteness does not manifest in practice.

Second, methodologies like ours based on annotating program commands, e.g., with linearization points, are generally incomplete since the consistency mechanism employed by any given implementation may not admit characterization according to a given static annotation scheme; the Herlihy-Wing Queue, whose linearization points depend on the results of future actions, is a prototypical example [18] . Likewise, our systematic strategy for annotating implementations with lin and vis commands ( §3) can fail to prove consistency of future-dependent operations. However, we have yet to observe any practical occurrence of such exotic objects; our strategy is su cient for verifying the weakly-consistent algorithms implemented in the Java development kit. As a theoretical curiosity for future work, investigating the potential for complete annotation strategies would be interesting, e.g., for restricted classes of data types and/or implementations.

Finally, while 's high-degree of automation facilitated rapid prototyping of our simulation proofs, its underlying foundation using Owicki-Gries style proof rules limits the potential for modular reasoning. In particular, while our weak-consistency proofs are thread-modular, our invariants and intermediate assertions necessarily talk about state shared among multiple threads. Since our simulation-based methodology and annotations are completely orthogonal to the underlying program logic, it would be interesting future work to apply our methodology using expressive logics like Rely-Guarantee, e.g. [19, 38] , or variations of Concurrent Separation Logic, e.g. [28, 32, 34, 35, 4, 20] . It remains to be seen to what degree increased modularity may sacri ce automation in the application of our weak-consistency proof methodology.

A Appendix: Proofs to Theorems and Lemmas Lemma 1. The abstract executions E(W ) of a speci cation W are consistent with W .

Proof. Any complete, sequential, and absolute execution is consistent by de nition, since the labeling of its linearization is taken from the sequential speci cation. Then, any happens-before weakening is consistent for exactly the same reason as its source execution, since its linearization and visibility projection are both identical. Finally, any visibility weakening is consistent by the condition of W -consistency in its de nition.

A weak-visibility speci cation and its transition system have identical histories.

Proof. It follows almost immediately that the abstract executions of W s are identical to those of W , since W s 's state e ectively records the abstract execution of a given AETS execution, and only enables those returns that are consistent with W . Since histories are the projections of abstract executions, the corresponding history sets are also identical. Theorem 1. A witnessing implementation I is consistent with a weak-visibility specication W if the transition system W s of W simulates I.

Proof. This follows from standard arguments, given that the corresponding SLTSs include ε transitions to ensure that every move of one system can be matched by stuttering from the other: since both systems synchronize on the call, ret, hb, lin, and vis actions, the simulation guarantees that every abstract execution, and thus history, of I is matched by one of W s . Then by Lemma 2, the histories of I are included in W .

Theorem 3. R chm is a simulation relation from I chm to W m s . Proof Sketch. We show that every step of the implementation, i.e., an atomic section or a program command, is simulated by W m s . Given q, e ∈ R chm , we consider the di erent implementation steps which are possible in q.

The case of commands corresponding to procedure calls of put and get is trivial. Executing a procedure call in q leads to a new state q which di ers only by having a new active operation o. We have that e The transition corresponding to the atomic section of put is labeled by a sequence of visibility actions (one for each linearized operation) followed by a linearization action. Let σ denote this sequence of actions. This transition leads to a state q where the array table may have changed (unless writing the same value), and the history variable lin is extended with the put operation o executing this step. We de ne an abstract execution e from e by changing lin to the new value of lin, and de ning an absolute visibility for o. We have that e σ − → e because e is consistent with W m . Also, q , e ∈ R chm because the validity of (3), (4), and (5) follow directly from the de nition of e . The atomic section of get can be handled in a similar way. The simulation of return actions of get operations is a direct consequence of point (6) which ensures consistency with S.

For has, we focus on the atomic sections containing vis commands and the linearization commands (the other internal steps are simulated by steps of W m s , and the simulation of the return step follows directly from (7) which justi es the consistency of the return value). The atomic section around the procedure call corresponds to a transition labeled by a sequence σ of visibility actions (one for each linearized modi er) and leads to a state q with a new active has operation o (compared to q). We have that e σ − → e because e is consistent with W m . Indeed, the visibility of o in e is not constrained since o has not been linearized and the W m -consistency of e follows from the W m -consistency of e. Also, q , e ∈ R chm because index (o) = −1 and (7) is clearly valid. The atomic section around the read of table[k] is simulated by W m s in a similar way, noticing that (7) models precisely the e ect of the visibility commands inside this atomic section. For the simulation of the linearization commands is important to notice that any active has operation in e has a visibility that contains all modi ers which returned before it was called and as explained above, this visibility is monotonic.

The existence of re nement mappings

An integrated speci cation and veri cation technique for highly concurrent data structures for highly concurrent data structures

Comparison under abstraction for verifying linearizability

The vercors tool set: Veri cation of parallel and concurrent software

On reducing linearizability to state reachability

Proving linearizability using forward simulations

Replicated data types: specication, veri cation, optimality

Aspect-oriented linearizability proofs

Concurrent data structures linked in time

Quiescent consistency: De ning and verifying relaxed linearizability

On abstraction and compositionality for weak-memory linearisability

Automatic linearizability proofs of concurrent objects with cooperating updates

Weak-consistency speci cation via visibility relaxation

Local linearizability for concurrent container-type data structures

Automated veri cation of practical garbage collectors

Automated and modular re nement reasoning for concurrent programs

Quantitative relaxation of concurrent data structures

Linearizability: A correctness condition for concurrent objects

Speci cation and design of (parallel) programs. In: IFIP Congress

Iris from the ground up: A modular foundation for higher-order concurrent separation logic

Proving linearizability using partial orders

Owicki-gries reasoning for weak memory models

Dafny: An automatic program veri er for functional correctness

Modular veri cation of linearizability with non-xed linearization points

Forward and backward simulations: I. untimed systems

Simple, fast, and practical non-blocking and blocking concurrent queue algorithms

E-matching for fun and pro t

Resources, concurrency and local reasoning

Verifying properties of parallel programs: An axiomatic approach

Grasshopper -complete heap veri cation with mixed speci cations

On library correctness under weak memory consistency: specifying and verifying concurrent libraries under declarative consistency models

Separation logic: A logic for shared mutable data structures

How to prove algorithms linearisable

Mechanized veri cation of ne-grained concurrent programs

Hoare-style speci cations as correctness conditions for non-linearizable concurrent objects

Hierarchic reasoning in local theory extensions

Shape-value abstraction for verifying linearizability

Automatically proving linearizability

Rgsep action inference

Linear types can change the world! In: Programming Concepts and Methods

Poling: SMT aided linearizability proofs

), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use