key: cord-0059203-94vh80f0
authors: Albert, Elvira; Hähnle, Reiner; Merayo, Alicia; Steinhöfel, Dominic
title: Certified Abstract Cost Analysis
date: 2021-02-24
journal: Fundamental Approaches to Software Engineering
DOI: 10.1007/978-3-030-71500-7_2
sha: 711c61beb368eee58829d9510bb506aa552c412c
doc_id: 59203
cord_uid: 94vh80f0

A program containing placeholders for unspecified statements or expressions is called an abstract (or schematic) program. Placeholder symbols occur naturally in program transformation rules, as used in refactoring, compilation, optimization, or parallelization. We present a generalization of automated cost analysis that can handle abstract programs and, hence, can analyze the impact on the cost of program transformations. This kind of relational property requires provably precise cost bounds which are not always produced by cost analysis. Therefore, we certify by deductive verification that the inferred abstract cost bounds are correct and sufficiently precise. It is the first approach solving this problem. Both, abstract cost analysis and certification, are based on quantitative abstract execution (QAE) which in turn is a variation of abstract execution, a recently developed symbolic execution technique for abstract programs. To realize QAE the new concept of a cost invariant is introduced. QAE is implemented and runs fully automatically on a benchmark set consisting of representative optimization rules.

We present a generalization of automated cost analysis that can handle programs containing placeholders for unspecified statements. Consider the program Q ≡ "i =0; while (i < t) {P; i ++;}", where P is any statement not modifying i or t. We call P an abstract statement; a program like Q containing abstract statements is called abstract program. The (exact or upper bound) cost of executing P is described by a function ac P (x) depending on the variables x occurring in P. We call this function the abstract cost of P. Assuming that executing any statement has unit cost and that t ≥ 0, one can compute the (abstract) cost of Q as 2 + t · (ac P (x) + 2) depending on ac P and t. For any concrete instance of P, we can derive its concrete cost as usual and then obtain the concrete cost of Q simply by instantiating ac P . In this paper, we define and implement an abstract cost analysis to infer abstract cost bounds. Our implementation consists of an automatic abstract cost analysis tool and an automatic certifier for the correctness of inferred abstract bounds. Both steps are performed with an approach called Quantitative Abstract Execution (QAE).

Fine, but what is this good for? Abstract programs occur in program transformation rules used in compilation, optimization, parallelization, refactoring, etc.: Transformations are specified as rules over program schemata which are nothing but abstract programs. If we can perform cost analysis of abstract programs, we can analyze the cost effect of program transformations. Our approach is the first method to analyze the cost impact of program transformations. Automated Cost Analysis. Cost analysis occupies an interesting middle ground between termination checking and full functional verification in the static program analysis portfolio. The main problem in functional verification is that one has to come up with a functional specification of the intended behavior, as well as with auxiliary specifications including loop invariants and contracts [21] . In contrast, termination is a generic property and it is sufficient to come up with a suitable term order or ranking function [6] . For many programs, termination analysis is vastly easier to automate than verification. 1 Computation cost is not a generic property, but it is usually schematic: One fixes a class of cost functions (for example, polynomial) that can be handled. A cost analysis then must come up with parameters (degree, coefficients) that constitute a valid bound (lower, upper, exact) for all inputs of a given program with respect to a cost model (# of instructions, allocated memory, etc.). If this is performed bottom up with respect to a program's call graph, it is possible to infer a cost bound for the top-level function of a program. Such a cost expression is often symbolic, because it depends on the program's input parameters.

A central technique for inferring symbolic cost of a piece of code with high precision is symbolic execution (SE) [9, 25] . The main difficulty is to render SE of loops with symbolic bounds finite. This is achieved with loop invariants that generalize the behavior of a loop body: an invariant is valid at the loop head after arbitrarily many iterations. To infer sufficiently strong invariants automatically is generally an unsolved problem in functional verification, but much easier in the context of cost analysis, because invariants do not need to characterize functional behavior: it suffices that they permit to infer schematic cost expressions.

Abstract Execution. To infer the cost of program transformation schemata requires the capability of analyzing abstract programs. This is not possible with standard SE, because abstract statements have no operational semantics. One way to reason about abstract programs is to perform structural induction over the syntactic definition of statements and expressions whenever an abstract symbol is encountered. Structural induction is done in interactive theorem proving [7, 31] to verify, e.g., compilers. It is labor-intensive and not automatic. Instead, here we perform cost analysis of abstract programs via a recent generalization of SE called abstract execution (AE) [37, 38] . The idea of AE is, quite simply, to symbolically execute a program containing abstract placeholder symbols for expressions and statements, just as if it were a concrete program. It might seem counterintuitive that this is possible: after all, nothing is known about an abstract symbol. But this is not quite true: one can equip an abstract symbol with an abstract description of the behavior of its instances: a set of memory locations its behavior may depend on, commonly called footprint and a (possibly different) set of memory locations it can change, commonly called frame [21] .

Cost Invariants. In automated cost analysis, one infers cost bounds often from loop invariants, ranking functions, and size relations computed during SE [3, 11, 16, 40] . For abstract programs, we need a more general concept, namely a loop invariant expressing a valid abstract cost bound at the beginning of any iteration (e.g., 2 + i * (ac P (x) + 2) for the program Q above). We call this a cost invariant. This is an important technical innovation of this paper, increasing the modularity of cost analysis, because each loop can be verified and certified separately.

Relational Cost Analysis. AE allows specifying and verifying relational program properties [37] , because one can express rule schemata. This extends to QAE and makes it possible, for the first time, to infer and to prove (automatically!), for example, the impact of program transformation on performance.

Certification. Cost annotations inferred by abstract cost analysis, i.e., cost invariants and abstract cost bounds, are automatically certified by a deductive verification system, extending the approach reported in [4] to abstract cost and abstract programs. This is possible because the specification (i.e., the cost bound) and the loop (cost) invariants are inferred by the cost analyzer-the verification system does not need to generate them.

To argue correctness of an abstract cost analysis is complex, because it must be valid for an infinite set of concrete programs. For this reason alone, it is useful to certify the abstract cost inferred for a given abstract program: during development of the abstract cost analysis reported here, several errors in abstract cost computation were detected-analysis of the failed verification attempt gave immediate feedback on the cause. We built a test suite of problems so that any change in the cost analyzer can be validated in the future.

Certification is crucial for the correctness of quantitative relational properties: The inferred cost invariants might not be precise enough to establish, e.g., that a program transformation does not increase cost for any possible program instance and run. This is only established at the certification stage, where relational properties are formally verified. A relational setting requires provably precise cost bounds. This feature is not offered by existing cost analysis methods.

We introduce our approach and terminology informally by means of a motivating example: Code Motion [1] is a compiler optimization technique moving a statement not affected by a loop from the beginning of the loop body to before the loop. This code transformation should preserve behavior provided the loop is executed at least once, but can be expected to improve computation effort, i.e. quantitative properties of the program, such as execution time and memory

//@ assignable x; //@ accessible t, w; //@ cost footprint t, w; bstract statement P; //@ assignable y; //@ accessible i , t, y, z; //@ cost footprint t, z; bstract statement Q;

Program Before int i = 0; //@ assignable x; //@ accessible t, w; //@ cost footprint t, w; bstract statement P;

//@ assignable y; //@ accessible i , t, y, z; //@ cost footprint t, z; bstract statement Q; i ++; } //@ assert 

Compilers: Principles, Techniques, and Tools

Deductive Software Verification -The KeY Book -From Theory to Practice

Cost analysis of object-oriented bytecode programs

A formal verification framework for static analysis -as well as its instantiation to the resource analyzer COSTA and formal verification tool KeY. Software and Systems Modeling

The Parma polyhedra library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems

A new look at the automatic synthesis of linear ranking functions

Interactive Theorem Proving and Program Development -Coq'Art: The Calculus of Inductive Constructions

ABC: algebraic bound computation for loops

SELECT-A formal system for testing and debugging programs by symbolic execution

Alternating runtime and size complexity analysis of integer programs

Automated termination proofs for Java programs with cyclic data

Ensuring the Correctness of Lightweight Tactics for JavaCard Dynamic Logic

Automatic discovery of linear restraints among variables of a program

Resource bound certification

The Why/Krakatoa/Caduceus platform for deductive program verification

Resource analysis of complex programs with cost equations

Proving termination of programs automatically with AProVE

Regression Verification: Proving the Equivalence of Similar Programs

Static energy consumption analysis of LLVM IR programs

SPEED: precise and efficient static estimation of program computational complexity

Deductive verification: from pen-and-paper proofs to industrial tools

Modular, correct compilation with automatic soundness proofs

Amortized resource analysis with polynomial potential

Proving the correctness of reactive systems using sized types

Symbolic execution and program testing

Proving Optimizations Correct Using Parameterized Program Equivalence

Dafny: An automatic program verifier for functional correctness

Inferring parametric energy consumption functions at different software levels: ISA vs. LLVM IR

Practical Verification of Peephole Optimizations with Alive

Isabelle/HOL -A Proof Assistant for Higher-Order Logic

Monadic refinements for relational cost analysis

The KIV-approach to software verification

An automatic verifier for Java-like programs based on dynamic frames

A termination analyzer for Java bytecode based on path-length

REFINITY to Model and Prove Program Transformation Rules

Abstract execution

Abstract Execution: Automatically Proving Infinitely Many Programs

Mechanical program analysis

Bound analysis of imperative programs with the size-change abstraction

Open Access This chapter is licensed under the terms of the Creative Commons

Acknowledgments. This work was funded partially by the Spanish MCIU, AEI and FEDER(EU) project RTI2018-094403-B-C31, by the CM project S2018/TCS-4314 cofunded by EIE Funds of the EU and by the UCM CT42/18-CT43/18 grant.