key: cord-0059203-94vh80f0 authors: Albert, Elvira; Hähnle, Reiner; Merayo, Alicia; Steinhöfel, Dominic title: Certified Abstract Cost Analysis date: 2021-02-24 journal: Fundamental Approaches to Software Engineering DOI: 10.1007/978-3-030-71500-7_2 sha: 711c61beb368eee58829d9510bb506aa552c412c doc_id: 59203 cord_uid: 94vh80f0 A program containing placeholders for unspecified statements or expressions is called an abstract (or schematic) program. Placeholder symbols occur naturally in program transformation rules, as used in refactoring, compilation, optimization, or parallelization. We present a generalization of automated cost analysis that can handle abstract programs and, hence, can analyze the impact on the cost of program transformations. This kind of relational property requires provably precise cost bounds which are not always produced by cost analysis. Therefore, we certify by deductive verification that the inferred abstract cost bounds are correct and sufficiently precise. It is the first approach solving this problem. Both, abstract cost analysis and certification, are based on quantitative abstract execution (QAE) which in turn is a variation of abstract execution, a recently developed symbolic execution technique for abstract programs. To realize QAE the new concept of a cost invariant is introduced. QAE is implemented and runs fully automatically on a benchmark set consisting of representative optimization rules. We present a generalization of automated cost analysis that can handle programs containing placeholders for unspecified statements. Consider the program Q ≡ "i =0; while (i < t) {P; i ++;}", where P is any statement not modifying i or t. We call P an abstract statement; a program like Q containing abstract statements is called abstract program. The (exact or upper bound) cost of executing P is described by a function ac P (x) depending on the variables x occurring in P. We call this function the abstract cost of P. Assuming that executing any statement has unit cost and that t ≥ 0, one can compute the (abstract) cost of Q as 2 + t · (ac P (x) + 2) depending on ac P and t. For any concrete instance of P, we can derive its concrete cost as usual and then obtain the concrete cost of Q simply by instantiating ac P . In this paper, we define and implement an abstract cost analysis to infer abstract cost bounds. Our implementation consists of an automatic abstract cost analysis tool and an automatic certifier for the correctness of inferred abstract bounds. Both steps are performed with an approach called Quantitative Abstract Execution (QAE). Fine, but what is this good for? Abstract programs occur in program transformation rules used in compilation, optimization, parallelization, refactoring, etc.: Transformations are specified as rules over program schemata which are nothing but abstract programs. If we can perform cost analysis of abstract programs, we can analyze the cost effect of program transformations. Our approach is the first method to analyze the cost impact of program transformations. Automated Cost Analysis. Cost analysis occupies an interesting middle ground between termination checking and full functional verification in the static program analysis portfolio. The main problem in functional verification is that one has to come up with a functional specification of the intended behavior, as well as with auxiliary specifications including loop invariants and contracts [21] . In contrast, termination is a generic property and it is sufficient to come up with a suitable term order or ranking function [6] . For many programs, termination analysis is vastly easier to automate than verification. 1 Computation cost is not a generic property, but it is usually schematic: One fixes a class of cost functions (for example, polynomial) that can be handled. A cost analysis then must come up with parameters (degree, coefficients) that constitute a valid bound (lower, upper, exact) for all inputs of a given program with respect to a cost model (# of instructions, allocated memory, etc.). If this is performed bottom up with respect to a program's call graph, it is possible to infer a cost bound for the top-level function of a program. Such a cost expression is often symbolic, because it depends on the program's input parameters. A central technique for inferring symbolic cost of a piece of code with high precision is symbolic execution (SE) [9, 25] . The main difficulty is to render SE of loops with symbolic bounds finite. This is achieved with loop invariants that generalize the behavior of a loop body: an invariant is valid at the loop head after arbitrarily many iterations. To infer sufficiently strong invariants automatically is generally an unsolved problem in functional verification, but much easier in the context of cost analysis, because invariants do not need to characterize functional behavior: it suffices that they permit to infer schematic cost expressions. Abstract Execution. To infer the cost of program transformation schemata requires the capability of analyzing abstract programs. This is not possible with standard SE, because abstract statements have no operational semantics. One way to reason about abstract programs is to perform structural induction over the syntactic definition of statements and expressions whenever an abstract symbol is encountered. Structural induction is done in interactive theorem proving [7, 31] to verify, e.g., compilers. It is labor-intensive and not automatic. Instead, here we perform cost analysis of abstract programs via a recent generalization of SE called abstract execution (AE) [37, 38] . The idea of AE is, quite simply, to symbolically execute a program containing abstract placeholder symbols for expressions and statements, just as if it were a concrete program. It might seem counterintuitive that this is possible: after all, nothing is known about an abstract symbol. But this is not quite true: one can equip an abstract symbol with an abstract description of the behavior of its instances: a set of memory locations its behavior may depend on, commonly called footprint and a (possibly different) set of memory locations it can change, commonly called frame [21] . Cost Invariants. In automated cost analysis, one infers cost bounds often from loop invariants, ranking functions, and size relations computed during SE [3, 11, 16, 40] . For abstract programs, we need a more general concept, namely a loop invariant expressing a valid abstract cost bound at the beginning of any iteration (e.g., 2 + i * (ac P (x) + 2) for the program Q above). We call this a cost invariant. This is an important technical innovation of this paper, increasing the modularity of cost analysis, because each loop can be verified and certified separately. Relational Cost Analysis. AE allows specifying and verifying relational program properties [37] , because one can express rule schemata. This extends to QAE and makes it possible, for the first time, to infer and to prove (automatically!), for example, the impact of program transformation on performance. Certification. Cost annotations inferred by abstract cost analysis, i.e., cost invariants and abstract cost bounds, are automatically certified by a deductive verification system, extending the approach reported in [4] to abstract cost and abstract programs. This is possible because the specification (i.e., the cost bound) and the loop (cost) invariants are inferred by the cost analyzer-the verification system does not need to generate them. To argue correctness of an abstract cost analysis is complex, because it must be valid for an infinite set of concrete programs. For this reason alone, it is useful to certify the abstract cost inferred for a given abstract program: during development of the abstract cost analysis reported here, several errors in abstract cost computation were detected-analysis of the failed verification attempt gave immediate feedback on the cause. We built a test suite of problems so that any change in the cost analyzer can be validated in the future. Certification is crucial for the correctness of quantitative relational properties: The inferred cost invariants might not be precise enough to establish, e.g., that a program transformation does not increase cost for any possible program instance and run. This is only established at the certification stage, where relational properties are formally verified. A relational setting requires provably precise cost bounds. This feature is not offered by existing cost analysis methods. We introduce our approach and terminology informally by means of a motivating example: Code Motion [1] is a compiler optimization technique moving a statement not affected by a loop from the beginning of the loop body to before the loop. This code transformation should preserve behavior provided the loop is executed at least once, but can be expected to improve computation effort, i.e. quantitative properties of the program, such as execution time and memory //@ assignable x; //@ accessible t, w; //@ cost footprint t, w; bstract statement P; //@ assignable y; //@ accessible i , t, y, z; //@ cost footprint t, z; bstract statement Q; Program Before int i = 0; //@ assignable x; //@ accessible t, w; //@ cost footprint t, w; bstract statement P; //@ assignable y; //@ accessible i , t, y, z; //@ cost footprint t, z; bstract statement Q; i ++; } //@ assert Compilers: Principles, Techniques, and Tools Deductive Software Verification -The KeY Book -From Theory to Practice Cost analysis of object-oriented bytecode programs A formal verification framework for static analysis -as well as its instantiation to the resource analyzer COSTA and formal verification tool KeY. Software and Systems Modeling The Parma polyhedra library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems A new look at the automatic synthesis of linear ranking functions Interactive Theorem Proving and Program Development -Coq'Art: The Calculus of Inductive Constructions ABC: algebraic bound computation for loops SELECT-A formal system for testing and debugging programs by symbolic execution Alternating runtime and size complexity analysis of integer programs Automated termination proofs for Java programs with cyclic data Ensuring the Correctness of Lightweight Tactics for JavaCard Dynamic Logic Automatic discovery of linear restraints among variables of a program Resource bound certification The Why/Krakatoa/Caduceus platform for deductive program verification Resource analysis of complex programs with cost equations Proving termination of programs automatically with AProVE Regression Verification: Proving the Equivalence of Similar Programs Static energy consumption analysis of LLVM IR programs SPEED: precise and efficient static estimation of program computational complexity Deductive verification: from pen-and-paper proofs to industrial tools Modular, correct compilation with automatic soundness proofs Amortized resource analysis with polynomial potential Proving the correctness of reactive systems using sized types Symbolic execution and program testing Proving Optimizations Correct Using Parameterized Program Equivalence Dafny: An automatic program verifier for functional correctness Inferring parametric energy consumption functions at different software levels: ISA vs. LLVM IR Practical Verification of Peephole Optimizations with Alive Isabelle/HOL -A Proof Assistant for Higher-Order Logic Monadic refinements for relational cost analysis The KIV-approach to software verification An automatic verifier for Java-like programs based on dynamic frames A termination analyzer for Java bytecode based on path-length REFINITY to Model and Prove Program Transformation Rules Abstract execution Abstract Execution: Automatically Proving Infinitely Many Programs Mechanical program analysis Bound analysis of imperative programs with the size-change abstraction Open Access This chapter is licensed under the terms of the Creative Commons Acknowledgments. This work was funded partially by the Spanish MCIU, AEI and FEDER(EU) project RTI2018-094403-B-C31, by the CM project S2018/TCS-4314 cofunded by EIE Funds of the EU and by the UCM CT42/18-CT43/18 grant.