key: cord-0046676-segf4wj9 authors: Kochemazov, Stepan title: Improving Implementation of SAT Competitions 2017–2019 Winners date: 2020-06-26 journal: Theory and Applications of Satisfiability Testing - SAT 2020 DOI: 10.1007/978-3-030-51825-7_11 sha: bbeccaaf4cf6615773e081a5d2f3a552b2f303e4 doc_id: 46676 cord_uid: segf4wj9 The results of annual SAT competitions are often viewed as the milestones showcasing the progress in SAT solvers. However, their competitive nature leads to the situation when the majority of this year’s solvers are based on previous year’s winner. And since the main focus is always on novelty, it means that there are times when some implementation details have a potential for improvement, but they are just inherited from solver to solver for several years in a row. In this study we propose small modifications of implementations of existing heuristics in several related SAT solvers. These modifications mostly consist in employing a deterministic strategy for switching between branching heuristics and in augmentations of the treatment of Tier2 and Core clauses. In our experiments we show that the proposed changes have a positive effect on solvers’ performance both individually and in combination with each other. The Conflict-Driven Clause Learning (CDCL) solvers [13] form the core of the algorithms for solving the Boolean satisfiability problem (SAT) [4] . Every year the community proposes new heuristics aimed at improving their performance. To test them in close to real-world conditions, the SAT competitions are held. They evaluate the prospective solvers' performance in the same computational environment over the sets of test instances gathered from various areas where the SAT solvers are applied. To show that a new heuristic for SAT solving contributes to state of the art, it is usually implemented on top of one of the well-known SAT-solvers such as MiniSAT [5] , glucose [1] , CryptoMiniSat [20] , Cadical [2] or, more often, the most recent SAT competition(s) winner (e.g. [8, 12, 17] ). The problem with the latter is that the solver with a new heuristic must be compared with the original solver without modifications to show that the increase in performance is not due to fixing some aspects of original solver's implementation. This is not a bad thing as it is, but it can lead to solvers accumulating undesired traits. One of the examples of the latter is the non-deterministic switching between branching heuristics, first introduced in MapleCOMSPS [9] which won in the main track of SAT Competition 2016. This behavioral trait was inherited by the winners of SAT Competitions 2017 and 2018, and many participants of the SAT Race 2019. In the present paper we propose several changes to the common implementation aspects of the solvers that won at the main tracks of SAT Competitions 2017 to 2019. The main contributions are as follows: 1. We analyze possible variants of deterministic switching between branching heuristics and show that some of them yield consistently better results compared to the non-deterministic switching at 2500 s. 2. We adjust the treatment of the so-called Tier2 clauses and show that accumulating them in a slightly different manner results in better overall performance. 3. We show that contrary to the intuition provided in [18] , it is better to sometimes purge some learnts from the Core tier in order to increase the propagation speed and the effectiveness of a solver on hard instances. We use the MapleLCMDistChronoBT solver as the main object for analysis and experimentation. To evaluate improvements and modifications we use the set of SAT instances from the SAT Race 2019. After having figured out the modifications that serve our goals best, we implement them in the winners of SAT Competition 2017 and SAT Race 2019. Then we test the resulting six SAT solvers on a wide range of benchmarks. The presentation of the details on implementation of heuristics in SAT solvers implies that the reader is familiar with the architecture of CDCL SAT-solvers in general and with key advancements in the area during the recent years. The SAT competitions are annual competitive events aimed at the development of SAT solving algorithms. In the course of the main track of the competition, the solvers are launched on each test instance with the time limit of 5000 s. The performance is evaluated using two criteria: Solution-Count Ranking (SCR) -the number of test instances successfully solved by an algorithm within the time limit, and Penalized Average Runtime (PAR-2) computed as the sum of solver runtimes on solved instances plus 2× the time limit for unsolved ones, divided by the total number of tests. In the paper we study the solvers MapleLCMDist, MapleLCMDistChronoBT and MapleLCMDistChronoBT-DL-v3, which won at SAT Competitions 2017 to 2019. All three solvers are based on the MapleCOMSPS solver that won at SAT Competition 2016 [8] . The latter uses the foundation of COMiniSatPS [18] , which combined the better traits of the well-known MiniSAT [5] and Glucose [1] solvers. One of the main novelties of COMiniSatPS was the special treatment of learnt clauses depending on their literal block distance value (lbd). The notion of lbd was first proposed in [1] and is equal to the number of distinct decision levels of the literals in a learnt clause. COMiniSatPS splits all learnt clauses into three tiers: Core, Tier2 and Local and handles them differently. The second major novel feature of COMiniSatPS was to use two sets of activity values for the branching heuristic. In particular, the solver employed two sets of scores maintained via Variable State Independent Decaying Sum (VSIDS) [16] . They used different VSIDS decay values, but what's more important, their use was tied to periodic switching between restart strategies. For years after the introduction of Minisat [5] , the SAT solvers mainly relied on luby restarts [11] , but after the triumphal appearance of Glucose [1] , the CDCL solvers mostly switched to the much faster glucose-restarts. COMiniSatPS combined both, and used one set of VSIDS scores with luby restarts and another set with glucose restarts. In 2016 the Learning Rate Branching (LRB) heuristic [8] was proposed as an alternative to VSIDS. It was implemented in MapleCOMSPS as well as in several other solvers [9] . MapleCOMSPS is based on COMiniSatPS and uses LRB branching together with luby restarts and VSIDS-branching with glucose restarts. Unfortunately, the solver did not inherit the deterministic switching strategy employed by COMiniSatPS to combine different phases of solving. Instead, it relies on LRB + luby restarts for the first 2500 s of the search and VSIDS + glucose restarts for its remainder, which results in a non-deterministic behaviour. Having realized their oversight, the MapleCOMSPS's authors submitted to SAT competitions 2017-2019 only the deterministic variants of their solvers. The winner of the main track of the SAT Competition 2017, MapleLCMDist introduced Learnt Clause Minimization [12] into MapleCOMSPS. It is an expensive inprocessing method [6] periodically applied to learnt clauses from Tier2 and Core with the goal to remove irrelevant literals from them. The main track winner of SAT Competition 2018 called MapleLCMDistChronoBT had augmented MapleLCMDist with chronological backtracking [17] , which was later studied in [15] . Finally, the winner of SAT Race 2019 was MapleLCMDistChronoBT-DL-v3, which used the duplicate learnts (DL) heuristic [7] , that tracks duplicate learnts and sometimes adds them into Tier2 or Core. In this section we describe the three-phase experiment aimed at improving the implementation of MapleLCMDistChronoBT. The main question we strive to answer is if it is possible to retain its good overall performance but make the solver deterministic. In the first phase of experiments we implement several deterministic strategies for switching between branching heuristics and evaluate their performance. In the second phase we use the best-behaved switching strategy from the first phase to check whether it is possible to forego one of the branching heuristics and let the solver work with a single one. In the third phase we propose and implement a small change to handling Tier2 and Core learnt clauses and see how it affects the general performance of a solver. In all experiments we used a single node of the "Academician V.M. Matrosov" computing cluster of Irkutsk Supercomputer Center [14] . It is equipped with two 18-core Intel Xeon E5-2695 CPUs and 128 GB DDR4 RAM. The solvers were launched in 36 simultaneous threads. First we combed through the available deterministic implementations in the solvers from recent SAT competitions and found several variants for switching between branching heuristics that are summed up in Table 1 . All of them assume that VSIDS is used with glucose restarts and LRB with luby restarts, thus we refer to them as to VSIDS and LRB phases, respectively. The fcm column corresponds to the implementation in COMiniSatPS. The variants f1 and f2 denote the implementations first appeared in MapleCOMSPS LRB [9] and MapleCOMSPS LRB VSIDS 2 [10] , respectively. The v3 variant employs the scheme used in the MapleLCMDistChronoBT-DL-v3 solver [7] , which uses propagations to measure the phases' sizes. All variants in Table 1 use phase allotment variable to store the number of conflicts (or propagations) allocated for the next phase. They all first launch the LRB phase and then the VSIDS phase. Some of them allocate more resources to the VSIDS phase, in that case the value in the VSIDS multiplier row of Table 1 is different from 1. The value of phase allotment is multiplied by a constant after each LRB/VSIDS phase and the cycle repeats anew. Actually, all strategies but v3 only increase the value of phase allotment at the end of the VSIDS phase. We implemented the strategies from Table 1 in MapleLCMDistChronoBT. The performance of the resulting 4 solvers on the benchmarks from SAT Race 2019 is presented in form of a cactus plot in Fig. 1 with the corresponding detailed statistics in Table 1 . They are accompanied by the original solver and 5 other solver variants, which will be described in detail below. Two of them (-LRB and -VSIDS) employ only a single branching heuristic and the other three use different combinations of augmented handling of Tier2 clauses (-t) and the procedures for reducing Core database (-rc). MapleLCMDistChronoBT-VSIDS-f2 203 115 88 5589 MapleLCMDistChronoBT-f2-t 236 139 97 4720 MapleLCMDistChronoBT-f2-rc 240 143 97 4631 MapleLCMDistChronoBT-f2-t-rc 242 143 99 4556 As it can be seen from both Table 2 and Fig. 1 , the best deterministic variant in the considered conditions is f2. Its distinctive feature is the phase allotment multiplier equal to 2. Thus, the solver alternates between long phases of rapidly increasing size in which it relies on a single heuristic. The similar behaviour is achieved by the v3 scheme, where a large number of propagations is allocated to phases. Compared to f2, v3 starts from longer phases, that grow in size significantly slower. It was first noted in [18] and later reaffirmed in [3] that it is beneficial to maintain different variable activity values for use with different restart strategies. Also, in [18] it was pointed out that a strong correlation exists between the effectiveness of glucose restarts for proving unsatisfiability and that of Luby restarts for finding satisfying assignments. Based on the success of the solvers that employ Luby restarts with LRB and glucose restarts with VSIDS it is easy to make a conclusion that LRB shows better performance on SAT and VSIDS on UNSAT instances. To evaluate at a glance the highlighted issues we implemented two versions of the MapleLCMDistChronoBT-f2: one that uses only a single set of LRB activity values, and one that relies only on VSIDS. They both still switch between luby restarts and glucose restarts deterministically after the f2 fashion. We denote them as MapleLCMDistChronoBT-LRB-f2 and MapleLCMDistChronoBT-VSIDS-f2, respectively. The results of these solvers are included in Fig. 1 and Table 2 . They point at the following two conclusions. First, that indeed two separate sets of activities for branching during different restart strategies work better than a single set. Second, that the relation between LRB, luby restarts and satisfiable instances, and VSIDS, glucose restarts and unsatisfiable instances is more complex than it might seem and warrants further investigation that goes out of the scope of the present paper. Recall, that in [18, 19] , it was proposed to split learnt clauses into three tiers. Core tier accumulates and indefinitely holds the clauses with lbd not exceeding the value of the core lbd cut parameter, which is equal to 3 or 5 depending on a SAT instance. Tier2 stores for a prolonged time period the clauses with slightly higher lbd than necessary to be included into Core (with core lbd cut