key: cord-0572168-zodv3yoq authors: Fdhila, Walid; Knuplesch, David; Rinderle-Ma, Stefanie; Reichert, Manfred title: Verifying Compliance in Process Choreographies: Foundations, Algorithms, and Implementation date: 2021-10-14 journal: nan DOI: nan sha: df2e921ab8b8988d4d7046aac79bdf5073d86ef2 doc_id: 572168 cord_uid: zodv3yoq The current push towards interoperability drives companies to collaborate through process choreographies. At the same time, they face a jungle of continuously changing regulations, e.g., due to the pandemic and developments such as the BREXIT, which strongly affect cross-organizational collaborations. Think of, for example, supply chains spanning several countries with different and maybe even conflicting COVID19 traveling restrictions. Hence, providing automatic compliance verification in process choreographies is crucial for any cross-organizational business process. A particular challenge concerns the restricted visibility of the partner processes at the presence of global compliance rules (GCR), i.e., rules that span across the process of several partners. This work deals with the question how to verify global compliance if affected tasks are not fully visible. Our idea is to decompose GCRs into so called assertions that can be checked by each affected partner whereby the decomposition is both correct and lossless. The algorithm exploits transitivity properties of the underlying rule specification, and its correctness and complexity are proven, considering advanced aspects such as loops. The algorithm is implemented in a proof-of-concept prototype, including a model checker for verifying compliance. The applicability of the approach is further demonstrated on a real-world manufacturing use case. Gartner regards interoperability as "strategic imperative" 1 for healthcare. Especially the global push by digitalization and the current pandemic require the collaboration and integration of (business) partners and organizations. Process technology serves as enabler for process-oriented collaborations between distributed business partners, realized and implemented through so-called process choreographies. Applications include healthcare (34) , blockchain-based processes (47; 46) , multi-modal logistics scenarios (7; 10), and supply chains (22) . Digitalization and ongoing changes due to, for example, the pandemic situation or the BREXIT flood enterprises and organizations with updated or even new regulations at a fast pace. For example, "bank regulations change every 12 minutes". 2 Regulatory frameworks comprise application-independent frameworks such as the GDPR on "data processing boundaries of the personal data of European Union's citizens" (65) and the ISO 27001 security standard 3 as well as application-specific ones, e.g., the WHO regulations defined in the context of COVID19 4 . As a consequence, in our globalized world, regulations and their changes affect process collaborations (54) and lead to an increased need for compliance verification in process choreographies. What are the particular challenges with respect to compliance verification in process choreographies? Let us illustrate them by an example. Figure 1 depicts the choreography model of a supply chain scenario adapted from (22) . It involves five process partners, i.e., Bulk Buyer, Manufacturer, Middleman, Supplier, and Special Carrier that interact through Verifying Compliance in Process Choreographies message exchanges. First, the Bulk Buyer orders a set of products from the Manufacturer (e.g., an aircraft). The manufacturing of the product requires several sub-products (intermediates) to be provided by different suppliers. In this scenario, we assume that only one intermediate is required and provided by the Supplier. After processing the order, hence, the Manufacturer sends an order for the intermediate (e.g., the fuselage or engines) to the Middleman. Obviously, none of the GCR can be directly verified on basis of the choreography model in Fig. 1 as none of the public and message exchanging tasks corresponds to any of the tasks referred to in the GCR. Let us have a look at the public processes of the partners involved in the choreography as shown in Fig. 3 . These public process models contain all public tasks that are visible to the other partners, including the tasks that exchange messages, but also other visible tasks such as Production at the Manufacturer. Based on the public process models, C1 and C2, Verifying Compliance in Process Choreographies as depicted in Fig. 2 , can be verified: C1 refers to public tasks of the Manufacturer process, which obviously complies with C1, i.e., public task Production is followed by public task Final Test. C2 can be verified over the Supplier and Special Carrier processes. The order between tasks Pack Intermediate and Transport Intermediate is determined by the message exchange between sending and receiving Waybill Intermediate. As opposed to C1 and C2, C3 cannot be verified based on the public processes of the partners as there are no public tasks for Safety check and Get permission of authority. The presumption is that C3 also refers to private tasks of the partners, i.e., tasks that are only present in the private process models of the partners. In general, private process models of the partners implement and possibly extend the behavior of the corresponding public models. As opposed to public tasks, private tasks are not visible to the partners. Figure 4 shows the private process models of partners Special Carrier and Middleman where private tasks are highlighted in gray color. Although private tasks are usually hidden to other partners, restrictions over them might exist. In the supply chain, for example, C3 refers to private tasks Safety Check for partner Special Carrier and Get permission of authority for partner Middleman. If private tasks are affected by a GCR, no information about how and when these tasks are executed, or how they are connected to other nodes of the corresponding private process model, becomes visible to the other partners. Usually, this happens when a collaborating partner p1 imposes the execution of a specific task that must exist in its private process and comply with a given rule involving another partner p2. Partner p1 should then assure the existence of such task and that it follows the imposed rule. As can be seen from the example depicted in Figs. 1-4, GCRs constrain actions of multiple partners and/or the interactions between them. Ensuring the compliance of process choreographies with a GCR is crucial and challenging (32) as a partner "only has the visibility of the portion of the process under its direct control" (48) . Reconsider GCR C3 as an example. It asks for a safety check accomplished by a private task of the Special Carrier. To cope with this issue, assertions can be used. An assertion (A) corresponds to a commitment of a partner guaranteeing that its private/public process complies with the imposed rule (22) . Figure 5 depicts the two assertions A1 and A2: the Middleman agrees to get the permission of the authority before ordering the special transport (A1). Moreover, the Special Carrier commits to perform a safety check before transporting the intermediate (A2). In combination, assertions A1 and A2 enable Verifying Compliance in Process Choreographies Overall, this leads to the overarching research question RQ tackled in this work: In literature, there is a "knowledge gap" when it comes to compliance verification in process choreographies (32) . (48) tackles the problem of checking a GCR on private tasks based on IoT-enabled artifacts. However, not all process choreography settings with compliance requirements feature IoT-enabled artifacts. Hence, this works aims at providing a formal approach that is independent of any technology or application. The central idea is to decompose the GCR into assertions in a lossless way, i.e., the verification of all assertions guarantees the one of the GCR. Note that this solves the problem as assertions may be checked separately by each of the partners. Hence, infringing the privacy of any partner is avoided. The decomposition algorithm presented in this article exploits transitivity properties of the underlying GCR specification and was originally presented in (23) . The decomposition relies on transitivity properties of the underlying GCR specification. The transitivity properties are shown by the example of a translation to first order predicate logic. Similarly, for example, (57) presents declarative patterns based on Linear Temporal Logic (LTL). In our approach, GCRs are specified in a pattern-based and visual way using the eCRG formalism (40) . This means that a GCR may be composed of so called antecedence patterns and consequence patterns. The patterns can be connected reflecting pre-/post-conditions of the respective GCR. C1 in Fig. 2 , for example, connects antecedence pattern Production with consequence pattern Final test, demanding that after the production a final test is required. Note that antecedence and consequence patterns may require occurrence (i.e., something must happen) and absence (i.e., something must not happen). In (23), we relied on simple rules that consist of single antecedent and multiple occurrence patterns. Aside the decomposition algorithm itself, (23) provides basic proofs, simple GCR decomposition scenarios, and the embedding of the approach in the overall digitalized change and compliance management framework C 3 Pro. 5 This article extends and elaborates the results presented in (23) in several directions: • We allow for additional and more complex compliance rules with multiple antecedence patterns. This significantly increases the complexity of the theoretical considerations as well as the one of the provided GCR decomposition scenarios. As a result, we obtain new theorems and algorithms. • The decomposition proofs are extended to cover the additional complexity of the GCR; in particular they now consider loops as well. • The decomposition algorithm with extensions is prototypically implemented and integrated with the C 3 Pro framework, which deals with both change and compliance in process choreographies. • A model checker for verifying decomposition correctness is provided. • A manufacturing use case illustrates the applicability of the approach. Specifically, the use case demonstrates the applicability of the approach beyond regulatory compliance, i.e., it shows how decomposition can be used to lift implicit connections to explicit assertions. • The related work section is significantly extended. The remainder of the paper is structured as follows: Section 2 provides fundamentals required for understanding this work, and Section 3 introduces the foundations for GCR decomposition (including transitivity theorems). Section 4 then presents the decomposition algorithm for global compliance rules, whereas Section 5 deals with the automated verification of the resulting decompositions based on model checking. Sections 6 and 7 cover the evaluation of the approach, i.e., the implementation and application of the algorithms. Section 8 discusses related work. Section 9 concludes the paper with a summary and an outlook. This section presents definitions and formal backgrounds for global compliance rules (GCRs) to be obeyed by a process choreography . A choreography includes three types of overlapping models: (i) a private model representing the executable process and including both private and public activities (see Fig. 4 for examples of private process models), (ii) a public model (also called the interface of the process) that solely includes the public activities and the interactions of a given partner (see Fig. 3 for the public process models of our running example), and (iii) a choreography model providing a global view on the interactions between all partners (see Fig. 1 for the choreography model of our running example) (21) . Compared to (21; 2), this paper assumes that public activities are not necessarily interactions with other partners, but may additionally represent tasks made visible by the corresponding partners. Therefore, both interactions and non-interaction public activities of a single partner are described in a public model. The latter serves as public (restricted) view on the private model of the partner, which "describes the internal logic of a partner including its private and public activities" (6) . For a formal definition of process choreography, we refer to Definition 1. (21)). We define a choreography as a tuple (, , Π, , , Γ, ) where - is the set of all participating partners. - is the choreography model representing the interactions  between partners in  (cf. Fig. 1 ). -Π = { } ∈ is the set of all private models (cf. Fig. 4 ). - = { } ∈ is the set of all public models (cf. Fig. 3 ). -= { ∶ ↔ } ∈ is a partial mapping function between nodes of the public and private models. -Γ: ↔ ′ is a partial mapping function between nodes of different public models. -∶  ↔ is a partial mapping function between nodes of the choreography model and the public models. As depicted in Figures 1, 3 and 4 , choreography, public and private models are defined as graphs, where nodes are either activities (i.e., interaction, public or private activities) or gateways (e.g., sequence, exclusive or parallel), and arrows are the dependencies between them. As described above, each of these three models use specific type of activity nodes (e.g., interaction activities for choreography models). Because the focus of this paper is mainly on GCR decomposition, we abstract their respective formal definitions, but the reader may refer to (21) for more details. While function maps the activities of the public models to those of the private models, function Γ determines the dependencies between the interactions of different public models (e.g., Γ( _ ( _ )) = _ ( )). Finally, function represents the dependencies between the activities in the choreography model and those of the public models (e.g., . Note that in the examples above, connected interaction activities (i.e., the send and the corresponding receive) of different public models have the same labels, while in practice, it is possible to have them different. Based on functions and Γ, certain soundness properties of choreography can be checked, including structural and behavioral compatibility between public models, and consistency between public and private models (15) . Structural consistency requires that for each public activity of the public model of a partner p, there should be a matching element in the corresponding private model of p, but not vice versa (21) . Structural compatibility states that for each interaction activity of the public model of a partner p, there should be a matching interaction activity in the public model of another partner. Note that this is a necessary, but not yet sufficient condition for ensuring compatibility and consistency-the models' behaviors (control flow dependencies) should also be compatible and consistent. In this paper, we assume that the choreography is sound. In previous work (20) , multiple formal languages employed for business process compliance modelling and checking (e.g., linear temporal logic LTL, event calculus EC, extended compliance rule graph eCRG) were compared. It was shown that most of these languages can deal with most qualitative time patterns, and can therefore be used to model the compliance constructs addressed in this paper. Similar results were proven in (30) . In order to specify these constructs as well as transitivity properties required for the GCR decomposition, this work utilizes the visual eCRG (extended Compliance Rule Graph) language (42; 38; 36). The eCRG language offers a visual rule notation for expressing compliance rules over process choreographies and is based on first order predicate logic (cf. Fig. 6 ). To distinguish between a precondition (i.e., antecedence) and corresponding postconditions (i.e., consequences), an eCRG can be partitioned into an antecedence pattern and a consequence pattern. The antecedence pattern specifies when the compliance rule is triggered (i.e., activated), whereas the consequence pattern specifies what the rule demands. As compliance rules may require the occurrence or absence of certain activities or interactions (i.e., message exchanges), the antecedence and consequence patterns are further sub-divided into occurrence and absence nodes. Sequence conditions between these events can be expressed using directed connectors between the respective nodes. We use the following notation: : Antecedence occurrence; : Antecedence absence; : Consequence occurrence; : Consequence absence. Fig. 6 introduces the elements of the eCRG language. For a formal definition of eCRG, we refer to Def. 2. • ∶ →  returning the partner responsible for a node. • ∶ × → { ⤏,→, ∅} returning the sequence flow connector between two nodes, i.e., consequence sequence and antecedence sequence connectors respectively. • ∶ →  ∪  mapping each node to an activity or an interaction (i.e., message exchange). Think of an eCRG as a graph of connected nodes, where each node is assigned to a particular partner (e.g., in C1, ( ) = ). A node may either be a private, non-interaction public activity, or an interaction (see Figure 6 ). Given two nodes of an eCRG, function returns the sequence flow connector as depicted in Figure 6 , where a dashed arrow (i.e., consequence connector) connects an antecedence pattern to set of consequence patterns (e.g, 1: After production a final test is required), and an antecedence connector expresses a relation between antecedence patterns solely (i.e, the pre-condition). For example, assume that we change 3 as follows: Then: if the pre-condition (i.e., execution of activity _ _ _ ℎ followed by the one of activity _ ℎ ) is met, the post condition (i.e., activity _ ) will be triggered. Finally, function pattern evaluates whether a node is an antecedence or consequence, and whether or not it should occur. This section introduces the theoretical foundations for the decomposition of global compliance rules (including theorems and proofs) illustrated by a number of examples, which we have extracted from the application scenario introduced in Section 1. Section 3.1 first describes the basic idea of our approach (i.e., why do we need to decompose a GCR), followed by the presentation, proofs and illustrations of the theorems in Section 3.2. Our method for the decentralized checking of global compliance rules relies on the decomposition of the original GCR into a set of assertions that can be checked locally by each partner and collectively reproduce the behavior of the GCR (cf. Fig. 7) . A communication between partners is only required in the setup phase to deduct the assertions. During runtime, however, no further compliance-related communication becomes necessary for checking the GCR unless a local assertion becomes violated. The decomposition of a GCR into a set of assertions is based on well-grounded theorems, which ensure that if a conjunction of hypotheses is true, the conclusion (GCR) is true as well. In this section, we provide a decomposition method for selected global compliance patterns and show how they can be applied in a collaborative setting. In particular, we prove a set of theorems required for ensuring the correctness of our decomposition method. Each theorem represents a possible decomposition of a given compliance pattern. We illustrate the translation of a GCR into a First Order Logic (FOL) expression using basic equivalences as in Def. 3. Based on (41) , the following equivalences hold by definition. Predicate ( , ) describes that at the point in time an activity (message) of type was executed (i.e., sent or received). . Thereby, relation < expresses a temporal precedence between points in time and . The decomposition algorithm presented in Section 4 exploits the transitivities for GRC as in Theorem 1. Specifically, by combining transitive relations, where each relation can be checked locally by a single partner, it becomes possible to reconstruct the original GCR behavior. Theorem 1 ensures that the behavior of the derived assertions reproduces the behavior of the GCR, but not vice versa. Let , , and be three activity or message types. Then: a. The rightwards transitivity holds: The leftwards transitivity holds: In the following, the correctness of Theorem 1 is proven applying Def. 3. Let , , and be three activities or interactions. Then ⤏ ∧ ⤏ Leftwards transitivity can be proven similarly by replacing '<' with '>'. In the following, we use Examples (1) -(3) (cf. Fig. 8 ), which we extracted from our running example (cf. Figs. 3 and 4), in order to illustrate how we use Theorem 1 for decomposing a simple compliance rule of type ⤏ that involves two private tasks and of two different partners 1 and 2 respectively. • Example (1): In this example, both activities are private, which would normally require and _ to share runtime information about the execution time of the respective activities. In turn, this would require an agreement on a time synchronization protocol that considers network failures and message transmission delays. This can be solved by identifying a transitive relation between the two private activities that include an interaction activity. According to Theorem 1, the interaction activity _ _ between and _ fulfills the conditions The behavioral and structural compatibility (cf. Section 2) between the partner processes ensures that message _ _ sent by shall be correctly received by _ . By locally checking 1 and 2 by and _ respectively, we can ensure that the original GCR is not violated as long as the assertions are not violated. If one assertion is violated, a communication between the two partners will become necessary. Note that this violation does not necessarily mean that the original GCR is violated. For example, assume that for a given process instance, assertion 1 evaluates to true, and _ executeś activity _ ℎ before the message arrival. Although this would result in 2 being evaluated to false, it does not necessarily mean that _ ℎ is executed before _ _ _ ℎ . • Example (2): _ ⤏ _ . In this example, and do not engage in any direct interaction. However, by looking at the public processes of the collaboration model from Fig. 3 , it becomes possible to identify a double transitive relation through , which interacts with both partners. Therefore, using Corollary 1, the transitive relations (assertions): , which has initially not been involved in the GCR, becomes involved in the derived assertions. We call an intermediary partner. In this example, it is not possible to identify any transitive relation between and _ that involve private activities _ and _ ℎ . The interaction activity _ _ connects both partners immediately after the activities in question, which discards any possibility of a transitive relation. In this case, it is not possible to satisfy Theorem 1 and, hence, additional message exchanges become necessary to inform about the execution state of the activities involved in the GCR. Message exchanges can be either synchronous or asynchronous. Asynchronous message exchange only allows for reactive GCR checking and, therefore, detecting violations after their occurrence. Synchronous message exchange, in turn, is proactive as it enforces the GCR with new restrictions to the process models, e.g., the execution of activity _ ℎ becomes enabled only after receiving a synchronization message (i.e., about whether or not _ is executed). shall also inform _ in case activity _ is not executed, as this does not prevent activity _ ℎ from being executed according to the GCR. Rightwards transitivity (cf. Theorem 1.a) directly ensures the correctness of the assertions derived in the above examples. It should be clear that the correctness of the derived assertions in Example (2) can be directly concluded based on Corollary 1. The same examples can be also used to illustrate leftwards transitivity. Let , , and be three activity or message types. Then: a. The rightwards zig zag transitivity holds for the consequence absence: The leftwards zig zag transitivity holds for the consequence absence: Let , , and be activities or interactions. Then: Leftwards zig zag transitivity of absence can be proven similarly by replacing '<' with '>' and '≤' with '≥'. In the following, we use Examples (4) and (5) from Fig. 9 to illustrate and discuss how Theorem 2 can be used to decompose a GCR of type rightwards zigzag ⤏ . Note that these two examples are adopted from the running example we introduced in Section 1 in order to fulfill the decomposition requirements. • Example (4): In this example, _ and _ _ in _ belong to different XOR branches, which means that the execution of activity _ implies the non-execution of activity _ _ and vice versa (fulfilling assertion 1 _ _ ⤏ _ ). Additionally, in , the interaction activity _ _ and the private activity _ _ belong to the same XOR branch, and fulfill assertion 2 _ ⤏ _ _ . According to Theorem 2.a, the conjunction of 1 and 2 reproduces the behavior of the original GCR. Note that process compatibility ensures that whenever sending _ _ occurs in _ , receiving _ _ should occur in as well. At the presence of loops that encapsulate the depicted process part of _ , the XOR fragment can be executed multiple times possibly leading to an alternate execution of the corresponding branches. For example, if in the first loop iteration, _ is executed and _ _ is not executed, then, to this point both derived assertions are satisfied. Let us assume that a future iteration over the XOR fragment in the context of the same process instance triggers _ _ execution, thus, violating the original GCR. iteration 1: } By looking at the combined trace, it becomes clear that the GCR is violated. Unfortunately, this would require both partners to exchange the traces and use a common time stamping system to obtain the same chronological order of activities. Using the theorems, however, the can locally run its derived assertion against its own execution trace of the same process instance, and identify the violation. Note that the decomposition does not enforce the processes with new restrictions (except when no transitivity could be derived), but uses the existing control flow and interactions between partners to derive assertions that can be used for a decentralized checking of the original GCR. • Example (5): The second part of the decomposition can be directly derived from the process control flow of _ : _ ⤏ _ _ . The same reasoning applies to this example at the presence of loops. The correctness of Example (5) concludes from the leftwards zig zag transitivity (cf. Theorem 2.b), whereas Example (4) relies on the rightwards zig zag transitivity of the absence (cf. Theorem 2.a). The decomposition process is not limited to these scenarios and, as aforementioned, the decomposition cannot always be automated, but might require manual interaction and processing. Altogether, the decomposition eases global compliance rule checking, where each partner checks its corresponding derived assertions locally. A GCR is rechecked only if at least one of the derived assertions is not evaluated to true. Note that this does not necessarily imply that the GCR is violated. Let , , , and be activities or interactions such as → ⤏ ⤏ : if and occur, and shall occur afterwards. Let 1 , 2 , and 3 be three interactions such as: ( In this example, all activities involved in the GCR are private and belong to separate partners. According to the process models shown in Figs. 3 and 4, each partner can separately derive the corresponding assertion based on Theorem 3 and involving the corresponding activity in the GCR. _ _ ⤏ In this example, partners will first engage in a setup phase, in which they agree on the interaction activities that satisfy all derived assertions following the assertions' templates of Theorem 3. For example, will start by identifying relations in its process of type _ ⤏ _ _ ℎ , where _ must be a message exchange with that is the partner being responsible for the following antecedence occurrence _ . In this example, and have only one interaction that satisfies the derived assertion (1); however, it is also possible to identify several alternatives. The combination of the four derived assertions reproduce the behavior of the original GCR when all assertions are true. The following theorem represents a generalization of Theorem 3 with antecedences' occurrences and consequences' occurrences. Note that the previous example also illustrates Theorem 4 with = 2 and = 2. should occur afterwards such that ∀ < , < +1 holds. Let , where 1