A Formal Framework for Representing Mechanisms? University of Groningen A formal framework for representing mechanisms? Gebharter, Alexander Published in: Philosophy of Science DOI: 10.1086/674206 IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2014 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Gebharter, A. (2014). A formal framework for representing mechanisms? Philosophy of Science, 81(1), 138-153. https://doi.org/10.1086/674206 Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 06-04-2021 https://doi.org/10.1086/674206 https://research.rug.nl/en/publications/a-formal-framework-for-representing-mechanisms(8ac4692e-3cb8-4da3-a25f-509c18758090).html https://doi.org/10.1086/674206 All u A Formal Framework for Representing Mechanisms? Alexander Gebharter*y In this article I tackle the question of how the hierarchical order of mechanisms can be representedwithinacausalgraphframework.Iillustrateananswertothisquestionproposed by Casini, Illari, Russo, and Williamson and provide an example that their formalism does not support two important features of nested mechanisms: ðiÞ a mechanism’s submecha- nisms are typically causally interacting with other parts of said mechanism, and ðiiÞ inter- vening in some of a mechanism’s parts should have some influence on the phenomena the mechanism brings about. Finally, I sketch an alternative approach taking ðiÞ and ðiiÞ into account. 1. Introduction. In many scientific fields phenomena are explained or pre- dicted by pointing at their underlying mechanisms. Such mechanisms are thought of as concrete entities located at specific regions in space-time that produce the respective phenomena. They are characterized by formulations like this: “A mechanism underlying a behavior is a complex system which produces that behavior by the interaction of a number of parts according to direct causal laws” ðGlennan 1996, 52Þ. For alternative formulations, see, for Received January 2013; revised July 2013. *To contact the author, please write to: Düsseldorf Center for Logic and Philosophy of Sci- ence ðDCLPSÞ, Heinrich Heine University Düsseldorf, Universitätsstraße 1, 40225 Düssel- dorf, Germany; e-mail: alexander.gebharter@phil.hhu.de. yThis work was supported by Deutsche Forschungsgemeinschaft ðDFGÞ, research unit Cau- sation | Laws | Dispositions | Explanation ðFOR 1063Þ. My thanks go to Lorenzo Casini, Stuart Glennan, Jens Harbecke, Phyllis McKay Illari, Marie I. Kaiser, Gerhard Schurz, Paul Thorn, Matthias Unterhuber, Ioannis Votsis, and Jon Williamson for their input and important dis- cussions. Thanks also to Christian J. Feldbacher, Sebastian Maaß, Alexander G. Mirnig, and Lucia M. Pichler as well as to two anonymous referees for constructive criticism on an earlier version of the article. An earlier version of this article won a best paper award at the 8th In- ternational Conference of the Association for Analytic Philosophy ðGAP.8Þ. Philosophy of Science, 81 (January 2014) pp. 138–153. 0031-8248/2014/8101-0008$10.00 Copyright 2014 by the Philosophy of Science Association. All rights reserved. 138 This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). FORMAL FRAMEWORK FOR MECHANISMS? 139 example, Machamer, Darden, and Craver ð2000, 3Þ, Bechtel and Abrahamsen ð2005, 423Þ, and Illari and Williamson ð2012, 120Þ. According to mechanists, mechanisms are dynamic causal systems; they are wholes consisting of several spatiotemporally arranged and interacting parts producing certain behavior. Besides having these properties, mecha- nisms are oftentimes ðbut not alwaysÞ self-regulating systems including a lot of feedback loops. Typically ðbut not necessarilyÞ, they are also hierarchi- cally organized ði.e., they consist of several interacting submechanisms that may themselves be built up of submechanisms, etc.Þ. The more is known about the structure of these submechanisms, the more accurate the predic- tions of the phenomena these mechanisms bring about will typically be. Although characterizations, like the one formulated by Glennan ð1996Þ above, are intuitively quite clear, they are not as helpful as one may hope for when it comes to quantitatively precise explanations/predictions of phe- nomena of interest. This deficit can easily be seen by means of the following example: the question of why a car speeds up when the gas pedal is pressed can be answered by pointing at/describing the underlying mechanism ði.e., the motor and how it is connected to the gas pedal, the wheels, the gas tank, etc.Þ, but questions including numerical details like why the acceleration of the car is a when the gas pedal is pressed with pressure p cannot be answered that easily. The answer to a question like the latter requires a formalism ca- pable of capturing/computing the numerical details/effects of specific ma- nipulations of said mechanism. Such a formalism must be able to represent the above-mentioned char- acteristic properties of mechanisms in an adequate way. Casini et al. ð2011Þ propose to model mechanisms on the basis of so-called recursive Bayesian networks, which were originally developed by Williamson and Gabbay ð2005Þ to model nested causal relationships. In doing so they focus on an adequate representation of the hierarchical structure of mechanisms and represent submechanisms by means of a recursive Bayesian network’s ver- tices. I will briefly introduce the formal preliminaries needed to take a closer look at their approach and explain their account on a very simple exemplary toy mechanism in section 2. In section 3 I will highlight two problems with Casini et al.’s approach: their approach ðiÞ does not allow for a graphical representation of how a mechanism’s macrovariables are causally connected to the mechanism’s causal microstructure, which is essential when it comes to mechanistic explanation, and it ðiiÞ leads to the fatal consequence that a mechanism’s macrovariables’ values cannot be changed by any intervention on the mechanism’s microstructure whatsoever and, thus, contradicts the fact that scientists regularly perform so-called bottom-up experiments to investi- gate which are the mechanism’s constitutively relevant parts. In section 4 I present an alternative approach for modeling nested mechanisms: sub- mechanisms should not be represented by means of a causal graph’s verti- This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). 140 ALEXANDER GEBHARTER All u ces, like in Casini et al.’s approach, but rather by means of a causal graph’s edges. I finally demonstrate using the above-mentioned exemplary mecha- nism that this approach does not fall prey to problems ðiÞ and ðiiÞ. 2. Bayesian Networks, Recursive Bayesian Networks, and the Recursive Bayesian Network Approach. A Bayesian networkðBNÞ is a triplehV; E; Pi that satisfies the so-called Markov condition ðMCÞ. Graph G 5 hV; Ei is a graph whose vertices ði.e., the elements of V Þare random variables that may take a number of different values, while E is a binary relation on V ðE ⊆ V � VÞ. Relation E’s elements hX; Yi are called edges and can be graphi- cally represented via different kinds of lines or arrows in G. A BN’s asso- ciated graph is always a directed acyclic graph ðDAGÞ, that is, a graph whose edges are arrows ðX → YÞ and that does not contain a substructure of the form X → : : : → X . And P is a joint probability distribution over the ran- dom variables in V. 1. CM corre cause cause se sub DEFINITION 1. hV; E; Pi satisfies the Markov condition if and only if INDEP ðX ; V 2 DesðX ÞjParðXÞÞ holds for all X ∈ V. ðSpirtes, Glymour, and Scheines 2000, 11Þ In this definition ‘DesðX Þ’ stands for the descendants ði.e., the successorsÞ of X in graph G 5 hV; Ei, ‘ParðX Þ’ for the parents ði.e., the direct pre- decessorsÞ of X in graph G 5 hV; Ei, and ‘INDEPðX ; YjZÞ’ for probabilistic independence of X and Y conditional on Z ði.e., Pðxjy; zÞ 5 PðxjzÞ for all X-, Y-, and Z-values x, y, and z, respectively, provided Pðy; zÞ > 0Þ. BNs can be causally interpreted; that is, they can be understood as a certain type of causal model. When doing so, a BN’s associated graph G represents the system of interest’s causal structure: ‘X → Y’ in such a causal graph G stands for ‘X is a direct cause of Y in G’, and a chain of ðone or moreÞ arrows ði.e., a directed pathÞ going from X to Y stands for ‘X is a ðdirect/ indirectÞ cause of Y in G’. A structure of the form X ← : : : ← Z → : : : → Y is called a common cause path between X and Y. When one uses BNs for causal modeling, MC is causally interpreted also. Under its causal interpretation, MC becomes the so-called causal Markov condition ðCMCÞ that is satisfied by a causal model hV; E; Pi if and only if every X ∈ V is probabilistically independent of all its non- effects conditional on its direct causes ðcf. Spirtes et al. 2000, 29Þ.1 C is the generalization of an idea that can be traced back to Reichenbach ð1956Þ: lated effects are screened off each other by conditionalizing on their common s; effects are screened off their indirect causes by conditionalizing on their direct s. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM ject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). FORMAL FRAMEWORK FOR MECHANISMS? 141 The graph G 5 hV; Ei of a BN satisfying MC/CMC determines the following Markov factorization:2 Pðx1; : : : ; xnÞ 5 P i PðxijparðXiÞÞ: ð1Þ A recursive Bayesian network ðRBNÞ is a BN in which the values of var- iables in V can be BNs themselves. Such variables are called network var- iables, while variables that do not have BNs as values are called simple variables. Casini et al. ð2011Þ suggest to represent a mechanism by an RBN hV; E; Pi and a submechanism by a network variable X ∈ V whose values are BNs representing the possible states of this submechanism. They pro- pose, in addition to the causal interpretation of MC, an additional modeling assumption, the recursive causal Markov condition ðRCMCÞ: 2. No on th 3. Th fhu; v All DEFINITION 2. hV; E; Pi satisfies the recursive causal Markov condition if and only if INDEPðX ; NIDðXÞjDSupðXÞ [ ParðXÞÞ holds for all X ∈ V. ðCasini et al. 2011, 11Þ Set NIDðXÞ is the set of noninferiors or descendants of X, that is, the set of random variables that are neither inferiors nor descendants of X. The inferiors of X are the variables of a lower-level BN representing states of the submechanism described by X at the higher level, the variables of the lower-level BNs representing states of submechanisms of this submech- anism, and so on. Set DSupðXÞ is the set of direct superiors of X and con- tains those variables of the next-level-up BN representing a submechanism whose states are described by lower-level BNs including X. ðFor an illus- tration of these notions, see the water dispenser example introduced below.Þ Casini et al. ð2011, sec. 4Þ suggest interpreting the inferiority/superiority re- lation as constitutive relevance in the sense of Craver ð2007a, 2007bÞ. Let me now briefly explain how probabilistic interlevel explanation/pre- diction works in Casini et al.’s ð2011Þ RBN approach. One therefore needs to define V 5 fX1; : : : ; Xmg as the RBN hV; E; Pi’s variable set V under the transitive closure of the inferiority relation.3 Let N 5 fXj1; : : : ; Xjkg be the set of network variables in V. Then for every instantiation n 5 xj1; : : : ; xjk of network variables in N, a simple BN can be constructed: the flattening of the RBN with regard to n ðn↓Þ. The nodes of this new BN n↓ are the simple variables in V together with the instantiations n 5 te that ‘parðXiÞ’ stands for the instantiation of Xi’s parents to their values x1; : : : ; xn e left-hand side of the equation. e transitive closure R* of a binary relation R can be defined as R* 5 i : ∃ w1; : : : ; ∃ wnðhu; w1i ∈ R ∧ : : : ∧ hwn; vi ∈ RÞg. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 1. 142 ALEXANDER GEBHARTER All u xj1; : : : ; xjk of the network variables in N. BN n↓’s set of edges contains an arrow pointing from X to Y if and only if X is a parent or direct superior of Y in the RBN, and n↓’s probability distribution is determined by the following equation: PðxijparðXiÞ; dsupðXiÞÞ 5 PxjlðxijparðXiÞÞ; ð2Þ where Xjl are the direct superiors of Xi. The flattenings n↓ of an RBN determine a unique probability distribu- tion over V 5 fX1; : : : ; Xmg that allows for quantitative reasoning across the diverse levels of the mechanism represented by the RBN:4 Pðx1; : : : ; xmÞ 5 P i PðxijparðXiÞ; dsupðXiÞÞ: ð3Þ Let me now briefly illustrate how the modeling approach proposed by Casini et al. ð2011Þ works on a very simple toy example, that is, the water dispenser mechanism. This device normally dispenses cold water and wa- ter close to the room temperature when its tempering button is pressed. The water dispenser can be represented by an RBN whose top-level graph is depicted in figure 1. Variable T represents the room temperature, B 5 1=0 stands for whether the tempering button is pressed or not, W stands for the temperature of the water dispensed, and D is a network variable that represents a submech- anism, that is, the water dispenser’s water temperature regulation unit. This regulation unit consists of two lower-level parts: a temperature sensor ðSÞ and a heater ðHÞ. Variable D has two possible values: BN1 ðwater temper- ature is regulated, and thus, one gets water close to the room temperatureÞ and BN0 ðwater temperature is not regulated, and cold water is dispensed as 4. The probabilities PðxijparðXiÞ; dsupðXiÞÞ on the right-hand side of this equation are determined by the flattening induced by x1; : : : ; xm. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 2. FORMAL FRAMEWORK FOR MECHANISMS? 143 a resultÞ. Values BN1 and BN0 are two BNs with the same topological struc- ture ðdepicted in fig. 2Þ but with different associated probability distributions. If D 5 BN1, then the heater is working on a level corresponding to the input of the temperature sensor. If D 5 BN0, then H is probabilistically insensitive to S. Note that the singleton of D is the set of direct superiors of S and H ðfDg 5 DSupðSÞ 5 DSupðHÞÞ in our exemplary mechanism, while fS; Hg is the set of inferiors of DðfS; Hg 5 InfðDÞÞ. When one wants to use the RBN approach for probabilistic predictions across the levels of a mechanism, one first has to construct the RBN’s flattenings as described above. Figure 3 shows the flattening of the RBN with regard to D 5 BN1. Note that the two interlevel arrows from D to S and from D to H stand for the direct superiority/inferiority relation and should not be causally interpreted:5 S and H are not effects of D; they rather stand for constitutively relevant parts of the submechanism represented by D. To indicate this fact, the arrows are dashed in figure 3. According to equation ð2Þ, the conditional probability distribution of this flattening is PðTÞ, PðBÞ, PðD 5 BN1Þ 5 1, PðWjD 5 BN1Þ, PðSÞ 5 PD5BN1ðSÞ, PðHÞ 5 PD5BN1ðHjSÞ. The conditional probability distribution of the flattening of the RBN with regard to D 5 BN0 is PðTÞ, PðBÞ, PðD 5 BN0Þ 5 1, PðWjD 5 BN0Þ, PðSÞ 5 PD5BN0ðSÞ, PðHÞ 5 PD5BN0ðHjSÞ. Ac- cording to equation ð3Þ, the two flattenings of the RBN determine a joint prob- ability distribution over V 5 fT; B; D; W; S; Hg; that is, PðT; B; D; W; S; HÞ 5 PðTÞPðBÞPðDjT; BÞPðWjDÞPðSjDÞPðHjS; DÞ, where the probabilities on the right-hand side of the equation are determined by the flattening induced by T, B, D, W, S, H. This probability distribution can be used for quantitative prediction across the two levels of our exemplary mechanism. 3. Two Problems with the RBN Approach. Let me now expose the two deficits of the RBN approach announced in section 1. Problem ðiÞ: while RBNs clearly allow for quantitative reasoning across the diverse levels of mechanisms, they do not tell us how exactly submechanisms are causally connected to their mechanisms. In case of the water dispenser example, for instance, the RBN’s graph tells us neither how T and B causally influence S 5. If such an interlevel arrow is pointing from a variable X to a variable Y, then X is a direct superior of Y, and Y is a direct inferior of X in the RBN. If a directed path of such interlevel arrows is going from X to Y, then X is a ðdirect or indirectÞ superior of Y, and Y is a ðdirect or indirectÞ inferior of X. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 3. 144 ALEXANDER GEBHARTER All u and H nor how S and H are causally relevant for W; that is, there are no arrows between those variables in the RBN’s graph, and it is unclear over which causal paths probabilistic influence from T and B is propagated through the mechanism’s microstructure to W. But is the graphical repre- sentation of such causal information required at all? Is it not sufficient that the RBN captures the probabilistic dependencies between the variables in V 5 fT; B; D; W; S; Hg? The answer to the latter question is no. One of the reasons for this is simply that mechanistic explanation requires infor- mation about how exactly ði.e., over which causal pathwaysÞ certain inputs to the system influence the mechanism’s microstructure and how changes in this microstructure bring about the phenomenon ðor phenomenaÞ of in- terest at the macrolevel ðcf., e.g., Bechtel 2007, sec. 3Þ.6 In causal models this information is typically provided by the model’s associated probabil- ity distribution together with its graph’s topology. Illustrated in our exam- ple: if our RBN model adequately represents the water dispenser mecha- nism, then the information that the tempering button is not pressed ðB 5 0Þ will screen W off from T. ðThe room temperature is only relevant for the temperature of the water dispensed when the tempering button is pressed.Þ The RBN’s associated probability distribution may give us the correct prob- abilistic dependencies/independencies, but its graph does not provide the 6. There is an analogy in the discussion on scientific explanation: for, explaining an event e2 by referring to an earlier event e1, knowing that e1 increases e2’s probability, is not enough. What one has to know additionally is that e1 is causally relevant to e2—one has to provide a model that shows how e1 causes e2 ðcf. Salmon 1984; Woodward 2011, sec. 4Þ. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). FORMAL FRAMEWORK FOR MECHANISMS? 145 causal information to mechanistically explain this probabilistic behavior. So the model does not tell us that the probabilistic influence of T on W breaks down because B 5 0 fixes the value of H and because H lies on the only directed causal path from T to W. The representation of such causal information in the model’s graph is important not only for mechanistic explanation but also when it comes to questions of manipulation and control. ðPurely probabilistic models cannot distinguish between observation and manipulation; cf. Pearl 2009, sec. 1.3.1.Þ So how, for example, could we intervene on the mechanism’s microstruc- ture in such a way that we can amplify or decrease certain external influ- ences? If we want, for instance, to increase or decrease T’s causal effect on W in our exemplary mechanism, then the information ðwhich is not captured by the RBN’s graphÞ that S lies on a causal path from T to W is crucial. Such knowledge tells us that we can increase/decrease T’s effect on W by manip- ulating S in certain ways, for example, by putting an additional heat source to the sensor S or by cooling S.7 Let me now illustrate problem ðiiÞ, which is presumably the more striking one of the two problems for the RBN approach: recall that the probability distribution that allows for probabilistic reasoning across all levels of a mech- anism is constructed via the flattenings of the RBN ðsee sec. 2Þ. For our exem- plary mechanism this probability distribution would be PðT; B; D; W; S; HÞ 5 PðTÞPðBÞPðDjT; BÞPðWjDÞPðSjDÞPðHjS; DÞ,where theprobabilities on the right-hand side of the equation are determined by the flattening induced by T, B, D, W, S, H. This probability distribution can be captured by a BN with a graph like the one depicted in the box in figure 4. ðAgain, the continuous lines could, while the dashed ones should not, be causally interpreted.Þ Now assume that one would, for example, intervene on S by means of an inter- vention variable IS. Such an intervention on S would—and this can directly be read off the BN’s associated graph’s topology ðdepicted in fig. 4Þ—not have any probabilistic influence on any macrovariable at all. So, according to the RBN approach, intervening on a mechanism’s mi- crovariables does not have any probabilistic influence on any one of the macrovariables whatsoever. This does not only contradict what we observe when looking at the ðbottom-upÞ experiments scientists perform. It is also inconsistent with one of the core features of mechanisms: a mechanism’s macro- and its constitutively relevant microbehaviors should be mutually 7. Note that such amplification or decrease of a certain variable’s influence on another one is not possible by means of so-called surgical or ideal interventions in the sense of Woodward ð2003Þ or Pearl ð2009Þ, but it is by means of soft interventions ðcf. Eberhardt and Scheines 2007Þ. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 4. 146 ALEXANDER GEBHARTER All u manipulable ðcf. Craver 2007a, 2007bÞ. Note that the inferiority relation is explicitly intended to represent constitutive relevance within the RBN ap- proach ðcf. Casini et al. 2011, sec. 4Þ. 4. An Alternative. Let me now propose an alternative to Casini et al.’s ð2011Þ method for representing nested mechanisms. Instead of BNs I use causal models hV; E; Pi whose graphs G 5 hV; Ei are not restricted like those of BNs. In particular, the causal graphs G 5 hV; Ei I use can con- tain two kinds of edges: X → Y, which means that X is a direct cause of Y in the graph, and X ↔ Y, which means that X and Y are effects of a latent common cause, that is, a cause of X and Y not represented within the graph’s variable set V.8 Contrary to Casini et al., I suggest to represent mechanisms not by means of variables but by means of causal arrows. So the simplest representation of a mechanism’s top level would be a causal model hV; E; Pi with graphical structure X → Y or X ↔ Y. In the first case, X would be the mechanism’s input, Y its output, and the arrow ‘→’ would stand for the ðnot further specifiedÞ mechanism at work. In the latter case, X and Y would both represent different outputs produced by one or more not-further-specified ðand maybe yet unknownÞ common causes. Also here, ‘↔’ would stand for the mechanism at work. To represent the mechanism’s causal microstructure, one can now assign a second causal model to the top-level causal model hV; E; Pi that specifies how exactly probabilistic influence between X and Y is propagated through 8. Note that the graph of a causal model that contains bidirected arrows no longer de- termines the Markov factorization ðeq. ½1�Þ. Causal models containing bidirected arrows will typically violate the Markov condition as well as its causal interpretation, i.e., the causal Markov condition. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). FORMAL FRAMEWORK FOR MECHANISMS? 147 the mechanism’s causal microstructure. Both causal models must fit to- gether with respect to the causal information contained in their associated graphs as well as with respect to the probabilistic information stored in their associated probability distributions. This is guaranteed by the follow- ing notion of a restriction of a causal model. This notion is basically a slightly modified version of Steel’s ð2005, 11Þ notion of a restricted graph complemented by conditions for bidirected arrows: 9. W 10. V of the an ar All DEFINITION 3. hV; E; Pi is a restriction of hV *; E*; P*i if and only if aÞ V ⊂ V *, and bÞ P*↑V 5 P,9 and cÞ for all X; Y ∈ V: 1. If there is a directed path from X to Y in hV*; E*i and no vertex on this path different from X and Y is in V, then X → Y in hV; Ei, and 2. if X and Y are connected by a common cause path p in hV *; E*i or by a path p free of colliders containing a bidirected edge in hV*; E*i,10 and no vertex on this path p different from X and Y is in V, then X ↔ Y in hV; Ei, and dÞ no path not implied by c is in hV; Ei. Definition 3 determines for every causal model hV *; E*; P*i and for every proper subset V of V * a unique restriction hV; E; Pi. This restriction is called hV*; E*; P*i’s restriction to V. The introduced notion of a restriction allows for marginalizing out variables in such a way that the causal as well as the probabilistic information captured by the restricted model is pre- served: hV; Ei can be interpreted as a higher- and hV*; E*i as a lower-level mechanism’s causal structure in definition 3. Condition a guarantees that the higher-level structure contains fewer variables than the lower-level one, b ensures that hV; E; Pi’s and hV*; E*; P*i’s probability distributions fit to- gether, and c that also their associated causal structures do. Thanks to c1 all components of a mechanism represented at both levels are directly caus- ally connected at the higher level whenever they are directly causally con- nected at the lower level, so no direct causal connection between two varia- bles represented at both levels gets lost when going from the lower to the higher level. In addition it guarantees that there is a direct causal connection for every directed causal path in the lower-level structure whose interme- here P*↑V is the restriction of probability distribution P* to variable set V. ariable Zl is called a collider on a causal path p if and only if p contains a subpath form Zk@→ Zl ←@Zm, where ‘@’ is a metasymbol standing for an arrowhead or row’s tail. So ‘X@→ Y’, e.g., stands for ‘X → Y or X ↔ Y ’. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). 148 ALEXANDER GEBHARTER All u diate components are not represented at the higher-level model’s associated graph. Condition c2 tells us when we have to draw a bidirected edge ð↔Þ between two variables X and Y in the higher-level model’s graph: draw such a bidirected edge whenever there also is one at the lower level, if all varia- bles on a common cause path of X and Y are marginalized out when going from the lower- to the higher-level structure or when all variables lying on a path at the lower level that indicates a latent common cause of X and Y are marginalized out.11 Condition d prevents causal connections at the higher level that do not have a counterpart at the lower level. Figure 5 illustrates how marginalizing out variables functions according to definition 3, by show- ing an exemplary causal structure and some of its possible restrictions. Let me now further develop the above-mentioned idea of representing nested mechanisms by edges instead of vertices. For this purpose I intro- duce the following notion of a multilevel causal model ðMLCMÞ that is based on the definition of a restriction ðdefinition 3Þ. I propose MLCMs as adequate means for representing the hierarchical organization of mech- anisms ðbelow I will demonstrate that MLCMs do not fall prey to the two problems of the RBN approach discussed in sec. 3Þ: 11. H a late out Z this p se sub DEFINITION 4. hM1 5 hV1; E1; P1i; : : : ; Mn 5 hVn; En; Pnii is a multilevel causal model if and only if aÞ M1; : : : ; Mn are causal models, and bÞ every Mi with 1 < i ≤ n is a restriction of M1, and cÞ M1 satisfies CMC. According to definition 4, an MLCM is an n-tuple consisting of several causal models ðcondition aÞ that are intended to represent causal structures at different levels. According to b, every causal model Mi in the ordering different from M1 is a restriction of the first causal model M1, so M1 stands for the mechanism’s lowest level, while every Mi different from M1 repre- sents one of its higher levels. Condition c captures a basic assumption of the causal nets approach; that is, that every robust probability distribution is pro- duced ðand, thus, can be explainedÞ by some underlying causal model satis- fying CMC ðcf. Spirtes et al. 2000, 124–25Þ. So an MLCM of a mechanism is complete only when all probabilistic dependencies of any higher-level model can be explained by a lowest-level causal model M1 that satisfies CMC. Definition 4 does not directly tell us much about the hierarchical orga- nization of the mechanism and its submechanisms represented by the ere is an example of such a path: Z1 ↔ Z2 in structure X ← Z1 ↔ Z2 → Y indicates nt common cause of Z1 and Z2 and, thus, also of X and Y. When marginalizing 1 and Z2, one has to draw a bidirected arrow between X and Y ðX ↔ YÞ to prevent iece of causal information. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM ject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 5. According to definition 3, the graph of the restriction of a causal model with the graph depicted above would be X ↔ Y ← Z ↔ W, if one chooses to marginalize out U. It would be X ↔ Y ← U → W, if marginalizing out Z, and X ↔ Y ↔ W , if marginalizing out Z and U. When one restricts the original model to V 5 fX; Z; U; Wg, the resulting structure would be X Z ← U → W ðwithout an edge between X and ZÞ. FORMAL FRAMEWORK FOR MECHANISMS? 149 MLCM’s causal models; it just tells us that M1 stands for the lowest level. Fortunately, a unique level graph G 5 hV; Ei can be constructed for every MLCM. Such a level graph is a kind of metagraph that provides exactly the information requested above: information about the hierarchical rela- tion of nested mechanisms represented by the MLCM: All DEFINITION 5. A graph G 5 hV; Ei is called an MLCM hM1 5 hV1; E1; P1i; : : : ; Mn 5 hVn; En; Pnii’s level graph if and only if aÞ V 5 fM1; : : : ; Mng, and bÞ for all Mi 5 hVi; Ei; Pii and Mj 5 hVj; Ej; Pji in V, Mi → Mj in G if and only if Vi ⊂ Vj and there is no Mk 5 hVk; Ek; Pki in V such that Vi ⊂ Vk ⊂ Vj holds. According to a, a level graph G 5 hV; Ei is a graph over the causal models M1 5 hV1; E1; P1i; : : : ; Mn 5 hVn; En; Pni of an MLCM. Condition b in- structs one to draw a directed edge from one of these Mi 5 hVi; Ei; Pii to another Mj 5 hVj; Ej; Pji whenever Vi is a proper subset of Vj and there is no Mk 5 hVk; Ek; Pki in V such that Vk is a proper subset of Vj and a proper superset of Vi. So the directed paths in a level graph correspond to the set- theoretical proper subset relation ði.e., Mi → : : : → Mj in G if and only if Vi ⊂ VjÞ. Because every causal model Mi 5 hVi; Ei; Pii of the MLCM dif- ferent from M1 5 hV1; E1; P1i is a restriction of M1, the vertex set Vi of every such model Mi is a proper subset of V1. So the level graph G will be a DAG containing only one vertex with no exiting arrows ði.e., M1 5 hV1; E1; P1iÞ, while there will be a directed path from every Mi different from M1 to M1. Now some information about the hierarchical organization of causal models of an MLCM can be read off this MLCM’s level graph G: when- ever there is a directed path from Mi to Mj in the level graph G, then Mi represents a higher-level causal structure than Mj does. And whenever a causal model Mk 5 hVk; Ek; Pki lies on such a directed path from Mi to Mj, then Mk represents a causal structure on a level between Mi and Mj. So what we basically get by drawing a level graph is a strict order among causal models of an MLCM. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). 150 ALEXANDER GEBHARTER All u Let me now illustrate the MLCM approach for modeling mechanisms by an abstract example. Figure 6 shows the causal structures of the causal models of an MLCM plus the MLCM’s level graph that connects these models and provides information about the hierarchical order of the mech- anism’s levels the MLCM represents. The lowest-level causal model M1’s graph is X ↔ Y ← Z ← U → W. One gets the higher-level model M2 with graph X Z ← U → W by marginalizing out Y, and the higher-level model M3 with graph Y ← Z ↔ W by marginalizing out X and U. Note that the MLCM’s level graph does not provide any information about whether these two models ði.e., M2 and M3Þ represent structures at the same or at different levels of organization. By marginalizing out U from M2, one arrives at the higher-level causal model M4 with structure X Z ↔ W. Note that the formalism again does not provide any information about whether M4 represents a mechanism at the same level as the one repre- sented by M3. One can further restrict M3 and M4 to M5, with causal graph Z ↔ W . Model M5 describes the represented mechanism at the top level. Note that the MLCM’s level graph tells us that causal models M2, M3, and M4 describe the mechanism’s causal structure on levels between the mech- anism’s top and its lowest level represented by M5 and M1, respectively. As a last step, I will demonstrate using our exemplary mechanism in- troduced in section 2 ði.e., the water dispenserÞ that MLCMs do not share problems ðiÞ and ðiiÞ, which Casini et al.’s ð2011Þ RBN approach has to face, and that MLCMs nicely capture another important feature of nested mechanisms: as long as the details of a mechanism are not considered, the same input should lead to the same output on all of the mechanism’s levels. The water dispenser mechanism can be represented by an MLCM hM1 5 hV1; E1; P1i; M2 5 hV2; E2; P2ii, where the graph in the upper box in fig- ure 7 shows M2’s and the one in the lower box shows M1’s causal struc- Figure 6. Boxed graphs are the associated causal graphs of an MLCM’s causal mod- els M1; : : : ; M5. Dashed lines are the edges of this MLCM’s level graph. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Figure 7. Water dispenser mechanism by means of a two-stage MLCM ðcausal models M1 and M2Þ. Graph with the dashed edge connecting the MLCM’s two causal models M1 and M2 is the MLCM’s level graph. FORMAL FRAMEWORK FOR MECHANISMS? 151 ture. Model M1 represents the water dispenser’s submechanism, that is, the water temperature regulation unit, and how it is causally connected to the mechanism’s macrovariables. Note that this submechanism is not repre- sented by a variable like in Casini et al.’s ð2011Þ RBN approach but by M2’s graph T → W ← B at the higher level. Variables T and B are this sub- mechanism’s input variables; W is its output variable. The MLCM hM1 5 hV1; E1; P1i; M2 5 hV2; E2; P2ii’s level graph is M2 → M1. When we go from M2 to M1, we zoom into the microstructure of the submechanism rep- resented by T → W ← B at the top level. Since M2 is a restriction of M1 in the MLCM, it follows from definition 3b that P1ðwjt; bÞ 5 P2ðwjt; bÞ holds for arbitrarily chosen W-, T-, and B-values w, t, and b, respectively. So as long as only the variables contained in both causal models’ variable sets are considered, the same input will lead to the same output at both levels, and thus, MLCM captures the aforementioned feature of nested mechanisms. Since the causal arrows in M1 tell us exactly how the submechanism’s components S and H are causally connected to the rest of the mechanism ði.e., T, B, and WÞ, the MLCM representation captures property ðiÞ: the MLCM can graphically represent the causal connections between the rep- resented mechanism’s macro- and microvariables. This gives us causal in- formation that is crucial for questions concerning explanation, manipulation, and control. It tells us why certain inputs ði.e., conditionalizing on certain T- and B-valuesÞ bring about ðor explainÞ certain outputs ði.e., probabilities of certain W-valuesÞ: T is directly causally relevant for S, B and S are direct causes of H, and H is the only direct cause of W in our toy mechanism. This causal information does tell us, for example, why T’s probabilistic influence on W breaks down when B 5 0. It is because the only productive causal This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). 152 ALEXANDER GEBHARTER All u path from T to W goes through H. Variable B 5 0 fixes H’s value and, thus, probability propagation between T and Walong this path is blocked when H’s value is fixed. It also tells us that T’s influence on W can be amplified or decreased by manipulating S or H by means of soft interventions, while B’s effect on W can only be modified by changing H’s behavior. The MLCM can also capture property ðiiÞ: intervening on the mechanism’s microstructure ði.e., on S or HÞ will typically have a probabilistic influence on the mech- anism’s macrobehavior ði.e., on certain W-valuesÞ. Like Casini et al.’s ð2011Þ RBN approach, the MLCM representation provides a unique probability distribution over the set of all variables ap- pearing in the causal models of the MLCM. Since the first causal model M1 5 hV1; E1; P1i in an MLCM’s ordering M1; : : : ; Mn also contains all variables of the causal models Mi appearing later in that particular ordering, said unique probability distribution is M1’s probability distribution P1. When it comes to quantitative prediction, one can, thanks to the fact that every causal model appearing later in the ordering M1; : : : ; Mn is a re- striction of M1, just choose one of the causal models in the MLCM that contains all the variables of interest and then compute the probabilities for the phenomena of interest accordingly. 5. Conclusion. In this article I tackled the question of how mechanisms, and especially their hierarchical organization, can be represented within a causal graph framework. In section 2 I discussed an approach for modeling such nested mechanisms proposed by Casini et al. ð2011Þ. I introduced Bayesian networks and recursive Bayesian networks and explained how they can be used for causal modeling. I then illustrated Casini et al.’s RBN approach,which suggests representing submechanisms by network variables of an RBN, by means of a very simple toy example, that is, the water dis- penser mechanism. In section 3 I illustrated two problems with the RBN approach by means of the exemplary mechanism introduced in section 2: ðiÞ an RBN does not graphically encode information about how a mecha- nism’s submechanisms are causally connected to the rest of this mecha- nism. Such information is, however, relevant when it comes to questions of mechanistic explanation, manipulation, and control. ðiiÞ It follows from the RBN approach that intervening on some of a mechanism’s microvariables cannot have any probabilistic influence on some of this mechanism’s mac- robehavior whatsoever. This consequence stands in stark contrast to scien- tific practice; scientists typically carry out so-called bottom-up experi- ments to distinguish between a mechanism’s constitutively relevant and its irrelevant parts. In section 4 I developed an alternative modeling approach for nested mechanism: the MLCM approach. This approach represents sub- mechanisms not by means of a causal model’s variables but by the edges of its associated graph. I finally demonstrated, again using the exemplary This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM se subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). FORMAL FRAMEWORK FOR MECHANISMS? 153 mechanism of the water dispenser, that the MLCM approach does not fall prey to problems ðiÞ and ðiiÞ, which Casini et al.’s RBN approach has to face. REFERENCES Bech Bech Casi Crav —— Eber Glen Illari Mac Pear Reic Salm Spir Stee Will Woo —— Al tel, William. 2007. “Reducing Psychology While Maintaining Its Autonomy via Mechanistic Explanations.” In The Matter of the Mind: Philosophical Essays on Psychology, Neuroscience and Reduction, ed. Maurice Shouten and Hoib Lorren de Jong, 172–98. Oxford: Blackwell. tel, William, and Adele Abrahamsen. 2005. “Explanation: A Mechanist Alternative.” Studies in History and Philosophy of the Biological and Biomedical Sciences 36:421–41. ni, Lorenzo, Phyllis McKay Illari, Federica Russo, and Jon Williamson. 2011. “Models for Prediction, Explanation and Control: Recursive Bayesian Networks.” Theoria 70:5–33. er, Carl. 2007a. “Constitutive Explanatory Relevance.” Journal for Philosophical Research 32:3–20. —. 2007b. Explaining the Brain. Oxford: Clarendon. hardt, Frederick, and Richard Scheines. 2007. “Interventions and Causal Inference.” Philos- ophy of Science 74:981–95. nan, Stuart. 1996. “Mechanisms and the Nature of Causation.” Erkenntnis 44:49–71. , Phyllis McKay, and Jon Williamson. 2012. “What Is a Mechanism? Thinking about Mech- anisms across the Sciences.” European Journal for the Philosophy of Science 2:119–35. hamer, Peter, Lindley Darden, and Carl Craver. 2000. “Thinking about Mechanisms.” Philos- ophy of Science 67:1–25. l, Judea. 2009. Causality. Cambridge: Cambridge University Press. henbach, Hans. 1956. The Direction of Time. Berkeley: University of California Press. on, Wesley. 1984. Scientific Explanation and the Causal Structure of the World. Princeton, NJ: Princeton University Press. tes, Peter, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search. Cambridge, MA: MIT Press. l, Daniel. 2005. “Indeterminism and the Causal Markov Condition.” British Journal for the Philosophy of Science 56:3–26. iamson, Jon, and Dov Gabbay. 2005. “Recursive Causality in Bayesian Networks and Self- Fibring Networks.” In Laws and Models in the Sciences, ed. Donald Gillies, 223–45. London: King’s College. dward, James. 2003. Making Things Happen. Oxford: Oxford University Press. —. 2011. “Scientific Explanation.” In Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/win2011/entries /scientific-explanation/. This content downloaded from 129.125.019.061 on October 29, 2018 04:22:23 AM l use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1007%2Fs13194-011-0038-2&citationId=p_17 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1016%2Fj.shpsc.2005.03.010&citationId=p_11 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1016%2Fj.shpsc.2005.03.010&citationId=p_11 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1093%2Fphisci%2Faxi101&citationId=p_23 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1093%2Fphisci%2Faxi101&citationId=p_23 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&system=10.1086%2F392759&citationId=p_18 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&system=10.1086%2F392759&citationId=p_18 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&system=10.1086%2F525638&citationId=p_15 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&system=10.1086%2F525638&citationId=p_15 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.1007%2FBF00172853&citationId=p_16 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F674206&crossref=10.5840%2Fjpr20073241&citationId=p_13