key: cord-0042971-xutibzn3
authors: Janssen, Remie; Jones, Mark; Murakami, Yukihiro
title: Combining Networks Using Cherry Picking Sequences
date: 2020-02-01
journal: Algorithms for Computational Biology
DOI: 10.1007/978-3-030-42266-0_7
sha: 5c17f3a6aad88e545fe1c376d27b4a8acf9edf90
doc_id: 42971
cord_uid: xutibzn3

Phylogenetic networks are important for the study of evolution. The number of methods to find such networks is increasing, but most such methods can only reconstruct small networks. To find bigger networks, one can attempt to combine small networks. In this paper, we study the Network Hybridization problem, a problem of combining networks into another network with low complexity. We characterize this complexity via a restricted problem, Tree-child Network Hybridization, and we present an FPT algorithm to efficiently solve this restricted problem.

Evolutionary histories are often represented by phylogenetic trees, and more recently, by phylogenetic networks. Knowing the evolutionary history of a species is vital for understanding their biology. Therefore, it is important to have methods for finding phylogenetic networks that accurately represent the true evolutionary histories. Many methods exist to find evolutionary histories; some are purely combinatorial, others have a likelihood component as well. Here, we focus mainly on the purely combinatorial problems.

One classic combinatorial method is to solve Hybridization: given a set of trees, find the simplest network that displays these trees [1] . Unfortunately, the problem is NP-hard, even on inputs of two trees [2] . For this problem, it is assumed we can construct accurate phylogenetic trees for small parts of the genomes of the studied taxa. When the input consists of only two or three trees, it can be solved relatively efficiently-EPT time [8, 17] -even though the problem is already NP-hard in that case. For an input consisting of three trees or more, there is still an FPT algorithm [9] , but it is not practical. In these cases, it is useful to limit the search space to networks with a restricted structure, such as tree-child networks [7] , or temporal networks [6] .

Another combinatorial approach for finding phylogenetic networks is to combine smaller networks. The smaller networks are often assumed to have certain properties. For example, it may be assumed that we are given all strict subnetworks containing the full set of leaves. In that case, it is possible to reconstruct a level-k tree-child network from all its level-(k − 1) subnetworks [15] . Another assumption could be that the input consists of all subnetworks obtained by removing exactly one leaf [11] . Instead of having almost all leaves, the subnetworks can also be allowed to have only few leaves. For example, low level networks can be reconstructed from their full set of binets [5, 12] , trinets [10, 16] or quarnets [4] .

In practice, it may not be easy to find all subnetworks. This renders many of the previously mentioned methods useless. Furthermore, these methods typically only work for low level networks. This means that they cannot be used when the phylogenetic signal comes from a complicated evolutionary history, or if there is some randomness in the data, complicating the data as well.

In this paper, we combine networks that all contain the full set of leaves, but we do not assume we have all the subnetworks of the network we want to find. The problem we solve is analogous to Hybridization, but with networks as the input, Network Hybridization: Given a set of networks with taxa X, find a network N with minimal reticulation number, that displays all input networks. Since this is a generalization of the Hybridization problem, the problem remains NP-hard in general, even for inputs of two networks. We show that for the restricted problem on tree-child (topologically restricted class of networks; defined formally in Sect. 2) inputs and output, we can use tree-child sequences to obtain an FPT algorithm. This FPT algorithm is an extension of the one introduced in [7] in which they considered tree inputs; the tree-child sequence approach was first introduced in [14] . We also comment briefly on how some measure of an optimal solution to the Network Hybridization problem can be characterized by solving this restricted problem.

Structure of the Paper. We start with a quick introduction of relevant concepts from mathematical phylogenetics in Sect. 2. Then, in Sect. 3, we formally introduce Tree-child Network Hybridization and prove its relation to treechild sequences. This section also relates this problem to the more general Network Hybridization. In Sect. 4.1, we lay the theoretical foundation to extend the algorithm in [7] that takes inputs of trees to also work for inputs of networks. As a last theoretical section in the paper, we present an FPT algorithm that solves Tree-Child Network Hybridization (Sect. 4.2). We conclude the paper with a discussion, including some open questions (Sect. 5).

The main objects of study for this paper are phylogenetic networks. These graphs are used in biology to represent evolutionary scenarios for a given set of species. Definition 1. A (rooted phylogenetic) network on a finite set of taxa X is a directed acyclic graph with -one node of indegree-0 and outdegree-1, the root; -nodes of indegree-1 and outdegree-0 labelled bijectively with X, the leaves; -nodes of indegree-1 and outdegree-2, the tree nodes; -nodes of indegree greater than 1 and outdegree-1, the reticulations.

If all the reticulation nodes have indegree-2, the network is called binary. An edge uv is called a tree edge if v is a tree node or leaf, and a reticulation edge if v is a reticulation. The vertex u is the parent of v, and v is the child of u. The reticulation number r(N ) of a network N is the total number of reticulation edges minus the total number of reticulations.

A network is stack-free if every reticulation has a child that is a tree node or a leaf. A network is tree-child if it is stack-free and every tree node has a child that is a tree node or a leaf. We now define some relevant notation for local structures in phylogenetic networks.

Let N be a network on X and x, y ∈ X two leaves. Then we say N has a cherry {x, y} if the parent of x is the parent of y; we say that N has a reticulated cherry (x, y) if the parent of x is a reticulation, and the parent of y is a parent of this reticulation. If (x, y) is a cherry or a reticulated cherry in N , then it is called a reducible pair.

Tree-child sequences are built on the notion of reducing cherries and reticulated cherries from networks. The reversal of reducing cherries and reticulated cherries can be done by adding ordered pairs of leaves to the network. Definition 4. Let N be a network and let (x, y) be reducible pair. Then we may construct N from N (x, y)-also called add (x, y) to N (x, y)-by applying the following. 1 . If x is a leaf in N (x, y) (i.e., if (x, y) is a reticulated cherry in N ), and (a) if p, the parent of x in N (x, y), is a reticulation then add a node q directly above y, and add an edge qp. (b) otherwise, add nodes p and q directly above x and y respectively, and add an edge qp. 2. If x is not a leaf in N (x, y) (i.e., if (x, y) is a cherry in N ) then add a labelled node x, insert a node q directly above y, and add an edge qx.

The above notion of adding an ordered pair of leaves (x, y) to a network N is well-defined if y is already a leaf in N . If this is indeed the case, we may obtain a network from a sequence of ordered pairs by iteratively adding ordered pairs to an existing network. To do so, we impose the following condition on our sequence of ordered pairs: The second coordinate of each pair has to occur as a first coordinate in the remainder of the sequence, or as the second coordinate of the last pair. Then, the following procedure constructs a network from a sequence.

Input: A sequence of ordered pairs S = (x1, y1) · · · (xn, yn); Output: The network that can be constructed from S; 1 Set N to be the tree on one leaf yn; 2 for i = n, . . . , 1 do Subdivide the incoming edge of xi with a node px; 8 Subdivide the incoming edge of yi with a node py; 9 Add the edge pypx to N ; Subdivide the incoming edge of yi with a node py; 12 Add a new node labelled xi to N ; 13 Add the edge pyxi to N ;

Note that because we only add reticulation edges to existing reticulation nodes wherever possible (Line 4), the network obtained by using the above procedure is always stack-free. Imposing another condition: no first coordinate leaf is used as a second coordinate in the remainder of the sequence on the sequence ensures that the network we obtain is tree-child. With this in mind, we formally define a tree-child sequence (Fig. 1 ).

A tree-child sequence (TCS) is a sequence of ordered pairs of two leaves such that the following conditions hold:

-the second coordinate of each pair has to occur as a first coordinate in the remainder of the sequence or as the second coordinate of the last pair; -no first coordinate leaf is used as a second coordinate in the rest of the sequence.

Let S be a TCS, that involves the leaves X. Then, the weight of S is w(S) = |S| − |X| + 1. Given a sequence of ordered pairs S = S 1 S 2 · · · S |S| , we let NS denote the network

We introduce some notation for subsequences of a sequence S. For i ∈ [|S|], we use the following notation for subsequence of S.

We say that the leaves x 1 , . . . , x i are forbidden for S [:i] . Forbidden leaves do not appear as a second coordinate leaf in a TCS (by the second condition of TCSs).

We say S reduces N to the leaf x if NS is the tree with the single leaf x. Similarly, let N be a set of networks, then we denote by N S the set of reduced networks {NS : N ∈ N }, and we say S reduces N to x if NS is the one leaf tree x for all N ∈ N .

We call a sequence S of ordered pairs a partial TCS if there exists a TCS S such that S [:i] = S for some i.

In this section we formally define the Tree-child Network Hybridization problem, which asks to find a tree-child network with minimal reticulation Fig. 1 . A binary tree-child network N (grey and black) reduced to a leaf 4 by a tree-child sequence S. The reduction is shown as a sequence of networks NS [:i] for i = 0, 1, . . . , 6 from left to right, in which an element of S is applied to the network successively. Each element of S reduces a pair in N . An example of a cherry (3, 1) can be seen in the network NS [:3] , and a reticulated cherry (3, 4) can be seen in the network N . The subnetwork of N consisting of the black edges is also reduced by S, and the embedding can be constructed by building both networks simultaneously and keeping track of the edges added by the pairs that change the subnetwork (black pairs and arrows).

number that displays all input tree-child networks on the same set of taxa. We generalize the results presented in [14] (they considered inputs of trees while we consider inputs of networks) by showing how this problem relates to the more generalized problem of Network Hybridization and also to the Tree-child Weight problem. For the Tree-child Network Hybridization problem, it turns out that there is not always a solution for some given inputs; we also comment on when this is the case.

We start by defining what it means for a network to display another network. If a network N on X is a subnetwork of another network N on X, then an embedding of N into N is the mapping of the nodes of N to the nodes of N such that the leaves of N are mapped to the leaves of N with the same labels, and the edges of N are mapped to edge-disjoint paths of N . Our main focus of this paper is to solve the following problem.

Input: A set of rooted tree-child networks N on X. Output: A tree-child network that displays N with minimal reticulation number if it exists, NO otherwise.

Given an optimal tree-child network to the Tree-child Network Hybridization problem, one may find a TCS that reduces it. We will show that the weight of such a TCS is equal to the weight of an optimal solution to the following related problem.

Input: A set of rooted networks N on X. Output: A minimal weight TCS that reduces N if it exists, NO otherwise.

Let N be a set of networks on X. The reticulation number of an optimal solution to Tree-child Hybridization is denoted h tc (N ). The weight of an optimal solution to Tree-child Weight is denoted s tc (N ).

For a set of trees T , the relation h tc (T ) = s tc (T ) holds. We will extend this result for network inputs. We first recall some key lemmas from [13] . The first lemma loosely states that each TCS reducing a set of networks N gives a tree-child network with corresponding reticulation number that displays N . The second lemma gives the reverse statement: each tree-child network that displays a set of networks N gives a TCS of corresponding weight that reduces N . Lemma 8) . Let N and N be a tree-child network. Suppose there is a TCS S that reduces both N and N , such that each element of S that reduces a pair in N also reduces a reducible pair in N . Then N is a displayed by N (Fig. 1) . Corollary 4) . Let N, N be tree-child networks on X such that N is displayed by N . If a TCS S reduces N , then S also reduces N .

Unlike when the input consists of only trees, a solution to Tree-child Network Hybridization does not always exist when the input may also contain networks (Fig. 1) .

A set of tree-child networks N are tree-child compatible if there exists a tree-child network that displays all networks in N .

Our next step, is to prove that there is a strong connection between tree-child compatibility and TCSs. Proof. Suppose that N is tree-child compatible. Then there exists a tree-child network N that displays N , with minimal reticulation number. Now let S be a TCS for N . By Lemma 8, S also reduces all displayed networks of N , and hence it reduces N . Furthermore, the weight of S is equal to the reticulation number of N by Lemma 3 from [13] , (originally proved slightly less strong in [14] ). Now suppose there exists a TCS S that reduces N . Let N be the tree-child network that can be constructed from S. Then, by Lemma 7, N displays N . Because N is the network corresponding to S, the reticulation number of N is equal to the weight of S.

In the previous subsection, we have found a strong connection between Treechild Network Hybridization and Tree-child Weight for feasible solutions. Not all inputs, however, are feasible. Here, we investigate the feasibility of inputs, and how to deal with infeasible inputs. Lemma 11. Let N be a tree-child network with reticulated cherry (x, y), then any TCS that reduces N must contain the pair (x, y).

Proof. Suppose S is a TCS that reduces N . The only ways to reduce the reticulated cherry (x, y) are by either reducing it directly with the pair (x, y), or by first turning it into a cherry {x, y} and then reducing it with a pair (x, y) or (y, x). This second option, however, leads to a contradiction: To make the reticulated cherry into a cherry, we must reduce a pair of the form (x, ·); however, any sequence that includes (x, ·) and later (y, x) cannot be tree-child. to M , we get the network MZ which is tree-child. Then, adding these leaves in the right places to N 1 and N 2 , we get the set of networks NZ ∈ N + on X ∪ Z, that are displayed by the tree-child network MZ .

Using the connection between tree-child compatibility and the existence of TCSs, we can prove an obstruction to tree-child compatibility of Lemma 12. This obstruction will turn out to be quite valuable in the proofs in the rest of this paper, as it allows us to quickly check whether a set of networks is tree-child compatible.

Lemma 12. Let N 1 , N 2 be tree-child networks on the same set of leaves X. For any pair of leaves x, y, if N 1 contains the reticulated cherry (x, y) and N 2 contains the reticulated cherry (y, x), then N 1 and N 2 are not tree-child compatible.

Proof. Let N be a tree-child network that displays both N 1 and N 2 . Then any TCS S for N reduces both N 1 and N 2 . By Lemma 11, the sequence S must contain the pair (x, y), because N 1 has the reticulated cherry (x, y); similarly, S must contain (y, x). This means S is a TCS, but it includes both pairs (x, y) and (y, x), a contradiction. Hence we conclude that N 1 and N 2 are not tree-child compatible.

Even if an input is infeasible, we still desire a network that displays all input networks. For this purpose, we can relax the tree-child constraint on output (and input) of the Tree-child Network Hybridization problem, giving rise to the following problem.

Input: A set of rooted networks N on X. Output: A network that displays N with minimal reticulation number.

This problem can be viewed as the natural extension of the classic Hybridization problem for trees. Linz and Semple show that Hybridization can be solved by adding leaves in the right place to all input trees, and then solving Tree-child Hybridization [14] . This also holds for the network versions of these problems, as the solution to Network Hybridization can be made treechild by adding leaves, and all networks displayed by a tree-child network are tree-child networks as well (Fig. 2) .

In this section, we give an FPT algorithm for Tree-child Network Combination. We extend the algorithm given in [7] by allowing for inputs to be networks, and by looking for reducible pairs within networks rather than cherries in trees. Given an input N of tree-child networks, we first look for trivial reducible pairs. We show that it is safe to reduce trivial reducible pairs as soon as we encounter one, in any order. We then branch on all possible non-trivial reducible pairs of the network, and by showing that the total number of possible reducible pairs at each branching point is at most 8k for the reticulation number k of the optimal solution, we show that the running time of the algorithm is O((8k) k · poly(|X|, |N |).

Trivial Pairs. The algorithm in [7] reduces trivial cherries (a pair of leaves {x, y} that appear as a cherry in any input tree containing x and y) whenever possible. Here, only looking at trivial cherries is not sufficient. For an input of networks, we will need to reduce trivial reducible pairs (trivial pairs for short) whenever possible. A trivial pair is a pair of leaves (x, y) such that all networks either only have the leaf y, or they have a reducible pair (x, y). In the following two lemmas, we prove that it is safe to reduce such a pair as soon as we encounter one. , y) and (a, x) . This can only occur if a = y: as x is the first as well as the second element of a reducible pair, it must form a cherry with another leaf, namely the leaf y. However, S(y, x)(x, y)S is not a TCS, which contradicts our assumption that S(a, b)(x, y)S is a TCS for N . Hence, for the rest of the proof, we assume b = x. In this case, S(x, y)(a, b)S is a TCS. It remains to prove that it reduces N . This is clear if {x, y}∩{a, b} = ∅. Observe that a = y, as otherwise S(a, b)(x, y)S would not have been a TCS to begin with. Therefore, we still need to check the cases a = x and b = y.

If a = x and a network has both reducible pairs (x, b) and (x, y), then this network has a reticulation with reticulated cherries (x, b) and (x, y) . The order of reducing these pairs obviously does not matter for such networks: both options remove the reticulation edges between the parents of b and y, and the parent of x. For a network N that only has the reducible pair (x, y) after S (and not

This means S(x, y)(x, b)S also reduces N [13] . Hence if a = x, the sequence S(x, y)(a, b)S is a TCS for N . Now suppose b = y. Let N be a network that has both reducible pairs (a, y) and (x, y). But all tree nodes of N are of outdegree-2; this implies that every leaf can be the second coordinate of at most one reducible pair. Therefore such a network cannot exist, and thus this case is not possible. (x, y) . Hence, by repeated application of Lemma 13, there is a sequence S(x, y)S for N . This sequence has the same length as SS , because it is simply a reordering of the pairs. Now suppose S i = (y, x), then x cannot have been the first coordinate in any pair of S, so all networks in N S contain x. Furthermore, S does not contain the pair (x, y), as this would violate the assumption that SS is a TCS. Hence, each network in N S has a cherry or reticulated cherry on x and y, which is ultimately reduced by a pair (y, x) in S . Suppose a network N ∈ N S does not have the cherry {x, y}. Then it has the reticulated cherry (x, y). To make this into a cherry, so that it can be reduced by (y, x), the sequence must first contain a pair of the form (x, z). However, this implies S first contains (x, z) and then (y, x), which contradicts the fact that SS is a TCS. Hence, we may assume that all networks in N S have the cherry {x, y}.

If y is not forbidden after S, we can switch the roles of x and y in the remaining part of the sequence S to get a new TCS SS * for N . In S * , the first occurrence of (x, y) or (y, x) is S * i = (x, y), and we are in the previous case. If y is forbidden after S, repeated application of Lemma 13 on SS and S i gives a sequence S(y, x)S for N .

Bounding Reducible Pairs in Networks with All Leaves. In the algorithm in [7] , a bound on the number of cherries after having reduced all trivial cherries was required to compute the running time. Here, we require something similar; we require a bound on the number of reducible pairs after we have reduced all the trivial pairs. [7] prove such bounds by first focusing on the case where all input trees have the same leaf set. We do the same, by first focusing on the case where all input networks have the same leaf set.

Let N be a set of networks. Then the set of displayed trees of N is the set of all trees that are displayed by the networks of N . N = {N 1 , . . . , N I } be a set of tree-child networks on a common set of leaves such that there exists a TCS S for N . If N does not contain any trivial pairs, then the set of displayed trees of N has no trivial cherries. Lemma 16 ([7] Lemma 10). Let T be a set of phylogenetic trees with leaf set X such that there is a tree-child network N with k reticulations that displays T . If T has no trivial cherries, then the total number of cherries of the trees in T is at most 4k.

Lemmas 15 and 16 gives the bound on the number of reducible pairs for networks with common leaf sets. Lemma 17. Let N be a set of tree-child networks with leaf set X such that there is a tree-child network N with k reticulations that displays N . If N has no trivial pairs, then the total number of reducible pairs of the networks in N is at most 8k.

Proof. Each reducible pair of a network is a cherry in one of its displayed trees, and the set of displayed trees is displayed by the solution network N as well. Hence, by Lemma 16, there are at most 8k reducible pairs in the trees, and therefore at most 8k reducible pairs in the networks.

Bounding Reducible Pairs in General. Recall that the algorithm will build a TCS by successively appending reducible pairs; it terminates upon finding the shortest possible sequence that reduces all the input networks. In the process, it branches on all possible non-trivial pairs that the input network may have. Depending on the sequence that is being built, it is possible that leaves that exist in some of the input networks (after reduction by the existing sequence) may have already been deleted from others. Here, we show that even for these instances, it is still the case that the number of possible reducible pairs that we can branch on is bounded by 8k. This result follows directly from Lemma 7 of [7] : we change the wording of the statement slightly to accommodate for network inputs. The idea of the proof is as follows. Let j be such that N S [:j] has no trivial pairs. Then we find a set of tree-child networksN j on X with the same set of reducible pairs as N S [:j] and tree-child hybridization number at most k. By Lemma 17, this shows that N S [:j] has at most 8k reducible pairs.

The set of networks is constructed by adding back each missing leaf to each network in N S [:j] at the root. The order in which they are placed at the root is the same as the order in which these leaves appear as first element in S [:j] . Now, we may construct a TCS of the same weight as S that reduces this set of networks. By first reducing the part that corresponds to the part in N S [:j] , and then the leaves placed by the root, we have a TCS that reducesN j of weight at most k: y j+2 ) , . . . , (x r , y r ), (x 1 , y r ), (x 2 , y r ), . . . , (x j , y r ).

An example of the corresponding networks and their embeddings can be found in Fig. 3 . 

Our algorithms are the same as those presented in [7] , except for the following changes.

-The input set of trees T is changed into an input set of tree-child networks N ; -trivial cherries are now trivial pairs; -In line 4, the stop condition of a non-pickable reticulated cherry is added;

The first change is necessary for the algorithm to take an input consisting of networks. The second change is necessary as not all reducible pairs are cherries anymore, when the input consists of networks. The while-loop that reduces all the trivial pairs is still correct in the algorithm, because there is an optimal sequence that first reduces all trivial pairs (Lemma 14). The last change makes Procedure TreeChildSequence(N , S, k) Input: A collection N of tree-child networks, a partial TCS S, an integer k; Output: An optimal TCS SS of weight at most k for N if such a sequence exists; Fail otherwise; 1 while There exists a trivial pair (x, y) in N S with y not forbidden by S do 2 Set S = S(x, y); N has a cherry (x, y) sure we stop when the reduced input N S cannot be fully reduced using a TCS that can be appended after the prefix S. Otherwise, the algorithm is still correct. Indeed, the algorithm branches over all non-trivial pairs, to find a shortest sequence that reduces all input networks; and this shortest sequence corresponds to a network with minimal reticulation number that displays all input networks. Furthermore, the running time follows as each branch-out is over at most 8k pairs, and the search depth is at most k.

Theorem 19. Let N be a set of tree-child networks on a set of taxa X. If there exists a tree-child network with at most k reticulations that displays N , then it can be found in O((8k) k · poly(|X|, |N |)) time using TreeChildNetwork(N , k).

In this paper, we have introduced Network Hybridization, the problem of finding a network with minimal reticulation number that displays a set of net-Procedure TreeChildNetwork(N , k) Input: A collection N of tree-child networks, an integer k; Output: A tree-child phylogenetic network N on X that displays N with reticulation number at most k, if such a network exists; otherwise None; works. We showed that the Tree-child Network Hybridization problem, in which we restrict our inputs and output to be tree-child networks, can be solved by making slight adjustments to the FPT algorithm presented in [7] .

In practice, our algorithm can be sped up using the heuristic improvement that was introduced in [7] . We may consider branch reduction, in which we ignore parts of the search tree where no better solution can be found.

For this problem, FPT is essentially the best we can do, because solving the Network Hybridization problem for an input set of tree-child networks is NP-hard. This follows from the fact that it is already NP-hard for an input set of trees. It has recently been shown that if all level-(k−1) subnetworks of a level-k tree-child networks are given, this network can be constructed in polynomial time [15] . In other words, the Tree-child Network Hybridization problem is easy to solve when we are given all level-(k −1) subnetworks of a level-k network. This suggests that the problem becomes easy if the difference in reticulation number between the inputs and the output network is bounded. We wonder if this is still true for networks that are not tree-child, and therefore it would be interesting to see whether the Hybridization problem is FPT with this difference in reticulation number as parameter. And, if this is the case, whether the current algorithm can be proven to have this running time.

Recall that a TCS is a sequence of ordered pairs with two conditions imposed on them: the first condition ensures that we obtain a network from the sequence upon using the ConstructNetworkFromSequence algorithm; the second condition ensures that the network we obtain is tree-child. Upon removing this second condition from sequences of ordered pairs, we obtain what is called a cherry-picking sequence [13] . Networks that can be reduced by a cherry-picking sequence are called orchard networks [3, 13] . A natural extension of the results we have presented in this paper would be to consider the following problem.

Input: A set of orchard networks N on X. Output: An orchard network that displays N with minimal reticulation number.

Ideally, in Algorithm TreeChildSequence, we would simply remove the tree-child condition to obtain an algorithm which works for orchard networks as well. However, simply doing so could potentially result in a much higher running time, as we do not have a bound on the number of reducible pairs for orchard networks (see Fig. 4 ). Nevertheless, solving Orchard Network Hybridization could lead to better upper bounds for the network hybridization number, and the algorithm could still be efficient in practice. In this light, this paper has taken the first step towards finding good solutions for Network Hybridization.

x 1 x 2 x 3

x n1−1 x n1 y 1 y 2 y 3 y n2 y n2−1 y n2−2 z 1 z 2 z 3 z n2−2 z n2−1 z n2 Fig. 4 . An orchard network with n1 + n2 − 1 reticulations such that the set of displayed trees have at least n1n2 cherries.

A framework for representing reticulate evolution

Computing the minimum number of hybridization events for a consistent evolutionary history

A class of phylogenetic networks reconstructable from ancestral profiles

Quarnet inference rules for level-1 networks

Reconstructing phylogenetic level-1 networks from nondense binet and trinet sets

Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies

A practical fixedparameter algorithm for constructing tree-child networks from multiple binary trees

Hybridization number on three rooted binary trees is EPT

A quadratic kernel for computing the hybridization number of multiple trees

Trinets encode tree-child and level-2 phylogenetic networks

Leaf-reconstructibility of phylogenetic networks

Binets: fundamental building blocks for phylogenetic networks

Solving phylogenetic network containment problems using cherry-picking sequences

Attaching leaves and picking cherries to characterise the hybridisation number for a set of phylogenies

Reconstructing tree-child networks from reticulate-edge-deleted subnetworks

Trilonet: piecing together small networks to reconstruct reticulate evolutionary histories

Fixed-parameter algorithms for maximum agreement forests