key: cord-0719903-kay1fonx authors: Meyer, Nuala J; Calfee, Carolyn S title: Novel translational approaches to the search for precision therapies for acute respiratory distress syndrome date: 2017-05-26 journal: Lancet Respir Med DOI: 10.1016/s2213-2600(17)30187-x sha: 0e31cc3ff77d6387aa5ab8f0c3a5fef49682ad50 doc_id: 719903 cord_uid: kay1fonx In the 50 years since acute respiratory distress syndrome (ARDS) was first described, substantial progress has been made in identifying the risk factors for and the pathogenic contributors to the syndrome and in characterising the protein expression patterns in plasma and bronchoalveolar lavage fluid from patients with ARDS. Despite this effort, however, pharmacological options for ARDS remain scarce. Frequently cited reasons for this absence of specific drug therapies include the heterogeneity of patients with ARDS, the potential for a differential response to drugs, and the possibility that the wrong targets have been studied. Advances in applied biomolecular technology and bioinformatics have enabled breakthroughs for other complex traits, such as cardiovascular disease or asthma, particularly when a precision medicine paradigm, wherein a biomarker or gene expression pattern indicates a patient's likelihood of responding to a treatment, has been pursued. In this Review, we consider the biological and analytical techniques that could facilitate a precision medicine approach for ARDS. In the 50 years since Ashbaugh and colleagues 1 first described acute respiratory distress syndrome (ARDS), substantial progress has been made in identifying the pathogenic contributors to the syndrome and in improving the ventilatory support of patients. 2 Con current improvements in the management of sepsis, 3,4 the most common precipitant of ARDS, and ventilator liberation, sedation practices, and preventive therapies, 5, 6 have also contributed to the improved survival of patients with ARDS. Nonetheless, in observational studies 7,8 from across the world, the mortality of patients with ARDS remains high; about one in three patients diagnosed with ARDS will not survive 60 days. Pharmacological treatments for ARDS remain scarce, with only one drug (cisatracurium besilate) showing a potential, albeit probably nonspecific, benefit in a randomised trial of 340 patients. 9 The need for new therapies for ARDS is indisputable, yet numerous well conceived studies have been unable to identify an effective treatment. 10- 14 Drugs might have been ineffective because responses to, or the toxic effects of, therapy are nonuniform; because of problems with drug timing, duration, or delivery; or because of incomplete understanding of the pathogenesis or heterogeneity of ARDS. [15] [16] [17] A precision approach, whereby therapies are specifically targeted to patients most likely to benefit, might overcome such limitations and is broadly advocated for ARDS. 18, 19 In this Review, we consider tools and approaches that might facilitate the identification of novel and potential precision therapies for ARDS by identifying the mechanisms and therapeutic relevance of specific biological pathways, by identifying subgroups of patients likely to respond to a particular therapy based on their biology, or (perhaps ideally) both. Substantial inflammation in the alveolar and plasma compartments has been a recognised characteristic of patients with ARDS since the 1980s. High concentrations of inflammatory cytokines have been associated with both injurious ventilator strategies and worse outcomes. [20] [21] [22] [23] Beyond inflam mation, mechanisms including endothelial activation, respiratory epithelial dysfunction, and sur factant depletion have been established as major contributors to the pathogenesis of ARDS, 24, 25 and candidate studies [26] [27] [28] [29] [30] of individual biomarkers and genes in these pathways have yielded and continue to yield some important associations (table 1) . Novel analytical tech niques to be discussed in this Review might highlight new uses for some of these accepted markers to refine ARDS endotypes or to serve as enrichment markers in future trials. To date, however, improved molecular understanding of the pathogenic contributors to ARDS has yet to translate into mortality reductions for patients with the syndrome. How could a paradigmshifting breakthrough be facilitated? Perhaps the field of ARDS has persisted in looking under the light, restricting inquiry to genes and proteins already hypothe sised to affect lung injury. 39 • Effective pharmacotherapies for acute respiratory distress syndrome (ARDS) are still elusive, and the clinical and biological heterogeneity of ARDS suggests that precision approaches might be helpful. • Novel approaches to understanding ARDS biology, identifying discrete subgroups of ARDS, or both are needed to facilitate precision therapies for ARDS. • Discovery-based approaches, combined with candidate marker analyses, might help to identify new pathways relevant to ARDS for subsequent testing in hypothesis-driven experiments, potentially including novel preclinical models of ARDS. • New approaches to analyses of complex clinical and biological data might help to identify distinct ARDS subgroups that might be suitable for targeted therapies. Although continued investigation of selected candidate pathways, identified from experimental and obser vational investigations of ARDS, does and should continue, a complementary approach is to use discovery methods independent of any understanding of the pathogenesis of ARDS. Such "biasfree" approaches include assessing largescale variation in genomic DNA, transcriptome (mRNA) expression, noncoding RNA, proteins, lipids, or metabolites and comparing patients with ARDS with atrisk controls. Generally, the goal of such approaches is to identify novel candidates that can then be investigated in more traditional hypothesistesting experiments (figure 1). We stress that discovery approaches are not inherently superior to hypothesisdriven, candidate investigations, but rather that discovery approaches can bring new candidates to the pipeline of investigation. Discovery approaches are not predicated on our current understanding of ARDS biology but rather on the investigator's ability to select representative biospecimens and populations of patients and unaffected controls. Thus, such approaches are free from the biases introduced by our current molecular understanding of ARDS, such as which pathways are activated and which molecules incite or perpetuate lung injury. We refer to discovery approaches as biasfree when they are also independent of known genetic or proteomic sequences; for example, sequencing is biasfree, whereas a wholegenome gene expression array requires knowledge about transcripts to design specific probes (table 2) . Although a sceptic might view discovery approaches as akin to a fishing expedition, the broad goal of these studies is not to merely identify the most discriminatory marker between samples, but also to expand our consideration of pathways and mechanisms not presently thought to contribute to disease and to explain why an unexpected marker might differ significantly between cases and noncases or between target tissues. (IL8), IL6, IL1B, TNF, CSF2 (GM-CSF), IL10, IL1RN, MBL2, NFKBIA, TIRAP, TLR1, PI3, IRAK3, DARC, ADIPOQ IL1R2, FTL, PI3, S100A2 ·· IL8, IL6, IL1B, IL18, CSF2 (GM-CSF Genomewide association studies attempt to find regions of the genome in which genetic variation among unrelated cases is consistently skewed compared with the background population. 40 An extension of classic familybased genetic linkage studies, genomewide association studies rely on very large populations to detect common genetic variants with smalltomoderate effect sizes and stipulate an extreme statistical imbalance (p values of less than 5 × 10 -¹⁸) to declare significance. One genomewide association study of trauma associated ARDS has been reported; 41 no single nucleotide polymorphism (SNP) achieved genomewide significance, although a replicating variant in the liprinα gene that warrants further mechanistic investigation was identified. This first attempt at a genomewide association study of ARDS was instructive for highlighting its insufficient power despite including 600 patients with ARDS and more than 2000 healthy controls in a discovery population, followed by an additional 500 critically ill patients in a replication population. 41 Although a genomewide association study is a powerful tool with reasonable consensus about analytical strategy, additional consid erations might restrict its application to a trait such as ARDS. For example, although ARDS is the cause of many thousands of deaths each year, clinical recognition of ARDS is poor, 7 and no validated method exists to diagnose ARDS with electronic medical record coding. 42 Whereas genomewide association studies search for common variants that are involved in common diseases, each variant conferring just a small fraction of altered risk, rarevariant analyses use DNA sequencing to detect polymorphisms that could have a large effect but are highly uncommon. 45 Sequencing of genomic coding regions (socalled exome sequencing) was done in a collaborative effort sponsored by the National Heart, Lung, and Blood Institute 46 and included 96 cases of ARDS. When ARDS cases were compared with presumably healthy controls in the 1000 Genomes Project, two SNPs in the genes XK related 3 and arylsulfatase D showed differential expression in ARDS cases, 31 Targeted amplicon sequencing Next-generation sequencing (also known as metagenomics) We refer to candidate testing when an individual known entity (SNP, specific transcript, protein, etc) is quantified, whereas medium-throughput or high-throughput discovery approaches rely on known features but are capable of multiplex assays. For example, genome-wide association studies or cDNA microarrays rely on the knowledge of a genetic sequence to generate probes that will assay each SNP or transcript. By contrast, bias-free methods imply that the features being detected might be unknown-for example, next-generation sequencing sequences all nucleic acids detected, regardless of whether a sequence is recognised. MALDI-TOF=matrix-assisted laser desorption/ionisation time-of-flight. cDNA=complementary DNA. miRNA=microRNA. ncRNA=non-coding RNA. SNP=single-nucleotide polymorphism. *Chromatin-immunoprecipitation sequencing is a method to detect and sequence areas of the genome where proteins interact with DNA to, for example, understand transcription factor binding. †Bisulfite sequencing is a method that uses bisulfite treatment of DNA to uncover DNA methylation patterns that might give clues to DNA regulation. ‡Vary by cell type and time. §Vary by location and time. Review detect major deleterious genetic disruptions, has shown that, for a fraction of the population, the clinical pheno type is explained by multiple major genetic variants rather than just one. 47 The potential for multiple variants that affect ARDS coexisting in one patient is even more relevant for common (noncoding) genetic variation and, if unrecognised, this issue might contribute to imprecise risk estimates or restricted reproducibility. The search for new candidates involved in the patho genesis of ARDS is not restricted to genomic DNA. Discovery approaches exist to interrogate transcript expression (socalled transcriptomics), DNA modifi cations (socalled epigenomics), proteins, metabolites, and microbiota. The key technological shift that facilitated the explosion of these 'omic fields was the ability to perform highthroughput feature capture and identification without apriori designation of which features would be sampled. In the case of tran scriptomics, in which mRNA is the feature under investigation, the field has progressed from whole genome arrays-ie, microchips coated with thousands of oligonucleotide probes specific to the complementary DNA (cDNA) sequence of roughly 22 000 known genes-to RNA sequencing, whereby cDNA or occasionally RNA is sequenced using nextgeneration sequencing technology. 48 The main advantage of RNA sequencing compared with microarrays is that RNA sequencing does not rely on probes designed to capture cDNA variation based on previous knowledge of the genome, but instead applies sequencing to all cDNA captured from the sample. 49 As a result of RNA sequencing investigations, we are gaining insight into the regulatory roles of noncoding RNA and small RNA species, the potential for alleles that do not change protein sequence to affect expression, and the complexity involved in isoform determination. 50 The challenges involved in applying RNA sequencing to a phenotype such as ARDS are not trivial and can be summarised as obstacles facing the broad field of transcriptomics-including developing bioinformatic solutions to highdimensional data analysis, quality assurance, and achieving a consensus on significance testing-and burdens that might be unique to ARDS. Burdens unique to ARDS are fundamental questions about study design: which tissue(s) should be profiled by RNA sequencing to discover new markers and inform about ARDS pathogenesis? Lung tissue is so rarely available while the patient is alive with early ARDS, 51 so are alveolar macrophages an adequate substitute? Kovach and colleagues 52 reported the first microarray analysis of human alveolar macrophages collected from broncho alveolar lavage fluid samples from patients with early ARDS (days 0-4) and identified a pattern of relative immune tolerance distinguishing ARDSderived alveolar macro phages from those of healthy controls. Although the contribution of alveolar macrophages in lung host defence, injury, and reso lution is established, 53 it would seem unwise to ignore the potential contributions of lungresident neutrophils and lymphocytes, as well as the endothelial, epithelial, and stromal cells of the lungs. Attempts to characterise gene expression in wholeblood or circulating mono nuclear cells have also been done for ARDS 54, 55 and have implicated dysregulation in neutrophilexpressed genes and the ferritin heavy chain, although these findings await replication. Highthroughput approaches are also feasible for analysing the expression patterns of proteins or small molecule metabolites in body fluid or tissue. Rather than attempting to quantify a protein or molecule of known identity, these techniques first separate the population of unknown proteins or small molecules, then quantify and identify them. Advances in the granular separation of proteins or small molecules with differential light absorption and activation (ie, matrixassisted laser desorption/ionisation timeofflight [MALDITOF] mass spectrometry), and the resolution of compounds with gas and liquid chromatography, have led to substantial improvements compared with twodimensional gel electrophoresis. Additionally, mass spectrometry and nuclear magnetic resonance spectroscopy have improved the accuracy and sensitivity of the identification of unknown proteins and metabolites. 56 A high dimensional proteomic analysis 57 of bronchoalveolar lavage fluid from patients with ARDS identified differential protein expression between ARDS survivors and nonsurvivors over the first 4 days of ARDS, with nonsurvivors manifesting decreased expression of proteins related to coagulation, iron homoeostasis, and immune activation and increased expression of proteins related to glycolysis, collagen metabolism, and the actin cytoskeleton. Highthroughput proteomic screening of plasma from patients with ARDS identified increased expression of glyco proteins and serum amyloid A and decreased expression of complement factor H and apolipoprotein A, B, and C compared with plasma from healthy controls. 58 However, discovery proteomics in lung injury has been hampered by several factors, including the complexity of the plasma proteome and overwhelming signals from highabundance plasma proteins (eg, albumin), which have limited the insights gained to date; 59 thus, most of the recent advances in this area have come from candidate marker analyses. Indeed, the expectation is that each promising feature identified through a discovery approach should then be replicated as a candidate marker, as is highlighted by the iterative example in figure 1 ; the most compelling associations will show converging support from multiple lines of evidence. Largescale metabolomics has also been done with both bronchoalveolar lavage fluid and plasma from patients with ARDS compared with healthy controls. 34, 35 Whereas both bronchoalveolar lavage fluid and plasma showed increased expression of lactate, citrate, and creatine, other metabolites were specific to each fluid. For example, bronchoalveolar lavage fluid from patients with ARDS was characterised by increased expression of guanosine, xanthine, and hypoxanthine-metabolites of guanosine and uric acid metabolism pathways-and decreased expression of phosphati dylcholines, the phospho lipids that constitute pulmonary surfactant. 34 Plasma from patients with ARDS was characterised by metabolites that might indicate disrupted oxi dant stress signalling (glutathione), energy homoeo stasis (adenosine), endothelial barrier function (sphingomyelin), and apoptosis (phosphati dylserine). 35 Although exciting, each of these proteomic and metabolomic investigations were done with very small sample sizes (n<20 individuals), and the direct translation from protein or metabolite identification to ARDS diagnosis or therapy is not obvious. 34, 35, 57, 58 By contrast, another group examined the metabolome of exhaled breath condensate collected from the ventilators of more than 100 critically ill patients and identified three volatile compounds-octane, acetaldehyde, and 3methylheptane-in the exhaled breath condensate of patients with ARDS compared with ventilated controls. 60 By use of training and validation sets, the authors of that study showed that these three metabolites had a reasonable diagnostic performance on their own and that dis crimination was improved when they were added to the Lung Injury Prediction score compared with the score used alone. [60] [61] [62] The lung microbiome represents another opportunity to apply cuttingedge sequencing techniques in a biasfree manner. Dickson and colleagues 63 showed that, in a mouse model of sepsis induced by caecal ligation and puncture, the lung microbiota was rapidly modified by sepsis and gutassociated bacteria were the dominant community for 5 days. The gastrointestinal tract, and not the upper respiratory tract, was shown to be the source of sepsisinduced lung microbiome alterations for mice with sepsis from systemic insults, whereas this shift from typical lungresident microbes to gutassociated bacteria was not observed in mice exposed to intratracheal lipopolysaccharide to model direct lung injury. 63 In support of these findings, 63 sequencing of bacterial ribosomal RNA from the bronchoalveolar lavage fluid of 68 patients with ARDS identified gutassociated Bacteroides species in 28 (41%) patients, compared with only one (3%) of 26 healthy controls. Enrichment of gut bacteria in bronchoalveolar lavage fluid was associated with the plasma concentration of inflammatory markers. Thus, for nonpulmonary sepsisassociated ARDS, translocation of bacteria from the gut to the lungs might have a more prominent role than previously recognised, because traditional culture methods are relatively insensitive to anaerobic bacteria. Whether this mech anism can be targeted to prevent or treat ARDS, however, remains untested. As the potential for each 'omic method to be applied to ARDS is considered, the sheer volume of data generated is potentially staggering. How should this wealth of biological data, often coupled with extensive clinical data from the critical care setting, be analysed to maximise our potential insight into novel treatment approaches for ARDS? Even for candidate marker analyses, novel analytical approaches might be needed to maximise the insight gained and integrate with complex molecular and clinical data. Traditionally, biological data from human beings with ARDS have been analysed in one of two ways: by identifying the biological phenotypes most strongly associated with a clinical outcome of interest (often death), with the idea that targeting these pathways might thereby reduce mortality, or by dividing patients into categories based on clinically evident or pre supposed characteristics and then comparing their clinical and biological phenotypes to identify distinct subgroups that might respond differently to treatment. A common approach has been to measure biomarkers thought to reflect specific biological pathways and test their association with poor outcomes, typically with regressionbased methods. This approach is appealingly straightforward; it is intuitive and, at least in its simplest form, requires minimal advanced training in statistical analysis for either the investigator or the reader of the final product. Moreover, this approach has led to substantial advances in our understanding of the biology of human ARDS. Over the past three decades, studies ranging from small, singlecentre investigations to secondary analyses of large multicentre clinical trials have reported associations between poor clinical outcomes and biomarkers of key pathways of lung injury-for example, injury to the lung epithelium and endothelium-as well as numerous inflammatory pathways. Although a comprehensive review of these studies is beyond the scope of this Review, several of the key studies are cited in table 1. This approach is particularly well suited to confirming the human relevance of biological pathways identified in laboratory models of ARDS. This approach has most commonly been used to analyse candidate plasma proteins and genetic polymorphisms, although more recent investi gations have focused on unbiased discovery methods such as DNA and RNA sequencing, metabolomics, and proteomics. This approach also has notable limitations. First and most important is the old adage that correlation does not imply causation; put differently, biological markers associated with mortality in ARDS might not be Review causal drivers of ARDS mortality and, therefore, might not be good therapeutic targets. Studying these associations in the setting of a randomised trial, and incorporating analysis of the response of the biomarker in question to the randomly assigned therapy, might partially mitigate but not eliminate this issue. 21 Second, the regressionbased models typically used for these types of analyses do not mandate or necessarily facilitate analyses of heterogeneity within ARDS, so as to identify treatmentresponsive subgroups. Alternative analytical approaches are better suited than regression methods for this particular question. A second approach often applied to patient data to search for novel treatment strategies for ARDS is to divide patients into subgroups based on preexisting hypotheses or clinically evident features and then to compare the biology of these subgroups. For example, for decades, investigators have focused on whether direct lung injury (ie, ARDS resulting from direct damage to the lung parenchyma, such as in pneumonia or aspiration of gastric contents) differs from indirect lung injury (ie, ARDS resulting from an insult distant to the lung parenchyma, such as nonpulmonary sepsis or massive blood transfusion). 64 Clinical studies 36, 65 have supported this distinction, identifying differences in endothelial injury and inflammation between patients with direct and indirect lung injury. Other similar types of analyses have subdivided ARDS into diffuse versus focal radio graphic changes, or into subgroups based on clinical risk factors (eg, trauma vs sepsis vs other). [66] [67] [68] Similar to outcomefocused methods, these approaches have the advantage of being intuitive and logical, particularly for clinicians who can recognise in their own practice the obvious clinical differences between subgroups of patients with different clinical features or natural histories. However, these approaches rely on our pre existing biases about disease classification and, to date, have not led to successful clinical trials targeting specific ARDS subgroups. As our ability to quantify biological complexity has advanced over the past several decades, so too have novel statistical methods designed to disentangle heterogeneity within complex datasets. The use of these novel statistical methods in other disease phenotypes, such as asthma, has led to substantial progress towards the discovery of disease endotypes-that is, specific subgroups of a syndrome with distinct pathobiologies and differential responses to treatment. Some of these methods are now being used in translational studies of critical illness, with a similar goal. Clusterbased methods encompass various analytical techniques that share the broad goal of identifying groups (clusters) of observations with similar characteristics. These types of methods, such as hierarchical clustering and κmeans clustering, have commonly been applied to genomic data, with the aim to identify clusters of patients with similar gene expression patterns. Once biological data has been used to identify patient clusters, the clusters can subsequently be analysed for differences in clinical outcomes, clinical phenotypes, or other variables of interest. Multiple examples exist of the use of these techniques in asthma, in which they have helped to identify the Thelper2high endotype in several studies. 69, 70 In critical care, clusterbased methods have been used to identify subclasses of paediatric septic shock. 71, 72 These methods have several advantages, including an explicit focus on reducing heterogeneity by identifying subgroups that are internally similar, an unbiased approach free from presupposition, and facile visual representation (ie, the classic heat map). Another advantage for use of clustering in ARDS is that it can be done on baseline characteristics without consideration of outcome. However, one notable disadvantage of clusterbased methods is that they will always identify clusters 73 and do not test a specific hypothesis regarding the number of clusters in a particular class. This parameter must be set externally and, as such, might be prone to bias regarding the number of clusters present, although methods exist to estimate the optimal number of clusters from the data. 74 Another advanced analytical approach, which, similar to cluster analysis, often falls under the rubric of machine learning, is classification trees, often referred to as classification and regression tree analysis. This approach is designed to identify unexpected cutpoints in a set of data that permit classification into subgroups that might not be readily apparent. This approach generates a branching treelike structure that branches at specific cutpoints of a given variable and ends in multiple terminal nodes, which are often distinguished by differences in outcomes. This method has the advantage of being able to identify patterns in the data that would otherwise be unapparent. However, classification trees are also notorious for overfitting the model to the data and, as such, mandate external validation in independent datasets. Treebased models have been used in hospital inpatients to identify predictors of clinical deterioration, 75 in adult septic shock to refine prognostic stratification based on plasma biomarkers, 76 and in ARDS to identify clinical features associated with poor outcomes. 77 These trees are constructed based on the association between the measured variables and a specific clinical outcome, so the caveats regarding outcomebased models also apply to these trees-ie, that the variables which identify the branches in the tree might not be causally related to the observed differences in outcomes. Similar to cluster based models, treebased models also require external and potentially arbitrary decisions regarding the number of branches and terminal nodes, although methods with resampling and crossvalidation have been developed to inform these decisions. 75 Latent class analysis is an approach derived from mixture modelling that is explicitly designed to identify hidden (socalled latent) subgroups within a larger group. Latent class analysis has been widely applied in Review psychiatric research and has been an important contributor to the study of asthma endotypes. 78, 79 One major strength of latent class modelling is that it can test the hypothesis that a certain number of classes (k) fits the data better than one fewer class (k -1), providing more objective confirmation of the optimal number of classes for a given dataset. In ARDS, use of latent class analysis has identified two distinct subphenotypes in independent analyses of three randomised trials. 15, 17 The two sub phenotypes were clearly distinguished in part on the basis of biomarker profiles, with one sub phenotype characterised as hyperinflammatory com pared with the other subphenotype, and by their responses to randomly assigned positive endexpiratory pressure and a fluidconservative management strat egy. When applied to trauma cohorts, latent class analysis also identified distinct subgroups of ARDS that were distinguished largely by time to ARDS development, but also by clinical characteristics and plasma biomarker expression. 80 One disadvantage of latent class analysis is that relatively large datasets (n>300 individuals) are typically needed to confidently fit latent class models. Recognising that heterogeneity is a cardinal feature of ARDS that might confound traditional genomic approaches, an alternative approach is to apply genomic tools (eg, genomewide association studies) to an intermediate trait that has meaning in the context of ARDS (eg, a plasma biomarker or expression of an mRNA transcript) to apply causal inference methods that might help to evaluate one aspect of the syndrome. For example candidate studies [81] [82] [83] have shown plasma concentrations of angiopoietin2 (ANG2)-a protein secreted by activated endothelium that potentiates vascular permeability-to be a strong marker for ARDS diagnosis, prediction, and prognosis. Discovery genetic approaches also implicate this pathway in ARDS risk, 84 making ANG2 an attractive candidate ARDS intermediate protein. If we consider that every biological process is, to some degree, controlled by our DNA, the genetic regulation of a plasma protein such as ANG2 is far less complex than that of ARDS, because the phenotype of plasma ANG2 concentration is regulated by fewer genes and has a stronger correlation between genetic variation and measured output (ANG2) than does a disease phenotype such as ARDS. Thus, the likelihood of finding genetic variants that explain a high proportion of variance in plasma ANG2 concentration is greater than finding variants that explain a large proportion of ARDS risk. 50 Furthermore, quantitative traits are statistically more powerful than dichotomous traits, such that statistically significant results might be detected with only 100 or 200 patients, rather than with the 1000s necessitated by traditional genomewide association studies, 85 and such sample sizes are far more achievable for populations of critically ill patients. Genomics can then be used to apply causal inference methods-techniques borrowed from the econometric and social science fields-to infer causality from observational data. The advantage of identifying causal biomarkers in ARDS, rather than markers that merely correlate with ARDS, is that modifying the causal intermediate is more likely to influence disease outcome. Experimental evidence from animal or in vitro models allows us to directly test whether intermediates have direct causal effects; however, it would be unethical to randomly assign people to receive doses of a potentially injurious marker. Thus, methods to infer the effect of a potential mediator-for example, a plasma marker such as ANG2, a bronchoalveolar lavage fluid metabolite, or a whole blood transcript-on ARDS risk or mortality are desirable. Once a marker is suggested to have a causal role, rigorous experiments in model systems should be done to more definitively show causality. Mediation analysis is a formal method to explain the mechanism by which a potential explanatory variable influences an outcome via an intermediate, or mediator variable. Many quantitative traits have a strong genetic component; thus, mediation analysis can be applied to associations between SNPs and disease outcomes to test whether a significant portion of the association is mediated by change in a third variable, the quantitative trait. 86 If the intermediate trait (biomarker, transcript, or metabolite) mediates a significant portion of the observed risk, this provides good evidence that the marker might be mechanistically linked to the disease outcome. Wei and colleagues 87 used mediation analysis to address whether changes in platelet count might explain ARDS risk and mortality. They first identified genetic variants that strongly influenced baseline platelet count during critical illness and then assessed whether a SNP strongly associated with platelet count was also associated with ARDS risk; they found that a small but significant portion of the association between a SNP and ARDS was mediated indirectly through changes in baseline platelet count. In a related analysis, 88 the same group used this method to show that a decrease in platelet count mediates a portion of ARDS mortality. The associations between genetic variants and a quantitative trait such as a plasma marker can also be harnessed for instrumental variable analysis, whereby SNPs that are strongly associated with a biomarker are used to genetically predict the biomarker, and then the association between the genetically predicted biomarker and the outcome is assessed. This approach is sometimes called a mendelian randomisation analysis, because each individual is considered randomised by random assort ment of parental alleles to a genotype that might express high or low levels of a biomarker. Mendelian randomisation studies have provided strong support for the causal contribution of LDLcholesterol to risk of Review coronary artery disease; genetic variants associated with the plasma concentration of LDLcholesterol were also associated with coronary artery disease, and when the concentration of LDLcholesterol was predicted by genotype, genetically predicted LDLcholesterol was associated with this disease. 89 Furthermore, dissecting the genetic determinants of LDLC concentration has uncovered novel treatment platforms for coronary artery disease, 89, 90 offering an attractive template for ARDS genetic research if appropriate intermediates can be identified. Another approach to determining the therapeutic relevance of pathways identified with 'omics methods is to test them in novel preclinical models of ARDS. Although a thorough review of human, animal, and in vitro models of lung injury is beyond the scope of this Review, several developments that are facilitating the translation from bedside to bench and back again are worth highlighting and are shown in figures 2 and 3. These techniques can interrogate candidates identified in biasfree methods with a rigorous experimental design and, in some cases, can model treatment, while also monitoring the variation in quantitative markers, physiology, or gene expression. A recognised shortcoming in human ARDS research is the rarity of lung tissue from acute ARDS, which then restricts applications that are cellspecific, such as transcriptomic, epigenomic, and proteomic methods. One solution to this scarcity of tissue is to use human lungs that were declined for transplantation. Up to 80% of evaluated lungs are deemed unsuitable for transplant because of poor oxygenation, visible injury, or poor compliance, features that are common to ARDS. 92, 93 Furthermore, to better simulate conditions in vivo, these human lungs can be ventilated and perfused in an exvivo lung perfusion (EVLP) system for several hours (figure 2), allowing the observation of physio logical measures such as compliance, alveolar fluid clearance, and oxygenation and the sampling of lung tissue, perfusate, and alveolar fluid. 94, 95 Clinical trials (eg, NCT01365429) 92 are being done to test whether EVLP can increase the number of suitable lungs for transplantation while maintaining optimal transplant outcomes. However, these sophisticated systems can also be optimal ARDS preclinical models in which to screen potential ARDS therapies, 94,95 because they have the advantages of safety (no human exposure) and access to human lung tissue. By taking biopsy samples of the tissue and using pharmacological inhibitors or agonists, the mechanism of a drug's action can be investigated. The EVLP system can also be adapted to model a uniform injury by applying bacteria or endotoxin to the lung, making this an adaptable model for hypothesistesting for ARDS prevention and therapy. 95 Another potential therapeutic screening method uses microfluidic bioengineering and threedimensional cell culture to produce a microengineered lungonachip, complete with an alveolar-capillary interface, cyclic stretch to mimic breathing, and perfusion to model circulation (figure 3). 91 Although this system relies on human cell lines that are capable of persisting in long term culture, and thus, to date, alveolar epithelial cells derived from a lung cancer cell line rather than primary human alveolar epithelial cells have been used, drug induced permeability pulmonary oedema was modelled with this system and was shown to share many features with clinical toxicity from interleukin 2, including reduced oxygen tension and endothelial and epithelial paracellular gap formation. 91, 96 Most impressively, that model could be used to test whether coadministration of preventive agents-for example, angiopoietin1 or a transient receptor potential cation channel subfamily V member 4 channel inhibitor-could block the inter leukin 2mediated permeability and, via immuno histochemistry, the mechanism of action could be evaluated. 91 The microfluidic lungonachip could also be adapted to study therapeutic applications and might Review be considered a more highthroughput method for drug screening than EVLP because it does not rely on the scarce resource of human lungs. Additionally, if a quantitative trait such as a plasma protein concentration was identified as having causal significance through causal inference modalities or endotype identification, then drug screening could proceed with the secreted protein as an outcome, improving efficiency. Although promising compounds would still need to be tested in both small and large animal models of ARDS before proceeding to human trials, both EVLP and lungona chip microdevices might help to select agents most likely to benefit patients. Finally, the field of genome editing has been revolutionised in the past 5 years by the elucidation of a family of endonucleases, the clustered regularly interspaced short palindromic repeats (CRISPR)/ CRISPRassociated protein (Cas) systems, that are capable of sitespecific DNA cleavage. 97 The simplicity by which the CRISPR/Cas9 system can be adapted to recognise and cleave a specific genomic sequence has been harnessed for gene silencing, to identify transcriptional repressors or enhancers, and even to insert specific point mutations into the genomes of human tissue. To date, CRISPR/Cas9 editing has been used to make mice susceptible to the coronavirus responsible for Middle East respiratory syndrome, overcoming the relative species restriction that Middle East respiratory syndrome coronavirus exhibits and allowing the testing of potential vaccine and antibody treatments. 98 Future uses of this technology in ARDS could be to interrogate the functional significance of poly morphisms or genes identified through genome wide association studies or transcriptomic screens, to provide or refute evidence for pathogenic causality for ARDS biomarkers, and to potentially allow testing of a precision paradigm whereby a targeted treatment could be tested in different genetic backgrounds. In this age of rapid technological and analytical advances, the promise of matching patients to therapies most likely to help, and least likely to harm, seems more attainable than at any previous point in history. Numerous challenges remain, however, particularly in identifying new drug targets and selecting patients most likely to benefit. The approaches reviewed here have already identified plasma biomarkers that could serve as prognostic enrichment factors; markers that identify patients more likely to experience ARDS or who are at an increased risk of dying if ARDS is present. 15, 19 Enrichment strategies that select such high risk patients should improve power for clinical trials by ensuring a population with sufficient outcomes to analyse a potential treatment effect. If causal inference methods or preclinical models of ARDS suggest that a marker has a causal role in the development or progression of ARDS, then the marker might actually identify a biological endotype of ARDS. Endotype defining markers might be useful for predictive enrichment, whereby a plasma marker could inform about a patient's likely response to a specific therapy. 15, 17 As a direct extension, one could then imagine clinical trials for which eligibility might be predicated on the potential patient expressing a specific plasma marker. This approach has been shown to be highly successful in cancer therapy, especially when using tumour gene expression profiles to select targeted therapy. 99, 100 In asthma, endotype recognition has similarly stimulated new therapies that might be more successful when restricted to patients predicted to benefit. 101 To realise the potential of biomarkerdriven clinical trials for ARDS, however, a few things must happen. 102, 103 First, markers must be rapidly available for hospital inpatients to inform potential trial eligibility. By contrast with cancer, a biopsy is unlikely to be available for ARDS and patients and providers cannot afford to wait long for potential genetic results. The technology to measure protein biomarkers by ELISA in real time is not a technical barrier, but such assays are generally not commercially available. Microfluidic cell separation coupled with multiplexed, colourcoded probes might eventually allow rapid gene expression tests at the bedside; 104 however, this technology has not been widely applied to critical illness. 105 Second, biomarkerdriven trials would require more knowledge about the performance of potential plasma markers over time during the natural course of ARDS, because the concentration of many markers is highly variable by time. 82, 106 Third, in many cases, biomarkerdirected trials would ideally be preceded by testing of potential therapies in preclinical models of ARDS such as EVLP or lungon achip, in addition to animal models of ARDS, to better predict whether an endotypedefining marker denotes a pathway that is modified by the potential therapy. We acknowledge that, even when a precision paradigm for ARDS is well supported by consistent findings across endotype identification, genomic or proteomic profiling, and even experimental designs such as EVLP, a potential therapy might not improve mortality if it has a deleterious effect on a comorbid condition during critical illness or if it has utility for ARDS prevention but not progression, particularly if ARDS develops rapidly. Well designed studies will be needed to selectively target ARDS prevention and therapy. Clinical trial considerations are addressed more comprehensively in a related review about this issue. 107 In summary, precision medicine is a realistic paradigm for ARDS that could begin to be tested in the near future. As 'omic methods, novel models of lung injury, and new analytical approaches improve the resolution of ARDS endotypes and suggest new mechanisms of treatment, pharmacological breakthroughs might be closer than we realise. NJM and CSC both contributed to the literature search, figure design, and manuscript drafting. CSC reports grant funding and consulting fees from GlaxoSmithKline and consulting fees from Boehringer Ingelheim and Bayer. NJM received grant funding from GlaxoSmithKline and consulting fees for serving on an advisory board from Sobi. We searched PubMed from inception to Jan 15, 2017, for the search terms "genomic", "polymorphism", "gene expression", "proteomic", "metabolomic", "microbiome", "unsupervised learning", "cluster analysis", and "latent class analysis" combined with "acute respiratory distress syndrome" and "acute lung injury". The search was limited to studies of human beings. Returned lists of articles were then screened manually by reading abstracts to exclude neonatal lung injury and neonatal respiratory distress syndrome. The remaining manuscripts were read in full and their reference lists were reviewed when appropriate. When possible, we have cited comprehensive recent reviews. Simvastatin in the acute respiratory distress syndrome for the BALTI2 study investigators. Effect of intravenous β2 agonist treatment on clinical outcomes in acute respiratory distress syndrome (BALTI2): a multicentre, randomised controlled trial for the National Heart, Lung, and Blood Institute Acute Respiratory Distress Syndrome (ARDS) Clinical Trials Network. Randomized, placebocontrolled clinical trial of an aerosolized β2agonist for treatment of acute lung injury Neutrophil elastase inhibition in acute lung injury: results of the STRIVE study Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials Endothelial nanomedicine for the treatment of pulmonary disease Acute respiratory distress syndrome subphenotypes respond differently to randomized fluid management strategy Personalized medicine for ARDS: the 2035 research agenda Toward smarter lumping and smarter splitting: rethinking strategies for sepsis and acute respiratory distress syndrome clinical trial design Effect of mechanical ventilation on inflammatory mediators in patients with acute respiratory distress syndrome: a randomized controlled trial Lower tidal volume ventilation and plasma cytokine markers of inflammation in patients with acute lung injury Persistent elevation of inflammatory cytokines predicts a poor outcome in ARDS. Plasma IL1 beta and IL6 levels are consistent and efficient predictors of outcome over time Inflammatory cytokines in the BAL of patients with ARDS. Persistent elevation over time predicts poor outcome Diffuse alveolar damagethe role of oxygen, shock, and related factors. A review Differential responses of the endothelial and epithelial barriers of the lung in sheep to Escherichia coli endotoxin Genetic heterogeneity and risk for ARDS Beyond singlenucleotide polymorphisms: genetics, genomics, and other 'omic approaches to acute respiratory distress syndrome A pathophysiologic approach to biomarkers in acute respiratory distress syndrome Biomarkers of ARDS: what's new? Polymorphisms in key pulmonary inflammatory pathways and the development of acute respiratory distress syndrome Identification of novel single nucleotide polymorphisms associated with acute respiratory distress syndrome by ExomeSeq Association of common genetic variation in the protein C pathway genes with clinical outcomes in acute respiratory distress syndrome Untargeted LC-MS metabolomics of bronchoalveolar lavage fluid differentiates acute respiratory distress syndrome from health Metabolic consequences of sepsisinduced acute lung injury revealed by plasma 1Hnuclear magnetic resonance quantitative metabolomics and computational analysis Distinct molecular phenotypes of direct versus indirect ARDS in singlecenter and multicenter studies A pathophysiologic approach to biomarkers in acute respiratory distress syndrome Type III procollagen is a reliable marker of ARDSassociated lung fibroproliferation The streetlight effect in type 1 diabetes Finding the missing heritability of complex diseases Genome wide association identifies PPFIA1 as a candidate gene for acute lung injury risk following major trauma Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care Bias due to misclassification in the estimation of relative risk Distinct and replicable genetic risk factors for acute respiratory distress syndrome of pulmonary or extrapulmonary origin Rare and common variants: twenty arguments Evolution and functional impact of rare coding variation from deep sequencing of human exomes Resolution of disease phenotypes resulting from multilocus genomic variation Landscape of transcription in human cells RNASeq: a revolutionary tool for transcriptomics Expression quantitative trait locus analysis for translational medicine Which patients with ARDS benefit from lung biopsy? Microarray analysis identifies IL1 receptor type 2 as a novel candidate biomarker in patients with acute respiratory distress syndrome Diverse macrophage populations mediate acute lung inflammation and resolution Increased expression of neutrophilrelated genes in patients with early sepsisinduced ARDS Discovery of the gene signature for acute lung injury in patients with sepsis Applying metabolomics to uncover novel biology in ARDS Proteomic profiles in acute respiratory distress syndrome differentiates survivors from nonsurvivors Quantitative proteomic analysis by iTRAQ for identification of candidate biomarkers in plasma from acute respiratory distress syndrome patients Challenges in translating plasma proteomics from bench to bedside: update from the NHLBI Exhaled breath metabolomics as a noninvasive diagnostic tool for acute respiratory distress syndrome Early identification of patients at risk of acute lung injury: evaluation of lung injury prediction score in a multicenter cohort study Acute lung injury prediction score: derivation and validation in a populationbased sample Enrichment of the lung microbiome with gut bacteria in sepsis and the acute respiratory distress syndrome The AmericanEuropean Consensus Conference on ARDS. Definitions, mechanisms, relevant outcomes, and clinical trial coordination The circulating glycosaminoglycan signature of respiratory failure in critically ill adults Elevated plasma levels of sRAGE are associated with nonfocal CTbased lung imaging in patients with ARDS: a prospective multicenter study Traumaassociated lung injury differs clinically and biologically from acute lung injury due to other clinical disorders Lung morphology predicts response to recruitment maneuver in patients with acute respiratory distress syndrome Cluster analysis and clinical asthma phenotypes Identification of asthma phenotypes using cluster analysis in the severe asthma research program Identification of pediatric septic shock subclasses based on genomewide expression profiling Developing a clinically feasible personalized medicine approach to pediatric septic shock Computational analysis of microarray data NbClust: an R package for determining the relevant number of clusters in a data set Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards A multibiomarkerbased outcome risk stratification model for adult septic shock A simple classification model for hospital mortality in patients with acute lung injury managed with lung protective ventilation Asthma phenotypes: the evolution from clinical to molecular approaches Distinguishing asthma phenotypes using machine learning approaches Heterogeneous phenotypes of the acute respiratory distress syndrome after major trauma Plasma angiopoietin2 predicts the onset of acute lung injury in critically ill patients Plasma angiopoietin2 in clinical acute lung injury: prognostic and pathogenetic significance Acute lung injury in patients with traumatic injuries: utility of a panel of biomarkers for diagnosis and pathogenesis ANGPT2 genetic variant is associated with traumaassociated acute lung injury and altered plasma angiopoietin2 isoform ratio Systematic identification of transeQTLs as putative drivers of known disease associations A general approach to causal mediation analysis Platelet count mediates the contribution of a genetic variant in LRRC 16A to ARDS risk A missense genetic variant in LRRC16A/CAMIL1 improves ARDS survival by attenuating platelet count decline Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9 Emerging LDL therapies: using human genetics to discover new therapeutic targets for plasma lipids A human disease model of drug toxicityinduced pulmonary edema in a lungonachip microdevice Normothermic ex vivo lung perfusion in clinical lung transplantation Lung donor selection criteria Clinical grade allogeneic human mesenchymal stem cells restore alveolar fluid clearance in human lungs rejected for transplantation Therapeutic effects of human mesenchymal stem cells in ex vivo human lungs injured with live bacteria Plasma angiopoietin2 concentrations are related to impaired lung function, and organ failure in a clinical cohort receiving high dose interleukin2 therapy A programmable dualRNAguided DNA endonuclease in adaptive bacterial immunity A mouse model for MERS coronavirusinduced acute respiratory distress syndrome Improved survival with MEK inhibition in BRAFmutated melanoma Using multiplexed assays of oncogenic drivers in lung cancers to select targeted drugs Mepolizumab treatment in patients with severe eosinophilic asthma Biomarker enrichment strategies: matching trial design to biomarker credentials Biomarker driven population enrichment for adaptive oncology trials with time to event endpoints Direct multiplexed measurement of gene expression with colorcoded probe pairs Development of a genomic metric that can be rapidly used to predict clinical outcome in severely injured trauma patients Plasma receptor for advanced glycation end products and clinical outcomes in acute lung injury Clinical trials in ARDS: challenges and opportunities