key: cord-0312272-1xyvgnhw authors: Kanai, M.; Elzur, R.; Zhou, W.; Global Biobank Meta-analysis Initiative,; Daly, M. J.; Finucane, H. K. title: Meta-analysis fine-mapping is often miscalibrated at single-variant resolution date: 2022-03-20 journal: nan DOI: 10.1101/2022.03.16.22272457 sha: e47950092d0e1f8c1e0a1847f506e417e1e50a05 doc_id: 312272 cord_uid: 1xyvgnhw Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWAS) into a more powerful whole. To resolve causal variants, meta-analysis studies typically apply summary statistics-based fine-mapping methods as they are applied to single-cohort studies. However, it is unclear whether heterogeneous characteristics of each cohort (e.g., ancestry, sample size, phenotyping, genotyping, or imputation) affect fine-mapping calibration and recall. Here, we first demonstrate that meta-analysis fine-mapping is substantially miscalibrated in simulations when different genotyping arrays or imputation panels are included. To mitigate these issues, we propose a summary statistics-based QC method, SLALOM, that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics based on ancestry-matched local LD structure. Having validated SLALOM performance in simulations and the GWAS Catalog, we applied it to 14 disease endpoints from the Global Biobank Meta-analysis Initiative and found that 68% of loci showed suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci were significantly depleted for having likely causal variants, such as nonsynonymous variants, as a lead variant (2.8x; Fisher's exact P = 6.2 x 10-4). Compared to fine-mapping results in individual biobanks, we found limited evidence of fine-mapping improvement in the GBMI meta-analyses. Although a full solution requires complete synchronization across cohorts, our approach identifies likely spurious results in meta-analysis fine-mapping. We urge extreme caution when interpreting fine-mapping results from meta-analysis. Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWAS) 2 from different cohorts 1 . Previous GWAS meta-analyses have identified thousands of loci 3 associated with complex diseases and traits, such as type 2 diabetes 2,3 , schizophrenia 4,5 , 4 rheumatoid arthritis 6,7 , body mass index 8 , and lipid levels 9 . These meta-analyses are typically 5 conducted in large-scale consortia (e.g., the Psychiatric Genomics Consortium [PGC] , the Global 6 Lipids Genetics Consortium [GLGC] , and the Genetic Investigation of Anthropometric Traits 7 [GIANT] consortium) to increase sample size while harmonizing analysis plans across 8 participating cohorts in every possible aspect (e.g., phenotype definition, quality-control [QC] 9 criteria, statistical model, and analytical software) by sharing summary statistics as opposed to 10 individual-level data, thereby avoiding data protection issues and variable legal frameworks 11 governing individual genome and medical data around the world. The Global Biobank analysis Initiative (GBMI) 10 is one such large-scale, international effort, which aims to establish a 13 collaborative network spanning 19 biobanks from four continents (total n = 2.1 million) for 14 coordinated GWAS meta-analyses, while addressing the many benefits and challenges in meta-15 analysis and subsequent downstream analyses. 16 17 One such challenging downstream analysis is statistical fine-mapping 11-13 . Despite the great 18 success of past GWAS meta-analyses in locus discovery, individual causal variants in associated 19 loci are largely unresolved. Identifying causal variants from GWAS associations (i.e., fine-20 mapping) is challenging due to extensive linkage disequilibrium (LD, the correlation among 21 genetic variants), the presence of multiple causal variants, and limited sample sizes, but is rapidly 22 becoming achievable with high confidence in individual cohorts 14-17 owing to the recent 23 development of large-scale biobanks 18-20 and scalable fine-mapping methods 21-23 that enable 24 well-powered, accurate fine-mapping using in-sample LD from large-scale individual-level data. 25 26 After conducting GWAS meta-analysis, previous studies 2,7,9,24-30 have applied existing summary 27 statistics-based fine-mapping methods (e.g., approximate Bayes factor [ABF] 31,32 , CAVIAR 33 , 28 PAINTOR 34,35 , FINEMAP 21,22 , and SuSiE 23 ) just as they are applied to single-cohort studies, 29 without considering or accounting for the unavoidable heterogeneity among cohorts (e.g. 30 differences in sample size, phenotyping, genotyping, or imputation). Such heterogeneity could 31 lead to false positives and miscalibration in meta-analysis fine-mapping ( Fig. 1) . For example, 32 case-control studies enriched with more severe cases or ascertained with different phenotyping 33 criteria may disproportionately contribute to genetic discovery, even when true causal effects for 34 genetic liability are exactly the same between these studies and less severe or unascertained 35 ones. Quantitative traits like biomarkers could have phenotypic heterogeneity arising from 36 different measurement protocols and errors across studies. There might be genuine biological 37 mechanisms too, such as gene-gene (GxG) and gene-environment (GxE) interactions and 38 (population-specific) dominance variation (e.g., rs671 and alcohol dependence 36 ), that introduce 39 additional heterogeneity across studies 37,38 . In addition to phenotyping, differences in genotyping 40 and imputation could dramatically undermine fine-mapping calibration and recall at single-variant 41 resolution, because differential patterns of missingness and imputation quality across constituent 42 cohorts of different sample sizes can disproportionately diminish association statistics of 43 potentially causal variants. Finally, although more easily harmonized than phenotyping and 44 genotyping data, subtle differences in QC criteria and analytical software may further exacerbate 45 the effect of heterogeneity on fine-mapping. Large-scale simulations demonstrate miscalibration in meta-analysis fine-mapping 108 Existing fine-mapping methods 21,23,31 assume that all association statistics are derived from a 109 single-cohort study, and thus do not model the per-variant heterogeneity in effect sizes and 110 sample sizes that arise when meta-analyzing multiple cohorts (Figure 1 ). To evaluate how 111 different characteristics of constituent cohorts in a meta-analysis affect fine-mapping calibration 112 and recall, we conducted a series of large-scale GWAS meta-analysis and fine-mapping 113 simulations (Supplementary Table 1 array, and imputation panel, we conducted 300 GWAS with randomly simulated causal variants 120 that resemble the genetic architecture of a typical complex trait, including minor allele frequency 121 (MAF) dependent causal effect sizes 52 , total SNP heritability 53 , functional consequences of causal 122 variants 17 , and levels of genetic correlation across cohorts (i.e., true effect size heterogeneity; rg 123 = 1, 0.9, and 0.5; see Methods). We then meta-analyzed the single-cohort GWAS results across 124 10 independent cohorts based on multiple configurations (different combinations of genotyping 125 arrays and imputation panels for each cohort) to resemble realistic meta-analysis of multiple 126 heterogeneous cohorts (Supplementary Table 4 ). We applied ABF fine-mapping to compute a 127 posterior inclusion probability (PIP) for each variant and to derive 95% and 99% credible sets 128 (CS) that contain the smallest set of variants covering 95% and 99% of probability of causality. 129 We evaluated the false discovery rate (FDR, defined as the proportion of variants with PIP > 0.9 130 that are non-causal) and compared against the expected proportion of non-causal variants if the 131 Fine-mapping Meta-analysis Cohort meta-analysis fine-mapping method were calibrated, based on PIP. More details of our simulation 132 pipeline are described in Methods and visually summarized in Supplementary Fig. 2 . 133 134 We found that FDR varied widely over the different configurations, reaching as high as 37% for 135 the most heterogeneous configurations (Fig. 2) . We characterized the contributing factors to the 136 miscalibration. We first found that lower true effect size correlation rg (i.e., larger phenotypic 137 heterogeneity) always caused higher miscalibration and lower recall. Second, when using the 138 same imputation panel (1000GP3), use of less dense arrays (MEGA or GSA) led to moderately 139 inflated FDR (up to FDR = 11% vs. expected 1%), while use of multiple genotyping array did not 140 cause further FDR inflation (Fig. 2a) . Third, when using the same genotyping array (Omni2.5), 141 use of imputation panels (HRC or TOPMed) that does not match our simulation reference 142 significantly affects miscalibration (up to FDR = 17% vs. expected 1%), and using multiple 143 imputation panels further increased miscalibration (up to FDR = 35% vs. expected 2%, Fig. 2c) ; 144 this setup is as bad as the most heterogeneous configuration using multiple genotyping arrays 145 and imputation panels (FDR = 37%). When TOPMed-imputed variants were lifted over from 146 GRCh38 to GRCh37, we observed FDR increases of up to 10%, likely due to genomic build 147 conversion failures (Supplementary Note) 54 . Fourth, recall was not significantly affected by 148 heterogeneous genotyping arrays or imputation panels (Fig. 2b,d) . Fifth, including multiple 149 genetic ancestries did not affect calibration when using the same genotyping array and imputation 150 panel (Omni 2.5 and 1000GP3; Fig. 2e ) but significantly improved recall if African ancestry was 151 included (Fig. 2f) . This is expected, given the shorter LD length in the African population 152 compared to other populations, which improves fine-mapping resolution 55 . Finally, in the most 153 heterogeneous configurations where multiple genotyping arrays and imputation panels existed, 154 we observed a FDR of up to 37% and 28% for European and multi-ancestry meta-analyses, 155 respectively (vs. expected 2% for both), demonstrating that inter-cohort heterogeneity can 156 substantially undermine meta-analysis fine-mapping (Fig. 2g,h) . 157 158 To further characterize observed miscalibration in meta-analysis fine-mapping, we investigated 159 the availability of GWAS variants in each combination of ancestry, genotyping array, and 160 imputation panel. were available in every combination (Supplementary Fig. 3a ). This fraction increased from 68% 166 to 73%, 74%, and 76% as we increased gnomAD MAF thresholds to > 0.005, 0.01, and 0.05, 167 respectively, but never reached 100% (Supplementary Fig. 4) . Notably, we observed a 168 substantial number of variants that are unique to a certain genotyping array and an imputation 169 panel, even when we restricted to 344,497 common variants (gnomAD MAF > 0.05) in every 170 ancestry (Supplementary Fig. 3b Fig. 4 ). The remaining 2,711,356 QC-passing variants in our simulations (gnomAD MAF ≤ 0.001 in at 182 least one ancestry) further exacerbate variable coverage of the available variants 183 (Supplementary Fig. 3c ). Of these, the largest proportion of variants (39%) were only available 184 in African ancestry, followed by African and European (but not in East Asian) available variants 185 (7%), European-specific variants (6%), and East Asian-specific variants (5%). Furthermore, 186 similar to the aforementioned common variants, we found a substantial number of variants that 187 are unique to a certain combination. Altogether, we observed that only 393,471 variants (12%) 188 out of all the QC-passing 3,285,617 variants were available in every combination 189 (Supplementary Fig. 3d) . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2022. ; To address the challenges in meta-analysis fine-mapping discussed above, we developed 215 SLALOM (suspicious loci analysis of meta-analysis summary statistics), a method that flags 216 suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics based 217 on deviations from expectation, estimated with local LD structure (Methods). SLALOM consists 218 of three steps, 1) defining loci and lead variants based on a 1 Mb window, 2) detecting outlier 219 variants in each locus using meta-analysis summary statistics and an external LD reference 220 panel, and 3) identifying suspicious loci for meta-analysis fine-mapping (Fig. 3a,b) . 221 222 To detect outlier variants, we first assume a single causal variant per associated locus. Then the 223 marginal z-score ! for a variant should be approximately equal to !,# ⋅ # where # is the z-score 224 of the causal variant , and !,# is a correlation between variants and . For each variant in meta-225 analysis summary statistics, we first test this relationship using a simplified version of the 226 DENTIST statistics 46 , DENTIST-S, based on the assumption of a single causal variant. The 227 DENTIST-S statistics for a given variant is written as 228 which approximately follows a + distribution with 1 degree of freedom 46 . Since the true causal 230 variant and LD structure are unknown in real data, we approximate the causal variant as the lead 231 PIP variant in the locus (the variant with the highest PIP) and use a large-scale external LD 232 reference from gnomAD 57 , either an ancestry-matched LD for a single-ancestry meta-analysis or 233 a sample-size-weighted LD by ancestries for a multi-ancestry meta-analysis (Methods). 234 235 SLALOM then evaluates whether each locus is "suspicious"-that is, has a pattern of meta-236 analysis statistics and LD that appear inconsistent and therefore call into question the fine-237 mapping accuracy. By training on loci with maximum PIP > 0.9 in the simulations, we determined 238 that the best-performing criterion for classifying loci as true or false positives is whether a locus 239 has a variant with r 2 > 0.6 to the lead and DENTIST-S P-value < 1.0 × 10 -4 (Methods). Using this 240 criterion we achieved an area under the receiver operating characteristic curve (AUROC) of 0.74, 241 0.76, and 0.80 for identifying whether a true causal variant is a lead PIP variant, in 95% credible 242 set (CS), and in 99% CS, respectively (Fig. 3c) . We further validated the performance of SLALOM 243 using all the loci in the simulations and observed significantly higher miscalibration in predicted 244 suspicious loci than in non-suspicious loci (up to 16% difference in FDR at PIP > 0.9; Fig. 3d ). Given the relatively lower miscalibration and specificity at low PIP thresholds (Fig. 3d, 3. Identify suspicious loci for fine-mapping that have outlier variants in LD with a lead variant (P DENTIST-S < 10 -4 and r 2 > 0.6). Output: list of predicted suspicious loci for fine-mapping Additional independent signal SLALOM . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. in suspicious loci (Fisher's exact P = 3.6 × 10 -79 ; Fig. 4a ). We also tested whether 290 nonsynonymous variants belonged to 95% and 99% CS and again observed significant depletion 291 (1.4x and 1.3x, respectively; Fisher's exact P < 4.6 × 10 -100 ). In addition, when we used a more 292 stringent r 2 threshold (> 0.8) for selecting loci that contain nonsynonymous variants, we also 293 confirmed significant enrichment (Fisher's exact P < 6.1 × 10 -65 ; Supplementary Figure 6 ). To 294 quantify potential fine-mapping miscalibration in the GWAS Catalog, we investigated the 295 difference between mean PIP for lead variants and fraction of lead variants that are 296 nonsynonymous; assuming that nonsynonymous variants in these loci are truly causal, this 297 difference equals the difference between the true and reported fraction of lead PIP variants that 298 are causal. We observed differences between 26-51% and 10-18% under different PIP 299 thresholds in suspicious and non-suspicious loci, respectively ( Fig. 4b) , marking 45% and 15% 300 for high-PIP (> 0.9) variants. 301 302 We further assessed SLALOM performance in the GWAS Catalog meta-analyses by leveraging 303 high-PIP (> 0.9) complex trait and cis-eQTL variants that were rigorously fine-mapped 16 variants (Fisher's exact P = 2.6 × 10 -24 ; Fig. 4d ). We observed the same significant depletions of 315 the high-PIP complex trait and cis-eQTL variants in suspicious loci that belonged to 95% and 99% 316 CS set (Fig. 4c,d) . . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. and local LD structure) as well as inter-cohort heterogeneity (Fig. 5b-o) . 344 345 While the fraction of suspicious loci (68%) in the GBMI meta-analyses is higher than in the GWAS 346 Catalog (28%), there might be multiple reasons for this discrepancy, including association 347 significance, sample size, ancestral diversity, and study-specific QC criteria. For example, the 348 GBMI summary statistics were generated from multi-ancestry, large-scale meta-analyses of 349 median sample size of 1.4 million individuals across six ancestries, while 63% of the 467 summary 350 statistics from the GWAS Catalog were only in European-ancestry studies and 83% had less than 351 0.5 million discovery samples. Nonetheless, predicted suspicious loci for fine-mapping were 352 prevalent in both the GWAS Catalog and the GBMI. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.16.22272457 doi: medRxiv preprint Using nonsynonymous (pLoF and missense) and high-PIP (> 0.9) complex trait and cis-eQTL 365 variants, we recapitulated a significant depletion of these likely causal variants in predicted 366 suspicious loci (2.8x, 5.4x, and 5.2x for nonsynonymous, high-PIP complex trait, and high-PIP 367 cis-eQTL variants being a lead PIP variant, respectively; Fisher's exact P < 6.2 × 10 -4 ), confirming 368 our observation in the GWAS Catalog analysis (Fig. 6a-c) (P = 1.7 × 10 -11 ) which is in LD (r 2 = 0.92) with a missense variant rs396991 (p.Phe176Val) of 382 FCGR3A (Fig. 6d) . This locus was not previously reported for COPD, but is known for 383 associations with autoimmune diseases (e.g., inflammatory bowel disease 44 , rheumatoid 384 arthritis 7 , and systemic lupus erythematosus 67 ) and encodes the low-affinity human FC-gamma 385 receptors that bind to the Fc region of IgG and activate immune responses 68 . Notably, this locus 386 contains copy number variations that contribute to the disease associations in addition to single-387 nucleotide variants, which makes genotyping challenging 68,69 . Despite strong LD with the lead 388 variant, rs396991 did not achieve genome-wide significance (P = 9.1 × 10 -3 ), showing a significant 389 deviation from the expected association (PDENTIST-S = 5.3 × 10 -41 ; Fig. 6e ). This is primarily due to 390 missingness of rs396991 in 8 biobanks out of 17 (Neff = 76,790 and 36,781 for rs2099684 and 391 rs396991, respectively; Fig. 6f ), which is caused by its absence from major imputation reference 392 panels (e.g., 1000GP 49 , HRC 50 , and UK10K 70 ) despite having a high MAF in every population 393 (MAF = 0.24-0.34 in African, admixed American, East Asian, European, and South Asian 394 populations of gnomAD 57 ). 395 396 Sample size imbalance across variants was pervasive in the GBMI meta-analyses 71 , and was 397 especially enriched in predicted suspicious loci-84% of suspicious loci vs. 24% of non-398 suspicious loci showed a maximum/minimum effective sample size ratio > 2 among variants in 399 LD (r 2 > 0.6) with lead variants (a median ratio = 4.2 and 1.2 in suspicious and non-suspicious 400 loci, respectively; Supplementary Fig. 7) . . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.16.22272457 doi: medRxiv preprint observed similar trends regardless of whether variants were in suspicious or non-suspicious loci 447 (Fig. 7b,c) . To examine patterns of increased and decreased PIP for individual variants, we also 448 calculated PIP difference between the GBMI and individual biobanks, defined as ΔPIP = PIP 449 (GBMI) -maximum PIP across BBJ, FinnGen, and UKBB (Supplementary Fig. 8,9) . We 450 investigated functional enrichment based on ΔPIP bins and observed inconsistent enrichment 451 results using different ΔPIP thresholds (Supplementary Fig. 10) . Finally, to test whether fine-452 mapping resolution was improved in the GBMI over individual biobanks, we compared the size of 453 95% CS after restricting them to cases where a GBMI CS overlapped with an individual biobank 454 CS from BBJ, FinnGen, or UKBB (Methods). We observed the median 95% CS size of 2.5 and 455 2.5 in non-suspicious loci for the GBMI and individual biobanks, respectively, and 5 and 15 in 456 suspicious loci, respectively (Supplementary Fig. 11) . The smaller credible set size in suspicious 457 loci in GBMI could be due to improved resolution or to increased miscalibration. These results 458 provide limited evidence of overall fine-mapping improvement in the GBMI meta-analyses over 459 what is achievable by taking the best result from individual biobanks. Individual examples, however, provide insights into the types of fine-mapping differences that can 462 occur. To characterize the observed differences in fine-mapping confidence and resolution, we 463 further examined non-suspicious loci with ΔPIP > 0.5 in asthma. In some cases, the increased 464 power and/or ancestral diversity of GBMI led to improved fine-mapping: for example, an intergenic 465 variant rs1888909 (~18 kb upstream of IL33) showed ΔPIP = 0.99 (PIP = 1.0 and 0.008 in GBMI 466 and FinnGen, respectively; Fig. 7d) , which was primarily owing to increased association 467 significance in a meta-analysis (P = 3.0 × 10 -86 , 7.4 × 10 -2 , 3.6 × 10 -16 , and 1.9 × 10 -53 in GBMI, 468 BBJ, FinnGen, and UKBB Europeans, respectively) as well as a shorter LD length in the African 469 population than in the European population (LD length = 4 kb vs. 41 kb for variants with r 2 > 0.6 470 with rs1888909 in the African and European populations, respectively; Neff = 4,270 for Africans in 471 the GBMI asthma meta-analysis; Supplementary Fig. 12) . This variant was also fine-mapped for 472 eosinophil count in UKBB Europeans (PIP = 1.0; P = 1.3 × 10 -314 ) 16 and was previously reported 473 to regulate IL33 gene expression in human airway epithelial cells via allele-specific transcription 474 factor binding of OCT-1 (POU2F1) 77 . Likewise, we observed a missense variant rs16903574 475 (p.Phe319Leu) in OTULINL showed ΔPIP = 0.79 (PIP = 1.0 and 0.21 in GBMI and UKBB 476 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. fine-mapping with an external LD reference is extremely error-prone as previously reported 14-16 . 504 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.16.22272457 doi: medRxiv preprint GBMI and UKBB Europeans. A nearby missense, rs20541, showed lower PIP than rs1295686 despite having strong 518 LD (r 2 = 0.96). g. rs12123821 for asthma in the GBMI and UKBB Europeans. Nearby stop-gained rs61816761 was 519 independent of rs12123821 (r 2 = 0.0) and not fine-mapped in the GBMI due to a single causal variant assumption in 520 the ABF fine-mapping. In this study, we first demonstrated in simulations that meta-analysis fine-mapping is substantially 523 miscalibrated when constituent cohorts are heterogeneous in phenotyping and imputation. To 524 mitigate this issue, we developed SLALOM, a summary statistics-based QC method for identifying 525 suspicious loci in meta-analysis fine-mapping. Applying SLALOM to 14 disease endpoints from 526 the GBMI meta-analyses 10 as well as 467 summary statistics from the GWAS Catalog 48 , we 527 observed widespread suspicious loci in meta-analysis summary statistics, suggesting that meta-528 analysis fine-mapping is often miscalibrated in real data too. Indeed, we demonstrated that the 529 predicted suspicious loci were significantly depleted for having likely causal variants as a lead PIP 530 variant, such as nonsynonymous variants, high-PIP (> 0.9) GWAS and cis-eQTL fine-mapped 531 variants from our previous fine-mapping studies 16,17 . Our method provides better calibration in 532 non-suspicious loci for meta-analysis fine-mapping, generating a more reliable set of variants for 533 further functional characterization. 534 535 We have found limited evidence of improved fine-mapping in the GBMI meta-analyses over 536 individual biobanks. A few empirical examples in this study as well as other previous 537 studies 7,9,26,27,30 suggested that multi-ancestry, large-scale meta-analysis could have potential to 538 improve fine-mapping confidence and resolution owing to increased statistical power in 539 associations and differential LD pattern across ancestries. However, we have highlighted that the 540 observed improvement in PIP could be due to sample size imbalance in a locus, miscalibration, 541 and technical confoundings too, which further emphasizes the importance of careful investigation 542 of fine-mapped variants identified through meta-analysis fine-mapping. 543 544 As high-confidence fine-mapping results in large-scale biobanks and molecular QTLs continue to 545 become available 16,17,60 , we propose alternative approaches for prioritizing candidate causal 546 variants in a meta-analysis. First, these high-confidence fine-mapped variants have been a 547 valuable resource to conduct a "PheWAS" 16 to match with associated variants in a meta-analysis, 548 which provides a narrower list of candidate variants assuming they would equally be functional 549 and causal in related complex traits or tissues/cell-types. Second, a traditional approach based 550 on tagging variants (e.g., r 2 > 0.6 with lead variants, or PICS 79 fine-mapping approach that only 551 relies on a lead variant and LD) can be still highly effective, especially for known functional 552 variants such as nonsynonymous coding variants. As we highlighted in this and previous 39 553 studies, potentially causal variants in strong LD with lead variants might not achieve genome-554 wide significance because of missingness and heterogeneity. 555 556 While using an external LD reference for fine-mapping has been shown to be extremely error-557 prone 14-16 , we find here that it can be useful for flagging suspicious loci, even when it does not 558 perfectly represent the in-sample LD structure of the meta-analyzed individuals. However, our 559 use of external LD reference comes with several limitations. For example, due to the finite sample 560 size of external LD reference, rare or low-frequency variants have larger uncertainties around r 2 561 than common variants. Moreover, our r 2 values in a multi-ancestry meta-analysis are currently 562 approximated based on a sample-size-weighted average of r 2 across ancestries as previously 563 suggested 80 , but this can be different from actual r 2 . These uncertainties around r 2 affect SLALOM 564 prediction performance and should be modeled appropriately for further method development. On 565 the other hand, we find it challenging to use a LD reference when true causal variants are located 566 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. ; within a complex region (e.g., major histocompatibility complex [MHC]), or are entirely missing 567 from standard LD or imputation reference panels, especially for structural variants. These 568 limitations are not specific to meta-analysis fine-mapping, and separate fine-mapping methods 569 based on bespoke imputation references have been developed (e.g., HLA 81 , KIR 82 , and variable 570 numbers of tandem repeats [VNTR] 83 ). 571 572 In addition, there are several methodological limitations of SLALOM. First, our simulations only 573 include one causal variant per locus. Although additional independent causal variants would not 574 affect SLALOM precision (but decrease recall), multiple correlated causal variants in a locus 575 would violate SLALOM assumptions and could lead to some DENTIST-S outliers that are not due 576 to heterogeneity or missingness but rather simply a product of tagging multiple causal variants in 577 LD. In fact, our previous studies have illustrated infrequent but non-zero presence of such 578 correlated causal variants in complex traits 16,17 . Second, SLALOM prediction is not perfect. 579 Although fine-mapping calibration is certainly better in non-suspicious vs. suspicious loci, 580 SLALOM has low precision, and we still observe some miscalibration in non-suspicious loci. 581 Finally, SLALOM is a per-locus QC method and does not calibrate per-variant PIPs. Further 582 methodological development that properly models heterogeneity, missingness, multiple causal 583 variants, and LD uncertainty across multiple cohorts and ancestries is needed to refine per-variant 584 calibration and recall in meta-analysis fine-mapping. 585 586 We have found evidence in our simulations and real data of severe miscalibration of fine-mapping 587 results from GWAS meta-analysis; for example, we estimate that the difference between true and 588 reported proportion of causal variants is 20% and 45% for high-PIP (> 0.9) variants in suspicious 589 loci from the simulations and the GWAS Catalog, respectively. Our SLALOM method helps to 590 exclude spurious results from meta-analysis fine-mapping; however, even fine-mapping results in 591 SLALOM-predicted "non-suspicious" loci remain somewhat miscalibrated, showing estimated 592 differences between true and reported proportion of causal variants of 4% and 15% for high-PIP 593 variants in the simulations and the GWAS Catalog, respectively. We thus urge extreme caution 594 when interpreting PIPs computed from meta-analyses until improved methods are available. We 595 recommend that researchers looking to identify likely causal variants employ complete 596 synchronization of study design, case/control ascertainment, genomic profiling, and analytical 597 pipeline, or rely more heavily on functional annotations, biobank fine-mapping, or molecular QTLs. 598 599 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2022. To benchmark fine-mapping performance in meta-analysis, we simulated a large-scale, realistic 619 GWAS meta-analysis and performed fine-mapping under different scenarios. An overview of our 620 simulation pipeline is summarized in Supplementary Fig. 2 . 621 Using HAPGEN2 85 with the 1000 Genomes Project Phase 3 reference 49 , we simulated "true" 623 genotypes of chromosome 3 for multiple independent cohorts from African, East Asian, and 624 European ancestries. For each independent cohort from a given ancestry, we simulated 10,000 625 individuals each using the default parameters, with an ancestry-specific effective population size GRCh38. To mimic phenotypic heterogeneity across cohorts in real-world meta-analysis (due to 680 e.g., different ascertainment, measurement error, or true effect size heterogeneity), we introduced 681 cross-cohort genetic correlation of true effect sizeswhich is set to be one of 1, 0.9, or 0.5. For 682 a true causal variant , true causal effect sizes . across cohorts were randomly drawn from . ∼ 683 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) To simulate meta-analyses that resemble real-world settings, we generated multiple 696 configurations of the above GWAS results to meta-analyze across 10 independent cohorts. 697 Briefly, we chose configurations based on the following settings: 1) 10 EUR cohorts are genotyped 698 and imputed using the same genotyping array (one of GSA, MEGA, or Omni2.5) and the same 699 imputation panel (one of 1000GP3, HRC, TOPMed, or TOPMed-liftover); 2) 10 cohorts consisting 700 of multiple ancestries (9 EUR + 1 AFR/EAS cohorts or 8 EUR + 1 AFR + 1 EAS cohorts), with all 701 cohorts genotyped and imputed using the same array (Omni2.5) and the same panel (1000GP3); 702 3) 10 EUR or multi-ancestry cohorts are genotyped using the same array (Omni2.5) but imputed 703 using different panels across cohorts; 4) 10 EUR or multi-ancestry cohorts are imputed using the 704 same panel (1000GP3) but genotyped using different arrays across cohorts; 5) 10 EUR or multi-705 ancestry cohorts are genotyped and imputed using different arrays and panels across cohorts. 706 For settings 3-5, we randomly draw a combination of a genotyping array and an imputation panel 707 for each cohort five times each for 10 EUR and multi-ancestry cohorts. In total, we generated 45 708 configurations as summarized in Supplementary Table 4. 709 710 For each configuration, we conducted a fixed-effect meta-analysis based on inverse-variance 711 weighted betas and standard errors using a modified version of PLINK 1.9 712 (https://github.com/mkanai/plink-ng/tree/add_se_meta). 713 For each meta-analysis, we defined fine-mapping regions based on a 1 Mb window around each 715 genome-wide significant lead variant and applied ABF 31,32 using prior effect size variance of 0 + = 716 0.04. We set a prior variance of effect size to be 0.04 which was taken from Wakefield et al 31 and 717 is commonly used in meta-analysis fine-mapping studies 2,7 . We computed posterior inclusion 718 probability (PIP) and 95% credible set (CS) for each locus and evaluated whether true causal 719 variants were correctly fine-mapped. 720 The SLALOM method 721 SLALOM takes GWAS summary statistics and external LD reference as input and predicts 722 whether a locus is suspicious for fine-mapping. SLALOM consists of the following three steps: 723 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) We determined DENTIST-S outlier variants using two thresholds: 1) r 2 > ρ to the lead and 2) 742 PDENTIST-S < τ. The thresholds ρ and τ were set to ρ = 0.6 and τ = 1.0 × 10 -4 based on the training 743 in simulations as described below. 744 We predicted whether a locus is suspicious or non-suspicious for fine-mapping based on the 746 number of DENTIST-S outlier variants in the locus > κ. To determine the best-performing 747 thresholds (ρ, τ, and κ), we used loci with maximum PIP > 0.9 in the simulations for training. 748 Positive conditions were defined as whether a true causal variant in a locus is 1) a lead PIP 749 variant, 2) in 95% CS, and 3) in 99% CS. We computed AUROC across different thresholds (ρ = 750 0, 0.1, 0.2, …, 0.9; -log10 τ = 0, 0.5, 1, …, 10; and κ = 0, 1, 2, …) and chose ρ = 0.6, τ = 1.0 × 10 -751 4 , and κ = 0 that showed the highest AUROC for all the aforementioned positive conditions. Using 752 all the loci in the simulations, we then evaluated fine-mapping miscalibration (defined as mean 753 PIP -fraction of true causal variants) at different PIP thresholds in suspicious and non-suspicious 754 loci and decided to only apply SLALOM to loci with maximum PIP > 0.1 owing to relatively lower 755 miscalibration and specificity of SLALOM at lower PIP thresholds. 756 We retrieved full GWAS summary statistics publicly available on the GWAS Catalog 48 . Out of 758 33,052 studies from 5,553 publications registered at the GWAS Catalog (as of January 12, 2022), 759 we selected 467 studies from 96 publications that have 1) full harmonized summary statistics 760 preprocessed by the GWAS Catalog with non-missing variant ID, marginal beta, and standard 761 error columns, 2) a discovery sample size of more than 10,000 individuals, 3) African (including 762 African American, Afro-Caribbean, and Sub-Saharan African), admixed American (Hispanic and 763 Latin American), East Asian, or European samples based on their broad ancestral category 764 metadata, 4) at least one genome-wide significant association (P < 5.0 × 10 -8 ), and 5) our manual 765 annotation as a meta-analysis rather than a single-cohort study (Supplementary and European) to calculate the weighted-average of r 2 . All the variants were harmonized into the 771 human genome assembly GRCh38 by the GWAS Catalog. 772 We used meta-analysis summary statistics of 14 disease endpoints from the GBMI 774 (Supplementary Table 7) . These meta-analyses were conducted using up to 1.8 million 775 individuals across 19 biobanks, representing six different genetic ancestry groups (approximately 776 33,000 African, 18,000 Admixed American, 31,000 Central and South Asian, 341,000 East Asian, 777 1.8 million European, and 1,600 Middle Eastern individuals). Detailed procedures of the GBMI 778 meta-analyses were described in the GBMI flagship manuscript 10 . 779 780 Across the 14 summary statistics, we defined 503 genome-wide significant loci (P < 5.0 × 10 -8 ) 781 based on a 1 Mb window around each lead variant and merged them if they overlapped. We 782 applied SLALOM to 422 loci with maximum PIP > 0.1 based on the ABF fine-mapping and 783 predicted whether they were suspicious or non-suspicious for fine-mapping. We used per-variant 784 sample sizes of each ancestry (African, Admixed American, East Asian, Finnish, and non-Finnish 785 European) to calculate the weighted-average of r 2 . Since gnomAD LD matrices were not available 786 for Central and South Asian and Middle Eastern, we did not use their sample sizes for the 787 calculation. All the variants were processed on the human genome assembly GRCh38. 788 We retrieved our previous fine-mapping results for 1) complex traits in large-scale biobanks 790 (BBJ 58 , FinnGen 20 , and UKBB 19 Europeans) 16,17 and 2) cis-eQTLs in GTEx 59 v8 and eQTL 791 Catalogue 60 . Briefly, we conducted multiple-causal-variant fine-mapping (FINEMAP 21,22 and 792 SuSiE 23 ) of complex trait GWAS (# unique traits = 148) and cis-eQTL gene expression (# unique 793 tissues/cell-types = 69) using summary statistics and in-sample LD. Detailed fine-mapping 794 methods are described elsewhere 16, 17 . 795 796 In this study, we collected 1) high-PIP GWAS variants that achieved PIP > 0.9 for any traits in any 797 biobank and 2) high-PIP cis-eQTL variants that acheived PIP > 0.9 for any gene expression in 798 any tissues/cell-types. All the variants were originally processed on the human genome assembly 799 GRCh37 and lifted over to the GRCh38 for comparison. 800 Additional fine-mapping results 801 To compare with the GBMI meta-analyses, we additionally conducted multi-causal-variant fine-802 mapping of four additional endpoints (gout, heart failure, thyroid cancer, and venous 803 thromboembolism) that were not fine-mapped in our previous study 16, 17 . We used exactly the 804 same fine-mapping pipeline (FINEMAP 21,22 and SuSiE 23 ) as described previously 16, 17 . For UKBB 805 Europeans, to use the exact same samples that contributed to the GBMI, we used individuals of 806 European ancestry (n = 420,531) as defined in the Pan-UKBB project 807 (https://pan.ukbb.broadinstitute.org), instead of those of "white British ancestry" (n = 361,194) 808 used in our previous study 16, 17 . 809 To validate SLALOM performance, we asked whether suspicious and non-suspicious loci were 811 enriched for having likely causal variants as a lead PIP variant, and for containing them in the 812 95% and 99% CS. We defined likely causal variants using 1) nonsynonymous coding variants, 813 i.e., pLoF and missense variants annotated 93 by the Ensembl Variant Effect Predictor (VEP) v101 814 (using GRCh38 and GENCODE v35), 2) the high-PIP (> 0.9) complex trait fine-mapped variants, 815 and 3) the high-PIP (> 0.9) cis-eQTL fine-mapped variants from our previous studies as described 816 above. 817 818 We estimated enrichment for suspicious and non-suspicious loci as a relative risk (i.e., a ratio of 819 proportion of variants) between being in suspicious/non-suspicious loci and having the annotated 820 likely causal variants as a lead PIP variant (or containing them in the 95% or 99% CS). That is, a 821 relative risk = (proportion of non-suspicious loci having the annotated variants as a lead PIP 822 variant) / (proportion of suspicious loci having the annotated variants as a lead PIP variant). We 823 computed 95% confidence intervals using bootstrapping. 824 To directly compare with fine-mapping results from the GBMI meta-analyses, we used our fine-826 mapping results of nine disease endpoints (asthma 64 , COPD 64 , gout, heart failure 73 , IPF 62 , primary 827 open angle glaucoma 74 , thyroid cancer, stroke 75 , and venous thromboembolism 76 ) in BBJ 58 , 828 FinnGen 20 , and UKBB 19 Europeans that were also part of the GBMI meta-analyses for the same 829 traits. For comparison, we computed the maximum PIP for each variant and the minimum size of 830 95% CS across BBJ, FinnGen, and UKBB. We restricted the 95% CS in biobanks to those that 831 contain the lead variants from the GBMI. We defined the PIP difference between the GBMI and 832 individual biobanks as ΔPIP = PIP (GBMI) -the maximum PIP across the biobanks. 833 834 We conducted functional enrichment analysis to compare between the GBMI meta-analysis and 835 individual biobanks because unbiased comparison of PIP requires conditioning on likely causal 836 variants independent of the fine-mapping results, and functional annotations have been shown to 837 be enriched for causal variants. Using functional categories (coding [pLoF, missense, and 838 synonymous], 5'/3' UTR, promoter, and CRE) from our previous study 16,17 , we estimated 839 functional enrichments of variants in each functional category based on 1) top PIP rankings and 840 2) ΔPIP bins. Since fine-mapping PIP in the GBMI meta-analysis can be miscalibrated, we 841 performed a comparison based on top PIP rankings to assess whether the ordering given by 842 GBMI PIPs is more informative than the ordering given by the biobanks. For the top PIP rankings, 843 we took the top 0.5%, 0.1%, and 0.05% variants based on the PIP rankings in the GBMI and 844 individual biobanks. We computed enrichment as a relative risk = (proportion of top X% PIP 845 variants in the GBMI that are in the annotation) / (proportion of top X% PIP variants in the 846 individual biobanks that are in the annotation). For ΔPIP bins, we defined three bins using different 847 thresholds (θ = 0.01, 0.05, and 0.1): 1) decreased PIP bin, ΔPIP < -θ, 2) null bin, -θ ≤ ΔPIP ≤ θ, 848 and 3) increased PIP bin, θ < ΔPIP. We computed enrichment as a relative risk = (proportion of 849 variants in the decreased/increased PIP bin that are in the annotation) / (proportion of variants in 850 the null PIP bin). We combined coding, UTR, and promoter categories for this analysis due to the 851 limited number of variants for each bin. 852 853 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2022. ; https://doi.org/10.1101/2022.03.16.22272457 doi: medRxiv preprint Meta-analysis methods for genome-wide association 855 studies and beyond Fine-mapping type 2 diabetes loci to single-variant resolution using high-857 density imputation and islet-specific epigenome maps Identification of type 2 diabetes loci in 433,540 East Asian 859 individuals Biological insights from 108 schizophrenia-associated genetic loci The Schizophrenia Working Group of the Psychiatric Genomics Consortium Mapping genomic loci prioritises genes and implicates 864 synaptic biology in schizophrenia Genetics of rheumatoid arthritis contributes to biology and drug discovery Trans-ancestry genome-wide association study identifies novel genetic 868 mechanisms in rheumatoid arthritis Genetic studies of body mass index yield new insights for obesity 870 biology The power of genetic diversity in genome-wide association studies of 872 lipids Global Biobank Meta-analysis Initiative: powering genetic discovery across 874 human diseases 10 Years of GWAS Discovery: Biology, Function, and Translation Genomic Medicine-Progress, Pitfalls, and 878 Promise From genome-wide associations to candidate 880 causal variants by statistical fine-mapping Interrogation of human hematopoiesis at single-cell and single-variant 882 resolution Functionally informed fine-mapping and polygenic localization of 884 complex trait heritability An annotated atlas of causal variants underlying complex traits 886 and gene expression Overview of the BioBank Japan Project: Study design and profile The UK Biobank resource with deep phenotyping and genomic data Unique genetic insights from combining isolated population and 894 national health register data. bioRxiv (2022) FINEMAP: Efficient variable selection using summary data from genome-896 wide association studies A simple new approach to variable 900 selection in regression, with application to genetic fine mapping Fine mapping of type 1 diabetes susceptibility loci and 903 evidence for colocalization of causal variants with lymphoid gene enhancers Bi-ancestral depression GWAS in the Million Veteran Program and meta-906 analysis in >1.2 million individuals highlight new therapeutic directions Genome-wide meta-analysis identifies 127 open-angle glaucoma loci 909 with consistent effect across ancestries The trans-ancestral genomic architecture of glycemic traits GWAS of thyroid stimulating hormone highlights pleiotropic effects and 913 inverse association with thyroid cancer A genome-wide association study with 1,126,563 individuals 915 identifies new risk loci for Alzheimer's disease Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 917 Individuals from 5 Global Populations A Bayesian measure of the probability of false discovery in genetic 919 epidemiology studies Bayes factors for genome-wide association studies: comparison with P-921 values Identifying Causal 923 Variants at Loci with Multiple Signals of Association Integrating functional data to prioritize causal variants in statistical fine-925 mapping studies Leveraging Functional-Annotation Data in Trans-ethnic Fine-927 Mapping Studies Strong protective effect of the aldehyde dehydrogenase 929 gene (ALDH2) 504lys (*2) allele against alcoholism and alcohol-induced medical diseases 930 in Asians Transethnic Genetic-Correlation Estimates 932 from Summary Statistics Population-specific causal disease effect sizes in functionally important 934 regions impacted by selection Resolving TYK2 locus genotype-to-phenotype differences in 938 autoimmunity Tyrosine kinase 2 variant influences T lymphocyte polarization and 940 multiple sclerosis susceptibility Two rare disease-associated Tyk2 variants are catalytically impaired but 942 signaling competent RICOPILI: Rapid Imputation for COnsortias PIpeLIne Fine-mapping inflammatory bowel disease loci to single-variant resolution. 946 Quality control and conduct of genome-wide association meta-948 analyses Improved analyses of GWAS summary statistics by reducing data 950 heterogeneity and errors Conditional and joint multiple-SNP analysis of GWAS summary statistics 952 identifies additional variants influencing complex traits The NHGRI-EBI GWAS Catalog of published genome-wide association 955 studies, targeted arrays and summary statistics 2019 The 1000 Genomes Project Consortium. A global reference for human genetic variation. 958 A reference panel of 64,976 haplotypes for genotype imputation Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. 962 Quantification of frequency-dependent genetic architectures in 25 UK 964 Biobank traits reveals action of negative selection Genome partitioning of genetic variation for complex traits using common 966 SNPs Converting single nucleotide variants 968 between genome builds: from cautionary tale to solution Trans-ethnic study 970 design approaches for fine-mapping Genotype imputation for genome-wide association studies The mutational constraint spectrum quantified from variation in 974 141,456 humans A cross-population atlas of genetic associations for 220 human 976 phenotypes The GTEx Consortium atlas of genetic regulatory effects across 978 human tissues A compendium of uniformly processed human gene expression and 980 splicing quantitative trait loci Alpha-1 Antitrypsin PiMZ Genotype Is Associated with Chronic 986 Obstructive Pulmonary Disease in Two Racial Groups Multi-ancestry meta-analysis of asthma identifies novel associations and 989 highlights the value of increased power and diversity Rare and low-frequency coding variants alter human adult height Transancestral mapping and genetic load in systemic lupus 996 erythematosus Fcγ receptors: genetic variation, function, and disease Association analysis of copy numbers of FC-gamma receptor genes for 1000 rheumatoid arthritis and other immune-mediated phenotypes UK10K Consortium et al. The UK10K project identifies rare variants in health and disease Global biobank analyses provide lessons for computing polygenic risk 1005 scores across diverse cohorts A practical guideline of genomics-driven drug discovery in the era of global 1007 biobank meta-analysis. bioRxiv Polygenic risk score from a multi-ancestry GWAS uncovers susceptibility 1009 of heart failure Genome-wide association meta-analysis identifies novel ancestry-specific 1011 primary open-angle glaucoma loci and shared biology with vascular mechanisms and cell 1012 proliferation Multi-ancestry GWAS for venous thromboembolism identifies novel loci 1017 followed by experimental validation Asthma-associated genetic variants induce IL33 differential expression 1019 through an enhancer-blocking regulatory region IL-13 R130Q, a common variant associated with allergy and asthma, 1021 enhances effector mechanisms essential for human allergic inflammation Genetic and epigenetic fine mapping of causal autoimmune disease 1024 variants Genetic analyses of diverse populations improves discovery for complex 1026 traits A high-resolution HLA reference panel capturing global population diversity 1028 enables multi-ancestry fine-mapping in HIV host response Decoding the diversity of killer immunoglobulin-like receptors by deep 1031 sequencing and a high-resolution imputation method Protein-coding repeat polymorphisms strongly shape diverse human 1033 phenotypes Exploring and visualizing large-scale genetic associations by 1035 using PheWeb HAPGEN2: Simulation of multiple disease SNPs Robust relationship inference in genome-wide association studies Second-generation PLINK: rising to the challenge of larger and richer 1041 datasets CCR5-∆32 is deleterious in the homozygous state in humans No statistical evidence for an effect of CCR5-∆32 on lifespan in the UK 1045 Biobank cohort Next-generation genotype imputation service and methods Reference-based phasing using the Haplotype Reference Consortium 1049 panel Contrasting genetic architectures of schizophrenia and other complex 1051 diseases using fast variance-components analysis The Ensembl Variant Effect Predictor Global Biobank Meta-analysis Initiative Arjun Bhattacharya 12 , Huiling Zhao 13 , Shinichi Namba 5 , Ida 1056 Surakka 14 Sarah Finer 40 , Lars G Fritsche 32 Hilary K Finucane 1,2,3 , Lude Franke 18 , Eric 1068 Gamazon 35 Jordan 1071 A Shavit 56 Aarno V Palotie 1,2,21 The HUNT Study, UCLA ATLAS Community Health Initiative 74* , Mark J Daly 1,2,3,21* , Benjamin M Neale 1,2,3* 1083 1084 *These authors jointly supervised the initiative 1085 1086 1 Analytic and Translational Genetics Unit, Department of Medicine 3 Program in Medical and Population Genetics, Broad Institute of 1089 MIT and Harvard Department of Public Health and Nursing, Faculty of Medicine and Health MRC Integrative Epidemiology Unit, Population 1096 Health Sciences 13 MRC Integrative Epidemiology Unit (IEU) 16 Department 1104 of Clinical Genetics Jebsen Center for 1110 Genetic Epidemiology, Department of Public Health and Nursing 26 Division of Biostatistics, Institute of Epidemiology and 1115 Preventive Medicine, College of Public Health 33 The Charles Bronfman 1122 Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai 36 Centre for Genomic and 1125 Experimental Medicine, Institute of Genetics and Cancer Takatsuki 569-1125, Japan, 1132 44 Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer Psychiatric 1136 and Neurodevelopmental Genetics Unit 52 Institute of Precision Health 55 Department of 1143 Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, The 1144 University of Tokyo 58 Bradford Institute for Health Research, Bradford Teaching 1147 Hospitals National Health Service (NHS) Foundation Trust, Bradford, UK, 59 Department of 1148 Molecular Genetics 1151 62 Department of Neurology 64 Department of Public Health & Medical Humanities, School 1154 of Medicine 66 Medical and 1156 Population Genomics Suita 565-0871, Japan, 69 Laboratory for Systems Genetics, RIKEN Center for Integrative Medical 1160 Sciences 72 Institute for Genetics and Biomedical Research, 1164 National Research Council, Cagliari 09100, Italy, 73 Psychiatric and Neurodevelopmental Genetics 1165 74 Department 1166 of Human Genetics