Data availability
The experimental data used in this study are publicly available. In vitro perturbation data were downloaded from the supplementary material of publications (as described and referenced in the manuscript) and from the BioGRID Open Repository of CRISPR Screens (ORCS) (version 1.1.16, May 2024) website https://orcs.thebiogrid.org/. The human pLoF burden association data used in this study were derived from individual participant data from the UK Biobank. Phenotypic and genotypic data from the UK Biobank are available to researchers via an application https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Phenome-wide pLoF burden data (same da…
Data availability
The experimental data used in this study are publicly available. In vitro perturbation data were downloaded from the supplementary material of publications (as described and referenced in the manuscript) and from the BioGRID Open Repository of CRISPR Screens (ORCS) (version 1.1.16, May 2024) website https://orcs.thebiogrid.org/. The human pLoF burden association data used in this study were derived from individual participant data from the UK Biobank. Phenotypic and genotypic data from the UK Biobank are available to researchers via an application https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Phenome-wide pLoF burden data (same dataset, similar variant masks) from the UK Biobank WGS dataset43 are in the AstraZeneca PheWAS browser (https://azphewas.com/).
Code availability
Standard functions from the TwoSampleMR (0.6.20) and metagen (8.1−0) R packages were used for all empirical GPAT analyses and visualisations as described in the manuscript. Custom R code used for GPAT simulations has been deposited in a public repository (https://zenodo.org/records/17534798). Further queries can be directed to the corresponding author.
References
Wei, J. et al. Genome-wide CRISPR screens reveal host factors critical for SARS-CoV-2 infection. Cell 184, 76–91.e13 (2021).
Baggen, J. et al. Genome-wide CRISPR screening identifies TMEM106B as a proviral host factor for SARS-CoV-2. Nat. Genet. 53, 435–444 (2021).
Przybyla, L. & Gilbert, L. A. A new era in functional genomics screens. Nat. Rev. Genet. 23, 89–103 (2022).
Shi, H., Doench, J. G. & Chi, H. CRISPR screens for functional interrogation of immunity. Nat. Rev. Immunol. 23, 363–380 (2023).
Haley, B. & Roudnicky, F. Functional genomics for cancer drug target discovery. Cancer Cell 38, 31–43 (2020).
Kendirli, A. et al. A genome-wide in vivo CRISPR screen identifies essential regulators of T cell migration to the CNS in a multiple sclerosis model. Nat. Neurosci. 26, 1713–1725 (2023).
Belk, J. A. et al. Genome-wide CRISPR screens of T cell exhaustion identify chromatin remodeling factors that limit T cell persistence. Cancer Cell 40, 768–786.e7 (2022).
Baronas, J. M. et al. Genome-wide CRISPR screening of chondrocyte maturation newly implicates genes in skeletal growth and height-associated GWAS loci. Cell Genom. 3, 100299 (2023).
Rottner, A. K. et al. A genome-wide CRISPR screen identifies CALCOCO2 as a regulator of beta cell function influencing type 2 diabetes risk. Nat. Genet. 55, 54–65 (2023).
Dong, C. et al. A genome-wide CRISPR-Cas9 knockout screen identifies essential and growth-restricting genes in human trophoblast stem cells. Nat. Commun. 13, 2548 (2022).
Bareinboim, E. & Pearl, J. A general algorithm for deciding transportability of experimental results. J. Causal Inference 1, 107–134 (2013).
Pearl, J. & Bareinboim, E. Transportability of causal and statistical relations: a formal approach. Proc. AAAI Conf. Artif. Intell. 25, 247–254 (2011).
Ference, B. A. et al. Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analysis. J. Am. Coll. Cardiol. 60, 2631–2639 (2012).
Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet 23, R89–R98 (2014).
Sanderson, E. et al. Mendelian randomization. Nat. Rev. Methods Prim. 2, 6 (2022).
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Bock, C. et al. High-content CRISPR screening. Nat. Rev. Methods Prim. 2, 8 (2022).
Lazar, N. H. et al. High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by CRISPR-Cas9 editing. Nat. Genet. 56, 1482–1493 (2024).
Turn, R. E. et al. A genome-wide, CRISPR-based screen reveals new requirements for translation initiation and ubiquitination in driving adipogenic fate change. Genes Dev. 39, 1241–1264 (2025).
Lu, A. et al. CRISPR screens for lipid regulators reveal a role for ER-bound SNX13 in lysosomal cholesterol export. J Cell Biol 221, e202105060 (2022).
Chai, A. W. Y. et al. Genome-wide CRISPR screens of oral squamous cell carcinoma reveal fitness genes in the Hippo pathway. Elife 9, e57761. (2020).
Fei, T. et al. Deciphering essential cistromes using genome-wide CRISPR screens. Proc. Natl. Acad. Sci. USA 116, 25186–25195 (2019).
Lee, D. H. et al. Genome-wide CRISPR screening reveals a role for sialylation in the tumorigenesis and chemoresistance of acute myeloid leukemia cells. Cancer Lett. 510, 37–47 (2021).
Xu, S. et al. Genome-wide CRISPR screen identifies ELP5 as a determinant of gemcitabine sensitivity in gallbladder cancer. Nat. Commun. 10, 5492 (2019).
Thelen, A. M. & Zoncu, R. Emerging Roles for the Lysosome in Lipid Metabolism. Trends Cell Biol. 27, 833–850 (2017).
Sterling, P. What Is Health? in What Is Health?: Allostasis and the Evolution of Human Design 0 (The MIT Press, 2020). 1.
Davey, S. G. & Ebrahim, S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).
Garry, D. J. et al. Mice without myoglobin. Nature 395, 905–908 (1998).
Voight, B. F. et al. Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet 380, 572–580 (2012).
Holmes, M. V. et al. Mendelian randomization of blood lipids for coronary heart disease. Eur. Heart J. 36, 539–550 (2015).
Davey Smith, G. & Phillips, A. N. Correlation without a cause: an epidemiological odyssey. Int. J. Epidemiol. 49, 4–14 (2020).
Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 40, 304–314 (2016).
Gutierrez, S., Glymour, M. M. & Davey Smith, G. et al. Evidence triangulation in health research. Eur. J. Epidemiol. 40, 743–757 (2025).
Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).
Blake, J. A. et al. Mouse Genome Database (MGD): knowledgebase for mouse–human comparative biology. Nucleic Acids Res. 49, D981–D987 (2020).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet 52, 969–983 (2020).
The UK Biobank Whole-Genome Sequencing Consortium. Whole-genome sequencing of 490,640 UK Biobank participants. Nature 645, 692–701 (2025). 1.
Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53, 1097–1103 (2021).
Denaxas, S. et al. UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. J. Am. Med Inf. Assoc. 26, 1545–1559 (2019).
Sun, B. B. et al. Plasma proteomic associations with genetics and health in the UK Biobank. Nature 622, 329–338 (2023).
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife 7, e34408 (2018).
Carss, K. et al. Whole-genome sequencing of 490,640 UK Biobank participants. Nature 645, 692–701 (2025).
Acknowledgements
GDS works within the MRC Integrative Epidemiology Unit at the University of Bristol, which is supported by the Medical Research Council (MC_UU_00032/1).
Author information
Authors and Affiliations
GSK, Gunnels Wood Road, Stevenage, United Kingdom
Laurence J. Howe, Yurii S. Aulchenko, Toby Johnson, Tom G. Richardson, Philippe Sanseau, Robert A. Scott, Daniel D. Seaton & Ashwini Sharma 1.
MRC-IEU, University of Bristol, Bristol, UK
George Davey Smith 1.
Division of Psychiatry, University College London, London, UK
Neil M. Davies 1.
GSK, Calle Severo Ochoa, Tres Cantos, Spain
Jorge Esparza-Gordillo 1.
GSK, Collegeville, Collegeville, Pennsylvania, USA
Jimmy Z. Liu 1.
GSK, Meyerhofstrasse 1, Heidelberg, Germany
Adrian Cortes
Authors
- Laurence J. Howe
- Yurii S. Aulchenko
- George Davey Smith
- Neil M. Davies
- Jorge Esparza-Gordillo
- Toby Johnson
- Jimmy Z. Liu
- Tom G. Richardson
- Philippe Sanseau
- Robert A. Scott
- Daniel D. Seaton
- Ashwini Sharma
- Adrian Cortes
Contributions
L.J.H. conceived the project, performed GPAT analyses and draughted the manuscript. T.G.R. and A.C. performed supporting WGS burden analyses. AS contributed with data extraction for GPAT analyses using ORCS data. L.J.H., Y.S.A., G.D.S., N.M.D., J.E.G., T.J., J.Z.L., T.G.R., P.S., R.A.S., D.D.S., A.S., and A.C. reviewed and critically contributed to the manuscript.
Corresponding author
Correspondence to Laurence J. Howe.
Ethics declarations
Competing interests
LJH, YSA, AC, JEG, TJ, JZL, TGR, PS, RAS, DDS, and AS are all employees and/or stockholders of GSK. GDS reports Scientific Advisory Board Membership for Bristol Myers Squibb, Relation Therapeutics and Insitro. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Xihao Li and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Howe, L.J., Aulchenko, Y.S., Davey Smith, G. et al. Evaluating transportability of in vitro cellular models to in vivo human phenotypes using gene perturbation data. Nat Commun (2025). https://doi.org/10.1038/s41467-025-67199-1
Received: 22 October 2024
Accepted: 25 November 2025
Published: 13 December 2025
DOI: https://doi.org/10.1038/s41467-025-67199-1