Main
Genome-wide association studies (GWAS) and rare variant burden tests have identified tens of thousands of reproducible associations for a wide range of human traits and diseases. These signals have identified many genes that can serve as therapeutic targets4,5,[6](https://www.nature.com/articles/s41586-025-09866-3#ref-CR6 “Minikel, E. V., Painter, J. L., Dong, C. C. & Nelson, M. R. Refin…
Main
Genome-wide association studies (GWAS) and rare variant burden tests have identified tens of thousands of reproducible associations for a wide range of human traits and diseases. These signals have identified many genes that can serve as therapeutic targets4,5,6; driven discoveries of new molecular mechanisms7,8, critical cell types9 and physiological pathways of disease risks or traits10,11,12; and enabled genetic risk prediction for complex diseases13.
But despite these successes, interpreting the vast majority of associations remains challenging. Aside from coarse-grained analyses such as identifying trait-relevant cell types and enriched gene sets, we lack genome-scale approaches for interpreting the molecular pathways and mechanisms through which hundreds, if not thousands, of genes affect a given phenotype.
One challenge for interpreting genetic associations is the observation that many hits act indirectly, via trans-regulation of other genes14,15,16,17,18,19. This observation is formalized in the omnigenic model3,20, which proposes that, for any given trait, only a subset of genes, referred to as core genes, are located within key molecular pathways that act directly on the trait of interest. Meanwhile, many more genes affect the trait indirectly, by regulating core genes through links in gene-regulatory networks. In this model, we can interpret the effect size of a variant in terms of all paths through the network by which it affects core genes.
The central role of trans-regulation underlying many GWAS hits implies that fully understanding the genetic basis of complex traits requires tools to measure how genetic effects flow through networks. However, until recently, we have had very limited information about gene-regulatory networks in any human cell type, with the main information coming from observational data such as trans-expression quantitative trait locus (trans-eQTL) and co-expression mapping14,16,21. However, both approaches have important limitations including low power20,22 and confounding effects of cell-type composition14 in the case of trans-eQTLs, and ambiguous causality in co-expression analysis23,24.
Advances in genome editing and single-cell RNA sequencing, including Perturb-seq, now provide new opportunities to measure causal gene-regulatory connections at genome scale25,26,27,28. In Perturb-seq experiments, a pool of cells is transduced with a library of guide RNAs, each of which causes knockdown (or other perturbation) of a single gene. After allowing the cells time to equilibrate, single-cell sequencing is used to determine which genes were knocked down in each cell and measure the transcriptome of the cell. Critically, Perturb-seq enables measurement of the trans-regulatory effects of each gene in a controlled experimental setting at the genome-wide scale. Recent work has shown that such approaches are a promising tool for interpreting GWAS data, finding that GWAS hits are often enriched in specific transcriptional programs identified by CRISPR perturbations of a subset of genes29,30,31,32,33.
Major challenges remain as we aim to move beyond identifying enriched programs to inferring genome-scale causal cascades of biological information. In this paper, we developed a new systematic approach to this problem. We demonstrate how, by combining loss-of-function (LoF) burden results with Perturb-seq, we can infer an internally coherent graph linking genes to functional programs to traits, and derive biological insight into the key genes and pathways that control these traits (Fig. 1a). The resulting graph helps us to understand not only the trait-relevant pathways but also the functions of genes and programs within the graph, to explain why those genes are associated with the traits. On the basis of our results, we expect that forthcoming efforts to generate perturbation data in a wide variety of cell types will provide a critical interpretative framework for human genetics.
Fig. 1: Study overview and selection of model traits.
a, Overview of the study. The square nodes represent genes, the coloured arrows between genes represent regulatory effects and the arrows from genes to traits represent associations. sgRNA, single guide RNA. b, Heritability enrichment of UKB traits to open chromatin regions in K562. Traits are ordered based on the P value of enrichment, which was estimated using the Jackknife test in S-LDSC. The dashed line indicates the threshold for Bonferroni significance. ATAC-seq, assay for transposase-accessible chromatin using sequencing; Cou, count; HLSR, high light scatter reticulocyte count; MCV, mean corpuscular volume; MRV, mean reticulocyte volume; MSCV, mean sphered corpuscular volume; Per, percentage; RBC, red blood cell; Ret, reticulocyte. c, Schematic of the human haematopoietic tree. Traits of interest are annotated near their relevant cell types. CLP, common lymphoid progenitor; CMP, common myeloid progenitor; GMP, granulocyte–monocyte progenitor; HSC, haematopoietic stem cell; MPP, multipotent progenitor. d, Comparison of heritability enrichment to UKB traits, between MEP and K562 open chromatin regions. P values were estimated using the Jackknife test in S-LDSC. The dashed line indicates the threshold for Bonferroni significance.
Selection of model traits
To integrate genetic association data with Perturb-seq, our first step was to evaluate whether there are any traits with high-quality genetic data where the most relevant cell type (or types) can be well modelled by existing Perturb-seq data. At the time of writing, the only published genome-wide Perturb-seq dataset was collected in a leukemia cell line: K562 (ref. 2). In that experiment, every expressed gene was knocked down using CRISPR interference, one gene per cell, before single-cell RNA sequencing.
To determine which traits could reasonably be modelled in terms of the gene-regulatory networks of K562 cells, we compiled published GWAS and LoF burden test data for a wide range of traits measured in the UK Biobank (UKB)1,34. Of these, we selected 234 quantitative traits with single-nucleotide polymorphism (SNP) heritability > 0.04 for further consideration (Supplementary Table 1) and performed stratified linkage disequilibrium score regression (S-LDSC)9 across all 234 traits. We observed that open chromatin regions in K562 exhibited significant heritability enrichment exclusively for traits related to morphology or quantity of erythroid lineage cells (Fig. 1b).
This result is intuitive, as the K562 cell line was derived from erythroleukaemia cells, which are a neoplastic form of erythroid progenitors (Fig. 1c), and K562 cells retain multipotency and can differentiate into erythroid cells35.
We also performed S-LDSC across the same set of traits for various primary cell types, and found a very similar enrichment for erythroid traits in megakaryocyte–erythroid progenitor cells (MEPs), which are the natural progenitor cells for erythrocytes (Fig. 1c,d and Extended Data Fig. 1a). The open chromatin regions in MEPs were also more similar to those in K562 cells than other cell types (Extended Data Fig. 1b). These results support the notion that K562 cells share similar chromatin features with primary progenitor cells and could serve as a cellular model for studying the gene-regulatory network associated with erythroid traits.
Among the enriched traits, we selected three traits that are relatively independent, with pairwise genetic correlations ranging from −0.39 to 0.15, for detailed analysis (Extended Data Fig. 1c). We focused primarily on mean corpuscular haemoglobin (MCH), which measures the mean amount of haemoglobin per erythrocyte; but, we also analysed red cell distribution width (RDW)—the standard deviation of the size of erythrocytes per individual—and the immature reticulocyte fraction (IRF). For these traits, a considerable amount of SNP heritability was explained by open chromatin regions in the K562 cell line (53%, 44% and 36% of the total SNP heritability, respectively), further supporting the use of K562 Perturb-seq to interpret their genetic associations (Supplementary Table 2).
Pathway enrichment for trait associations
Before attempting to build causal models for these traits, we first explored the genetic associations for MCH, RDW and IRF with standard approaches (Fig. 2 and Supplementary Fig. 1). GWAS of MCH in the UKB identified 634 independent genome-wide significant signals. Many of the lead hits fall into a few significantly enriched pathways, including haem metabolism, haematopoiesis and cell cycle (Fig. 2a,c). These enriched pathways are crucially involved in the maturation of erythrocytes. For example, tight control of cell cycle is important at several steps in erythropoiesis36,37,38,39.
Fig. 2: Pathway enrichments for blood trait associations.
a, Genetic associations identified from UKB GWAS for MCH. Variants located within a 100-kb window centred on the transcription start site of the genes in the gene set are coloured. ‘Macromolecule synthesis + reg’ refers to the positive regulation of the macromolecule biosynthetic process. b, Gene associations with MCH from UKB LoF burden tests. The colours indicate the same gene sets as panel a. Labelled genes have FDR < 0.01 and belong to the gene sets. c, Pathway enrichment of GWAS and LoF burden test top genes. For GWAS, the closest genes from the independent top variants were used. For the LoF burden test, genes were ranked by the absolute posterior effect size from GeneBayes, and the same number of genes as in GWAS was used. P values are from one-sided Fisher’s exact test. d, Comparison of LoF burden test effect sizes after GeneBayes between MCH and RDW. The solid line corresponds to the first principal component.
In addition to GWAS, UKB has also released whole-exome sequencing data for more than 450,000 participants40. Here we focused on the phenotypic effects of LoFs, which are variants such as frameshift and premature stop mutations that are predicted to cause complete LoF of a gene. To estimate the average effect of different LoF variants in the same gene on a phenotype, we compared the phenotypic values for carriers of LoF variants in a given gene versus non-carriers. This approach, known as a burden test, generates a score for each gene that estimates the effect of half loss of gene dosage on the phenotype.
Previously reported burden test statistics for LoF variants1 identified 90 genes associated with MCH at a false discovery rate (FDR) = 0.1 (Fig. 2b). Although the rankings of top hits differ between GWAS and LoF burden tests (Extended Data Fig. 2a), the lead hits from GWAS and LoF are generally enriched in the same pathways (Fig. 2c). This is consistent with the expectation that common and rare variants associated with a trait act through similar biological pathways, but frequently prioritize different genes41,42.
As one might expect, LoF variants in the genes that encode components of adult human haemoglobin, HBB, HBA1 and HBA2, all show strong negative effects on MCH (Fig. 2b). Clinically, these mutations cause α-thalassaemia or β-thalassaemia, in which a decrease in MCH is characteristic. This highlights a key feature of burden tests: in addition to significance testing, they also provide a quantitative, directional estimate of LoF effects, referred to here as γ.
The directions of associations in the burden tests also help us to interpret the pleiotropic effects of genes. When looking at genes associated with MCH and RDW, which have a negative genetic correlation in GWAS (Extended Data Fig. 1c), the LoF effects for most genes were associated in opposite directions (r = −0.53; Fig. 2d). However, a handful of genes had strong same-direction effects on both traits (Fig. 2d). For instance, CAD encodes a multifunctional enzyme of which biallelic mutations cause megaloblastic anaemia43, whereas heterozygous LoFs increase both MCH and RDW (Fig. 2d). One goal of building a causal mechanistic graph for these traits will be to explain these seemingly discordant associations.
For many genes, the LoF γs have large standard errors, due to the low frequency of LoF variants41. To improve estimation of the γs, we applied an empirical Bayes framework called GeneBayes that we developed previously44. Our approach incorporated previous information about gene expression, protein structure and gene constraint to share information across functionally similar genes (Methods). We found that the GeneBayes estimates of γ are far more reproducible than naive estimates in the independent All of Us cohort45 (Extended Data Fig. 2b,c). Furthermore, we observed greater enrichment of genes associated with traits in functional pathways even though we did not directly use that information (Extended Data Fig. 2d,e). These improvements are important for making full use of the beneficial features of LoF burden tests while reducing unwanted noise. Therefore, we used the GeneBayes posterior mean effect sizes in Fig. 2c,d and for the remainder of the paper. For further discussion about the choice of prior information for GeneBayes, see the Supplementary Note.
Gene regulation shapes genetic signals
Next, we investigated whether Perturb-seq from K562 could allow us to interpret genetic associations in the context of the gene-regulatory network. Perturb-seq estimates the effect of knocking down a gene x on the expression of another gene y, which we denote as β**x→y (Methods). β**x→y represents the total effect of x on y, including both direct and indirect pathways through the gene-regulatory network. Previous studies using perturbations to interpret GWAS have identified enrichment of hits in co-regulated gene sets, sometimes referred to as ‘programs’29,30,31,32,33, but have had limited success at identifying GWAS enrichment among program regulators (Supplementary Note).
As an initial proof of concept, we focused on the genes encoding constituents of adult haemoglobin. We focused on the gene HBA1, which is the only one abundantly expressed in K562 cells, and which has one of the largest LoF effect sizes for MCH (γ**HBA1 = − 1.5). We reasoned that if K562 Perturb-seq is relevant for interpreting MCH, then genes that regulate HBA1 should also be associated with MCH. Moreover, we should be able to predict the direction of effect on MCH from the Perturb-seq data: positive regulators of HBA1 should, themselves, have promoting effects on MCH, and vice versa for negative regulators. (Note that we refer to genes with negative β or negative γ from knockdown or LoF, respectively, as promoting and coloured them red; positive β and γ are considered repressing and coloured blue).
As predicted, we found that across all 9,498 genes that were perturbed and also tested in the LoF burden test, the LoF effect of a gene x on MCH, denoted γ**x, is significantly positively correlated with the knockdown effects of that gene on HBA1 expression, β**x→HBA1 (*β-*coefficient = 0.052, P = 3 × 10−7; Fig. 3a). Of note, among the perturbed genes, of the top ten genes ranked by LoF effects on MCH, seven had nominally significant Perturb-seq effects on HBA1, and for all seven, the sign of the Perturb-seq β matched what we predicted from γ.
Fig. 3: Regulatory effects in Perturb-seq explain genetic association signals.
a, Gene effects on MCH can be predicted by regulatory effects on HBA1. Genes perturbed in Perturb-seq experiment are ordered by their effect sizes on MCH from LoF burden test. Perturb-seq β refers to log fold change of HBA1 expression after knockdown of the genes. Significant (P < 0.05) regulatory associations in Perturb-seq are connected with arrows. The protein structure of haemoglobin is presented using UCSF ChimeraX66 based on Protein Data Bank entry 1A3N. The P value is from the linear regression and is two-sided. KDreg, knockdown of a regulator. b, Enrichment analysis testing whether the top n HBA1 regulators (ranked by P values) are enriched at LoF or GWAS top hits. GWAS hits are the closest genes to the independently associated variants (Methods). Points indicate the odds ratio in the exact Fisher’s test. Enrichment was calculated with all the perturbed genes in Perturb-seq as a background. The error bars indicate 95% confidence intervals. The P values for the enrichment of the top 200 HBA1 regulators are 9.6 × 10−5 for the top 90 LoF hits, 0.65 for the top 90 GWAS hits and 0.01 for the top 543 GWAS hits. c, For every expressed gene in K562, regulator–burden correlation is plotted against their γ for MCH. The y axis shows the –log10(P) of the regulator–burden correlation, multiplied by the sign of the correlation. The P values are from the linear regression. Quadrants with a yellow background correspond to ‘concordant’ association, in which the sign of regulator–burden correlation aligns with the sign expected from the γ of the gene. d, Genome-wide QQ-plots for regulator–burden correlations among representative traits. Each dot represents one gene. Traits without significant signals lie along the dotted line. For other traits, see Extended Data Fig. 3c. P values are from the linear regression.
We also attempted a similar analysis for GWAS hits, testing whether significant GWAS hits were enriched near HBA1 regulators (Fig. 3b). We observed that GWAS hits were enriched (OR = 2.1 for the top 200 regulators), but to a lesser extent than for significant LoF burden test hits (OR = 6.3 for the top 200 regulators). This cannot be solely explained by inaccurate gene linking, as the same set of GWAS hits showed high enrichment for some of the gene sets (Fig. 2c and Extended Data Fig. 3a). This suggests a benefit of LoF burden tests over GWAS for identifying the trait-relevant regulatory networks.
We were curious whether similar patterns of correlation between LoF effect and Perturb-seq regulatory effects might be found for other genes or other traits. Consistent with the central role of HBA1 in determining the MCH phenotype, we found that the correlation of γ**x with β**x→y, which we call regulator–burden correlation, was the highest for y = HBA1 among all genes expressed in K562 cells (Fig. 3c). As a negative control, we also tested for correlations between regulatory effects on HBA1 with LoF effects on unrelated traits. As expected, we only detected HBA1 regulator signals for erythroid traits (Extended Data Fig. 3b). These HBA1 regulator signals for traits were also detected if we used raw burden effect estimates without applying GeneBayes, but with weaker significance (Extended Data Fig. 2f,g), supporting our approach.
Another key question for Perturb-seq studies is whether regulatory relationships learned in one cell type—K562 in this case—are useful for studying traits that are determined by less-related cell types. To examine this, we computed the regulator–burden correlation for all expressed genes, with LoF γs for various traits. For each trait, we visualized the distribution of regulator–burden correlations in a two-sided quantile–quantile (QQ)-plot (Fig. 3d).
Starting with our three main erythroid traits, MCH, RDW and IRF, we saw that all three traits show large excesses of both positive and negative correlations compared with the null (x = y line), indicating significant relationships between Perturb-seq and LoF burden tests for many genes. By contrast, there was minimal correlation between regulatory effects and γ for other blood traits, including lymphocyte and eosinophil counts (Fig. 3d and Extended Data Fig. 3c). This suggests that cell types that are not differentiated from MEPs cannot be modelled well using K562 cells (Fig. 1c). This observation implies the importance of obtaining Perturb-seq data in trait-relevant cell types.
However, we were surprised to see that some non-erythroid traits, including serum levels of IGF-1 and CRP, as well as body mass index, did show highly significant correlations of regulatory effects with γ (Extended Data Fig. 3c). The strongest correlations were seen for insulin-like growth factor 1 (IGF-1), which connects the release of growth hormone to cell growth, acting on many cell types46. Further examination revealed that these signals appear to be driven by the regulation of cellular growth markers, including MKI67. We hypothesize that essential programs for cellular growth may be broadly shared across cell types that regulate IGF-1 and other traits that share this signal (Extended Data Fig. 3d). Indeed, with further analysis using Perturb-seq in additional cell lines, we confirmed the broad sharing of regulatory effects on the essential programs associated with IGF-1 (Extended Data Fig. 10 and Supplementary Note).
Together, these results confirm the relevance of gene-regulatory relationships learned from Perturb-seq for interpreting complex traits. They highlight the role of both cell-type-specific pathways—for which the cell type used in Perturb-seq must be closely matched to the trait of interest—and broadly active pathways that may be detectable in many cell types.
Trait-associated program regulations
We next aimed to develop a more comprehensive framework to explain genotype–phenotype associations in terms of the regulatory hierarchy inferred from Perturb-seq data. In principle, one might imagine inferring a complete gene-regulatory network from Perturb-seq that contains all causal gene-to-gene edges. However, the inference of accurate genome-scale causal graphs is extremely challenging, if not infeasible, from current Perturb-seq data.
As a more robust alternative, we followed previous work by clustering genes into co-expressed groups, referred to here as programs30. To identify programs, we applied consensus non-negative matrix factorization (cNMF)47 to the gene expression matrix from Perturb-seq (Fig. 4a). This allowed us to quantify the activity of each program in every cell. Similar to ref. 30, we then used the perturbation data to estimate the causal regulatory effects of knockdown of every gene x on the activity of each program P, denoted as β**x→P.
Fig. 4: Association of program regulation with blood traits.
a,b, Overview of our pipeline for the analysis to find the trait-relevant programs. c–e, Program burden effects (x axis) and regulator–burden correlation (y axis) of 60 programs in three blood traits: MCH (c), RDW (d) and IRF (e). Programs with significant associations after Bonferroni correction (P < 0.05/60) are coloured. Pathway annotations of representative programs are labelled. For annotations of other programs, see Supplementary Table 3. The P values for program burden effects are from the permutation test and are two-sided. The P values for regulator–burden correlations are from the linear regression and are two-sided. f, Schematic for the concordant and discordant patterns between program burden effects and regulator–burden correlation. P, program; R, regulator; T, trait. g, Co-regulation patterns between programs. Each dot represents a gene that has significant regulatory effects on the G2/M phase program. The gene effect size on the program activity was calculated by comparing the program usage of cells between perturbed cells and control cells using linear regression (β**x→P; Methods). The lines and their 95% confidence intervals are from locally estimated scatterplot smoothing. h, A summary of signs of regulatory effects on the programs. i, Average γ and its standard errors for 115 genes in R**A and 154 genes in R**B MCH. j, Program association with MCH in GWAS–trans-eQTL analysis (Methods) and LoF–Perturb-seq analysis.
On the basis of the preliminary analyses, we chose to model the data using 60 programs (see Methods; Extended Data Fig. 4a). We found that a large fraction of the 60 programs successfully captured biological pathways (Supplementary Table 3). Using external ENCODE data48, we found evidence for coordinated transcriptional control of many programs: for 49 of the 60 programs at least one transcription factor showed significant binding site enrichment near program genes and knockdown of that transcription factor significantly changed program expression (Extended Data Fig. 4b and Supplementary Table 3).
We next quantified the average effects of programs, and their regulators, on traits (Fig. 4b). To measure program effects, we note that in NMF, the gene loadings on each program are non-negative by definition. Thus, a natural measure of the effect of a program on a trait is simply to compute the average LoF effects (γs) of highly loaded genes as a measure of the effect of that program on the trait. We refer to this as the program burden effect. A positive program burden effect is interpreted to mean that the program has a repressing function on the trait; a negative value implies that it is promoting. Significance was determined by permutations (Methods).
To measure the effects of regulators of program P on each trait, we needed to account for the fact that distinct regulators can have either positive or negative effects on P. Thus, for each program P, we computed the correlation across regulators, x, of β**x→P with γ**x. We refer to this measure as the regulator–burden correlation; this measure is analogous to the measure of regulatory effects used for single genes above. A positive regulator–burden correlation is interpreted to mean that upregulation of program P promotes the trait; a negative value suggests that upregulation of P has a repressing effect on the trait.
The program effects on each trait are shown in Fig. 4c–e. For MCH, the haemoglobin synthesis program genes and their regulators were both significantly enriched, consistent with our single-gene analysis of HBA1. In addition, five programs associated with the cell cycle were all enriched in the program burden effect axis. This mirrors the enrichment of this pathway from the over-representation analysis of GWAS and LoF top hits (Fig. 2c), but here we can confirm the enrichment of both regulators and program genes for these programs (Fig. 4c).
For RDW, the program reflecting ATP-dependent activity was highlighted from both program and regulator axes (Fig. 4d). Iron is incorporated into haem in the mitochondria, and its dysregulation results in high RDW. In extreme cases, mitochondrial dysfunction leads to sideroblastic anaemia, characterized by high RDW49. The association of the ATP activity program with RDW is consistent with this biology. For IRF, the program representing the maintenance of the erythroid progenitor population was enriched for both program and regulator axes (Fig. 4e). This program showed the enrichment of binding sites for transcription factors that are important for the maintenance of stem cell and progenitor populations, including TAL1, NFIC, MAX and MNT50,51,52 (Extended Data Fig. 4b).
Overall, the Perturb-seq data efficiently captured biological pathways and their regulators, and comparison with gene associations enable us to identify the pathways relevant to each trait.
Complex interplay of programs
Although the significant programs in Fig. 4c–e provide insight into biological controls of these three traits, they also revealed puzzling inconsistencies. Some programs, including haemoglobin synthesis for MCH, show consistent directional effects for program genes and program regulators, but for other programs, the directions of effects initially appeared to be inconsistent (Fig. 4f). Examination of these programs revealed important principles about the regulatory architecture of programs, and design considerations for building regulatory models of complex traits.
The first principle is revealed by three programs with strong effects on MCH: the S and G2/M phase cell-cycle programs, and the autophagy program. For the G2/M phase, the program and regulator effects have directionally concordant effects on MCH, but for the S phase, the program genes and their regulators imply effects with opposite directions. In addition, for autophagy, only the regulators—but not the program genes—show a signal (Fig. 4c).
One piece of this puzzle is explained by considering patterns of co-regulation across the three programs: (1) regulators of the S phase and G2/M phase programs are shared but affect the programs in opposite directions (Fig. 4g); and (2) most G2/M and S phase regulators also affect autophagy, but the knockdown effect on autophagy is almost always positive (Fig. 4g and Extended Data Fig. 5a). These relationships are intuitive: S phase and G2/M phase are mutually exclusive components of the cell cycle; meanwhile, autophagy is suppressed during mitosis, and cell-cycle regulators are known to have a key role in that suppression53.
To describe these patterns in a simple way, we defined two sets of regulator genes, denoted R**A and R**B, according to their effects on G2/M (Fig. 4g). To determine how the regulators of these three programs affect MCH, we fit their effects jointly in a multiple regression model. This analysis showed that G2/M and autophagy regulators both have independent repressive effects on MCH (Extended Data Fig. 5b). The opposite co-regulation of S and G2/M phase programs explains the opposite correlation of these regulators with γ (Fig. 4c). A summary of the joint model of regulator effects is shown in Fig. 4h.
One prediction of this model is that R**A regulators should have stronger (more negative) genetic effects on MCH γs than R**B regulators. This is because R**A genes have a repressive effect on both G2/M and autophagy, and both programs have repressive effects on MCH; whereas for R**B, the positive regulator effects on G2/M and the negative regulator effects on autophagy partially cancel the effects of each other on MCH. Indeed, consistent with this model, we saw that both R**A and R**B have significantly negative γs on average, but R**A is much more strongly negative (Fig. 4i).
These observations emphasize the need for joint modelling of programs and show that the observed effect sizes of regulators on a trait can be modelled as sums of regulatory effects mediated through key pathways. A different form of crosstalk between programs, involving a negative-feedback loop affecting RDW, as well as the distinct relationships of program genes and their regulators with a trait, is discussed in detail in the Supplementary Note.
Validation with GWAS and trans-eQTL
Although the enrichment of GWAS hits to regulators was modest (Fig. 3b), we hypothesized that we might find consistent regulatory effects of GWAS variants on the core pathways if we take the direction of effect into account. We utilized trans-eQTL effects in peripheral blood[14](https: