Main
TEs that integrate into or near genes often disrupt gene function. To limit such disruptions, organisms have epigenetic systems that inhibit TE expression and replication1,2,3. Despite these sy…
Main
TEs that integrate into or near genes often disrupt gene function. To limit such disruptions, organisms have epigenetic systems that inhibit TE expression and replication1,2,3. Despite these systems, TEs constitute a substantial (3–80%) fraction of extant genomes4, which indicates that TE-silencing systems are not infallible. Whether organisms also have systems that protect genes from disruption—after TE-silencing systems fail and TEs mobilize into genes—is not well understood.
DNA transposons are TEs that typically have a transposase-encoding gene flanked by inverted terminal repeat elements (ITRs)5. Insertion of DNA transposons into plant or animal genes does not always destroy host-gene function, despite the TE-based introduction of premature stop codons (PTCs) into these genes6,7,8,9. A probable explanation for this paradox was suggested by the observation that DNA transposons can be imprecisely excised from mRNAs in animals6. We wondered whether the excision of TEs from mRNA is an active, host-mediated process that perhaps evolved to protect genes from TE-mediated disruption. Therefore, we set out to explore this possibility.
RSD-3 is an epsin N-terminal homology (ENTH) domain protein required for RNA interference (RNAi) in C. elegans10. RNAi-mediated depletion of dpy-6 mRNA causes a RSD-3-dependent Dumpy (Dpy) phenotype (Fig. 1a). Tc1 is the most active and abundant DNA transposon in C. elegans11. A Tc1 insertion in the first coding exon of rsd-3 (rsd-3(pk2013), henceforth Tc1::rsd-3) does not abolish rsd-3 function in RNAi9 (Fig. 1a) despite the introduction of PTCs. This is because Tc1 is excised from rsd-3 mRNA7. We first used nanopore long-read sequencing to verify that Tc1 is excised from rsd-3 mRNA (Fig. 1b). We also sequenced RNA from animals in which Tc1 was mobilized into coding exons of five other C. elegans genes to assess the generality of this process ((Extended Data Fig. 1)). To capture all Tc1 excision events, including those that create out-of-frame mRNAs, RNA from smg-2(–);Tc1::rsd-3 animals, which are defective for nonsense-mediated decay12, was sequenced (Fig. 1b). Finally, in vitro synthesized rsd-3 RNAs, which did or did not contain Tc1, were included to ensure that Tc1 excision was not an in vitro artefact of library preparation or sequencing (Fig. 1b). The analysis showed that in all cases, Tc1 was efficiently excised from 90–100% of its host mRNAs in vivo (Fig. 1b and Extended Data Fig. 1). Moreover, PCR with reverse transcription (RT–PCR) analysis did not detect evidence of Tc1 backsplicing to generate circular Tc1 RNA (Extended Data Fig. 1). Also, in all cases, Tc1, which exhibits a low rate of transposition from DNA13, was present in DNA but not mRNA in these animals (Fig. 1c and Extended Data Fig. 1), which indicated that Tc1 excision occurred at the level of RNA. TE excision from mRNA exhibited the following properties, results that support and extend previous studies6,7: (1) excision occurs at multiple sites in or near Tc1 ITRs (Fig. 1d); (2) a subset of the Tc1-excised mRNAs are in-frame (Fig. 1d); (3) excision sites only rarely map to consensus spliceosomal GU-AG14 splice sites (Fig. 1e); and (4) excision leaves short insertions or deletions (indels) in repaired mRNAs (Fig. 1d). Henceforth, we refer to the removal of a TE from its host mRNA as SOS splicing.
Fig. 1: SOS splicing excises DNA transposons from C. elegans mRNAs.
a, Left, schematic of wild-type (WT), *rsd-3-*null (–) or animals with Tc1 insertions treated with dpy-6 dsRNA (RNAi). Right, the percentage of DPY animals caused by dpy-6 RNAi (mean ± s.d., n = 3 biological replicates). One-way analysis of variance (ANOVA; two-tailed) with Tukey’s test was used to compare each group with the WT. b, Nanopore long-read sequencing of in vitro-synthesized rsd-3 RNAs with or without Tc1, and RNA from animals of indicated genotypes. PCR amplicons were generated with primers flanking the Tc1 insertion site. SOS splicing efficiency and borders of 5′ and 3′ splice sites (SS) are shown, with common SOS SS underlined. Tc1::rsd-3 is rsd-3(pk2013). c, SOS splicing visualized by automated DNA electrophoresis via Agilent TapeStation (henceforth, TapeStation) of amplicons from gDNA (left) or cDNA (RT–PCR) (right). WT (N2) (–Tc1) served as the negative control. Unspliced and spliced amplicons are indicated. d, Summary of SOS splicing events in Tc1::rsd-3 (rsd-3(pk2013)), smg-2(–);Tc1::rsd-3 and five other exonic Tc1 insertions (Extended Data Fig. 1). In-frame, out-of-frame isoforms and total in-frame percentage is shown. e, Left, SOS splice sites for C. elegans smg-2(–);Tc1::rsd-3 and Tc3::unc-22(r750) (this figure), Tc4::ced-4(n1416) (Extended Data Fig. 1), Tc1-traΔ::ITRscr::rsd-3 (Fig. 2) and HSMAR2–GFP in human HEK293T cells (Fig. 4). A total of 3,028, 4,030, 5,100, 160 and 759 reads were analysed. Right, spliceosomal splice sites for 10,000 random C. elegans introns. f, Nanopore sequencing of RNA from Tc3::unc-22(r750) animals with SOS splicing efficiency indicated. g, TapeStation analysis of Tc3::unc-22(r750) gDNA (left) or cDNA (RT–PCR) (right). WT (N2) (–Tc3) served as the negative control. h, SOS splicing isoforms from f, ranked by abundance. Isoforms with >1% of total reads are shown. In-frame percentage is indicated. Tc3::unc-22 is Tc3::unc-22(r750). Tc3::unc-22(r750) animals exhibit an Unc phenotype15, consistent with the low (2%) percentage of in-frame Tc3::unc-22(r750) RNAs.
The analysis of additional TE insertions revealed further features of SOS splicing. First, Tc1 located in the 3′ untranslated region (UTR) of a mRNA was subjected to SOS splicing. However, two intronic Tc1 insertions were not (Extended Data Fig. 2). Second, two additional DNA transposons (Tc3 and Tc4), which do not share homology with Tc1, were also subjected to SOS splicing when present in exons of C. elegans genes (Fig. 1f–h and Extended Data Fig. 1). For the Tc3::unc-22 allele, 98% of SOS spliced RNAs were out-of-frame (Fig. 1h), which may explain why Tc3::unc-22 animals exhibit an uncoordinated (Unc) phenotype despite SOS splicing15. Finally, when Tc1 was inserted into eight different coding sites in the rsd-3 gene, all eight insertions were subjected to SOS splicing. However, SOS splicing only rescued rsd-3 function in RNAi when Tc1 was inserted into regions of rsd-3 that are poorly conserved (that is, not the ENTH domain) (Fig. 1a and Extended Data Fig. 3). Together, these data suggest that SOS splicing operates in C. elegans only on DNA transposons located in mRNA. However, SOS splicing will only rescue gene function when SOS splicing generates in-frame mRNAs, and indels left by SOS splicing do not disrupt essential protein functions.
We wondered how C. elegans might identify transposon-containing RNAs to initiate SOS splicing. We engineered a mScarlet gene that contains a Tc1 element (Tc1::NmScarlet, where NmScarlet is nuclear mScarlet) that, if subjected to SOS splicing, would produce mScarlet signals localized to the nucleus (Fig. 2a). Injection of this reporter gene into the germline of adult C. elegans resulted in progeny that expressed mScarlet, which indicated that SOS splicing had occurred (Fig. 2a). RNA sequencing confirmed that the Tc1::NmScarlet mRNA was SOS spliced (Fig. 2a). Deleting all sequences, including transposase, located between the ITRs of Tc1::NmScarlet did not interfere with Tc1::NmScarlet expression, which indicated that these sequences are not necessary for triggering SOS splicing (Fig. 2b). Similarly, deletion of the transposase gene from a genomic Tc1::rsd-3 locus (Tc1-traΔ::rsd-3) did not affect SOS splicing, as indicated by nanopore sequencing (Fig. 2c), RT–PCR analysis (Fig. 2d and Extended Data Fig. 4) and RSD-3 functional analyses (Extended Data Fig. 4). However, deleting one or both 54-nucleotide ITRs (ITRΔ) from Tc1::NmScarlet abrogated mScarlet expression (Fig. 2b). Moreover, deleting an ITR element from chromosomally integrated Tc1::rsd-3,* Tc1-traΔ::rsd-3* or Tc1::unc-54::Ngfp (henceforth Tc1::Ngfp) genes prevented SOS splicing of Tc1 from its host mRNAs (Fig. 2d and Extended Data Fig. 4). We conclude that ITRs are necessary for SOS splicing.
Fig. 2: SOS splicing is a pattern-recognition system triggered by the RNA structure.
a, Top, schematic of the assay to explore SOS splicing in C. elegans. Plasmids with Tc1::NmScarlet (NmSc) were injected into adult C. elegans (P0). Middle, SOS splicing was visualized using nanopore sequencing of RNA isolated from F1 progeny. Total in-frame percentage is shown. Bottom, SOS splicing was associated with mScarlet expression in F1 progeny. b, Indicated constructs were injected and the percentage of F1 progeny expressing mScarlet signal was quantified. Left, schematic of injected constructs, which all contain NmScarlet. Variants retaining the ability to engage in ITR base-pairing are indicated by hairpin schematics. Data are the mean ± s.d. with all data points shown. n = 3 biologically independent experiments. One-way ANOVA (two-tailed) with Tukey’s test was used to compare each group with Tc1::NmScarlet. c, Nanopore sequencing of RNA isolated from Tc1-traΔ::rsd-3 animals. In vitro-synthesized RNAs were included as controls. d, SOS splicing of chromosomally integrated Tc1-traΔ reporter variants detected with TapeStation. Top, schematic of variants. RT–PCR amplicons for indicated Tc1-traΔ variants are shown. Unspliced and spliced amplicons are indicated with arrowheads. Variants undergoing SOS splicing are indicated with dashed rectangles. Scr, scrambled ITR. e, Nanopore sequencing of RNA isolated from Tc1-traΔ::ITRscr::rsd-3 animals. In vitro-synthesized Tc1-traΔ::ITRscr::rsd-3 RNA is shown. Common SOS splice sites (SS) are underlined. f, SOS splicing isoforms observed in e, ranked by abundance. Isoforms representing >2% of total reads are shown. The total in-frame percentage is indicated. The diagram in a was created using BioRender (https://www.biorender.com).
ITRs are able to base-pair to form double-stranded RNA (dsRNA) hairpin structures, which might be the signal that induces SOS splicing. Consistent with this idea, we observed adenosine to inosine editing of transposon ITRs, which supports the idea that ITRs base-pair in vivo16 (Extended Data Fig. 4). Moreover, SOS splicing of Tc1::NmScarlet required that ITRs are inverted with respect to each other, which suggests that base-pairing of ITRs is needed for SOS splicing to occur (Fig. 2b). To test this idea, we scrambled the Tc1 ITR sequence (ITRscr) and replaced the ITR elements of Tc1::NmScarlet and Tc1-traΔ*::rsd-3* with ITRscr in an inverted orientation. Thus, in these ITRscr genes, the ITR sequence is scrambled; however, scrambled ITRs retain the ability to base-pair with each other. mScarlet expression was observed in Tc1-ITRscr::NmScarlet animals (Fig. 2b). Moreover, nanopore sequencing (Fig. 2e,f) and RT–PCR analysis (Fig. 2d) revealed that SOS splicing occurred on Tc1-traΔ::ITRscr::rsd-3 mRNA, albeit with reduced efficacy compared with wild-type ITRs. These data suggest that SOS splicing is a pattern-recognition system triggered by base-pairing of inverted repeats, which are a defining feature of DNA transposons.
If SOS splicing is an active host response to DNA transposons interrupting an mRNA, then a genetic screen could identify host factors that mediate SOS splicing. We generated C. elegans that express two SOS splicing reporter genes: Tc1::Ngfp and Tc1::rsd-3. Tc1::Ngfp;Tc1::rsd-3 animals expressed GFP and were competent for RNAi (see below), because both TE-containing mRNAs were, as expected, subjected to SOS splicing (Fig. 1b and Extended Data Fig. 4). utp-20 RNAi causes developmental arrest at the larval stage17. We mutagenized Tc1::Ngfp;Tc1::rsd-3 animals and screened around 100,000 haploid genomes to identify 20 mutant animals that did not exhibit arrested development when subjected to utp-20 RNAi and did not express GFP (Fig. 3a). Genome sequencing identified candidate SOS splicing genes in the 20 mutants. We identified 8 mutations in the gene C07H6.8, 11 mutations in sut-2 and 1 mutation in* F15D4.2* (Fig. 3b). The chromosomal locations of C07H6.8, sut-2 and F15D4.2 were consistent with positional mapping data (Extended Data Fig. 5). CRISPR–Cas9-based introduction of stop codons into C07H6.8, sut-2 or F15D4.2 resulted in animals that did not express GFP or RSD-3 from Tc1::Ngfp or Tc1::rsd-3, respectively (Fig. 3c). These mutants were also defective for SOS splicing of ITR-containing mRNAs (Fig. 3d and Extended Data Fig. 6), a result that confirmed that C07H6.8, SUT-2 and F15D4.2 are required for rescuing the function of TE-interrupted genes through SOS splicing. We refer to C07H6.8 and F15D4.2 henceforth as akap-17 and caap-1, respectively, for reasons outlined below. akap-17 encodes a protein with an arginine-rich and serine-rich domain (RS) domain, two protein kinase A-anchoring domains (PBDs) and an RNA recognition motif (RRM) (Fig. 3b). The putative mammalian orthologue of AKAP-17 is A-kinase anchoring protein 17 A (AKAP17A; also known as XE7), which is a nuclear speckle-localized protein linked to alternative splicing of reporter minigenes in human cells18,19. The putative mammalian orthologue of CAAP-1 is CAAP1, which promotes chemotherapeutic resistance20,21,22 and may22,23 or may not24 regulate apoptosis. The putative mammalian orthologue of SUT-2 is MSUT2 (also known as ZC3H14 and NAB2), which is a poly(A) RNA-binding and RNA-regulating protein25,26 linked to tauopathy resistance27,28 and circular RNA biogenesis29. The role of SUT-2 in SOS splicing is not explored further here because SUT-2 was not linked to SOS splicing until after most of this work had been completed. We conclude that AKAP-17, SUT-2 and CAAP-1 are conserved proteins required for SOS splicing in C. elegans.
Fig. 3: Identification of three C. elegans factors required for SOS splicing.
a, Schematic of the genetic screen used to identify SOS splicing factors. Animals with Tc1::rsd-3;Tc1::Ngfp (nuclear GFP) were mutagenized with ethyl methanesulfonate (EMS) and F2 progeny were treated with utp-20 RNAi. Mutants that did not arrest at the developmental stage were isolated, and lineages lacking GFP expression were identified. b, Alleles of akap-17, caap-1 and sut-2 identified in a screen (red arrowhead) and generated by CRISPR–Cas9 (black arrowhead). Asterisks indicate stop codons. M1X, initiating methionine mutation; NLS, nuclear localization signal; S2NH, SUT-2 N-terminal homology; SAV, splice acceptor variant; SDV, splice donor variant; ZFD, zinc finger domain. c, RNAi responsiveness (left) and GFP expression (right) in animals with the indicated genotypes. Boxed regions are magnified at the far right. Scale bar, 50 μm. Data are the mean± s.d. n = 3 biological replicates. One-way ANOVA (two-tailed) with Tukey’s test was used to compare each group with WT. d, TapeStation analysis showing defective SOS splicing in akap-17(gg911), caap-1(gg981) and sut-2(gg1074) animals. Unspliced and spliced amplicons are indicated. e, Confocal images of L4 C. elegans with mScarlet-tagged AKAP-17 or CAAP-1. These animals produce nuclear GFP (NGFP) from the Tc1::Ngfp SOS reporter. Scale bar, 10 μm. f, RIP–seq analysis of endogenous SOS splicing targets. Normalized fold change (log2) versus reproducibility scores from two biological repeats. Results show non-enriched RNAs (blue), reporter RNAs and validated endogenous RNAs (red) and unvalidated endogenous RNAs (pink; fold change > 2, reproducibility > 0.55). g, Nanopore sequencing of W07G4.3 3′ UTR, containing Tc5A-derived inverted repeats, using RNA from WT animals. SOS splicing isoforms are ranked by abundance, with isoforms > 2% of total reads shown. h, TapeStation detection of W07G4.3 3′ UTR SOS splicing. AKAP-17 is required for W07G4.3 3′ UTR SOS splicing. Unspliced and spliced amplicons are indicated. The diagram in a was created using BioRender (https://www.biorender.com).
We introduced amino-terminal mScarlet tags to C. elegans akap-17 and caap-1. The tags did not affect AKAP-17 or CAAP-1 function, as evidenced by proficient SOS splicing in these animals (Extended Data Fig. 7). mScarlet::AKAP-17 and mScarlet::CAAP-1 fluorescence was observed in nuclei of all to most C. elegans cells in both the soma and germline and at all stages of development (Fig. 3e and Extended Data Fig. 7). These data suggest that SOS splicing is a nuclear process that may be active in all C. elegans cell types.
Illumina-based RNA sequencing of RNA from wild-type or akap-17(–) animals identified SOS spliced Tc1::Ngfp RNAs, which depended on AKAP-17 for their production (Extended Data Fig. 7). No obvious defects in canonical splicing, however, were observed in akap-17 mutants (Extended Data Fig. 7), which suggests that the role of AKAP-17 in RNA splicing is focused largely on SOS splicing. We performed RNA immunoprecipitation with sequencing (RIP–seq) of AKAP-17 to try and identify endogenous targets of SOS splicing. RIP–seq identified Tc1::Ngfp and Tc1::rsd-3 as AKAP-17-associated mRNA (Fig. 3f). Directed RT–qPCR analyses confirmed that AKAP-17 co-precipitated with Tc1-containing mRNAs but not with control mRNAs that lack Tc1 (Extended Data Fig. 6). RIP of AKAP-17 also identified about 20 other AKAP-17-associated RNAs (Fig. 3f). We tested three of these RNAs, for which published nanopore sequencing data suggested might be targets of SOS splicing30. We observed complete (Fig. 3g,h) or partial (Extended Data Fig. 6) AKAP-17-dependent SOS splicing in all three RNAs, which occurred near inverted repeat elements. In these RNAs, SOS splicing removed part of a presumed pseudogene exon or parts of 3′ or 5′ UTR elements of protein-coding mRNAs (Fig. 3g,h and Extended Data Fig. 6). In two cases, the SOS-spliced inverted repeats were derived from non-autonomous DNA transposons (Fig. 3g and Extended Data Fig. 6), whereas in the third case, no link between the SOS spliced inverted repeats and transposons could be identified (Extended Data Fig. 6). These data show that AKAP-17 associates with ITR-containing mRNAs, which implies that AKAP-17 could be directly involved in the SOS splicing process. The data also showed that C. elegans mRNAs, which contain TE-derived or non-TE-derived inverted repeats, are subjected to SOS splicing, thereby suggesting that SOS splicing has a potential role in gene regulation during growth and development. Incidentally, akap-17(–) animals exhibited an increased rate of excision of Tc1 from chromosomes, which indicated that SOS splicing may also have a role in limiting transposon activity (Extended Data Fig. 6).
Given that AKAP-17, SUT-2 and CAAP-1 are conserved in mammals, we wondered whether SOS splicing might also occur in human cells. To address this question, we constructed plasmids that express a GFP gene interrupted by either C. elegans Tc1 (Tc1–GFP) or the human hsMar2 DNA transposon (HSMAR2–GFP). These reporter genes would not be expected to produce GFP owing to introduced PTCs, unless a SOS splicing system was operational in human cells. Transfection of either plasmid into HEK293T cells resulted in cells that expressed GFP, albeit at lower levels than a control construct lacking a transposon (Fig. 4a). RT–PCR analysis (Extended Data Fig. 8) and nanopore sequencing (Fig. 4b,c) showed that both transposons were excised from their host mRNAs in human cells. The molecular hallmarks of TE excision in human cells resembled those of SOS splicing in C. elegans. That is, TE excision was efficient and was not restricted to spliceosome-associated GU-AG splice sites (Fig. 4c and Extended Data Fig. 8). Notably, although excision sites for both Tc1 and HSMAR2 occurred near their respective ITRs, the precise sites of excision differed. Tc1 excision occurred in Tc1 ITRs, whereas most of the HSMAR2 excision sites occurred 100–393 nucleotides distal to HSMAR2 ITRs (Fig. 4b). Swapping the ITRs of Tc1–GFP and HSMAR2–GFP did not alter these patterns of SOS splicing (Extended Data Fig. 8), which indicated that sequences located outside ITR elements can influence where SOS splicing will occur. Taken together, these data suggest that a SOS splicing-like system is active in human cells.
Fig. 4: SOS splicing in human cells.
a, Representative fluorescence micrographs of HEK293T cells transfected with indicated human SOS splicing reporter plasmids. Left, schematic of transfected reporter constructs. Short and long exposure images show that SOS splicing can restore GFP expression, but only to a low level, which is probably due to the low percentage of in-frame SOS splicing. Scale bar, 20 μm. b, Nanopore sequencing of RNA isolated from AKAP17A+ (WT) or AKAP17A– HEK293T cells transfected with the Tc1–GFP SOS splicing reporter (top) or the HSMAR2–GFP reporter (bottom). See Extended Data Fig. 8 for data demonstrating that cells are AKAP17A–. Bottom, the source of the small deletion seen in HSMAR2–GFP RNAs in AKAP17A– cells is not known. c, SOS splicing isoforms detected in Tc1–GFP mRNA from WT HEK293T cells, ranked by abundance. Isoforms representing >0.2% of total reads are shown, with those >0.5% highlighted. Total in-frame percentage is indicated. d, Flow cytometry of AKAP17A+ and AKAP17A– HEK293T cells transfected with the indicated Tc1–GFP variant constructs. mCherry-expressing plasmid was co-transfected, and mCherry-positive cells were selected for GFP expression analysis. Non-transfected cells served as the negative control. Tc1–GFP constructs that retain ITR base-pairing capability are indicated. Left, schematic of variant reporters. e, Flow cytometry of HEK293T cells with a chromosomally integrated Tc1–GFP SOS splicing knock-in reporter gene. Top, AKAP17A+ or AKAP17A– cells, and AKAP17A– cells complemented with AKAP17A+ via lentiviral transformation. Bottom, CAAP1+ cells and two independently derived clones (1 and 2) of CAAP1– HEK293T cells. See Extended Data Fig. 8 for data showing that cells are CAAP1–.
To explore this idea further, we asked whether ITR elements are necessary and sufficient for TE excision from reporter mRNAs in human cells, similar to C. elegans. Indeed, structure–function analyses of the Tc1–GFP reporter gene showed that the ITRs of Tc1–GFP were necessary and sufficient to induce TE excision from Tc1–GFP. Moreover, excision occurred independently of the underlying ITR sequence (Fig. 4d). We next asked whether the putative mammalian orthologues of AKAP-17 or CAAP-1 are needed for TE excision in human cells. We inserted Tc1–GFP or HSMAR2–GFP reporter genes into the AAVS1 safe-harbour site31 in HEK293T cells. We confirmed that Tc1–GFP and HSMAR2–GFP expressed GFP using flow cytometry (Fig. 4e and Extended Data Fig. 8). We then deleted all copies of AKAP17A or CAAP1 from Tc1–GFP or HSMAR2–GFP cells (Extended Data Fig. 8) and observed that homozygous AKAP17A– and CAAP1– cells no longer expressed GFP (Fig. 4e and Extended Data Fig. 8). Reintroduction of a wild-type copy of AKAP17A into AKAP17A– cells using lentiviral transduction rescued GFP expression from Tc1–GFP and HSMAR2–GFP cells (Fig. 4e and Extended Data Fig. 8). These data show that SOS splicing occurs in human cells and that orthologous proteins and related RNA structures mediate SOS splicing in humans and nematodes.
SOS splicing was recalcitrant to the spliceosome inhibitor pladienolide B (PladB)32 (Fig. 5a). This result, along with the diversity and non-GU-AG nature of many SOS splices (Fig. 1), suggests that SOS splicing is not mediated by the spliceosome. Note that in Extended Data Fig. 9, we present evidence that SOS splicing and the spliceosome can, in specific contexts, act cooperatively to splice RNAs. To further understand the mechanism of SOS splicing, we conducted immunoprecipitation with mass spectrometry (IP–MS) on human CAAP1. IP–MS of CAAP1 from HEK293T cells identified AKAP17A and MSUT2 and the RNA ligase RTCB as candidate CAAP1-interacting proteins (Fig. 5b). Some tRNAs possess introns that are excised by the endonucleases TSEN2 and TSEN34 (ref. 33). The resultant tRNA fragments are ligated by RTCB to generate mature tRNAs34. RTCB also ligates XBP1 mRNA fragments generated during the unfolded protein response35. We wondered whether RTCB might contribute to SOS splicing, perhaps by ligating mRNA fragments created by TE excision. Directed co-immunoprecipitation analyses confirmed the CAAP1 IP–MS results, whereby CAAP1–RTCB and CAAP1–AKAP17A co-immunoprecipitated in both C. elegans and in human cells (Fig. 5c,d and Extended Data Fig. 10). AlphaFold3 simulations showed that CAAP1–RTCB and CAAP1–AKAP17A may interact directly (Extended Data Fig. 10; ipTM precision of prediction values of 0.23–0.47). We used the tripartite split-GFP system to assess where in the cell RTCB might interact with the other SOS splicing factors36. We co-expressed the following constructs: (1) AKAP17A fused to the 10th β-strand of GFP (GFP10); (2) RTCB fused to the 11th β-strand of GFP (GFP11); and (3) β-strands 1–9 of GFP (GFP1–9) in HEK293T cells. In these cells, we observed GFP fluorescence in nuclear foci, which suggested that AKAP17A–RTCB interact in these foci (Fig. 5e). GFP fluorescence was not observed when GFP β-strands were not fused to AKAP17A or RTCB (Fig. 5e). Because CAAP1 associates with both RTCB and AKAP17A, we wondered whether CAAP1 might be necessary for RTCB and AKAP17A to interact. Indeed, in CAAP1– cells, RTCB and AKAP17A did not interact via split-GFP (Fig. 5e). Moreover, transiently transfected AKAP17A and RTCB could be co-immunoprecipitated from HEK293T cells, but only when CAAP1 was co-expressed in these cells (Fig. 5f). Thus, AKAP17A and RTCB interact in the nucleus and CAAP1 promotes this interaction. Moreover, because AKAP17A localizes to nuclear speckles18, the interaction between RTCB and AKAP17A probably occurs in nuclear speckles. Related split-GFP analyses showed that CAAP1–AKAP17A interact in nuclear foci and that RTCB–CAAP1 interact throughout cells (Fig. 5e). Taken together, these data suggest that SOS splicing is likely to occur in nuclear speckles in human cells and that CAAP1 recruits the RNA ligase RTCB to these foci.
Fig. 5: The RNA ligase RTCB is required for SOS splicing.
a, TapeStation analysis of amplicons from cDNA (RT–PCR) isolated from HEK293T cells transfected with plasmids for Tc1–GFP, HSMAR2–GFP or DNAJB1. Two days after transfection, cells were treated with 100 nM of the splicing inhibitor PladB, and total RNA was collected at 0, 1, 2 and 3 h after treatment. Splicing of endogenous (endo) DNAJB1 and BRD2 is shown. Unspliced and spliced amplicons are indicated, and intron retention changes compared with 0 h are indicated. b, STRING network analysis of CAAP1 IP–MS results from HEK293T cells. Factors identified in the genetic screen (Fig. 3) and the RNA ligase RTCB are highlighted in bold. c,d, Co-immunoprecipitation (co-IP) assay showing interactions between AKAP17A and CAAP1 (c) and between RTCB and CAAP1 (d) in HEK293T cells transfected with plasmids expressing the indicated tagged proteins. GAPDH served as the loading control. Asterisk indicates a nonspecific band. e, Top, schematic of tripartite split-GFP system. Bottom, representative images showing interactions between AKAP17A, CAAP1 and RTCB. Nuclei were stained with Hoechst 33342. Scale bar, 10 μm. f, Co-IP assay showing AKAP17A–RTCB interactions, observed only with CAAP1 co-transfection. GAPDH served as the loading control. Asterisk indicates a nonspecific band. g, Tc1::Ngfp SOS splicing reporter expression in WT, RTCB-1(C122A) or RTCB-1(H428A) C. elegans. Boxed regions are magnified on the right. Scale bar, 50 μm. h, Flow cytometry of GFP expression in Tc1–GFP reporter knock-in HEK293T cells treated with esiRNA targeting RTCB (siRTCB) or Renilla luciferase (siNC). i, TapeStation analysis of amplicons from RNA extracted from HEK293T cells treated with esiRNA targeting RTCB (KD) or Renilla Luciferase (NC) and transfected with Tc1–GFP reporter plasmid. Unspliced and spliced amplicons are indicated.j, Model for SOS splicing. ITR hairpins initiate SOS splicing, and an unknown process (represented by scissors) excises TEs. The AKAP17A–CAAP1–RTCB complex seals the SOS spliced mRNA.
Finally, we asked whether RTCB is required for SOS splicing. Note that RTCB is essential for viability owing to its role in tRNA splicing; therefore, it probably would not have been identified by our genetic screen. We used CRISPR–Cas9 to alter two residues in C. elegans RTCB ((RTCB-1(C122A,H428A)), which are required for RTCB-based RNA ligation37,38. Heterozygous rtcb-1(**C122A/+) and rtcb-1(H428A/+) animals were isolated and their RTCB-1(C122A) or RTCB-1(H428A) homozygous progeny were arrested at larval stage three (L3) of development (Extended Data Fig. 10). Homozygous RTCB-1(C122A) or RTCB-1(H428A) progeny, which have the Tc1::Ngfp SOS splicing reporter gene, did not express GFP as L2/L3 animals, which suggests that RTCB-1-based RNA ligation is required for SOS splicing (Fig. 5g). Similar results were obtained with animals that expressed the Tc1::NmScarlet SOS splicing reporter gene and were homozygous for a deletion allele of rtcb-1 (Extended Data Fig. 10). Animals with mutations in the catalytic sites of TSEN-2 or TSEN-34 (tRNA intron endonucleases) exhibited arrested development similar to animals with rtcb-1 mutations (L4 stage) (Extended Data Fig. 10). However, these animals did not exhibit defects in SOS splicing (Extended Data Fig. 10), which suggests that the SOS splicing defects observed in animals with rtcb-1 mutations are not an indirect consequence of failing to splice tRNAs, and that the TE excision step of SOS splicing is mediated by another, unknown process (Discussion). Introduction of mutations into the dsRNA endonucleases Dicer or Drosha also did not abrogate SOS splicing (Extended Data Fig. 10). Finally, knockdown of RTCB using endoribonuclease-prepared short interfering RNA (esiRNA) in HEK293T cells decreased the expression of GFP from Tc1–GFP and HSMAR2–GFP cells (Fig. 5h and Extended Data Fig. 10). Moreover, RT–PCR analysis showed that SOS splicing of Tc1–GFP and HSMAR2–GFP became inefficient when RTCB was depleted (Fig. 5i and Extended Data Fig. 10). Together, these data show that the RNA ligase RTCB promotes SOS splicing in C. elegans and human cells. The data suggest that, in human cells, CAAP1 recruits RTCB to nuclear speckles, where it interacts with AKAP17A to promote SOS splicing by ligating the mRNA fragments generated during the TE-excision step of SOS splicing.
Discussion
Here we described a new mode of mRNA splicing, which we term SOS splicing. We showed that SOS splicing is a conserved pattern-recognition system that detects inverted repeats in mRNAs and excises them.