Abstract
The ability to learn and form memories is critical for animals to make choices that promote their survival. The biological processes underlying learning and memory are mediated by a variety of genes in the nervous system, acting at specific times during memory encoding, consolidation, and retrieval. Many studies have utilised candidate gene approaches or random mutagenesis screens in model animals to explore the key molecular drivers for learning and memory. We propose a complementary approach to identify this network of learning regulators: the proximity-labelling tool TurboID, which promiscuously biotinylates neighbouring proteins, to snapshot the proteomic profile of neurons during learning. To do this, we expressed the TurboID enzyme in the entire nervous system of *Ca…
Abstract
The ability to learn and form memories is critical for animals to make choices that promote their survival. The biological processes underlying learning and memory are mediated by a variety of genes in the nervous system, acting at specific times during memory encoding, consolidation, and retrieval. Many studies have utilised candidate gene approaches or random mutagenesis screens in model animals to explore the key molecular drivers for learning and memory. We propose a complementary approach to identify this network of learning regulators: the proximity-labelling tool TurboID, which promiscuously biotinylates neighbouring proteins, to snapshot the proteomic profile of neurons during learning. To do this, we expressed the TurboID enzyme in the entire nervous system of Caenorhabditis elegans and exposed animals to biotin only during the training step of an appetitive gustatory learning paradigm. Our approach revealed hundreds of proteins specific to ‘trained’ worms, including components of molecular pathways previously implicated in memory in multiple species such as insulin signalling, G-protein-coupled receptor signalling, and MAP kinase signalling. Most (87–95%) of the proteins identified are neuronal, with relatively high representation for neuron classes involved in locomotion and learning. We validated several novel regulators of learning, including cholinergic receptors (ACC-1, ACC-3, LGC-46) and putative arginine kinase F46H5.3. These previously uncharacterised learning regulators all showed a clear impact on appetitive gustatory learning, with F46H5.3 showing an additional effect on aversive gustatory memory. Overall, we show that proximity labelling can be used in the brain of a small animal as a feasible and effective method to advance our knowledge on the biology of learning.
Introduction
All animals with a brain have the capacity to change their behaviour in response to changes in the environment. This capacity – to learn and remember – is essential for survival. There are numerous structural and molecular changes in the brain that modulate learning and memory in specific brain regions, occurring in a time and context-dependent manner (examples in Huckleberry et al., 2016; Lin et al., 2010; Peixoto et al., 2015; Watteyne et al., 2020; reviewed in Bailey et al., 2015). Research using model organisms has been essential towards understanding the key regulatory mechanisms underlying learning, many of which involve neurotransmitter signalling, neuromodulator signalling, signal transduction pathways, and cytoskeletal dynamics (Rahmani and Chew, 2021; Peng et al., 2011; Lamprecht, 2014). Importantly, many of these mechanisms appear to be conserved across diverse species (Bailey et al., 2015; Matsumoto et al., 2018; Rahmani and Chew, 2021).
Multiple studies have demonstrated that changes in the neuronal proteome are required for learning and memory formation (Inberg et al., 2013; Barzilai et al., 1989; Rosenberg et al., 2014). New protein synthesis appears to be critical in several contexts, as the addition of a protein synthesis inhibitor (e.g. cycloheximide) has been shown to abolish long-term memory (Chen et al., 2012; Pedreira et al., 1995; Hernandez and Abel, 2008). Moreover, protein degradation together with new protein synthesis has been strongly implicated in synaptic plasticity and memory formation (Lee et al., 2008; Fazeli et al., 1993; Park and Kaang, 2019). There is also evidence that local translation in neurons, specifically the synthesis of specific proteins in dendritic regions (thereby altering local proteome composition), plays a key role in learning (Bradshaw et al., 2003; Das et al., 2023; Smith et al., 2005; Sutton and Schuman, 2006). Additionally, several key regulatory proteins have been shown to be required at specific timepoints, such as during the training/learning step, to trigger memory formation (Stefanoska et al., 2023; Watteyne et al., 2020). Taken together, these findings suggest that the spatiotemporal regulation of protein composition within neurons is critical for learning and memory formation.
The molecular requirements for learning have primarily been identified by combining genetic approaches with behavioural paradigms to test learnt associations, typically through assaying candidate genetic mutants or performing a forward genetics screen. These strategies have been extremely insightful; however, they have some limitations. The first being that candidate genetic screens are time-consuming and labour-intensive and require subjective selection of which candidate genes to test (for example, Hukema, 2006; Stein and Murphy, 2014). The second is that large-scale screens tend to only reveal genes that have the strongest phenotypes, so genes that have more subtle phenotypes (Hiroki and Iino, 2022; Lindsay et al., 2022), or act in redundant pathways (Feng et al., 2010; Gyurkó et al., 2015; Shahmorad, 2015), may not be identified using these approaches despite their contributions to learning.
To overcome these limitations, and to gain a holistic view of the molecular pathways that contribute to learning, we used an objective proteomics approach to snapshot the protein-level changes that occur specifically during learning. To do this, we expressed the proximity-labelling tool TurboID in the entire Caenorhabditis elegans nervous system and used this to identify the proteins present in neurons during the training step of an associative learning paradigm we call ‘salt associative learning’. TurboID is an enzyme based on the BirA* biotin ligase, engineered to provide greater catalytic efficiency (Branon et al., 2018) compared with the original BirA* enzyme used in BioID experiments (Roux et al., 2012). TurboID catalyses a reaction where biotin is covalently added onto lysine residues – as this process requires biotin, its timing can be controlled by depleting tissues of biotin, then adding it exogenously only at specific time points. Additionally, spatial control can be provided by regulating the site of TurboID expression using cell-specific transgenes. TurboID has been used in multiple studies for identification of protein-protein interactions, usually by tagging a ‘bait’ protein N- or C-terminally with the TurboID enzyme, allowing for rapid biotinylation of bait interactors. Through this approach, TurboID has been used for protein-tagging experiments in C. elegans (Artan et al., 2022; Sanchez et al., 2021; Holzer et al., 2021; Hiroki et al., 2022). For example, this approach identified cytoskeletal proteins in C. elegans proximal to the microtubule-binding protein PTRN-1 (Sanchez et al., 2021), and detected interactors for ELKS-1, which localises other proteins to the presynaptic active zone in the nervous system (Artan et al., 2021).
In our study, rather than focusing on specific protein-protein interactions, we expressed TurboID that was not tagged with any bait protein in the entire nervous system of C. elegans to identify as many proteins as possible within the cytoplasm. Using this approach, we identified hundreds of proteins specific to ‘trained’ worms, which we refer to here as the learning proteome, including those in molecular pathways previously shown to contribute to learning and memory formation in worms and other organisms. In addition, we validated several novel regulators of gustatory learning, including cholinergic receptors (ACC-1, ACC-3, and LGC-46), Protein kinase A regulator KIN-2, and putative arginine kinase F46H5.3. These proteins all show a clear impact on appetitive gustatory learning. F46H5.3 showed an additional effect on aversive gustatory learning, suggesting a more general role for this kinase in memory encoding. In summary, we have demonstrated that our approach to using proximity labelling to snapshot the brain of a small animal during training is a feasible and effective method to further our understanding of the biology of learning.
Results
TurboID expression in the nervous system of C. elegans successfully labels proteins during learning
To model learning in C. elegans, we used a simple yet robust associative learning paradigm called salt associative learning. Briefly, this assay involves training worms to associate the absence of salt (NaCl) with the presence of food. C. elegans is typically grown in the presence of salt (usually ~50 mM) and displays an attraction toward this concentration when assayed for chemotaxis behaviour on a salt gradient (Kunitomo et al., 2013; Luo et al., 2014). Training/conditioning with ‘no salt +food’ partially attenuates this attraction (group referred to ‘trained’). This is because the presence of abundant food (unconditioned stimulus) is a strong innate attractive cue, and pairing this with ‘no salt’ (the conditioned stimulus) leads to the animals showing the same behaviour towards the conditioned stimulus as they do to the unconditioned stimulus, that is attraction towards no salt, reflected as a preference for lower salt concentrations (Hiroki et al., 2022; Nagashima et al., 2019). Similar behavioural paradigms involving pairings between salt/no salt and food/no food have been previously described in the literature (Nagashima et al., 2019). Here, learning experiments were performed by conditioning worms with either ‘no salt +food’ (referred to as ‘salt associative learning’) or ‘salt +no food’ (called ‘salt aversive learning’).
To identify the learning proteome, we adapted this learning paradigm to incorporate TurboID-catalysed biotinylation of proteins specifically during the learning/conditioning step. We did this by (1) performing the salt associative learning assay on transgenic animals expressing TurboID in the entire nervous system (Prab-3::TurboID) and (2) adding biotin only when the worms are being trained (i.e. exposed to both food and ‘no salt’ in the ‘trained’ group, or to food and ‘high salt’ concentrations in the ‘high-salt control’ group). As an additional control, we performed the same assay on non-transgenic (non-Tg) animals that do not express TurboID (Figure 1A and B). We then isolated proteins from >3000 whole worms per group for both ‘high-salt control’ and ‘trained’ groups, most of which were subjected to a sample preparation pipeline for mass spectrometry, and some of which were probed via western blotting to confirm the presence of biotinylated proteins. The same pipeline was used to generate five biological replicates.
Summary of the TurboID approach for protein labelling in all C. elegans neurons during learning.
(A) Workflow for mass spectrometry-based analysis. Biotin-depleted animals without (non-transgenic/Non-Tg, red) or with TurboID (transgenic, yellow) were exposed to 1 mM of exogenous biotin during conditioning by pairing food with ‘no salt’ (orange - trained) or ‘high salt’ (blue - control). >3000 worms were used per group /biological replicate (n=5) – a small proportion of each group was tested in a chemotaxis assay to assess learning capacity, while the rest was subjected to sample preparation steps for mass spectrometry (see Materials and methods). Some harvested protein was probed via western blot for the presence of biotinylated proteins or V5-tagged TurboID (see panel C for representative image from replicate 1). (B) The graph shows chemotaxis assay data for Non-Tg/wild type (WT) and transgenic C. elegans following salt associative learning. Each data point represents a ‘chemotaxis index’ (CI) value for one biological replicate (n=8). Each biological replicate includes three technical replicates (26–260 worms/technical replicate). Statistical analysis: Two-way ANOVA and Tukey’s multiple comparisons test (****≤0.0001; ns = non-significant). Error bars = mean ± SEM. (C) Western blots to visualise V5-tagged TurboID and biotinylated proteins*.* The left side shows V5-tagged TurboID visualised using 18 µg total protein from naïve worms per lane (39 kDa). Non-Tg protein lysates acted as a negative control. α tubulin was probed as a loading control. The right side shows biotinylated proteins visualised from 25 µg total protein per lane from control (C) or trained (T) worms with streptavidin-horseradish peroxidase (HRP). (D) Venn diagram comparing all proteins assigned an identity by MASCOT from peptides detected by mass spectrometry from transgenic worms. Values represent the number of proteins listed as detected in ‘TurboID, control’ (blue) and ‘TurboID, trained’ (orange). These lists were generated by first subtracting proteins identified in corresponding Non-Tg lists and then comparing both control and trained TurboID lists. The overlap represents proteins unique to ‘TurboID, trained’ worms in ≥1 replicate/s that were also detected in ‘TurboID, control’ worms in ≥1 other replicate/s.
Validation of TurboID-catalysed biotinylation was performed in two ways: First, we compared total protein from naïve/untrained animals that are non-Tg versus TurboID-encoding by western blot and probed for V5-tagged TurboID: as expected, we observed expression in transgenic worms only at the predicted size (39 kDa) (Figure 1C). Secondly, we tested if exposure to biotin increased the biotinylation signal in a TurboID-dependent manner. To do this, we quantified the biotinylation signal in (1) naive non-Tg worms not exposed to biotin, (2) non-Tg C. elegans exposed to biotin for 6 hr, (3) naive TurboID worms not exposed to biotin, and (4) TurboID animals exposed to biotin for 6 hr. Although background biotinylation was present in worms not treated with biotin, we found that biotin exposure increased the signal 1.3-fold for non-Tg and 1.7-fold for TurboID C. elegans (Figure 1—figure supplement 1). Taken together, these findings indicate that there is increased biotinylation of proteins in the presence of both biotin and the TurboID enzyme.
Mass spectrometry experiments were performed with the following experimental groups per replicate: (1) non-transgenic/non-Tg high-salt control, (2) non-Tg trained, (3) TurboID high-salt control, and (4) TurboID trained. We did not include no-biotin treatment controls due to the practical challenges of handling >4 groups in the combined learning assay/mass spectrometry pipeline, for which >3000 worms are required per group. Therefore, all groups were exposed to biotin during the 6 hr exposure period to food and either high salt (for control) or no salt (for trained; Figure 1A).
To confirm that each experimental group displayed the expected phenotype after training, a portion of worms from all groups was tested using a chemotaxis assay. The chemotaxis index (CI) was used as a readout of learning performance: a positive CI reflects high salt preference, a CI close to 0 represents a more neutral response, and a negative CI represents low salt preference (Figure 1—figure supplement 2). We confirmed after each learning assay that naïve/untrained worms had a strongly positive CI (~0.7–0.9), whereas trained animals showed a lower CI (~0.0). We also performed a learning control (indicated as high-salt ‘control’) in which the presence of food (the US) is paired with high salt concentrations – worms in this group are attracted to high salt and showed a strongly positive CI (~0.7–0.9; Figure 1B), displaying a similar behaviour to naïve worms. This behavioural change seen in trained animals, versus the naïve and high-salt control groups, represented successful learning as seen in previous studies (Hiroki et al., 2022; Nagashima et al., 2019). This was observed in both non-Tg and transgenic animals, confirming that introducing the transgene did not perturb learning (Figure 1B).
We also confirmed by western blotting that biotinylated proteins could be observed in TurboID-expressing high-salt control and trained groups (Figure 1C). As in other C. elegans studies utilising TurboID, we saw background biotinylation in non-Tg controls; however, this is visually lower compared with groups from TurboID transgenic worms (Figure 1C; Artan et al., 2021; Sanchez et al., 2021). Quantification of the signal within entire lanes showed a 1.1-fold increase in the ‘TurboID, control’ lane compared with the ‘non-Tg, control’ lane, and a 1.9-fold increase in the ‘TurboID, trained’ lane compared with the ‘non-Tg, trained’ lane. For all replicates, we determined that biotinylated proteins could be observed from total TurboID-positive worm lysate by western blotting before proceeding with downstream proteomic experiments (Figure 1—figure supplement 3, Supplementary file 1B).
Our sample preparation methodology for mass spectrometry is based on similar protocols used in C. elegans and other systems (Artan et al., 2022; Sanchez et al., 2021; Prikas et al., 2020). We performed five biological replicates, in line with other C. elegans studies (Artan et al., 2022; Holzer et al., 2021). To examine the learning proteome, we first subtracted proteins from ‘TurboID, trained’ groups also present in ‘Non-Tg, trained’ samples to generate a protein list specific to ‘TurboID, trained’ animals for each biological replicate. We next subtracted from ‘TurboID, control’ lists any proteins that appeared in ‘Non-Tg, control’ samples to generate a revised ‘TurboID, control’ protein list specific to each replicate. We then compared revised protein lists for ‘trained’ and ‘control’ worms from all biological replicates and examined both unique and shared proteins between these two groups. We found 304 proteins that were shared between ‘trained’ and ‘control’ TurboID groups, 706 proteins unique to the ‘TurboID, trained’ group, and 388 proteins unique to the ‘TurboID, control’ group (Figure 1D). We refer to the learning proteome as proteins unique to samples for ‘TurboID, trained’ worms. When generating the learning proteome, we categorised proteins as ‘assigned hits’ based on the criteria that at least one unique peptide was identified by the MASCOT search engine for the protein identity from at least one biological replicate. We also examined peptide sequences in our peak lists that were considered ‘unassigned’ by MASCOT, as these sequences were not detected as unique for any protein by the software, but specific protein identities could be found by performing a Basic Local Alignment Search Tool (BLAST) query (https://blast.ncbi.nlm.nih.gov/Blast.cgi; see Materials and methods for details). The Venn diagram in Figure 1D shows assigned hits only. Learning proteome lists for both assigned and unassigned hits are in Supplementary file 1C and D.
We assessed overlap between biological replicates for individual candidates (Figure 1—figure supplement 4) using two mass spectrometry systems: Thermo-Fisher Q-Exactive Orbitrap (‘QE’) and Orbitrap Exploris (‘Exploris’). Candidates detected in multiple replicates comprised 17% of assigned hits in QE runs, 13% in Exploris, and 21–23% when including unassigned hits (Figure 1—figure supplement 4A–D). Of the 1,010 assigned QE hits, 17% were also identified with Exploris, increasing to 29% when including all 2065 protein identities (Figure 1—figure supplement 4E–F). Despite modest overlap (<25%), key learning-related pathways (Figure 2, Supplementary file 1) and other biological processes, including metabolic pathways (Figure 3), were consistently represented, supporting the biological relevance of the identified learning proteome.
Molecular pathways previously implicated in associative learning are detected in our learning proteome.
Proteins detected from ‘TurboID, trained’ worm lysates by mass spectrometry are in bold with circles coloured as orange (‘assigned hits’ assigned protein identities by MASCOT) and/or blue (‘unassigned hits’ given protein identities by bulk BLAST searching, but not MASCOT). Darker colours mean the protein has been detected in more than one biological replicate (see legend).
Schematics for metabolic processes represented in the learning proteome.
The molecular pathways above are (A) carbohydrate metabolism (glycolysis and gluconeogenesis) and (B) fatty acid metabolism (via the tricarboxylic acid or TCA cycle). Each protein is a node in white (not detected by TurboID during learning), orange (an ‘assigned hit’), and/or blue (an ‘unassigned hit’) based on mass spectrometry data from ‘TurboID; trained’ worms. Darker colours mean the protein has been detected in more than one biological replicate (see legend).
Examination of the learning proteome reveals known regulators of learning and memory
Our initial analysis of the learning proteome sought to validate our TurboID-based approach by identifying components of biological pathways previously implicated in learning. We then performed a gene ontology (GO) term analysis of ‘cellular component’ to obtain a broad overview of the subcellular localisation of proteins identified in trained animals (Figure 4—figure supplement 1A). To do this, we generated protein-protein interaction (PPI) networks of assigned protein hits within the learning proteome for subcellular components of interest (Figure 4—figure supplement 1B–G), using data from STRING and curated with the Cytoscape ClueGO tool (Supplementary file 1E and F; Bindea et al., 2009). We found that a majority of proteins were categorised as ‘cytoplasmic’ (28.1%) as expected from our approach, which utilised the TurboID enzyme not tagged to any bait protein; this means that we would anticipate the enzyme to be present relatively evenly across the cell body and to catalyse biotinylation of proteins in this space. We saw that an unexpectedly high proportion of proteins were nuclear (18.1%), despite the presence of a nuclear export signal in our TurboID transgene, which should prevent TurboID from entering the nucleus and biotinylating nuclear proteins – this could be due to some ‘leaky’ entry of the enzyme or biotin reactive species into the nucleus, or that some proteins are categorised solely as ‘nuclear’ in the ClueGO database when they are also present in other cellular components. Importantly, we found that a proportion of proteins are categorised as present in neuronal compartments – the pre-synapse (0.5%), cilia/dendrites (2.7%), and in the axon/(synaptic) vesicles (4.0%) – as expected from transgenic expression of TurboID in the nervous system.
Learning and memory formation in organisms with brains of varying sizes has been shown to involve key regulatory pathways including signalling via neurotransmitters/neuromodulators, G-protein-coupled receptors (GPCR), the mitogen-activated protein kinase (MAPK) pathway, and the insulin/insulin growth factor-like pathway (Matsumoto et al., 2018; Myhrer, 2003; Rahmani and Chew, 2021). We next categorised proteins within the learning proteome (Figure 2) based on their known roles within these signalling pathways. This included (1) several GPCR components, including the Gi/o protein subunit GOA-1 and Gα protein subunit GPA-2, (2) regulators of insulin signalling including the DAF-2 insulin receptor, phosphoinositide 3-kinase AGE-1, and serine/threonine protein kinases AKT-1 and SGK-1, which were previously reported to modulate salt-based learning in the worm (Tomioka et al., 2006; Sakai et al., 2017), (3) MAPK signalling components including NSY-1/MAPKKK, MEK-2/MAPKK, and MPK-1/MAPK/ERK, (4) cAMP/PKA (protein kinase A) signalling regulators such as the regulatory PKA subunit KIN-2, and (5) multiple components that modulate synaptic vesicle release, including N-ethyl-maleimide sensitive fusion protein NSF1/NSF-1 and syntaxin/SYX-2. In addition, we identified several proteins relevant to glutamate, acetylcholine, and GABAergic signalling. Several components involved in protein synthesis and degradation were also detected in our learning proteome, in line with studies that suggest changes in total protein composition following memory formation (Inberg et al., 2013; Barzilai et al., 1989). These data are summarised in Figure 2, Figure 2—figure supplement 1, and Supplementary file 1F. In summary, the learning proteome includes both known learning regulators and potentially novel candidates that warrant further study. We focused on proteins functioning within these pathways of interest in our subsequent investigations (highlighted nodes in Figure 2, Figure 2—figure supplement 1, and Figure 4—figure supplement 1).
In addition, we consistently observed enrichment of two metabolic pathways, fatty acid metabolism via the TCA cycle and carbohydrate metabolism (gluconeogenesis and glycolysis), in multiple biological replicates of mass spectrometry data, uniquely in TurboID-trained animals (Figure 3). These metabolic pathways play essential roles, including in cellular energy production and macromolecule biosynthesis (Krebs and Johnson, 1937; Goetsch and Lu, 1993). Consequently, their disruption can severely impair animal health. For example, knock-down of mitochondrial components involved in the TCA cycle led to larval arrest and/or severely reduced lifespan in C. elegans (Artan et al., 2022; Liao et al., 2022). This limits the capacity to assess these processes in learning using single-gene mutants or knockdown tools. Therefore, our TurboID approach reveals biological pathways potentially involved in memory formation that are not detectable through conventional forward or reverse genetic screens.
Exploring neuron class representation within the learning proteome
Aside from identifying relevant biological networks, we also used data from the learning proteome to identify potential neuron classes involved in memory formation, using four databases. This included the Wormbase Tissue Enrichment Analysis (TEA) Tool (Angeles-Albores et al., 2016), based on Anatomy Ontology (AO) terms, and single-cell transcriptomics data from the C. elegans Neuronal Gene Expression Network (CeNGEN; Taylor et al., 2021).
Firstly, we employed transcriptome databases to check representation of the nervous system within learning proteome data. The CeNGEN database confirmed that 87–95% of assigned hits and 89–92% of all hits (assigned and unassigned hits) from the learning proteome show neuronal expression, that is were found in at least one neuron in the database (Table 1). It is important to note that CeNGEN was generated using L4 hermaphrodites, and not young adult hermaphrodites (as used here for TurboID), since transcriptomes differ between the two developmental stages (St Ange et al., 2024). The nervous system transcriptome has been characterised for young adult hermaphrodites, highlighting 7873 genes that are enriched in neurons (versus non-neuronal tissues). Neuron-enriched genes were identified using single-nucleus and bulk neuron RNA-Seq techniques, respectively (St Ange et al., 2024; Kaletsky et al., 2016). The nervous system is highly represented by our proteome data; 75–87% of assigned hits and 75–83% of all hits correspond to neuron-enriched genes identified by St. Ange et al. and Kaletsky et al.
Neuron-specific expression within the learning proteome.
Mass spectrometry runs (n=5) were performed with the ThermoFisher Scientific Q-Exactive Orbitrap (‘QE’) and/or ThermoFisher Scientific Orbitrap Exploris (‘Exploris’), for technical reasons. There are six lists because replicate #3 was run on both mass spectrometers: corresponding protein lists are annotated as ‘3 a’ and ‘3b’, respectively. The total ‘#Assigned hits’ versus ‘#All hits’ (assigned +unassigned hits) is shown in rows listed above. The CeNGEN database (threshold = 2) was used to determine corresponding percentages for assigned hits versus all hits as ‘% Neuronal for assigned hits’ versus ‘% Neuronal for all hits’ (Taylor et al., 2021). The average percentages across all replicates were 91% for assigned hits only versus 89% for all hits.
| Biological replicate | 1 | 2 | 3a | 3b | 4 | 5 |
|---|---|---|---|---|---|---|
| Mass spectrometer used | QE | QE | QE | Exploris | Exploris | Exploris |
| #Assigned hits | 364 | 159 | 97 | 237 | 202 | 274 |
| #All hits | 675 | 516 | 279 | 455 | 708 | 578 |
| % Neuronal for assigned hits | 93 | 91 | 95 | 91 | 87 | 91 |
| % Neuronal for all hits | 91 | 89 | 92 | 90 | 89 | 89 |
Secondly, we assessed which tissues and neuron classes are most highly represented within the learning proteome. We used the Wormbase TEA tool to search for gene lists corresponding to proteins encoded by (1) assigned hits only and (2) both assigned and unassigned hits within the learning proteome. Anatomical terms were considered enriched when they had a q value <1. We observed enriched terms for pharyngeal neurons (M1, M2, M5, NSM, and I4), sensory neurons (PVD), interneurons (ADA and RIG), ventral nerve cord (VNC) motor neurons (VB2, VB3, VB4, VB5, VB6, VB7, VB8, VB9, VB10, and VB11), and CAN cells from both gene lists. RIS interneurons and DD motor neurons were also enriched when including unassigned hits. Several of these neurons have previously been implicated in learning: RIG interneurons (Zhou et al., 2023) and NSM neurons in butanone olfactory learning (Fadda et al., 2020), VNC neurons through changes in glutamate receptor GLR-1 expression during touch habituation (Rose et al., 2003) and diacetyl aversive learning (Vukojevic et al., 2012), and RIS interneurons in salt aversive learning (Wang et al., 2025). Therefore, neurons enriched within the learning proteome include those known to be required for learning; other neurons not previously identified in this context, such as pharyngeal neurons, may warrant further study.
We complemented this analysis by using the CeNGEN database to search for gene lists encoding proteins (assigned hits only, minus non-transgenic controls) identified in control worms (388 genes) versus trained animals (706 genes) from Figure 1D (Taylor et al., 2021). Using the bulk gene search function in CenGEN (threshold = 2), we determined the number of genes from each list that are expressed in a specific neuron type. Values for the trained gene list were normalized to account for the ~1.8-fold increase in the number of proteins detected in trained samples compared to the high-salt control. For each neuron class that appeared in both datasets (128 in total), we calculated fold-change values between the number of genes from trained vs control gene lists. Neurons were ranked in descending order of fold-change. This ranked list is based on the relative enrichment of training-associated genes compared to control, with higher ranks suggesting neurons that may be more transcriptionally responsive or involved during training. These data are summarised in Supplementary file 1G. Given that CeNGEN utilises transcriptomic data from L4 (juvenile) animals, neuron classes were also ranked using equivalent datasets for young adult hermaphrodites: Worm-Seq (Ghaddar et al., 2023) and CeSTAAN (Princeton University, 2025; see Materials and methods for details). Importantly, CeSTAAN and Worm-Seq provide data for 79 and 104 neuron classes, respectively (vs 128 from CeNGEN); this section therefore focuses on CeNGEN data due to its greater coverage, with other datasets described in brackets. Moreover, as this analysis is descriptive and does not include statistical testing (e.g. bootstrapping), the rankings should be interpreted as indicative rather than definitive, and future work incorporating formal statistical approaches will be important to validate these observations.
Cholinergic and glutamatergic neurons constituted 15% and 55% of neurons ranked #1–20, respectively (45% and 30% for CeSTAAN; 40% and 20% for Worm-Seq). Glutamate signalling components previously have been implicated in C. elegans learning paradigms involving salt (e.g. NMDA-type glutamate receptor subunits nmr-1 and nmr-2; Kano et al., 2008). Acetylcholine has not been explored extensively in C. elegans for its involvement in learning but has been described in other animal models and in humans (reviewed in Huang et al., 2022). Other neuron classes identified have previously been implicated in salt-based associative learning (ranks in brackets): AVK interneurons (rank #7 for CenGEN; #37 for CeSTAAN; #76 for Worm-Seq; Beets et al., 2012), RIS interneurons (rank #14 for CenGEN; #1 for CeSTAAN; #31 for Worm-Seq; Wang et al., 2025), salt-sensing neuron ASEL (rank #18 for CenGEN; #16 for CeSTAAN; #34 for Worm-Seq as ‘ASE’; Beets et al., 2012), CEP and ADE dopaminergic sensory neurons (individually ranked #22 and #39, respectively, for CenGEN, vs ‘CEP_ADE_PDE’ ranked #20 and #80 for CeSTAAN and Worm-Seq, respectively; Voglis and Tavernarakis, 2008), and AIB interneurons (rank #21 for CenGEN; #11 for CeSTAAN; #67 for Worm-Seq; Sato et al., 2021). In summary, although there are some exceptions that may be reflecting expression differences between adult and L4 animals, the same neuron classes are generally seen as highly represented for trained animals (vs control) for all three transcriptomic datasets.
Interestingly, unlike its counterpart ASEL (rank #18 for CenGEN), the salt-sensing neuron ASER was ranked only #104/128 (for CenGEN, Supplementary file 1G). ASER becomes activated in response to a decrease in salt concentration (Suzuki et al., 2008) and its downstream targets likely function to redirect worms toward higher salt concentrations (Appleby, 2012). This activation is suppressed after training that reduces attraction to high salt levels (Sato et al., 2021; Wang et al., 2025). It is possible that this learning-dependent suppression of ASER activity may explain its lower fold-change in trained versus control groups. Alternatively, given it is ranked #4 using the CeSTAAN database, potentially due to the use of adults and not L4, it is equally possible that additional proteomic changes may be required in ASER (vs ASEL, with rank #16) to trigger salt chemotaxis changes. Nevertheless, these findings imply a molecular and cellular switch facilitated by dual ASE neurons to express gustatory learning in the worm, which complements previous research.
Some neurons identified were not previously implicated in learning: IL1 polymodal head neuron class (rank #1 for CenGEN; #44 for CeSTAAN; #42 for Worm-Seq), motor neuron DA9 (rank #2 for CenGEN; #78 for CeSTAAN as ‘PDA_AS_DA_DB’; #95 for Worm-Seq as ‘DA_VA’), and interneuron DVC (rank #5 for CenGEN; #23 for CeSTAAN; #3 for Worm-Seq). IL1 releases glutamate (Pereira et al., 2015) and mainly functions in regulating foraging behaviour (Hart et al., 1995), potentially indicating a role in food-based responses. Separately, cholinergic neuron DA9 and glutamatergic neuron DVC are involved in backward locomotion (Pereira et al., 2015; Ardiel and Rankin, 2015; Chalfie et al., 1985). Changes in locomotion are critical for learning-dependent modulation of chemotaxis: the incidence of sharp turns or ‘pirouette’ movements in C. elegans is influenced by prior experience in salt-based gustatory learning (Kunitomo et al., 2013). IL1 may influence salt-based learning by signalling through interneurons AVE (rank #24 for CenGEN; #10 for CeSTAAN; #87 for Worm-Seq) and PVR (rank #114 for CenGEN; neuron class not available in CeSTAAN; #44 for Worm-Seq) to the DA neurons (Bhatla, 2009), potentially modulating backward locomotion as part of the chemotaxis response. We also identified pharyngeal neurons I3 (rank #4 for CenGEN; data not available in CeSTAAN nor Worm-Seq) and I6 (rank #5; neuron class not available in CeSTAAN nor Worm-Seq), which have not previously been implicated in learning. Figure 4D provides a summary for the neural circuits implicated from these analyses, where neuron classes are highly connected to each other. Investigating the role of specific genes within these circuits opens new avenues for future research into gustatory learning.
Utilising positive candidates involved in cholinergic signalling to illustrate a putative neural circuit containing neuron classes represented by the learning proteome.
(A, B, C) Chemotaxis assay data for C. elegans with mutations targeting cholinergic signalling components acc-1, acc-3, or lgc-46, respectively (n=5). Each data point represents a chemotaxis index (CI) value from one biological replicate (n), with three technical replicates per biological replicate (23–346 animals assayed per technical replicate). Error bars = mean ± SEM. Two-way ANOVA and Tukey’s multiple comparisons tests were performed to analyse this data (****≤0.0001; ***≤0.001; **≤0.01; *≤0.05; ns = non-significant). (D) Neuron classes represented by the learning proteome were identified using the gene enrichment tool from WormBase (Angeles-Albores et al., 2016) and the CeNGEN database (threshold = 2; Taylor et al., 2021). Neurons are represented by pink triangles (sensory), orange pentagons (interneurons), and purple circles (motor neurons). Chemical synapse (black arrows) and gap junction (dotted arrows: grey for gap junctions only or yellow for synapses and gap junctions) information is provided using the software WormWeb (Bhatla, 2009). Learning regulators validated in this study are also represented: ACC-1 (brown rectangles), ACC-3 (pink squares), and LGC-46 (purple diamonds) are annotated above based on single neuron expression profiles from CeNGEN (Taylor et al., 2021). Notably, KIN-2 and F46H5.3, discussed in detail below, are expressed in all neurons shown except for DD.
Validating the requirement of learning proteome components in salt associative learning through single gene studies
Our initial analysis of learning proteome data indicates that there are multiple hits present in biological pathways important for neuron function, and that are potentially relevant to learning and memory formation. To test this directly, we performed salt associative learning experiments on selected learning proteome hits (Figures 4—6, Figure 6—figure supplements 1 and 2). We used the following general rules to interpret our data: if the average chemotaxis indices (CIs) for ‘trained’ worms were higher in a particular strain compared with wild-type, this strain was considered learning-defective, as this reflects a reduced magnitude of the expected behaviour change (an increased preference for low salt demonstrated by CIs closer to 0 or negative CIs). If the average CI for ‘trained’ worms was lower in a strain compared with wild-type, then this strain was considered to display ‘better’ learning, as the lower CI reflects an increased magnitude of the expected behaviour change. In general, we observed no significant difference in CIs between naïve groups for all genotypes, reflecting no gross locomotor or chemotaxis defects in the strains tested (Figures 4—6, Figure 6—figure supplements 1–2).
C. elegans PKA regulatory subunit KIN-2 acts in neurons to regulate salt associative learning.
Salt chemotaxis behaviour was measured in the form of chemotaxis indices (CI) for naive/untrained worms (grey circles), high-salt control (blue squares), and trained worms (orange triangles). This was done for (A and B) wild-type (WT) animals, (A and B) kin-2(ce179) mutants, and (B) transgenic worms with a WT background engineered to overexpress KIN-2 from the ce179 allele in all neurons (10–60% transgenic worms per technical replicate, both non-transgenic (-) and transgenic (+) siblings are plotted above). Each data point represents one biological replicate where (A) n=5 and (B) n=3 (one biological replicate was excluded from high-salt control and trained kin-2(ce179) groups due to insufficient sample size). (A) 32–487 worms and (B) 5–184 worms per technical replicate. Error bars = mean ± SEM. Annotations above graphs represent P-values from Two-way ANOVA and Tukey’s multiple comparison tests (****≤0.0001; ***≤0.001; **≤0.01; *≤0.05; ns = non-significant). (B) Statistical comparisons between WT trained and siblings in transgenic lines are in red (top row), between adjacent trained groups are in green (middle row), and between groups within each line in black (bottom row).
Salt associative learning is dependent on arginine kinase F46H5.3 and not armadillo-domain containing protein C30G12.6.
Chemotaxis indices (CI) are shown for wild-type/WT animals versus mutants for (A) F46H5.3 (non-backcrossed with WT, n=5), (B) F46H5.3 backcrossed with WT (n=4), (C) C30G12.6 (non-backcrossed with WT, n=5), and (D) C30G12.6 backcrossed with WT (n=5). These animals were assessed for salt associative learning by preparing three groups for each line: naïve/untrained (grey circles), high-salt control (blue squares), and trained (orange triangles; 27–395 worms per technical replicate). Each data point is for one biological replicate each comprising three technical replicates. Error bars = mean ± SEM. Statistical analyses were done by Two-way ANOVA and Tukey’s multiple comparison test (****≤0.0001; **≤0.01; *≤0.05; ns = non-significant).
We tested 26 candidates in total for this study. Although this represents a small subset of the 706 proteins identified in the learning proteome, several proteins in the full list are unsuitable for functional testing due to key constraints: (1) having essential roles, with corresponding single-gene mutants being lethal; (2) involvement in neurodevelopment rather than mature neuronal function; and (3) being required for locomotion, with severe locomotion defects precluding assessment using chemotaxis assays.
Candidates tested were classified as either strong (detected in biological replicates ≥3) or weak (replicates <3) based on the number of mass spectrometry replicates in which they were uniquely identified in TurboID-trained C. elegans (shown in brackets). We determined these numbers by considering both assigned and unassigned protein lists, which contained mostly neuron-expressed proteins (Table 1) including known learning regulators (Figure 2 and Supplementary file 1H). The list of 26 candidates for further testing includes both weak and strong hits. In addition, although candidates tested were mostly detected in more replicates of trained versus control groups, we also assayed seven candidates for which this was not the case. Table 2 summarises the potential learning regulators explored in this study, including strong/weak classifications and replicate numbers between experimental groups.
Summary of candidates assessed for their effect in learning.
The number (#) of biological replicates (total n=5) in which each candidate was detected as an assigned hit (by the MASCOT software) or in assigned + unassigned hits (identified by bulk BLAST search) is provided under ‘# Biological replicates in TurboID trained’ and ‘ # Biological replicates in TurboID high-salt control’ columns. These values exclude proteins from non-transgenic trained and non-transgenic high-salt control groups, respectively. Orange highlights indicate candidates detected in more replicates in the TurboID-trained group. Candidates are also defined as ‘weak’ or ‘strong’ based on the frequency of detection across biological replicates.
| Candidates tested | # Biological replicates in TurboID trained (assigned hits) | # Biological replicates in TurboID high-salt control (assigned hits) | # Biological replicates in TurboI D trained (assigned + unassigned hits) | Classification for candidate |
|---|---|---|---|---|
| IFT-139 | 4 | 1 | 5 | Strong |
| ACR-2 | 1 | 0 | 4 | Strong |
| F46H5.3 | 3 | 2 | 4 | Strong |
| SAEG-1 | 2 | 0 | 4 | Strong |
| UEV-3 | 4 | 1 | 4 | Strong |
| AEX-3 | 0 | 0 | 3 | Strong |
| C30G12.6 | 0 | 0 | 3 | Strong |
| ELO-6 | 3 | 0 | 3 | Strong |
| ELP-1 | 2 | 1 | 3 | Strong |
| FSN-1 | 0 | 0 | 3 | Strong |
| GAP-2 | 2 | 0 | 3 | Strong |
| RIG-4 | 0 | 1 | 3 | Strong |
| TAG-52 | 1 | 0 | 3 | Strong |
| TAP-1 | 2 | 0 | 3 | Strong |
| VER-3 | 3 | 0 | 3 | Strong |
| ACC-3 | 1 | 0 | 2 | Weak |
| DLK-1 | 1 | 0 | 2 | Weak |
| GBB-2 | 2 | 0 | 2 | Weak |
| GPA-2 | 2 | 0 | 2 | Weak |
| RHO-1 | 2 | 1 | 2 | Weak |
| ACC-1 | 1 | 1 | 1 | Weak |
| GAP-1 | 1 | 0 | 1 | Weak |
| GLR-1 | 0 | 1 | 1 | Weak |
| KIN-2 | 1 | 0 | 1 | Weak |
| LGC-46 | 1 | 1 | 1 | Weak |
| MACO-1 | 1 | 0 | 1 | Weak |
We first tested the regulatory subunit of PKA, kin-2 (1 replicate), since it is a known regulator of memory and was detected as a weak candidate by TurboID. Adenylyl cyclase is a key signalling effector for Gαs and Gαi proteins and regulates levels of the secondary messenger cyclic AMP (cAMP) within the cell. cAMP binding to PKA regulates its activity, and therefore its downstream effects (Sassone-Corsi, 2012). We tested worms with the ce179 mutant allele in kin-2, in which a conserved residue in the inhibitory domain (which normally functions to keep PKA turned off in the absence of cAMP) is mutated to cause an R92C amino acid change – this results in increased PKA activity (Schade et al., 2005). kin-2 has previously been shown to be required for intermediate-term memory in C. elegans (Stein and Murphy, 2014), with cAMP/PKA signalling previously shown to be involved in memory in multiple systems (Kandel, 2012). We found that these kin-2 mutant animals showed enhanced learning compared with wild-type (i.e. Non-Tg worms; Figure 5A). We next re-expressed KIN-2(R92C) in wild-type worms using a pan-neuronal promoter, and these worms sho