Abstract
How brain circuits are organized to skillfully produce learned sequences of behaviours is still poorly understood. Here we functionally examined how the cortical song premotor region HVC, which is necessary for zebra finch song1, controls the sequential production of learned song syllables. We found that HVC could generate the complete sequence of learned song syllables independently of its main synaptic input pathways. Thalamic input to HVC was needed for song initiation, but it was not required for transitions between syllables or for song completion. We …
Abstract
How brain circuits are organized to skillfully produce learned sequences of behaviours is still poorly understood. Here we functionally examined how the cortical song premotor region HVC, which is necessary for zebra finch song1, controls the sequential production of learned song syllables. We found that HVC could generate the complete sequence of learned song syllables independently of its main synaptic input pathways. Thalamic input to HVC was needed for song initiation, but it was not required for transitions between syllables or for song completion. We showed that excitation of HVC neurons during song reliably caused vocalizations to skip back to the beginning of the song, in a manner reminiscent of a skipping record. This restarting of syllable sequences could be induced at any moment of the song and relied on local circuits within HVC. We identified and computationally modelled a synaptic network, including intratelencephalic premotor and corticostriatal neurons within HVC that are essential for completing song syllable sequences. Together, our results show that the learned zebra finch song is controlled by a cortical sequence-generating network in HVC that, once started, can sustain production of all song syllables independent of major extrinsic input pathways. Thus, sequential neuronal activity can be organized to fuse well-learned vocal motor sequences, ultimately achieving holistic control of this naturally learned behaviour.
Main
Motor behaviours are considered to be learned by splitting and chunking smaller behavioural units into sequences of neural activity and then concatenating the sequences into a unified premotor plan that supports the fluid production of the entire behaviour2,3,4,5,6,7. Although there is evidence for chunking3,6, identifying the neural origin for unified premotor programs has remained challenging. The control of learned birdsong provides a tractable model for searching such premotor programs. Songbirds are among the few groups of animals, apart from humans, that learn their vocalizations through imitation. Moreover, birdsong is controlled by dedicated forebrain circuits8.
Zebra finches learn a single courtship song motif. They engage in extensive daily practice to maintain expert performance of this song. Sparse sequential neuronal activity in the pallial song nucleus HVC probably underlies the production of zebra finch song9,10,11,12,13. However, how neural sequences in HVC contribute to the progression of the song motif is still not well understood. Several lines of evidence support the idea that song control may involve reciprocal loops spanning the brainstem, thalamus and pallium14,15,16,17 (Extended Data Fig. 1a), whereas other studies suggest that HVC may be capable of generating neural sequences for song production more autonomously10,11,18,19.
There are various models of how adult songs are controlled: (1) sequential activity in HVC can sustain progression through all song syllables independently of instructive afferent inputs18,19,20; (2) input pathways link shorter neural sequences at syllable or other vocal parameter boundaries14,17,21,22; and (3) HVC sequences are continuously updated by instructive afferent input15,16. Research in this area has relied on correlations between song and electrophysiological recordings or on non-selective circuit manipulations, including electrical stimulation, cooling of brain regions and electrolytic lesioning. Here we combine a series of cell-type, circuit and pathway manipulations with synaptic mapping and computational modelling to causally examine how neural sequences contribute to completing the song motif. This study reveals that, barring a permissive thalamic input important for song initiation, HVC can independently propagate activity for production of all song syllables in the motif, and that this network relies on two synaptically interconnected classes of HVC projection neurons.
Optogenetic restarting of song
Electrical stimulation of HVC has varied effects on song production, including distortion of syllable acoustic features, truncation of song and occasional restarting of song soon after song truncation23,24. However, these studies are difficult to interpret because stimulation cannot be restricted to specific cell types or to cells within a small spatial volume, the stimulated population of neurons is highly dependent on electrode placement and there is inevitable antidromic and orthodromic activation of neurons and passing axons25.
Instead, we selectively controlled HVC activity using viral expression of the excitatory opsin ChRmine (n = 6 birds; Fig. 1a and Extended Data Fig. 1b). This provided experimental control over a population of HVC neurons composed of approximately 20% inhibitory neurons and 80% principal neurons, with a bias towards HVCX projection neurons26. Birds were implanted with fibre optics over HVC, and syllable detection software was used to perform closed-loop optogenetic manipulations while the birds were freely singing. Light stimulation reliably caused song truncation, seen as a rapid decrease in sound amplitude and disruption in syllable acoustic features (stimulation outcome probability: 86.8 ± 3.6% truncation and 10.2 ± 3.7% pause + continuation; latency to silence from onset of stimulation: 66.6 ± 4.1 ms and average ± s.e.m. in six birds; Fig. 1b–g and Extended Data Fig. 1c–g). Truncation was followed by the rapid restarting of the song motif (median, 135.8 ± 25.8 ms; lowest quartile, 87.6 ± 15.5 ms; Fig. 1b–d,h–j and Extended Data Fig. 1c). The birds restarted their song from the beginning, with one or two introductory notes followed by the motif or directly back to the first syllable of the motif, and this resetting behaviour occurred with high probability, independent from when in the song the optogenetic stimulation was triggered (Fig. 1h; all post-truncation trials reported in Extended Data Fig. 1h). When normalized by the likelihood of the birds to chain multiple motifs in series, the probability of a stimulated motif to be immediately followed by another motif was 108.6 ± 4.9%, suggesting that the optogenetic perturbation caused the song to restart from the beginning of the motif without prematurely ending the song bout (Fig. 1i and Extended Data Fig. 1i).
Fig. 1: Optogenetic excitation of HVC causes truncation and restarting of the song motif.
a, Schematic of closed-loop song-contingent light stimulation of HVC; sample image of HVC ChRmine-expressing neurons. b, Spectrograms (0–11 kHz) of normal song (top) and stimulated song (bottom). Horizontal lines identify song element boundaries, introductory notes ‘i’ followed by syllables (A, B and C) composing the motif. Light stimulation (red bars: 10-ms light) causes motif truncation (blue dashed lines overlaying letters; dashed contour represents the missing portion of the truncated syllable; motif truncation represented by the line being truncated at an angle). Orange dots indicate restart. c, Stacked control (top) and stimulated (bottom) song amplitude plots ordered by latency of stimulation onset (red line; arrow). d, Latency to stimulation (red), motif truncation (blue) and identity of resumed vocalization within 1 s following stimulation. e, Box plots (5th–95th percentile; 25th, 50th and 75th percentiles) showing the outcome of optogenetic stimulation (average probability, n = 6 birds). f, Average latency ± s.e.m. to motif truncation in response to stimulation delivered across the motif (bins, 10%; motif advancement, n = 6 birds). g, Box plots showing truncation latencies computed across all trials, per bird (n = 6). h, Probability (average ± s.e.m.) of post-truncation vocalization resumption by category upon stimulation delivered throughout the motif (bins: 10% motif advancement). i, Normalized probability of post-truncation motif restart, per bird (Methods). j, As in g for motif restart latency, per bird (n = 6). k, Subsyringeal pressure recordings (dotted line indicates ambient pressure; deviations above indicate expiration and deviations below indicate inspiration) aligned at the onset of stimulation (red bar, 50 ms; top, unstimulated trace; bottom, 34 motif traces; grey bar highlights the corresponding point in the unstimulated motif waveform). l, As in k, stimulation during quiet respiration or calls (top, sample traces; bottom, 56 traces aligned at the stimulation onset). m, Schematic of the two proposed possible scenarios; song progression is either controlled through extrinsic updates (top, red arrows) to HVC activity or controlled more autonomously by HVC (bottom, red arrow). Scale bars, 200 µm (a), 200 ms (b,k,l), 0.5 a.u. (k,l). Brain outline in a adapted with permission from ref. 60, Wiley.
To better understand how our circuit manipulations affect the motor control of the song, we recorded subsyringeal air sac pressure during optogenetic stimulations. We found that optogenetic stimulation applied during quiet respiration neither induced vocalization nor altered respiratory patterns in the birds. By contrast, stimulation during singing caused rapid cessation of expiration during ongoing syllables (Fig. 1k,l and Extended Data Fig. 1j). Syllable truncations resulted from significant respiratory pressure deviations within 36.4 ± 4.0 ms of light onset, approximately 30 ms before vocalizations were acoustically truncated, consistent with previous studies15,27 (Extended Data Fig. 1k). Finally, we found that optogenetic stimulation trials in which birds did not quickly restart singing could be the result of apnoea. Thus, in some cases, optogenetic activation suppressed involuntary respiration, which effectively blocked the reinitiation of song (apnoea duration: 588.2 ± 216.8 ms; Extended Data Fig. 1l).
Together, these data indicate that HVC can control downstream steady-state respiration circuitry in a state-dependent manner, and that once HVC is engaged, stimulation interrupts the chain of activity in HVC, resulting in abrupt song truncation and resetting of the motif back to its initial state. These attributes are reminiscent of response to perturbation in central pattern-generating (CPG) networks described in the invertebrate and vertebrate nervous systems28,29. Another defining feature of CPG networks is that once initiated, they can produce patterned activity in the absence of instructive patterned input. The seemingly automatic and rapid restarting of song hints that extrinsic inputs to HVC may function permissively, rather than instructively, in song motif production (Fig. 1m), raising the possibility that HVC produces the neuronal sequences for song in the absence of instructive patterned input and may function as a pattern-generating network for song syllable sequences.
Song initiation needs thalamic input
Input to HVC from the thalamic nucleus Uvaeformis (Uva) is one probable source of instructive signals for producing the song motif18,22,30,31. Electrical stimulation of Uva was reported to cause motif truncation at syllable boundaries17, suggesting that the inputs of Uva to HVC are instructive for motor programs to transition from one syllable to the next. To test this idea, we first used optogenetic excitation of the axon terminals of Uva in HVC through viral expression of eGtACR1, an opsin that potently drives excitation of axon terminals in zebra finches31 (Extended Data Fig. 2a,b). Light stimulation of eGtACR1-expressing Uva–HVC terminals drives strong transient increases in HVC activity (Fig. 2a and Supplementary Table 1). In contrast to thalamic electrical stimulation17, the optogenetic excitation of Uva terminals during singing did not cause motif truncation and left song syntax and spectral characteristics unaffected (Fig. 2b,c, Extended Data Fig. 3a,b and Supplementary Table 1).
Fig. 2: Uva does not instruct transitions between syllables in the song motif.
a, Schematic, sample trace, raster plot and normalized peri-stimulus time histogram (PSTH) of HVC multi-unit activity recording in anaesthetized birds expressing eGtACR1 in Uva; light stimulation of Uva afferents (1 s, red bar); inset, magnified PSTH and scatter plot comparing baseline and stimulation (200 ms, dashed rectangles; n = 30 hemispheres, 17 birds). b, Song-contingent light stimulation (red bar, 200 ms) of Uva terminals in HVC; sample spectrogram (0–11 kHz; horizontal lines identify song elements). c, Violin plots reporting accuracy of song segments with (grey) and without (white) stimulation, per bird (n = 4). d, UvaHVC neurons (labelled by retrograde tracer, green) expressing ChRmine (red). Dashed white lines, fibre-optic tip. e, As in b for UvaHVC stimulation. f, Box plots (5th–95th percentile; 25th, 50th and 75th percentiles) reporting optogenetic stimulation outcome (average probability, n = 3 birds; filled circles, empty box plots from Fig. 1e reported for comparison). g, As in c for UvaHVC stimulation (n = 3 birds). Scale bars, 200 ms (b), 200 µm (d), 20 µm (d (inset)). Brain outlines in a and d adapted with permission from ref. 60, Wiley.
The lack of effects on song, even with prolonged stimulation, prompted us to test whether direct optogenetic excitation in Uva disrupts song. We expressed the excitatory opsin ChRmine in Uva neurons projecting to HVC (UvaHVC) using an intersectional viral strategy (Fig. 2d). We found that even directly stimulating UvaHVC neurons failed to cause song truncation and restarting (1.2 ± 1.2%, motif stop; 98.9 ± 1.2%, no effect; Fig. 2e,f and Supplementary Table 1). Moreover, this manipulation had no detectable impact on the spectral characteristics of song syllables (Fig. 2g, Extended Data Fig. 3c and Supplementary Table 1).
One possibility is that manipulations such as electrical stimulation may drive truncation at syllable boundaries through off-target effects, such as recruiting nearby thalamic regions or fibres of passage. Uva is located within the posterior commissure, which connects midbrain regions critical for vocalizations, audition and vision. It is immediately adjacent to the robust nucleus of the arcopallium (RA) fibre tract, which transmits descending motor commands for song (Supplementary Video 1). Neurons in and surrounding Uva relay visual information to the forebrain32, and sudden visual stimulation with a stroboscope elicits orienting responses in zebra finches27,33 that result in motif truncations at syllable boundaries, similar to those observed with electrical stimulation of Uva17.
To assess whether off-target effects could be involved in truncating motifs at syllable boundaries, we attempted to mimic the effects of electrical stimulation by non-selectively expressing ChRmine in Uva and the surrounding thalamus (Extended Data Fig. 3d). Broader thalamic optogenetic stimulation resulted in reliable motif truncation at syllable boundaries (91.5 ± 3.6%, motif stop; 0.4 ± 0.4%, pause + continuation; Extended Data Fig. 3d–k). In contrast to optogenetic stimulation in HVC or along the Uva–HVC pathway, optogenetic stimulation of the broader thalamus caused birds to momentarily stop movement and blink, both during singing and non-singing states. This suggests that the manipulation causes a visually evoked orienting response, perhaps mimicking responses to strobe-light visual stimulation27,33. Consistent with this, broader thalamic stimulation resulted in significantly longer truncation latencies than direct HVC stimulation (Extended Data Fig. 3g–k) and predominantly led to cessation of singing rather than resetting of song (Extended Data Fig. 3l,m). In the few instances when birds returned singing, the motif reset latency was significantly longer than what we observed when stimulating HVC (Extended Data Fig. 3n,o). These findings suggest that electrical stimulation-triggered song truncations are the result of off-target stimulation of the peri-Uva thalamus, and that Uva is not instructive for HVC syllable sequence progression.
We next examined if Uva could play a permissive role in song production. Electrolytic lesions of Uva or peri-Uva regions can abolish courtship song production22,30. However, electrolytic lesions non-selectively ablate neurons and damage axonal fibres. To minimize damaging axonal tracks, we performed bilateral excitotoxic lesions of Uva using a cocktail of ibotenic and quisqualic acid (n = 13 birds). This strategy yielded three outcomes: (1) complete lesions of Uva (99.6 ± 0.4% Uva lesioned) that also included the peri-Uva thalamus that resulted in birds that could no longer sing their motif; (2) large peri-Uva thalamic lesions that mostly spared Uva (10.8 ± 3.8% Uva lesioned) that resulted in birds that also could no longer sing their motif; and (3) almost complete Uva lesions (87.5 ± 7.0% Uva lesioned) that spared the broader peri-Uva thalamus and resulted in birds that could sing their motif within approximately 1 week following lesion (Fig. 3a–c and Extended Data Fig. 4a). This last group of birds demonstrates that HVC can drive production of the entire song motif, even when Uva is significantly lesioned. Nonetheless, we found that these birds chained significantly fewer motifs together in each song bout (Fig. 3d and Supplementary Table 2), and they would often fail to produce their song motif after singing introductory notes (Fig. 3e and Supplementary Table 2). These findings are consistent with Uva lesions disrupting the ability of birds to initiate courtship song performances to female birds30, and they suggest that Uva may be needed for the initiation of the song motif.
Fig. 3: Uva is permissive for motif initiation.
a, Schematic and sample image (NeuN immunofluorescence, grey; HVC retrograde tracer, green) and spectrograms (0–11 kHz; horizontal lines identify song elements) reporting the effect of excitotoxic bilateral lesion of Uva and peri-Uva thalamus. b, Motif self-similarity before (circles) and 1–2 weeks after (triangles) lesions (peri-thalamus + Uva (brown; n = 8); peri-thalamus excluding Uva (grey; n = 2); Uva excluding perithalamic areas (blue; n = 3). c, Percentage of Uva lesion in the three experimental groups. d, Cumulative probability of motifs per bout sung by the birds before (black) and 30 d after (blue) excitotoxic lesion of Uva that spared peri-Uva thalamic regions (n = 3 birds). e, The rate of motif start failures before (grey circles) and after (blue triangles) the excitotoxic Uva lesions. f, Schematic, sample image and spectrograms (as in a) reporting the effects of TeNT expression in UvaHVC neurons. g, Box plots (5th–95th percentile and 25th, 50th and 75th percentiles) reporting self-similarity between motifs sang before viral injection (grey) and the last motifs produced before complete cessation of singing upon expression of TeNT in UvaHVC neurons for 1–2 weeks (purple triangles; n = 6 birds). h, Cumulative probability of motifs per bout before (black) and 1–2 weeks after expression of TeNT in UvaHVC neurons (purple; n = 6 birds). i, Rate of motif start failures before (grey circles) and after (purple triangles; n = 6 birds) expression of TeNT in UvaHVC neurons. NS, nonsignificant. Scale bars, 1 mm (a), 100 µm (a (insets)), 1 s (a (spectrograms)), 200 µm (f). Brain outlines in a and f adapted with permission from ref. 60, Wiley.
To test the role of Uva in song initiation, we first blocked glutamate release from UvaHVC neurons using viral expression of tetanus neurotoxin (TeNT) (Fig. 3f and Extended Data Fig. 4b). These birds had progressive difficulty initiating their song on a timeline consistent with viral expression (approximately 10–14 days). They had increasing failures in motif initiation following singing of introductory notes and decreased number of motifs per song bout. However, in instances when the motif was initiated, the birds consistently produced all song syllables in the motif with high accuracy (Fig. 3f–i, Extended Data Fig. 4c and Supplementary Table 2). These data support the idea that the Uva–HVC pathway is permissive for initiating learned song motifs rather than instructing song syllable transitions17.
To test this, we expressed eGtACR1 in Uva and optogenetically silenced Uva neurons during singing (Extended Data Fig. 4d). We found that silencing Uva during an ongoing song motif did not disrupt the completion or acoustic structure of that motif, but it reduced the probability of initiating and concatenating a subsequent motif (Extended Data Fig. 4e,f). By contrast, using the same birds and placing fibre optics over HVC to excite Uva axon terminals across motif transitions did not suppress initiation of a subsequent song motif (Extended Data Fig. 4g,h). Thus, if Uva input to HVC is excited, birds can continue the ongoing motif and string other motifs together. If instead it is inhibited, birds still complete the ongoing motif but exhibit difficulty starting the next song motif. Thus, the Uva–HVC pathway is critical for initiating song motifs, potentially coordinating the two hemispheres, but not needed for birds to string together syllables within the song motif.
Pallial afferents are not needed for song
HVC receives excitatory input from three auditory and premotor pallial regions that play important roles in song learning: nucleus interfacialis (NIf), nucleus avalanche and medial magnocellular nucleus of the anterior nidopallium (mMAN)30,34,35,36,37. We examined the role of each pathway in adult song performance. Stimulation of eGtACR1-expressing axon terminals in HVC from any of these regions significantly increased HVC multi-unit firing activity (Extended Data Fig. 5a,g,m). However, 200-ms-long or 1-s-long song-contingent light stimulation of any of these input pathways failed to affect spectrotemporal motif characteristics (Extended Data Fig. 5a–r). We therefore tested whether these afferents are necessary for adult song performance. Previous studies indicate that bilateral lesions of either NIf, mMAN or nucleus avalanche in adults do not cause any long-lasting disruptions in song34,36,38. However, it has been shown that compensation by other pathways could account for the lack of sustained effects on song. Therefore, we consecutively lesioned mMAN, NIf and nucleus avalanche in the same birds using ibotenic and quisqualic acid. Bilateral lesions of these nuclei (mMAN, 100.0 ± 0.0%; NIf, 92.9 ± 4.0%; nucleus avalanche, 100.0 ± 0.0%; lateral magnocellular nucleus of the anterior nidopallium (lMAN), 82.5 ± 7.7%; Extended Data Fig. 6a–i) caused only a temporary decrease in motif quality. The song motif quickly recovered to its pre-lesioned state (Extended Data Fig. 6j), and, unlike Uva lesions, these lesions did not impact the number of motifs per bout or cause disruptions in motif initiation (Extended Data Fig. 6k,l). This demonstrates that HVC can generate the sequential activity necessary for completing song independent of its known main excitatory synaptic afferents, further supporting the idea that HVC is the origin of a unified premotor program for the zebra finch song motif.
Song pattern-generating network in HVC
To further define the circuit boundaries of the song pattern-generating network, we examined whether downstream target regions of HVC are critical to pattern generation. We reasoned that the kinetics of the post-truncation restarting of song provides a sensitive behavioural read-out of pattern resetting and could clarify whether those neural circuits are involved in song pattern generation or simply relay the patterned output. Disruption of a pattern generation node would produce truncation and reset latencies similar or faster than those observed upon HVC stimulation, whereas a relay node would result in low-latency truncations followed by low-probability and longer-latency motif resetting. HVC has two major output pathways: the descending song motor pathway through the pallial song region RA and the palliostriatal pathway through area X, emerging from HVCRA and HVCX neurons, respectively8.
We bilaterally expressed ChRmine in either area X or RA and light stimulated each region in freely singing birds. Driving area X neurons rarely caused motif truncation (truncation probability, 2.9 ± 1.6%; no effect, 97.2 ± 1.6%; Extended Data Fig. 7a,b). The truncations we observed occurred at syllable boundaries and were significantly delayed (latency, 146.7 ± 36.7 ms) compared with the uniform song truncations observed with HVC optogenetic stimulation. Nonetheless, stimulation of area X neurons consistently caused a modest increase in the noisiness of stimulated syllables (Extended Data Fig. 7c,f), consistent with the known role of the basal ganglia pathway38.
By contrast, optogenetic stimulation in RA caused rapid motif truncations with high reliability (92.2 ± 2.7%, motif stop; 1.5 ± 1.8%, pause + continuation; Extended Data Fig. 8a–c). These truncations exhibited uniform latency across song, similar to stimulation in HVC (Extended Data Fig. 8d–f). Because RA is downstream of HVC in the song motor pathway, we might expect birds to restart their song as fast, or faster, than when stimulating in HVC and with equal probability. However, we found that RA stimulation is less likely to be followed with restarting of the motif. When it does, it takes significantly longer than following HVC stimulation (Extended Data Fig. 8g–k). This argues that the song pattern-generating network is localized to HVC and that RA functions downstream of this network to relay motor commands for song.
To test this prediction, we moved upstream by one synapse and examined whether optogenetic stimulation of HVCRA neurons would produce the truncation and song restarts with the same timing and reliability as our pan-HVC optogenetic manipulations, as shown in Fig. 1. We used an intersectional viral strategy to achieve ChRmine expression only in HVCRA neurons (Extended Data Figs. 9 and 10). As anticipated, song-contingent HVCRA stimulation reliably caused rapid truncations throughout song (86.2 ± 5.6%, motif stop; 2.5 ± 1.8%, pause + continuation; Extended Data Fig. 9a–c). Unexpectedly, the latency to truncation was significantly longer than what we observed with pan-HVC stimulation. In some instances, it seemed to occur closer to syllable boundaries (panHVC, 66.6 ± 4.1 ms; HVCRA, 79.1 ± 4.8 ms; Extended Data Figs. 9b,d and 10c,d). Although post-truncation motif reset probability is comparable to that observed with pan-HVC stimulation (Extended Data Figs. 9e,f and 10e,f), restart latency was intermediate to the timing of pan-HVC and RA stimulation (Extended Data Figs. 9g and 10g–i). Although different latencies to first spike among HVC projection classes may influence truncation and restart dynamics39, this intermediate timing is still surprising because chains of excitatory synaptic connections among HVCRA neurons are considered to be a central component of the network controlling song9,10,11. This prompted us to investigate whether the other main class of HVC projection neurons (HVCX neurons) may contribute to the rapid restarting of the song motif.
HVCX neurons in song pattern generation
Similar to HVCRA neurons, HVCX neurons exhibit temporally precise and sparse activity during production of the song motif12,40,41, but their role in song generation is not known. They are considered to relay timing activity to the basal ganglia rather than directly contributing to song pattern generation42,43,[44](https://www.nature.com/articles/s41586-025-10069-z#ref-CR44 “Sánchez-Valpuesta, M. et al. Corticobasal gangli