Proc. Natl. Acad. Sci. USA Vol. 82, pp. 5997-6000, September 1985 Neurobiology Auditory representation of autogenous song in the song system of white-crowned sparrows (auditory neurophysiology/learning and memory/birdsong/sensorimotor integration/auditory feedback) DANIEL MARGOLIASH* AND MASAKAZU KONISHI Division of Biology, California Institute of Technology, Pasadena, CA 91125 Contributed by Masakazu Konishi, May 20, 1985 ABSTRACT The HVc (hyperstriatum ventrale, pars caudale) is a forebrain nucleus in the motor pathway for the control of song. Neurons in the HVc also exhibit auditory responses. A subset of these auditory neurons in the white- crowned sparrow (Zonotrichia leucophrys) have been shown to be highly selective for the individual bird's own (autogenous) song. By using multiunit recording techniques to sample from a large population, we demonstrate that the entire population of auditory neurons in the HVc is selective for autogenous song. The selectivity of these neurons must reflect the song-learning process, for the acoustic parameters of a sparrow's song are acquired by learning. By testing with laboratory-reared birds, we show that HVc auditory neurons prefer autogenous song over the tutor model to which the birds were exposed early in life. Thus, these neurons must be specified at or after the time song crystallizes. Since song is learned by reference to auditory feedback, HVc auditory neurons may guide the development of the motor program for song. The maintenance of a precise auditory representation of autogenous song into adulthood can contribute to the ability to distinguish the fine differences among conspecific songs. The song of birds is learned. Young birds memorize a song model during an impressionable phase early in life (1, 2). The development of the motor program for song is dependent on comparison of the individual's vocal output with the previ- ously established song memory. For birds such as the white-crowned sparrow (Zonotrichia leucophrys), the "crys- tallized" adult song is highly stereotyped throughout life, and the maintenance of adult song does not require vocal feed- back (3). The song of the adult white-crowned sparrow contributes to his reproductive success by influencing mate attraction and defense of the breeding territory. In the wild, a white- crowned sparrow typically competes with conspecifics that sing similar songs from the same dialect (4). Although the songs of neighbors are similar, white-crowned sparrows can distinguish among neighbors and between neighbors and strangers solely on the basis of song (5, 6). Recently, a projection between the avian auditory telen- cephalon and the telencephalic nucleus HVc (hyperstriatum ventrale, pars caudale) has been described (7). The HVc is part of the motor pathway controlling song production (8, 9), and neurons in the HVc also exhibit responses to auditory stimuli (9-12). We report here that auditory neurons in the HVc are optimally stimulated by the bird's own adult (autog- enous) song. Therefore, these auditory response properties reflect specific modification by vocal feedback during devel- opment. MATERIALS AND METHODS Experiments were conducted with four awake birds carrying chronically implanted electrodes (see below) and with four birds anesthetized with 20% urethane (Sigma). These and other aspects of the experimental procedures have been described (12). Briefly, birds were collected from Bodega Bay, CA, as nestlings or adults, and housed in individual sound-attenuation chambers (Industrial Acoustics, Bronx, NY). Songs were recorded on analog tape, digitized, and analyzed by zero-crossing techniques (13). The lack of harmonic structure and short-time limited frequency modu- lation of the white-crowned sparrow's song permits accurate recovery of the song from its zero-crossings. All songs in this report were presented at peak values adjusted to 70 decibels (dB) (relative to 20 ,uPa). Clusters of neurons were recorded, to efficiently sample from a large number of cells (e.g., Fig. 1A). A simple tech- nique was developed to quantify the strength of multineu- ronal responses. Beginning with the onset of a song stimulus, values representing the strength of response of neuronal activity were calculated for 500 consecutive 10-msec inter- vals ("bins"). Thus, sampling was continued beyond the end of a stimulus until the neuronal activity recovered to spon- taneous levels. For each 10-msec bin the absolute value ofthe signal (i.e., signal area) was summed across 50 samples acquired at a sampling rate of 5 kHz. An average for the response strength for each bin was maintained across all the representations of a given stimulus. The final baseline value representing the spontaneous activity was subtracted from the bin values (see Fig. 1B). The sum of all bin values throughout the duration of a song was taken as a measure of overall response strength. The duration of a song stimulus will affect this and all measures of neuronal response in a complex way depending on the relative duration of the song components that do and do not elicit excitation. For the present data, however, differences between the overall duration or duration of corresponding phrases of the bird's own song and test songs are small compared to differences in response strength. Except where noted, correction of response strength for overall duration does not alter the statistical significance of the results. For multineuronal recordings, this analysis enjoys several advantages over counting spikes. These include obviating the need to arbitrarily choose a threshold level for discrimination of spikes, and increasing the sensitivity as a result of signal averaging. The primary disadvantage is that neurons closer to the electrode and those that produce larger spikes are given Abbreviation: HVc, hyperstriatum ventrale (pars caudale). *Present address: Department of Biology, Box 1137, Washington University, Lindell and Skinker Boulevards, St. Louis, MO 63110. 5997 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. ยง1734 solely to indicate this fact. D o w n lo a d e d a t C a rn e g ie M e llo n U n iv e rs ity o n A p ri l 5 , 2 0 2 1 5998 Neurobiology: Margoliash and Konishi greater weighting; nevertheless, at any one recording site this bias is constant across stimuli. RESULTS Many HVc neuronal clusters responded vigorously to song; at these recording sites the individual bird's own song proved to be a most effective stimulus (Fig. 1B). As an initial test of song selectivity, four white-crowned sparrows were prepared for acute recordings. Two ofthe birds, R75 and Y73 (Fig. 2A), sang a song of the Bodega Bay dialect. For these birds, the responses of neuronal clusters to autogenous song were compared with the responses to three test songs (W73, G88, W91; Fig. 2C) also of the Bodega Bay dialect. The test songs were chosen as a small sample representative of the range of intradialect variation. The other two birds, Y8 and Y14, had been reared in the laboratory and exposed during the im- pressionable phase to a computer-synthesized song that A Ace r A w---~~~~~--I- A-"-AM.-in I R71 --ar T T T~~~~,Ijjjjj:~ B comprised unnatural elements, in particular a trill with upward-sweeping frequency modulation (14). This tutoring paradigm results in individual variation in learning (unpub- lished results), and indeed Y14 formed a good copy of the song, whereas Y8 failed to learn. For these birds, the responses to autogenous song were compared with the responses to the tutor song. At 47 recording sites throughout the HVc that responded robustly to song, a random sequence of songs was chosen; each song was then presented for 20 repetitions. Y73 and Y14 each received a single row of closely spaced (100 Am) penetrations that covered the rostrocaudal extent ofthe HVc, whereas R75 and Y8 received more coarsely spaced pene- trations in several rows that encompassed the entire HVc. At each of the recording sites, the bird's own song consistently elicited stronger responses than the test songs (Table 1). No systematic differences in song selectivity were observed throughout the HVc. Abnormal components of song never observed in the wild-i.e., reverse frequency modulation for Y14 and a series of eight whistles of descending frequency for Y8-elicited selective responses from HVc neurons in those birds. Although the individual bird's song elicited the maximal response, the test songs also typically elicited excitatory responses. At 46 of the 47 recording sites, the overall response to the bird's own song was excitatory; of the 95 presentations of the test songs, 6 resulted in overall inhibi- tion, while for 4 there was essentially no response. Further- more, the response profile to the bird's own song (e.g., response to whistle and trill but not to buzz) typically was similar to the response profile to the other songs (Fig. 1B). Thus, throughout the HVc, conspecific song is an effective stimulus, and almost always the optimal song is the individ- ual's own. Several experiments were conducted to verify that the selectivity for autogenous song was not capriciously gener- ated by sampling bias. As one control, the degree of intradialect song selectivity of HVc neuronal clusters was investigated further. At two recording sites in each of three birds (R71, R77, Y85; Fig. 2B), a more stringent test of selectivity was applied. At these sites, the responses to 10 different songs, all from the Bodega Bay dialect, were compared with the response to each bird's own song. These 10 songs were quite similar in general morphology both to each other and to each bird's song; typically these songs W91 I I . .. A. I y, " 'I W73 -a -Lh A&-.A LA d d_~w*.~.i~AA." A. I .L-LL FIG. 1. (A) Neuronal activity elicited by five repetitions of the bird's own song (BOS), for bird R71. The multiunit activity is strong during the whistle and trill (first and third phases) of song and weak for the intervening buzz. (B) Response of the neuronal cluster represented in A to four songs, including BOS (R71). The baseline represents spontaneous activity. Arrows mark the offset of each phrase (see Fig. 2 for sonograms). Note that response to BOS is strongest, whereas the response to other songs, if any, is at the same phrases as response to BOS. Each response was summed over 50 repetitions of the song (1 song per 12 sec). Table 1. Comparison between the bird's own song (SOS) and three intradialect songs for birds R75 and Y73 or between the BOS and the tutor song for birds Y8 and Y14 No. of recording sites with stronger response to Bird Test song n BOS Test song P* R75 W73 11 8 (10) 3 (1) 0.113 (0.006) G88 11 9 2 0,033 W91 11 6 (11) 5 (0) 0.274 (<0.001) Y73 W73 13 13 0 <0.001 G88 13 13 0 <0.001 W91 13 13 (12) 0 (1) <0.001 (0.002) Y8 Tutor 14 14 0 <0.001 Y14 Tutor 9 7 2 0.09 Total 95 83 (89) 12 (6) <0.001 For values that changed when corrected for song duration (see text), corrected numbers are shown in parentheses. Note that R75 sang an unusually short song, hence the effect of correction for song duration is significant. Each song was repeated 20 times; repetition rate was 1 per 12 sec. *Based on one-tailed sign test. i AIL _ . -,M.lw_F-ly- .1 RERN AV4%W*"~s-qwrOrr-pr---w Proc. Nad Acad Sd USA 82 (1985) I - T" - . - VwwIffl, D o w n lo a d e d a t C a rn e g ie M e llo n U n iv e rs ity o n A p ri l 5 , 2 0 2 1 Proc. NatL Acad Scd USA 82 (1985) 5999 R75 K R71 -mimpm \P\\ \ \ TI .._ R77 4 ______ 8't.T- 8 , Y85 4 - 4 _*__rmmmm \ \ . 4 \ 500 1000 1500 2000 Y73 C G88 W73 W91 i t |4-*\mom! MSEC FIG. 2. Sonograms (frequency vs. time representations) of songs used to test HVc song-selectivity. (A) Birds R75 and Y73 were tested both with autogenous song and the songs in C, at multiple sites throughout the HVc (see Table 1). (B) Birds R72, R77, and Y85 were tested with autogenous song and 10 other songs, including those in C (see Table 2). (C) Three of the test songs. For A, B, and C all songs are derived from Bodega Bay dialect birds; note that all songs are similar yet vary in slight detail. comprised a two-part introductory whistle, a buzz of higher center frequency, and a two-part trill, with an occasional terminal buzz. The center frequencies and detailed morphol- ogy for the different phrases in the test songs, however, spanned the range of variation found within the Bodega Bay dialect. Nevertheless, the bird's own song was commonly the most effective song (Table 2). Thus, the neuronal song selectivity observed in the HVc was not capriciously gener- ated by use of a limited set of test songs. As a final control for sampling bias, in four white-crowned sparrows (R71, R77, Y48, Y62), 11 pairs of glass-coated platinum/iridium electrodes were chronically implanted so as to locate the tips (100 ,um) within the HVc. The possibility of bias in choosing recording sites on the basis of selectivity for the bird's own song was excluded by implanting the elec- trodes before the birds were induced to sing and thus before their songs were known. Locations of recording sites were selected when responses to the three test songs verified the Table 2. Strength of response to test songs relative to bird's own song (BOS) Relative response to Bird Site G77 G83 G88 044 W73 W76 W78 W91 Y75 Y80 P* R71 1 0.82 0.26 0.64 0.86 0.59 1.01 0.72 0.61 0.56 0.49 0.011 2 0.75 0.29 0.78 0.28 0.28 1.06 0.27 0.21 0.17 0.16 0.011 R77 1 0.93 0.61 0.81 0.89 1.05 0.45 0.50 0.62 0.53 1.03 0.055 2 1.20 0.24 0.53 1.07 0.67 1.55 0.76 0.36 0.51 0.52 0.172 Y85 1 1.59 0.42 0.80 0.74 0.62 0.84 0.57 0.48 0.68 0.88 0.011 2 1.26 0.49 0.69 0.85 0.91 0.96 0.51 0.27 0.57 1.32 0.055 For six recording sites in three birds, pair-wise comparisons of the efficacy of each BOS with 10 intradialect test songs. Values <1.0 indicate that the test song elicited a weaker response than the BOS. For R71 andR77, 25 repetitions per song; for Y85, 20 repetitions per song. One song per 12 sec in all cases. *Based on one-tailed sign test. B N I Neurobiology: Margohash and Konishi looks 4w mw. -..- to !I v N. D o w n lo a d e d a t C a rn e g ie M e llo n U n iv e rs ity o n A p ri l 5 , 2 0 2 1 6000 Neurobiology: Margoliash and Konishi existence of robust auditory responses to song. Chronic differential recordings from pairs of electrodes (which reject noise common to both inputs, such as body-movement- induced artifactual signals from awake but restrained birds) were conducted on subsequent days. The birds came into full song within 1 to 3 weeks after subcutaneous administration of testosterone. Three of the birds, two males and a female, developed normal song, whereas a second female sang an abnormal song, as is occasionally observed for wild-caught females. While the birds were developing song, a daily variability in response strength to the test songs was noted, suggesting fluctuation either in electrode position or in neuronal responses. Without exception, however, after song crystallized and testing with the bird's own song commenced, all three test songs elicited weaker responses at all of the recording sites in all of the birds-a total of 60 pair-wise comparisons. Two birds sur- vived 77 and 97 days after onset of singing, respectively. The neuronal clusters recorded at all three of the electrode pairs still functioning were optimally responsive to the individual's song. Numerous tests were conducted to ascertain the acoustic basis for the selectivity for autogenous song. Although a detailed description is beyond the scope of this report, it was observed that reversing the temporal sequence of song, which alters the pattern of time-varying frequency and amplitude modulation while leaving the overall spectrum unaltered, systematically reduced the efficacy of the song stimulus. Other manipulations, including frequency-shifting and modification of the frequency modulation without affect- ing amplitude modulation, all reduced the efficacy of autogenous song (unpublished data). The results show that HVc auditory neurons are sensitive to the particular acoustic parameters in autogenous song. DISCUSSION The results demonstrate that adult HVc auditory neurons reflect a developmental plasticity that is molded by the individual's own song. Auditory neurons in the HVc are explicitly modified by the vocal-feedback experience and therefore are probably specified at the time of song crystal- lization. The location and response properties of these neurons suggest that during song crystallization they are likely candidates for selecting those motor patterns in the HVc that produce vocal feedback that matches the song template. That is, if HVc auditory neurons in the juvenile have access both to the memory trace of the tutor song and to vocal feedback, then during song crystallization the response properties of these neurons may become fixed as fortuitous patterns of motor activity produce a match be- tween the template and auditory feedback. In turn, the output of HVc auditory neurons might tend to stabilize those motor patterns. What advantage does an adult white-crowned sparrow achieve by maintaining an auditory representation of his own song? Once song is crystallized, a white-crowned sparrow can maintain normal song for at least 18 months after being deafened (3). Since song maintenance does not require auditory feedback and since auditory responses in HVc are inhibited during singing (9), maintenance is an unlikely explanation for HVc auditory-response properties. These neurons do exhibit behaviorally relevant song selectivity, however. Thus, they are excellent candidates for mediating song recognition. It has been suggested that a bird mayjudge the distance of territorial conspecifics by comparing the transmission-induced frequency-degradation of the songs of neighbors with a memorized copy of autogenous song (15-17). If so, the HVc is a candidate site of such memory, serving as an "autogenous reference" for those aspects of conspecific song recognition that may require a reference component. The hypothesis that the song an individual learns early in life affects his perception of conspecific songs as an adult is amenable to further behavioral analysis. We thank Drs. Nobuo Suga, Dale Purves, and James D. Miller for reviewing an early draft of the manuscript. This work was supported by a Del E. Webb fellowship (to D.M.) and a grant from the Pew Memorial Trust. 1. Thorpe, W. H. (1961) Bird Song (Cambridge Univ. Press, Cambridge, England). 2. Marler, P. (1970) J. Comp. Physiol. Psychol. 71, 1-25. 3. Konishi, M. (1965) Z. Tierpsychol. 22, 770-783. 4. Marler, P. & Tamura, M. (1962) Condor 64, 368-377. 5. Milligan, M. M. & Verner, J. (1971) Condor 73, 208-213. 6. Baker, M. C., Thompson, D. B. & Sherman, G. L. (1981) Condor 83, 265-267. 7. Kelley, D. B. & Nottebohm, F. (1979) J. Comp. Neurol. 183, 455-470. 8. Nottebohm, F., Stokes, T. M. & Leonard, C. M. (1976) J. Comp. Neurol. 165, 457-486. 9. McCasland, J. & Konishi, M. (1981) Proc. Natl. Acad. Sci. USA 78, 7815-7819. 10. Katz, L. C. & Gurney, M. E. (1981) Brain Res. 211, 192-197. 11. Paton, J. A. & Nottebohm, F. (1984) Science 225, 1046-1048. 12. Margoliash, D. (1983) J. Neurosci. 3, 1039-1057. 13. Margoliash, D. (1983) Dissertation (California Institute of Technology, Pasadena, CA). 14. Konishi, M. (1978) in Perception and Experience, eds. Walk, R. D. & Pick, H. L., Jr. (Plenum, New York), pp. 105-118. 15. Richards, D. G. (1981) Auk 98, 127-133. 16. Morton, E. S. (1982) in Acoustic Communication in Birds, eds. Kroodsma, D. E. & Miller, E. H. (Academic, New York), Vol. 1, pp. 183-212. 17. McGregor, P. K., Krebs, J. R. & Ratcliffe, L. M. (1983) Auk 100, 898-906. Proc. Nad Acad Sd USA 82 (1985) D o w n lo a d e d a t C a rn e g ie M e llo n U n iv e rs ity o n A p ri l 5 , 2 0 2 1