The Journal of Neuroscience June 1986, 6(6): 1643-1661 Preference for Autogenous Song by Auditory Neurons in a Song System Nucleus of the White-Crowned Sparrow Daniel Margoliash Division of Biology, California Institute of Technology, Pasadena, California 91125 Neuronal activity in the hyperstriatum ventrale, pars caudale (HVc) is associated with and necessary for the production of song by songbirds. HVc neurons also respond to acoustic stim- uli. The present investigation assessed the auditory response properties of neurons in HVc by testing with the individual bird’s own (autogenous) song and the songs of conspecific birds. Throughout HVc, multiunit clusters preferentially responded to autogenous song. Selectivity for autogenous song was apparent even when compared to similar intradialect songs, and neuronal clusters preferred autogenous song over the (tutor) song model that birds heard during the impressionable phase early in life. The responses to autogenous song were stable in the adult. HVc neurons were sensitive to the acoustic parameters of autogenous song and consistently exhibited a diminished response to mod- ified song. In Contras& field L neurons, which are presumed to be a source of auditory input to HVc, did not exhibit selectivity for autogenous song and showed no special sensitivity to the acoustic parameters of autogenous song. These observations im- plicate song (motor) learning in shaping the response properties of HVc, but not field L, auditory neurons. It is proposed that HVc auditory neurons may contribute to a bird’s ability to dis- criminate among conspecific songs by acting as an “autogenous reference” during perception of those songs. The song of oscine passerines (songbirds) is an attractive system for investigating the neural mechanisms of learning. In many species, song shows local variation, termed dialects (e.g., Marler and Tamura, 1962). Within each dialect, song conforms to a common pattern, although there are individual differences im- portant for intradialect recognition (Falls, 1982). Dialects are culturally transmitted by young birds learning the songs of adult conspecifics (individuals of the same species), so that the ac- quisition of song is disrupted by sensory deprivation early in life. In the white-crowned sparrow (Zonotrichia leucophrys), isolation of juveniles from a conspecific song model during an impressionable phase (critical period) that closes at 50-100 d of age renders the adult songs abnormal, although these songs retain some wild-type attributes (Marler, 1970). Furthermore, deafening prior to the acquisition of song renders the adult pattern completely abnormal and unstable (Konishi, 1965). In contrast, deafening an adult white-crowned sparrow has little or no effect on the maintenance of adult song (Konishi, 1965). Thus, the motor substrates for song are fixed during the emer- Received Aug. 27, 1985; revised Nov. 4, 1985; accepted Nov. 5, 1985. I thank Dr. Mark Konishi, who provided support and encouragement through- out these experiments. The manuscript benefitted considerably from the comments of Drs. Catherine Carr, Mark Konishi, Terry Takahashi, Susan Volman, and Hermann Wagner. The 40 songs used as stimuli in these experiments will be deposited with the Cornell University Library of Natural Sounds, catalog no. 35241-35280. This work was supported by a Del E. Webb research fellowship to D. M. and NIMH Grant MH 40455 to M.K. Correspondence should be addressed to Dr. Margoliash, Department of Anat- omy, University of Chicago, 1025 E. 57th St., Chicago, IL 60637. Copyright 0 1986 Society for Neuroscience 0270-6474/86/061643-19$02.00/O gence (“crystallization”) of adult song by comparison of vocal feedback to an internalized song reference (“template”) acquired during the impressionable phase. These observations demon- strate that the genetic contribution is insufficient to specify the wild-type song of an adult white-crowned sparrow. Initial insights into the neural correlates of song learning have recently been described. Neurons in the hyperstriatum ventrale, pars caudale (HVc), a telencephalic nucleus necessary for and active during song production (McCasland and Konishi, 198 1; Nottebohm et al., 1976), respond to auditory stimuli (Katz and Gurney, 198 1; Margoliash, 1983a, 1984; Margoliash and Ko- nishi, 1985; McCasland and Konishi, 1981; Paton and Notte- bohm, 1984). One relatively rare class of HVc auditory neurons, the “song-specific” neurons, are strictly combination-sensitive, requiring a temporal sequence of consecutive song phrases to elicit excitation (Margoliash, 1983a). Song-specific neurons ex- hibit intra- and interdialect song selectivity comparable to the neighbor/stranger discrimination of territorial white-crowned sparrows. The selectivity of these neurons is conferred by their specificity to the acoustic parameters of the individual bird’s own (autogenous) song, and it is autogenous song that is con- sistently the optimal stimulus for song-specific neurons. Spec- ificity for the fine details of an individual’s song clearly suggests a role for song learning in establishing the response properties of song-specific neurons. The use of autogenous song to search for these infrequently encountered neurons, however, may have introduced a sampling bias (Margoliash, 1983a). The present study, with multiunit recordings, demonstrates that the response properties of the majority of HVc auditory neurons are also selective for autogenous song, proving that these properties re- sult from some aspect of song learning. Furthermore, the re- sponse properties of field L neurons, a putative source of au- ditory input to HVc (Kelley and Nottebohm, 1979), do not exhibit such selectivity. Thus, auditory response properties in the adult HVc may result from plasticity local to HVc. Brief reports of this work have appeared (Margoliash, 1984; Mar- goliash and Konishi, 1985). Materials and Methods The procedures used for recording the songs of birds, the computer techniques used to manipulate the song stimuli, and the preparation of animals for acute recordings have been described (Margoliash, 1983a, b). Briefly, birds were housed individually in sound-attenuation cham- bers and induced to sing by subcutaneous implantation of testosterone. An individual’s song was tape-recorded and entered into a computer (PDP 1 l/40). The zero-crossings and amplitude envelope of the signal identified the time-varying frequency and amplitude parameters of the song, permitting accurate reproduction of the original song while facil- itating its modification (Margoliash, 1983b). One to several days before the days of experiment, a stainless-steel pin used to restrain head move- ments was glued to the skull of anesthetized birds (Equithesin, Jensen Salsbery). During experiments, the birds were acutely anesthetized with 20% urethane (Sigma), body temperature was monitored intracloacally, and maintained with external heating. The head was immobilized by fixing the pin, while the body, wrapped in a cloth jacket and comfortably 1643 1644 Margoliash Vol. 6, No. 6, Jun. 1986 A B Figure 1. Quantification of mul- tiunit activity. A, Five consective traces of multiunit response to the stimulus (C’), the bird’s own song. B, Averaged responses from 50 rep- etitions of the stimulus, termed “summed response” (see text). The baseline represents the level of activ- ity from spontaneous activity. C, Bird’s own song, a Bodega Bay dialect C song, is represented as functions of frequency and amplitude vs. time. Bodega Bay songs comprise 3 distinct parts, or phrases: the introductory whistle (I+‘), followed by a buzz (B), and trill (T). Often, as in this case, the first phrase comprises 2 distinct whis- tles. The trill may be further subdi- vided into syllables and notes. The strong activity to the first and third phrases seen in A is reflected in the histogram in B. Note also the post- stimulus inhibition below baseline ac- tivity. All traces are time-aligned. suspended, was relatively free to move. A small fenestra in the cranium exposed the dura overlying either left or right HVc or field L. Penetra- tions with glass-coated platinum/iridium electrodes were systematically placed at regular intervals (typically 100 or 200 pm for HVc, 300 pm for field L). Search stimuli included tone and noise bursts as well as song, but in these experiments the use of song to quantify neuronal response properties was emphasized. At the end of an experiment, the bird was administered a lethal dose of Equithesin, exsanguinated with saline, and fixed in 10% formalin by intracardial perfusion. Electrolytic lesions were reliably observed in standard frozen sections (30 pm) stained with cresyl violet. The waveforms produced by multineuronal activity were quantified by measuring the rectified area under the curve. To achieve this, mul- tineuronal activitv (e.g.. Fig. 1A) was diaitallv sampled at 5 kHz. Once _._I - digitized, each sample was rect&d, thai is, ihe absolute value of each sample point subtracted from the value measured at quiescence was calculated. The values for each 50 consecutive samples (i.e., 10 msec) were summed together as an index of the total neuronal activity during the 10 msec bin. Digital sampling of the neuronal activity commenced with the onset of a song stimulus (e.g., Fig. 1C) and ended 5 set later (i.e., 500 consecutive 10 msec bins were collected), to permit sampling after the recovery of the baseline background activity. (The song stimuli used were never longer than 2.8 sec.) The repetition rate was one song per 12 sec. After each repetition of a song, the values for that iteration were added to a running average of bin values maintained over all stimulus presentations. This function, termed “summed response” (e.g., MSEC Fig. lB), represented the time-varying neuronal response to the stim- ulus. The value expected from spontaneous activity was calculated as the average value over the last second of the summed response (Fig. 1 B). A measure of the overall response strength, in arbitrary units, was calculated over the duration of the stimulus as the sum of all bin values above spontaneous baseline minus the sum of all bin values below spontaneous baseline. The relative response strength of a test song as compared to the individual bird’s own song was represented as a ratio of the overall response strength of test song over bird’s own song. This technique offers several advantages over counting spikes in mul- tineuronal activity. First, in contrast to choosing an essentially arbitrary threshold level for counting the spiking activity of a cluster of neurons, the present technique does not require a threshold to be specified. Thus, a potential source of unconscious bias introduced by the experimenter is eliminated. Furthermore, the technique is sensitive to the larger signal produced by several spikes occurring simultaneously and, as a result of averaging, can extract signals embedded in noise. To minimize the greater weighting assigned to larger spikes, the rectified value for each sample was not squared, which would have resulted in a standard mea- sure-the rms value of the signal. In any case, the bias toward larger spikes is constant across all stimuli at a given recording site. The strong dependency of the magnitude of the measured response on the distance between neurons and the recording surface of the elec- trode constitutes the primary disadvantage of this technique. To control for any fluctuations in the strength of response, the response to the reference stimulus (autogenous song-see below) was monitored every The Journal of Neuroscience Neural Correlates of Song Learning 1645 Table 1. Number of comparisons between the bird’s own song (BOS) and other songs No. sites No. comparisons: BOS vs Field Bird BOS HVc L Rev TB Fra AFra AAmu Tutor Dialect RI5 BBay Y46 Ab Yl3 BBay Y85 BBay R7 la BBay R7P BBay Y62 BBay Y48a Ab 033 Chc Y8 Failed Y14 Rev-FM Y87 For-FM 20 6 13 3 2 4 1 4 16 14 9 18 4 4 2 2x4 1x4 13 13 13 (10) (10) (10) (10) 4 2 3x4 2x4 8 2 6x4 4x4 2 4 2 5x4 5x4 4 2 1 1x4 1x4 (17) 13 9 9 (18) (13) (17) (18) (4 x 6) (39) (37) (34) (37) (17 x 6) Of the 8 birds in these experiments wild-caught as adults, 6 sang normal Bodega Bay dialect songs (BBay), while 2 sang an abnormal song (Ab). The 4 other birds were hand-raised and tutored in the laboratory. Two ofthose learned computer- synthesized songs incorporating forward (For-FM) or reversed (Rev-FM) frequency modulation, while a third failed to learn Rev-FM song (Failed). A fourth bird chose the white-crown song when given a choice between 5 sympatric species songs and conspecific song (Chc). At multiple recording sites in HVc and/or field L, the bird’s own song (BOS) was compared to reversed BOS (Rev), tone-burst (TB), and frequency-shuffled (Frq) song variants, frequency (AFrq) and amplitude @Amp) shifted versions ofBOS, as well as the songs birds weTe tutored with early in life (Tutor) and intradialect conspecific songs (Dialect). The data for field L are shown in brackets. For pairs of numbers, the first represents the number of recording sites, the second the number of different songs presented at each site. o Birds with chronically implanted electrodes. 11 x 3 13 x 3 (10 X 3) 2x3 2 x 10 6x3 2 x 10 8x3 2 x 10 3x3 2x3 16 x 2 (17 x 2) 14 x 2 9 (17) 15-30 min. In many cases, the magnitude of the response to a stimulus did not vary substantially over a period of several hours. Thus, although this technique is inappropriate when the absolute efficacy of a stimulus is to be quantified across several recording sites, it is well suited for quantification of the relative efficacy of several stimuli at a fixed re- cording site. Several experiments necessitated the comparison, at a given recording site, of the overall response strength elicited by song stimuli of different durations. It should be noted that the duration of a complex stimulus will affect any overall measure of response strength in a complex way, depending on the degree of phasic and tonic components of the response and on the relative duration of the components of the stimulus that elicit excitation and inhibition. Fortunately, experimental factors served to moderate these potential problems. In particular, a given neuronal cluster in HVc typically exhibited a similar profile for the summed response to each intradialect song. Thus, the relative contribution of the phasic and tonic components to the overall response strength was relatively constant across the various song stimuli. Second, many of the songs used in these experiments were samples drawn from one dialect, and thus incorporated phrase components of similar durations. In this report, for all comparisons across songs of differing duration, the overall response strengths were calculated uncorrected for song duration, as is reported herein, and normalized for song duration. Although the values of relative response strength were slightly affected by the correction, the statistical significance of the results remained unaltered (with the ex- ceptions noted). One control experiment (see Results) required a series of recording sessions in birds carrying electrodes chronically implanted in HVc. For these, the aforementioned preparatory procedure was modified. The end of each electrode was attached in electrical contact with a small pin. The electrode was designed to permit the placement of as many as 8 electrodes in both hemispheres’ HVc. During a single surgical pro- cedure, birds whose individual song was unknown to the experimenter were anesthetized with Equithesin. When an electrode penetration into HVc identified neurons with robust song-related auditory activity, the electrode was fixed into position with acrylic cement so that the shaft of the pin protruded from the cement. An appropriate recording site was defined as a multiunit cluster exhibiting robust excitatory responses to one of a set of the three intradialect songs (see below) to be used in subsequent chronic recording sessions. An attempt was made to position the second electrode of the pair at an appropriate site in HVc within close proximity of the first electrode (typically within 300 pm). Subsequent to the recovery from surgery, the birds participated in a series of chronic recording sessions. A fully awake individual was re- strained as described above and presented with various songs. Record- ings were made with each pair of electrodes serving as inputs to a differential amplifier, thus minimizing artifactual signals induced by body movements. While the recording sites and electrodes remained viable, with these procedures it was possible to record chronically from as many as four pairs of electrodes simultaneously over a period of days and months. Occasionally, while recording from the chronically implanted birds, and infrequently with the acutely anesthetized preparations, the dis- tinctive bursting nature of the background activity in HVc (see Results) disappeared. Concomitant with the change in background activity, the responsiveness to auditory stimuli was compromised. These changes served as accurate physiological predictors of an ensuing brief episode of struggling in response to restraint. Struggling was not a response to pain: The chronic recording procedure merely involved an initial slip- on connection to the firmly implanted pins. Within 15-45 set after cessation of struggling, the characteristic background activity returned, and robust auditory responses were again evident. The fluctuation in activity suggests that the responsiveness of HVc auditory neurons is sensitive to the behavioral state of the animal. I have also previously observed (unpublished results) that activity in HVc is quite sensitive to the depth of anesthesia and the anesthetic agent. To minimize variability in the results, therefore, data collection was suspended during episodes of altered background activity. The data for this report were obtained from 12 birds (Table 1) induced to sing with exogenous testosterone. No systematic differences were 1646 Margoliash Vol. 6, No. 6, Jun. 1986 A f 20 I RESPONSE RE. BOS Figure 2. Response to intradialect songs relative to response to bird’s own song (BOS). A, Overall response to each of 3 sample songs: W73 (black), G88 (coarse stipple), and W9 1 Cfine stipple). Virtually all pre- sentations of these songs elicited weaker response (< 1 .O) than BOS (n = 48 per sample song). B, Response to each phrase of sample song relative to the response to the corresponding phrase of BOS: whistle (black), buzz (coarse stipple), and trill yine stipple). At most recording sites, all phrases of each sample song elicited a weaker response than BOS. observed in the response properties of the 5 males (033, R71, R75, R77, Y 14) and the 7 females, so the data were combined for statistical purposes. For 5 acutely anesthetizedanimals (033, R75, Y8, Y 14, Y73), the responses of 69 neuronal clusters distributed throughout HVc were auantified with respect to their selectivity for song. In 3 of the birds (033, Y 14, Y73), an additional 45 clusters thus quantified were also recorded in field L. while in another bird devoted solelv to field L cY87). 30 sites were quantified. Three birds (R71, R77, Y62) that were per- manently implanted with 11 pairs of electrodes in HVc prior to the induction of song survived over an extended period ranging from 94 to 130 d, during which the viable recording sites were sampled a total of 60 times. In the final acute recording session with these 3 birds, an extensive battery of 10 intradialect songs was presented. To compensate for a technical failure that resulted in the loss of the final day’s data for Y62, another acutely anesthetized individual (Y85) was also presented with the 10 intradialect songs at each of two recording sites. The re- maining 2 individuals (Y46, Y48) yielded little data. Results Response properties in H Vc Several physiological criteria helped identify the HVc: After penetrating through the overlying area parahippocampus, elec- trodes encroaching on HVc encountered a dramatic increase in spontaneous activity, which was highly irregular and of a dis- tinctive, bursting nature. Throughout the HVc, many neurons exhibited clear responses to acoustic stimuli. At some locations, neurons responded only to tones of higher or (infrequently) lower frequency than present in song; at these sites, song was a poor stimulus. At other locations, no acoustic stimulus pre- sented was effective, or all acoustic stimuli elicited only weak responses. Often, however, neurons responded to tone and bandpass-filtered noise bursts of frequencies occurring in white- crowned sparrow song (~3-6.5 kHz). For these neurons, song often elicited strong excitation. Consistently, the individual bird’s own song proved to be a most effective stimulus. Throughout HVc, responses to autog- enous song shared a number of similarities, many of which are represented in Figure 1. This multiunit cluster comprised at least 4 different units (Fig. 1.4). While the largest of the units exhibited weak or no response to the song stimulus, the other units exhibited stimulus-related activity, as judged by excitation during song and inhibition following the termination of song. The summed response (Fig. le) for this and most other re- cording sites was excitatory throughout the bird’s own song. Overall inhibition to autogenous song was observed at only one of the 92 sites that exhibited robust auditory responses, while inhibition of background activity after the termination of song was common, occurring at 54% (50/92) of the recording sites. The 92 presentations of autogenous song comprised a total of 30 1 phrases, of which only 7 phrases elicited inhibition (see Fig. 1 for terminology of components of song). Different phrases often elicited varying amounts of excitation. For example, the multiunit cluster of Figure 1 responded strongly to the whistle and trill of the bird’s own song, while giving only weak responses to the intervening buzz. In birds whose song comprised 3 or more phrases, for 22% of the recording sites (19/88) one phrase elicited at least half of the total response. That phrase was com- monly ( 13/ 19) the trill, typically the longest phrase of the white- crowned sparrow’s song. In 35% (31/88) of the recording sites, 2 phrases contributed to 80% or more of the response. Those 2 phrases were often the whistle and trill (25/3 1). The buzz typ- ically elicited the least response, although occasionally (4/88) the buzz was the most effective phrase of autogenous song. Selectivity for autogenous song Six of the birds contributing to these experiments sang individ- ual versions of the Bodega Bay, California, dialect, while 2 fe- males from that locale produced unusual songs, with abnormal whistle and lacking either trill or buzz (Table 1). In 7 of these birds, HVc auditory neurons were tested for selectivity among different songs of the same dialect. For each bird the efficacy of the bird’s own song was compared to 3 “sample” songs (W73, G88, W91; see Fig. 6) chosen as a small set representative of the range of variation within the Bodega Bay dialect. Although all these songs were similar to each other, in 135 pairwise comparisons between autogenous song and an intradialect sam- ple song, in only 10 did one of the sample songs elicit a stronger response than the individual’s own song (Fig. 2A). Averaged across all 48 comparisons, each sample song elicited roughly half of the response to autogenous song (W73: 0.407 f 0.644; G88: 0.476 + 0.464; W91: 0.610 + 0.346). The sample songs elicited excitatory responses of a nature similar to the response to the bird’s own song. In comparing responses to autogenous and sample songs, song phrases were a useful unit of song for analysis. If a phrase (e.g., whistle) from the bird’s own song did not elicit excitation, neither did the corresponding phrase (e.g., whistle) from the conspecific sample songs. The 13 5 presentations of the sample songs comprised a total of 432 phrases. Only 10% (42/432) of these phrases elicited greater response than the corresponding phrases from the birds’ own songs (Fig. 28). Frequently, a phrase from the bird’s own song elicited much stronger excitation than the corresponding phrase from a sample song. Thus, the diminished responses to the sample songs reflected a decrease in the efficacy of all phrases of conspecific song. The degree of intradialect song selectivity of HVc multiunit clusters was further quantified. At 2 recording sites in each of 3 birds (R7 1, R77, Y85; see Fig. 3), the responses to 10 different songs, all from the Bodega Bay dialect, were compared with the response to autogenous song. The 10 test songs (Fig. 4) were The Journal of Neuroscience Neural Correlates of Song Learning 1647 81R71 6- I 1 I I I 500 1000 1500 2000 8- R77 6- !z 4- I Y m- 2- 560 10’00 15bo 20;; 81Y85 6- N Figure 3. Songs of 3 birds. At 2 re- cording sites for each bird, the re- 2- i 500 sponse to autogenous song was com- pared to the responses to an extensive intradialect repertoire (see Fig. 4). Note the songs are similar to each oth- er. Frequency (ordinate) vs time (ab- S&X2). quite similar in general morphology to each other and to the songs of the experimental subjects. Each test song comprised an initial phrase of 2 whistles of similar frequency, a rapid frequency- and amplitude-modulation buzz typically of higher mean frequency, a two- or three-part trill, and an occasional terminal buzz. These 10 test songs spanned the range of vari- ation found within the Bodega Bay dialect (unpublished obser- vations). Although the test songs were rather similar to autog- enous song, of 60 pairwise comparisons (Table 2), for only 10 did a test song elicit a stronger response than the bird’s own song (6 if compensated for song duration). Thus, the selectivity for song observed in HVc reflected genuine specificity for au- togenous song, rather than being capriciously generated by use of the limited set of sample songs. Stability of song selectivity A final experiment was conducted to verify that the aforemen- tioned song selectivity was not capriciously generated by ex- periment-introduced sampling bias. Four adult birds (R7 1, R77, Y48, Y62) whose songs were not known to the experimenter were chosen. In these birds, 12 pairs of electrodes were im- planted in HVc (see Materials and Methods). Chronic recording sessions commenced following recovery from surgery, and tes- tosterone was subsequently implanted. As song developed, the magnitude of the response elicited by the sample songs exhibited considerable fluctuation from day to day (Fig. 5). Since the magnitude of response for the 3 songs covaried, the fluctuation was probably due to the movement of the electrode relative to nearby neurons, perhaps as a result of morphological modifi- cation as HVc responded to hormone. After the birds came into full song, autogenous song was included as one of the test songs. In light of the significant daily fluctuation in response magnitude, it is remarkable that half of all recording sites exhibited a stronger response when first tested with autogenous song than to any of the prior presentations of the sample songs (Fig. 5). Furthermore, for all 60 pairwise com- parisons of autogenous and sample songs in all birds at all re- cording sites over all the days of experiment, autogenous song elicited the stronger response (5 comparisons changed sign when corrected for song duration). Although the magnitude of the response to song was labile, the profile of the summed response remained stable over a pe- 1648 Margoliash Vol. 6, No. 6, Jun. 1986 ’ G83 6 1 560 10-00 1500 2000 I]044 1 ” , ;4 ---\\\\\\\ 2H 500 1000 1500 2000 560 rdoo 1500 2000 ’ W76 6. ; 47 -- 2. 500 1000 1500 2000 "1 W78 0 6 ;4 2 500 1000 1500 2000 VI91 560 10.00 15bo 2000 “1 Y80 500 1000 1500 2000 Figure 4. Songs of 10 Bodega Bay white-crowned sparrows representative of the range of intradialect variation. The trills are more constant between individuals than are the whistles or buzzes. The 3 sample songs (W73, G88, W91) are included. Note the songs are similar to each other. riod of many weeks. For 5 of the recording sites in the chronically implanted birds, neuronal activity could be recorded for at least 25 d. At 3 of these locations, neuronal clusters exhibited largely invariant responses to autogenous song, as judged by the con- stancy of the shape of the summed response over a period of 97 d (Fig. 6, A-C). At another site, somewhat greater variability was encountered (Fig. 60) while the fifth site exhibited sub- stantial fluctuation. The invariance of the profile of the summed response to autogenous song suggests that the response prop- erties of HVc auditory neurons are not normally modified in the adult white-crowned sparrow. Translocation of the electrode to a nearby recording site or cellular damage and recovery over a period of many weeks may reasonably explain the variability exhibited at some recording sites. Comparison of autogenous song with tutor song The adult, autogenous song of the white-crowned sparrow may be an accurate reproduction of the song model tutored early in life. Alternatively, a white-crowned sparrow may fail to learn the tutor song model (Konishi, 1985; Marler, 1970). In such a case, a bird sings an abnormal song wholly different from the tutor model. Recordings in 3 birds (033, Y8, Y 14) were directed at the question of whether HVc auditory neurons prefer autog- enous song over the songs that the birds were tutored early in life during the impressionable phase. These 3 birds were hand- reared. Their exposure to song during the impressionable phase and their subsequent singing histories were known. 033 was tutored with a choice paradigm-five sympatric species songs, including a Lincoln sparrow song, as well as a conspecific song (designated WCSCHC). Initially, 033 sang 3 distinct songs: 2 improvised alien songs and an accurate copy of the white crown tutor song (Fig. 7; see also Fig. 9, Konishi, 1985). By the time of the present experiments, however, 033 had discarded the 2 alien songs in preference for his conspecific song. 033 received 16 penetrations into HVc spaced at 100 pm. In 15 pairwise comparisons of autogenous song and WCSCHC, the responses to the conspecific tutor song were generally quite strong (0.989 f 0.310) and the tutor song elicited a stronger response in 6 cases. Thus, in this bird, HVc auditory neurons did not prefer autogenous song to the tutor song (p > 0.1; one- tailed sign test). In contrast, autogenous song elicited a stronger response than the alien Lincoln sparrow tutor song in all 16 pairwise comparisons, and the responses to the alien song were consistently weak (0.324 + 0.4 13). The Journal of Neuroscience Neural Correlates of Song Learning 1649 DAYS AFTER HORMONE IMPLANT Figure 5. Stability of song selectivity. Response to the sample songs (W73, GM, W91) at 6 recording sites in 3 birds (solid lines). After the birds sang, response to autogenous song was also measured (dashed lines). On any one day, autogenous song always elicited the strongest response. Note that the strength of response for all songs covaried (see text). Response is measured in absolute, arbitrary units. Across graphs, the height of the ordinate represents the same strength of response. Y 14 sang a moderately accurate copy of an artificial, com- puter-synthesized tutor song consisting of artificial introductory tone bursts followed by a trill consisting of unnatural elements with a low to high frequency direction of frequency modulation (Fig. 8; see also Konishi, 1978). Y 14’s song shared many im- portant features with the tutor song, including “pure” whistles with minimal frequency modulation, nearly identical mean fre- quencies for the whistles (3.48 and 4.56 kHz for Y14 vs 3.40 kHz and 4.30 kHz for the tutor song), and reversed direction of frequency modulation in the trill. The details of the trill, however, varied significantly from the tutor model. Y14 re- ceived 8 penetrations into HVc. Of 9 HVc recording sites in Y 14, for 7 the bird’s own song elicited a stronger response than the tutor song, while at the other 2 sites the tutor song elicited Table 2. Pairwise comparison of the efficacy of bird’s own song (BOS) with 10 intradialect test songs BOS Test songs Site G77 G83 G88 044 W73 W76 W78 W91 Y75 Y80 LY R71 ; 0.82 0.26 0.64 0.86 0.59 1.01 0.72 0.61 0.56 0.49 0.011 0.75 0.29 0.78 0.28 0.28 1.06 0.27 0.21 0.17 0.16 0.011 R77 ; 0.93 0.61 0.81 0.89 1.05 0.45 0.50 0.62 0.53 1.03 0.055 1.20 0.24 0.53 1.07 0.67 1.55 0.76 0.36 0.51 0.52 0.172 Y85 ; 1.59 0.42 0.80 0.74 0.62 0.84 0.57 0.48 0.68 0.88 0.011 1.26 0.49 0.69 0.85 0.91 0.96 0.51 0.27 0.57 1.32 0.055 Six recording sites in 3 birds. Values are strength of response for test songs with respect to BOS; for values x 1.0 the test song elicited a weaker response than BOS. For R71 and R77, 25 repetitions per song; for YSS, 20 repetitions per song. One song/ 12 sec. E One-tailed sign test. 1650 Margoliash Vol. 6, No. 6, Jun. 1966 Figure 6. Long-term stability of profile of summed response to autogenous song. Numbers are days after first test with autogenous song. Each panel represents one recording site. The shape of the response does not change in A and B, changes slightly in C, and is more variable in D. Arrowheads demark the offset of each phrase of autogenous song, which was R7 1 for A and B, Y62 for C, and R77 for D. Response, in arbitrary units, vs time; abscissa represents 3.0 sec. marginally (2 and 15%) stronger responses (p = 0.09; one-tailed sign test). Overall, the responses to the tutor song were reason- ably strong (0.667 k 0.306). Y8 was also tutored the reversed frequency modulated tutor song; but failing to learn it, the bird sang a song wholly different from the tutor song (Fig. 8). Y8 received 14 penetrations into HVc. At all 14 recording sites the bird’s own song elicited a stronger response than the tutor song (p < 0.00 1; one-tailed sign test). In contrast to Y 14, the responses to the tutor song in Y8 were generally very weak (0.287 -I- 0.336). In summary, the efficacy of tutor song was measured either by overall response strength to tutor song or by the number of times tutor song elicited a stronger response than autogenous song. Autogenous song typically elicited a stronger response, and a positive correlation was observed between the similarity of autogenous and tutor songs, and the efficacy of tutor song. Lack of local d&erences in song selectivity in H Vc The spatial distribution of the song selectivity ofresponses with- in HVc was explored with 2 different search strategies. Three birds (033, Y 14, Y73) received a series of closely spaced pen- etrations. For example, Y73 received 8 penetrations spaced at The Journal of Neuroscience Neural Correlates of Song Learning 1651 8 ,WCSCHC T&l 10’00 l&O 20‘00 8 ILINCOLN 560 1600 15-00 20'00 100 pm in a single rostrocaudal row through HVc. In some cases, it was possible to record from 2 or more sites in HVc separated in depth by 200 pm. All 13 of the recording sites were tested with autogenous song and the set of intradialect sample songs. Without fail, at all sites (all but one song at one site if corrected for song duration) Y73’s song elicited a stronger re- sponse than each of the sample songs (Fig. 9). As was commonly observed in other birds, throughout HVc the phrases of Y73’s song that elicited strong excitation were predictive of which phrases of conspecific song, if any, would also elicit strong ex- citation. Furthermore, those phrases of conspecific song that elicited strong excitation at one site typically were effective at all recording sites where the corresponding phrase from the bird’s own song was also effective (Fig. 9). In two birds (R75, Y8) penetrations were distributed through- out HVc. For example, the 11 electrode penetrations into R75 involved the entire rostrocaudal extent and all but the extreme lateral and medial aspects of HVc (Fig. 10). The enhanced re- sponse to autogenous song as compared to the sample songs was observed throughout the nucleus. As described above, the profile of the response to the sample songs varied systematically Figure 7. 033 was tutored the songs of 5 sympatric species, including Lin- coln sparrow (LINCOLN), as well as a conspecific model ( WCSCHC). The final song the bird developed (BOS) is an accurate copy of the conspecific model, except that it lacks an initial whistle. with the response profile to autogenous song. For R75, in only 23 of 33 pairwise comparisons did autogenous song elicit a stronger response than a sample song. However, R75’s song was unusually brief-only 1695 msec-compared to 2380, 2239, and 23 10 msec for the three sample songs W73, G88, and W93, respectively. With a correction for the overall duration of song, the instances of a sample song supplanting R75 as the most effective stimulus were decreased to 3 (Fii. 10). Even without this correction, the enhanced response to autogenous song is statistically significant 0, = 0.0183, z = 2.09). The fact that in only 1 of 18 tests did reversed autogenous song elicit a stronger response than forward song (see below) suggests that the phys- iology was normal for R75 and that the unusually poor selec- tivity resulted from limitations of the analysis. To date, no consistent systematic organization of the re- sponses to song of neighboring multiunit clusters has been ob- served, nor has a clear tonotopic organization for HVc been discerned. During these experiments, however, it was frequently observed that all neurons contributing to the response of a mul- tiunit cluster exhibited similar phrase selectivity. As an example, all the neurons of Figure IA responded to the whistle and the 1652 Margoliash Vol. 6, No. 6, Jun. 1986 8.Y8 2- 6 81 TUTOR Figure 8. Y8 and Y 14 each were exposed to TUTOR song during the impressionable phase. Y 14 achieved a good copy, whereas Y8 failed. The TUTOR song is an artificial, com- outer-svnthesized model (Konishi. i978). - 500 1000 1500 2000 trill, while none responded to the buzz. Twenty-five of 44 mul- tiunit recording sites tape-recorded for off-line analysis exhib- ited similar activity as judged by visual inspection. These ob- servations constitute weak evidence that auditory neurons in HVc are topographically organized. Song parameters underlying selectivity for autogenous song A series of experiments was designed to delineate the specificity for acoustic parameters of autogenous song that resulted in the aforementioned selectivity for song. The initial tests measured the efficacy of the bird’s own song as a function of changes in the overall amplitude and frequency of song. For changes in the intensity of song, individual clusters exhibited considerable variability in response strength (Fig. 11A). At 13 recording sites in 5 birds, song was presented with peak amplitude values rang- ing from 40 to 80 dB (re. 0.0002 dyn/cmz) in 10 dB increments. Only 2 of the resultant curves were strictly monotonic, and of the 52 pairs of points separated by 10 dB steps, 17 showed lowered response strength for the greater amplitude song. On average, the population response increased monotonically and roughly linearly with stimulus amplitude (Fig. 1 lB), 1.19% per dB at these moderate levels of intensity. However, only 38% of the overall variance was accounted for by the change in ampli- tude. In contrast to the effect of modification of the intensity of song, changes in the overall frequency of song resulted in a systematic change in response. For 17 multiunit clusters in 5 birds, the response strength was measured while the bird’s own song was frequency shifted by f 1 kHz in 500 Hz increments. For 16 of the 17 resultant curves, the optimal frequency was the unshifted version of song; of the 68 pairs of points separated by 500 Hz, only one showed a (3%) greater response to the song with the larger frequency shift (Fig. 12A). Thus, the averaged response exhibited a strictly monotonic decrease in strength with increasing frequency shift (Fig. 12B). It should be noted that for several of the songs used in these experiments, frequency shifts of 500 Hz leave these songs within the normal range of intra- dialect variation (unpublished observations). Time-varying parameters of song The contribution of the temporal aspects of song to the responses ofthe population of HVc auditory neurons was explored initially w 91 G 88 w 73 Fi gu re 9. Su m m ed re sp on se s to b ird ’s ow n so ng ( Y 73 ) an d th e 3 sa m pl e so ng s (W 9 1, G 88 , W 73 ). Th e so ng s tim ul i ar e sh ow n as s on ag ra ph s, bo tto m tra ce s. A s ys te m at ic re la tio ns hi p ex is ts be tw ee n th e ph ra se s of Y 73 ’s so ng th at el ic it ex ci ta tio n an d th e ph ra se s of t he sa m pl e so ng s th at el ic it ex ci ta tio n. N ot e th at W 9 1 el ic ite d al m os t as s tro ng a re sp on se as Y 73 at P ~T TI P rm m rd in ~ sit es . wh ile C T8 8 an d W 73 w er e les s ef &c tiv e st im ul i. Tw en ty re pe titi on s pe r hi st og ra m , ba se lin es ar e 3 se t in du ra tio n. 1654 L 1.6 12 3 I I B Margoliash Vol. 6, No. 6, Jun. 1966 NR NR NR NR 111 NR AL NR 3 4 5 NR NR Figure 10. Distribution of song selectivity within HVc of bird R75. Penetrations in rostrocaudal rows were made at a series of mediolateral locations, from L = 1.6 mm with respect to midline, to L = 2.4 mm with respect to midline (left column). Numerals and dashed lines indicate the location of each penetration, and fiduciary lesions are demarked by bold arrowheads Small arrowheuds delineate the ventral border of HVc; the calibration bar (bottom) is 200 pm. The response at each penetration to the 3 sample songs (W73, G88, W91) as compared with the response to the bird’s own song (BOS) is shown in the graphs on the right side (see example, bottom right). (For this bird only, the response strength is corrected 1656 Margoliash Vol. 6, No. 6, Jun. 1986 f 0.0 0.5 1 .o 1.5 2.0 BIRD’S OWN SONG : REVERSED re. FORWARD 0.0 0.5 1.0 1.5 2.0 3 TEST SONGS : REVERSED re. FORWARD Figure 13. Efficacy of reversed song relative to norm& forward song. A, For 62 clusters in 7 birds, reversed re. forward autogenous song. Reversed song is almost always less effective (< 1.0). B, Efficacy of reversed sample songs (W73, G88, W91) re. normal sample song. These songs are also less effective backward than forward. Twenty or 50 rep- etitions per song. The second song variant also was designed to manipulate the frequency modulation. For each phrase of tone-burst song (Fig. 14C9, the frequency was set to the mean (sometimes mode or median) frequency of the original phrase from which it was derived. (The slight frequency modulation that persists in the buzz and trill of tone-burst song is an unavoidable result of rapid amplitude modulation.) Thus, while the amplitude and amplitude modulation of tone-burst song were identical to the original song, the frequency components and therefore the spec- trum of each phrase were substantially different. As in frequen- cy-shuffled song, the whistle and buzz were relatively spared by this manipulation as compared to the trill. Throughout HVc, the frequency-shuffled song variant was less effective than the normal song. Of 3 1 presentations of frequency- shuffled song, only 3 elicited stronger responses than the re- sponse to autogenous song (Fig. 154). The extent of this effect varied with different phrases of song. The efficacy of the trill was seriously compromised (0.245 f 0.234, mean & SD), while the response to the whistle was less affected (0.445 -t 0.290). Presumably, this is a consequence of the elimination of normal parameters of trill (i.e., linear frequency modulation), while leaving the normal parameters of the whistle (i.e., frequency jitter) relatively unaffected. The response to the buzz was the least affected (0.693 f 0.257), although the effect of frequency shuffling the buzz is intermediate between the effect on whistle and trill. However, this discrepancy may reflect the relatively minor contribution of the response to the buzz to the overall response. A similar result was obtained when HVc neurons were pre- sented with tone-burst song. Of 36 recording sites where tone- burst song was presented, only 3 elicited stronger responses than did autogenous song (Fig. 15B). The effect was dramatic for the response to the trill (0.007 * 0.244) and was relatively weak for the response to the whistle (0.647 -+ 0.280) and the response to the buzz (0.720 f 0.246). Given that the whistle is the first phrase of song and is a rather “pure” sound to begin with, it is remarkable that HVc multiunit clusters consistently (33/36) pre- ferred the fine frequency modulation of the whistle over the spectrally similar tone burst. Response properties in jield L and surrounding areas That HVc auditory responses are modified during ontogeny does not establish the site of plasticity. To address this issue, a survey of field L and surrounding areas was undertaken. Field L is a telencephalic auditory area that receives direct thalamic input (Karten, 1968). Field L also projects to the “shelf’ (Kelley and Nottebohm, 1979), a cell-sparse zone ventral and medial to HVc. The dendrites of HVc neurons invade the shelf (L. Katz, unpublished observations); this putative connection is currently the only known pathway for auditory input to HVc (see Mar- goliash, in press, for discussion). Penetrations were made through field L and the surrounding mediocaudal neostriatum and caudal hyperstriatum in four birds (033, Y14, Y73, Y87). Data were also collected from HVc in 3 of the birds, permitting direct comparison. 033, Y14, and Y73 received 11,8, and 8 penetrations in HVc and 3, 5, and 3 penetrations into field L, respectively. 033 and Y14 had been raised in the laboratory-these birds were tested with the tutor songs. Y73 sang a Bodega Bay dialect song and was tested with the 3 intradialect sample songs previously described. In these 3 birds, of 78 pairwise comparisons of the responses of HVc clus- ters to autogenous and test songs, in only 7 cases did autogenous song elicit the weaker response. In contrast, of 8 1 pairwise com- parisons in field L, for 49 the test songs elicited greater responses than autogenous song (Fig. 16). Thus, field L did not prefer autogenous song over other intradialect songs, nor did field L prefer autogenous song as compared with the tutor song model. A systematic representation of frequency is the most prom- inent organizational feature of field L (Bonke et al., 1979; Lang- ner et al., 198 1; Leppelsack, 198 1) and was consistently dis- cerned with the multiunit recordings employed here. During the course of these experiments it was observed that this tonotopic organization extends far beyond the classical boundaries of field L to include structures both caudal and rostra1 to field L proper. The tonotopic organization is continuous across the lamina hy- perstriatica (LH), although the sluggish response to tone bursts of neurons in the LH often makes it difficult to determine a best frequency (Mtiller and Leppelsack, 1985; see also Scheich et al., 1979). The tonotopic organization of field L provided considerable insight into field L responses to frequency-shifted song. While an attempt was made to sample field L systematically in its entirety, the size of field L and surrounding auditory structures precluded a complete scan. During the most thorough scan con- ducted to date, Y87 survived for almost 2 d, during which 11 penetrations were made. These penetrations were spaced 300 pm apart in 2 rows located at 900 and 1200 pm lateral to the midline and covered the full anteroposterior extent of field L. At 17 sites in field L and surrounding structures, responses were measured to autogenous song frequency-shifted ? 1.5 kHz in 500 Hz increments. Summed over the entirety of field L, the responses showed no systematic preference for the unmodified song, in clear contrast with HVc (Fig. 17). In the caudal region of field L, where low frequencies are represented, song shifted downward in frequency elicited the strongest response; the op- posite was the case in the rostra1 high-frequency region of field L. Thus, the large error bars associated with field L responses to frequency-shifted song (Fig. 17) are a simple consequence of the underlying tonotopic organization. For 3 birds, the relative efficacy of forward and reversed song was also measured. Of the 60 recording sites, 22 exhibited great- The Journal of Neuroscience Neural Correlates of Song Learning 1657 C 2 Y 8 -I Y73 6- 500 1000 1500 2000 8 Y73/TB 6- 4-o-v s O,)JICOCmlCLC~ Figure 14. A, Normal song of bird Y73. B, Frequency-shuffled version 2- of the song (see text). Note that each phrase has approximately the same range of frequencies and that the am- 2000 plitude modulation is identical to that in A. C, Tone-burst version of the song 500 1000 1500 MSEC (see text). Amplitude modulation is identical to that in A, while spectrum has changed. er responses to reversed song (Fig. 18). In clear contrast to HVc, the overall response to reversed song in field L was quite strong (0.969 + 0.654) and at best a slight preference for the forward song emerged (p > 0.02, z = - 1.94). Not infrequently a phrase of reversed song elicited a stronger response than the corre- sponding phrase of forward song. This was rarely observed in HVc recordings. As may be predicted by the efficacy of reversed autogenous song, field L did not show specificity for the frequency modu- lation within the bird’s own song. Of 65 presentations of fre- quency-shuffled song (Fig. 19A), 36 elicited stronger responses than autogenous song. Thus, field L did not systematically prefer autogenous song over frequency-shuffled song (p > 0.02, z = 0.74). Similarly, field L did not prefer autogenous song over tone-burst song: 22 of 61 field L clusters responded more vig- orously to tone-burst song as compared with autogenous song (Fig. 19B; p > 0.02, z = -2.05). In fact, over the extent of field L investigated, the frequency-shuffled song elicited stronger re- sponses than autogenous song (1.199 f 1.296) while the re- sponses to the tone-burst song variant were roughly equal to the responses to autogenous song (1.052 f 0.786). By themselves, these data do not convey the full magnitude of the differences encountered when comparing field L with HVc. Although very weak responses were often encountered for one or more of the sample songs, the tutor songs, the frequency- shuffled song, the tone-burst song, or reversed song, responses 2, 3, or even 4 times stronger than the response to the bird’s own song were also observed. In hundreds of comparisons in HVc, this was never observed. In field L, the bird’s own song rarely emerged as the optimal stimulus when the complete bat- tery of tests were performed, in contradistinction, this was the rule, not the exception, in HVc. Discussion HVc auditory response properties are shaped by autogenous song The present results demonstrate that auditory neurons in HVc are selective for autogenous (self-produced) song. Throughout HVc, multiunit sites exhibit stronger responses to autogenous song than to a wide variety of conspecific songs (i.e., songs of other individuals of the same species). The efficacy of autoge- nous song is apparent even when compared to a repertoire of similar songs from the same dialect area. Thus, HVc auditory 1658 N q 31 z 0.0 0.5 1 .o 1.5 2.0 FREQUENCY SHUFFLED SONG re. BOS Margoliash Vol. 6, No. 6, Jun. 1966 HVc BE fn’5 - f P p” - N q 36 B =5 - I 0.0 0.5 1.0 1.5 2.0 TONE BURST SONG re. BOS Figure 15. Response to song variants. A, Response to frequency-shuf- fled song relative to bird’s own song (BE?) at 3 1 multiunit clusters in 7 birds. B, Response to tone-burst song re. BOS at 36 recording sites in 7 birds. At most recording sites the song variants were less effective (< 1 .O) than BOS. Twenty repetitions per song. neurons are sensitive to slight differences between autogenous song and the songs of conspecifics. Although these results are based on multiunit recordings, and are biased toward clusters exhibiting robust song-related activity, the consistency of the results suggests that the majority of HVc auditory neurons ex- hibit the selectivity for autogenous song. The song selectivity of HVc auditory neurons is conferred by their sensitivity to the idiosyncratic acoustic parameters of au- togenous song. This is evidenced by the systematic degradation of the efficacy of autogenous song on manipulation of its acoustic parameters. For example, as the overall frequency of autogenous song is increased, the strength of the response to song decreases monotonically. Nevertheless, the sensitivity of HVc auditory neurons to autogenous song is not based solely on a simple specificity to static or overall acoustic parameters (see also Mar- goliash, 1983a). Reversing the song, which does not modify static parameters, consistently degrades the stimulus efficacy. Furthermore, when the dynamic (time-varying) frequency, but not amplitude, components of autogenous song are modified, the stimulus efficacy is also degraded. This is evident even with a song variant that approximates the spectral properties of each phrase of autogenous song. Thus, the dynamic frequency com- ponents of each phrase of song are important parameters for HVc auditory neurons. The sensitivity of HVc auditory neurons to the direction of song and to the frequency modulation in the whistle, buzz, and trill song phrases are examples of temporal facilitation. A pulsed tone-burst song variant with the same amplitude modulation as autogenous song systematically elicits weaker responses. Clearly, quantification of HVc responses to tone bursts is in- adequate to describe the response properties of the majority of HVc neurons. Had these experiments relied on the classical tone-burst paradigm to the exclusion of testing with behaviorally relevant stimuli, a critical aspect of HVc auditory response prop- erties would have escaped notice. 0.0 0.5 1 .o 2.0 TEST SONGS re. BOS Figure 16. Comparison of song selectivity in field L and HVc within the same birds. The strength of response of various test songs in 3 birds relative to the strength of response of the bird’s own song (BOS). The test songs include tutor songs as well as intradialect conspecific songs (see text). Note that in HVc the responses to the test songs are weaker (< 1.0) than the response to BOS, while field L does not exhibit any systematic selectivity. In the histograms, all values ~0.0 or >2.0 are collapsed. Previously, a limited population of auditory neurons in HVc that exhibit temporal facilitation, “song-specific” neurons, have been described (Margoliash, 1983a). Song-specific neurons also exhibit selectivity for autogenous song. Nevertheless, important aspects of the response properties of song-specific neurons differ from the larger population of HVc auditory neurons. Song- specific neurons (1) exhibit greater song selectivity; (2) do not respond or respond very weakly to single tone bursts; (3) are strictly song phrase “combination sensitive” (Suga et al., 1978), that is, they respond well to a sequence of two consecutive song phrases but not to the individual phrases; (4) do not respond to nonsequential phrase sequences, reversed song, or at multiple phrases in a song; and (5) exhibit phasic response properties, responding with a short burst of activity after the onset of the second phrase in the sequence. These patterns of activity con- trast with the present results. Many multineuronal clusters in HVc respond to tone bursts with varying degrees of strength, and they respond vigorously to individual phrases, typically with a tonic (sustained) component. HVc multiunits also re- spond to nonsequential phrases (e.g., response to whistle and trill but not to the intervening buzz), to reversed song (albeit less strongly than to normal song), and at several phrases in a song. It is unlikely, therefore, that the present results reflect the contribution of the relatively scarce song-specific neurons. How- ever, these data are consistent with the hypothesis that song- specific neurons derive their response properties from circuits local to HVc (Margoliash, 1983a). Developmental plasticity of HVc auditory neurons A plethora of behavioral observations have established that the adult song of the white-crowned sparrow reflects a learning pro- The Journal of Neuroscience Neural Correlates of Song Learning 1659 I I I I 1 I I -1500 - 1000 -500 0 600 1000 1500 FREQUENCY SHIFT (HZ) Figure 17. Field L responses to changes in overall frequency of au- togenous song. Note flat response and large error bars (see text). Twenty- one recording sites in 2 birds. The corresponding data for HVc (Fig. 12B) are shown for comparison. cess. Young white-crowned sparrows of the Nuttallii race can learn to sing interdialect songs (Marler, 1970), computer-syn- thesized songs (Konishi, 1978), the songs of sympatric species (Konishi, 1985), and even the songs of allopatric species (Bap- tista and Petrinovich, 1984). Birds reared in acoustic isolation develop abnormal songs (Marler, 1970), as do birds deafened early in life (Konishi, 1965). These observations demonstrate that the action of innate mechanisms are insufficient to generate normal songs. Instead, many parameters of an individual’s song, especially the fine details of the notes and syllables (Marler and Sherman, 1985), are aquired by learning. HVc neurons are “tuned” to those parameters, so that in contrast to the range of songs that can form appropriate tutor models, the responses of HVc auditory neurons are selective for a single song, the bird’s own. Hence, HVc auditory neurons are specifically modified by some aspect of the song-learning experience. When is this selectivity established? The profound effects of deafening on song development (Konishi, 1965) demonstrate that a bird cannot know how to sing until he practices. The details of an individual’s song are established by a process of overproduction and attrition, improvisation and invention, as well as copying (Marler and Peters, 198 1, 1982). Thus, the pro- cess of song crystallization is the earliest time during develop- ment when the adult properties of HVc auditory neurons can be specified. Indeed, autogenous song elicits stronger responses than the song tutored earlier in life during the impressionable phase (see also Margoliash, 1983a). The preference in HVc for autogenous song is independent of whether a bird accepts or fails to accept the tutor model, demonstrating that the adult auditory response properties of HVc are not specified by the early exposure to the tutor model. Furthermore, the present results demonstrate the stability of adult HVc auditory response properties over a period of months. If white-crowned sparrow HVc auditory neurons are specified by song, do not exhibit plasticity in the adult, and are specified after exposure to the tutor song model, then they must be specified during song crys- tallization. Specification of the response properties of auditory neurons in HVc may be coincident with the establishment of the adult motor pattern for song, suggesting the role of these neurons during the development of song. If neurons in HVc have access to the template, then while a bird practices singing, auditory neurons in HVc may be activated as fortuitous patterns of motor activity induce auditory feedback that matches the input from the template. In turn, activity in these neurons may tend to stabilize the ongoing motor activity. As the appropriate motor HVc ii 5 Field L c.5’0 - L 0.0 0.5 1 .o 1.5 2.0 BIRD’S OWN SONG : REVERSED re. FORWARD Figure 18. Effect of reversing autogenous song on responses of field L clusters. Only a slight preference for forward song (38/60) is present. Sixty recording sites in 3 birds. The corresponding data for HVc (Fig. 13A) are shown for comparison. patterns are reinforced, perhaps the inhibition of auditory re- sponsiveness during singing (McCasland and Konishi, 1981) emerges, emancipating singing from auditory feedback (see Mar- goliash, in press). Site of plasticity-comparison with field L The present results demonstrate that neurons in field L, which project to the shelf and may be a source of auditory input to [ 8-A Field L 8-B Field L 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 FREQUENCY SHUFFLED SONG re. BOS TONE BURST SONG re. BOS Figure 19. Field L responses to frequency-shuffled song variant (A) and tone-burst song variant (B). The corresponding data for HVc (Fig. 15) are shown for comparison. All responses are measured re. response to bird’s own song (Bob’). In contrast to HVc, many sites in field L respond more vigorously (> 1 .O) to the song variants than to BOS. All values 50.0 or 22.0 are collapsed. Margoliash Vol. 6, No. 6, Jun. 1986 HVc (Kelley and Nottebohm, 1979; L. Katz, unpublished ob- servations-see Margoliash, in press), lack selectivity for autog- enous song and do not exhibit specificity for the acoustic pa- rameters of autogenous song. In particular, for field L recordings the relative response strength to frequency-shifted versions of autogenous song can largely be predicted from the underlying tonotopic organization of field L. This indicates that an inter- esting transformation of information occurs between the pri- mary auditory telencephalon and HVc. These differences suggest HVc, but not field L, as a site of plasticity of auditory neurons associated with song (motor) learning. The response properties and location of those field L neurons that project to the shelf has yet to be determined, how- ever. In this regard, it is a significant caveat that the present experiments relied on multiunit recordings and that the full extent of field L was not mapped. Nevertheless, the striking differences between field L and HVc demonstrate that auditory neurons in these two areas are shaped by different develop- mental processes. Indeed, it has been reported that the response properties of some field L neurons are influenced by exposure to song during the impressionable phase (Leppelsack, 1983). In that experiment, however, the effects of song exposure during the impressionable phase and during adult singing were not distinguished. HVc as an “autogenous reference” What is the behavioral significance of auditory responses in the adult HVc? The reproductive success of the white-crowned spar- row is likely to be dependent on the ability to distinguish among a set of rather similar conspecific songs. The song of the white- crowned sparrow varies within and across dialect and subspe- cific boundaries (Baptista, 1975; Baptista and King, 1980; Marler and Tamura, 1962). Territorial males can distinguish neighbors from strangers solely on the basis of song, and the songs of strangers from without the dialect elicit distinctly different re- sponses than do the songs of strangers from within the dialect (Baker et al., 198 la; Milligan and Vemer, 197 1). Female white- crowned sparrows distinguish between songs from their home dialect and from songs of other dialects (Baker et al., 1981b), but whether they mate assortatively with males singing home dialect songs remains controversial (Tomback and Baker, 1984; cf: Petrinovich and Baptista, 1984). The selectivity for song observed in HVc suffices to distin- guish among inter- and intradialect songs, suggesting that HVc contributes to that discrimination. If so, then the representation of autogenous song contributes to song recognition. Certain be- havioral observations lend support to this notion. Although the strength of response elicited when territorial males are exposed to playback of autogenous song is intermediate between their responses to neighbors’ and strangers’ songs (Falls, 1982), the degree of response to a stranger’s song may depend on the sim- ilarity of that song to the bird’s own (McArthur, 1985). Fur- thermore, birds are known to distinguish between the songs of neighbors broadcast from within or outside of their territory (Richards, 198 1). This discrimination, which requires a deter- mination of the distance from the speaker to the responding bird, is thought to be based on the increasing degradation of song resulting from increasing transmission distances through the acoustic habitat. The proposal that a bird utilizes an internal reference of song to assess the degradation of the songs of neigh- boring conspecifics (Morton, 1982) has enjoyed some experi- mental support (McGregor et al., 1983). I propose that HVc contributes to the discrimination of the songs of conspecifics by providing a reference component (see also Margoliash, in press). In such a scheme, differences between conspecific and autogenous song would be memorized, forming the parametric basis for song recognition. This does not imply that conspecific songs similar to autogenous song need have special behavioral significance. This proposal is a form of a “motor theory” of song perception (Margoliash, 1985) and, in similarity to the “motor theory” of speech perception (Liberman et al., 1967), suggests a linkage between production and per- ception. Unlike another recent hypothesis of song perception (Williams and Nottebohm, 1985), the current proposal does not require activity in brain-stem motor nuclei as an integral part of the perceptual process. For both theories, the possible con- tribution of HVc to song recognition in females that exhibit a reduced song system and experience little or no singing (e.g., zebra finch) remains unresolved. The stable representation of autogenous song in the adult white-crowned sparrow’s HVc suggests that the site of plasticity associated with adult song recognition is other than HVc. It is difficult to reconcile these observations with the proposal that plasticity in HVc, perhaps mediated by neurogenesis, contrib- utes to conspecific song recognition in canaries (Nottebohm, 1984). To what extent autogenous song is represented in the HVc of other species, however, has yet to be explored satisfac- torily. It should also be noted that in some species, such as the canary, adults acquire new song elements seasonally. The pres- ent results suggest that in these species HVc auditory neurons may exhibit plasticity as song changes throughout adulthood. References Baker, M. C., D. B. Thompson, and G. L. Sherman (198 la) Neighbor/ stranger song discrimination in white-crowned sparrows. Condor 83: 265-267. Baker, M. C., K. J. Spitler-Nabors, and D. C. Bradley (1981b) Early experience determines song dialect responsiveness of female spar- rows. Science 214: 8 19-82 1. Baptista, L. F. (1975) Song dialects and demes in sedentary popula- tions of the white-crowned sparrow (Zonotrichia leucophrys nuttall@ Univ. Calif. Publ. Zool. 105: l-52. Baptista, L. F., and J. R. King (1980) Geographical variation in song and song dialects of montane white-crowned sparrows. Condor 82: 267-284. Baptista, L. F., and L. Petrinovich (1984) Social interaction, sensitive phrases and the song template hypothesis in the white-crowned spar- row. Anim. Behav. 32: 172-181. Bonke, D., H. Scheich, and G. Langner (1979) Responsiveness of units in the auditory neostriatum of the guinea fowl (Numida meleagris) to species-specific calls and synthetic stimuli. I. Tonotypy and func- tional zones. J. Comn. Phvsiol. 132: 243-255. Falls, J. B. (1982) Individual recognition by sound in birds. In Acoustic Communication in Birds, Vol. 2, Song Learning and Its Conse- quences, D. E. Kroodsma and E. H. Miller, eds., pp. 237-278, Aca- demic, New York. Karten, H. (1968) The ascending auditory pathway in the pigeon (Co- lumba livia). II. Telencephalic projections of the nucleus ovoidalis thalami. Brain Res. II: 134-153. Katz, L. C., and M. E. Gurney (198 1) Auditory responses in the zebra finch’s motor system for song. Brain Res. 211: 192-197. Kelley, D. B., and F. Nottebohm (1979) Projections of a telencephalic auditory nucleus-field L-in the canary. J. Camp. Neurol. 183: 455- 470. Konishi, M. (1965) The role of auditory feedback in the control of vocalization in the white-crowned sparrow. Z. Tierpsychol. 22: 770- 783 Konishi, M. (1978) Auditory environment and vocal development in birds. In Percention and Experience. R. D. Walk and H. L. Pick. Jr.. eds., pp. 105-i 18, Plenum,-New York. 2 ---7 Konishi, M. (1985) Birdsong: From behavior to neuron. Annu. Rev. Neurosci. 8: 125-l 70. Langner, G., D. Bonke, and H. Scheich (1981) Neuronal discrimi- nation of natural and synthetic vowels in field L of trained mynah birds. Exp. Brain Res. 43: 1 l-24. Leppelsack, H.-J. (1981) EinfluB von Gesangslemen auf das Ant- wortverhalten auditor&her Vorderhimneuronen eines Singvogels. Habilitationsschrifi, Ruhr-Universitit Bochum, Federal Republic of Germany. Leppelsack, H.-J. (1983) Analysis of song in the auditory pathway of The Journal of Neuroscience Neural Correlates of Song Learning 1661 song birds. In Advances in Vertebrate Neuroetholonv, J. P. Ewert, R. R. Capranica, and D. J. Ingle, eds., pp. 783-799, Pi&urn, New York. Liberman. A. M.. F. S. Cooner. D. P. Shankweiler. and M. Studder- Kennedy (1967) Perception ‘of the speech code. ‘Psychol. Rev. 74: 431-461. Margoliash, D. (1983a) Acoustic parameters underlying the responses of song-specific neurons in the white-crowned snarrow. J. Neurosci. - _ 3: 1039-1057. Margoliash, D. (1983b) Songbirds, grandmothers, and templates: A neuroethological approach. Ph.D. thesis, California Institute of Tech- nology, Pasadena. Margoliash, D. (1984) An auditory representation of the bird’s own sona in the adult HVc of white-crowned marrows. Sot. Neurosci. Absir. 10: 1022. Margoliash, D. (1985) An auditory representation of the individual bird’s own song: Evidence for a motor theory of song perception? AR0 Ab. 8: 147-148. Margoliash, D. (in press) Neural plasticity in birdsong learning. In Imprinting and Cortical Plasticity, J. P. Rauschecker, and P. Marler, eds., Wiley, New York. Margoliash, D., and M. Konishi (1985) Auditory representation of autogenous song in the song-system ofwhite-crowned sparrows. Proc. Natl. Acad. Sci. USA 82: 5997-6000. Marler, P. (1970) A comparative approach to vocal learning: Song development in white-crowned sparrows. J. Comp. Physiol. Psychol., Pt. 2 71(2): l-25. Marler, P., and S. Peters (198 1) Sparrows learn adult song and more from memory. Science 213: 780-782. Marler, P., and S. Peters (1982) Developmental overproduction and selective attrition: New processes in the epigenesis of birdsong. Dev. Psychobiol. 15: 369-378. Marler, P., and V. Sherman (1985) Innate differences in singing be- haviour of sparrows reared in isolation from adult conspecilic song. Anim. Behav. 33: 57-71. Marler, P., and M. Tamura (1962) Song variation in three populations of white-crowned sparrows. Condor 64: 368-377. McArthur, P. (1986) Similarity of playback songs to self song as a determinant of response strength in song sparrows (Melospiza melo- dia). Anim. Behav. 34: 199-207. McCasland, J., and M. Konishi (198 1) Interaction between auditory and motor activities in an avian song control nucleus. Proc. Natl. - Acad. Sci. USA 78: 7815-1819. McGregor. P. K.. J. R. Krebs. and L. M. Ratcliffe (1983) The reaction of great tits (P&us major)‘to playback of degraded and undegraded songs: The effect of familiarity with the stimulus song type. Auk 100: 898-906. Milligan, M. M., and J. Vemer (197 1) Inter-populational song dialect discrimination in the white-crowned sparrow. Condor 73: 77-80. Morton, E. S. (1982) Grading, discreteness, redundancy, and moti- vation-structural rules. In Acoustic Communication in Birds, Vol. I, Production, Perception, and Design Features of Sounds, D. E. Kroodsma and E. H. Miller, eds., pp. 183-2 13, Academic, New York. MUller, C., and H.-J. Leppelsack (1985) Feature extraction and to- notopic organization in the avian forebrain. Exp. Brain Res. 59: 587- 599. Nottebohm, F. (1984) Birdsong as a model in which to study brain processes related to learning. Condor 86: 227-236. Nottebohm, F., T. M. Stokes, and C. M. Leonard (1976) Central control of song in the canary, Serinus canarius. J. Comp. Neurol. 165: 457-486. Paton, J. A., and F. Nottebohm (1984) Neurons born in adult brain are recruited into functional circuits. Science 225: 1046-1048. Petrinovich, L., and L. F. Baptista (1984) Song dialects, mate selection, and breeding success in white-crowned sparrows. Anim. Behav. 32: 1078-1088. Richards. D. G. ( 198 1) Estimation of distance of singing conspecifics by the Carolina Wren. Auk 98: 127-133. -- _ Scheich. H.. B. A. Bonke. D. Bonke. and G. Lananer (1979) Functional organization of some auditory nuclei in the giinea fowl demonstrated by the 2-deoxyglucose technique. Cell Tissue Res. 204: 17-27. Suga, N., W. E. O’Neill, and T. Manabe (1978) Cortical neurons sensitive to combinations of information-bearing elements of bio- sonar signals in the mustache bat. Science 200: 778-78 1. Tomback, D. F., and M. C. Baker (1984) Assortative mating by white- crowned sparrows at song dialect boundaries. Anim. Behav. 32: 465- 469. Williams, H., and F. Nottebohm (1985) Auditory responses in avian vocal motor neurons: A motor theory for song perception in birds. Science 229: 279-282.