Royal Society Publishing

A review of the generalization of auditory learning

Beverly A Wright, Yuxuan Zhang


The ability to detect and discriminate attributes of sounds improves with practice. Determining how such auditory learning generalizes to stimuli and tasks that are not encountered during training can guide the development of training regimens used to improve hearing abilities in particular populations as well as provide insight into the neural mechanisms mediating auditory performance. Here we review the newly emerging literature on the generalization of auditory learning, focusing on behavioural investigations of generalization on basic auditory tasks in human listeners. The review reveals a variety of generalization patterns across different trained tasks that can not be summarized with a simple rule, and a diversity of views about the definition, evaluation and interpretation of generalization.


1. Introduction

Performance on many auditory perceptual skills improves with practice. The purpose of the present review is to examine the extent to which this learning generalizes to performance on stimuli and tasks that were not encountered during training. Determining such generalization patterns is of interest for both practical and theoretical reasons. On the practical side, this information can guide the development of training regimens that provide the desired improvement most efficiently, while on the theoretical side, it can be used to gain insight into the neural processes affected by the training.

In comparison with investigations in the visual system, the examination of learning, and especially of generalization, in the auditory system is still in its infancy. In traditional psychoacoustic testing, training has been used simply as a means to reach asymptotic performance. The learning, let alone the generalization that occurred as the result of this training, was rarely reported. More recently, there has been an increase in interest in the influence of auditory training itself. This interest is simultaneously inspired by the richness of the learning literature in the visual system, the potential application of auditory training in the treatment of communication disorders, as well as the accumulating evidence of neural plasticity in sensory systems. However, while there are several reviews of human auditory learning (Watson 1980; Irvine & Wright 2005; Wright & Zhang 2006, in press), none has emphasized the generalization of that learning.

We restrict our review to behavioural investigations of generalization on basic auditory tasks in human listeners. This literature includes examinations of the generalization of learning on frequency-discrimination, temporal-judgement, spatial-hearing, and signal-detection and intensity-discrimination tasks. The majority of these investigations employed multiple-session training. Those that used single-session training are described in a separate section. We exclude investigations employing musical and speech stimuli. We consider generalization of learning to untrained stimuli (stimulus generalization), to judgments along untrained stimulus dimensions (task generalization), and to untrained testing procedures. We also include generalization of learning between the auditory and other sensory modalities or the motor system. For simplicity, we describe only those aspects of the untrained conditions that differed from the trained one.

In the literature reviewed here, we consider evidence of generalization obtained with four different paradigms. (i) Training was conducted on a single condition for each group of trained listeners, preceded and followed by tests on a variety of conditions; generalization was indicated by improvement on an untrained condition between the pre- and post-training tests (e.g. Wright et al. 1997). (ii) Same as (i), but with a control group who participated in the pre- and post-training tests, but not in the intervening training phase; generalization was indicated by more learning by trained listeners than controls on untrained conditions (e.g. Mossbridge et al. 2006). (iii) Same as (i), but with no pre-training test; generalization was indicated by similar post-training performance on the trained and untrained conditions, under the assumption that the pre-training performance on all conditions would have been equal (e.g. Amitay et al. 2005). (iv) Two sequential training phases each on a different condition; generalization from the first to the second trained condition was indicated by less learning on the second condition than the first (e.g. Demany & Semal 2002).

2. Investigations employing multiple-session training

(a) Frequency discrimination

Generalization of learning has been tested most extensively on judgments about the spectral attributes of sounds, including pure-tone frequency discrimination and fundamental-frequency discrimination.

(i) Pure-tone frequency discrimination

For frequency discrimination with pure tones, generalization has been examined across stimulus attributes, ear, presentation style (fixed versus roving standard) and task. The experiments described below assessed the ability of listeners to discriminate small differences in the frequency of pure tones. Except where otherwise noted, training consisted of 750–1200 trials per day for 4–12 days, with a 100–300 ms standard tone. Generalization was evaluated by a variety of methods.

Generalization across frequency

It appears that learning on frequency discrimination generalizes to untrained frequencies, but that this generalization is rarely complete. In a number of experiments, listeners who were trained at one frequency also improved at untrained frequencies. Such generalization has been observed among 0.75, 1.5, 3 and 6 kHz (n=8 per group; Delhommeau et al. 2005), between 5 (n=8) and 8 kHz (n=8) (Irvine et al. 2000), from 1 to 1.1 and 2 kHz (n=5, 350 trials per session; Roth et al. 2003), from 1 to 0.5, 2 and 4 kHz (n=12; Amitay et al. 2005), and from 3 to 1.2 and 6.5 kHz (n=8; Demany & Semal 2002). However, in nearly every case, there was some indication that this generalization was not complete. The lack of complete generalization was indicated by three result patterns. In one case, there was greater improvement at a target frequency in listeners who were trained at that frequency than in those who were trained at a different frequency (Irvine et al. 2000). In another, the learning curves obtained during a second training phase were steeper for previously untrained than for previously trained frequencies (Demany & Semal 2002). In a third instance, thresholds were higher on the first (though not subsequent) estimates on the post-training test for untrained than trained frequencies (Delhommeau et al. 2005). Less improvement also was observed at the untrained than the trained frequencies in another investigation, but the authors attributed this difference to the fact that the trained frequency was always tested first (Roth et al. 2003). In a different study, listeners who were trained at either 88, 250 or 1605 Hz showed similar improvement across the three frequencies (Grimault et al. 2003). However, it is unclear whether the improvement at each frequency resulted from training at that frequency or from generalization from other frequencies, because the data of all three trained groups were combined in the analyses. The authors themselves suggest that the similar improvement across frequencies was induced by exposure to each frequency in the pre-training test rather than by generalization across frequencies during training, because performance on the trained conditions improved rapidly within the first training session (3150 trials), but not thereafter.

An indication of across-frequency generalization has also been observed following training with a standard that roved in frequency trial by trial. Amitay et al. (2005) trained listeners with a standard that was fixed at 1 kHz (n=12), or that roved between either 0.9 and 1.1 kHz, including 1 kHz (n=12), or between 0.57 and 2.15 kHz, excluding 1 kHz (n=15). Although they did not test performance at all frequencies prior to training, there was no difference in post-training thresholds at 0.5, 1, 2 and 4 kHz for each listener group, suggesting that learning generalized across frequencies.

In contrast to these reports, we have observed some evidence for frequency-specific learning on frequency discrimination (Wright & Fitzgerald 2005, plus subsequently collected data). We trained listeners 900 trials per day for 10 days on frequency discrimination using a standard consisting of two brief 1 kHz tone pips whose onsets were separated by 100 ms (n=8). Control listeners who received no training (n=10) improved between the pre- and post-training tests at 4 kHz but not 1 kHz, suggesting frequency-specific rapid learning induced by the pre-training test itself. In addition, the trained listeners improved more than controls at 1 kHz, but only as much as the controls at 4 kHz, suggesting frequency-specific learning induced by the multiple-hour training. It is not clear whether these results differ from those described above due to the use of brief tone pips rather than a longer continuous tone as the standard stimulus or because the improvement of the trained listeners was evaluated relative to that of controls.

Generalization across stimulus duration

There is also some indication that frequency-discrimination learning generalizes at least partially to untrained stimulus durations and to stimuli with untrained temporal intervals. Frequency-discrimination training with one standard stimulus yielded pre- to post-training improvements in performance for stimuli with untrained durations (200 ms trained versus 40 and 100 ms untrained; n=10; Delhommeau et al. 2002) or an untrained temporal interval (100 ms trained versus 50 ms untrained; Wright & Fitzgerald 2005, plus subsequently collected data), indicating generalization. However, Delhommeau et al. (2002) reported that while the magnitude of the improvement (expressed as the ratio of pre- to post-training performance) for their untrained 100 ms stimulus was similar to that for their trained 200 ms stimulus, that for their untrained 40 ms stimulus was smaller. This observation suggests that the generalization of frequency-discrimination learning to untrained stimulus durations is incomplete.

Generalization across ear

Learning on frequency discrimination appears to generalize from the trained to the untrained ear. There was improvement in the untrained ear following frequency-discrimination training in the other ear in every case tested, indicating cross-ear generalization (Delhommeau et al. 2002, 2005; Demany & Semal 2002; Roth et al. 2003; Micheyl et al. 2006). In two cases, this generalization seems to have been complete, because the amount of improvement was similar at both ears for trained as well as untrained stimuli: untrained frequencies (Delhommeau et al. 2005) or untrained noise conditions (quiet versus contralateral noise, see fig. 5 in Micheyl et al. 2006). In two other cases, though there was some indication that the generalization was incomplete, this conclusion was counterbalanced by other evidence. Roth et al. (2003) reported greater improvement between the pre- and post-training tests in the trained than the untrained ear at all tested frequencies, but they attributed this across-ear difference to the fact that the trained ear was always tested first. Demany & Semal (2002) observed that additional training yielded learning in the previously untrained, but not the previously trained ear. However, they concluded that learning generalized across ears, with a minor, if any, component of ear specificity, because the slopes of the learning curves resulting from additional training did not differ between the two ears. In the remaining case, across-ear generalization of learning with the trained stimulus appeared to be incomplete. Delhommeau et al. (2002) reported greater improvement in the trained than the untrained ear between the pre- and post-training tests. Interestingly, this incomplete generalization was only observed for the trained but not for the untrained stimulus durations.

Generalization across presentation style

There is some evidence that learning on frequency discrimination generalizes, in large part, between conditions in which the pure-tone standard is fixed at a single frequency, or roves across frequencies. In an experiment described above (Amitay et al. 2005), listeners were trained with a standard that was fixed in frequency or roved among a set of either narrowly spaced or widely spaced frequencies. Post-training performance on the fixed condition did not differ across the three trained groups, suggesting that learning generalized from both roving conditions to the fixed condition. By contrast, learning generalized from the fixed condition to the narrow-roving condition for poor, but not good listeners. Post-training performance on the narrow-roving condition was better for the narrow-roving trained than the fixed-trained groups in the one-third of listeners who had the highest starting thresholds, but was similar between the two groups for the remaining two-thirds of listeners. Note, however, that even with their apparently broader generalization, the poor listeners still had higher post-training thresholds than the good listeners.

Generalization across tasks

Finally, one investigation suggests that frequency discrimination learning generalizes to fundamental-frequency discrimination with harmonic complexes, but not to amplitude-modulation rate discrimination. Grimault et al. (2003) trained three different groups of listeners on frequency discrimination at either 88, 250 or 1605 Hz (n=3 per group). Based on the pooled data from all three groups, these listeners improved on fundamental-frequency discrimination with fundamental frequencies of 88 and 250 Hz (with harmonics filtered at low, mid and high frequency regions), indicating generalization to fundamental-frequency discrimination. However, the amount of learning was greater when the individual harmonics could be separated by the peripheral auditory system than when they could not. By contrast, these same listeners did not improve on amplitude-modulation-rate discrimination with a standard rate at either 88 or 250 Hz (with noise-band carriers either in a low, mid or high frequency region), suggesting a lack of generalization to amplitude-modulation rate discrimination.

(ii) Fundamental-frequency discrimination

Another pitch-based task on which generalization of learning has been examined is fundamental-frequency discrimination. In this task, listeners were asked to distinguish small differences in the fundamental frequency (F0) of a set of harmonics.

The combined results of two separate reports suggest that learning on fundamental-frequency discrimination generalizes to untrained F0s and harmonic frequency regions, but not to pure-tone frequency discrimination, and is at least partially specific to the processing status of the harmonics in the peripheral auditory system (resolved versus unresolved). In one experiment, Grimault et al. (2002) trained listeners for approximately 1800 trials per session for 12 sessions on F0 discrimination with a standard F0 of either 88 Hz (n=4) or 250 Hz (n=4) and harmonics in a mid-frequency region. The trained listeners improved between the pre- and post-training tests at both F0s, regardless of whether the harmonics were presented in the low-, mid- or high-frequency regions, while controls, who received no training, showed no improvement. These results suggest generalization across both F0 and harmonic frequency region. However, the conclusion regarding generalization across F0 should be treated with caution. This is because, in that analysis, the data at each of the two F0s were pooled across listeners who were trained at that F0 and those who were trained at the other F0. Thus it is difficult to determine whether the observed improvement resulted from the training at that F0 or from generalization from the other F0. Interestingly, the amount of improvement was greater on untrained conditions that matched the trained one in terms of whether the harmonics were resolved or unresolved, suggesting some specificity to a characteristic of the harmonic components themselves.

In a related experiment, Demany & Semal (2002) trained listeners for 1100 trials per session for 10 sessions on fundamental-frequency discrimination using resolved harmonic complexes with a F0 of either 100 (n=8) or 500 Hz (n=8). These listeners subsequently improved with six additional training sessions on pure-tone frequency discrimination at 100, 500 and 2500 Hz (frequency varied trial by trial in a fixed order), and the amount of that learning did not depend on whether the pure tone had the same pitch as, or shared a frequency component with, the trained harmonic complex. This subsequent learning suggests that, if there is any generalization from F0 discrimination to pure-tone frequency discrimination, it is not complete.

(b) Temporal judgments

The generalization of learning has also been examined on a number of tasks involving temporal judgments, including temporal-interval discrimination, relative-timing tasks, and amplitude-modulation rate discrimination.

(i) Temporal-interval discrimination

Generalization of learning on auditory temporal-interval discrimination has been examined across stimulus attributes as well as across sensory modalities and the motor system. In the experiments described below, listeners were asked to discriminate small differences in the time interval between two brief auditory markers. Training typically consisted of 500–900 trials per day for approximately 10 days (range: 5–16 days) on a single temporal-interval discrimination condition. Generalization was evaluated by comparing improvement between pre- and post-training tests on the trained and untrained conditions.

The current evidence suggests that learning on temporal-interval discrimination does not generalize to untrained intervals, but generalizes to untrained frequencies and interval-marker types. Listeners who learned as a result of training with a single standard stimulus (learners) did not improve on untrained temporal intervals. Specifically, learning did not generalize from 100 ms to 50, 200 or 500 ms (Wright et al. 1997; n=11 learners), from 100 to 200 ms (Karmarkar & Buonomano 2003; n=10 learners), or from 200 to 100 ms (Karmarkar & Buonomano 2003; n=5 learners). However, the same listeners improved equally at both trained and untrained stimulus frequencies. Learning generalized from 1 kHz to 3.75 or 4 kHz (Wright et al. 1997; Karmarkar & Buonomano 2003). In addition, listeners retained their improvements when they were tested with the trained standard interval filled by a continuous tone rather than marked by two brief tone pips, indicating generalization across interval-marker type (Karmarkar & Buonomano 2003).

There is also evidence that learning on temporal-interval discrimination generalizes from the auditory system to motor performance and from the somatosensory system to the auditory system, but only for the trained temporal interval. In one investigation, participants who were trained on auditory interval discrimination with either a 300 (n=6) or 500 ms (n=6) standard interval improved their ability to produce the trained, but not the other, interval through successive button presses with the right thumb (Meegan et al. 2000). In another experiment, participants who were trained on somatosensory interval discrimination with a 125 ms standard interval improved on auditory interval discrimination, but only with a standard interval that was similar to the trained one (100 ms, but not 50 or 200 ms; Nagarajan et al. 1998). It is worth noting that the learning and generalization pattern in the somatosensory system itself paralleled that observed in the auditory system. Learning generalized to an untrained skin location (contralateral position on the untrained hand), but not to a longer untrained interval (225 versus 125 ms).

Generalization of temporal-interval discrimination learning to untrained stimuli was observed following training on two standard stimuli that were randomly selected trial by trial (Karmarkar & Buonomano 2003). Approximately 50 per cent of the listeners (11 out of 20) improved on both standards as a result of the random-stimulus training. These learners improved equally on the trained (50 ms, 1 kHz and 200 ms, 4 kHz) and untrained (50 ms, 4 kHz and 200 ms, 1 kHz) stimuli between pre- and post-training tests in which each condition was tested separately. This outcome suggests that the learning obtained with the random-stimulus training generalized to performance with a single standard stimulus. Further, given the generalization patterns observed with single-standard training, it seems likely that the improvement on the untrained conditions resulted from cross-frequency rather than cross-temporal-interval generalization.

(ii) Relative-timing tasks

In three of four cases, learning on auditory relative-timing tasks was reported to be specific to the trained tone pair, temporal position and type of judgement. Mossbridge et al. (2006, 2008) examined learning and generalization on asynchrony detection and temporal-order discrimination at sound onset and sound offset. In the asynchrony-detection task, in each trial, listeners determined in which of two presentations the components of a two-tone complex began (onset) or ended (offset) at different times (asynchronously), as opposed to at the same time (synchronously). In the temporal-order discrimination task, listeners determined in which of the two presentations the higher frequency tone began or ended earlier, as opposed to later, than the other tone. In each of four experiments, Mossbridge et al. trained 6–14 listeners in 720 trials per day for 6–8 days on a single relative-timing condition using tones at 0.25 and 4 kHz. In comparison to controls who received no training (n=6–18), all four groups of trained listeners showed more improvement on the trained condition, but not on conditions with untrained frequency pairs (0.75 and 1.25 kHz or 0.5 and 1.5 kHz), suggesting specificity to the trained frequency pair. Further, three of the trained groups (asynchrony onset, order onset and offset) did not improve more than controls at the untrained temporal position (onset versus offset) or on the untrained task (order versus asynchrony), indicating specificity to the position and type of temporal judgement. By contrast, the fourth group (asynchrony offset) improved more than controls both at the untrained temporal position and untrained task, demonstrating broader generalization than the other trained groups.

There is also some indication that asynchrony-detection learning does not generalize across modalities. Virsu et al. (2008) trained participants (n=28) 30 min per session for eight sessions on an asynchrony-detection task with auditory, visual, tactile, audiovisual, audiotactile and visuotactile stimuli in a fixed order. The amount of improvement between the first and last training sessions was nearly independent across the six trained conditions, suggesting a lack of generalization across modalities. These authors also mention unpublished data in which it appears that the learning did not generalize to untrained stimuli even within the same modality, consistent with other reports (Mossbridge et al. 2006, 2008).

Finally, learning of relative-timing in tone sequences appears not to generalize from shorter to longer sequences, but to generalize across some aspect of equal-length sequences. Leek & Watson (1988) reported that listeners (n=5) who learned to apply a different label to each of four three-tone sound segments when each segment was presented alone took longer than the original training to reach the same level of performance (90% accuracy) when the segments were presented in two- to four-segment sequences. This outcome implies a lack of generalization from the shorter to the longer sequences. Later, Barsz (1996) asked listeners to distinguish between a pair of repeating four-tone sequences that differed only of the order of the component tones. One group of listeners (n=5) practiced on a target condition with only one pair of tone sequences, each of which was presented with equal probability. Another group (n=5) practiced the same amount as the first group on each of four conditions, including the target one, and therefore received four times as much practice. The four conditions included two pairs of tone sequences in which each sequence within a pair was presented with either equal or unequal probabilities. The listeners who practiced only the target condition improved less on that condition than those who received the additional training on the other conditions. This result suggests that learning generalized across the four conditions (either across frequency pairs or across presentation probabilities, or both) in the four-condition training.

(iii) Amplitude modulation rate discrimination

For amplitude-modulation rate discrimination, the generalization of learning has been examined across rate, carrier and task. There are conflicting reports as to whether this learning generalizes to untrained modulation rates. Fitzgerald & Wright (2005) trained listeners (n=9) on amplitude-modulation rate discrimination for 720 trials per day for 6–8 days with a 150 Hz standard rate and a broadband carrier and compared their performance with that of controls (n=9) who received no training. The trained listeners improved more than the controls at the trained 150 Hz modulation rate as well as at an untrained 300 Hz rate, but not at a 30 Hz rate. However, the post-training performance of the trained listeners at 150 Hz was similar to the asymptotic performance previously reported for highly trained listeners at that rate, but did not reach the equivalent asymptotic value at 300 Hz. These results suggest that learning did not generalize to a slower rate and only partially generalized to a faster rate. By contrast, Grimault et al. (2003) concluded that learning generalized between 88 and 250 Hz rates after training on amplitude-modulation rate discrimination for 3150 trials per session for 12 sessions. However, this conclusion should be treated with caution, because it was based on the observation of similar improvements at both rates in a combined group of listeners, half of whom were trained at each rate (n=3 per rate), without a comparison between trained and untrained listeners at each rate.

There is a hint that learning on amplitude-modulation rate discrimination generalizes across carrier frequency regions. Listeners who were trained on amplitude-modulation rate discrimination with a mid-range noise-band carrier (1375–1875 Hz) at a rate of either 88 or 250 Hz (n=3 per group) improved equally with carriers in low-, mid- and high-frequency regions, based on an analysis in which the data of both trained groups were combined (Grimault et al. 2003).

Finally, across-task examinations suggest that learning on amplitude-modulation rate discrimination does not generalize to amplitude-modulation detection, frequency discrimination, rippled-noise discrimination or temporal-interval discrimination. In one investigation (Fitzgerald & Wright 2005), listeners who were trained on amplitude-modulation rate discrimination actually showed poorer post- than pre-training performance on the detection of amplitude modulation at the trained rate, with the trained carrier, a rare example of negative generalization. Further, these listeners did not improve more than controls on pure-tone frequency discrimination or rippled-noise discrimination at the trained pitch. In another investigation (van Wassenhove & Nagarajan 2007), listeners (n=9) were trained for 800 trials for 3 days to discriminate a temporal difference in the inter-stimulus-interval of four consecutive tone pips, a form of amplitude-modulation rate discrimination. These listeners did not improve between pre- and post-training tests in their ability to discriminate the frequency of those tones or to discriminate between the standard intervals marked by only two rather than four tone pips, suggesting a lack of generalization to either frequency or temporal-interval discrimination.

(c) Spatial hearing

Generalization of learning also has been examined on tasks employing the two primary cues to sound-source location on the horizontal plane: interaural level differences (ILDs) and interaural time differences (ITDs; for a review of learning in spatial hearing of humans, see Wright & Zhang 2006).

In the only investigation of the effect of training on ILD discrimination, learning was reported to be specific to the frequency and cue used in training, but to generalize to an untrained standard ILD value (Wright & Fitzgerald 2001). Listeners (n=8) who practiced 720 trials per day for 9 days on ILD discrimination with a 4 kHz tone and a standard ILD of 0 dB improved more than controls who received no training. The trained listeners also improved more than controls with an untrained standard ILD (6 dB), but not with untrained frequencies (0.5 or 6 kHz), or on ITD discrimination (at 0.5 kHz with a 0 μs standard ITD).

For ITD discrimination, the influence of multiple-hour training has been examined in four investigations (Wright & Fitzgerald 2001; Rowan & Lutman 2006, 2007; Zhang & Wright 2007), but improvement that can be attributed to that training was reported in only two. Based on those two investigations, learning on ITD discrimination appears to generalize to untrained stimulus types, and between stimuli with and without an interaural carrier-frequency difference. In one case (Rowan & Lutman 2007), listeners who were trained in 360 trials per day for 6 days on ITD discrimination with either a 128 Hz pure tone (n=6) or a 4 kHz tone amplitude modulated with a half-rectified 128 Hz tone (transposed stimulus; n=8) improved with each of the two trained stimuli as well as with an untrained one (a 4 kHz tone sinusoidally amplitude modulated at 128 Hz). The improvement of the trained listeners on each condition was greater than that of controls who received no training, suggesting that learning generalized across stimulus type (pure tone, transposed stimulus, sinusoidally amplitude-modulated stimulus). In another case (Rowan & Lutman 2006), listeners were trained for 300 trials per day for 6 days with transposed stimuli that either contained an interaural carrier-frequency difference (4.6 kHz tone at one ear and a 5.4 kHz tone at the other; n=9) or did not (5 kHz tone carrier; n=7). The ratio of performance between these two conditions did not differ before and after training in either trained group in the subset of listeners for whom pre-training performance was measurable. This result implies that learning generalized between conditions with and without an interaural carrier-frequency difference. However, it is not clear whether, as a group, the listeners who were included in the generalization test improved on their respective trained conditions, because a considerable number of those listeners did not improve during training.

Learning following multiple-hour training has also been observed in only one of the two investigations of another ITD-based task designed to evaluate the precedence effect (Saberi & Perrott 1990; Litovsky et al. 2000). In this task, listeners were required to discriminate ITDs presented in a stimulus that was preceded by another stimulus with a different ITD value (a simulation of a source and ‘echo’). In the report showing learning, Saberi & Perrott (1990) trained listeners (n=3) with square-wave clicks for 5–20 sessions of 30 minutes. These listeners improved on ITD discrimination with the echo stimulus, indicating a reduced precedence effect. This learning was subsequently observed for untrained sine-wave pulses across a broad range of frequencies (250 Hz to 12 kHz), and for a broad range of stimulus levels (45–110 dB versus the trained 60 dB) with the trained square-wave clicks, suggesting generalization across stimulus type and sound level.

(d) Signal detection and intensity discrimination

(i) Signal detection

We are aware of only three investigations of the generalization of learning following training on tone detection, each of which employing a different task: tone detection in quiet, in a noise masker or in a tone sequence. For tone detection in quiet, learning curves obtained over three sessions of training with a 0.1 kHz tone in listeners who had received four sessions of prior training with a 1 kHz tone (n=8) were similar to those in listeners who had not received this prior training, suggesting a lack of generalization from 1 to 0.1 kHz (Zwislocki et al. 1958). For tone detection in a masker, listeners (n=3) who were trained extensively (over a period of about four months) on the detection of short tones in same-duration gated noise performed better on that task with a 10 ms stimulus, but not with longer stimuli, than did another group of listeners (n=3) who received less training (Tucker et al. 1968). This result suggests some specificity to the trained stimulus duration. Finally, for tone detection in a tone sequence, listeners (n=4) were extensively trained to detect a tone embedded in a single 10-tone sequence, with the target-tone position randomized trial by trial (Leek & Watson 1984). These listeners improved with the trained sequence as well as in an untrained condition in which six new sequences comprising the same 10 tones as the trained one were randomly presented. Thus, the learning with the trained sequence generalized to new sequences, and to greater stimulus uncertainty. Two other aspects of this experiment are noteworthy. First, during training, the listeners appeared to learn to detect the tone at one position in the sequence at a time, implying a lack of generalization across tone positions. Second, two listeners who had considerable difficulty detecting the tone in a particular temporal position during this random-position training improved rapidly when given practice with a longer-duration target tone presented only at that temporal position (salient duration, reduced uncertainty). These listeners later maintained this good performance when tested again in the original condition (with the duration of the target tone restored to its original value, and the temporal position of the target tone selected randomly), suggesting generalization of learning from an easy condition to a difficult one.

(ii) Intensity discrimination

We are aware of only one investigation that provides information on generalization of learning on intensity discrimination, the ability to distinguish small differences in the intensity of a stimulus. Buss (2008) reported that listeners (n=8) improved during six training sessions (approx. 5500–9000 total trials) on intensity discrimination with a 50 dB sound pressure level (SPL) target tone of 948.7 Hz presented simultaneously with tonal maskers at 0.3 and 3 kHz that were roved in intensity from 42 to 58 dB SPL. However, the listeners did not improve on intensity discrimination for the target tone presented in quiet, a condition that was tested at the beginning of every other training session. This outcome implies that learning did not generalize from the masked to the quiet condition.

3. Rapid learning

Though the majority of studies of the generalization of auditory learning employ multiple-session training, there is a small, but growing literature on the generalization of learning resulting from brief, typically one session, training. To date, this literature is restricted to learning on pure-tone frequency discrimination and spatial-hearing tasks.

(a) Pure-tone frequency discrimination

For pure-tone frequency discrimination, learning obtained with relatively short training periods appears to generalize across frequency, ear and testing procedure. In one investigation (Demany 1985), listeners who were trained for 700 trials over two daily sessions at either 0.36 (n=16) or 2.5 kHz (n=16) improved between pre- and post-training sessions at 0.2 kHz as much as listeners who were actually trained at 0.2 kHz (n=16), indicating generalization across frequency. However, listeners who were trained at 6 kHz (n=22) did not improve at 0.2 kHz, suggesting that this generalization was limited to a certain frequency range. In another demonstration of across-frequency generalization (Amitay et al. 2006), listeners who practiced frequency discrimination at 4 kHz for 800 trials in one session improved significantly between pre- and post-training tests at 1 kHz. Interestingly, compared with the frequency-discrimination learning obtained with typical discrimination paradigms, learning resulting from training with no frequency difference between the standard and signal appears to be more frequency specific (Amitay et al. 2006, their supplementary fig. 2 online). Using this no-frequency-difference paradigm, listeners who were trained for 200 trials at each of five frequencies (570, 840, 1170, 1600 and 2150 Hz) improved less at the untrained frequency of 1 kHz than did listeners who received 800 training trials at 1 kHz, suggesting that the cross-frequency generalization, if present, was not complete.

Frequency-discrimination learning resulting from brief training also appears to generalize across ears and testing procedures, but not across tasks. Listeners who practiced frequency discrimination in a single 700-trial training session with a 1 kHz standard either in the left (n=10) or right (n=10) ear had similar performance in both ears 24 hours after training, suggesting generalization across ears (Roth et al. 2004). In another case, listeners who were trained for approximately 500 trials in a single session on frequency discrimination using an AXB procedure (‘was the second tone more like the first tone, or the third tone?’) showed no additional learning during a subsequent approximately 1000 trials of training with a 2I-2AFC (two-interval, two-alternative forced choice) procedure, just as those who were initially trained with the 2I-2AFC procedure itself (Hawkey et al. 2004). This outcome implies that learning generalized across testing procedures. However, brief training on intensity discrimination or visual-contrast discrimination did not improve frequency-discrimination performance (Hawkey et al. 2004). Listeners who were trained for approximately 500 trials in a single session on either intensity discrimination or visual-contrast discrimination showed learning on frequency discrimination during a subsequent approximately 1000 trials of training, while those who were initially trained on frequency discrimination did not. This result suggests a lack of cross-task generalization.

(b) Spatial hearing

For spatial hearing, there are three investigations of the generalization of rapid learning, each of which examined generalization across a different dimension. In an early investigation (Russell 1976), three groups of listeners (n=78 total) were trained for 200 trials on free-field localization on the horizontal plane with either open ears (normal cues), bilateral ear plugs (spectral cues preserved, sound level reduced) or bilateral ear muffs (spectral cues altered, sound level reduced). All three groups improved between pre- and post-training tests on their respective trained conditions. In addition, the group trained in the open-ear condition showed improvement on the ear-plug condition and vice versa. However, the group trained in the ear-muff condition did not improve on either of the other two conditions or vice versa. Thus, it appears that improvement in sound-localization performance on the horizontal plane did not generalize across alterations in spectral cues. In a more recent experiment (Spierer et al. 2007), listeners (n=10) who were trained to discriminate between a fixed pair of ITD values for 40 min with white-noise stimuli improved with the trained ITD pair, but not with untrained pairs. Thus, this learning was specific to the trained ITD values. Note that, in this case, the learning lasted no longer than 6 hours. The results of a third investigation suggest that brief training on other tasks generalizes, but only partially, to ITD discrimination. Ortiz & Wright (in press) trained listeners in a single daily session either on temporal-interval discrimination with a 100 ms interval marked with 4 kHz tone pips (n=17; 1200–1500 trials), ILD discrimination with a 4 kHz tone (n=28; 300 trials and n=18; 1200–1500 trials), or ITD discrimination with a 0.5 kHz tone (n=17; 300 trials and n=14; 1200–1500 trials). The day after this training, all three groups performed better on ITD discrimination than naive listeners, suggesting that there was some generalization from the temporal-interval and ILD discrimination tasks to ITD discrimination. However, ILD-trained listeners did not perform as well as listeners who were trained on ITD itself, indicating that the generalization from ILD to ITD discrimination was incomplete.

4. Discussion

As is revealed by the present review, the literature on generalization of auditory learning is fairly limited. For many tasks, there are only a few, if any, investigations of generalization, making it difficult to draw reliable conclusions about the generalization pattern. In the few cases in which generalization has been examined more extensively, a variety of testing and training methods have been employed, thereby constraining across-experiment comparisons. Overall, the available data reveal no simple rule that can be used to predict the pattern of generalization on a given task. For example, generalization patterns differ across tasks both along a particular stimulus attribute (e.g. learning generalizes across frequency for temporal-interval discrimination, but not for ILD discrimination), and along trained stimulus dimension (e.g. learning is specific to the trained temporal interval for interval discrimination but generalizes across the standard ILD value for ILD discrimination). Nevertheless, this examination of the existing literature provides a valuable opportunity to compare the different approaches and clarify the implicit assumptions used in the evaluation and interpretation of generalization.

(a) Evaluation of generalization

Three interesting points regarding the evaluation of generalization arise from this review. First, there appear to be two different implicit definitions of generalization. In some investigations, generalization was evaluated using only the data of those listeners who improved during the training (learners; e.g. Wright et al. 1997; Karmarkar & Buonomano 2003). The implicit definition of generalization here is improvement on untrained conditions that results from improvement on the trained condition (generalization resulting from learning). Under this definition, any improvement on an untrained condition in listeners who did not improve during training would not be classified as generalization. In other investigations, generalization was evaluated using the data of all trained listeners, regardless of whether each individual improved on the trained condition during training (e.g. Wright & Fitzgerald 2001). Here generalization is implicitly defined as improvement on untrained conditions that results from any changes induced by the training, even when those changes are not revealed by the performance on the trained condition (generalization resulting from training). Under this definition, any improvement on an untrained condition is classified as generalization.

Second, the paradigms used to assess generalization differ in the inclusion of three elements, each of which provides a different advantage. Common to all of these paradigms is a training period and a subsequent test of performance on conditions not encountered during training. The paradigms differ in the presence or absence of the following elements: (i) a pre-training test of performance on untrained conditions, (ii) controls who participate in the pre- and post-training tests, but not in the training in between and (iii) subsequent training on previously untrained conditions. Including a pre-training test allows a direct assessment of the amount of improvement on trained and untrained conditions, precluding the need to assume equal performance across conditions before training. A possible disadvantage is that the pre-training testing itself may contribute to learning on the trained condition, thereby limiting the ability to evaluate the influence of training on only that condition. Testing controls who do not participate in the training allows the separation of rapid learning induced by exposure to the pre-training test from slower learning induced by the training period. Finally, examining the effect of subsequent training on untrained conditions allows the assessment of immediate generalization to untrained conditions (based on performance at the beginning of the subsequent training) as well as of longer-term consequences on performance on those conditions (based on learning curves in the subsequent training).

Third, there are two different implicit views about what constitutes complete generalization. According to one view, generalization is deemed to be complete if there is as much learning on the untrained condition as on the trained one (the trained condition ‘gives all that it can give’). In such cases, the extent of generalization is assessed by comparing either the amount of improvement between pre- and post-training tests, or post-training performance alone when there is no pre-training test, between trained and untrained conditions. According to the other view, generalization is considered to be complete if practice on the trained condition yields as much improvement on the untrained condition as can be obtained by training on the untrained condition itself (the untrained condition ‘gets all that it can get’). In this case, assessment of the extent of generalization requires training on the untrained condition. One method is to train a different group of listeners on each condition. Generalization is regarded as complete if the amount of improvement on a given condition is equivalent between listeners who are trained on a different condition and listeners who are trained on that condition itself. The other method is to subsequently train the original trained listeners on a different condition. Generalization is regarded as complete if there is no learning during the subsequent training. The sequential training paradigm allows a test for a unique form of generalization in which training one condition does not directly improve performance, but increases the rate of learning, on another condition.

(b) Interpretation of generalization

Patterns of generalization have been used to make inferences about what was learned, as well as which part of the brain was modified, during perceptual training. Depending on the breadth of generalization, learning has been divided into two broad categories: stimulus learning and procedural learning (for a detailed description, see Ortiz & Wright in press). Stimulus learning refers to learning of a single or a set of attributes of the trained stimulus, and is indicated by specificity to some aspect of that stimulus. Procedural learning refers to learning of factors that are independent of the trained stimulus (such as the testing method and the laboratory environment), and is indicated by broad generalization across different stimuli and tasks. Based on this convention, the interval-specific learning of temporal-interval discrimination provides an example of stimulus learning (e.g. Wright et al. 1997; Karmarkar & Buonomano 2003), while improvement on ITD discrimination following a single session of training on temporal-interval discrimination is regarded as procedural learning (Ortiz & Wright in press). Investigators tend to attribute improvements that occur rapidly at the beginning of training to procedural learning. However, this interpretation should be treated with caution, because there is considerable evidence that stimulus learning also can occur quite rapidly (see Ortiz & Wright in press). Note that the division of improvement in perceptual performance into procedural and stimulus learning does not specify the location(s) of the neural modifications, such as whether they occur at the sensory encoding stage or at some higher processing stage. It also provides no indication of the mechanism(s) of learning, such as whether the improvements result from the enhancement of the signal representation or the reduction of exogenous or endogenous noise in the system (for descriptions of these mechanisms, see Dosher & Lu (1998) and Gold et al. (1999)).

Generalization patterns also have been used to gain insights into the characteristics of the neural processes that are modified by training. The assumption in such inferences is that training on one condition improves performance on a different condition if and only if the neural processes affected by the training contribute to performance on both conditions. Under this assumption, generalization patterns reveal the tuning characteristics of the affected processes. For example, the demonstration that learning on temporal-interval discrimination is specific to the trained interval but generalizes across frequency and even modality (Wright et al. 1997; Nagarajan et al. 1998; Meegan et al. 2000; Karmarkar & Buonomano 2003) suggests that training modifies neural processes that are tuned to a specific time interval but are involved in processing auditory and somatosensory stimuli as well as in motor performance. This modified process could either consist of a single circuit that is broadly tuned to the stimulus attributes across which learning generalizes, or multiple circuits each of which is specifically tuned to those attributes.

The tuning characteristics of the affected neural processes inferred from the generalization patterns, in turn, have been used to make inferences about the location of those processes. In the visual system, there are a number of demonstrations that the specificity of perceptual learning to particular trained stimulus attributes mirrors the tuning characteristics of neurons in primary sensory cortices (e.g. Karni & Sagi 1991; Poggio et al. 1992; Ahissar & Hochstein 2004). Therefore, perceptual learning has been associated with changes in primary sensory encoding areas. This view has received some support from physiological observations that these areas are most activated during visual learning in humans (Schiltz et al. 1999; Schwartz et al. 2002; Furmanski et al. 2004). However, the changes underlying perceptual learning may not be limited to primary cortex. In a recent investigation, behavioural improvements on a visual-motion task in monkeys were more correlated with changes in an area involved in higher-level cognitive and visual-motor functions than in an area where visual-motion information is encoded (Law & Gold 2008).

It is worth pointing out that training on a single condition might induce multiple modifications in the neural processes engaged in performing that condition. For example, the across-frequency generalization of frequency-discrimination learning has been shown to range from apparently complete to partial to absent, suggesting that, across experiments, training modified, to different extents, a neural mechanism that is tuned to a specific frequency as well as one that processes a broad range of frequencies. Reports that the affected site shifts during the course of training on perceptual and motor tasks are consistent with this idea (Karni et al. 1998; Petersen et al. 1998; Atienza et al. 2002; Gottselig et al. 2004). Among other possibilities, the site of perceptual learning, and therefore the generalization pattern, has been proposed to be determined by the difficulty of the trained task (Ahissar & Hochstein 2004). This idea arose from the observation that learning on a visual task generalized broadly to untrained stimuli at low difficulty levels (high signal-to-noise ratios), but became increasingly more specific as the difficulty level increased. This outcome suggests that learning of easy tasks occurs at more central stages, where visual neurons are broadly tuned, while that of more difficult tasks occurs at more peripheral stages, where neurons are tuned more specifically. It remains to be determined to what extent the conclusions drawn from visual learning apply to the auditory system.

Taken together, investigation of the generalization of auditory learning is a newly emerging research field that has a promising future. The outcomes can guide the development of therapeutic and non-therapeutic training regimens as well as provide information about the structure and plasticity of the auditory system.


The comments of two reviewers helped us to improve this manuscript. This work was supported by a grant from NIH/NIDCD.


  • One contribution of 12 to a Theme Issue ‘Sensory learning: from neural mechanisms to rehabilitation’.


View Abstract