110 Sober, S. J., & Brainard, M. S. (2009). Adult birdsong is actively maintained by error correction. Nature Neuroscience, 12(7), 927–931.
111 Sperry, R. W. (1950). Neural basis of the spontaneous optokinetic response produced by visual inversion. Journal of Comparative and Physiological Psychology, 43(6), 482–489.
112 Takahashi, D. Y., Fenley, A. R., & Ghazanfar, A. A. (2016). Early development of turn‐taking with parents shapes vocal acoustics in infant marmoset monkeys. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1693), 20150370.
113 Terband, H., Van Brenk, F., & van Doornik‐van der Zee, A. (2014). Auditory feedback perturbation in children with developmental speech sound disorders. Journal of Communication Disorders, 51, 64–77.
114 Thevenin, D. M., Eilers, R. E., Oller, D. K., & Lavoie, L. (1985). Where’s the drift in babbling drift? A cross‐linguistic study. Applied Psycholinguistics, 6(1), 3–15.
115 Tourville, J. A., Reilly, K. J., & Guenther, F. H. (2008). Neural mechanisms underlying auditory feedback control of speech. NeuroImage, 39(3), 1429–1443.
116 Vihman, M. M. (1991). Ontogeny of phonetic gestures: Speech production. In I. Mattingly & M. Studdert‐Kennedy (Eds), Modularity and the motor theory of speech perception: Proceedings of a conference to honor Alvin M. Liberman (pp. 69–84). Hillsdale, NJ: Lawrence Erlbaum.
117 Vihman, M. M. (1993). Variable paths to early word production. Journal of Phonetics, 21(1–2), 61–82.
118 Vihman, M. M. (1996). Phonological development: The origins of language in the child. Oxford: Blackwell.
119 Villacorta, V. M., Perkell, J. S., & Guenther, F. H. (2007). Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. Journal of the Acoustical Society of America, 122(4), 2306–2319.
120 von Holst, E., & Mittelstaedt, H. (1950). Das Reafferenzprinzip. Naturwissenschaften, 37(20), 464–476.
121 Waldstein, R. S. (1990). Effects of postlingual deafness on speech production: Implications for the role of auditory feedback. Journal of the Acoustical Society of America, 88(5), 2099–2114.
122 Whalen, D. H., Levitt, A. G., & Goldstein, L. M. (2007). VOT in the babbling of French‐and English‐learning infants. Journal of Phonetics, 35(3), 341–352.
123 Whalen, D. H., Levitt, A. G., & Wang, Q. (1991). Intonational differences between the reduplicative babbling of French‐ and English‐learning infants. Journal of Child Language, 18(3), 501–516.
124 Yates, A. J. (1963). Recent empirical and theoretical approaches to the experimental manipulation of speech in normal subjects and in stammerers. Behaviour Research and Therapy, 1(2‐4), 95–119.
125 Zarate, J. M., & Zatorre, R. J. (2008). Experience‐dependent neural substrates involved in vocal pitch regulation during singing. NeuroImage, 40(4), 1871–1887.
126 Zimmermann, G., & Rettaliata, P. (1981). Articulatory patterns of an adventitiously deaf speaker: Implications for the role of auditory information in speech production. Journal of Speech, Language, and Hearing Research, 24(2), 169–178.
Part II Perception of Linguistic Properties
5 Features in Speech Perception and Lexical Access
SHEILA E. BLUMSTEIN
Brown University, United States
One of the goals of speech research has been to characterize the defining properties of speech and to specify the processes and mechanisms used in speech perception and word recognition. A critical part of this research agenda has been to determine the nature of the representations that are used in perceiving speech and in lexical access. However, there is a lack of consensus in the field about the nature of these representations. This has been largely due to evidence showing tremendous variability in the speech signal: there are differences in vocal tract sizes; there is variability in production even within an individual from one utterance to another; speakers have different accents; contextual factors, including vowel quality and phonetic position, affect the ultimate acoustic output; and speech occurs in a noisy channel. This has led researchers to claim that there is a lack of stability in the mapping from acoustic input to phonetic categories (sound segments) and mapping from phonetic categories to the lexicon (words). In this view, there are no invariant or stable acoustic properties corresponding to the phonetic categories of speech, nor is there a one‐to‐one mapping between the representations of phonetic categories and lexical access. As a result, although there is general consensus that phonetic categories (sound segments) are critical units in perception and production, studies of word recognition generally bypass the mapping from the auditory input to phonetic categories (i.e. phonetic segments), and assume that abstract representations of phonetic categories and phonetic segments have been derived in some unspecified manner from the auditory input.
Nonetheless, there are some who believe that stable speech representations can be derived from the auditory input. However, there is fundamental disagreement among these researchers about the nature of those representations. In one view, the stability is inherent in motor or speech gestures; in the other, the stability is inherent in the acoustic properties of the input.
In this chapter, we will use behavioral, psychoacoustic, and neural evidence to argue that features (properties of phonetic segments) are basic representational units in speech perception and in lexical access. We will also argue that these features are mapped onto phonetic categories of speech (phonetic segments), and subsequently onto lexical representations; that these features are represented in terms of invariant (stable) acoustic properties; and that, rather than being binary (either present or not), feature representations are graded, providing a mapping by degrees from sounds to words and their meanings during lexical access.
To set the stage for our discussion, it is necessary first to provide a theoretical framework of the functional architecture of the word recognition system. Here, we will briefly specify the various components and stages of processing, identify the proposed representations in each of these components, and describe the nature of the information flow between the components. It is within this framework that we will consider feature representations. As a starting point for the discussion of features as representational units, it is useful to provide motivation and evidence for the theoretical construct of features. We will then turn to the evidence that features are indeed representational units in speech perception and word recognition.
Functional architecture of word recognition
It is assumed in nearly all models of word recognition that there are multiple components or stages of processing in the mapping from sound to words. The first stage of processing involves the transformation of the auditory input from the peripheral auditory system into a spectro‐temporal representation based on the extraction of auditory patterns or properties from the acoustic signal. This representation is in turn converted at the next stage to a more abstract phonetic‐phonological representation corresponding to the phonetic categories of speech. The representation units at this stage of processing are considered to include segments and (as we will claim) features. These units then interfaces with the lexical processing system where the segment and feature representations map onto the lexicon (words). Here, a particular lexical entry is ultimately selected from a potential set of lexical candidates or competitors. Each lexical entry in turn activates its lexical semantic network where the meaning of the lexical entry is ultimately contacted.
Читать дальше