74 Remez, R. E., Ferro, D. F., Wissig, S. C., & Landau, C. A. (2008). Asynchrony tolerance in the perceptual organization of speech. Psychonomic Bulletin & Review, 15, 861–865.
75 Remez, R. E., Pardo, J. S., Piorkowski, R. L., & Rubin, P. E. (2001). On the bistability of sine wave analogues of speech. Psychological Science, 12, 24–29.
76 Remez, R. E., & Rubin, P. E. (1984). On the perception of intonation from sinusoidal sentences. Perception & Psychophysics, 35, 429–440.
77 Remez, R. E., & Rubin, P. E. (1993). On the intonation of sinusoidal sentences: Contour and pitch height. Journal of the Acoustical Society of America, 94, 1983–1988.
78 Remez, R. E., Rubin, P. E., Berns, S. M., et al. (1994). On the perceptual organization of speech. Psychological Review, 101, 129–156.
79 Remez, R. E., Rubin, P. E., Nygaard, L. C., & Howell, W. A. (1987). Perceptual normalization of vowels produced by sinu soidal voices. Journal of Experimental Psychology: Human Perception and Performance, 13, 41–60.
80 Remez, R. E., Rubin, P. E., Pisoni, D. B., & Carrell, T. D. (1981). Speech perception without traditional speech cues. Science, 212, 947–950.
81 Remez, R. E., & Thomas, E. F. (2013). Early recognition of speech. Wiley Interdisciplinary Reviews: Cognitive Science, 4, 213–223.
82 Roberts, B., Summers, R. J., & Bailey, P. J. (2010). The perceptual organization of sine‐wave speech under competitive conditions. Journal of the Acoustical Society of America, 128, 804–817.
83 Roberts, B., Summers, R. J., & Bailey, P. J. (2015). Acoustic source characteristics, across‐formant integration, and speech intelligibility under competitive conditions. Journal of Experimental Psychology: Human Perception and Performance Psychology, 41, 680–691.
84 Rosen, S. M., Fourcin, A. J., & Moore, B. C. J. (1981). Voice pitch as an aid to lipreading. Nature, 291, 150–152.
85 Rosen, S. M., & Iverson, P. (2007). Constructing adequate non‐speech analogues: What is special about speech anyway? Developmental Science, 10, 169–171.
86 Rossing, T. D. (1990). The science of sound. Reading, MA: Addison‐Wesley.
87 Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8‐month‐old infants. Science, 274, 1926–1928.
88 Seidenberg, M. S., MacDonald, M. C., & Saffran, J. R. (2002). Does grammar start where statistics stop? Science, 298, 553–554.
89 Shannon, R. V., Zeng, F., Kamath, V., et al. (1995). Speech recognition with primarily temporal cues. Science, 270, 303–304.
90 Smith, Z. M., Delgutte, B., & Oxenham, A. J. (2002). Chimaeric sounds reveal dichotomies in auditory perception. Nature, 416, 87–90.
91 Steiger, H., & Bregman, A. S. (1982). Competition among auditory streaming, dichotic fusion, and diotic fusion. Perception & Psychophysics, 32, 153–162.
92 Stevens, K. N. (1998). Acoustic phonetics. Cambridge, MA: MIT Press.
93 Stevens, K. N., & Blumstein, S. E. (1981). The search for invariant acoustic correlates of phonetic features. In P. D. Eimas & J. L. Miller (Eds), Perspectives on the study of speech (pp. 1–38). Hillsdale, NJ: Lawrence Erlbaum.
94 Stevens, K. N., & House, A. S. (1961). An acoustical theory of vowel production and some of its implications. Journal of Speech and Hearing Research, 4, 303–320.
95 Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26, 212–215.
96 Summerfield, Q. (1992). Roles of harmonicity and coherent frequency modulation in auditory grouping. In M. E. H. Schouten (Ed.), The auditory processing of speech: From sounds to words (pp. 157–166). Berlin: Mouton de Gruyter.
97 Svirsky, M. A., Robbins, A. M., Kirk, K. I., et al. (2000). Language development in profoundly deaf children with cochlear implants. Psychological Science, 11, 153–158.
98 Toscano, J. C., & McMurray, B. (2010). Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science, 34, 434–464.
99 Vouloumanos, A., & Werker, J. F. (2007). Listening to language at birth: Evidence for a bias for speech in neonates. Developmental Science, 10, 159–171.
100 Warren, R. M., Obusek, C. J., Farmer, R. M., & Warren, R. P. (1969). Auditory sequence: confusion of patterns other than speech or music. Science, 164, 586–587.
101 Wertheimer, M. (1923/1938). “Laws of organization in perceptual forms” (trans. of “Unsuchungen zur Lehre von der Gestalt”). In W. D. Ellis (Ed.), A sourcebook of gestalt psychology (pp. 71–88). London: Routledge & Kegan Paul.
102 Whalen, D. H., & Liberman, A. M. (1987). Speech perception takes precedence over nonspeech perception. Science, 237, 169–171.
103 Whalen, D. H., & Liberman, A. M. (1996). Limits on phonetic integration in duplex perception. Perception & Psychophysics, 58, 857–870.
104 Zevin, J. D., Yang, J., Skipper, J. I., & McCandliss, B. D. (2010). Domain general change detection accounts for “dishabituation” effects in temporal‐parietal regions in functional magnetic resonance imaging studies of speech perception. Journal of Neuroscience, 30, 1110–1117.
1 1It is notable that the literature on duplex perception contains meager direct evidence that the auditory and phonetic properties of the duplex acoustic test items are available simultaneously. The empirical evaluation of auditory and phonetic form employed sequential measures, sometimes separated by a week, that assessed the perception of auditory form in one test and phonetic form in another. Evidence is provided that phonetic perception is distinct from a generic auditory process, but the literature is silent on the criteria of perceptual organization required for phonetic analysis.
2 Primacy of Multimodal Speech Perception for the Brain and Science
LAWRENCE D. ROSENBLUM AND JOSH DORSI
University of California, Riverside, United States
It may be argued that multimodal speech perception has become one of the most studied topics in all of cognitive psychology. A keyword search for “multimodal speech” in Google Scholar shows that, since early 2005, over 192,000 papers citing the topic have been published. Since that time, the seminal published study on audiovisual speech: McGurk & MacDonald (1976) has been cited in publications over 4,700 times (Google Scholar search). There are likely many reasons for this explosion in multisensory speech research. Perhaps most importantly, this research has helped usher in a new understanding of the perceptual brain.
In what has been termed the “multisensory revolution” (e.g. Rosenblum, 2013), research is now showing that brain areas and perceptual behaviors, long thought to be related to a single sense, are now known to be modulated by multiple senses (e.g. Pascual‐Leone & Hamilton, 2001; Reich, Maidenbaum, & Amedi, 2012; Ricciardi et al., 2014; Rosenblum, Dias, & Dorsi, 2016; Striem‐Amit et al., 2011). This research suggests a degree of neurophysiological and behavioral flexibility with perceptual modality not previously known. The research has been extensively reviewed elsewhere and will not be rehashed here (e.g. Pascual‐Leone & Hamilton, 2001; Reich, Maidenbaum, & Amedi, 2012; Ricciardi et al., 2014; Rosenblum, Dias, & Dorsi, 2016; Striem‐Amit et al., 2011 ). It is relevant, however, that research on audiovisual speech perception has spearheaded this revolution. Certainly, the phenomenological power of the McGurk effect has motivated research into the apparent automaticity with which the senses integrate/merge. Speech also provided the first example of a stimulus that could modulate an area in the human brain that was thought to be solely responsible for another sense. In that original report, Calvert and her colleagues (1997) showed that lip‐reading of a silent face could induce activity in the auditory cortex. Since the publication of that seminal study, hundreds of other studies have shown that visible speech can induce cross‐sensory modulation of the human auditory cortex. More generally, thousands of studies have now demonstrated crossmodal modulation of primary and secondary sensory cortexes in humans (for a review, see Rosenblum, Dias, & Dorsi, 2016). These studies have led to a new conception of the brain as a multisensory processing organ, rather than as a collection of separate sensory processing units.
Читать дальше