LibCat » Книги » Приключения » unrecognised » The Handbook of Speech Perception

The Handbook of Speech Perception

Здесь есть возможность читать онлайн «The Handbook of Speech Perception» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
The Handbook of Speech Perception
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

The Handbook of Speech Perception: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «The Handbook of Speech Perception»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

A wide-ranging and authoritative volume exploring contemporary perceptual research on speech, updated with new original essays by leading researchers Speech perception is a dynamic area of study that encompasses a wide variety of disciplines, including cognitive neuroscience, phonetics, linguistics, physiology and biophysics, auditory and speech science, and experimental psychology.
, Second Edition, is a comprehensive and up-to-date survey of technical and theoretical developments in perceptual research on human speech. Offering a variety of perspectives on the perception of spoken language, this volume provides original essays by leading researchers on the major issues and most recent findings in the field. Each chapter provides an informed and critical survey, including a summary of current research and debate, clear examples and research findings, and discussion of anticipated advances and potential research directions. The timely second edition of this valuable resource:
Discusses a uniquely broad range of both foundational and emerging issues in the field Surveys the major areas of the field of human speech perception Features newly commissioned essays on the relation between speech perception and reading, features in speech perception and lexical access, perceptual identification of individual talkers, and perceptual learning of accented speech Includes essential revisions of many chapters original to the first edition Offers critical introductions to recent research literature and leading field developments Encourages the development of multidisciplinary research on speech perception Provides readers with clear understanding of the aims, methods, challenges, and prospects for advances in the field
, Second Edition, is ideal for both specialists and non-specialists throughout the research community looking for a comprehensive view of the latest technical and theoretical accomplishments in the field.

The Handbook of Speech Perception — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «The Handbook of Speech Perception», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

This phenomenon of concurrent auditory and phonetic effects of a single acoustic element was described as duplex perception (Liberman, Isenberg, & Rakerd, 1981; Nygaard, 1993; Whalen & Liberman, 1996), and it has been explained as an effect of a peremptory aspect of phonetic organization and analysis. 1 No matter how the evidence ultimately adjudicates the psychophysical claims, it is instructive to note that the generic auditory functions of perceptual organization only succeed in rationalizing the split of the dichotic components into separate streams, and fail to provide a principle by which the combination of elements occurs.

Organization by coordinate variation

A classic understanding of the perception of speech derives from study of the acoustic correlates of phonetic contrasts and the physical and articulatory means by which they are produced (reviewed by Raphael, Chapter 22; also, see Fant, 1960; Liberman et al., 1959; Stevens & House, 1961). In addition to calibrating the perceptual response to natural samples of speech, researchers also used acoustic signals produced synthetically in detailed psychoacoustic studies of phonetic identification and differentiation. In typical terminal analog speech synthesis, the short‐term spectra characteristic of the natural samples are preserved, lending the synthesis a combination of natural vocal timbre and intelligibility (Stevens, 1998). Acoustic analysis of speech, and synthesis that allows for parametric variation of speech acoustics, have been important for understanding the normative aspects of perception, that is, the relation between the typical or likely auditory form of speech sounds encountered by listeners and the perceptual analysis of phonetic properties (Diehl, Molis & Castleman, 2001; Lindblom, 1996; Massaro, 1994).

However, a singular focus on statistical distributions of natural samples and on synthetic idealizations of natural speech discounts the adaptability and versatility of speech perception, and deflects scientific attention away from the properties of speech that are potentially relevant to understanding perceptual organization. Because grossly distorted speech remains intelligible (e.g. Miller, 1946; Licklider, 1946) when many of the typical acoustic correlates are absent, it is difficult to sustain the hypothesis that finding and following a speech stream crucially depends on meticulous registration of the brief and numerous acoustic correlates of phonetic contrasts described in classic studies. But, if the natural acoustic products of vocalization do not determine the perceptual organization and analysis of speech, what does?

An alternative to this conceptualization was prompted by the empirical use of a technique that combines digital analysis of speech spectra and digital synthesis of time‐varying sinusoids (Remez et al., 1981). This research has revealed the perceptual effectiveness of acoustic patterns that exhibit the gross spectro‐temporal characteristics of speech without incorporating the fine acoustic structure of vocally produced sound. Perceptual research with these acoustic materials and their relatives (noise‐band vocoded speech: Shannon et al., 1995; acoustic chimeras: Smith, Delgutte, & Oxenham, 2002; Remez, 2008) has permitted an estimate of a listener’s sensitivity to the time‐varying patterns of speech spectra independent of the sensory elements of which they are composed.

The premise of sinewave replication is simple, though in practice it is as laborious as other forms of copy synthesis (Remez et al., 2011). Three or four tones, each approximating the center frequency and amplitude of an oral, nasal, or fricative resonance, are created to imitate the coarse‐grain attributes of a speech sample. Lacking the momentary aperiodicities, harmonic spectra, broadband formants, and regular pulsing of natural and most synthetic speech, a sinewave replica of an utterance differs acoustically and qualitatively from speech while remaining intelligible. A spectrogram of a sinewave sentence is shown in the bottom panel of Figure 1.2; a comparison of short‐term spectra of natural speech and both synthetic and sinewave imitations is shown in Figure 1.3.

It is significant that three or four tones reproducing a natural formant pattern evoke an experience in a naive listener of several concurrent whistles changing in pitch and loudness, and do not automatically elicit an impression of speech. The listener’s attention is free to follow the course of the auditory form of each component tone. Certainly, this aspect of a sinewave pattern is salient auditorily, and little of the raw quality prompts attention to the tones as a single compound contour. Studies show that listeners are well able to attend to individual tone components and to focus on the pattern of pitch changes each evokes over the run of a few seconds (Remez & Rubin, 1984, 1993). In other words, the immediate experience of the listener is accurately predicted by a generic auditory account, because acoustic elements that change frequency at different rates to different extents, onsetting and offsetting at different moments in different frequency ranges are dissimilar along many dimensions that specify separate perceptual streams according to gestalt principles.

Once instructed that the tones compose synthetic speech, a listener readily reports linguistic properties as if hearing the original natural utterance on which the sinewave replica was modeled. If attention to a complex, broadband contour is characteristic of the perceptual organization of speech, its sufficient condition is met in the absence of natural acoustic vocal products. Performance levels reported with this kind of copy synthesis have varied with the proficiency of the synthesis, although it has often been possible to achieve very good intelligibility, rivalling natural speech (for instance, Remez et al., 2008). Within this range of performance levels, these acoustic conditions pose a crucial test of a gestalt‐derived account of perceptual organization, for a perceiver must integrate the tones in order to compose a single sensory contour segregated from the background, ready to analyze for the linguistic properties borne on the pattern of the signal. Several tests support this claim of true integration preliminary to analysis.

In direct assessments, the intelligibility of sinewave replicas of speech exceeded intelligibility predicted from the presentation of individual tones (Remez et al., 1981, 1987, 1994). This superadditive performance is evidence of integration, and it persisted even when the tones came from separate spatial sources, violating similarity in location (Remez et al., 1994; see also Broadbent & Ladefoged, 1957). In combining the individual tones into a single time‐varying coherent stream, however, this complex organization, which is necessary for phonetic analysis, does not exclude an auditory organization as independently resolvable streams of tones (Remez & Rubin, 1984, 1993; Roberts, Summers, & Bailey, 2015). In fact, the perceiver’s resolution of the pitch contour associated with the frequency pattern of tonal constituents is acute whether or not the fusion of the tones supporting phonetic perception occurs (Remez et al., 2001). On this evidence rests the claim that sinewave replicas are bistable , exhibiting two simultaneous and exclusive organizations.

Figure 13 A comparison of the shortterm spectrum of natural speech top - фото 4

Figure 1.3 A comparison of the short‐term spectrum of natural speech (top); terminal analog synthetic speech (middle); and sinewave replica (below). Note the broadband resonances and harmonic spectra in natural and synthetic speech, in contrast to the sparse, nonharmonic spectrum of the three tones.