23 Purpura, J. E., Dakin, J. W., Ameriks, Y., & Grabowski, K. (2010). How do we define grammatical knowledge in terms of form and meaning dimensions at six different CEFR proficiency levels? Cambridge, England: Language Testing Research Colloquium.
24 Weigle, S. T. (2010). Validation of automated scores of TOEFL iBT tasks against non‐test indicators of writing ability. Language Testing, 27(3), 335–53.
25 Wolfe‐Quintero, K., Inagaki, S., & Kim, H.‐Y. (1998). Second language development in writing: Measures of fluency, accuracy and complexity. Technical report 17. Manoa: University of Hawai‘i Press.
1 Based in part on J. E. Purpura (2012). Assessment of grammar. In C. A. Chapelle (Ed.), The Encyclopedia of Applied Linguistics. John Wiley & Sons Inc., with permission.
GARY J. OCKEY
Listening is important for in‐person communication, with estimates that it accounts for more than 45% of the time spent communicating (Feyten, 1991). Listening continues to become more important in virtual environments with the increase of communication through such technologies as FaceTime, Second Life, and Skype. It follows that teaching and assessing the listening skill of second language learners is essential. Unfortunately, the assessment of second language listening comprehension has attracted little research attention (Buck, 2018) and, as a result, understanding of how to best assess it is limited.
Current conceptions of the listening process maintain that comprehension results from the interaction of numerous sources of information, including the acoustic input and other relevant contextual information. The mind simultaneously processes these incoming stimuli and other information such as linguistic and world knowledge already present in the mind. Listening comprehension is a dynamic process, which continues for as long as new information is made available from any of these sources (Gruba, 1999; Buck, 2001).
Listening is multidimensional but is comprised of related discrete lower‐level ability components. While agreement on a comprehensive list of these components has not been reached (nor does there exist an agreed‐upon theory of how these components operate with each other), some research indicates that listening ability may include three lower‐level abilities: the abilities to understand global information, to comprehend specific details, and to draw inferences from implicit information (Min‐Young, 2008). Test developers typically draw upon these in defining a listening construct in the first stages of test development.
Factors Affecting Listening
Professionals take into account factors that affect listening comprehension when they design and use listening assessments. One of these factors is rate of speech (Brindley & Slatyer, 2002). When listeners comprehend authentic oral communication, they process a large amount of information very rapidly, which can result in cognitive overload or push working memory beyond its capacity. This means that listeners may not be able to understand input at faster speeds which can, however, be processed at slower speeds. Research also indicates that background knowledge about the topic is important for the message to be comprehended. Test takers with background knowledge on a topic related to the input are generally advantaged (Jensen & Hansen, 1995).
Accent of the speaker is another important factor that affects listening comprehension. Research has shown that the use of different speech varieties can have profound impacts on listening comprehension in assessment contexts, even when those speech varieties are very similar. Most notably, the greater the strength of an accent, that is, the less similar it is to a particular speech variety, the more challenging it is to comprehend (Ockey & French, 2016; Ockey, Papageorgiou, & French, 2016) and it is easier for language learners to comprehend familiar than unfamiliar accents (Tauroza & Luk, 1997; Major, Fitzmaurice, Bunta, & Balasubramanian, 2002; Harding, 2012; Ockey & French, 2016).
Other important factors of oral communication known to affect listening comprehension include prosody (Lynch, 1998), phonology (Henricksen, 1984), and hesitations (Freedle & Kostin, 1999). Brindley and Slatyer (2002) also identify length, syntax, vocabulary, discourse, and redundancy of the input as important variables.
Types of interaction and relationships among speakers are also important factors to take into account when designing listening assessment inputs. Monologues, dialogues, and discussions among a group of people are all types of interaction that one would be likely to encounter in real‐world listening tasks. Individuals might also expect to listen to input with various levels of formality, depending on the relationship between the speaker and the listener.
Tasks for Assessing Listening
Decisions about the characteristics of the desired listening assessment tasks should be based on the purposes of the test, the test takers' personal characteristics, and the construct that the test is designed to measure (Bachman & Palmer, 2010). Buck (2001) provided the following guidelines concerning listening tasks, which may be applicable to most listening test contexts: (a) listening test input should include typical realistic spoken language, commonly used grammatical knowledge, and some long texts; (b) some questions should require understanding of inferred meaning (as well as global understanding and comprehension of specific details) and all questions should assess linguistic knowledge—not that which is dependent on general cognitive abilities; and (c) test takers' background knowledge on the content to be comprehended should be similar. The message conveyed by the input, not the exact vocabulary or grammar used to transmit the message, as well as various types of interaction and levels of formality should also be assessed.
In practice, listening assessment tasks require learners to listen to input and then provide evidence of comprehension by responding to questions about the information conveyed in the input. The most common types of comprehension questions are selected response items, including multiple‐choice, true/false, and matching. For these item types, test takers are required to select the most appropriate answer from options which are provided. These options may be based on words, phrases, objects, pictures, or other realia. Selected response items are popular, in part, because they can be scored quickly and objectively. An important question to answer when designing selected response item types is whether or not to provide people with the questions and possible responses prior to the input, especially since including them has been shown to favor more proficient test takers (Wu, 1998) and certain item types are affected differentially by the inclusion of item stems or answer options, or both (Koyama, Sun, & Ockey, 2016).
Constructed response item types are also commonly used. They require test takers to create their own response to a comprehension question and have become increasingly popular. These item types require short or long answers, and include summaries and completion of organizational charts, graphs, or figures. One item type that has received increasing attention is an integrated listen–speak item. Test takers listen to an oral input and then summarize or discuss the content of what they have heard (Ockey & Wagner, 2018). Constructed response item types have been shown to be more difficult for test takers than selected response item types (In'nami & Koizumi, 2009) and may therefore be more appropriate for more proficient learners. Most test developers and users have avoided using constructed response item types because scoring can be less reliable and require more resources. Recent developments in computer technology, however, have made scoring productive item types increasingly more reliable and practical (Carr, 2014), which may lead to their increased use.
Читать дальше