Operationists Link Concepts to Observable Events
Where, then, does the meaning of concepts in science come from if not from discussions about language? What are the criteria for the appropriate use of a scientific concept? To answer these questions, we must discuss operationism, an idea that is crucial to the construction of theory in science and one that is especially important for evaluating theoretical claims in psychology.
Although there are different forms of operationism, it is most useful for the consumer of scientific information to think of it in the most general way. Operationism is simply the idea that concepts in scientific theories must in some way be grounded in, or linked to, observable events that can be measured. Linking the concept to an observable event makes the concept public. The operational definition removes the concept from the feelings and intuitions of a particular individual and allows it to be tested by anyone who can carry out the measurable operations.
For example, defining the concept hunger as “that gnawing feeling I get in my stomach” is not an operational definition because it is related to the personal experience of a “gnawing feeling” and, thus, is not accessible to other observers. By contrast, definitions that involve some measurable period of food deprivation or some physiological index such as blood sugar levels are operational because they involve observable measurements that anyone can carry out. Similarly, psychologists cannot be content with a definition of anxiety, for example, as “that uncomfortable, tense feeling I get at times” but must define the concept by a number of operations such as questionnaires and physiological measurements. The former definition is tied to a personal interpretation of bodily states and is not replicable by others. The latter puts the concept in the public realm of science.
It is important to realize that a concept in science is defined by a set of operations, not by just a single behavioral event or task. Instead, several slightly different tasks and behavioral events are used to converge on a concept (we will talk more about the idea of converging operations in Chapter 8). For example, educational psychologists define a concept such as reading ability in terms of performance on a standardized instrument such as the Woodcock Reading Mastery Tests (Woodcock, 2011) that contains a whole set of tasks. The total reading ability score on the Woodcock Reading Mastery instrument comprises indicators of performance on a number of different subtests that test slightly different skills, for example, reading a passage and thinking of an appropriate word to fill in a blank in the passage, coming up with a synonym for a word, pronouncing a difficult word correctly in isolation, and several others. Collectively, performance on all of these tasks defines the concept reading ability.
Operational definitions force us to think carefully and empirically—in terms of observations in the real world—about how we want to define a concept. Imagine trying to define operationally something as seemingly conceptually simple as typing ability. Imagine you need to do this because you want to compare two different methods of teaching typing. Think of all the decisions you would have to make. You would want to measure typing speed, of course. But over how long a passage? A passage of only 100 words would seem too short, and a passage of 10,000 words would seem too long. But exactly how long then? How long does speed have to be sustained to match how we best conceive of the theoretical construct typing ability? And what kind of material has to be typed? Should it include numbers and formulas and odd spacing? And how are we going to deal with errors? It seems that both time and errors should come into play when measuring typing ability, but exactly what should the formula be that brings the two together? Do we want time and errors to be equally weighted, or is one somewhat more important than the other? The need for an operational definition would force you to think carefully about all of these things; it would make you think very thoroughly about how to conceptualize typing ability.
Consider the task of the Food and Drug Administration, which has to decide what is an “unacceptable” level of contamination for various foods as opposed to what are considered “unavoidable defects” (Levy, 2009). A federal agency such as the FDA cannot be subjective about such things. It needs strict operational definitions of its judgments with respect to contaminants in each food that it inspects. So, for example, it comes up with operational definitions of the following sort (Levy, 2009): An “unacceptable” level of contamination in tomato juice is more than 10 fly eggs per 100 grams; an “unacceptable” level of contamination in mushrooms is five or more maggots 2 millimeters or longer per 100 grams. Very gross—but commendably operational!
Operationalizing a concept in science involves measurement: assigning a number to an observation via some rule. Science writer Charles Seife (2010) makes the point that once we start using numbers in measurement we suddenly start caring about them. His argument is that the nonmathematician rarely cares about the properties of numbers when they are used merely as abstract symbols. We don’t care about the number five, by itself. But as soon as the number five becomes five “pounds” or five “dollars” or five “percent inflation” or five “IQ points”—then suddenly we start to care. Seife (2010) says that “a number without a unit is ethereal and abstract. With a unit, it acquires meaning—but at the same time, it loses its purity” (p. 9). What Seife means by “losing its purity” is that once we are involved with measurement—once the number has a unit attached—we are suddenly concerned that numbers have the “right” properties. What are the “right” properties for a number to have in science? The answer to this question is that, in science, the “right” properties for a number to have are the properties of reliability and validity.
For an operational definition of a concept to be useful, it must display both reliability and validity. Reliability refers to the consistency of a measuring instrument—whether you would arrive at the same measurement if you assessed the same concept multiple times. The scientific concept of reliability is easy to understand because it is very similar to its layperson’s definition and very like one of its dictionary definitions: “an attribute of any system that consistently produces the same results.”
Consider how a layperson might talk about whether something was reliable or not. Imagine a New Jersey commuter catching the bus to work in Manhattan each morning. The bus is scheduled to arrive at the commuter’s stop at 7:20 a.m. One week the bus arrives at 7:20, 7:21, 7:20, 7:19, and 7:20, respectively. We would say that the bus was pretty reliable that week. If the next week the bus arrived at 7:35, 7:10, 7:45, 7:55, and 7:05, respectively, we would say that the bus was very unreliable that week.
The reliability of an operational definition in science is assessed in much the same way. If the measure of a concept yields similar numbers for multiple measurements of the same concept, we say that the measuring device displays high reliability. If we measured the same person’s intelligence with different forms of an IQ test on Monday, Wednesday, and Friday of the same week and got scores of 110, 109, and, 110, we would say that that particular IQ test seems to be very reliable. By contrast, if the three scores were 89, 130, and 105, we would say that that particular IQ test does not seem to display high reliability. There are specific statistical techniques for assessing the reliability of different types of measuring instruments, and these are discussed in all standard introductory methodology textbooks.
Читать дальше