Educational Research: The Challenge of Using an Academic Discipline

Susan Notes: I have praised Children and Reading Tests by Clifford Hill and Eric Larsen so often that people must wonder if I get a commission. All I can say is READ THIS BOOK, which describes what happens when children take a reading test and then are asked probing questions about why they chose the answers they did. Researchers' conversations with children reveal many cultural assumptions that just aren't apparent.

I don't know how long TC Record will keep this article free. The second part of Hill's presentation draws from the book. It's worth subscribing to read it.

Cite This Article as: Teachers College Record Volume 114 Number 2, 2012, p. 1-42
http://www.tcrecord.org ID Number: 16241

by Clifford Hill

Background/Context: In 2010, I was invited to give the annual lecture that honors Lawrence Cremin, the historian of American education who became the seventh president of Teachers College, Columbia University. To pay tribute to the way in which Cremin used an academic discipline to bring rigor and depth to educational research, I described my own use of an academic disciplineâ"linguistics and its varied tools of discourse analysisâ"in conducting research at the College.

Focus of Research: I focused on two major areas of research: (a) ethnocultural variation in processing spatio-temporal information in languages throughout the world and (b) children's interaction with multiple-choice tests of reading comprehension, with particular attention to the ways in which their ethnocultural background affects how they respond.

Research Design and Findings: The first area of research used experimental methods developed by a research team that I directed. The major finding was that distinctive patterns of processing spatiotemporal information by speakers of African languages (e.g., Hausa) and Asian languages (e.g., Chinese) are preserved when African Americans and Chinese Americans speak English in the Western hemisphere. In addition to ethnocultural identity, our research team uncovered other factors such as age and gender that are reflected in the preservation of these patterns. I draw on the model structured heterogeneity (Herzog, Weinrich, & Labov, 1968) to show that what may appear to be random variation in language use can be accounted for by attending to sociocultural factors.

The second area of research used quantitative methods (experimental probes) and qualitative methods (interviews). Our major finding was that children, especially African Americans who live in the inner city, often make inferences when responding to a multiple-choice task, which, although stimulated by features in the test item, lead them to select a choice, which, given the test makersâ highly restricted model of literacy, cannot be justified. Our research team drew on the model ethnography of communication (Hymes, 1962) in identifying the contrasting interpretive norms used by test makers and test takers. We then developed a model grounded constructivism (Hill, 2004) that was used to build an alternative approach to assessment in which children respond to an integrated set of tasks that call for three different kinds of response: factual, inferential, and experiential.

Recommendations: An academic discipline can provide greater depth and rigor in educational research, but those who draw on one must seek, much like Lawrence Cremin, to make their research intelligible to an informed public concerned with educational policy.

We can have facts without thinking,

but we cannot have thinking without facts.

â"John Dewey

I would like to welcome friends, former students, and colleagues here at Teachers College and from various institutions around metropolitan New York and beyond. The Hausa people in West Africa have a saying, Zumunta a kâafa ta ke ("Friendship is in the foot"). They use these words to express a fundamental value in their culture: if they are to keep friendship alive, they have to get up and go to where their friends are. Iâm grateful for every friend's foot that has made it here today.

It is a special honor to give a lecture that celebrates Lawrence A. Cremin. Shortly after my arrival at Teachers College, he became president, and those of us who were new faculty members were in awe of this man who was able to conduct a hands-on presidency while continuing to teach his popular course on the history of American education and publish widely as a historian of education. I remember the excitement at the College when we learned that he had received a Pulitzer Prize for American Education: The National Experience, 1783â"1876, the second volume in his three-volume history of American education (1970, 1980, 1988).

Those of us who were new faculty members were inspired--and to be honest, a bit intimidated--by his scholarly example. One of the distinguishing features of Teachers College is its range of specialized programs based on particular academic disciplines: to name just a few, history, philosophy, psychology, sociology, anthropology, and, in my case, linguistics. Such specialization is a great strength of the College, but it also presents a challenge, as indicated by the title of this lecture. How can these disciplines be brought to bear on important issues in education and how can their intellectual rigor be maintained when communicating with a wider audience?

Here Larry Cremin is an exemplary model. As a historian, he did not settle for doing research in some small corner, though he explored plenty of corners as he was preparing to take on the whole of American education. In his three-volume History of American Education, he wrote about the evolution of public education in a lucid and elegant style that was compelling to professional historians and the lay public. He did not take short cuts in his research: his published work was characterized by copious notes, but they were relegated to the bottom of the page or to the back of the article or book. He did not allow them to clog the flow of his masterful narrative.

Iâm sure I speak for many faculty and students who have come to the College with a commitment to an academic discipline, but an uncertain grasp on how this discipline might be used in educational research. In my own case, this lack of certainty was exacerbated, as I ended up at the College by accident. I came to Columbia on soft money to direct a program in African languages at the School of International and Public Affairs. An unexpected budget cut by the Nixon administration forced me to search for additional work, and somewhat serendipitously, I ended up filling in for a faculty member at Teachers College who was on leave. This temporary position soon became permanent, and I was faced with the challenge of working out just what kind of research I could do at a graduate school of education.

My dissertation research seemed far removed from the various research agendas that I encountered at Teachers College. It was a rather technical analysis of the language used in a particular oral tradition among the Hausa people in Niger, a French-speaking country located just to the north of Nigeria on the edge of the Sahara desert. I collected more than 6,000 examples of what the Hausa people call karin magana, which translates roughly as "folded speech." A good example of such speech is the one I used at the beginning of the lecture. Using linguistics as a tool, I attempted to identify the ways in which the language used in this tradition differs from ordinary speech.

In responding to the challenge presented by Teachers College, I soon discovered the provocative thinking of John Dewey, who had been a faculty member not only at Teachers College but also in the Department of Philosophy within the larger university. This dual affiliation, no doubt, contributed to the breadth of his thinking about education, and such affiliation is still an important means of sustaining the academic disciplines at the College.

During his presidency, Larry Cremin drew heavily on Dewey in developing a distinctive approach to education for the College. For Cremin, education had to do with all societal institutions, beginning with the family, that transmit knowledge and values from one generation to the next. As I absorbed this way of thinking about education, I began to see my doctoral research in a new light. In rural communities that lack writing, oral tradition is the backbone of education. Traditional societies around the world have used oral tradition--song, story, proverb--to transmit knowledge and values to the next generation. Even today, this means of educating is more vital than the literacy practices of the Western school in many rural communities in Niger.1

I also began to think differently about the distinctive features of language that I had analyzed in my dissertation research: for example, the abundant parallelism used in karin magana makes this way of speaking memorable, which is crucial to its preservation in oral culture. Appropriately cued memory is the key to exacting reproduction of the language that transmits cultural knowledge and values.


As I continued to teach Hausa at the Institute of African Studies, I experimented with teaching methods that I was learning about at Teachers College. One of these experiments took a surprising turn while I was using the silent way to teach the Hausa terms used to describe spatial relations. Silently, I was manipulating objects such as a pen and a piece of paper to encourage students to use the Hausa spatial terms that identify relations between objects on the three primary axes.

a kan "on"

a k'ark'ashin "under"

a gaba "in front"

a baya "in back, behind"

a hagu "to the left"

a dama "to the right"

In addition to the American students in the class, there were two West African students: a native speaker of Ashanti from Ghana who had been exposed to Hausa as a child, and an Igbo speaker from southern Nigeria who was concerned about interethnic relations in Nigeria (this was at the time of the Biafran War).

As the students used the Hausa terms for "front/back," I noticed that the West African students and the American students were interpreting these terms differently. The American students described a nearer object as gaba da "in front of" a further one. As shown in Figure 1, they constructed a familiar field in which the orientational dynamic was running toward themselves. The West African students, however, described the pen as baya da "behind/in back of" the ball. They constructed a field in which the orientational dynamic was running away from themselves (see Figure 2).2

Both constructions can be described as deictic: this term comes from the Greek word for "pointing" and refers to any language use that reflects the language userâs immediate point of view. To characterize these contrasting constructions, I introduced the terms mirror imagery and in-tandem imagery in order to capture the active ways in which we construct images--although they remain largely out of awareness--as we make sense out of language.this term and refers to any language use that reflects the language userâs immediate point of view. To characterize these contrasting constructions, I introduced the terms mirror imagery and in-tandem imagery in order to capture the active ways in which we construct images--although they remain largely out of awareness--as we make sense out of language.

Clearly, what transpired in the Hausa classroom merited further investigation, so I began to contact various speakers of West African languages in nearby Harlem. I discovered that the deictic imagery used by the Ashanti and Igbo students was, in fact, dominant among these speakers. As I continued to explore this imagery, I became aware that it is at work in time as well as in space. In describing relations between days of the week, for example, Hausa speakers use in-tandem imagery:

Ran Talata tana gaba da Ran Littinin.

"Tuesday comes before Monday."

As can be seen, the presence of the word before in a literal translation leads to a nonsensical proposition that highlights the contrasting use of deictic imagery. The use of after in a functional translation is based on the mirror imagery that native speakers of Standard English are accustomed to:

Ran Talata tana gaba da Ran Littinin.

"Tuesday comes after Monday."

It is worth noting that the terms for "front/back" also represent "before/after" in many languages around the world. This is especially the case in Bantu languages widely spoken on the African continent. In Kikuyu, for example, mbere signals both "front" and "before," thutha both "back" and "after."

When the structure of languages first evolved, human beings apparently viewed relations between two temporal points much like relations between two objects. Again, we can observe the active role of imaging in language use. It is as if time was viewed as running through the human body out toward the horizon. As writing was developed, however, human beings began to shift their perspective and view time as running along other spatial axes that are privileged on the page. In Asian languages that use a vertically oriented script, the terms for "up/down" can be used to describe certain temporal relations. Speakers of Chinese, for example, characterize last month as ä¸ "up" and the next month as ä¸ "down":

ä¸ä¸ªæ shàngge yue "last month"

ä¸ä¸ªæ xiage yue "next month"

It is as if time is viewed as running from the top of a page to the bottom, which is, of course, an image often exploited in a graphic representation of historical events along a vertical time line (in any cultural tradition, a book page is generally characterized by greater height than width). The relations between space and time can be quite complex, a subject I have explored in greater depth across a number of languages (Hill, 1978, 1991b, 1993, 1994, 1998).


Upon the completion of the academic year, I headed for West Africa to direct a Peace Corps program in Francophone Niger, which gave me the opportunity to explore these constructions more extensively. To help prospective teachers gain experience, we recruited nearly 350 students and set up a summer school in which they could do practice teaching. At the end of each school day, I carried out experimental tasks in the students' first language, whether Hausa or Djerma.3

I was exhilarated to discover that in-tandem imagery was dominant for Hausa-speaking and Djerma-speaking students, although nearly a third of the students adapted to the communicative norms of a Western school and used mirror imagery. This was especially the case with girls: nearly half of them used mirror imagery. This result is not altogether surprising when one considers that within a Muslim society girls do not end up in a French-speaking school unless they come from a family that reflects a strong Western orientation.

After this initial study, a doctoral student, Tijjani Isma'il, conducted research among Hausa speakers in a secondary school in English-speaking Nigeria.4 He designed a bilingual study in which written tasks were administered to 180 randomly selected boys and girls in forms 1, 3, and 5 (the equivalent of grades 7, 9, and 11 in the American system). Half of the students responded to the tasks in Hausa, and the other half responded in English, as shown in Table 1.

Tijjani administered an array of four tasks built around different objects that lack any intrinsically marked orientation. The task shown in Figure 3, utilizing a rock and a ball, closely simulates the oral one I had administered in Niger. As was the case with Hausa-speaking students in Niger, nearly two-thirds of these Nigerian students used in-tandem imagery when responding in Hausa. When responding in English, however, only about one-third of the students used in-tandem imagery, thus showing a remarkable capacity to adapt to native-speaker norms. (Later, Tijjani set up a control group of 180 students in Scarsdale, an affluent suburb north of New York City, and they made almost exclusive use of mirror imagery.)

The ball is ______ the rock.

A. on
B. in front of
C. behind
D. to the left of

Tijjani also explored how these students responded when the reference object--for example, a telephone--reflects a fundamental asymmetry that can be viewed as marking an intrinsic front/back. He developed three tasks using objects in which the intrinsically marked orientation is increasingly salient. The first involved a telephone, a static entity; the second involved a car, an entity that has a capacity for motion but cannot itself initiate movement; and the third involved a person, an entity that can initiate movement. In each task, the reference object was facing to the side so that there was a conflict between the deictic orientation (i.e., based on the language user's point of view) and the nondeictic orientation (i.e., based on the intrinsic orientation of the reference object).

The task in which the reference object is a telephone is shown in Figure 4. If students choose to the right of, they can be considered as making a deictic response. If they, however, choose in front of, they can be considered as making a nondeictic response.

Nearly three-quarters of the students responding in Hausa made a nondeictic choice. When responding in English, slightly less than half of the students made a nondeictic response, thus, once again, showing a capacity to adapt to native-speaker norms (less than one-quarter of the students in Scarsdale made a nondeictic choice).5 Not surprisingly, a nondeictic response became progressively stronger on the tasks in which the reference object was first a car and then a person.

Figure 4

The ball is ______ the telephone.

A. to the right of
B. behind
C. to the left of
D. in front of

Tijjani's research uncovered a correlation that has held up in later research conducted in different parts of the world: students who use in-tandem imagery are more likely to make a nondeictic choice when they respond to a task in which the reference object has intrinsic orientation. From a cognitive perspective, these students can be viewed as using imagery in which they are projected into the field. If the reference point has intrinsically marked orientation, they project to where it is; if it does not, then they project their own orientation on to it: it is as if their front becomes its front, their back its back. I introduced the term dynamic to describe such responses and the term static to describe the use of mirror imagery in which the reference object is viewed as oriented back toward the language user.

After Tijjani defended his dissertation, he and I visited a secondary school in Harlem where he presented his research. He began by asking the students to respond to a task designed to elicit either in-tandem imagery or mirror imagery. After a lively discussion with the students, we rushed back to the College to analyze how they had responded to this task. We discovered that the majority had used in-tandem imagery, and immediately I began to plan a large-scale research project to be conducted in metropolitan New York secondary schools that would explore ethnocultural differences in the use of deictic imagery.6


I was fortunate that a doctoral student named Sheila McKenna took on this large-scale project. To parallel Tijjani's research, she administered tasks to 445 students at the equivalent grade levels, as shown in Table 2.7

Sheila made a major contribution to our research methodology by developing a card game that allowed her to infer whether students were using in-tandem imagery or mirror imagery. In effect, students' attention was shifted from making a language choice to trying to win a game.

The results of this large-scale project were striking. As shown in Figure 5, African American students and European American students differed significantly in using in-tandem imagery. And for both groups, the use of in-tandem imagery was slightly stronger when the game was played orally. Not surprisingly, literacy itself tamps down the use of such imagery.

The European American students in this study differed significantly from the European American students in Tijjani's study. Here only about two-thirds of these students used mirror imagery, whereas in Scarsdale virtually every student used mirror imagery, suggesting that we needed to approach deictic imagery in a more complex way.8

If we add grade level (see Figure 6), we are able to observe an especially revealing contrast between African American and European American students in their use of in-tandem imagery: the African American students use such imagery significantly less as they remain in school, whereas the European American students show virtually no change. The decreasing use of in-tandem imagery among African American students is, no doubt, related to their high dropout rate: those students who are dropping out are presumably less acculturated to mainstream norms of communication. But the decrease also reflects that African American students who remain in school, no doubt, increasingly adapt to mainstream norms. Indeed, the use of deictic imagery can serve as a sensitive index to whether students are adapting to these norms.

Adding gender as well as grade level allows us to observe a maximal contrast in the use of in-tandem imagery.9 As shown in Figure 7, nearly all African American boys in the seventh grade used in-tandem imagery, while nearly all European American boys in the 11th grade used mirror imagery.

Having established a cultural continuity between West Africans and African Americans, the research expanded to other parts of the world, thanks to the many international students who carried out course projects or dissertation research in their first languages. As the research spread geographically, I became increasingly aware that the dynamic processing initially evidenced in West Africa is, in fact, dominant in many parts of the world.


In 1991, I began to travel to the People's Republic of China and, working with Chinese doctoral students there, was able to explore the use of deictic imagery. What was especially intriguing about research conducted by a doctoral student Wei Yong with 180 students at Hangzhou University is the degree to which various aspects of their identity could be used to predict the use of deictic imagery. As shown in Figure 8, in-tandem imagery was used almost exclusively by male students who came from a rural area and whose academic major was Chinese when they responded to tasks in Chinese.

By way of contrast, the use of in-tandem imagery was significantly diminished when female students who came from an urban area and whose academic major was English responded in English. As Herzog, Weinreich, and Labov (1968) have pointed out, variation is fundamental to language use, but it is not random. They introduced the term structured heterogeneity to describe the way in which variation in language use reflects dynamic processes, in this case those associated with Westernization, that are at work in the larger society.

Jianguo Ji conducted research among 345 Chinese immigrants in Chinatown, where he discovered a dominant use of in-tandem imagery and thus established a cultural continuity that parallels the one running from West Africa to metropolitan New York. Once again, the use of in-tandem imagery could be predicted by various aspects of individual identity: gender, the amount of education, how long the immigrants had been in this country, and the language they used to respond to a task. Moreover, Jianguo was able to show what had initially been established in Tijjani's research in Nigeria, namely, that a significant majority--in this instance, nearly three-fifths--of the students who made a nondeictic response used in-tandem imagery. In effect, these Chinese students, much like the Hausa students, can be viewed as making a more dynamic response.

As the research expanded to include Chinese speakers, further tasks were introduced: for example, a task in which students were asked to cut a small round cake into four pieces. Over three-quarters of the students who made a more dynamic response (choosing both nondeictic and in-tandem) made the first cut across the cake rather than toward their own body.10

This preference for a horizontal cut can be viewed as reflecting the primacy of the horizontal over the vertical in Chinese ("front/back" is readily assimilated to "up/down" by Chinese speakers), as reflected in the syntactic ordering of terms such as ä¸å- dÅng bÄi "east north" and å³ä¸ yòu shàng "right upper," which contrasts with the syntactic ordering of parallel terms in English: northeast and upper right. This preference for the horizontal is also evidenced in the order in which strokes are made in forming a Chinese character. Since calligraphy is a fine art, manuals have been prepared to specify the order of strokes: a horizontal stroke should be made before a vertical one.11

As the research expanded, a number of new areas were explored: for example, we became interested in the effects of literacy on how individuals imagine a time line: is it envisioned as running along a vertical axis or a horizontal one? We discovered that individuals from Taiwan who use traditional Chinese script are more prone to imagine a vertical time line, whereas individuals from mainland China are more prone to imagine a horizontal time line, since they use a simplified script that runs from left to right. And Hausa students who attend Western schools are more prone to imagine a time line running from left to right, whereas those who attend Qura'nic schools are more prone to imagine a time line running from right to left.

These results do not reflect the grand claims about literacy that scholars such as Walter Ong (1982) have made: that it was, for example, critical in the emergence of democracy or the development of scientific thinking. The results do, however, indicate that literacy has at least modest effects on restructuring human thought. The time line no longer runs along the front-back axis, as it did when the structure of human language was first evolving; rather it can shift to the axis along which the primary experience of literacy takes place.

I'm going to have to cut short the story of the deixis research, even though there's much more to tell. I would like, however, to address a question that is often voiced in an incredulous tone: Just how can deictic imagery be culturally transmitted across many generations in the absence of a stable language? This is an intriguing subject that requires more time than we have, but I would like to make a couple of points. First, as the research expanded, doctoral students and I discovered that in-tandem imagery was dominant for speakers of languages in many parts of the worldâ"Africa, the Middle East, Asia--and that the dominance of mirror imagery is best viewed as developing relatively late in Europe--here the very term mirror imagery is suggestive since it was during the European Renaissance that the mirror came to be widely used--and spreading to wherever European peoples have moved over the past few centuries.

Moreover, our research established that speakers of all languages have access to both in-tandem imagery and mirror imagery, moving back and forth between them according to subtle factors in the communicative situation: for example, if the further object is obscured, speakers of any language view the field as closed and hence draw on mirror imagery. In addition, if such speakers are in motionâ"or if there is motion in the field they are describing--they are likely to make use of in-tandem imagery.

But perhaps most important of all are sociopolitical factors. The racial separation practiced by earlier generations is still present in American society, and a significant proportion of the African American population is still not integrated into mainstream society. It is among adolescents in the inner city that we have documented a dominant use of in-tandem imagery. It is important to bear in mind that deictic imagery is a bodily transmitted marker of ethnocultural identity that operates largely out of awareness and is thus readily preserved (see Hill, 1998, for further discussion of the role of what I call embodied memory in transmitting deictic imagery).

The long-term stability of deictic imagery raises a further question: how can it be reconciled with its short-term flexibility, as evidenced in Hausa students' shifting from in-tandem imagery to mirror-imagery when they are speaking English? In addressing this question, I would like to briefly mention a small-scale project conducted by Bill Mooney and Lynne Goldstein (1983), who asked an African American high school student to use Black English when administering an oral task to his peers. As students interacted with a peer, nearly two-thirds of them used in-tandem imagery. However, when Bill, as an adult White teacher, administered the same task, less than half of the students used such imagery. In effect, these African American students shifted deictic imagery as they moved from one dialect to another, much as the Hausa students shifted such imagery as they moved from one language to another. As we have seen, the variation that is fundamental to language use is sensitive to sociocultural factors in the communicative setting (see Hill, 1998, for a more extended discussion of the factors that are involved in long-term preservation and short-term shifting of deictic imagery).


As I conducted research seminars in the Applied Linguistics Program at Teachers College, I increasingly turned my attention to a fundamental issue in American education: how children from different ethnocultural backgrounds interact with standardized tests of reading comprehension. Here I would like to acknowledge the support of Ed Gordon, who was then director of the Institute of Urban and Minority Education. Among Ed's many talents is bringing people together from different institutions to work on a common problem. Ed was interested in exploring more deeply the performance of culturally diverse students on standardized tests and asked whether I would head up a research team to be jointly based at Teachers College and the Educational Testing Service. Working with the Institute, I developed a research proposal that received federal funding and assembled a team of doctoral students to carry out the research.


Our research team worked with a corpus of test items that had been piloted for the Gates-MacGinitie Reading Tests. We used data from three major cities that compared the responses of African American children and European American children. Although we were uncomfortable with using simple ethnocultural labels to identify children, we used the categories--African American and European American--that the test makers had set up during the initial pilot testing. We used these data to construct a reading test composed of 22 items--the number used on an actual test--that we administered to third-grade children in inner-city schools in metropolitan New York.

Our research was based on a model known as the ethnography of communication, which was developed by an anthropologist Dell Hymes (1968) whose specialty was linguistics. Based on his approach, we viewed reading tests as involving communication between two parties (test makers and test takers), and hence we were interested in understanding how this communication took place. Central to this task was uncovering the culturally constituted interpretive norms that each party brings to a reading test. We were able to make inferences about the test makers' norms by analyzing the tasks they built, but we also visited the great test factories of America to conduct interviews with them. By the same token, we were able to make inferences about the test takers' interpretive norms from their responses to the tasks, but we also conducted interviews to discover more about why individual children made the particular choices they did.

These individual interviews brought to our attention children's ways of responding that we had not anticipated. Hence, we developed what we called experimental probes to be administered to a larger group of children to explore how widespread certain ways of responding might be. We used two kinds of probes: (a) knowledge probes that explored what children knew about a potentially important feature of a test item and (b) performance probes that explored how children respond when a potentially important feature of a test item is changed (see Hill & Larsen, 2000, for a more detailed presentation of research methodology).

Before we examine test items in which deixis played an important role, let us first consider an item that was especially useful in shedding light on the different ways in which the African American and European American children responded to the reading test that we administered:

The fawn looked at Alice with its large, gentle eyes. It didnât seem at all frightened. "Here then! Here then," Alice said, as she held out her hand and tried to stroke it. It moved back a little and stood looking at her again.

A. How did the fawnâs eyes look?


B. What did Alice try to do to the fawn?

help it
hug it
pet it
hide it

Perhaps the most noticeable aspect of the two tasks is that they call for a trivial response to a passage that has been lifted with only slight modifications from an imaginative piece of writing, Lewis Carroll's Through the Looking Glass. Task (A) calls for children to recycle from the passage the word gentle. Task (B) calls for children to define the word stroke by providing the synonym pet. More generally, each of these target responses forces children to abandon the ongoing communicative process that takes place as they read and turn their attention to a tangential activity. To characterize tasks like these that interrupt the onward flow of communication, we came up with the general term acommunicative, and we went on to identify correct answers like those above as acommunicative target responses.

As is often the case, each of these tasks also contains a choice that seemingly sustains the communicative process that the test passage has initiated--in effect, a communicative distractor. Since this passage has been excerpted from a larger text, it is incomplete and thus encourages children to use details from the passage to make inferences that provide a larger frame for this brief encounter between Alice and the fawn. In the case of Task (A), the distractor frightened can activate what we call a motivated inference, since this inference can be directly justified by passage details. There is the sentence It didn't seem at all frightened, which might be taken to mean that the fawn was, in fact, frightened. And then there is the fact that the fawn moved back a little when Alice tried to stroke it. In the case of Task (B), the distractor help it can activate what we call an invited inference, since it is only loosely associated with passage information. Alice is described as speaking gently to the fawn and trying to stroke it.

The pilot testing results show that the European American children are more attracted to the acommunicative target response, whereas the African American children are more attracted to the communicative distractor (see Table 3).
We were curious whether childrenâs knowledge of the word fawn affected their performance on Task (A), so we administered a knowledge probe to 120 children that investigated ethnocultural differences in not only urban schools but also suburban ones:

Do you know what a fawn is?

(1) a baby duck
(2) a small nest
(3) a baby deer
(4) a mythical animal

In establishing choices for this knowledge probe, we were motivated by what children said during the interviews. One thought the fawn was "a mythical animal"--presumably the child had in mind the word spelled as f-a-u-n. Other children described it as "a forest animal," "an elephant," "a squirrel of some kind," "something like a duck." The child who compared the fawn to a duck went on to talk about how "Alice stroked the fawnâs feathers."

The ethnocultural differences in choosing baby deer were significantly diminished in the suburban setting, as shown in Table 4.

But whether children knew the meaning of the word fawn did not predict whether they were able to choose the target response for the tasks. In fact, for Task (A) such knowledge correlated more strongly with their choice of the communicative distractor frightened. This is not altogether surprising, since children who know what a fawn is are more likely to view it as skittish and backing off when someone tries to pet it. Indeed, I discovered that for several of our test items what children knew often worked against their selecting the target response. This is one reason, among others, that I am skeptical about reducing the achievement gap simply by introducing content that is more familiar to children who come from culturally diverse backgrounds. [emphasis added]

When we interviewed African American children who attend inner-city schools, we discovered that a number of them focused on the same textual detail in justifying their choice of frightened. One girl said, "The fawn must have been frightened because when she tried to stroke it, she moved back."

A boy provided a more expansive explanation that took account of the fact that the task is focused on the fawn's eyes: "Because it say in the story when she tried to stroke it, he moved back and looked at her. His eyes, I think, were wide open so it must be frightened."12

This boy's attention to the eyes is presumably motivated by the fact that Task (A) is focused not on the fawn, but rather the fawn's eyes. When we interviewed the test makers, they pointed out that children must read the task carefully in order to choose the word (i.e., gentle) that describes the fawn's eyes rather than the fawn itself. But as one child pointed out, we would look at the fawn's eyes if we wanted to know whether it is frightened. The Hausa people are fond of making much the same point with a karin magana: Labarin zuciya a tambayi fuska ("For news of the heart, one must ask the face").

Certain African American children went well beyond textual information in justifying their choice of frightened. As one boy put it, "Because it kind of ran away. Animals run away usually when they are scared. Maybe it thought that she was gonna throw a stone and that's why it moved back." His expansive commentary does conclude with the crucial textual detail that the fawn "moved back."

On Task (B), these textual expansions were very much in evidence when certain African American children explained their choice of the communicative distractor help it. Their expansions often presented the fawn as hurt, as evidenced by the following explanation: A fawn got hurt and the girl is trying to help it. The fawn kind of ran away, got scared. We can see that, once again, the fawn is presented as "running away."

A hurt fawn also appeared in a story that one African American girl told when she was asked to retell what she had just read (her use of "Once upon a time" signals that she viewed this task not as calling for a brief summary but rather a story of her own):

Once upon a time, Alice was walking through the forest and she saw a fawn. It was beautiful, and she saw how gentle it was looking at her. So she went over there and talked to it and tried to pet it. Then the fawn jerked back. She was wondering why did the fawn jerk back. So she went over there to get her friend. Her friends came. They all surrounded the fawn, and then suddenly she got to it. And then she realized that the fawn had a broken leg.

Notice this girl's skill at dramatizing the events in the passage. When she introduces the word "gentle," it is not to describe the fawn's eyes but rather its way of looking: "she saw how gentle it was looking at her" (here, too, we can detect evidence of moving from the static to the dynamic). And then she introduces "jerked back" to dramatize the fawn's moving back when Alice approaches.13

In closing the discussion of this test item, I would like to present what an African American child wrote after he had been interviewed about his responses to certain test items. Following our protocol, the interviewer ended with a coaching component in which he provided the child a list of target responses and brief explanations of why they were considered correct choices. This child was an imaginative reader and had been attracted by a number of communicative distractors. He was visibly upset when he discovered that these choices were considered wrong. He asked the interviewer if he could write down a story of his own. The following is what he wrote:

I was walking in the woods when suddenly I saw something move in the bushes. I went to see what it was and it was a deer it looked at me and it was a sad look. I looked back and saw it was caught in a trap. I jiggled the trap but it wouldn't open, so I took a rock and smashed it. It opened and the deer was free, it looked at me again and then it ran away.

This story has a special poignancy. It is remarkable how this boy drew on the passage about Alice and the fawn to compose a story that symbolizes the way in which he felt trapped by the reading test.


As we conducted the research on reading tests, we discovered that the deixis research provided a useful perspective not only on structural features of the test items but also on certain confusions that children experienced. Before we proceed, I would like to call attention to a basic feature of a deictic utterance: typically, there is no indication that it is based on the language user's immediate point of view. This omission is not surprising, since generally the language user can assume that a listener shares that point of view and any specification would thus be redundant. Linguists often use the symbol à to indicate that the deictic point of view has not been specified (such a à can be described as iconic in that it refers to what is absent in language precisely because it is available in the communicative setting itself).

As illustrated in the presentation of the deictic research, this lack of specification can lead to conflicting interpretations. Let me give a couple of examples. First, a spatial example: There's a tree across the road can have a deictic interpretation in which the speaker is located on the opposite side of the road and signals by the use of across that (s)he must cross the road in order to get to where the tree is located. This same sentence has a nondeictic interpretation if the speaker is describing a tree that has fallen across the road. In this case, the relation between the tree and the road is the same, no matter where the speaker happens to be standing.

And here is a temporal example: I can finish the job in an hour has a deictic interpretation if the listener assumes the speaker is expressing that (s)he will be able to complete the job in an hour from now. But in the nondeictic interpretation the listener makes no assumption about when the job will be completed: (s)he simply assumes that it will take the speaker one hour to do the job. In effect, the interpretive focus is on âhow longâ rather than "when."

In presenting the following test item, I have inserted the symbol à to indicate that the little girl continuously assumes from now in whatever she says or thinks.14

She looked at the calendar. "Only two more days à and I can go to school. In three days à I can read."

Mama chuckled. "Not quite that soon, dear."

"How many days Ã?"

"Some children learn to read in a few months. Some learn in about a year."

A year Ã? A whole year Ã? Not till she was seven? That would be awful.

A. How old is the girl in the story?


B. At first, how long did the little girl think it would take her to learn to read at school?

three days
a few months
one day
a year

It is striking that the à is distributed throughout the little girl's speech and thought, but it is altogether absent in her mother's responses. This distribution of à signals sharply contrasting approaches to time that can be described with our familiar terms deictic and nondeictic. The little girl's view is limited to her own immediate point of view and thus leads to a distorted understanding of what will happen at school. In responding to her child, the mother is attempting to lead her child toward a broader perspective on time that is not distorted by the urgency of "now."

To select the target response for each task, children have to make sense of the little girl's speech and thought in which deictic à continuously occurs. For Task (A), they have to know that the little girl is talking about a year from now when she will be 7 years old. Once they have figured this out, all they have to do is subtract 1 from 7 to arrive at the target response six.

For Task (B), they have to understand that the little girl is counting forward from the present as she speaks to herself: she is thinking that 2 days from now she will go to school and hence in 3 days from now--her first day at schoolâ"she will enter a magical world in which suddenly she will be able to read. Once children are able to integrate all that the little girl is thinking, they simply subtract 2 from 3 and arrive at the target response one day.15

When we asked children to retell what they had read, they had difficulty in making sense out of the passage. Even though the passage deals with a familiar subject (the excitement of a child about her first day of school), its structural features are not readily accessible. Apart from the ubiquitous presence of deictic Ã, the passage also includes free indirect speech (from the French style indirect libre) in the final paragraph to represent what the little girl is thinking. In a literary text, this style is often used to represent what a character is thinking and is thus marked by syntactic fragmentation: A year? A whole year? Not till she was seven? Such fragmentation is especially apt for conveying "the texture of the language we think in. Our thinking moves toward a state of pure predication, one in which words that convey the already known are continuously suppressed" (Hill & Larsen, 2000, p. 109).

We discovered during our interviewing that the use of this sophisticated literary device was confusing to children: for example, many did not understand that she actually referred to the little girl, as illustrated by this exchange between an interviewer and an African American girl (I stands for interviewer and C for child):16

I Who do you think she referred to?

C I think it was the writer probably.

I Is there any way you can tell?

C Because it said she and not I.

Certain children were annoyed when they discovered what the target responses were for the two tasks. As one boy put it, "This is supposed to be a reading test, not a math test." He does have a point, since children reading this passage without the tasks are not likely to make these calculations. We conducted a performance probe in which the passage without the tasks was presented for 25 seconds to 34 children (evenly divided between African American and European American). Once the passage was removed, children were presented the two tasks, and none were able to provide a correct answer to Task (B). Four children, two in each group, did manage to come up with six in response to Task (A).

The pilot test results show that Task (A) was easier than Task (B) for children, presumably because they could use real-world knowledge in selecting six. It is noteworthy that more European American children (40%) made this selection than African American children (26%). During the interviewing, we discovered a possible explanation for this difference. European American children tended to associate starting school with the first grade. As one boy put it in justifying his choice of six, "That means that she is going--like--to first grade; and if someoneâs got to go to first grade, they got to be six."

The African American children, however, tended to think that children start school at an earlier age. Here is what a girl said when the interviewer asked her why she had chosen four: "'Cause at four years old you go to school." A Haitian American child who had participated in a prekindergarten program in Haiti before moving to this country made a startling observation: "She is four years old in Haiti. But in New York she is five years old."17

We included what this boy said in a research report prepared for the test makers. One of our recommendations in this report is that they should not construct multiple-choice tasks in which the target response can be chosen based on of real-world knowledge. This task dramatically illustrates that such knowledge can vary across ethnocultural groups.18

As for Task (B), the pilot test results show that it is especially difficult for all children, as shown in Table 5.

Given childrenâs difficulties with this test item, it is not surprising that neither group managed to perform at the level of chance in selecting the target response one day. The distractor three days attracted a large number of children in both groups, and as can be seen, nearly half of the European American students selected it.19

In making this choice, children seem to be making a nondeictic interpretation of three days (i.e., responding to the question of "how long" rather than "when"). But it is highly doubtful that this is what they are doing. Given children's difficulties in understanding the passage, they may just be latching on to a salient passage detail. As soon as they spot three days as a choice for Task (B), they can just fill in the bubble and move on without any further thought.

It is, of course, possible that certain children who had a better understanding of the passage were attracted to this choice because they assumed it was an acommunicative target response. As we observed in our discussion of the test item about Alice and the fawn, European American children were especially attracted to this kind of choice. In setting up three days as a distractor, test makers may well have ended up luring savvy test takers into a trap, and hence, a more principled response cannot be distinguished from a response made by children who were confused by the passage. A major problem with the multiple-choice format is that we cannot know why an individual child filled in a particular bubble.


The deixis research was centrally concerned with polarities such as "front/back" and "up/down," and so it provided a useful framework for investigating test items in which various kinds of polarities play a prominent role. In constructing our corpus from the pilot test data, we included five items in which polarities found in the passage make their way into the tasks. I would like for us to examine an item, the shortest in our corpus, in which all four choices for Task (B) are based on various permutations of two polarities: "warm/cool" and "wet/dry."

Raisins are made from sweet varieties of grapes. The ripe fruit is usually placed on trays right in the vineyard. There, the fruit dries in the sun. Drying may take several weeks.

A. Raisins are made from grapes that have a lot of


B. What kind of climate is best for making raisins?

warm and dry
warm and wet
cool and dry
cool and wet

The target response in Task (B) is warm and dry, based on the textual detail that in making raisins, the grapes are placed on trays so that they can dry in the sun.

This kind of multiple-choice task is widely used on reading tests, presumably because test makers find it relatively easy to build a set of four choices around polarities. As we conducted our research, we became increasingly aware that tasks that force children to manipulate polarities often engender a confusion that they would not experience when reading an ordinary text in which polarities are not overworked and are embedded in a naturally occurring context. In Children and Reading Tests (Hill & Larsen, 2000, p. 140), we introduced the term paradigmatic vertigo to describe these induced confusions.20

The deixis research had made us aware of the culturally variant ways in which African American children handle polarities, and so we were especially attentive to their performance on polarity items. During the interviewing, we noticed the children's tendency to make what we described as a polarizing inference. Consider, for example, how one boy justified his choice of the distractor warm and wet in Task (B):

Look, first I didn't understand so much so I picked warm and dry because of this [points to "dries" in the text]. But then when I read it more, I think--uhm--warm and wet, 'cause here it says, "There, it dries in the sun. Drying may take several weeks." Weeks. I think weeks are very long so I don't think dry is very good.

It is disconcerting to observe this boy rejecting his initial choice of the target response warm and dry as he comes up with the polarizing inference. If I were coaching this child, I would be tempted to tell him not to think too much when taking a reading test.

Just as the extended drying was used to justify the choice of warm and wet in Task (B), so it was used to justify the choice of water in Task (A). As one girl put it with great confidence, "It has to be water because of all that drying in the sun." Another girl read out loud the final sentence in the passage to justify her choice of water. "It was water, because here it says, 'Drying may take several weeks.'" These girls, like the boy described in the previous paragraph, adopt a dynamic stance toward textual information, which clearly runs counter to the static orientation that testing calls for. This stance leads them to project from one lexical pole to its opposite.

In presenting the deixis research, I pointed out that such dynamic projection is prominent among African American students. They are more prone to interpret polar terms nondeictically (i.e., they project to the intrinsically marked orientation of the reference point); and even when they interpret these terms deictically, they make greater use of in-tandem imagery (i.e., they project their own orientation onto the external reference point). I suspect that many African American children are prone to respond dynamically not only to polar terms but also to text as a whole.

Certainly, children from any ethnocultural background can have difficulty in adapting to the static processing that reading tests call for, since it runs counter to what they do when they are first learning to read. As they encounter imaginative stories, they often create worlds that go far beyond what is on the page. However, when taking a test, they must learn a circumscribed way of reading in which the world they create is more closely constrained by what is on the page.

During our interviewing, we found that European American children generally were more aware of the restricted way of reading that a test calls for. Once I asked a boy who had breezed through our corpus of test items without a single wrong answer to explain why he had chosen light on task (A) in the following polarity item.

Near the top of most pencils, there is a symbol, usually a letter or a number. It tells you whether the pencil is soft or hard. If you want your writing to be thick and dark, you select a pencil with soft lead.

A.You choose a hard lead to make lines that are


B. The letters or numbers near the tops of pencils are


His explanation for selecting light models perfectly what children are expected to do when they are confronted with polarities on a multiple-choice task: "Because it says, 'If you want your writing to be thick and dark, you select a pencil with a soft lead,' and this is hard lead so it's got to be the opposite."

I then asked him to explain why "hard lead makes a light line," hoping that he might say something like "Because the lead is hard, less of it rubs off." Rather, he curtly replied, "The story didn't talk about that." As a savvy test taker, he knew not to waste his time thinking about why hard lead makes a light line. Such an approach, however, can be limiting if it is carried over to the other reading that children do. For reading to become an effective tool for building knowledge, children must learn to probe beyond the surface of a text: for example, when they come across a text that states a causal relation but doesn't explore the question of "why" (i.e., stating that a pencil with soft lead makes lines that are thick and dark, but then providing no explanation).21

Unfortunately, the more restricted approach to text that a reading test calls for spreads into the classroom through the use of test prep materials. One of the most powerful criticisms of reading tests is the way in which they distort the curriculum, especially when they are used to determine whether children will be promoted or whether individual teachers, or even an entire school, will be placed on probation. Instead of children pursuing a broadly based language arts curriculum, they often spend a good deal of class time working with test prep material that focuses on the approach to reading required for successful performance on standardized tests.

Even those who support the use of these tests tend to agree that the way in which test prep takes over the classroom is unfortunate. What they generally do not recognize is that test prep material is not as carefully vetted as test material itself. In an op-ed piece in the New York Times (Hill, 2000c), I described how test prep material widely used in New York City schools was poorly constructed. One sample passage, for example, provided misleading information about the origins of testing in ancient China. An ancient scholarly practice was presented as based on purely materialistic motives: to win a job and lots of money for life. This erroneous information was further reflected in the sample tasks: for example, one of the tasks called for children to identify which of the following statements were probably true.22 The children were then provided reasons for accepting or rejecting each of the four choices. They were instructed to reject the choice Few people took the tests for the following reason:

When someone passed the test, he received lots of money. It is reasonable to think that many people took the tests.

In fact, many years of study were required to prepare for the test and relatively few people were thus ever in a position to take it. In reviewing this test prep material, I was disturbed not only by the careless way in which it was put together, but also by its tendency to misrepresent other cultures in order to make them seem more familiar.

To conclude this discussion of reading tests, I should point out that our research

— Clifford Hill
TC Record


