Orwell Award Announcement SusanOhanian.Org Home

Scholars Respond to Scientific Research in Education Report

Susan Notes: Teachers College Record features six articles (plus introduction) of scholars discussing the implications of the report report, Scientific Research in Education. This report is important to anyone who cares about public schools. The U. S. Department of Education asked for the report. A committee was assembled to address the question of what constitutes scientific research in education, and, as noted in the introduction below, the committee published its report in 2002--in time to influence ESRA. As Federick Erickson notes, the current promotion by the federal government of “science” as a unitary paradigm for educational research is a mistaken effort that may be well intended—or may not be.

James Paul Gee argues that the six principles that the National Research Council's report Scientific Research in Education claims define an enterprise as scientific are relatively vacuous generalities.

Read on.

Introduction: Implications of the Scientific Research in Education Report for Qualitative Inquiry

by Patti Lather & Pamela A. Moss - 2005

What constitutes scientifically based research in education? How should it be defined? Who should definite it? Should it be privileged over other forms of educational research? Should it be defined at all? What role should qualitative or interpretive methods play in educational research? What are the consequences of different answers to these questions?

While questions about the demarcation of science are not new, the federal government initiated the current debate with its definition of scientifically based research in the Reading Excellence Act of 1999 (Eisenhart & Towne, 2003). Subsequent federal attempts to delimit science followed in the ''original Castle bill'' to reauthorize OERI, the No Child Left Behind Act (NCLB) of 2001, the Education Sciences Reform Act (ESRA) of 2002, which created the Institute for Education Studies (IES) to replace OERI, and the current proposed priorities for IES. These federal definitions of scientifically based research (SBR) have tended to define scientific research in terms of particular methods and to privilege experimental research as the gold standard for evaluating educational programs. They have delimited and prioritized the kinds of educational programs and research agendas for which federal support would be provided.

At the invitation of the Department of Education's National Educational Research Policy and Priorities Board, the National Research Council (NRC)1 assembled a committee to address the question of what constitutes scientific research in education. The committee published its report, Scientific Research in Education (SRE), in 2002 in time to influence ESRA (Eisenhart & Towne, 2003).2 SRE has been disseminated across a wide berth of educational researchers who well recognize that the issues involved have profound implications for the direction of educational research, affecting researchers, the kind of work they do, and the people intended to benefit from their work. Within the educational research community, an increasing array of special issues of journals is being devoted to the ways that ''scientifically based'' methods are being defined.3

These articles are part of this ferment and grow out of a 2003 AERA panel that we organized, ''Yes, But Is It Science? Implications of the Scientific Research in Education Report for Qualitative Inquiry.'' In deciding to organize this session, we shared the concern that the SRE report, while offering a somewhat more flexible definition of science, nevertheless privileged the testing and establishment of replicable causal effects and positioned ''qualitative studies'' as preliminary or supplementary to this task (see Moss, this issue). We were concerned as well about the kinds of research that appeared to be ignored or relegated to the margins of the debate as not scientific and about the effects of these choices. As Lather (2004) describes her concern, in spite of ''the report's oft-repeated intentions of balance across multiple methods, …objectivity is enshrined and prediction, explanation, and verification override description, interpretation, and discovery'' (p. 19), thus reinscribing a unified theory of science ''under duress for some 30 years'' (p. 17). We asked the panelists to consider ways in which those of us who do interpretive or qualitative work might respond resourcefully and effectively to the vision of science portrayed in the NRC report and to the far narrower vision of science portrayed in recent federal legislation and policy statements to which the report responds.

While all of the panelists addressed these questions, they also raised broader concerns about the ways in which the SRE committee responded to its charge. Questions were raised about the marginalizing of forms of research considered not scientific (Erickson), about the advisability of general attempts to demarcate science from nonscience (Gee), about the absence of standards that promote critical dialogue across disciplinary boundaries (Moss), about the adequacy of attention to matters of cultural difference, to relationships between researcher and researched, and to what and whom the report ignores (Siddle-Walker), and about the undervaluing of public critique of educational research in a democratic society (Willinsky). These arguments are presented in this issue along with the response from SRE committee member, Eisenhart, who shares both her reasons for supporting the report and her concerns about why what it says is ''not enough.''

In seeking to improve the quality of educational practice, our concern is about how educational research can address the complexity and the messiness of practice-in-context which call into question the adequacy of conventional methods, the desirability of generally applicable research and policy standards, and the philosophies of science that prescribe narrow views of these issues. Our hope with this collection of papers is to bring such issues before the educational research community in order to foster discussion of what might be called the politics of the science of the U.S. accountability movement in public education, particularly the implications of legislative efforts for qualitative research in education.


1 The NRC is an arm of the National Academy of Sciences. Its charge is ''to synthesize scientific knowledge in a wide range of areas that affect public interest'' (SRE, p. 6).

2 Eisenhart and Towne (2003) trace the influence of SRE by contrasting the earlier definitions of SBR with the definition that appears in ESRA. While they highlight a subtle move away from the reliance of methods-based definitions of SBR, the recently proposed priorities for the IES (Federal Register, Vol. 68, No. 213, 11/4/03) appear to give the gain away by limiting funding to studies with particular design features.

3 In addition to the ongoing discussion in Educational Researcher, see Qualitative Inquiry, 10(1) and 10(2), special issues on ''the new methodological conservativism,'' as well as an inprocess issue of Educational Theory, with a symposium organized by Ken Howe. The NRC just released a follow-up report to SRE titled ''Advancing Scientific Research in Education'' (2004) and has ''established a planning committee to oversee the conceptual development of a broad, long-term initiative related to the quality of evidence'' in behavioral and social sciences research (Lisa Towne, personal communication).


Eisenhart, M., & Towne, L. (2003). Contestation and change in national policy on scientifically based research. Educational Researcher, 32(7), 31–38.

Lather, P. (2004). This is your father's paradigm: Government intrusion and the case of qualitative research in education. Qualitative Inquiry, 10(1), 15–34.

National Research Council. (2002). Scientific Research in Education. Washington, DC: National Academy Press.

Cite This Article as: Teachers College Record Volume 107 Number 1, 2005, p. 1-3
http://www.tcrecord.org ID Number: 11682, Date Accessed: 2/23/2005 8:18:05 A

Arts, Humanities, and Sciences in Educational Research and Social Engineering in Federal Education Policy

Frederick Erickson

The argument of this article is that the current promotion by the federal government of ''science'' as a unitary paradigm for educational research is a mistaken effort that may be well intended - or may not be. In either case, regardless of intentions, the consequences of this promotional effort are likely to be negative for educational research as a whole, because the promotion of the ''scientific'' not only misrepresents science itself, but it also undervalues and marginalizes the contributions of the arts and humanities to our understanding of educational aims and practices. The push for more ''science'' in educational research and of ''scientific evidence'' as a warrant for particular educational practices has involved Congress, the U.S. Office of Education, and the White House. It also manifests in a report issued in 2002 by the National Research Council (Shavelson & Towne, 2002). I see this policy orientation as wrong, leading toward dangerous kinds of social engineering, and this article explains why I believe this.

Let me begin the argument by repeating what I said in comments coauthored with my colleague Kris Gutierrez in last November's issue of the Educational Researcher (Erickson & Gutierrez, 2002). For the most part the NRC report does an adequate job of describing scientific research in education, and it makes clear that certain kinds of qualitative research are ''scientific.'' In addition the report makes many reasonable recommendations for the organization of educational research sponsorship in the federal Department of Education.

That said, it seems to me - and now even more than at my earlier writing - that the NRC report is seriously flawed, even when read as a standalone document. When it is read in the wider context of practices and policy being undertaken currently by the federal Department of Education, the NRC report ends up justifying scientism rather than science. There is real danger that the report may mislead a whole generation of educational researchers, and that is especially unfortunate because there is much good material in the report itself and its intent was to protect educational research from an extreme form of naive scientism.

The fundamental problem, in my judgment, is that the committee that produced the report accepted uncritically its charge from the entity that commissioned the report, the National Educational Policy and Priorities Board of the U.S. Department of Education. The charge reads (see Shavelson & Towne, 2002, p. 22):

This study will review and synthesize recent literature on the science and practice of scientific education research and consider how to support high quality science in a federal education research agency.


In contemporary (high modernist) parlance ''science'' connotes rigor and certainty - it provides a warrant for true belief, a ground for master narratives. I believe that what we need most fundamentally in educational research is wisdom, insight, courtesy, empathy for the people we study (both in the ways in which we collect data and in the ways in which we characterize those people in our reporting), and scholarly care in the analysis of evidence and in reporting - indeed educational research at its best should be imbued with what the ancient Greeks called the cardinal virtues: prudence, temperance, fortitude, and justice.

I wouldn't worry about calling for educational research to be ''scientific'' if the term were being used in a more generous, premodern sense. In the medieval university, theology was called the Queen of the Sciences. By this was meant that the subject matter of theology - the nature of God's being and will - was the most important thing humans had to think about. Such thinking should be done with the greatest possible care and with the greatest possible scholarly accountability. That's what ''science'' meant - careful thinking accountable to public expert criticism. Such individual theologians as Thomas Aquinas, Anselm, and William of Occam each set forth differing theoretical accounts of God's being and will, and they wrote and spoke in the expectation of public accountability for their discourse. (Indeed, in the oldest extant university building at Oxford University, Duke Humphrey's Library, we can see that the ground floor is furnished as a hall for public disputation between scholars - all of whom were clergy. Two lecterns are placed opposite each other on raised platforms at the center of the hall so that the disputants could face one another directly and be seen and heard clearly by all those present.)

Accountability and rigor also can be found in the scholarly study of the arts. In my own undergraduate and initial graduate education I studied historical musicology at Northwestern University. I hold a master's degree in that field. It is the study of artistic practices in music and their changes over time. In German its name Musikwissenschaft makes even more strongly the point I want to make, since this term is literally translatable as ''the scientific study of music.'' (Wissenshchaft being the word referring to physical science as well as to careful study in the human sciences and in the arts.) Let me give you an example of accountability and rigor in historical musicology.

I studied musicology in the heyday of close analytic reading of musical texts, analogous to ''close reading'' of poetic and fictional texts in literary criticism of the same era. (Things have changed since, but that's a story for another time.) My instructors said, ''We're not interested in what Bach had for breakfast - look at the music itself.'' They meant: pay attention to fine details of optional choice in musical form. The most senior professor in the department, John F. Ohl, was a stickler about levels of inference in formal pattern identification. I remember him saying of our claims to have identified a subtle pattern in melody or harmony, ''Well - I just don't see it.'' By chance last week I found in a box one of my papers in a course on music in the middle ages, in which I was enrolled in the fall of 1961, my first paper in my first course with Professor Ohl. Here is what I wrote:

Many of the art forms of antiquity or of preliterate modern societies seem to contain basic stylistic patterns which, by alteration and rearrangement, make up the basic ''vocabulary'' of the art form. This is true of the graphic art of the North American Indian, the literature of the early Hebrews, and the musical idioms of the Post-Augustan Mediterranean area, which gradually became a part of ''Gregorian'' chant.

Therefore it seems appropriate to study the Third Mass for Christmas [the Christmas morning mass] in comparison with other chants, in order to determine whether or not the process of centonization, or patchworking of common stylistic motives, is evident.

I then went on to provide specific musical examples that showed centonization in the Introit ''Puer natus est nobis,'' which is the chant that is sung at the beginning of the Mass. Here is what Professor Ohl wrote in red pencil as a marginal comment: ''I think your criteria for examples of centonization, as you cite them above, must be regarded as too easy-going; the resemblances seem less close to me than you imply. B-minus.''

He then followed that general comment with a detailed, point-by-point critique of my analysis. From that point on I reigned in my tendencies to take inferential leaps. And I went on to get A's from Professor Ohl. But at no place in my entire career as a graduate student was I ever more careful with data, even when some years later as a doctoral student I was punching in sums of squares by hand on a Monroe calculator in an introductory statistics course.

I challenge anyone to come up with better examples of scholarly rigor than those to be found in that kind of musicology - higher standards for careful thinking, a more thorough process of accountability for intellectual claims. But is it ''science?'' Not in the sense meant by the NRC report. And in that, the report is at fault for impoverished imagination, for constricted intellectual reach.


The second major flaw in the NRC report comes near the beginning of the introduction (Shavelson & Towne, 2002) in a list of illustrative research topics.

Knowledge is needed on many topics, including: how to motivate children to succeed; how effective schools and classrooms are organized to foster learning; the roots of teenage alienation and violence; how human and economic resources can be used to support effective instruction; effective strategies for preparing teachers and school administrators; the interaction among what children learn in the context of their families, schools, colleges, and the media; the relationship between educational policy and the economic development of society; and the ways that the effects of schooling are moderated by culture and language. In order that society can learn how to improve its efforts to mount effective programs, rigorous evaluations of innovations must also be conducted. (p. 12)

Granted, the report does not present this list as an exhaustive one. But it is called illustrative - and the list of nine topics above includes not one that is primarily descriptive. All invite some kind of causal analysis and the use of some sort of inferential statistics. This narrows the range of envisioning fruitful topics for educational research in a way that excludes a focus on what qualitative research is best at (more on that in a moment). Moreover, in a later chapter on research design (Shavelson & Towne, 2002, p. 99), as three main types of research questions are introduced, the descriptive one is limited in a significant way. The report presents three types of questions - ''What is happening'' (concerning description), ''Does x cause y?'' (concerning the identification of cause) and ''Why does x cause y?'' (concerning explanation of causal relations).

An interpretive qualitative researcher would say that the question ''What is happening?'' is always accompanied by another question: ''And what do those happenings mean to those who are engaged in them?'' A critical qualitative researcher would add a third question to the previous two: ''And are these happenings just and in the best interests of people generally - do they contribute to or inhibit human life chances and quality of life?'' Answering such questions is what qualitative research is best at.

As the NRC's list of illustrative topics stands, it privileges questions of efficiency and effectiveness over questions of hermeneutical or critical description and analysis. There is not a single question on that list in the form ''What's happening and what do those happenings mean?'' If the NRC committee's charge to define the ''scientific'' had not stacked the deck enough, at the very outset of that committee's deliberations, their report's presentation of exemplary topics - by what it silences implicitly through omission as well as by what it says explicitly - completes the disadvantaging of interpretive and critical studies in education, marginalizing as well those of feminist research, history, and philosophy.

And for the federal Department of Education currently, never mind scrutinizing the ends of education, or of considering relationships between ends and means. What's most important is the determination of effects - what works. And these are obvious effects - as conceived by common sense, not subtle effects, and certainly not contradictory and unintended effects. The Department of Education is currently establishing a clearinghouse that will tell you what works - and only evidence derived from randomized field trials will be reported by that clearing house. Make no mistake: The Blue Meanies have taken over the Yellow Submarine.

What might other kinds of questions be? Here's a list of fruitful ones, with theoretical foundations, mostly of the type ''What's happening and what do those happenings mean?'': What is it like to be a child in the bottom reading group in a particular first grade class? How does Miss Smith set up her kindergarten classroom so that students learn to listen closely to what each other says? What happened as the math department at Washington High School seriously tried to shift their teaching away from math as algorithms to math as reasoning? Why do the Black kids sit together in the lunchroom and should we as educators care about that? What about the highly scripted early elementary school reading program Open Court and its relation to teacher morale, not considering morale as an abstract outcome variable but as a concrete, palpable state of being, the ''life world'' of daily work life - how do teachers think and feel about teaching the Open Court reading program - experienced teachers, inexperienced teachers, skilled teachers, not so skilled teachers, advocates of whole language and those who have always followed the instructions in the teachers' manual? Why do teachers call the Open Court monitors ''the Open Court Police''? (That's not a question to be answered by experimental analysis of causal relations - it's a matter of human meaning as causal for human social action, as such meaning is constructed in life experience.) What differences in per pupil expenditure between school districts in a given state are fair - how much difference does it take to be demonstrably unjust - what are morally defensible criteria for judgments about justice in school funding? How do children who come to school speaking Spanish use their mother tongue in learning to speak and read English? What, on a large scale, do teachers believe about the ability of poor children of color to do well academically in school - how do they explain the high rates of low achievement by such children? Do different kinds of teachers have different views of this? What are the ways of characterizing types of teachers that best fit the full range of variation in their beliefs about children's capacities to learn? Does this look different in Japan, and Sweden, and South Africa? Has this looked different in America over the past 200 years? (And I mean sound evidence about beliefs: How do we get really good evidence about what teachers really think and feel?)

We won't answer those questions by means of randomized assignment of students to treatment and control groups. There is a place in educational research for large scale experiments. But the questions such approaches know how to answer are not the only ones worth asking.


Let me close with a narrative, since stories about what happens in the world can help us think and come to know important things about what is going on around us. In the fall of 2002, in Palo Alto, California, a group of academics were gathered at a party. They were discussing the NRC report and the current federal policy of privileging randomized field trials as the ''Gold Standard'' for educational research. One of the people in the room was a physician. He mentioned a report published in a medical journal that quoted a researcher who had worked for many years at the top laboratory for polio research, the Salk Institute. The medical researcher said that if knowledge development in polio research had had to depend only on conclusive findings from experiments, research on polio would today consist mainly of studies of the treatment effects of the iron lung.

In sum, the current federal agenda for increasing the ''science'' in educational research is a mistake, a course which if continued is likely to result in tragic consequences for educational researchers, practitioners, students, and families. I believe that this agenda is not simply an intellectually neutral search for better knowledge but that it is about knowledge production for social engineering - and we should be aware that this is social engineering toward extreme right wing ends. Make no mistake; these are dangerous times.


Erickson, F. & Gutierrez, K. (2002). Culture, rigor, and science in educational research. Ed- ucational Researcher, 31(1), 21–24.

National Research Council. (2002). Scientific research in education. In R. J. Shavelson, & L. Towne (Eds.), Committee on scientific principles for educational research. Washington, DC: National Academy Press.

Cite This Article as: Teachers College Record Volume 107 Number 1, 2005, p. 4-9
http://www.tcrecord.org ID Number: 11683, Date Accessed: 2/23/2005 7:16:56 AM


It's Theories All the Way Down: A Response to Scientific Research in Education

by James Paul Gee - 2005

This article considers the six principles that the National Research Council's report Scientific Research in Education claims define an enterprise as scientific. I argue that these principles are relatively vacuous generalities because one cannot determine anything about any of them from outside specific theories of specific domains (and domains are smaller, usually considerably smaller, than disciplines). Such theories can differ greatly on how they would define the meanings of the words in the six principles. In the end, I argue that we ought to compare specific theories and their track records, caring less about what "science" is and more about using theories with good track records.

Scientific Research in Education (National Research Council 2002, hereafter SRE) undertakes to specify what counts as science in general and science in education in particular. I myself don't believe this task is well advised (witness the vast and contentious literature in the philosophy of science, leaving aside the large literature on the history and sociology of science). I think it even less well advised when done at a time when the federal government wishes to dictate, largely, in my view, on political grounds, what counts as science in education and what pedagogies, curricula, and materials should be used in schools (e.g., consider the No Child Left Behind legislation and the National Reading Panel's 2000 report).

However, I am well aware that debates about this matter quickly deteriorate into insults and injuries, change few people's minds, and often hinge on people's views about the goodness or badness of complicity or cooperation (choose your term) with a federal government they either like or dislike. So let me turn to a discussion of why I think SRE's task is ill-advised from an intellectual, not primarily a political, point of view, though the two are, of course, ultimately connected.

SRE offers six guiding principles about what constitutes science. I consider each in turn. My basic point is that the reader cannot determine anything about any of SRE's principles from outside specific theories of specific domains (and domains are smaller, usually considerably smaller, than disciplines). I use grammatical theory in linguistics as an example, but theories of other domains would have worked as well. What morals I draw from my conclusions are stated at the end of the article.

Let me hasten to say, before starting in earnest, that my position here - really a position within the philosophy of science - is not (for better or worse) a species of postmodernism. By and large, my position is similar to the one the philosopher and logician Willard Van Orman Quine (1951) stated in his famous paper ''Two Dogmas of Empiricism,'' though in that paper Quine was primarily concerned with undermining the analytic-synthetic distinction.

SRE's first principle is ''Scientific inquiry must pose significant questions that can be investigated empirically'' (p. 3). The problem is this: Who determines what counts as significant? I would argue that what counts as significant cannot be determined from outside a particular domain of inquiry. It is a domain-internal decision. Worse yet, it is theory internal, as well, since different theories about a given domain may disagree on what counts as a significant question. Consider the following question from the domain of the study of grammar: In the underlying representation of a sentence like ''John believed the old man to be the guilty party'' has the noun phrase ''the old man'' been raised from the position of subject of ''to be'' to the position of object of ''believe''?

This question is significant or not - in fact it only makes sense at all - relative to a given theory of grammar. In fact, at one time in the history of grammatical theory (say during the time of the American structuralists; see, e.g., Bloomfield, 1935), this question could not be asked; it would have made no sense. The language in which the structuralists' theory of the domain (grammar) was couched could not formulate such a question. Later in the history of grammatical theory (say during the period of Chomsky's Aspects of a Theory of Syntax, 1965) it could be asked and did make sense. Yet later (e.g., today in work on minimality theory say, e.g., Chomsky, 1995) the question isn't asked and again doesn't really make sense.

Note, too, that the question itself contains ''terms of art'' (like ''underlying representation,'' ''noun phrase,'' ''subject,'' ''object,'' ''raising'') that only make sense within a given theory of a given domain (here grammar) and, in fact, change their sense when linguists change their theory of the domain. Even when this question was askable and made sense, competing theories interpreted the terms ''subject'' and ''object'' differently - so the question really meant different things, was a different question, to the two competing theories (compare, e.g., Chomsky, 1965, with Chomsky, 1995, or compare either of these to Bresnan, 2000).

SRE's second principle is ''Scientific inquiry must link research to relevant theory'' (p. 3). As the earlier discussion makes clear, questions - or for that matter, claims - are always couched in the language of a theory about a given domain, and that theory is what gives the language its meaning. The question here, then, is, who says what counts as the relevant theory?

As I mentioned earlier, there was a time (Chomsky, 1965) when the first question stated earlier was significant and meaningful to the current theory of grammar and, indeed, received the answer ''yes.'' Somewhat later the question was significant and meaningful to two competing theories of grammar, receiving the answer ''yes'' in one (Postal, 1974) and ''no'' in the other (Chomsky, 1973) (though, as I mentioned, the two theories did not interpret the terms ''subject'' and ''object'' in the same way formally). Today, there are theories of grammar in which the question, once again (as in the Structuralist era), cannot arise (these theories are formulated so that the question cannot arise in any meaningful way, that is, so that, in fact, it is ''nonsense,'' e.g., Chomsky, 1995, or, in a different theory, Sag & Wasow, 1999). What linguists have debated, at certain points in their history, is precisely which theory is the relevant one to which to link one's research - where this question really amounts to asking which theory is the relevant (right) one in which to couch one's questions and claims.

Note that it is not always a matter of deciding between two competing theories, one of which you think is wrong and one of which you think is right. Sometimes linguists debate whether a given observation should be dealt with by a theory of syntax, a theory of semantics (or logical form), or a theory of phonological/morphological form (see, e.g., Chomsky, 2002). They may think all three theories (of their respective domains) are compatible and correct but disagree as to which is the relevant theory - and, in this case, relevant domain.

The third SRE principle is ''Scientific inquiry must use methods that permit direct investigation of the question'' (p. 3). I don't really know what ''direct'' means here. However, in a sense, no scientific inquiry engages in ''direct investigation'' of any question. What would it mean to investigate the first stated question directly? The only way to investigate it is to formulate it in terms of the language licensed by some theory of the domain and, then, use the procedures (''methods'') of that theory's approach to answer the question. We are never looking directly at language or grammar. We are looking at language or grammar as it has first been specified by a given theory and its apparatus for formulating and attempting to answer questions in the domain. Furthermore, there was a time in the history of the theory of grammar (during the ''Government and Binding'' stage; see, e.g., Chomsky, 1981) when the first question was not so much answered ''directly,'' but, rather, the answers to other questions - questions that turned out to be more important to the theory - implied a given answer to the first question (namely, ''no''). The answer stood nonetheless (indeed, people eventually ceased to ask the first question).

Let me point out here, too, that SRE often talks about ''methods'' and ''different methods'' as if methods were detachable from theories of a domain. They are not. Any method starts with a question and then forms various statements (e.g., hypotheses, descriptions, claims, conjectures, implications) about the domain this question is in - it does not matter whether these statements are in English or in some numerical or statistical symbolic system. All of these statements are significant and meaningful only in terms of the theory of the domain that gives them significance and meaning. The same word or symbol can have - often does have - quite different meanings in different theories, even competing theories of the same domain (e.g., ''subject'' and ''object'' in theories of grammar).

The fourth principle SRE states is ''Scientific inquiry must provide a coherent and explicit chain of reasoning'' (p. 4). The problem here is the same as those stated earlier: Who is to say what counts as coherent and explicit? Again, only a given theory in a domain can define these terms, and different theories, even different theories of the same domain, can define them differently. For example, is it better to have a very abstract theory of grammar (whose technical terms are far removed from being directly translated into observational terms) which then involves long chains of deductive reasoning to draw conclusions about actual utterances or is it better to have a quite concrete theory (whose technical terms are fairly easy to translate into observational terms) which then involves fairly short chains of deductive reasoning to draw conclusions about actual utterances? This question has been fought over several times in the history of the study of grammar (see the debates in Peters, 1971). Proponents of one side of the issue find the reasoning of the other side less than fully coherent.

The question of explicitness is equally relative to given theories of a domain. Is the definition of the ''object of a verb'' as an NP in the configuration: ''VP[V NP]VP'' (i.e., the NP immediately following the verb and immediately dominated by VP) explicit or not? Some theories would say it is, some would say it isn't, and some would say it's too explicit (e.g., see Chomsky, 1981, 2000, and Bresnan, 2000; compare either or both to Chomsky, 1957, or Chomsky, 1965, or, for that matter, to Postal, 1974). In some cases, the theories that say it is not explicit or too explicit have an apparatus for describing sentences that don't exist (and therefore aren't used or appealed to) in theories that say it is. In any case, from outside the theory of the domain you can make no judgment about coherence or explicitness since you don't know what things like ''immediately dominate'' mean in a given theory. And, indeed, if you have a theory of the domain in which the notion of ''immediately dominate'' is nonsense, then the claim may be explicit, but it is, nonetheless, incoherent.

The fifth principle of SRE is ''Scientific inquiry must involve claims that can be replicated and generalized across studies'' (p. 4). In the broad sense, this is, in fact, never true. At best, scientific claims are replicated and generalized only across studies that apply the same theory of the domain or a compatible theory of the domain. If one theory of the domain finds the language of another theory meaningless, then obviously people using the former theory don't bother to try to replicate findings from people using the latter. But, then, here too we reach a familiar question: Who determines what counts as replication or generalization?

If I have shown that objects in English can ''prepose'' (as in ''The older man Mary does like''), then it is obviously not replication of what I have done for you to repeat this data (you already agreed with it) or data that is - as far as the relevant theory of the domain claims - the same (as in ''The younger man Mary does like''). On the other hand, it will obviously do no good for you to try preposing things that the theory of the domain we are using does not count as ''objects'' (as in ''*In the doorway, Mary likes the older man'' from ''Mary likes the older man in the doorway'') and tell me I am wrong. The trick is that you have to find cases that are similar enough to count as enough like my data but different enough not just to be the same study I carried out. Obviously you have to get Objects of a sort I have not dealt with (as in ''*His arm, John broke'' from ''John broke his [own] arm''). But now, two things: First, it's the theory that determines what counts as both similar and different enough and, second, when you fail to replicate, I may give up my claim or, on the other hand, I may change my definition of ''object'' (e.g., to exclude ''body part'' objects). What counts as replication or generalization is entirely theory internal.

The final principle SRE states is ''Scientific inquiry must disclose research to encourage professional scrutiny and critique'' (p. 5). But, of course, the question arises: Who counts as the ''professional'' whose scrutiny ought to be sought and whose critique ought to be valued? This is a tricky question. If we answer that such professionals are only those who are competent in a given theory of a domain, then, in actuality, while such professionals regularly question and critique specific claims in the theory, they rarely question and critique the basic assumptions and the basic methodological tools of the theory (if they go too far in this respect, they exit the theory and enter none or a new one).

In practice, theories of a given domain carry implications about who is ''competent'' to scrutinize and critique them. These people need not all be researchers using exactly that theory in the domain. For example, many syntactians might cede that a theorist of logical form or even of phonology might be a competent critic, since theories in these areas bear some similarity and use some of the same apparatus. But the field of possible competent critics is not all that wide. In addition, as I said earlier, there is a thin line between a fellow professional criticizing my work and that very criticism implying the professional disowns my theory of the domain, thereby rendering him or herself an ''outsider'' and not a competent critic.

Before I turn to the morals I draw from the previous discussion, let me say two things. First is this: One thing I think often confuses issues here is that everyone can see that questions like that in the first question - or those couched, for example, in the language of physics - are clearly in a language given meaning by a given theory. Outside that theory such questions clearly do not have much meaning, and inside that theory the words they contain clearly have meanings different from everyday English (if they even contain English words). On the other hand, when a volume like SRE surveys research on ''reading,'' for instance, it is easy to think that the questions and claims in this research are just everyday English and mean pretty much what everybody thinks they mean. It does not appear that theories of a domain play such a prominent role here. But, nonetheless, they do.

In much work on research on phonics, pre- and posttests of ''reading'' are composed of lists of English words out of context or nonwords, or both (e.g., Foorman, Francis, Fletcher, & Schatschneider 1998). Even when comprehension tests show subjects have not gained on comprehension, they are still said to have improved in ''reading'' if their scores on ''reading'' (i.e., decoding) of words and nonwords on such a list have gone up from pretest to posttest. Here ''read'' means ''decode'' or, perhaps, ''decode with literal comprehension'' (keeping in mind that there are vast debates in linguistics and philosophy as to whether there is such thing as ''literal meaning'' and, if there is, what exactly it is). The claims of this research derive their significance, coherence, explicitness, and meaning from a technical theory of the domain no less than those we have surveyed earlier. Here, too, alternative theories of the domain (where, for example, ''reading'' may involve the ability to keep a running discourse model of connected print) may distribute significance, coherence, explicitness, and meaning in a different way.

A second point is this: The previous discussion certainly does not imply that one cannot ask a question about a given domain (e.g., syntactic theory) from outside that domain. For example, at various times over the last few decades, psychologists of various types have asked questions about syntactic theory (e.g., ''Is it psychologically real?''; ''Does it predict processing data?''; ''Can it help explain data from patients with brain lesions?''; and many more). However, if one wants to ask a question from one domain about another domain three things must happen: some theory within some domain (e.g., psycholinguistics) must give the question some specific meaning; one has to understand the domain being asked about in its own terms; and work has be done bridging between the two domains (and this may not always be possible, see, e.g., Kuhn, 1962, though it clearly does happen and often).

Now I turn to what I take to be the morals of the previous discussion. It makes little sense to ask whether a given ''result'' is ''scientific'' or ''scientifically derived'' or whether a given method is ''scientific.'' These are all relative to given theories of specific domains. It does not make much sense either to ask whether a given theory of a domain is scientific or not - because the theory itself incorporates its own criteria of what counts as evidence, coherence, explicitness, method, significance, and rigor.

What we can do - and, in fact, do all the time - is first state some goal we are interested and invested in (most certainly including the goals members of the general public may have). Then we can ask, for those theories of the relevant domain that, in fact, have those goals, do they have a good track record of meeting them? If they do, we buy and use the theory; if not, we don't. If theories have explanatory goals that we don't care about, then we can and do ignore them (though their practitioners don't, of course). The question of track record is, of course, still a theory-internal issue, but theories can be checked on how ''honest'' they are in acknowledging whether they are meeting their own stated goals or not. Chomsky (e.g., 2000) himself has acknowledged that the jury is still out in the case of his brand of theoretical linguistics (the goals are, as he has pointed out, quite demanding).

One reason - not the only one - that we can assess the track record of specific theories of specific domains is that, in some cases, these theories claim to be about the physical, natural, biological, or social world, and the world ''out there'' bites back - not anything goes (put your hand in a fire and find out). However, even these theories differ greatly in whether their core claims are couched in language close to observations or in language far removed from observations and requiring a long (rather than short) chain of inference to make observational statements (and, in my view, a history of the track records of various theories does not support the claim that theories with short chains are better or more successful in general, though they may be in specific cases). Even for some very powerful theories, we can wait some time before the world bites back, and when it does, it may be quite hard to know what exactly to change in the theory. As Quine (1951) pointed out years ago, when a theory is faced with a counterexample, many different claims, at different levels of abstraction, might be modified. So even ''bite'' is a theory-internal matter.

So, in the end, it's really about goals we as a society are interested in and the track record of various theories of various domains in meeting them. So who decides what goals we ought to be interested and invested in? That is, by and large, and rightly so, a social and ''political'' question that ought to be debated thoroughly in the public sphere. If your goal is to kill people better and faster, I for one am not impressed that the theory you use - one with a good track record - is ''scientific.'' If your goal is to improve the final months of dying people's lives, and your theory with a good track record is not ''scientific'' - whatever that might mean - then I am still impressed.

This is not an anything-goes perspective. Many a theory has died because it eventually had a bad track record in accomplishing its own goals or died because no one but its own practitioners cared about its goals. If we wanted a definition of ''truth'' as it applies to theories, I would offer this: Truth is what we get, in the long haul, from theories with a good track record that we care about (and these, in the end, probably are closer to the way the world ''really'' is; if one wants to quibble with ''really'' here, see Hacking, 2000). ''Care about'' is a social and political notion that must be fought over. Of course, I cannot rule out that certain people will argue to state only certain goals and not others as the litmus test of which theories are ''scientific.'' But I would rather worry over which are true, in the sense just given, not which are ''scientific.''

In the end, then, I would argue that the principles about what is science offered by SRE are relatively vague generalities and dangerous as such since they can be applied in so many different ways based on the political interest of the person or agency applying them. Educational research, in particular, in my view badly needs a focus not on such generalities but on the nature of its theories and their track records in respect to a variety of different goals we value. Too often, research in education is mined for ''results'' only when those results support one's political views (see Bracey, 2003, for discussion). Results that don't support those political views are conveniently ignored or, worse, unfairly denigrated. At the same time, educators and policy makers on all sides of the political fence have spent too little time engaged in rigorous inspection of the theories that give research results and the research that produced them meaning in the first place. Such an approach is not science - at worst, it is the politics of self-interest; at best, it is mindless empiricism.


Bresnan, J. (2000). Lexical-functional syntax. Oxford, UK: Blackwell.

Bloomfield, L. (1935). Language. London: Allen & Unwin.

Bracey, G. W. (2003). What you should know about the war against America's public schools. Boston: Allyn & Bacon.

Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.

Chomsky, N. (1965). Aspects of a theory of syntax. Cambridge, MA: MIT Press.

Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris Publications.

Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.

Chomsky, N. (2000). New horizons in the study of language and mind. Cambridge, UK: Cambridge University Press.

Chomsky, N. (2002). On nature and language. Cambridge, UK: Cambridge University Press.

Foorman, B. R., Francis, D. J., Fletcher, J. M., & Schatschneider, C. (1998). The role of instruction in learning to read: Preventing reading failure in at-risk children. Journal of Educational Psychology, 90, 37–55.

Hacking, I. (2000). The social construction of what. Cambridge, MA: Harvard University Press.

Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.

National Reading Panel. (2000). Teaching children to read: An evidence-based assessment of the scientific literature on reading and its implications for reading instruction. Washington, DC: National Institute of Child Health and Human Development.

National Research Council. (2002). Scientific research in education. Washington, DC: National Academy Press.

Peters, S. (Ed.). (1971). Goals of linguistic theory. Englewood Cliffs, NJ: Prentice-Hall.

Postal, P. M. (1974). On raising: An inquiry into one rule of English grammar and its theoretical implications. Cambridge, MA: MIT Press.

Sag, I. A., & Wasow, T. (1999). Syntactic theory: A formal introduction. Stanford, CA: CSLI.

Quine, W. V. O. (1951). Two dogmas of empiricism. The Philosophical Review, 60, 20–43.

Cite This Article as: Teachers College Record Volume 107 Number 1, 2005, p. 10-18
http://www.tcrecord.org ID Number: 11684, Date Accessed: 2/23/2005 7:25:32 AM

Toward "Epistemic Reflexivity" in Educational Research: A Response to Scientific Research in Education

by Pamela A. Moss - 2005

In this response to Scientific Research in Education (National Research Council, 2002), I argue that the report has provided us with a carefully considered but partial vision of social science that limits the capacity of our field to engage in critical self-reflection. As one counterexample to the vision of social science portrayed in the report, I draw on the work of French sociologist Pierre Bourdieu. Bourdieu's work not only offers an alternative vision of social science, but it also treats academic fields, including social sciences, as objects of study, illuminating the social forces that shape categories of thought and action. As such, it invites a rigorous form of self-reflection that is not promoted within SRE's vision of social science.

As described by its authors, the primary purpose of Scientific Research in Education (SRE) is to ''to provide guidance for what constitutes rigorous scientific research in education'' (National Research Council, 2002, p. 24). The SRE authoring committee was assembled by the National Research Council (NRC) to respond to a charge from the U.S. Department of Education to ''review and synthesize recent literature on the science and practice of scientific educational research and consider how to support high quality science in a federal education research agency'' (p. 1). The report has at least two primary audiences: policy makers who support and use educational research and educational researchers for whom the report is intended to ''spark self reflection'' (p. 22). Repeated calls for ''scientifically based research'' in education at the federal level make the stakes over the criteria for rigorous scientific research particular high at this point in time.

SRE responds to its charge by providing a set of ''guiding principles,'' which it states ''underlie all scientific inquiry'' (p. 2). The guiding principles are accompanied by a discussion of the features of education that influence the way they are instantiated in educational research. These features all point toward the importance of attending to context and ''understanding the extent to which theories and findings may generalize to other times, places, and populations'' (p. 5). The report offers examples of what is described as ''some of the most frequently used and trusted designs for scientifically addressing broad classes of research questions in education'' (p. 98) and a ''set of design principles for fostering science in a federal education agency'' (p. 127) that are intended ''to nurture a scientific culture'' (p. 6). The report defends qualitative research by arguing that qualitative and quantitative research are ''epistemologically quite similar'' (p. 19), that both can be pursued rigorously, and that the research purpose should determine the methods used.

In this response, I focus primarily on the report's potential to provoke self-reflection among the community of educational researchers. I argue that the report has provided us with a carefully wrought, but partial vision of social science and, consequently, that it risks limiting the capacity of the field to engage in the kind of critical self-reflection its authors so clearly value. As one counterexample to the vision of social science portrayed in the report, I draw on the work of French sociologist Pierre Bourdieu. Bourdieu's work not only offers, by argument and example, an instructive contrast to SRE's vision of social science, but it also treats academic fields, including social sciences, as objects of study. As such, it offers a critical perspective that is doubly relevant to a report that defines what counts as rigorous science. A rigorous approach to social science, as Bourdieu envisions it, requires us to ''constantly scrutinize . . . the collective scientific unconscious embedded in the theories, problems, and . . . categories of scholarly judgment'' (Bourdieu, in Bourdieu & Wacquant, 1992, p. 40). It is this sort of critical reflection that Bourdieu's close colleague Loic Wacquant dubs ''epistemic reflexivity.'' To extend the vision of social science portrayed in the SRE report, this article offers an alternative vision of social science and simultaneously illustrates two complementary strategies for promoting epistemic reflexivity: challenging one vision of social science from the perspective of another and illuminating the social forces that shape ''our'' particular visions of social science.

Before criticizing the SRE report, it's important to understand the political situation in which its authors were working. The SRE authors were responding explicitly and critically to a considerably narrower vision of scientifically based research that was articulated in the congressional bill intended to reauthorize OERI. They criticize the bill for ''mandating a list of 'valid' scientific methods . . . erroneously assum[ing] that that science is mechanistic and thus can be prescribed'' (p. 130). Against this vision, they argue, ''it is the ['self-regulating norms of the' (p. 130)] scientific community that enables scientific progress, not . . . adherence to any one scientific method'' (p. 19). Thus, the report offers policy makers an expanded and more flexible conception of science, one that locates authority for scientific rigor within the scientific community and that envisions an important role for certain types of qualitative research within a comprehensive and cumulative research agenda. As illustrated in a recent presentation by NRC staff (Feuer & Towne, 2003), a comparison of this earlier bill with the recently enacted Educational Science Reform Act illuminates differences in the characterization of scientifically based research that can be traced to the SRE report (see also Eisenhart & Towne, 2003). Clearly, the committee has provided a valuable service to the educational research community. Had the authors written a report that responded to the criticism I'm about to make, they may well have undermined the report's ability to influence policy makers to expand their vision of science.

All of that said, I'm concerned about the long-range consequence of the report for the very quality and culture of inquiry its authors so clearly value. By offering a single set of principles that ''underlie all scientific inquiry'' the committee appears to have taken sides on a long standing debate in the philosophy of social science about whether or not the social sciences should approach the study of social phenomena in the same way the natural sciences have approached the study of natural phenomena (e.g., Martin & McIntyre, 1994; Phillips & Burbules, 2002; Rabinow & Sullivan, 1987). For example, those who take what is sometimes called an ''interpretive'' approach to social science argue that social phenomena differ from natural phenomena because they are meaningful to the actors involved.1 Thus the social domain, unlike the natural domain, is preinterpreted. Further, meanings are seen as embedded in complex social contexts that shape (enable and constrain) what can be understood2 in ways that the actors involved may not perceive. (As Bourdieu, 1975, 1991, reminds us, this is as true of scientists as of the people they study.) From this perspective, a primary aim of social science is to understand what people mean and intend by what they say and do and to locate those understandings within the historical, cultural, institutional, and immediate situational contexts that shape them. This debate is not, primarily, about methods; it is about the most important aims of social science given the nature of social phenomena. Both large-scale statistical studies and intensive descriptive studies can and have been used to serve these different aims of social science.

In the SRE report, the debate about the unified nature of science is acknowledged briefly in at least two places (pp. 13, 50), but the alternative positions (those that argue against a unified or ''naturalist'' view of science) are neither elaborated nor rebutted. Thus, by asserting that ''scientific inquiry is the same in all fields'' (p. 2), the report risks representing a particular view of social science as universal. When viewed at a surface level, SRE's guiding principles, quoted as follows in short form, appear, for the most part, to be more generally applicable than my criticism implies. Here are the short statements of the principles taken from SRE:

1. Pose Significant Questions That Can Be Investigated Empirically

2. Link Research to Relevant Theory

3. Use Methods That Permit Direct Investigation of the Question

4. Provide a Coherent and Explicit Chain of Reasoning

5. Replicate and Generalize Across Studies

6. Disclose Research to Encourage Professional Scrutiny and Critique

(NRC, 2002, pp. 3–5)

The word ''replicate'' is the only term that seems to raise immediate concerns for researchers who privilege the stance that meanings are situated in social contexts. However, when I trace the way in which the principles are elaborated throughout the document and applied to concrete examples, it appears that a more specific vision of science is being privileged.

Consistent with a unified approach to social science, the report privileges programs of research that culminate in the establishment of replicable causal effects (idealized in randomized experiments), and it positions ''descriptive studies'' as preliminary or supplementary to this task. This statement summarizes a pervasive pattern that is reflected in the way individual studies and programs of research are critiqued, in the description of designs for the conduct of qualitative research that shows how they work together in a program of research, and in the advice that's given about the formation of a federal agency to support educational research, as illustrated below.

Readers are told, for instance, that descriptive studies are ''especially important when good information about the group or setting is non-existent or scant'' (p. 106) or may be more appropriate ''when theory is weak'' (p. 120), that ''these descriptions, when used in conjunction with causal methods, are often critical . . . because they illuminate key contextual factors'' and that ''the results may be used as the basis for new theoretical developments, new experiments, or improved measures on surveys that indicate the extent of generalizability'' (p. 106). Readers are cautioned, however, that ''An area of research that, for example, does not advance beyond the descriptive phase toward more precise scientific investigation of causal effects and mechanisms for a long period of time is clearly not contributing as much to knowledge as one that builds on prior work and moves toward complete understanding of the causal structure. This is not to say that descriptive work cannot generate important breakthroughs. However, the rate of progress should . . . enter into consideration of the support for advanced lines of inquiry'' (p. 101). As this last sentence suggests, the privileging of generalizable causal research is also reflected in advice to policy makers about funding. Similarly, in forming peer review panels, policy makers are cautioned that ''an overemphasis on ensuring broad content area, epistemological, and methodological representation can backfire if such considerations outweigh the overarching need to engage top scientific talent'' (p. 128). This emphasis is reinforced in critiques of existing programs of research, where replication and validity of casual conclusions are the primary criteria cited for distinguishing degrees of scientific rigor.3

Thus, consistent with a unified or naturalist approach to social science, generalizable or replicable predictions and causal conclusions appear to be the hallmarks of scientific research as portrayed in SRE. I want to be clear: I believe that this approach to research can be of great use for certain purposes. And there's no question that the committee has defended qualitative research as an important component of this approach to research. My concern is about privileging this vision over other approaches to social science and risking the implication that it represents the goals of (social) science in general. As Bourdieu (1975) suggests, ''Partial theories of science and its transformation are predisposed to perform ideological functions in the struggles within the scientific field . . . because they universalize the properties attached to particular states of the scientific field'' (p. 39).

If there is a theme that resonates across multiple philosophies of social science, it's the importance of encountering alternative perspectives and practices to illuminate the things that ''we'' in a given research community take for granted and that limit what we can understand about the world (e.g., Bernstein, 1985, 1992; Gadamer, 1979; Habermas, 1990; Hoy & McCarthy, 1994; Messick, 1989). Through these sorts of encounters, we can become aware of ''the problems we didn't know we had'' (Hoy, in Hoy & McCarthy, 1994) and can imagine how things might be otherwise. Following these authors, a rigorous approach to social science entails the ongoing willingness to seek out and learn from these sorts of challenges.

Bourdieu's work provides one such productive challenge to the vision of social science portrayed in SRE. While his research agenda does not fit comfortably within SRE's vision, his credentials as a social scientist would be hard to deny. His clearly programmatic scholarship spans over 40 years and is drawn on by scholars from quite different research traditions, including those who adopt a unified or naturalist view of social science (see, e.g., Bourdieu, 1998, and Bourdieu & Wacquant, 1992, for an accessible and relatively comprehensive introduction). Many of his research projects focus on education, particularly higher education, and, most directly relevant to our discussion, on academics and social scientists themselves (e.g., 1965, 1975, 1988, 1991, 1996).

Bourdieu enacts an instructively different perspective on generalizability or applicability, on causality or contribution, and on the associated implications for how knowledge might progress and change might be fostered. Additionally, through his studies of academic fields, especially his sociology of sociology, he provides an important opportunity - if we are willing to think analogically - for self-reflection about our epistemological moorings.

Like the authors of the SRE report, Bourdieu draws on numerous methods, ''qualitative and quantitative, statistical and ethnographic'' (1998, p. 2); however, he uses them to serve a different aim for social science: that of ''studying an object well-defined in social space and time'' (1998, p. 2):

My entire scientific enterprise is indeed based on the belief that the deepest logic of the social world can be grasped only if one plunges into the particularity of an empirical reality historically located and dated, but with the objective of constructing it as a ''special case

— Multiple authors in Teachers College Record
Teachers College Record



This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.