'Proficiency for All' – An Oxymoron

**Susan Notes:** Although this paper includes provocative challenges, such as examining the conflicting slogans, 'proficiency-for-all' versus 'first-in-the-world,' surely a call for a return to norm-referenced testing should give us all pause.

See this Wikipedia entry for a brief explanation of what a norm-referenced test is.

That said, there are stunning observations/explanations in this paper. The discussion of how cut scores and NAEP proficiency levels are set is worth the trip. And there's lots more.

*Under NCLB, children with I.Q.s as low as 65 must achieve a standard of proficiency in math which is higher than that achieved by 60 percent of students in Taiwan, the highest scoring country in the world (in math), and a standard of proficiency in reading which is higher than that achieved by 65 percent of students in Sweden, the highest scoring country in the world (in reading).
What NCLB has done is the equivalent of demanding not only that 'C' students become 'A' students nationwide, but that 'D' and 'F' students also become 'A' students. As noted above, this confuses two distinct goals – that of raising the performance of typical students, and that of raising the minimum level of performance we expect of all, or almost all students. Both are reasonable instructional goals. But given the nature of human variability, no single standard can possibly describe both of these accomplishments. If we define proficiency-for-all as the minimum standard, it cannot possibly be challenging for most students. If we define proficiency-for-all as a challenging standard (as does NCLB), the inevitable patterns of individual variability dictate that significant numbers of students will still fail, even if they all improve. This will be true no matter what date is substituted for NCLB's 2014.
Political, not scientific, considerations continue to explain NAGB's stubborn refusal to abandon achievement level cut scores which have no scientific or scholarly credibility.*

November 2006

**By
Richard Rothstein, Rebecca Jacobsen, and Tamara Wilder**

Paper prepared for the Symposium, "Examining America's Commitment to Closing Achievement Gaps: NCLB and Its Alternatives," sponsored by the Campaign for Educational Equity, Teachers College, Columbia University, November 13-14, 2006

Richard Rothstein (riroth@epi.org) is a Research Associate of the Economic Policy Institute. Rebecca Jacobsen (rjj7@columbia.edu) and Tamara Wilder (tew2101@columbia.edu) are Ph.D. candidates in Politics and Education at Teachers College, Columbia University.

Research for this paper was supported by the Campaign for Educational Equity, Teachers College, Columbia University. Views expressed in this paper, however, are those of the authors alone, and do not necessarily represent positions of the Campaign for Educational Equity or of Teachers College. We are grateful for the advice and assistance we have received from scholars and policy experts (James Guthrie, Walt Haney, Daniel Koretz, Robert Linn, Lawrence Mishel, Senta Raizen, Michael Rebell, Bella Rosenberg, Jesse Rothstein, Christopher Weiss) and government technical experts (Eugene Owen, Susan Loomis, Larry Feinberg, Gary Phillips, Kelley Rhoney). None of these are responsible for our failure to follow their advice or heed their cautions in all cases, and so the errors of fact or interpretation that remain are the sole responsibility of the authors.

Introduction and Summary
No Child Left Behind (NCLB) requires all students in grades 3 through 8, in each racial, ethnic, and socio-economic group, and whether they have special needs or are native English speakers, to be proficient in math and reading by 2014. This is widely understood to be unattainable, but educators and policy makers are insufficiently aware of the causes of our looming failure. Many of the law's supporters believe that the goal of 'proficiency for all' can't be reached primarily because there is too little time between now and 2014 for schools to improve sufficiently, and that the problem can be fixed by making the deadline more distant to allow more time to improve. For this symposium, we have been asked to consider whether such a goal can be reached; if so, how long it might take if, in fact, 2014 is too soon; and if the goal is unattainable no matter how distant, how we might establish more reasonable school goals for narrowing the achievement gap and raising the achievement of all children.

We conclude that the problem is more fundamental than a mis-estimate of how long it might take for all students to achieve proficiency. There is no date by which all (or even nearly all) students in any subgroup, even middle-class white students, can achieve proficiency. Proficiency for all is an oxymoron, as the term 'proficiency' is commonly understood and properly used.

In the following pages, we show why this is impossible, in several steps. First, we attempt to discern the meaning of 'proficiency' in NCLB, and conclude from the language and structure of the legislation that it intends all students to be proficient as defined by the National Assessment of Educational Progress (NAEP). Although the U.S. Department of Education has looked the other way as many states have claimed compliance with NCLB by requiring only low skill levels to pass standardized tests, the law explicitly requires standards of proficiency to be "challenging," a term taken directly from NAEP's achievement level descriptions.

We show that by ignoring the inevitable and natural variation amongst individuals, the conceptual basis of NCLB is deeply flawed; no goal can simultaneously be challenging to and achievable by all students across the entire achievement distribution. A standard can either be a minimal standard which presents no challenge to typical and advanced students, or it can be a challenging standard which is unachievable by most below-average students. No standard can serve both purposes – this is why we call 'proficiency for all' an oxymoron - but this is what NCLB requires.

NCLB's admirable, though difficult goal of closing the achievement gap can only sensibly mean that the distributions of achievement for disadvantaged and middle class children should be more similar. If there were no achievement gap, for example, similar proportions of white and black students would be 'proficient' and similar proportions of white and black students would achieve below that level as well. 'Proficiency for all,' which implies the elimination of variation within socioeconomic groups, is inconceivable. Closing the achievement gap, which implies elimination of variation between socioeconomic groups, is extraordinarily difficult, but worth striving for.

We demonstrate that the inevitable distribution of student outcomes is such that if all, not only some, students were to reach NAEP's challenging academic standard of proficiency, impossible gains would be required. By comparing NAEP results to scores on international exams, we show that even the top-performing countries in the world are far from being able to meet a standard of 'proficiency for all,' as NAEP defines it. Indeed, 'first in the world,' a widely ridiculed U.S. education goal from the 1990s that was supplanted by NCLB, is actually much more modest than NCLB's goal of 'proficiency for all'.

It is only in the last 15 years that NAEP results have been reported in terms of proficiency and other achievement levels. We describe the shift from NAEP's original scale and norm-referenced results to this more recent, criterion-referenced reporting. Discussing the methods used by the federal government to develop current NAEP achievement levels, we show that definitions of proficiency are fraught with subjectivity. Even if well-intentioned, making judgments of what students ought to be capable of, rather than basing judgments on observations of what actual students can achieve, yields results that the federal government itself acknowledges should be "interpreted with caution." The movement away from scale and norm-referenced score reports has resulted in the politicization of standardized testing.

The problems we describe cannot be fixed by lowering NCLB's expectation, for example, lowering it to one that all students must achieve NAEP's basic level, not proficiency. Such a reduction would effectively return NCLB to the 'minimum competency' accountability standard of the 1970s that NCLB was explicitly designed to reject because it created no incentives to develop the critical thinking skills that today's graduates should possess. Even so, this basic standard still cannot be applied to 99 percent of all students, as NCLB demands. As the performance of 'first in the world' countries demonstrate, many students would still fail a requirement that all students have basic levels of achievement.

The irresponsibility of NCLB's expectation of 'proficiency for all' should not lead to the abandonment of goals for the improvement of student achievement, nor does it suggest that public education systems should not be accountable for realizing challenging degrees of improvement. We describe a simple statistical procedure, inspired by 'benchmarking' practices employed in the business world, which can be used to establish strenuous but realistic goals for improved achievement by students at all points in the distribution. Benchmarking permits a sophisticated return to norm-referenced measures of academic achievement, something not new to education but which has been abandoned in the NCLB legislation.

We conclude by describing reforms in education and youth development that might be necessary to raise achievement and to narrow achievement gaps, substantially. Because unacceptably low average achievement for disadvantaged children is established in our current education and social system by age three, and because skill developed at later ages depends on investments in skill at earlier ages, we describe a 19-year program that might bring a birth cohort of children to maturity with high levels of performance. Remedial and compensatory programs may contribute to higher achievement for cohorts already moving through the system, but probably cannot succeed in the realization of goals that inspired the framers of No Child Left Behind.

NCLB and the NAEP Standards
NCLB states that all children shall "reach, at a minimum, proficiency on challenging State academic achievement standards and state academic assessments," and that these standards must "contain coherent and rigorous content" and "encourage the teaching of advanced skills."1 The law does not further define "challenging" standards, but it is reasonable to infer that such a standard challenges typical children to achieve at a higher level than their past performance. This inference is supported by the law's requirement that the National Assessment of Educational Progress (NAEP) be administered biennially in math and reading to a sample of fourth and eighth grade students in each state, providing a standard by which state judgments about proficiency can be compared. Furthermore, NCLB uses language to describe proficiency that parallels that of NAEP, whose definition of proficiency is "demonstrated competency over challenging subject matter."2 As Christopher T. Cross, appointed by the Department of Education in 2002 to coordinate rulemaking for NCLB, recently noted, NAEP "is supposed to be the benchmark for states, and that is why its use was expanded" in the act.

The NCLB requirement that proficiency be "challenging" can also be traced to an influential series of articles on "systemic school reform" in the late 1980s and early 1990s that had an important influence on the development of federal accountability. In these, Marshall Smith and Jennifer O'Day proposed a program to create schools with "coherent and challenging instructional programs, that genuinely engage all, or at least most of their students."* They called for new standardized tests for accountability purposes that would "stand as a serious intellectual challenge for the student."4 The reform goal of "challenging content for all children," Smith and O'Day wrote, should take on "an aura of official policy;" and although NAEP is not explicitly aligned with any state's curriculum, "we expect that it will be moderately sensitive to effects of curricula that emphasize challenging content."5

* Marshall Smith was education advisor to Governor Bill Clinton when the latter co-chaired the National Governors Association education task force at the 1989 Charlottesville Education Summit where federal education goals were adopted; Dr. Smith then chaired the task force on education standards established by federal law in 1991 to develop a national accountability system, and went on to serve as President Clinton's deputy secretary and undersecretary of education. 5

NCLB specifies that NAEP achievement level definitions shall only be used on "a trial basis" until the Commissioner of Education Statistics evaluates them and determines that they are "reasonable, valid, and informative to the public."6 Yet nearly five years later, there has been no significant reconsideration of historic NAEP definitions of achievement levels, so it is again reasonable to infer that NCLB's implicit definition of proficiency is consistent with NAEP criteria.* In the NAEP administrations immediately prior to the adoption of NCLB, only 22 percent of fourth graders in public schools nationwide were deemed proficient in math and 27 percent in reading. For eighth graders, only 25 percent were deemed proficient in math and 29 percent in reading.7†

This gives us a rough way to estimate how much improvement would be required for all students in all subgroups to be proficient. At present (the most recent data are from 2005), 71 percent of all eighth graders in public schools are below proficiency in reading on the NAEP. For the typical student, becoming proficient would require a gain of 0.6 standard deviations.8‡ In other words, by 2014 the median student would perform similarly to a student who is at about the 72nd percentile of performance today.§ For a
student whose performance is below the median, but still similar to that of most same-age students (i.e., those who are below the median but still performing better than the lowest-performing 16 percent of all students), becoming proficient would require a gain of up to 1.6 standard deviations.* In other words, a student who is now at the 16th percentile in today's achievement distribution would also perform similarly to a student who is now at the 72nd percentile. Approximately one-sixth of all students would require a gain even greater than 1.6 standard deviations.

* As we discuss below, this requirement for a re-evaluation of NAEP achievement levels has been part of the Elementary and Secondary Education Act for 12 years, and ignored throughout that period.
† Data for fourth graders in reading, and for fourth and eighth graders in mathematics, are from NAEP administrations in 2000. Data for eighth graders in reading are from NAEP 1998. NAEP was not given for eighth grade reading in 2000. Data are for all public school students, including those who took the test with accommodations. These data include the percent of all students whose scores were above the proficient cut score, including those whose scores were above the advanced cut score.

‡ These and similar estimates in this paper are approximations because the distributions of test scores are not perfectly normal and therefore the median (or typical) student may not be identical to the mean (or average) student. Our estimates, however, are calculated from the mean, assuming perfect normality. In 2005, the proficiency cut score was 281 in reading, the mean score was 260, and the standard deviation was 35.
§ Throughout this paper, we adopt a convention of describing percentile ranks as ascending with improved performance. In other words, the best-performing 1 percent of students are described as being at or above 6
student whose performance is below the median, but still similar to that of most same-age students (i.e., those who are below the median but still performing better than the lowest-performing 16 percent of all students), becoming proficient would require a gain of up to 1.6 standard deviations.* In other words, a student who is now at the 16th percentile in today's achievement distribution would also perform similarly to a student who is now at the 72nd percentile. Approximately one-sixth of all students would require a gain even greater than 1.6 standard deviations.

For the rest of the paper, a pdf file, go to the url below.

— Richard Rothstein, Rebecca Jacobsen, and Tamara Wilder

*Campaign for Educational Equity*

http://www.epi.org/webfeatures/viewpoints/rothstein_20061114.pdf

INDEX OF RESEARCH THAT COUNTS