Orwell Award Announcement SusanOhanian.Org Home

Seeking Alternatives to Standardized Testing

Susan Notes: Deborah Meier's answers to the tired old questions about how the public is going to know when kids are learning should make everybody cheer.

Seeking Alternatives to Standardized Testing

By Jay Mathews
Washington Post Staff Writer

Deborah Meier is my favorite answer to an important question: Is there some other way to make low-income neighborhood schools better than through the annual standardized testing required by the new federal No Child Left Behind law?

In 1974 she created the Central Park East School in the East Harlem neighborhood of New York City. That school showed, through its students' success in college and the workplace, that the children of day laborers, garment workers and welfare recipients could achieve academic success by being treated in public school as if they were graduate students. Instead of being given regular standardized multiple-choice tests, they were evaluated through reviews of their written work and interviews with experts in the subject matter they had studied.

It has been more than a year since I last wrote about Meier, co-principal of the Mission Hill School in Boston, and her latest book, "In Schools We Trust." Her perspective has become even more significant as the No Child Left Behind (NCLB) rules hit all American public schools. So I asked her if she would be willing to have an e-mail conversation with me about what is going on, and what we should be doing about it. Here is the result:

MATHEWS: You are one of our nation's leading educators, with a remarkable record of innovation. You showed at the Central Park East School in East Harlem that inner-city students could thrive in a school that did not tie itself to regular standardized testing, but encouraged deep reading and much writing, with assessments made by panels of experts having long conversations with students and looking at their work. Yours is, in my view, the most hopeful alternative to the testing-kids-and-rating-schools approach taken by the No Child Left Behind law.

Yet I see many good schools in low-income environments, like the KIPP schools, or some of the Edison schools, that seem to be producing well-educated kids by using the testing approach, with some variations, and very few that are having similar success with your approach. I have come to wonder if your approach is too hard to make work unless the staff is unusually gifted, as yours was at Central Park East.

Am I wrong? Why don't we have a large and growing number of schools in low-income neighborhoods following your example?

MEIER: It mostly depends on how one defines being "well-educated." You get what you ask for. For those who define it as KIPP or Edison do, clearly the test-prep approach works, although they'll soon discover its limits. I'm for their right to define education that way and to use assessments that best capture what matters to them. But I think time will demonstrate that narrowing the gap between low- and high-income test takers is a never-never-land game -- for many reasons -- and that a better definition of being well educated offers a better shot at real equity. Richard Rothstein has done a good job of chasing down the so-called miracle schools that confound the odds on testing and proving they are mostly myths.

Alas, in a society determined to increase all the other gaps between low and high income and black, brown and white Americans, it will get harder, not easier, to narrow, much less close, those gaps. But even if one could do it, the gap in test scores is not the one I'm most worried about. Neither is it the one that most affects how kids do after they leave school, or what kind of society we are creating. The kinds of tests we're all focused on simply are not a good measure of what really counts.

C.P.E. [Central Park East] started from a different premise -- the premise that got me interested in becoming a teacher to start with. I do not, for good reasons, view test scores as a measure of intellectual achievement -- or of much else that we know matters in the long haul. The success of C.P.E. was not based on test scores but on the remarkable data we collected on who graduated, who went on to college, and, most important, how they fared in real life afterward.

My concern then, as now, was that few kids in America, and especially few of those attending working-class or low-income schools, got the kind of education that prepared them to be powerful members of the ruling class. In a democracy that should be the nonnegotiable central goal of public schooling. The kind of schooling I got (including my home schooling) prepared me for what I view to be the central functions of education in a democracy: to know how to exercise judgment on matters of considerable complexity, not to mention uncertainty. The schools I subbed at in Chicago in the mid-60s were, at best, focused on right/wrong answers, memorization, dutifulness and social adjustment. I did not believe this was inevitable.

MATHEWS: That is a pretty lofty goal, preparing the ruling class. How do we do that?

MEIER: For the past 40 years we have been creating schools in which both adults and children -- families, teachers and kids -- are obliged to use the school to enhance their own education. In my first book, "The Power of Their Ideas," I show what this meant. It involved teaching kids the power of reading, as well as the know-how; the power of math, as well as some efficient algorithms; the power of history, science and art.

The teachers at C.P.E. and now at Mission Hill are not unusually gifted (at least not by any measurable standard!). But we have unusually high expectations for ourselves and our kids. Since we want kids who take responsibility for exercising judgment, we surround them with adults who are required to do the same. And we make sure the relationships between kids and adults are strong and educative. We invented five habits of mind -- basic questions -- that we decided were at the heart of the usual academic disciplines, as well as all the disciplines of living. We organized curriculum, pedagogy and assessments so that kids and adults would get a lot of practice using their minds in these ways. We wanted a lively community in which we learned through the company we kept. Teachers reading scripts was not an answer compatible with our definition of the job. Means and ends need to fit.

Is this a one-of-a-kind idea? In the details, yes. But in concept not at all. For the first time in history there are hundreds of public schools like ours working with poor and minority kids. It's still a drop in the bucket. We don't have enough schools following our example because the deck has been stacked against them. First they told us that these ideas just wouldn't work for "those kinds" of kids -- meaning poor and black kids above all. When the educational authorities got embarrassed about saying this -- and the C.P.E. data helped show it was nonsense -- they still rarely encouraged it. If they devoted half the energy they've spent discouraging folks from trying out our ideas on encouraging them instead, hundreds would be thousands.

Yes, it would take time for it to become the norm. It's the kind of change that can't just be plugged in. But a system that defines being well-educated as getting higher test scores or else, and that puts considerable coercive power around the "or else" (NCLB), can hardly complain that we're not seeing a "large and growing number of schools" following our example. What's remarkable is that any schools are managing to stay the course of real intellectual rigor in the so-called reform climate facing us today.

MATHEWS: Okay, name two of these hundreds of schools that are following the C.P.E. lead and tell me what evidence there is that they are having the impact you say they are having on these kids' lives.

Competent reading, writing and arithmetic is enough for me. The appreciation of learning that you strive for is of course very important, but we average Americans with limited imaginations need something simple to measure, and the ability to read efficiently, write clearly and do math through algebra works for me. If you go beyond that, without a measuring method that people understand, you leave yourself open to doubt about what you are teaching and '60s-bashing and other unhelpful nonsense.

You don't say this, but a lot of people who take your side of the argument act as if the competitive, numbers-hungry, money-oriented American culture is at fault, and once we all become more sophisticated, we will see how good are the schools that reject those values. That reminds me of the teacher union head at Garfield High School in East Los Angeles who said Jaime Escalante, who against all odds had built a successful Advanced Placement program in the inner city, was not really adding much value. The union representative said the system wouldn't get really better until we had a social revolution in the United States. I am 58 and can't wait that long, and the kids in schools these days have even less time than I do to get a good education.

MEIER: Do you really want me to name just two schools? The Community School in Ithaca, N.Y., and the School Without Walls in Rochester, N.Y., go back almost as far as C.P.E. I could name 40 more in New York State alone. Ask George Wood about the transformation of his school in southern Ohio -- a rural high school that rarely sent kids on to college and now regularly does. Ask Dennis Littky about the public Metro schools with 100 percent college attendance for kids whose parents mostly didn't graduate from high school. At the elementary level there are schools like the Capital City Charter in D.C. (now in its fourth year); The Central Academy in Middleton, Ohio; I could go on and on.

You want a measure that everyone can understand? Here's one: the number of kids who graduate from high school versus the number who start, the number who go on to college compared to a comparable peer group. New York State appointed a panel to look at the data on graduation rates and post-graduate education at 40 schools using "performance-base graduation criteria, as well as examples of their student's work, and heard from graduates and their families. The review panel -- most of them testing experts -- found this evidence impressive. They said it was the data on real life success that moved them most.

Rare are the elementary schools that keep such data -- as we did at C.P.E. -- and it's rarer still for researchers to study this kind of outcome data, as David Bensman did for C.P.E.

I do think it would be possible to design a pass/fail reading/writing/basic math test, similar to the written part of the driver's test -- and one that I might find unobjectionable! But it's the performance part of the driver's test that is harder -- and isn't standardized in a single state despite the importance of trusting each other on the road.

Do I think this would satisfy the folks pushing for testing? (And the push is not coming primarily from the public.) Not as long as what they want is a rank ordering system -- a way to compare schools on a scale of best to worst. Have you ever wondered why no one tells you whether they passed the driver's test with a high or low score? Or why we don't keep comparative data between states or communities? Does anyone care whether a lawyer passed the bar high or low? It's the rank-ordering that requires tests that actually are designed to differentiate, to seek those items that will separate the best from the next best, all the way down.

Such a basic test should first be taken by the folks we honor by electing to office. That's the Meier Amendment: the people who legislate or mandate a test should be required first to take it themselves to ensure that it's measuring what they think it is. It's a form of validity checking. They might even have their scores posted!

Some of the things we wish for devoutly are not possible. Wanting a cure in my lifetime (and I'm closer to the end than you, Jay) may not be in the cards -- even if all the stars were aligned our way. And, alas, as long as we keep increasing the gaps between rich and poor in every other aspect of life. Perhaps that's what Escalante's union rep meant. But we are even less likely to hold the gaps steady if the way we measure is something as peculiarly sensitive to both socio-economic status and race as the instruments we now use. That's how they were designed -- to be sensitive to the larger culture (or genes -- take your pick).

It is intriguing how the right-wingers have done a 180-degree turn regarding nurture versus nature when it comes to public education, as well as on whether racism should be measured by inputs or outcomes. It seems we can re-engineer for equity -- if we focus just on public schools. We can do a lot better. I'm amazed at what schooling can accomplish. But we're not likely to have striking results across the board if we don't intend to pour enormously larger amounts of money into education -- and not just the 9-to-3, 180-days-a-year part of education.

Money alone, as critics note, is not a substitute for good schooling (although for some reason rich people insist on it for their kids) but the faster we want to go, the more it will cost us. If we decided to put all our weight into radically serious reforms of the type I've described, even with all the money in the world, it would be a generation until we saw major results for all children. You can't retool even a factory overnight. Education, however, is dependent on changes in cultures in ways factories aren't and change cannot therefore happen Soviet-style-through brainwashing or removing children from their benighted parents at earlier and earlier ages. It didn't work for communism either. People have a way of hanging on to their pasts, to their traditions -- which is why Ted Sizer is right that the best reforms are shaped by real human beings out of the stuff at hand. It's amazing, as doctors will tell you, how many myths persist in medical care because neither patients nor doctors can stand not doing some things the way they are used to doing them.

Think, Jay, of all the intolerable things in the world you and I tolerate -- some far, far more horrendous than the average bad school in the U.S.A. -- not because you or I are unconcerned or innately patient, but because getting to the wrong place faster is a disservice to the task of getting to the right place as fast as possible.

MATHEWS: As usual, you have given me a great deal to think about, and much that I agree with. I want to hear more from you on the one part of your message that just did not seem to fit the reality I see in the states -- your belief that No Child Left Behind is not an effort to make sure all kids acquire necessary skills in reading, writing, and math, but instead a ranking exercise to make schools look bad and feed our pecking-order culture's need for mindless differentiation.

I have been watching the reporting of test results in Virginia, Maryland and D.C. carefully, and I don't see much ranking being done. The reporting by those states and the reporting by my newspaper focuses instead on a few other questions -- how many schools have reached the AYP [annual yearly progress data] targets? How many haven't? What is being done to improve the schools that missed AYP? And, no less important than these first three, are those schools that have missed AYP really failing in some way, or is this just a quirk of the labeling system in the new law?

The other states that I watch at a distance, particularly New York and Massachusetts, in which you spend most of your time, seem to be focusing on the same things. So where is this ranking going on?

And, more important, if we are agreed that a driver's-license-like test for reading, writing and arithmetic is possible, and useful, then why not accept the tests that we have as a version of that -- most of them appear to be designed to be exactly that -- and work to improve them rather than tossing them out as ideologically incorrect?

MEIER: Some good points. The states and the feds are coming at this with somewhat different intents. The state tests assess individual kids first, schools second. The states rank schools by average test scores (published cumulative numbers) and withhold promotion or diplomas from individual kids based on scores. The feds have a different twist.

You are right that it's harder to rank schools using just NCLB and AYP. This is a yes/no category -- and consists of a whole array of possible grounds for getting a no. I imagine the "good" schools will soon solve this -- by publishing their subscores so that the successful subgroups will be reassured. But I suspect that one reason for not ranking on a federal level is to undermine suburban support for public education. NCLB supporters love saying to the contented middle class, "Just you wait and see, even your schools and your kids are not as good as you think." Of course they're not -- but the thing that's wrong with them is exactly what a focus on testing will make worse.

The impatient folks at the federal and business level are frustrated at the difficult-to-budge support that public education gets in poll after poll. If only, they complain, parents all realized how dumb their schools, kids and teachers are. They remind me of 1960s lefties, ready to bomb our complacent schools. Maybe this law was designed to make us so discouraged with public education that we can move on to unregulated privatized education. By labeling 75 percent of our schools failures, we can soon drop the old American love affair with the "common school" and move on to the marketplace.

But this strategy underestimates the ingenuity of the powerful. For example, they can make sure the underachieving kids don't go to their schools. As we celebrate the 50th anniversary of Brown vs. Board of Education, we have an education law that never mentions integration and rewards homogeneity. Note how few were those who objected to the fact that the only schools that did not fail AYP in Boston were the selective schools. No one said, "Hold on -- that's unfair." This is true nationwide: diversity works against you in surviving NCLB. The suburbs will either discover a similar solution or discover the virtues of privatization -- or, just as bad but better intended, figure out how to do more test prep for "those kids," who mostly need more intellectual prep. (The same thing is evident in pre-schooling -- even though we know that early narrow test-prepping does not have positive long-term effects.)

Note also that if we succeed in raising test scores, a la Texas, it might be that all we're doing is dropping more kids from the registers. As in business, the higher the stakes the more tempting it is to trim the truth.

I like the idea of seeing if you and I could agree on some basic skills tests -- given at about 9th grade (and to state legislators) -- that would undercut one set of legitimate fears. This would focus on the longer-range task of actually raising serious intellectual standards while also undertaking the raising of real school and community standards. That's how democracy works -- by persuasion.

I got into this work because I was appalled at our thoughtlessness about standards. That's why I built the schools I started on standards -- but not on standardized tests. More on these kinds of standards later. Of course, if you are right in assuming that the tests we have now tell us things of importance -- beyond which population is being served -- life would be simpler. It's hard to undermine this belief -- and since I also once held it I'm not surprised.

MATHEWS: Don't you think the AYP results, as annoying as they are, have some merit as a way to focus attention on those kids who need the most help?

MEIER: It's true that AYP is supposed to be a way of tackling this dilemma, measuring not only who tests best but progress and change. It will, perhaps, take time to convince you, but I believe there is sufficient evidence that most fluctuations, forward or backward, are due to luck, or cheating, or a change in who the school is serving.

Furthermore, if we are not measuring something of importance then spending most of our resources on trying to do the unimportant better is bad, not good. The titles of the test make them look sensible. It takes time, and talking with kids, to realize they are not measuring basics -- or advanced knowledge, either. Even the 6th-grade math test asks questions that neither you nor I have ever faced since we left school. Ditto the tests in literature (so-called reading). Even so-called authentic writing tests are scored according to a formula -- the scorers devote a minute or two per essay. This kind of test misleads us about what constitutes effective written communication, not to mention other forms of communication -- like the spoken word. And God help us when they start testing in science and history!

Not only do the tests not measure basics, but they also distract us from teaching the kind of stuff that might engage kids' minds and hearts, stuff that would force them to engage in the real discipline of intellectual life -- weighing evidence, seeing other ways of looking at the same data or situation, comparing and contrasting, seeking patterns, conjecturing, even arguing. The trouble with such skills is they don't come packaged with right/wrong answers.

I've been asked, "But Deb, how can folks use such higher order skills if they don't first pile into their brains the facts -- the yes/no stuff? Who wants to know your opinion on World War I if you are not knowledgeable about it?" Not a bad point. But since the average school experience never teaches us enough about anything to have an opinion, we never get to the task of knowing how educated people arrive at opinions or judgments -- and not always at the same ones! Where are youngsters supposed to learn about this?

From infancy on, the human brain learns to think by tackling the facts of life in context. We are born scientists. The human being is a natural at this -- if we feed him stuff worth thinking hard about. Facts stored without ideas remain inert and useless. We "remember" by using what we store, in both new and old ways. It's only stuff we truly don't respect, and don't assume anyone will actually use, that we teach in this odd disembodied way. We wouldn't teach anyone to drive, cook, type, negotiate with others, lawyer, doctor, or you name it without real life practice.

If these tests of math and reading were truly about basic skills we'd set the passing score at almost 100 percent -- as we do in driver's tests. Yet in Massachusetts, with its "tough tests," you can pass the exit math test by getting only about 6 out of 10 answers right, and when you take the re-test it's even less (they leave out the harder questions). Imagine a driver's test that you can pass if you can turn right but not left. "Oh well, most of the time you don't turn at all, and you make right turns at least half the time." (Actually, the written "standardized" portion of the driver's test is pretty silly-e.g., how many feet can you park from the fire hydrant, what's the shape of a yield sign, etc.)

At Mission Hill we actually do have a reading basic skills test. We tape-record every kid reading aloud and answering questions aloud, and score these on a scale of 1 to 7. We consider 6 to be basic reading proficiency. The tapes are accessible to skeptics, families, kids and external assessors. We're willing to stand behind the claim that they've reached proficiency -- that is, can read and understand anything they could also make sense of if it were read to them. They have closed the gap between oral language skill and literacy. We measure science, history and other intellectual disciplines by having kids present and defend a body of work before an external review board, as well as undertake several on-demand timed tasks without assistance. This is a method that is entirely doable by any school that is given the time to devote to it instead of having to focus on standardized test prepping.

MATHEWS: I like that taping idea, but it has the problem that most authentic assessments do. Most parents who have to decide where to send their children, and most taxpayers who have to decide if they should support a rise in taxes or a bond issue to support their public schools, don't have time to drop by your school -- and all the other available schools -- to review the tapes. So you are willing to let us parents and taxpayers have a driver's license written type test of reading, writing, and math so we can be sure we are not being conned? And, if so, how would that test be different from the ones the states are giving our kids under NCLB? And how could you keep that test from distorting and corroding the values that are most important to you in creating your schools?

MEIER: Somewhere along the line we have to decide whose judgment counts most. Openness, external review, and a variety of forms of evidence are all we have to go by when we make all those other important decisions in a democracy -- from deciding life-and-death matters on a jury or selecting the leaders of our land. That's a fundamental leap of faith upon which democracy rests.

In the end doctors, lawyers, auto mechanics, computer engineers and scientists depend on the exercise of informed judgment. I suspect the public is far more ready for this than the policy experts are. We're simply not all in agreement on what evidence matters most to us, in part because we're not all in agreement on the age-old debate about what constitutes a well-educated person. Trying to paper over these differences with technical solutions -- like tests -- won't further the debate, much less solve it. But it sure distracts us from getting on with the hard work that needs to take place in each school to make them the kind of places we would willingly send our own kids. It may require public schools of choice.

But, Jay, some conning can't be avoided -- even in driver's tests. Probably some folks pass the test because they've conned the inspector. (It would help if we required all employers to give parents time off to visit schools, so folks could see for themselves.) One man's powerful argument is another's con. If we had a solution to what we each see as conning, we'd have no need for democracy. I rather like the idea that human affairs are inevitably unpredictable and messy and rest on judgments that always contain an element of uncertainty. But that doesn't mean we can't do a better job helping to train young people -- all of them -- to exercise better and more informed judgment. This in turn means a system of testing that rests on an informed look at a range of evidence that each community finds compelling.

Does the state have a role? Yes, but not the only one. The feds? Except where it's a matter of constitutional rights, no. The founders of the country had that one right.

— Jay Mathews and Deborah Meier
Washington Post


This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.