Orwell Award Announcement SusanOhanian.Org Home

Grand Test Auto: The end of Testing

Ohanian NOTE: Take the lead metaphor to heart and you'll see the prediction that soon all kids will come with barcodes. . . Later the metaphor changes to GPS devices in each kid's brain.

It's interesting that the title calls this "the end of testing" when actually it's testing all the time. Note that at the School of One, the computer determines each student's next assignment.

Here's how the author of this article identified himself at Education Next: Tucker is a social entrepreneur who has founded and led both nonprofit organizations and for-profit companies. His expertise ranges from finance to strategic development and technology, he has extensive knowledge of the nonprofit sector, and he has led the growth of several organizations; in 2000, he co-founded SmarterOrg, Inc, an e-learning company, and sold it in 2002 to the Isoph Corporation, a provider of specialized e-learning software, services, and content development. Through 2005 he served as Isoph's chief knowledge officer, co-managing the company's operations and advising organizations such as the American Red Cross, the Land Trust Alliance, and the National Wildlife Federation on organizational learning strategies and business models. Prior to graduate school, Tucker managed the training, conference, and publishing operations at CompassPoint Nonprofit Services, the nation's largest provider of management and technical training to nonprofit organizations.

Also NOTE: Joel Rose, highly praised in this article, is Chief Executive Officer, School of One New York City Department of Education, was a member of The Broad Superintendents Academy Class of 2006

Chief Executive Officer, School of One
New York City Department of Education. Previously, he was served as the chief executive for human capital for NYCDOE. Before that, he was a Teach For America corps member in Houston and later a senior executive at Edison where he also served as the company‘s associate general counsel, chief of staff and vice president for school operations. He earned a bachelor’s degree in political science from Tufts University and a law degree from the University of Miami School of Law.

January 2012 Rose launched New Classrooms (trademarked). Start-up funders are:

  • The Bezos Family Foundation

  • The Bill and Melinda Gates Foundation

  • NewSchools Venture Fund

  • Carnegie Foundation

  • JP Morgan Chase

  • The Board of Advisors includes the usual suspects.

    Norman Atkins
    Co-Founder and President
    Relay Graduate School of Education

    Ann Bradley
    American Federation of Teachers
    Innovation Fund

    Anthony Bryk
    Carnegie Foundation for the Advancement of Teaching

    Tom Carroll
    National Commission on Teaching and America’s Future

    Linda Darling-Hammond
    Professor of Education
    Stanford University

    Chris Dede
    Professor in Learning Technologies
    Harvard Graduate School of Education

    Mike Feinberg

    Susan Fuhrman
    Teachers College Columbia University

    John Katzman
    Founder and Executive Chairman

    Marguerite Kondracke
    President & CEO
    America's Promise Alliance

    Wendy Kopp
    CEO and Founder
    Teach For America
    CEO and Co-Founder
    Teach For All

    David Levin

    Arthur Levine
    Woodrow Wilson National Fellowship Foundation

    Ellen Moir
    Founder and CEO
    The New Teacher Center

    Tom Payzant
    Professor of Practice
    Harvard Graduate School of Education
    Former Superintendent
    Boston Public Schools

    Jenny Shilling Stein
    Executive Director
    Draper Richards Kaplan Foundation

    Tom Vander Ark
    Getting Smart

    Gene Wilhoit
    Executive Director
    Council of Chief State School Officers

    By Bill Tucker

    In the old days, supermarkets struggled to keep track of the thousands of items on their shelves. Each month, they'd shutter the store so their employees could hand-count every soup can, cereal box, and candy bar. The first electronic scanning systems came along in the 1970s, which helped take a little of the drudgery and inefficiency out of the grocer’s life. Then came waves of advances in computing power and remote sensing technologies. By now, for most retailers, regularly shutting down to conduct inventor y is a thing of the past. Instead, they can constantly monitor their shelves through bar codes, scanners, and radio-frequency devices. And as it has turned out, all this technology has given them far more than just a better way to count cans: today, retailers not only keep track of what’s on their shelves, they also use the constant flow of real-time information to predict, analyze, and respond quickly to consumer demand.

    This kind of real-time assessment and response has become a part of modern life in a number of areas. New car owners increasingly rely on remote sensors, not a yearly mechanic’s visit, to detect engine problems and keep tires at the right pressure. And more and more, diabetics no longer have to stop and inject themselves. Instead, they use a continuous glucose monitor to send blood readings to an insulin pump, which warns them if their blood-sugar level spikes and allows them to adjust their level of insulin. In each of these areas, a scientific understanding of systems—whether biological, mechanical, or commercial—has been combined with new technology to develop more useful, productive, and actionable monitoring and measurement. And all of it takes place almost invisibly, in the background.

    Not so in America's classrooms. Schools across the nation still essentially close to conduct inventory--only we don't call it that. We call it "testing." Every year at a given time, regular instruction stops. Teachers enter something called "test prep" mode; it lasts for weeks leading up to the big assessment. Just as grocery-store workers might try to fudge inventory numbers to conceal shortfalls in cash, schools sometimes try to fudge their testing results, and cheating scandals erupt. Then, in a twist, regular classroom instruction resumes only half heartedly once the big test is over, because there are no stakes attached to what everyone's learning. Learning stops, evaluation begins: that's how it works. But in the not-so-distant future, testing may be as much a thing of the past for educators as the counting of cans is for grocers.

    Zoran Popovic, a computer scientist and the director of the Center for Game Science at the University of Washington in Seattle, is one of a new cadre of researchers pointing the way to a post-testing world. Popovic has designed a prototype of an online, puzzle-based game called Refractions. The game challenges students to use their knowledge of fractions to help provide the right amount of power to animals in marooned spaceships. Using puzzle pieces, students bend lasers and split the energy beams into half, one-third, and even one-twelfth power. In the process, they get a feel for a number of important concepts, such as equal partitioning, addition, multiplication, and common denominators.

    While Refractions looks like a relatively simple game, the real complexity is behind the scenes. The game records hundreds of data points, capturing information each time a player adjusts, redirects, or splits a laser. This data allows Popovic and his colleagues to analyze and visualize students' paths through the puzzles—seeing, for example, whether a student made a beeline for the answer, meandered, or tried a novel approach. Since the data shows not just whether the student solved the puzzle, but also how, it can be used to detect misconceptions or skill gaps. Good math teachers do this all the time when they require students to "show their work"--that is, to write down not just the answer to a math problem on a test, but also the calculations they used to derive the answer. The difference is that Popovic's game essentially "shows the work" of hundreds of thousands of players, recording data automatically in a way that allows teachers and scientists to draw robust inferences about where students tend to go astray. This would be virtually impossible with paper tests. And it's this massive scale that promises not only new insights on student learning but also new tools to help teachers respond.

    Popovic's game is one of dozens of experiments and research projects being conducted in universities and company labs around the country by scientists and educators all thinking in roughly the same vein. Their aim is to transform assessment from dull misery to an enjoyable process of mastery. They call it "stealth assessment."

    At this point, all this work is still preliminary--the stuff of whiteboards and prototypes. Little if any of it will be included in two new national tests now being designed with federal funds by two consortia of states and universities and scheduled to be rolled out in classrooms around the country beginning in 2014. Still, researchers have a reasonably clear grasp of what they someday--five, ten, or fifteen years from now--hope to achieve: assessments that do not hit "pause" on the learning process but are embedded directly into learning experiences and enable a deeper level of learning at the same time.

    In this vision, students would spend their time in the classroom solving problems, mastering complex projects, or even conducting experiments, as many of them do now. But they’d do much of it through a technological interface: via interactive lessons and simulations, digital instruments, and, above all, games. Information about an individual student’s approach, persistence, and problem-solving strategies, in addition to their record of right and wrong answers, would be collected over time, generating much more detailed and valid evidence about a student’s skills and knowledge than a one-shot test. And all the while, these sophisticated systems would adapt, constantly updating to keep the student challenged, supported, and engaged.

    One way to think of stealth assessment is to compare it to a GPS system--one that has the ability to monitor, assess, and respond to progress along the way. The metaphor is helpful, because it illuminates not only the promise of stealth assessment but also the crucial missing component that we lack now. A GPS system starts with a detailed digital map of all the roads and possible detours in a given terrain; then the system's software constantly tracks your car's location relative to that map. Similarly, stealth assessments will require a detailed understanding--a cognitive model or map--of all the different ways learning can progress in math, science, and various other disciplines. A student's performance would then be tracked against the various routes and pathways that learners tend to follow as their understand -ing progresses. But while cognitive scientists have made great strides in the past two decades, our understanding of how students learn is not nearly detailed enough to resemble a full map—certainly not one that reflects the whole range of possible routes, detours, intermediary steps, and junctions created by each student's individual strengths and weaknesses.

    One glimpse of this future can be found at
    School of One
    , a personalized learning program in New York City that was named one of Time magazine’s "50 Best Inventions of 2009." School of One is an experiment not only in technology-based lessons and assessments but also in competency-based student progress. If a seventh grader is working at a fourth-grade level, instructors focus unapologetically on fourth-grade material—attempting to ensure that, as the student progresses, he has really developed the right fundamental understanding going forward. It’s a big adjustment for teachers and parents, but students respond in striking ways. Joel Rose, former School of One CEO and now co-founder of New Classrooms, says that when students see that assessment results are used, in real time, to help them learn, then their entire relationship with testing changes—so much so that they often naturally draw their own clear distinction between the one-shot “tests” that they face in other classes and ongoing “assessments.”

    The coming revolution in stealth assessment is not without potential dangers, pitfalls, and unintended consequences. If students perceive that the constant monitoring is meant primarily to judge them, rather than help them improve, then they may be less likely to experiment or take risks with their learning. Worse still, it’s conceivable that teachers would just find new ways to teach to the test, focusing their instruction on how to beat a computerized assessment algorithm rather than how to solve a challenging physics problem.

    Eric Klopfer, director of MIT ’s Education Arcade and a proponent of stealth assessment, warns against a superficial “gamification” of learning. Just as in traditional classrooms, where the use of gold stars and special awards is only as sound as the underlying relationships among students and teachers, adding game-like rewards to educational lessons only works if the game itself is rewarding. If you give students a reward for things they don’t want to do, Klopfer says, then students stop doing those things as soon as the reward stops. It takes good instruction to challenge and engage learners. The best intrinsic motivation isn’t a flashy game, Klopfer says, but “success through meaningful accomplishments.”

    Still, stealth assessments are at a very early stage in their development, having yet to be proven in a large scale trial. Their drawbacks, kinks, and breakthroughs will no doubt become far more clear—and perhaps more manageable—over time. Numerous big experiments are on their way. As Popovic works on Refractions—which has been played by more than 100,000 people at this point—he’s also building the equivalent of an open-source plat -form to accelerate others’ efforts, in hopes of shaving off the time it takes to develop games with embedded assessment from scratch. Ultimately, his goal is to crowdsource designs for new games and assessment challenges from both educators and students. Dynamic Learning Maps, a consortium of thirteen states that was awarded $22 million in federal funding to develop new assessments for students with significant cognitive disabilities, plans to embed items and tasks in day-to-day instruction to map a student’s learning over the course of a year. Klopfer just received a $3 million grant from the Bill & Melinda Gates Foundation to develop a massively multiplayer online game to help high school students learn math and biology. And Pearson, the giant education publisher, led a $33 million investment in Knewton. It’s hard to tell, but at this pace it’s conceivable that the sit-down-stop-everything-else test may, within the decade, seem as old-fashioned as counting tubes of toothpaste on a supermarket shelf.

    But the solution may be self-generating. While both stealth assessments and GPS systems must start from an initial map, they also share another critical capability: the potential to become more accurate over time. As GPS software records millions of data points on destinations and routes, it begins to detect otherwise unknown traffic patterns, leading to better and better routing. The same potential holds for stealth assessments. Researchers can use student performance data from across a variety of tasks to update conceptual models and better understand how students learn. David Kuntz, vice president of research at Knewton, one of the companies developing new “adaptive” learning platforms, notes that, just as data collection improves the recommendations of a GPS system, collecting large data sets in the classroom can help to confirm or disconfirm hypotheses about how students learn. And, by comparing how similar students perform when given different types of content or instructional activities, researchers can also begin to understand which learning interventions work for which students, under which conditions.

    Education, of course, can’t be reduced to a series of online games. More than just a set of concepts to be learned, it’s also a complex set of relationships: between students, teachers, and the environment in which they learn. Florida State University professor Valerie Shute, who coined the term “stealth assessment,” agrees. Her assessments aren’t meant to replace human teachers, but to help teachers understand student misconceptions and provide recommendations for action. She sees automated scoring and machine-based reasoning techniques as tools for teachers to “infer things that would be too hard for humans.” Just as a pilot uses a navigational computer to crunch vast amounts of data for use in flight, teachers should use these tools to play an ever more active role, reviewing students’ progress and providing better-informed guidance and assistance as they solve problems.

    While computers have long been used for drilling facts or equations, Shute is designing her assessments to keep tabs on a deeper kind of learning—the kind that takes greater care and effort to measure but is essential for making sound progress. In math, for example, heavy drilling may help students pass a quiz on, say, fractions. But in order for them to put their knowledge of fractions to good use later on in, say, algebra class, students need a real conceptual grasp of what fractions are and how they work.

    Shute’s new project involves building and embedding stealth assessments in the game Crayon Physics Deluxe (CPD). This is meant to cultivate and measure just that kind of mastery. In CPD’s virtual world, students must discover and/or apply their knowledge of the principles of physics, such as gravity, kinetic energy, and inertia, to propel a red ball through various puzzles toward its destination, marked by a yellow star. But in this world, just as in the real world, students aren’t just given problems with one predefined answer with which to solve them. Instead, students experiment with different approaches in a world largely of their own creation. Using a virtual crayon, they draw their solutions. In one instance, they might draw a ramp to roll the ball across an obstacle. In another, they draw a rock that falls on a lever to thrust the ball upward. The game encourages students to continually refine their approach, rewarding not just what it calls “old school” solutions but also more “elegant” ways to move the ball toward its destination.

    As they play, CPD is assessing their performance constantly, collecting information on both simple indicators, such as the time spent on a particular problem, and complex information, such as the agents of force and motion—a springboard, say—that students use to accomplish a task. As students play, the assessment draws on more and more of the data points, which are constantly mapped against a model to update an estimate of the student’s competencies. In this case, a teacher could use CPD alongside more traditional instruction, ensuring that students understand the mathematical equations in physics but also the concepts underlying it.

    Another stealth assessment in CPD strives to measure students on their care, organization, and persistence in trying to solve problems—what researchers call “conscientiousness.” Research consistently shows that these skills can predict academic achievement but are independent from intelligence or cognitive ability. They are also essential to success in school and life. In CPD, data on persistence, for example, comes primarily from problems that students have trouble solving. CPD tracks the number of times a student tries to solve each problem and the overall time spent on each try. And the assessment is designed so that even the cleverest students are given problems that challenge them; that way, all students are measured on their level of persistence. Of course, there are various pencil-and-paper tests that can measure these skills. But those tests typically involve self-reported items and are taken in isolation from the learning process, as if persistence were a static quality, unrelated to the actual task or content at hand. Shute’s goal is to build stealth assessments that can be inserted into almost any game or interactive lesson, allowing both students and teachers to see how qualities like persistence and creativity relate to their overall performance throughout the course of learning, and may even be improved over time as a function of game play.

    So far, all these stealth assessment prototypes fall into the category of what educators call “formative assessments.” That means they are functionally analogous to the kinds of short-term tests, like chapter quizzes, that teachers use for diagnostic purposes—to gauge whether students grasp the lesson you just taught, so you can adjust your instruction in real time. This sets such tests apart from “summative assessmentsӉ€”the weightier, more stress-inducing tests taken at semester’s or year’s end to judge the performance of students, teachers, and schools. It’s reasonably clear that stealth technology can someday be used for formative assessments. The big question is whether this technology can also eliminate the need for the annual summative testing.

    The answer, in theory anyway, is yes. If done correctly, stealth assessments could help educators amass much greater evidence, over time, and at a deeper level, of what a student knows and is able to do. But doing so will require major changes in instruction—changes that would probably be beneficial for a whole host of reasons.

    Today, a calendar defines what students learn and how they progress. An eighth-grade U.S. history course fits into two eighteen-week semesters, with a test at the end of each. And no matter what knowledge students walk in with or what they manage to absorb in the first eighteen weeks, the teacher must move on to the second eighteen weeks’ worth of content when the schedule dictates. This is what some educators call a “time-based” approach to education; a “competency-based” approach flips this paradigm. In this model, rather than wait for an end-of-year test, students can demonstrate their competency in a subject over time, allowing them to move on as they are deemed ready. Learning, instead of seat time, defines progress.

    Bill Tucker ,since 2005 the managing director of Education Sector, a D.C .-based think tank, will soon be joining the Bill & Melinda Gates Foundation as deputy director, policy development, U. S. Program. He has written about education technology, innovation, and policy for publications including Education Next, Education Week, and Educational Leadership.

    — Bill Tucker
    Washington Monthly





    This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.