Why Kindergarten-Admission Tests Are Worthless
This is the most-read item in the online edition on New York. It asks the question: Should a child's fate be sealed by an exam he takes at the age of 4? And then answers it: Why kindergarten-admission tests are worthless, at best.
Don't you wish that members of the Obama/Duncan team who are pushing cradle-to-college skills acquisition schedules would read this? Don't you wish the NCTE and IRA Executive Directors and DC lobbyists who are promoting the LEARN [sic] act would read it?
Send it to Al Franken. He's a co-sponsor of LEARN. I am willing to wager Al never thinks of this legislation as child abuse.
60 East Plato Blvd
Saint Paul, MN 55107
by Jennifer Senior
The Junior Meritocracy
Skylar is allowed her potty break. She returns and stands on top of her chair.
Ă˘€śOkay!Ă˘€ť says her evaluator, smiling. Ă˘€śSo Ă˘€Â¦ what is a house?Ă˘€ť
Ă˘€śI already know. A home.Ă˘€ť
She gives Skylar a playful look and tips her head. Ă˘€śAnd whatĂ˘€™s a home?Ă˘€ť
Skylar mirrors her tipped head. Ă˘€śA house!Ă˘€ť
She laughs. Ă˘€śWhatĂ˘€™s a bird?Ă˘€ť
Skylar picks up her Hello Kitty pen and bounces it on her testerĂ˘€™s arm. Ă˘€śLook, a hopping marker!Ă˘€ť
Her tester smiles. Ă˘€śWhatĂ˘€™s a birdĂ˘€”Ă˘€ť
Skylar races the pen up and down. Ă˘€śVrooooooooom! Magic marker! Vroom vroom!Ă˘€ť
Watching this exchange is a reminder of something any parent knows: Four-year-olds, no matter how smart and delightful they may be, have obvious limits as test takers. Many, especially boys, canĂ˘€™t sit still for the full duration of an exam; others canĂ˘€™t stay awake or concentrate for that long, choosing at some catastrophic point to crawl under their desks and give up. Nor is the context in which these tests are administered exactly relaxing for young children. Both IQ tests require that they sit alone in a room with a tester they probably haven't seen before. In the case of the WPPSI, the tester often isn't allowed to prompt the children to give more complete answers, even if itĂ˘€™s clear they're capable of delivering them (and would score better if they did). In the case of the OLSAT, the testers can't even repeat the questions.
"What is a pet?"
"An animal. I have pet goldfish."
Her tester decides to play along this time. "Do they have names?"
"Zoe and Tangerine."
Skylar plants her marker next to a rectangular-shaped sticker sheĂ˘€™d gotten as a reward for a previous exercise and admires the shape she's just made. "Look! A flag!"
Stephen J. Bagnato, a professor of pediatrics and psychology at the University of Pittsburgh, is fond of quoting Head Start co-founder Urie Bronfenbrenner, who in 1977 famously wrote, "Much of contemporary developmental psychology is the science of the strange behavior of children in strange situations with strange adults for the briefest possible periods of time." It's hard not to think about that observation in the context of intelligence-testing 4-year-olds. The script is so rigid, the tasks are so narrow and precise. Skylar did extremely well on her evaluation. Yet to me, the loveliest and most intellectually revealing moment was when she blew off all rules and made that whimsical little flag. If it were a real exam, the tester wouldn't even have written it down. "Well, right," says Bagnato. "When the examiner can only say certain things to these kids, and the child can only say certain things back, of course it's too confining. We know that the way kids display their skills best is through creative play and everyday interactions at home and at school."
As it turns out, intelligence tests miss lots of things, not just creativity. And perhaps that explains why IQs alone are not especially good predictors of excellence. In the twenties, for instance, Lewis Terman, a psychologist and deep believer in intelligence testingĂ˘€”it was he who revised Alfred Binet's original test and came up with the Stanford-Binet modelĂ˘€”started a now-famous longitudinal study of nearly 1,500 California children with extremely high IQs. He grandiosely called it "Genetic Studies of Genius," and his hope was to show that these children, whom he called "exceptionally superior," would one day form the backbone of the nation's intellectual and creative elite, making crucial advances in sciences and public policy and the arts. But as David Shenk, author of the forthcoming The Genius in All of Us, points out, his subjects only grew less and less remarkable as time wore on. None won Nobel Prizes, though two who were specifically rejected for the study--William Shockley and Luis Alvarez--did, both in physics. None became world-renowned musicians, though two other rejects--Isaac Stern and Yehudi Menuhin--did, for their virtuosic violin-playing. In Outliers, Malcolm Gladwell makes a similar point, noting that one's IQ needn't be super-high to succeed; it simply needs to be high enough. "Once someone has reached an IQ of somewhere around 120," he writes, "having additional IQ points doesn't seem to translate into any measurable real-world advantage." In Genius Revisited, Rena Subotnik, director of the American Psychological Association's Center for Gifted Education Policy, undertook a similar study, with colleagues, looking at Hunter elementary-school alumni all grown up. Their mean IQs were 157. "They were lovely people," she says, "and they were generally happy, productive, and satisfied with their lives. But there really wasn't any wow factor in terms of stellar achievement."
So what do psychologists and educators think makes the difference between good and exceptional? Opportunity, connections, mentors. Perseverance and monomaniacal devotion, or what the psychologist Ellen Winner calls "the rage to master." Creativity, a willingness to fail. Nelson, the head of Calhoun, can go on at urgent, passionate length about this.
"I want a school full of kids who daydream," he says. "I want kids who are occasionally impulsive. I want kids who are fun to be with. I want kids who don't want to answer the questions on those tests in the way the adult wants them to be answered, because that kid is already seeing the world differently. In fact," he adds, after thinking it over for a moment, "I want kids who are cynical enough at age 4 to know that there's really something wrong with someone asking them these things and think, 'I'm going to screw with them in the process!'"
Granted, Calhoun is an unusual school, a place where kids don't even get test scores until they're freshmen. But one needn't be particularly subversive to appreciate Nelson's philosophy of educating 4-year-olds, or his frustration with current practice. "You have to play with blocks," he says. "You have to make up stories. You have to muck around. Arithmetic and decoding language aren't lifeĂ˘€”they're symbolic representations of other things. And education is being diverted into focusing on these symbolic representations of the very experiences kids are being denied."
Nelson says he's considering scrapping the WPPSI as an admission requirement for Calhoun's lower school, possibly starting as early as next year. As it is, he barely takes a kid's score into account. One of the most compelling reasons to get rid of it, he notes, isn't because the test is intellectually pointless. It's because it's emotionally insidious. "When we resort to any kind of measure of kids that's supposed to be qualitative at a young age," he says, "no matter how cheerfully we do it, no matter how many lollipops we hand out to de-stress the process, young children are extraordinarily discerning. They absorb their parents' anxiety about it, they absorb the kinds of judgments people are making about them. So there's a process of organizing kids in a hierarchy of worth, and it's beginning at an age that's criminal."
The irony is that doing well on these exams can be just as damaging as doing poorly on them. "Gifted" is an awfully uncomfortable label for some children to wear. It can cripple their thinking, make them terrified of risk. "It's not entirely inaccurate to observe that more and more high-achieving students go off to university and don't care about anything," says Nelson. "They don't ask questions, they don't have original ideas. And it's not because there's anything wrong with them, but because they were conditioned to believe that learning is about giving back the right answer." Nelson knows it's heresy to say this, but he wonders if it's true. "These tests, at 4, start that long process of conditioning," he says. "Right then, children start to believe that learning means pleasing the powerful adult in whose presence you are."
It"s unlikely that most city schools will follow Nelson's lead and stop testing 4-year-olds. But it is possible that these tests could earn less and less weight in the selection process as they become tainted by excessive prepping and anxiety. That doesn't mean, however, that the selection process will become more democratic. "I'm afraid schools will be judging the child in ways that aren't any better," says Emily Glickman, founder of Abacus Guide Educational Consulting. "There'll just be more weight on the school report, and what the nursery-school director says about the child verbally. And often kids who come from expensive, high-cachet nursery schools have elaborate evaluations written about them, because the preschool directors themselves have a high stake in the class's placement success." And in the case of private schools, she notes, even more emphasis may be given to a family's socioeconomic status: "The kindergarten-admission process has always been about openly judging a 4-year-old and secretly judging the parents' wealth, connections, and likeliness to give."
Giving less weight to these tests doesn't guarantee that the selection process would become more sensible, either, or more sensitive to finding those children who'd profit from an enriched education. After all, what mechanism should schools use?
This is the hardest question. Most education researchers can tell you just what's wrong with intelligence-testing 4-year-olds. But few can tell you what should emerge in its stead. "Before we adopted the OLSAT," says the Department of Education's Commitante, "we had 32 different school districts using a huge -- a tremendous variety of assessments." Some, she says, relied on expensive IQ tests; others required teacher evaluations. The result was a hodgepodge of arbitrary standards--ones that, the city believed, worked against children who spoke English as a second language (the OLSAT is given in eight languages) or had lower incomes (the city gives the OLSAT for free).
Given his druthers, Meisels, at Erikson Institute, says he'd try to get a more comprehensive picture of the child. "And that can only be found through watching children in classroom situations," he says. "And looking at the products of their work. And getting to know them. And that can be done through observational assessments."
I try to interrupt him, but he anticipates my objection. "It's not very practical, I know," he says. "It means teaching teachers how to do it. It'd be more expensive. But you could do it. And then youĂ˘€™d get the right kids into these differentiated programs."
Many researchers agree with himĂ˘€”and will add, as Meisels later does in our conversation, that kids ought never to be evaluated just once. "If one believes that kids do learn and improve," says McCall, "then a few new kids should be eligible for gifted programs each year."
If you're looking for practical answers though, Plucker, of Indiana, has a modest proposal. He suggests that schools assess children at an age when IQs get more stable. And in fact, thatĂ˘€™s just what City and Country, one of ManhattanĂ˘€™s more progressive schools, does. Standardized tests arenĂ˘€™t required of their applicants until theyĂ˘€™re 7 or older. "That way, the kids are further along in their schooling," explains Elise Clark, the school's admissions director. Ă˘€śTheyĂ˘€™re used to an academic setting, they can handle a test-taking situation, and overall, we consider the results more reliable.Ă˘€ť Even then, she says, her school still doesnĂ˘€™t weight IQ scores very much. "If we did, what we'd have is a group of kids with good test-taking skills and Ă˘€Â¦ I don't know what else."
But my money's on the marshmallow test. It's quite compelling and, apparently, quite famousĂ˘€”Shenk talks about it with great relish in The Genius in All of Us. In the sixties, a Stanford psychologist named Walter Mischel rounded up 653 young children and gave them a choice: They could eat one marshmallow at that very moment, or they could wait for an unspecified period of time and eat two. Most chose two, but in the end, only one third of the sample had the self-discipline to wait the fifteen or so minutes for them. Mischel then had the inspired idea to follow up on his young subjects, checking in with them as they were finishing high school. He discovered that the children who'd waited for that second marshmallow had scored, on average, 210 points higher on the SAT.
Two hundred and ten points. Can Princeton Review boast such a gain? Maybe our schools ought to be screening children for self-discipline and the ability to tolerate delayed gratification, rather than intelligence and academic achievement. It seems as good a predictor of future success as any. And Mischel's test subjects, too, were just 4 years old.
Most researchers in the field of childhood development agree that the minds of nursery-school children are far too raw to be judged. Sally Shaywitz, author of Overcoming Dyslexia, is in the midst of a decades-long study that examines reading development in children. She says she couldn't even use the reading data she'd collected from first-graders for some of the longitudinal analyses. "It simply wasn't stable," she says. I tell her that most New York City schools don't share this view. "A young brain is a moving target," she replies. "It should not be treated as if it were fixed."
Complicating matters further, IQs are least stable at the highest end of the spectrum no matter what age they're assessed. The explanation for this is simple: There's more room to fall the higher you go, and hence a greater likelihood that the score will regress toward the mean. Chance figures more prominently into high scores--a good night's sleep, comfort with the tester--and lucky guesses on tough questions are worth more points than answers to midrange questions. In 2006, David Lohman, a psychologist at the University of Iowa, co-authored a paper called "Gifted Today but Not Tomorrow?" in the Journal for the Education of the Gifted, demonstrating just how labile "giftedness" is. It notes that only 45 percent of the kids who scored 130 or above on the Stanford-Binet would do so on another, similar IQ test at the same point in time. Combine this with the instability of 4-year-old IQs, and it becomes pretty clear that judgments about giftedness should be an ongoing affair, rather than a fateful determination made at one arbitrary moment in time. I wrote to Lohman and asked what percentage of 4-year-olds who scored 130 or above would do so again as 17-year-olds. He answered with a careful regression analysis: about 25 percent.
The implications of this number are pretty startling. They mean that three quarters of the seniors in a gifted program would no longer test into that program if asked to retake an IQ test on graduation day. So I wrote Lohman back: Was he certain about this?
"Yes," he replied. "Even people who consider themselves well versed in these matters are often surprised to discover how much movement/noise/instability there is even when correlations seem high." He was careful to note, however, that this doesn't mean IQ tests have no predictive value per se. After all, these tests are betterĂ˘€”far betterĂ˘€”at predicting which children will have a 130-plus IQ at 17 than any other procedure we've devised. To have some mechanism that can find, during childhood, a quarter of the adults who'll test so well is, if you think about it, impressive. "The problem," wrote Lohman, "is assigning kids to schools for the gifted on the basis of a test score at age 4 or 5 and assuming that their rank order among age mates will be constant over time."
Appelbaum, McCallĂ˘€™s co-author, puts an even finer point on the stakes. "No university I know," he says, "would think of using a 4-year-old's data to decide who to admit."
A January 5 thread from the parenting website DCurbanmom:
Can anyone offer advice on whether I should by [sic] Aristotle Circle? IĂ˘€™m in a time crunch. Thanks!
My sister-in-law bought Aristotle Circle workbook and showed it to me. As a child psychologist, the workbook is so close to the real thing, I think it is cheating. That said, my nephew aced the test--
It is so sad that we have to do thisĂ˘€”but what to do? [dear child] is at a disadvantage if everyone else is prepping and we are not.
There was a time, not that long ago, when few parents attempted to prep their 4-year-olds for kindergarten-admission exams. But then a few more began to do it, and then a few more after that, and then suddenly, normal-seeming people with normal-seeming values began doing it, too, and an arms-race mentality kicked in. Responding to parents' anxieties and fears, some of the fancier preschools began subtly prepping their students--giving them similar exercises to do with blocks, introducing them to the concept of analogies. Expensive test-prep kits suddenly began to appear on the market. And high-end education consultancies began to bloom, like Aristotle Circle. Founded in 2008 by an M.I.T. graduate and former Wall Street analyst named Suzanne Rheault, it provides tutors, advisers, and--most important--prep books for apprehensive and even merely conscientious parents.
"I can understand people getting offended by 4-year-olds getting tutoring for these exams," says Rheault when we meet in her Soho conference room. "But I'm not the one making them take them."
She dumps a bag of blocks onto the conference table. They're essentially the same ones used on the WPPSI, except hers are white and blue rather than white and red. Then she plops down her meticulous, brightly designed prep book, which she just completed last August. She opens to the Ă˘€śVocabularyĂ˘€ť section, illustrated by a former cartoonist for Disney. "Any vocabulary the child needs," she tells me, "is in this book," whether it's to complete picture analogies or understand questions that are asked of them. Then she flips to a section of the types of questions the children will be asked aloud--What is a villain? What is a liquid?--and a few pages after that, she gets to what she believes is the "core intellectual meat" of the exam: "Concept groupings," or pages of pictures organized by how the objects in them are linked. Containers: picnic baskets, suitcases, matchboxes. Things that open and close: zippers, eyes, locks. Measuring instruments: hourglasses, watches, thermometers. "Any of the abstract groupings the child needs to understand are also here," she tells me.
How does Rheault know all this? I ask her, incredulously. Has she seen one? You have to be specially registered with the publisher to buy the WPPSI. Like most IQ tests, it is updated only periodically, which makes it coveted by parentsĂ˘€”if you've seen one lately, you've likely seen the version your child will take.
"I'm not going to talk about it," she replies. "But the people who helped us develop the workbook are psychologists who've seen them."
But copies of this test are obviously floating around. Skylar's mother, for instance, says she was offered a copy of the WPPSI by a fellow mom. Type a few key search words on Urbanbaby.com, and within 30 seconds you'll find this post: Have WWPSI-III to sell. Excellent condition. Complete set. E-mail me if you are serious and discreet. No questions asked. Cost is $3,000. (An e-mail address follows.) This past fall, a parent admitted to a psychologist who administers SB-5 tests for Hunter that he'd purchased a copy of the exam right off the publisher's website. "The type of tests we sell are primarily for special education, so it's never been an issue for us in the past," says Elizabeth Allen, the director of research and development of Pro-Ed Inc., which only recently acquired the rights to the Stanford-Binet. "When I heard, I was like, 'You're kidding me! Some parent paid a thousand dollars so they could get their kid into a gifted program? Wow.'" (The company has since fixed the problem; now only licensed professionals can buy them.)
There are some who insist that studying for these exams can't possibly budge a child's scores. "I don't know how prepping could help on the OLSAT," says Anna Commitante, head of the Gifted and Talented programs for the city's Department of Education. But Rheault can't believe there's still any debate about the subject. "The psychologists we work with," she says, "say that 50 to 60 percent of the material is learnable." Yes, her point of view may be colored by her commercial interests--her WPPSI prep books go for $500, and she's now completing a workbook for the OLSAT and will shortly start one for the SB-5. But she's hardly alone in her beliefs. "When people say this stuff isn't really coachable, I always scratch my head and say, 'Yeah, except for the parts that are,'" says Jonathan Plucker, director of the Center for Evaluation and Education Policy at Indiana University. "I understand the nature/nurture debate. It's a complicated relationship. But to say that families with greater means and more interest in education can't influence test outcomes--I can't understand that reasoning. It's common sense."
The practice of prepping can run families into the thousands of dollars, posing a clear disadvantage to those who can't afford it. But the truth is, even without coaching, children coming from economically and culturally rich backgrounds do far better on these tests. And that's a far more urgent reason to challenge the widespread reliance on them.
"An analogy people use a lot for this is planting corn," says Barnett, from Rutgers. "If you want to know about the properties of different kinds of corn, you have to plant it in land that's well fertilized and well irrigated. If you plant it in soil that's dried up and rocky, you won't know, because nothing will grow." The same, he explains, goes for children. How can one possibly know anything about their minds if they've spent their first four years in unstimulating environments?
"People have the idea that with these tests you can cancel out socioeconomic background and get to some real thing in the kid," agrees Nicholas Lemann, dean of the journalism school at Columbia and author of The Big Test, a history of the SAT. "That's a chimera. If you're a 4-year-old performing well on these tests, it's either because you have fabulous genetic material or because you have cultural advantages. But either way, the point is: You're doing better because of your parents."
Rather than promoting a meritocracy, in other words, these tests instead retard one. They reflect the world as it's already stratifiedĂ˘€”and then perpetuate that same stratification.
"Instead of giving IQ tests, you could just as easily look at Zip Codes and the education levels of the parents to determine who gets the better schooling--you get a very high correlation between IQ and socioeconomic status in the first seven or eight years of life," says Samuel J. Meisels, assessment expert and president of ChicagoĂ˘€™s Erikson Institute, the renowned graduate school in childhood development. "Giftedness is a real thing, no question. But giftedness can be extinguished, and it can be nurtured." He mentions a New York Times education analysis from 2008, which noted that after the city streamlined its G&T program, requiring specific cutoff scores for the OLSAT, the percentage of white students had shot up from 33 to 48 percent, while the percentage of black and Hispanic enrollment had fallen. "Sometimes," he says, "you look at a big city's decisions to do this and wonder if itĂ˘€™s about nurturing giftedness or if it's about keeping middle-class families in the city limits."
New York Magazine
INDEX OF NCLB OUTRAGES