Study on Teacher Value Uses Data From Before Teach-to-Test Era
I Tweeted this with the comment, "As always, Winerip cuts through the obfuscation." So much hot air has already been spent on this study that tracked students for 20 years to see if teachers raised student test scores. What if they tracked students for 20 years to see if teachers raised
time parents spent reading to their children
books checked out of the library
By Michael Winerip
My four children have all attended public schools in our middle-class suburban district. When my oldest was in fourth grade, in 1998, he took the state tests, and I was not even aware of it. Later, he said the tests were kind of fun; he got to miss his regular classes.
Six years later, in 2004, our daughter was in fourth grade. Long before the state tests, a letter came home. Prep classes were being offered before and after school. While the sessions were not mandatory, students were strongly urged to attend.
Eventually the results were printed in our local newspaper. The news was grim; the nearby districts, in wealthier towns, had creamed us. The following year, our middle school added a mandatory course to prep for the state English test.
That 1998/2004 divide Ă˘€” what happened in the interim was the 2002 No Child Left Behind law Ă˘€” should be kept in mind when analyzing a new, widely publicized study that closely tracked 2.5 million students over 20 years to determine whether teachers who helped raise childrenĂ˘€™s test scores have a lasting effect on their lives. The researchers conclude that having such a teacher improved studentsĂ˘€™ odds of going to a good college, the quality of the neighborhoods where they lived and their lifetime earnings.
The results have created a big stir because they seem to say that no matter what we think of all the standardized testing going on in education today, the scores are at least a measure of what matters in the long run.
That is not exactly what the research paper shows. While it is impressive for its scope and creativity, there is a major caution: it is largely based on test scores from the 1990s, that low-stakes era when my son enjoyed his fourth-grade test.
Whether those results are applicable to our post-2004 high-stakes world, we cannot tell. It may well be that teachers under pressure to raise their studentsĂ˘€™ scores through extensive test preparation will get inflated results that do not carry over positively to adulthood.
Ă˘€śIt is a key limitation of the study,Ă˘€ť Raj Chetty of Harvard, one of the three researchers, acknowledged in an interview. Ă˘€śThis may all change for high-stakes tests.Ă˘€ť
Thanks to President ObamaĂ˘€™s signature education plan, Race to the Top, student testing has never carried so much weight: it is used to rate teachers, principals and schools, and in many districts to reward them with bonuses.
Educators are bitterly divided over whether higher test scores will, in fact, equate with lifelong learning.
In search of an answer, Professor Chetty and two other economists, John N. Friedman also of Harvard and Jonah E. Rockoff of Columbia, used a variety of government data to track 2.5 million students from fourth grade into adulthood in an unnamed urban district. They linked test results with teachers, producing a value-added rating reflecting how much that teacher helped raise student scores. The study found that students of teachers with high value-added ratings were less likely to become teenage parents. By age 20, they were more likely to be attending a good college. By 25, more lived in higher socioeconomic ZIP codes. By 28, their annual incomes were higher.
The economists calculated that replacing a poor teacher (a value-added score in the bottom 5 percent) with an average one would raise a single classroomĂ˘€™s lifetime earnings by $1.4 million.
They concluded that students who had just one teacher with an excellent value-added score would be 0.5 percent more likely to attend college.
The study is full of complex statistical analysis. A one-page executive summary makes no mention of what years of test data were used, but I figured it out by working backward.
The study ends in 2009, when the subjects were 28, the age the researchers had used to measure adult earnings. A 28-year-old in 2009 was born in 1981. That child would have been a 10-year-old fourth grader in 1991, and a 14-year-old eighth grader in 1995.
Once you understand how old the data is, the reportĂ˘€™s disclaimers make more sense, although some require translation. From Page 5: Ă˘€śAn important limitation of our analysis is that teachers were not incentivized based on test scores in the school district and time period we study.Ă˘€ť
In other words: we used scores from the 1990s, when teachers were not under pressure to do prep or teach to the test.
That is a far different time from Washington, D.C., in 2010, when the chancellor at the time, Michelle Rhee, aggressively used student test scores to reward or fire teachers. To calculate the effect those teachers might have on their studentsĂ˘€™ future income, you will have to wait until they are 28 in 2024.
Professor Chetty said it was possible that in high-stakes conditions the usefulness of value-added ratings could be eroded significantly, but implausible that the effect would totally disappear.
Ă˘€śYou would still want to put some weight on this for evaluating teachers,Ă˘€ť he said.
I found an elegance in the way the researchers created proxies to quantify long-term success, like lower teenage pregnancy rates and higher socioeconomic ZIP codes.
But for me, that also points to the studyĂ˘€™s limitations. Economists need to find a way to quantify everything. Teachers with high value-added ratings may indeed have long-term positive impacts on students. But it is also possible that teachers who are excellent at project-based education have an even stronger longterm impact and we would never know it because their results cannot be teased out of a million pieces of data.
The danger is that education policy gets driven by teaching methods that can be given a number.
I suspect that Mr. Noyes, my 11th grade Advance Placement American history teacher from 40 years ago, had a low value-added rating. As I recall, no one in our class got a top score of 5; I got a 3. There was no prepared curriculum aligned with the test: Mr. Noyes built the lessons. On any given topic, he would assign us several books that viewed history through different lenses Ă˘€” economics, politics, personality.
I have long ago forgotten the content of those lessons, but Mr. Noyes instilled in us something far more important: the understanding that history does not come from one book. While that idea has served me for a lifetime, I do not believe it is quantifiable.
E-mail: oneducation @nytimes.com
New York Times