Passing Muster Fails Muster? An Evaluation of Evaluating Evaluation Systems
That is, a district's evaluation system can consider student test scores to whatever extent they want, in balance with other approaches to teacher evaluation. The logic here is a bit contorted from the start. The authors explain what they believe are necessary components of the system, but then claim to be agnostic on how those components are weighted.
But, if you're not agnostic on the components, then saying you're agnostic on the weights is not particularly soothing.
Clearly, they are not agnostic on the components or their weights, because the system goes on to evaluate the validity of each and every component based on the extent to which that component correlates with the subsequent year value-added measure. This is rather like saying, "We remain agnostic on whether you focus on reading or math this year, but we are going to evaluate your effectiveness by testing you on math." Or more precisely, "We remain agnostic on whether you emphasize conceptual understanding and creative thinking this year, but we are going to evaluate your effectiveness on a pencil and paper, bubble test of specific mathematics competencies and vocabulary and grammar."
Second, while hanging ratings of evaluation systems entirely on their correlation with "next year's value added," the authors choose to again remain agnostic on the specifics for estimating the value-added effectiveness measures. That is, as I've blogged in the past, the authors express a strong preference that the value added measures be highly correlated from year to year, but remain agnostic as to whether those measures are actually valid, or instead are highly correlated mainly because the measures contain significant consistent bias -- bias which disadvantages specific teachers in specific schools -- and does so year after year after year!
Here are the steps for evaluating a teacher evaluation system as laid out in Passing Muster:
In formal terms, their system is all reliability and no validity (or, at least, inferring the latter from the former).
But, rather than simply having each district evaluate its own evaluation system by correlating its current year ratings with next year's value-added, the Brookings report suggests that states should evaluate district teacher evaluation systems by measuring the extent that district teacher evaluations correlate with a state standardized value-added metric for the following year.
But again, the authors remain agnostic on how that model should/might be estimated, favoring that the state level model be "consistent" year to year, rather than accurate. After all, how could districts consistently measure the quality of their evaluation systems if the state external benchmark against which they are evaluated is not consistent?
As a result, where a state chooses to adopt a consistently biased statewide standardized valueadded model, and use that model to evaluate district teacher evaluation systems, the state in effect backs districts into adopting consistently biased year-to-year teacher evaluations... that have the same consistent biases as the state model.
The report does suggest that in the future, there might be other appropriate external benchmarks, but that:
That is, value-added measures -- however well or poorly estimated -- should be the benchmark for whether a teacher evaluation system is a good one, simply because they are available and we think, in some cases, that they may provide meaningful information (though even that remains disputable -- to quote Jesse Rothstein's review of the Gates/Kane Measures of Effective Teaching study: "In particular, the correlations between value-added scores on state and alternative assessments are so small that they cast serious doubt on the entire value-added enterprise." See: http://nepc.colorado.edu/files/TTR-MET-Rothstein.pdf).
I might find some humor in all of this strange logic and circular reasoning if the policy implications weren't so serious.
Bruce Baker is Associate Professor Department of Educational Theory, Policy and Administration, Rutgers, The State University of New Jersey. You can follow him on Twitter.
FAIR USE NOTICE
This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.