Orwell Award Announcement SusanOhanian.Org Home


Data Command Force

 

136 in the collection  

    Why Standardized Tests Should Not Be Used to Evaluate Teachers (and Teacher Education Programs)

    NOTE: You can--and should--subscribe to receive latest updates from Vamboozled! A blog about teacher evaluation, accountability, and value-added models (VAMs).

    Laura H. Chapman Comment: ESSA is chock full of calls for measures, assessments, evaluations of teacher effectiveness, including measures of student "growth." These calls are especially intense in Title II as the primary means of determining whether prospective teachers, including student teachers and those in extended residencies, should be eligible for certification.
    ESSA is, in my opinion, a congressional thumb in the eye to scholars in education and all coursework for teachers that is not strictly nuts and bolts and test prep. A clear express of this disdain can be read In the double speak about charter academies for teacher preparation, with the authorizor in chief the governor, five stipulations about what these academies are supposed to do Title II , section 2002, 4A with all those requirements removed in 4B if prospective teachers pass "state-approved content area examinations." What is removed? Think Relay writ large. No need for faculty in those academies to have advanced degrees or do academic research, no course credit requirements, no restrictions on academies via accrediting bodies and so on.

    by Audrey Amrein-Beardsley

    David C. Berliner, Regents' Professor Emeritus here at Arizona State University (ASU), who also just happens to be my former albeit forever mentor, recently took up research on the use of test scores to evaluate teachers, for example, using value-added models (VAMs). While David is world-renowned for his research in educational psychology, and more specific to this case, his expertise on effective teaching behaviors and how to capture and observe them, he has also now ventured into the VAM-related debates.

    Accordingly, he recently presented his newest and soon-to-be-forthcoming published research on using standardized tests to evaluate teachers, something he aptly termed in the title of his presentation "A Policy Fiasco." He delivered his speech to an audience in Melbourne, Australia, and you can click here for the full video-taped presentation; however, given the whole presentation takes about one hour to watch, although I must say watching the full hour is well worth it, I highlight below what are his highlights and key points. These should certainly be of interest to you all as followers of this blog, and hopefully others.

    Of main interest are his 14 reasons, "big and small" for [his] judgment that "assessing teacher competence using standardized achievement tests is nearly worthless.â

    Here are his fourteen reasons:

    1. "When using standardized achievement tests as the basis for inferences about the quality of teachers, and the institutions from which they came, it is easy to confuse the effects of sociological variables on standardized test scores" and the effects teachers have on those same scores. Sociological variables (e.g., chronic absenteeism) continue to distort others' even best attempts to disentangle them from the very instructional variables of interest. This, what we also term as biasing variables, are important not to inappropriately dismiss, as purportedly statistically "controlled for."

    2. In law, we do not hold people accountable for the actions of others, for example, when a child kills another child and the parents are not charged as guilty. Hence, "[t]he logic of holding [teachers and] schools of education responsible for student achievement does not fit into our system of law or into the moral code subscribed to by most western nations." Related, should medical school or doctors, for that matter, be held accountable for the health of their patients? One of the best parts of his talk, in fact, is about the medical field and the corollaries Berliner draws between doctors and medical schools, and teachers and colleges of education, respectively (around the 19-25 minute mark of his video presentation).

    3. Professionals are often held harmless for their lower success rates with clients who have observable difficulties in meeting the demands and the expectations of the professionals who attend to them. In medicine again, for example, when working with impoverished patients, "[t]here is precedent for holding [doctors] harmless for their lowest success rates with clients who have observable difficulties in meeting the demands and expectations of the [doctors] who attend to them, but the dispensation we offer to physicians is not offered to teachers."

    4. There are other quite acceptable sources of data, besides tests, for judging the efficacy of teachers and teacher education programs. "People accept the fact that treatment and medicine may not result in the cure of a disease. Practicing good medicine is the goal, whether or not the patient gets better or lives. It is equally true that competent teaching can occur independent of student learning or of the achievement test scores that serve as proxies for said learning. A teacher can literally 'save lives' and not move the metrics used to measure teacher effectiveness.

    5. Reliance on standardized achievement test scores as the source of data about teacher quality will inevitably promote confusion between "successful" instruction and "good" instruction. "Successful" instruction gets test scores up. "Good" instruction leaves lasting impressions, fosters further interest by the students, makes them feel competent in the area, etc. Good instruction is hard to measure, but remains the goal of our finest teachers.

    6. Related, teachers affect individual students greatly, but affect standardized achievement test scores very little. All can think of how their own teachers impacted their lives in ways that cannot be captured on a standardized achievement test. Standardized achievement test scores are much more related to home, neighborhood and cohort than they are to teachers' instructional capabilities. In more contemporary terms, this is also due the fact that large-scale standardized tests have (still) never been validated to measure student growth over time, nor have they been validated to attribute that growth to teachers. "Teachers have huge effects, it's just that the tests are not sensitive to them."

    7. Teachers' effects on standardized achievement test scores fade quickly, barely discernable after a few years. So we might not want to overly worry about most teachers' effects on their students--good or bad--as they are hard to detect on tests after two or so years. To use these ephemeral effects to then hold teacher education programs accountable seems even more problematic.

    8. Observational measures of teacher competency and achievement tests of teacher competency do not correlate well. This suggest nothing more than that one or both of these measures, and likely the latter, are malfunctioning in their capacities to measure the teacher effectiveness construct. See other Vamboozled posts about this here, here, and here.

    9. Different standardized achievement tests, both purporting to measure reading, mathematics, or science at the same grade level, will give different estimates of teacher competency. That is because different test developers have different visions of what it means to be competent in each of these subject areas. Thus one achievement test in these subject areas could find a teacher exemplary, but another test of those same subject areas would find the teacher lacking. What then? Have we an unstable teacher or an ill-defined subject area?

    10. Tests can be administered early or late in the fall, early or late in the spring, and the dates they are given influence the judgments about whether a teacher is performing well or poorly. Teacher competency should not be determined by minor differences in the date of testing, but that happens frequently.

    11. No standardized achievement tests have provided proof that their items are instructionally sensitive. If test items do not, because they cannot "react to good instruction," how can one make a claim that the test items are âtapping good instruction?â

    12. Teacher effects show up more dramatically on teacher made tests than on standardized achievement tests because the former are based on the enacted curriculum, while the latter are based on the desired curriculum. You get seven times more instructionally sensitive tests the closer the test is to the classroom (i.e., teacher made tests).

    13. The opt-out testing movement invalidates inferences about teachers and schools that can be made from standardized achievement test results. Its not bad to remove these kids from taking these tests, and perhaps it is even necessary in our over-tested schools, but the tests and the VAM estimates derived via these tests, are far less valid when that happens. This is because the students who opt out are likely different in significant ways from those who do take the tests. This severely limits the validity claims that are made.

    14. Assessing new teachers with standardized achievement tests is likely to yield many false negatives. That is, the assessments would identify teachers early in their careers as ineffective in improving test scores, which is, in fact, often the case for new teachers. Two or three years later that could change. Perhaps the last thing we want to do in a time of teacher shortage is discourage new teachers while they acquire their skills.

    — Audrey Amrein-Beardsley
    Vamboozled!
    November 27, 2015
    http://vamboozled.com/why-standardized-tests-should-not-be-used-to-evaluate-teachers-and-teacher-education-programs/


    Index of Data Command Force

Pages: 6   
[1] 2 3 4 5 6 Next >>    Last >>


FAIR USE NOTICE
This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.