Research that Matters: Putting Testing to the Test
Susan Notes: Different readers have different problems. Imagine that. We need to scream about the reality that although many intervention programs for struggling readers target word identification or phonics, more than 50% of the students who failed were strong in word identification. They had needs in comprehension, not in phonics.
Reading Between the Lines of Test Failure
"At a time of limited resources, responsible accountability calls us to question the trend of putting more and more money into testing instead of into more instructional support for students and more professional development for teachers."
Reading is universally valued as a fundamental skill that opens the doors to learning, higher education, good jobs and informed citizenship. How well, then, does current large-scale testing assess a student's reading ability and identify where help is needed.
Research by UW College of Education Professor Sheila Valencia and Professor Marsha Riddle Buly of Western Washington University provides a compelling example of the limitations of large-scale tests and the importance of classroom-based assessments in understanding students' reading abilities. In their study of fourth-grade readers who failed the state reading test in Washington (WASL), Valencia and Riddle Buly probed beneath the state scores and found that reading problems are much more complex and multi-faceted than a single score suggests. While their study focuses on Washington State's reading test, their results are not unique, but indicative of a larger, nationwide issue.
DIFFERENT READERS, DIFFERENT PROBLEMS. Using a variety of individualized assessments similar to those many teachers use in the classroom, Valencia and Riddle Buly identified six distinctive profiles of students who had failed the test. These six groups showed various strengths and weakness in three major areas of reading: word identification, comprehension, and fluency. For example, among the students who failed the test were fourth-graders whose word identification abilities tested at the ninth-grade level and others who tested at the first-grade level, just learning to read words. Similarly, some of the students read quickly and with expression, while others read haltingly or with little phrasing and expression. Furthermore, although many intervention programs for struggling readers target word identification or phonics, more than 50% of the students who failed were strong in word identification. They had needs in comprehension, not in phonics.
Because of the way most high-stakes assessments are designed and the way scores are reported, students with such vastly different strengths and weaknesses can actually receive the same score. Valencia and Riddle Buly conclude that scores from large-scale tests can mask important individual differences in students' reading abilities that call for dramatically different interventions. It is clear that there is no single solution for students with reading difficulties. Beneath each failing reading score is an individual pattern of performance that holds the key to better reading instruction and, consequently, improved reading ability.
"The more we generalize the remedies, the more students we fail," Valencia says. "Isolated test scores by themselves aren't particularly helpful in understanding what individual students need. And policies based on a single test score can lead to inappropriate decisions. We have to peel back lots of layers to find out what's at the core of a specific student's difficulty. In addition, we need to develop more detailed images of student progress. Even if students haven't read at their grade level on a large-scale test, we need an assessment system that can show when they've made progress in that direction."
Valencia and Riddle Buly conclude that additional testing, such as that promoted by No Child Left Behind, will not improve student achievement. But better instruction, focused on students' needs, will. Results of their study suggest that this kind of instruction cannot be informed by scores of large-scale assessments. However, instruction can be informed by classroom assessments and implemented by teachers who are supported in developing their expertise.
"At a time of limited resources," says Valencia, "responsible accountability calls us to question the trend of putting more and more money into testing instead of into more instructional support for students and more professional development for teachers. Unless we look beneath test scores, at individual students and the conditions that foster high quality learning (e.g. adequate materials, supplementary instructional supports, healthy schools, strong leadership, and knowledgeable teachers), more and more students will continue to fall below the bar."
POINTER TO MORE RESEARCH:
For more information on the policy-related aspects of this research:
Riddle Buly, M., & Valencia, S. W. (2002) Below the bar: Profiles of students who fail state reading assessment. Educational Evaluation and Policy Analysis, 24(Fall), 219-239.
For more information on the study's practice-related aspects:
Valencia, S. W., & Riddle Buly, M. (2004) Behind test scores: What struggling readers really need. The Reading Teacher, 57(6), 520-531.
College of Education
University of Washington
INDEX OF RESEARCH THAT COUNTS