2015: A School Odyssey. HAL will now grade your standardized test
By Frank Cerabino
You might imagine that it'd be almost impossible to find a new way to make standardized testing denigrate public education any more than it already has in Florida.
After all, standardized testing in some schools eats up from 60 to 80 days of the 180-day school year. It has demoralized teachers, stymied students from deeper educational opportunities, and, perhaps by design, served the political purposes of lawmakers who yearn to establish constitutionally-challenged voucher programs that transfer public money to private schools.
But we haven't hit rock bottom. Yet.
Florida soon will use computers, not teachers, to grade student essays in some of these high-stakes standardized tests.
There's a new emphasis on having students write long explanatory responses to test questions, rather than bubbling in answers on multiple-choice tests. Which is fine.
But essays, unlike multiple choice tests, have needed human graders. Each FCAT essay had required two human graders with a third grader brought in to resolve differences.
With the new emphasis on essays, that might not even be feasible, wrote Mark Shermis, the dean of the School of Education at the University of Houston -- Clear Lake.
"For example, the state of Florida has approximately 180,000 students in each grade level," Shermis wrote. "If each student in that one state had five essays graded, the state would be required to evaluate almost 11 million documents per year, raising questions as to the feasibility of recruiting a sufficient number of qualified human graders to provide final scores, read reliably, in a timely manner across the entirety of the United States."
Machine scoring of essays makes that possible. And much cheaper. Machine scoring can trim grading costs by as much as 50 percent, according to a study supported by the William and Flora Hewlett Foundation.
But how does a machine grade an essay?
Beginning in March, Florida students from the 4th grade to the 11th grade will be required to take tests that include timed essays. Except for the fourth graders, all those essays will be written on computer keyboards, which will allow machines to score them.
Florida is spending as much as $5.4 million to buy Utah's standardized test, which is known as SAGE -- Student Assessment of Growth and Excellence. And Florida is paying another $220 million to the American Institutes for Research to develop and grade these tests for the next six years.
The company says the upcoming SAGE test will be machine scored. This includes student essays, short answers, drawings, "hot text," equations and interaction choices.
"Scoring engine searches the student response for a match to the correct answer, answers, or characteristics of an answer supplied by the item writer," the company explained.
The essay scoring looks for key words, checks spelling and grammar and develops a "statistical model predicting humans core from measured features."
Many educators are horrified.
"Personally and professionally, as a parent and as an educator, I do not want machines scoring students' writing in Florida because the technology required to grade students fairly and accurately does not exist and because using what does already exist will only promote an inadequate, reductive understanding of writing," said Paul Corrigan, an English professor at Southeastern University in Lakeland. "In a time when almost everyone in the larger discussions on education agrees that writing skills are as important as ever for students to develop, we shouldn't hand off the assessment of writing skills to computers that simply are not up to the job.
"The way we test inevitably shapes the way we teach," he said. "Writing, which requires nuance, thought, and creativity from students, requires humans to teach and to assess."
The website HumanReaders.org catalogues studies showing that computer algorithms used to evaluate writing often fall short. The National Council of Teachers of English say that machine grading of essays uses crude methods that fail to recognize clarity, irony, and logic.
To illustrate that, Les Perlman, the former director of undergraduate writing at MIT, created the Babel Generator, an online tool that uses key words to create nonsense sentences that receive high marks on machine-scored tests.
Computer algorithms, he pointed out, are easily fooled by gibberish such as this: "Privateness has not been and undoubtedly never will be lauded, precarious, and decent. Humankind will always subjugate privateness."
Big words that mean nothing.
Soon to be part of big tests that mean too much.
And you thought the obsession with standardized testing in Florida couldn't get any worse.
FAIR USE NOTICE
This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.