As test scores rise, standards are lowered
A Memphis teacher points out that the writers failed to note that the test has been changed every year.
Students and teachers again are gearing up to take achievement tests this spring, but a four-month investigation raises questions about the execution and value of a complicated system designed to leave no child behind
By Ruma Banerji Kumar and Halimah Abdullah
"The much-discussed 'race to the bottom' appears to have begun. If states ease their standards, construct simple-minded tests, or set low passing scores, they can mislead their own citizens and educators into thinking that just about everyone is proficient."
-- Chester E. Finn Jr., president of the Fordham Foundation, an education reform organization.
"If every state is going to create its own assessments and tools of what is proficient and advanced, then what's the point? We've got to figure out what is proficient as a nation."
-- Memphis City Schools Supt. Carol Johnson.
Achievement test scores are rising, and so are doubts about whether the scores released every school year are measuring higher achievement or lower expectations.
Over the past three years, Tennessee and a dozen other states have relaxed standards for how well students should perform on achievement tests.
Tennessee and 40 other states also are using questionable grading formulas to give schools and districts a better chance at meeting the goals of the 2002 No Child Left Behind law.
Testing experts say those practices have rendered NCLB's annual "Report Cards" to the public confusing, misleading and unreliable.
"What we find happening around the country is that educators in positions of influence don't want their schools to look bad," said James Popham, a testing expert, a professor emeritus at UCLA and author. "As a consequence they will try to do everything the law permits to have fewer schools appear as if they're failing."
NCLB mandated that every child be "proficient" -- working at grade level -- by 2014. The law also requires schools and districts to make "Adequate Yearly Progress" in reaching that goal, or face sanctions.
But the federal law gives Tennessee and other states great flexibility in collecting that data. Each state can define its own standard of proficiency and determine how to test and grade its own students and schools.
Tennessee has revised its system every year.
For example, eighth-graders who took the TCAP reading test three years ago had to correctly answer 36 out of 70 questions -- 51 percent -- to be considered "proficient."
A year later, in 2004, eighth-graders were required to answer fewer questions right, just 29 of 68 -- 43 percent.
The next year, in 2005, they needed to get only 22 of 55 -- 40 percent.
Expectations have been lowered in other grades, as well.
Three years ago, third-graders who took the TCAP reading test had to answer correctly 34 of 62 questions -- 55 percent -- to be considered proficient.
Two years ago, in 2004, they needed 29 of 63 questions -- 46 percent.
The next year, they needed 19 of 50 -- 38 percent right.
As the number of required correct answers dropped between 2004 and 2005, the percentage of Tennessee eighth-graders who cleared the NCLB bar jumped from 80 percent to 87 percent. And the percentage of the state's third-graders working at grade level increased from 84 to 90 percent.
In Memphis, the lower standards coincided with dozens of schools making it off a state watch list for not meeting NCLB goals. In 2003, 149 were on that list. By 2005, 82 remained.
Federal and state officials acknowledge that NCLB scoring standards are revised every year, but they say the changes are statistically sound and academically valid.
"We are not lowering the standards. I'm not willing to sacrifice the well-being of every child in the state just to look good national(ly)," said Lana Seivers, Tennessee's education commissioner.
State officials defend the scoring changes as valid and say the TCAP doesn't purport to measure what every third-grader or eighth-grader should know. It measures what the average student knows, based on scores of field-tested questions.
According to their system, 20 percent of students should naturally score below proficient, 50 percent at proficient, and 30 percent above proficient.
"That 20-50-30 is what we're shooting for; it follows a natural bell curve," said Mary Reel, the state's testing director. "Assessments aren't designed so students get all the questions correct. Tests are a snapshot in time, a diagnostic tool to show where your strengths and weaknesses are."
But testing experts said there's a flaw in Tennessee's "diagnostic tool." They said that when the state defined proficiency three years ago, it used an approach that lowered standards for some grades.
It began in summer 2003 when 144 teachers and school district administrators gathered to develop the TCAP -- the Tennessee Comprehensive Assessment Program. Their first task was to decide what every child in grades 3-8 should know in math and reading.
For example, they decided every third- grader should be able to master 74 math skills -- including knowing how to add two- and three-digit whole numbers, how to count by tens, hundreds and thousands, and how to identify numbers as odd or even.
They decided every eighth-grader should be able to master 94 math skills -- including knowing how to estimate length, perimeter, circumference, area and volume.
Even though these standards are set, state officials say it's unrealistic to expect students in any grade to master all the skills they're supposed to learn. Proficiency doesn't require mastery, they argue.
"You don't have to get all the questions right on the drivers test to get a license. You can't expect children to get 100 percent correct; there are too many factors that come into play," said Deb Malone, an official with the state's testing department. There are factors like poverty, inadequate early childhood education, lack of parental involvement and ineffective teachers.
Knowing all students won't answer every question on the test correctly, passing grades or "cut scores" are revised every year, based on how many questions the average student got right on the test in years past. The testing company CTB McGraw Hill, which has a six-year, $33 million test development contract with the state, sets the cut scores. The TCAP questions change every year. So do the cut scores.
"The much-discussed 'race to the bottom' appears to have begun," said Chester E. Finn, Jr., president of the Fordham Foundation, an education reform organization.
"If states ease their standards, construct simple-minded tests, or set low passing scores, they can mislead their own citizens and educators into thinking that just about everyone is proficient."
Getting extra credit
Tennessee and other states have found other ways to bolster their test scores.
NCLB requires schools and districts to make "Adequate Yearly Progress " (AYP) toward reaching the 2014 goals in proficiency, attendance and graduation rates.
But states are using complicated statistical formulas to gain extra credit in meeting those goals. The two most popular methods are Confidence Interval and Safe Harbor.
Confidence Interval gives schools more latitude in making AYP.
In Tennessee, schools that don't have at least 83 percent of their students scoring proficient in reading and 79 percent scoring proficient in math are identified as "needing improvement" and fail to made adequate progress under NCLB.
Any school or district that fails to make AYP four years in a row can find itself in "corrective action." The school is on probation and the State Department of Education can fire school staff, increase the length of the school day or year, or decrease the authority of local school district.
But by using Confidence Interval, schools or districts can factor in a "margin of error" in the results, much like the margin of error in opinion polls.
In Tennessee, the margin of error is 5 percentage points. So, even though the state's goal is to get 79 percent of students to score proficient in math, a school can meet the bar by having as few as 74 percent proficient in math. That gives the school or district a better chance of making AYP.
Connie Smith, the director of accountability with the state, estimates that roughly 15 percent of the state's 1,677 schools made AYP last year by using Confidence Interval.
Safe Harbor gives schools extra credit for reducing the number of students who fail to test at "proficient levels" by at least 10 percent.
Under Safe Harbor, for example, Grahamwood Elementary received credit for improving math performance among its poor students. In 2004, only 56 percent of that group was proficient in math. In 2005, the figure was 62 percent.
That 62 percent is still far shy of the state's goal of 79 percent, but because the school was able to reduce the number of poor students below proficient in math by 10 percent, it got credit for making adequate progress under NCLB.
Safe Harbor helped 70 Tennessee schools make adequate yearly progress last year without meeting the state's bar, including 20 Memphis schools. No Shelby County schools benefited from Safe Harbor in 2005.
Lowering the bar
States began looking for ways to make their scores right after NCLB became law. Early in 2002, then-education secretary Rod Paige scolded state education departments for lowering "the bar of expectations to hide the low performance of their schools." It's an action "not worthy of a great country," he wrote.
But states felt they had little choice. They faced sanctions for not meeting NCLB's annual goals, sanctions that included the loss of millions of dollars in much-needed federal aid. They faced enormous -- some said impossible -- obstacles in getting every child -- including those with special needs -- up to grade level. They also faced losing students to charter schools.
Meanwhile, the U.S. Department of Education came under pressure to show the President's signature program was working. So federal education officials began giving states increasing flexibility to measure their NCLB progress.
"If states are lowering their standards, that's not consistent with what NCLB is supposed to do," said Henry Johnson, assistant secretary of elementary and secondary education for the U.S. Department of Education. "We're not approving plans that have that as their goal."
But over the past four years, the U.S. Department of Education has given states more and more flexibility on everything from what's considered a "failing" school or district to how to count graduates and dropouts.
Maryland, for example, has been allowed to count a gain of one-tenth of 1 percent in graduation and attendance rates as "adequate yearly progress." So Maryland got credit for improving its graduation rate from 84.2 to 84.8 from 2004 to 2005.
Mississippi has been allowed to identify a district as "failing" only when its elementary, middle and high schools all miss NCLB goals in the same subject area for two consecutive years. Mississippi is one of the only states in the region to request and be given such broad leeway.
In 2004, Tennessee asked for a special deal for its districts and schools that fail to meet NCLB goals because of poor performance by students with disabilities. Now, Tennessee automatically considers 2 percent of those students to be proficient, even if they aren't.
Arkansas asked to use a Confidence Interval of 75 percent -- a 25 percent margin of error. That gives Arkansas schools greater latitude than most in the nation to meet NCLB standards. In Arkansas, if schools are asked to have 79 percent of students proficient in math to meet NCLB, then that state's margin of error allows schools to meet that bar even if only 54 percent of their students are proficient.
NCLB critics said the federal government is quietly granting these changes to help states make it look like they're meeting their goals. The result: The public gets the impression of widespread improvement in public schools, when only incremental progress is being made.
"Rather than coming out and saying they were wrong, little by little, they're coming up with flexibility for states," said Reg Weaver, president of the National Education Association.
But state officials said the flexibility is needed to address realities that can't be measured by numbers.
In fall 2003, for example, with nearly half of Tennessee's 1,677 schools on NCLB's "failing" list, "our resources were being stretched too thin," said Julie McCargar, the director of federal programs at the Tennessee Department of Education.
Tennessee and other states felt NCLB over-identified schools and districts that needed help. Tennessee didn't have the money, she said, to help more than 800 schools improve.
So the state started using Confidence Interval and Safe Harbor formulas. Last year, those formulas helped thousands of schools get off the list in 41 states. In Tennessee, an estimated 253 schools got off the list last year.
"Other states are using it, and the U.S. Department of Education has told us we can use it," said Connie Smith, the state department of education's accountability director.
Changing the system
Experts said that without a national standard, NCLB's concept of "proficiency" has little real value -- and means different things in different states year after year.
"If every state is going to create its own assessments and tools of what is proficient and advanced, then what's the point?" said Memphis City Schools Supt. Carol Johnson. "We've got to figure out what is proficient as a nation."
A national achievement test isn't likely. But some education groups have called for the development of a uniform bar that every student, in every district, in every state should meet.
Still, that's controversial, and states bristle at the notion that the federal government could tell them what students in their states should know.
Some educational groups have also called for "growth models," which hold states accountable for students' progress based on where they started. The models would help show whether at least a year's worth of learning occurred in a year's time, instead of requiring states to meet a generalized annual target.
The growth models, identical to the value-added system Tennessee has been using since 1992, are particularly important for urban districts where a large number of students come to school with learning deficits. It gives schools credit for progress they make with each child, no matter the challenge: speech disability, language barrier, improper early childhood education.
"(NCLB proponents) want to make it appear as if the results on high stakes tests determine the success of a child, I don't believe that," said the National Education Association's Weaver. "NEA advocates for growth models, which allows teachers to see progress as it is made."
The Education Department has responded by considering granting 10 states, most likely including Tennessee, the chance to use such models to make adequate yearly progress.
But districts have also taken matters into their own hands. In Memphis, Johnson has required schools to take diagnostic tests every six weeks to help teachers see where the academic strengths and weaknesses lie. Schools use the tests to tweak what and how they're teaching and to focus on areas students struggle with the most.
In southwest Memphis, Westwood High principal Tommie McCarter studies the diagnostic tests closely.
"This way we're not waiting until the fall, when the students who took the test have gone on to the next grade, to see where they needed help," McCarter said.
McCarter said she and her staff are working to go beyond the NCLB mandate to bring equality to the education her students receive.
"We have students who come to us with basic needs: food, shelter, clothing," McCarter said. "I resent that (NCLB) doesn't take that into account. But that's not an excuse. We'll get these kids proficient. It just might take us longer."
Ruma Banerji Kumar and Halimah Abdullah
INDEX OF NCLB OUTRAGES