The Kids are Alright But NAEP Is in Trouble
Over and over, Bracey gave evidence of this phenomenon. His books give cut through the statistical maze and explain what's up.
In short, the NAEP achievement levels have been rejected by the Government Accountability Office, the National Academy of Sciences, the National Academy of Education, the Center for Research in Evaluation, Student Standards and Testing and the Brookings Institution, as well as by individual psychometricians. But they continue to be politically useful to those who want to scare the public about public school achievement.
In addition to questions about ethnicity, students are asked probes that get at income. I wonder how many parents are comfortable with their children being asked these questions.
And so on.
Down the Rabbit Hole with the Reading Passages
Student proficiency in reading, of course, gets the press attention. In How Does NAEP Label a Reader? I did what no media person ever does--studied the NAEP reading passages, the questions,student answers, and scorer comments on those answers. I post a revised version here.
Of the eight items NAEP posted as samples from 4th grade reading tests over a decade, topics include an American female astronaut on Mir, crab hunting, wombats, and life in the American Colonies. Two items, a West African tale and a pourquoi story from William Bennett's edited collection The Moral Compass, are in the folklore genre. There are two stories about rural children and their dogs. Highlights magazine holds the copyright on half of the items. NAEP provides online copies of the items and an analysis of the questions and responses. NAEP's Content Classification purports to explain what is being measured: Reading for Literary Experience, Forming a General Understanding, Developing Interpretation, Making Reader/Text Connections, Examining Content and Structure. (see Appendix)
NAEP asserts, "To reach the goal of approximating actual reading experiences, NAEP reading passages are typical of those read by students every day. (emphasis added) The passages are taken from authentic texts found in the environments of students in grades 4, 8, and 12. Selected passages are original sources, not simplifications, paraphrases, abridgments, or modernizations."
Unfortunately NAEP has removed the passages from their website. But they were there when I prepared this critique.
But you don't have to read the selections to know that passages from Highlights, with a subscription costing $26.04 when the test was given and $34.44 now are not available to most US students to read every day. California, to name one state education system, gave up on the idea of staffing schools with professional librarians decades ago, in many cases giving up on the idea of even libraries, and across the nation public libraries, when they can even remain open, are cutting back on hours and book budgets. For documentation on library disparity, take a look at Prof. Stephen Krashen's research.
Fourth graders may indeed be interested in many things, even crab hunting and wombats. But a government that subjects children to testing that will be trumpeted as indicating the quality of their schools, should surely include a few items that break out of the nostalgia/rural/agrarian genre, stories and articles closer to their own backgrounds and experiences. One can make the argument that reading about astronauts, crab hunting, wombats, life in the American Colonies, and rural children and their dogs might not be appropriate in topic, interest, perspective, or fairness. For those who think a West African folktale satisfies an ethnic consideration, consider the specialized language and format of such a tale. Teachers do very well teaching this--and the rest of the nostalgia genre, too, but what is appropriate for a classroom discussion doesn't give students full opportunity on a test, when they must do the whole thing solo.
According to NAEP, "Passages range in difficulty from those that could be read by the least proficient readers (e.g., about second-grade level in a fourth-grade class) to those that could be read by only the most proficient readers (e.g., possibly eighth-grade level in a fourth-grade class), as determined by teachers in specific grades." So when NAEP issues their proclamations about proficiency levels, 71 percent below proficiency nationwide, where do these eighth-grade level passages fit in? After all, fourth graders not proficient on eight-grade level passages is hardly surprising. But this little detail seems to escape media scrutiny. Always, always we must ask: When we look at NAEP proficiency levels, what are we seeing?
At the time of my research, released reading passages were online at the NAEP website. They are no longer there but can be accessed by searching the titles below.
"How the Brazilian Beetles Got Their Coats" retold by Elsie Eels (from The Moral Compass, edited and with commentary by William J. Bennett, Simon & Schuster. 1995) offers a small example of how background knowledge and experience inform the text. Try asking children to explain the meaning of "set" in this sentence: In Brazil the beetles have such beautifully colored, hard-shelled coats upon their backs that they are sometimes set in pins and necklaces like precious stones. . . .
A few paragraphs later, "set" is used differently: The bright green and gold parrot set the royal palm tree at the top of the cliff as the goal of the race.
Fourth graders who do understand the first meaning are grossed out by the concept.
But this "set" variance is small potatoes compared with green mangoes with golden flushes on their cheeks lying on the ground under the mango trees. I don't know anything about mangoes and learning they have cheeks stopped me cold in my reading. My husband didn't know about mango cheeks either, dismissing my question with "You know what they mean." Actually, neither one of us was quite knows what a mango looks like. An Internet search reveals I can buy frozen mango cheeks and cook a coconut, mango, and lychee terrine. I even found a picture of a mango cheek. My point here is that as a reader this term shut down my reading for about 15 minutes. Test takers can't afford to stop and ponder.
NAEP gave "Full comprehension" to just 35% of children answering this constructed response probe:
Oh my: Like a beetle set in somebody's jewelry?
Now why would this child of few words have chosen a lead character from The Tortoise and the Hare if he had little or no comprehension of the text at hand? Why didn't he compare the beetle or rat to a character in The Velveteen Rabbit or Captain Underpants? Are they rating comprehension or verbosity?
NAEP rated 71% of the responses Unacceptable on this question: Do you think that the beetle is being sneaky by not telling the rat she has wings? Use information from the story to explain why or why not?
Sometimes scorers reward students for repeating the question; sometimes they don't. Note that the student who declares herself well-mannered gets full credit but the one who finds a story heroine sneaky does not.
A frequently-used prompt asks readers to suggest a different ending to the story. These two answers for "The River" by Yetti Frenkel received only partial credit: "Minny and/or Cory might have drowned" and "The dog might have drowned," because, in the scorer's words: Both responses provide an appropriate opinion of how the story might have ended differently, but neither supports the opinion with evidence from the story. Thus, although children's answers may exhibit a certain lack of sophistication (or maybe they exhibit the virtue of Robert Frost's matter-of-fact hard floor), even the scorer acknowledges they don't exhibit lack of understanding of the reading selection. And so, once again, people viewing NAEP results on the reading proficiency of American fourth graders need to be aware of just what is being tested. Children aren't being tested on reading comprehension but on the ability to provide a formulaic integration of textual elements with a judgment. Many young responders don't repeat elements from the text because they take the text for granted. It's sitting there: why repeat it? A fourth grader who writes "The dog might have drowned" flunks. If he's written "The dog might have drowned because the kids couldn't reach him" he gets full credit.
Similarly, when the prompt asks "Suggest a new and different title for the story and explain why it would be a good title," the student who summarized the events of the story in his title "How the Brazilian Beetle Won the Race" and offered this justification--"because I like it"-- received only partial credit because, according to the scorer, "the response offers no support for why it would be a good title."
Since when is a writer's gut feeling not allowed? And again, what is being tested? Surely not whether the student understood what he read. It is clear he did. When a student fails to get full credit for this answer, someone has to admit that there's another agenda here. What's being tested is not reading comprehension but English major-type ability to supply textual evidence to support right answers. Anyone who teaches knows that children's writing sophistication lags behind their reading level. NAEP ignores this, choosing to label lack of writing erudition as deficient reading comprehension. The NAEP mantra seems to be: If you can't--or choose not to--explain your answer in full detail, you don't get reading comprehension credit.
When children were asked, Suggest another title for the story [The River]. Use examples from the story to explain why it would be a good title, only 19% of respondents received full credit. The scorer gave only partial credit for "'Danger on the Ice' because there is danger on the ice," complaining that this is a circular explanation. But full credit was given to "'The Icy River' because the dog got stuck in the icy river." Go figure.
Remember: this kind of arbitrary and capricious judgment determines NAEP proficiency ratings, those ratings broadcast in media headlines across the nation: NAEP Finds 71% of 4th graders score below the proficient level. For New Mexico it was 80%. Does this mean their 4th graders can't read, or does it mean they weren't given credit for their alternate titles for "The River?" This is a critical question, a question ignored by:
a) The Media
d) Think Tanks
e) All of the Above
In another constructed response probe for "The River," children are asked, "Do you think this story was exciting? Use an example from the story to explain why or why not." Children who answered "Yes, because every good story has some danger" and "Yes, because I like rescue stories" were given only partial credit. Here is the scorer's reasoning: Both responses provide an opinion and support it with a general reference to the story. The first response claims that the story has some danger, but does not use examples to describe what is dangerous. The second response provides a general reference to the genre of the story that is not explained in terms of story events.
Are we looking for pre-trial lawyer skills here or evidence of whether children understood the story? Young readers who write "because every good story has some danger" are taking it as a given that the scorer knows that they know the story has danger in it. Shouldn't the word because be worth something here? Fourth graders aren't used to building an evidentiary case to prove their reading savvy, and there are two reasons why schools should not go into the business of training them to do so:
1) It would mean a whole lot less time spent on reading.
2) It would kill the joy of reading for reading's sake.
On Thin Ice
Another constructed response probe asked, What was one of the most important lessons that Cory and Elisa learned from their experience? Here are samples of responses that received full credit and those that didn't:
One doesn't have to read many answers labeled "Evidence of Full Comprehension" to recognize which 10-year-olds have been trained to repeat the question in their answer and which haven't. Teachers, even knowing this is what the Standardistos want, should be very reluctant to deliver it. Teaching in our times means following Henry David Thoreau: Resist much.
A surefire method of killing any joy in reading for children is to require them to grind out these answers for everything they read. Too many of the Full Comprehension answerers march along like Stepford byproducts, leading the reader to start checking penmanship: most of the "Full Comprehension" answers sound like they were written by the same person. Teachers need to come to grips with the reality that sometimes "full credit" from the testing marauders comes at too high a price.
Evidence of Partial or Surface Comprehension:
1) "Never walk on thin ice."
2) "Keep an eye on the dog."
Scorer's Commentary: Both responses identify a superficial lesson that conveys an understanding of a trivial aspect of the story.
Trivial aspect of the story! Indeed. For children who live near bodies of water that freeze in winter, the lesson about walking on ice isn't trivial at all. Besides that, I must admit I feel a kinship to kids who insist on avoiding all frippery and cutting to the chase. They remind me of my first year of teaching--when on a test I asked students what lesson they learned from Silas Marner. I gave full credit to the student who wrote, "Hide your money in a safe place," feeling she'd put me in my place. And I confess I don't remember any of the other student answers.
Standardistos will label such answers naive and incomplete; the rest of us, recognizing there are many levels of comprehension, must be willing to give credit to students who go for such basics. It would do testocrats well to remember that only in the school world do English major-type answers hold sway. The real world is more pragmatic. And since the world needs pragmatists as well as philosophers, it does us all great harm to label pragmatists as failures.
The plot of "The Box in the Barn" by Barbara Eckfeld Conner involves a young boy who lets a surprise puppy for his sister's birthday out of a box, and the puppy runs off. He's very worried because he'd been told not to touch the box. Fourth graders are asked, If the box had been empty when Jason opened it at the party, what would Jason most likely have said? Give examples from the story that support your answer.
This student response was rated unacceptable: "I think he would say, 'I didn't do it.' That's what most people would say." The student responder even included quotation marks for the direct quote.
Scorer's Commentary: The response doesn't use information from the story to predict Jason's reaction. It provides dialogue that does not demonstrate understanding of Jason's honest character.
The fact is that the reader has gone beyond the story to make an observation about human nature. The scorer's reaction seems especially bizarre when one considers another response that received full credit: "Maybe he might have just kept what he had done a secret. Because he had been trying to act normal." Here, the scorer says, The response uses information from the story to predict how Jason might have reacted if the box had been empty. The response predicts that Jason might have kept it a secret, based on the part of the story when he pretended to act like everything was fine when the party began.
Can we tell from these scorer judgments that one child is a more proficient reader than the other? Again, NAEP's declarations about non-proficiency are worse than suspect; they are wrong.
Another consideration must be whether the children don't feel oppressed by the incessant writing tasks. This quite insipid story, 1029 words long, is weighed down by 12 questions, 7 of which require constructed responses providing evidence from the story:
Whew! Someone take pity on these fourth graders required to deconstruct so many motives and responses to one limp story.
Of Fares and Fishes: Grade 8 Reading Passages
The 8th grade test introduces a new category, Reading to Perform a Task. The sample-- Tiny print filling two pages is indeed as indecipherable as any transportation fare guide I have ever seen. Certainly, students' ability to scan for information, a useful skill, and maybe it's even a relief from the relentless literary probes. But even with this practical task students are required to write mini-essays to constructed response probes such as What is the guide to Metro's Fares and Passes supposed to help you do? And a literary response is expected.
These two are judged Partial or Surface Comprehension:
I can't see that the response labeled Full Comprehension is that much different from either Partial or Little Comprehension, just more verbose:
Obviously, what a fare guide does for you depends on what you're looking for, providing at the same time lots of information most people don't care about. I would congratulate all responders for not being overwhelmed by the massive amount of detail in the brochure:
Metrobus school fares, children ride free, Metrobus fares, Metrorail fares, Metro fare replacement policy, Transfers, and on and on and on. To cut through all this detail and come up with a summary statement is an accomplishment. But if you wanted to judge comprehension of such a text, wouldn't a multiple choice question have been better?
Another item from the 8th grade test, "The Fish" by Elizabeth Bishop, offers yet another example of relentless essay questions which are so similar to one another. Since NAEP samples different parts of the test in different schools, one can only hope nobody had to answer all these questions.
I caught a tremendous fish
and held him beside the boat
half out of water, with my hook
fast in a corner of his mouth.
5 He didn't fight.
He hadn't fought at all.
He hung a grunting weight,
battered and venerable
and homely. Here and there
10 his brown skin hung in strips
like ancient wallpaper,
and its pattern of darker brown shapes like full-blown roses
15 stained and lost through age.
He was speckled with barnacles,
fine rosettes of lime,
with tiny white sea-lice,
20 and underneath two or three
rags of green weed hung down.
While his gills were breathing in
the terrible oxygen
--the frightening gills,
25 fresh and crisp with blood,
that can cut so badly--
I thought of the coarse white flesh
packed in like feathers,
the big bones and the little bones,
30 the dramatic reds and blacks
of his shiny entrails,
and the pink swim-bladder
like a big peony.
I looked into his eyes
35 which were far larger than mine
but shallower, and yellowed,
the irises backed and packed
with tarnished tinfoil
seen through the lenses
40 of old scratched isinglass.
They shifted a little, but not
to return my stare.
--It was more like the tipping
of an object toward the light.
45 I admired his sullen face,
the mechanism of his jaw,
and then I saw
that from his lower lip
--if you could call it a lip--
50 grim, wet, and weaponlike,
hung five old pieces of fish-line,
or four and a wire leader
with the swivel still attached,
with all their five big hooks
55 grown firmly in his mouth.
A green line, frayed at the end
where he broke it, two heavier lines,
and a fine black thread
still crimped from the strain and snap
60 when it broke and he got away.
Like medals with their ribbons
frayed and wavering,
a five-haired beard of wisdom
trailing from his aching jaw.
65 I stared and stared
and victory filled up
the little rented boat,
from the pool of bilge
where oil had spread a rainbow
70 around the rusted engine
to the bailer rusted orange,
the sun-cracked thwarts,
the oarlocks on their strings,
the gunnels--until everything
75 was rainbow, rainbow, rainbow!
And I let the fish go.
"The Fish" from The Complete Poems 1927-1979 by Elizabeth Bishop. Copyright (c) 1979, 1983 by Alice Helen Methfessel. Reprinted by permission of Farrar, Straus and Giroux.
It is troubling to note how formulaic the answers receiving Full Comphrehension credit are. Of course it's too much to ask that a respondent's "voice" come through on a test, but the NAEP correctors discourage this anyway. Students who draw on their own personal experience are penalized for not referring to the poem. In some cases answers labeled No Comprehension might actually be indicative of a deep, personal response to the poem. One wouldn't know without talking to the students. Talking with children about poetry usually raises more wonderful questions than answers, but NAEP is looking only for answers that fit their formula.
Modest Proposal: Lock up the NAEP Governing Board--and all media who headline NAEP results-- in a room and make them answer all these questions:
1. Describe your reaction to the poem. Explain what it is about the poem that caused you to react that way.
Performance Results: No Comprehension: 16% Partial comprehension: 38% Full Comprehension: 44% Omitted: 2%
2. Which of the following best describes the person speaking in the poem?
A) Thoughtful and observant
B. Tired and victorious
C. Grim and sullen
D. Frightened and lonely
Performance Results: Correct: 82% Incorrect: 18%
3. Why does the person let the fish go? What in the poem makes you think so?
No comprehension: 25% Partial comprehension: 45% Full comprehension: 29% Omitted: 1%
4. Choose an image from the poem and explain what it means.
No comprehension: 28% Partial comprehension: 34% Full comprehension: 31% Omitted: 6% Off task: 1%
What?????! The plot of the poem???? Rainbow isn't image???
5. Explain how the language used in this poem is different from the language used in everyday speech. Use a line or phrase from the poem as an example.
No comprehension: 39% Partial comprehension: 28% Full comprehension: 26% Omitted: 6% Off task: 1%
One can only wonder what the scorer thinks "Most people would have said" means.
6. When the poet says 'Like medals with their ribbons frayed and wavering' (lines 61 & 62), she is referring to:
D) fish scales
Performance Results: Correct: 53% Incorrect : 46% Omitted: 1%
7. From the time she catches the fish until she lets it go, the speaker's feelings about the fish change. Tell how her feelings change and what causes them to change. Explain your answer by referring specifically to the poem.
Unsatisfactory: 6% Partial: 36% Essential: 38% Extensive: 14% Omitted: 5% Off task: 1%
The student replies labeled "Extensive" are just that: these kids are on track for Ph.D.s in English.
8. Reread the lines beginning with "I admired" (line 45) and ending with "aching jaw" (line 64). What do these lines tell you about the fish's experience?
No comprehension: 21% Partial comprehension: 48% Full comprehension: 29% Omitted: 1%
9. Describe how the speaker's decision to let the fish go is related to something you have experienced, read, or seen.
No comprehension: 28% Partial comprehension: 42% Full comprehension: 29% Omitted: 0%
Interesting. This must be the NAEP version of elegant variation: a child gets full credit by comparing the poem to her own catching of a butterfly but not for her own catching of a fish.
One might also suspect some class bias here.
How Did This Happen?
And so, reading comprehension is assessed by writing polish. And writing proficiency is judged by a narrow formula: providing examples from the texts is a lot safer than pulling them from your own experiences. If any of this bothers you, write the members of the Governing Board. After all, as mandated by NAEP law, the Governing Board reviews the items both before and after field testing for appropriateness and bias.
Remember The House that Jack Built? A "comprehensive national process" conducted under the auspices of the National Assessment Governing Board (NAGB) produced a reading framework, which is a "broadly accepted outline of what hundreds of educators, curriculum experts, policymakers, and members of the general public thought the assessment should test." This framework describes the goals of the assessment, "providing the theoretical basis for the assessment and the directions for what kinds of exercises should be included in the assessment, how those exercises should be designed, and how student responses should be scored."
What Does the NAEP Reading Assessment Measure? (according to NAEP)
After the completion of the framework, the NAEP Reading Committee worked with measurement specialists to create the assessment questions and scoring criteria according to the framework's specifications. "All exercises and scoring criteria were carefully reviewed to ensure that the assessment met the requirements of the reading framework."
How Was the NAEP Reading Assessment Developed? (according to NAEP)
The 2005 NAEP Reading Committee
Peter Afflerbach, University of Maryland; Rosalinda Barrera, University of Illinois at Urbana-Champaign; Susan Biggam, Vermont Reads Institute; Violet Harris, University of Illinois at Urbana-Champaign; Tamara Jetton, James Madison University; Janet Jones Berry Elementary School; Judith Langer, The State University of New York at Albany; Patricia McGonegal, Mount Mansfield High School; Jane Miller; The Lawrence Barnes School; Patricia Parrish; Sumrall Attendance Center; Jack Pikulski, University of Delaware; Timothy Shanahan, University of Illinois; Alfred Tatum; University of Maryland; Suzanne Wade; University of Utah; Karen Wixson, University of Michigan
Committee on Standards, Design and Methodology
Vice Chair: Michael Ward
Susan Loomis (staff)
Take a look at the 2014 NAEP Governing Board and you will see one chief architect of the Common Core, traveling at NAEP as "curriculum specialist."
FAIR USE NOTICE
This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.