Orwell Award Announcement SusanOhanian.Org Home


LET’S DO THE NUMBERS: Department of Education's Race to the Top Program Offers Only a Muddled Path to the Finish Line

Susan Notes:

The authors state flat out: The actual experience of RTT--in which the selection of particular states to receive competitive grants can't reasonably be justified--is further reason to abandon this approach for the future. And much more. Must reading.

You'll find many things you want to quote. Here are a few.


  • Because the awards were based on precise numerical scores, the process was presented as objective and scientific. However, further examination suggests that the selection of Delaware and Tennessee was subjective and arbitrary, more a matter of bias or chance than a result of these states' superior compliance with reform policies.

  • Whatever its merit in flush times, the substitution of competition for uniform funding has no place in this time of state fiscal crisis.

  • Evaluations based on metrics can appear
    objective even though they in fact reflect only bias or chance.

  • "The Department’s 500-point system [for Race to the Top Grant proporals] is needlessly complex. Its implied precision makes the results seem less affected by human judgment than is the case."

  • When these judging errors are combined with expected errors in the design of the metrics, it is surprising that Department asserted that the final state scores were correct to one decimal place.

  • Winning could be the result only of individual reviewers' occasional quirkiness.

  • For permitting a period of public comment [before adopting Common Core Standards], Massachusetts was deemed in violation of the competition rules, and the RTT reviewers docked Massachusetts a whopping 15 (out of 20 possible) points on this metric. [So participatory democracy cost MA hundreds of millions.]

  • The Race to the Top 500-point rating system
    presents a patina of scientific objectivity, but in truth masks a subjective and somewhat random process.

  • Because this trial run of a competitive system has proven to have little credibility, the administration should rethink its approach to federal education aid and its relationship to school improvement.



  • By William Peterson and Richard Rothstein

    Table of Contents

  • Introduction

  • Dangers of metrics

  • RTT weights

  • The RTT 500-point system

  • Questions on categories and initial weights

  • RTT panel judgments

  • Devil in the details

  • Needless complexity

  • Conclusions and recommendations


  • Introduction
    The American Recovery and Reinvestment Act of 2009 (the "stimulus" bill) provided $4.35 billion to the Department of Education for "Race to the Top" (RTT), a program in which states could apply for funds to implement education reform. Secretary of Education Arne Duncan established a competition to determine which states would receive the funds, and 40 states (plus the District of Columbia) entered. Of these, 16 were named as finalists, and in late March 2010, two states, Delaware and Tennessee, were announced as winners of the first round. The awards were substantial: Delaware got $100 million (or about $800 per pupil), and Tennessee got $500 million (or about $500 per pupil). In each case, the award represents about 7% of the total expenditures in these states for elementary and secondary education.

    To compete for RTT funds, governors were faced with quickly organizing existing resources to underwrite extensive grant-writing efforts. Some invested significant political energy and leadership to persuade school districts and
    teacher unions to endorse the applications, while others had to press legislatures to change state laws on charter schools and teacher evaluation. When the winners were announced, some governors expressed concern and disappointment at what one called an "inscrutable process," leaving them to wonder whether it would be worth participating in future rounds of the competition.1

    Delaware and Tennessee won because they got the
    most points (454.6 and 444.2, respectively) out of a total
    of 500 points available. Five outside panelists2 reviewed
    each state’s application, including interviews with delegations
    from the finalist states, and awarded points for states'
    compliance with policies promoted by Secretary Duncan,
    such as participating in a national consortium to develop
    common standards in reading and math (maximum of 20
    points) or using data to improve instruction (maximum
    of 18 points).

    Because the awards were based on precise numerical
    scores, the process was presented as objective and scientific.
    However, further examination suggests that the selection
    of Delaware and Tennessee was subjective and arbitrary,
    more a matter of bias or chance than a result of these
    states' superior compliance with reform policies.3

    At a time of widespread fiscal crises in the states,
    when receipt of Race to the Top awards can determine
    whether class sizes will be increased and teachers laid off,
    such capricious decision-making is unfortunate. The
    Department of Education can use its distribution of
    funding as a “carrot” to stimulate states to improve their
    education policies, but when state budgets are as stressed
    as they are today, every state should get a fair share of
    federal funding, excepting only those that refuse to make
    good faith efforts to implement research-based improvements
    in elementary and secondary education.

    The Obama administration intends RTT to be the
    model for a new approach to the distribution of federal
    elementary and secondary education aid. Whatever its
    merit in flush times, the substitution of competition
    for uniform funding has no place in this time of state
    fiscal crisis. The actual experience of RTT--in which
    the selection of particular states to receive competitive
    grants can't reasonably be justified--is further reason to
    abandon this approach for the future.

    Dangers of metrics

    Quantitative metrics are a popular management tool.
    Such metrics can be used to describe objective performance,
    such as total school lunches served per day, or subjective
    factors, such as an evaluator's judgment of a teacher's
    skill in teaching math. When managers use metrics to
    evaluate overall performance, they must assign weights,
    or relative importance, to the various metrics. For example,
    a school's overall rating could be determined by a combination
    of a rating for lunches served (weighted as 25%
    in importance) and a rating for the math teacher’s skill
    (weighted as 75% in importance).
    Subjective judgment is required both for assigning
    weights to metrics, and for making judgments regarding
    performance on most individual metrics. In the latter
    case, dangers of subjectivity can be reduced by providing
    evaluators with detailed checklists (sometimes called
    "rubrics") describing the components of performance (e.g.,
    in the case of the math teacher, assigning so many points
    for demonstrating understanding of the lesson, assigning
    a certain number of points for calling on children from
    different parts of the room, etc.), and by training evaluators
    by asking them to observe identical lessons and comparing
    the ratings to ensure "inter-rater reliability."

    If such precautions are not taken, or are insufficiently
    taken, then evaluations based on metrics can appear
    objective even though they in fact reflect only bias or
    chance. The RTT 500-point system suffers from several
    such deficiencies.

    RTT weights

    One source of false precision in the use of metrics for
    evaluation stems from the arbitrary assignment of weights
    to various indicators in a system. Some index systems
    make weights more credible by basing them on a survey
    (of opinion leaders, public officials, or the general public),
    asking respondents for their judgments regarding the
    relative importance of a list of factors, and then averaging
    the weights that respondents chose.

    In the case of RTT, Secretary Duncan and his staff
    chose provisional weights and then revised them after
    reviewing suggestions submitted by members of the
    public as part of a formal regulatory comment period.
    Several of the revisions made in this fashion made sense,
    but other well-founded suggestions were ignored.4 These
    arbitrary weights have enormous consequence.

    The RTT 500-point system

    The RTT 500-point system, shown in Table 1, has six
    major categories, seven general categories, and various
    subcategories. The primary weighted metrics consist of
    the 30 categories whose points are shown in italics. The
    first column in Table 1 is a list of the various categories
    selected by the Department of Education. We raise many
    questions below concerning the particular categories
    chosen, but this listing and its subjective evaluation are a
    reasonable first step by the Department to describe how
    it believes states should proceed to improve their educational
    programs. However, by assigning numbers to this
    process, the Department implies it has a testable theory
    or empirical data to back up its quantitative method.
    By making RTT a competitive system, the Department
    then locks itself into accepting the numerical scores as
    the specific criteria for selecting winners. The necessary
    subjective judgments required both for category selection
    and weight assignment makes a fair competition practically
    impossible, even if the competition is undertaken
    with great care. . . .



    For the rest of this paper, including the charts, go to the url below.

    William Peterson (bpeterson1931@yahoo.com) is a retired marine engineer with over 35 years experience in the management and maintenance of large commercial tankers and Navy ships, a lifelong interest in education, and in the use and misuse of numbers"-especially by managers.

    "Richard Rothstein (riroth@epi.org) is a research associate of the Economic Policy Institute.

    — William Peterson and Richard Rothstein
    Economic Policy Institute
    2010-04-20
    http://epi.3cdn.net/4835aafd6e80385004_5nm6bn6id.pdf


    INDEX OF RESEARCH THAT COUNTS


    FAIR USE NOTICE
    This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.