Monday, November 14, 2011

Is Kentucky Committing to a Value-Added Accountability System?

As regular KSN&C readers may know, I'm a bit conflicted when it comes to value-added accountability. On one hand, as principal I wanted as much data as I could get my hands on to inform me about how our school performance was trending, and how we compared to other schools like ours. I also wanted data on how the buses were running, how long a student had to wait in line for lunch, and how long an assembly would run. Data, data, data. On the other hand, I had no interest in using that data like a hammer to my teachers' heads. Mostly because I knew and respected the technical limits of student performance data and I did not try to make it mean more than it did. To knowingly misuse test data is anti-scientific in my mind.

Still, there is a certain allure to finding a set of measures that are based on individual students' prior performance, rather than whatever subgroup the student belongs to. Such data might allow us to buffer the powerful effects of poverty on classroom performance - at least, a whole lot better than we currently do.

At present, comparing the relative performance of any two classrooms of students is always skewed by the uneven distribution of poverty, and other factors. There is no way to account for the non-school differences from one class to another. But what if a 6th grade teacher knew on the first day of school that her class had a incoming aggregate performance of 580 (where 600 was expected for a 6th grade class where everyone was beginning on level, but no better)? What if the teacher next door had a first-day class aggregate of 615? Both teachers might better understand how much value they added to their class, having already accounted for differences in poverty and such. If the first teacher ends the year with students collectively scoring 700 (beginning 7th grade) and the second teacher does too, then it is easier to see that the first teacher added more value (+ 120) than did the second (+85), who presumably had a "better" class to begin with.

I had occasion to ask CCSSO chief Gene Wilhoit about this in 2010. I asked him to imagine such a system and he had no trouble doing so, even indicating that he thought it could be done. Still, I realize even this proposal is likely fraught with its own pitfalls, and I only offer it as an improvement over value-added systems that ignore the effects of outside factors.

What's in a Name?

In 2009, the Kentucky General Assembly - which is solely responsible for the public school system - passed Senate Bill 1, thus "beginning of a new era in public school assessment and accountability" according to the Kentucky Department of Education's website.

Senate Bill 1 addresses what will be tested; how subjects will be tested; when tests are given; what should comprise the public school accountability system; and more...

KDE, CPE, and ESPB are looking for "a strong, effective system of assessment and accountability that provides valuable information to a wide range of users, from educators and parents to legislators and citizens and more."

KDE spokeswoman Lisa Gross told KSN&C that she would not categorize Kentucky's teacher accountability plan as a value-added system and that Commissioner Holliday has not done so either. But, she admits, "student academic growth would be one component." 

Holliday recently told the Associated Press, "We're very confident that Kentucky's application...meets all of the criteria outlined by the U. S. Department of Education. And, we believe that our public school accountability model exceeds the basic requirements..."

The proposed Professional Growth and Effectiveness System (PGES) would focus on four areas: Instruction; Learning Climate; Leadership and Professionalism; and Student Growth. Holliday has said, "proficiency on standardized tests is not always the best way to measure teacher impact. A more valid and reliable method is growth."

Implementing Senate Bill 1, was always going to be a challenge, given the General Assembly's failure to fund it. If it weren't for federal funding, I'm not at all sure how the Kentucky Board of Education could have built a new test. One supposes Kentucky Education Commissioner Terry Holliday faced two choices: Go get some funding, or get nothing done.

Kentucky made its decision to latch on the the national school reform movement in February of 2010 when Holliday hitched it's wagon to common core standards, and told Kentuckians wishing to know where the state is headed educationally to look at KDE's application for Race to the Top grant funds for a road map.

Following failed attempts to secure the large Race to the Top grant, the state has now applied for flexibility in implementing the deeply flawed NCLB. But, as with RTTT, to get the goodies, KDE has to make some promises. 

The big funders, the US Department of Education and the Bill and Melinda Gates Foundation, are fans of value-added assessments. In granting Tennessee $500 million dollars in RTTT funds, Secretary Arne Duncan praised the state's value-added teacher evaluation system and thanked them for having “the courage, capacity and commitment to turn their ideas into practices that can improve outcomes for students.”

Kentucky's system will be piloted statewide next year, with full implementation in 2013-14. The state has not determined what percentage of a teacher's evaluation will come from student assessments. That decision will come after the pilot phase, Gross said.

When the PGES is fully implemented, teacher and principal performance in each standard will be rated according to four performance levels: ineffective, developing, accomplished and exemplary. Teachers and principals will receive overall performance ratings based on their success in meeting the standards.Draft frameworks for the teacher and principal portions of the system are here.

Expect those ratings to be published. 


Are Teachers Replaceable Widgets?

Value-added systems pose major technical problems for assessment scholars. Technical problems that most politicians and school reformers alike seem perfectly willing to faintly acknowledge, then, fully ignore. According to American Progress,
High-stakes uses of evaluation results put a premium on fairness and validity of new performance-evaluation systems, and the use of multiple measures of performance can bolster these qualities. But placing too much weight on any one measure can easily undermine fairness or validity...


How much weight value-added estimates can bear without sending a nascent evaluation system crashing down is an open question. One technical reason is that the relationship between value-added estimates and other measures of performance depends on the features of the student-achievement tests involved. Better tests should support more weight, by and large. But such technical matters will be of purely academic interest if teachers are publicly identified with value-added estimates.
Sherman Dorn recently reminded us of the widget effect:
You don’t have to be as flip as Leo Casey to see the problem with Jocelyn Huber’s op-ed in the Tennessean today, which is a generic, bland defense of tying student test scores to teacher and principal evaluation. Huber’s op-ed is almost certainly in response to Monday’s NYT article by Michael Winerip, which identifies (and dramatizes) the concerns of a number of Tennessee educators in the state’s new evaluation system. Like Florida’s and Colorado’s, the Tennessee system has a number of arcane pieces to the algorithm tying test scores to evaluations, and like those and other states, it’s jerry-built.


On the one hand, on principle I think student outcomes should play a role in evaluation. On the other hand, there is something naive or creepy going on when advocates of doing so leave out all the caveats and problems in plunging in without caution. Or, to quote someone with whom I often disagree,
None of this is cause to shy away from incorporating value-added metrics into teacher evaluation and pay. But it’s cause to move deliberately, encourage experimentation, and note that respected, knowledge-based firms like Apple and 3M don’t try to drive all their employees’ evaluations or pay off a handful of uniform data points.
Rick Hess was right in his comments in April, especially the last one: for everyone who cheered the Widget Effect report blasting evaluations and HR policies that treated teachers in standardized fashion, I hope you’re all standing up and fighting evaluation algorithm fetishes in Tennessee, Florida, and elsewhere. Because when you look at it, there is nothing more widgety-absurd than imputing fourth-and-fifth-grade reading scores for the evaluation of a kindergarten teacher or an arts teacher.


All these sparkly-new teacher evaluation systems that put a heavy weight on student test scores for every teacher, willy-nilly? The new widget effect.
And once the new system begins producing data, media outlets will ramp up the consequences.

In New York, freedom of information requests from The New York Times, The Wall Street Journal, New York Post, New York Daily News, and local news channel NY1, mirrored that of the Los Angeles Times. The media believes they serve the public interest in improving schools by publishing, by name, teacher ratings based on value-added estimates. Again from Dorn...
At first glance the idea seems to possess intuitive appeal. After all, research using value-added estimates shows that teachers are the most important school-based driver of students’ academic success. So why not turn teacher ratings based on value-added estimates into a vehicle by which interested parties, especially parents, might pressure school officials into making tough, school-improving decisions?


But the decision to publish this information is in fact not so simple. As value-added measures become an accepted component of teacher evaluations, states and school districts will increasingly have to grapple with the question of how much information should be made available to the public and how much should remain private because of the nature of the information about individual teachers.
What Price for Flexibility?
Or, We're Number One !

This from a KDE press release:

FINAL NCLB FLEXIBLITY APPLICATION SUBMITTED
The Kentucky Department of Education (KDE) has submitted the state’s application for flexibility under the Elementary and Secondary Education Act (ESEA) of 1965, which was reauthorized in 2001 as the No Child Left Behind (NCLB) Act. The application and related appendices may be seen on KDE’s Unbridled Learning page, [all the way at the bottom of this page].


To help states move forward with education reforms designed to improve academic achievement and increase the quality of instruction for all students, in September, President Barack Obama and U.S. Education Secretary Arne Duncan outlined how states can get relief from provisions of NCLB in exchange for serious state-led efforts to close achievement gaps, promote rigorous accountability and ensure that all students are on track to graduate college- and career-ready.


The deadline for submission of the flexibility request was November 14, and the U.S. Department of Education will review applications in December. States can request waivers of 10 provisions of NCLB, including determining Adequate Yearly Progress (AYP), implementing school improvement requirements, allocation of federal improvement funding and more.


States must address four principles in their requests for flexibility:
* college- and career-ready expectations for all students
* recognition, accountability and support for schools and districts
* support for effective instruction and leadership
* reduction of duplication and unnecessary reporting requirements


Since the passage of NCLB, Kentucky has used a two-tiered accountability model for its public schools and districts that provides both state- and federal-level designations. If the state’s application for flexibility is accepted, the Unbridled Learning Accountability Model would provide a single designation for both state and federal purposes.
According to the USDOE
"Each State that receives the ESEA flexibility will set basic guidelines for teacher and principal evaluation and support systems. The State and its districts will develop these systems with input from teachers and principals and will assess their performance based on multiple valid measures, including student progress over time and multiple measures of professional practice, and will use these systems to provide clear feedback to teachers on how to improve instruction." 
See more details on the terms of the flexibility opportunity here.


Value-added accountability, is now being tied to NCLB waivers in some states. That has generated fresh anxiety from several quarters. Virtually every federal objective has been leveraged by money since the Johnson administration, and the states could not resist. Now, the feds are leveraging the granting of waivers from the most objectionable parts of NCLB to force their objectives into play. It's good to be the king.


Beyond the technical issues previously mentioned, some smell a different kind of rat, exemplified by this from Jim Horn at School Matters:

Is Your State Ed Department Off to See the Wizard? 
Agricultural statistician Bill Sanders has been on a non-stop sales campaign since 1992, when he closed his first state deal in Tennessee for his value-added assessment system, while managing to maintain proprietary control over the statistical formalae. Since 1992, Sanders has written widely about his model but rarely in peer-reviewed journals where his calculations would be subject to professional scrutiny. Educators and legislators are finally starting to ask questions, particularly since the stakes have been raised with student and teacher well-being made dependent upon secret calculations. Today's piece at Valerie Strauss's page is a good example. Even the mainstream press is starting to print the views of those outside the oligarchs workshops. A clip from Sunday's Plain Dealer:
. . . .Value-added has been widely criticized because SAS Inc., the outside company that Ohio pays to calculate it, keeps part of the calculation process a secret. Since the company refuses to disclose it, many educators refer to it as a "black box." Kenston's Lee said it reminds him of the Wizard of Oz working his magic behind a curtain. "The data goes behind the curtain with the wizard," he said. "Then the wizard comes out and says who made it and who didn't." But John White, a senior manager for value-added at SAS, said the company's formulas are all published and available on the Ohio Department of Education website and can be replicated by anyone with the expertise to do it.
Pioneered by William Sanders, a former University of Tennessee professor now working for SAS, value-added is part of a nationwide shift over the last 10 years toward measuring not just on whether students pass or fail a test, but how much they have learned or improved over time. It is gaining in popularity nationwide. SAS also calculates value-added scores for every public school in North Carolina, Pennsylvania and Tennessee, as well as for individual districts in 13 other states.. . .
Writing elsewhere, Horn adds,
The Business Roundtable's Rube Goldberg plan for evaluating teachers in Tennessee (and other venues with RTTT money) has met with almost universal disdain, a response that has brought CEOs running out of their penthouses to dictate responses for the corporate media editorial pages.
The New York Times, which never saw a corporate education idea it didn't like, published its own tribute to the current effort by the Oligarchs of Ed on Saturday, with an editorial in support of Tennessee's ridiculous plan based on "value-added" test scores and an observation instrument that assures that thousands of good teachers will lose their tenure, if not their jobs. And thousands more will lose their love of teaching.
From the Times education experts on Saturday, which, with a date or two changed, could have been written any time during the past 30 years of school and teacher bashing: Tennessee’s need to do better was underscored when the latest National Assessment of Educational Progress, also known as the nation’s report card, ranked the state near the bottom in fourth-grade math performance, just ahead of Alabama, Louisiana and Mississippi. These dismal results — slightly worse than those reported in 2009 — were made public earlier this month during legislative hearings on the evaluation system.
Yes, Tennessee test scores are abysmal, but, first off, let's at least get their ranking correct. As you can see the scores from 1992-2011 below, NAEP numbers have been flat for a long time, with the flattest coming during the most recent and most intense test-and-punish period of the NCLB conflagration.


Tennessee NAEP Math
1992--211
1996--219
2000--220
2003--228
2005--232
2007--233
2009--232
2011--233


Tennessee NAEP Reading
1992--212
1994--213
1998--212
2002--214
2003--212
2005--214
2007--216
2009--217
2011--215


Most interesting, too, is the fact that the Sanders Miracle of value-added testing (TVAAS) started in 1992...


So with Tennessee test scores pancaked, some other states without the benefit of the Sanders Miracle have moved ahead, thus lowering Tennessee even further: Progress from other states [from 2009 to 2011] lowered Tennessee's rankings:
• From 45th to 46th in the nation in fourth-grade math
• From 39th to 41st in fourth-grade reading
• From 43rd to 45th in eighth-grade math
• From 34th to 41st in eighth-grade reading.


Most interesting for those who pretend to believe that more failed corporate education reforms will improve learning in Tennessee, or anywhere else, are facts related to education spending in Tennessee, which exactly matches Tennessee's testing rank among states.


A Compendium of State Education Rankings ranks Tennessee 44th in per pupil spending at $6,855 (based on NEA data from 2005).


Another coincidence, no doubt: a report earlier this year by U. S. News ranked Tennessee's kids, that's right, 44th in "smartness" based on NAEP data.


And the biggest coincidence yet that relates to Tennessee's ranking on tests? Only 5 other states (Louisiana, Alabama, Mississippi, New Mexico, and Arkansas) have higher percentages of children living in poverty than Tennessee, which is tied with Texas, South Carolina and Kentucky with 26% of its children living in poverty.

1 comment:

Anonymous said...

As a Kentucky Educator, this sort of thing really scares me because the Commissioner and those working with him are willing to embrace and force implementation of just about anything coming down the pipe in the most reactionary, confusing manner possible.