Saturday, March 03, 2012

Applying a Precise Label to a Rough Number

This from Michael Winerip in the New York Times:

I’m delighted that the New York City Education Department has released its teacher data reports.Finally, there are some solid numbers for judging teachers.

Using a complex mathematical formula, the department’s statisticians have calculated how much elementary and middle-school teachers’ students outpaced — or fell short of — expectations on annual standardized tests. They adjusted these calculations for 32 variables, including “whether a child was new to the city in pretest or post-test year” and “whether the child was retained in grade before pretest year.” This enabled them to assign each teacher a score of 1 to 100, representing how much value the teachers added to their students’ education.

Then news organizations did their part by publishing the names of the teachers and their numbers. Miss Smith might seem to be a good teacher, but parents will know she’s a 23.

Some have complained that the numbers are imprecise, which is true, but there is no reason to be too alarmed — unless you are a New York City teacher.

For example, the margin of error is so wide that the average confidence interval around each rating for English spanned 53 percentiles. This means that if a teacher was rated a 40, she might actually be as dangerous as a 13.5 or as inspiring as a 66.5.

Think of it this way: Mayor Michael R. Bloomberg is seeking re-election and gives his pollsters $1 million to figure out how he’s doing. The pollsters come back and say, “Mr. Mayor, somewhere between 13.5 percent and 66.5 percent of the electorate prefer you.”

There are a few other teensy problems. The ratings date back to 2010. That was the year state education officials decided that their standardized test scores were so inflated and unreliable that they had to use their own complex mathematical formula to recalibrate. One minute 86 percent of state students were proficient in math, the next minute 61 percent were.

Albert Einstein once said, “Not everything that can be counted counts, and not everything that counts can be counted,” but it now appears that he was wrong.

Of course, no one would be foolish enough to think that people would judge a teacher based solely on a number like 37. As Shael Polakow-Suransky, the City Education Department’s No. 2 official, told reporters on Friday, “We would never invite anyone — parents, reporters, principals, teachers — to draw a conclusion based on this score alone.”

Within 24 hours The Daily News had published a front-page headline that read, “NYC’S Best and Worst Teachers.” And inside, on Page 4: “24 teachers stink, but 105 called great.”

The publication of the teacher data reports is a defining moment. A line has been drawn between those who say, “even bad data is better than no data,” and those who say, “Have you no shame?” ...

1 comment:

Anonymous said...

So how exactly is this sort of public tar and feathering suppose to attract the brightest and the best to our profession? Seems like the "good" educators who I know that have seasoned a few series of curriculum and assessment shuffling tell me that they do it because they feel like they are making a difference and for the appreciation that local folks have for their efforts. If we are going to have the papers misrepresent teaching effectiveness based upon completely unreliable scores, then I don't see why current folks will stay in the profession either.

I don't see how we can take months to build a case against a superintendent for vote buying and obstruction of justice, but we can throw a teacher under the bus based on a couple of test scores. SHAME SHAME