Wednesday, February 24, 2010

Hoxby's Hocus Pocus

A few weeks ago I was discussing Carolyn Hoxby's most recent studies with a fellow blogger and I felt compelled to confess that I couldn't read it. My blogger buddy said he only reads the executive summaries. But that seemed wrong - like asking to be lied to - to me.

Here's why:
Hoxby's regression formula is Greek to me.

So how does one determine whether her methodology is useful? How does one determine if it rises to the level of the "gold standard" as Hoxby claims?

The typical approach is to have professional papers reviewed by knowledgeable peers before issuing the findings in a juried journal. But not these days. When somebody gets a wiff of confirming evidence, straight into the news cycle it goes.

So what's a lowly blogger to do? Is this the greatest study of all time or just a bunch of malarkey? Perhaps it's somewhere in the middle.

For help, I decided to seek out a the most neutral party I could find. I was looking for some folks who were very knowledgeable technically, but who were totally disinterested in any particular outcome. I wanted people with strong mathematical/statistical backgrounds. I wanted folks who knew how unbiased research ought to be conducted. I wanted folks who couldn't care less whether it was a good study or bad. I found them among the postdocs of a certain research university I know. Not in the education department. Not in economics, but pure mathematicians. These are not political people and they are unaware, so far as I know, that there is even a debate over charter schools.

I sent the study with no commentary other than a request for review of Hoxby's 2007 technical report with comments. Here are the early returns, edited to remove any identifying comments:

Sorry its taken me so long to get back to you...I did read through the paper and I have to say I'm not very impressed.

The entire paper is convoluted and its hard for me to decipher exactly what the findings are and what kind of implications are conjectured based on the findings. That aside, the mathematical approach doesn't appear to be very rigorous and there is no clear explanation of the actual mathematical tools employed.

Again, I am not a statistician so I can't argue the validity of their methods because I don't understand them myself. I can, however, tell you that the explanations of
their methodology are poorly written and give the impression that they might not
understand their methods either.

For example, when explaining the variables used in the 'estimating equations' the variable epsilon_i is not defined but is claimed to "remind us of the robust standard errors clustered at the student level". This is not a definition and not the rhetoric of a mathematician or statistician. Further there is no reference to what the "exhaustive set of lottery fixed effects" and other fixed effects are.

Several times there were unsubstantiated statements made such as "we believe that we should be able to match only about 90% of students, plus or minus a few percents" (pg 13), "Class size has an association with achievement effects that is estimated with a fair degree of precision" (pg 35), and "we estimate the magnitude of the understatement to be about 8 percentage points" (pg 15). Why do they believe that a 90% match is 'good enough'? Where did that statistic come from? What is a 'fair degree of precision'? How did they estimate the understatement to be 8 percentage points?

These statements alone are enough for me to discount the entire validity of the paper because if we can identify one statistic that was pulled out of thin air, then why should we believe that all the others aren't as well?

Also, for several results, reasons are given as to why the result should not be regarded as significant. For example, the authors clearly state why policy effects
should be discounted (p35). However, they go on to evaluate policy effects as well as speculate on reasons for the results. If policy effects should so clearly be disregarded, then they shouldn't be analyzed.

Finally, I saw many instances of bias remarks made which should be carefully avoided when evaluating statistics.

So there you have it.

The paper has now been passed on to another department at the same university for a confirming/disconfirming review. I'll let readers know if we learn anything from that.

No comments: