Sunday, October 19, 2008

Innes smacks Kifer. Kifer smacks back.

We recently had a little dust up over testing between education analyst Richard Innes and Georgetown/UK Professor Skip Kifer.

The Bluegrass Institute's Richard Innes, argues that credibility and stability problems with the CATS are reason enough to dump it. In this argument, the GREAT is the enemy of the GOOD. Innes pounds on the CATS' several flaws; hoping that by seeking perfection the utility of the instrument will be doomed. He thinks there's a better idea. Just throw the CATS out. Maybe, replace it with the ACT. After all, the ACT is now a super test with super powers. Just ask ...the folks at ACT.

Ben Oldham recently wrote, "Since the ACT is administered to all Kentucky juniors, there is a tendency to over-interpret the results as a measure of the success of Kentucky schools."

Innes, assisted that over-interpretation mightily, and decided he would "school" Professor Oldham saying,
Oldham pushes out-of-date thinking that the ACT is only a norm-referenced test. The ACT did start out more or less that way, years ago, but the addition of the benchmark scores, which are empirically developed from actual college student performance to indicate a good probability of college success, provides a criterion-referenced element today, as well.

Well, I'm no testing expert but even I knew that was wrong. Wrong enough, that I began to worry that BGI's testing expert might have some holes in his own preparation.

KSN&C responded,

The problem of over-interpretation has been somewhat exacerbated by the inclusion of benchmark scores in the ACT. But benchmarking does not change the construction of the test nor the norming procedures. It does not turn the ACT into a criterion-referenced exam as Innes tries to suggest...

The National Association for College Admission Counseling Commission — led by William Fitzsimmons, dean of admission and financial aid at Harvard University — issued a report last month the sparked a lot of conversation about how tests like the ACT and SAT were being misused.

This is a recurring theme in the college admissions business but this year's report warned that the present discussion of standardized testing has come to be “dominated by the media, commercial interests, and organizations outside of the college admission office.” Some of those groups have other items on their agenda.

Skip Kifer mentioned the report's warning about over-emphasizing test scores and argued for prudent use of the ACT in selecting students. He also warned folks not to get snookered by ACT officials new claims that without having changed the nature of the test, their scores now tell whether a student "meets expectations" or is "ready" to attend college.

As Kifer pointed out,

The benchmark stuff is statistically indefensible. Hierarchical Linear Modeling(HLM) was invented because people kept confusing at what level to model things and how questions were different at different levels. The fundamental statistical flaw in the benchmark ... is that it ignores institutions. Students are part of institutions and should be modeled that way.

Innes persisted and tried to play it off saying,

I guess Kifer and his compatriots at Georgetown College ... will never get the idea behind the ACT’s Benchmark Scores. It really isn’t hard to understand the Benchmarks.

This is when I should have suspected an academic spittin' contest was on the way and somebody was going to have to put up or shut up. (Actually, in the blogosphere, nobody really shuts up, but you know what I mean.)

Stung by the suggestion, and always the professor, Kifer challenged Innes to "Describe the statistical models used to determine the ACT benchmarks." This question would show whether Innes understood the nature of Kifer's argument and at the same time would disprove his claims.

Innes said he was waiting for some information from ACT (which in my experience is a lot like waiting for Godot) and changed the subject to how lousy the CATS is.

We're still waiting to hear him defend his position that the ACT Benchmarks constitute a valid criterion-reference test and that "the Benchmarks can fairly be considered a real measure of proficiency." That may be like waiting for Godot as well.

KSN&C has taken the position that all social science tests are imperfect. This includes the CATS, the KIRIS, the ACT, the SAT, the CTBS, the NAEP, and the EIEIO.

No comments: