Sunday, March 16, 2008

Guskey on Why the ACT is Inappropriate for State-wide Accountability

In today's Herald-Leader, CASA assessment guru Tom Guskey points out the problems associated with using tests for purposes that run counter to their design. Without mentioning Senate Bill 1, he shows why it won't fix what's broken.

Selecting school tests a matter of competence

...What's missing in these discussions about testing is a clear understanding about what...certain tests can and cannot do.

...College entrance exams such as the ACT and SAT help colleges and universities decide whom to admit [but they] do not reflect any particular level of knowledge... rather where each student ranks in relation to others. Ranking makes the selection process easier.

Problems arise when a test designed for one purpose is used for another. ... tests like the ACT and SAT are labeled "instructionally insensitive." If instruction helps most students answer a question correctly, then that question is removed from the test, for it no longer serves its purpose. Even if the question asks about a vitally important concept, it no longer differentiates students and is eliminated.

This is why scores on selection tests are more strongly related to social and economic factors than are scores on competence tests. Aspects other than those influenced by instruction often account for the differences among students. It is also why it makes little sense to use a selection test like the ACT or SAT as a measure of the quality of instructional programs. Doing so would be analogous to using a ruler to measure a person's weight.

Having all students take a selection test such as the ACT or SAT may help some realize that they rank high enough to get into a college or university. That would be a good thing, especially for non-traditional students and those who come from disadvantaged backgrounds.

But to use the results of an "instructionally insensitive" selection test to assess the quality of instructional programs is educational sacrilege. No testing expert would agree to it -- and neither should any legislator or policy-maker.

2 comments:

Anonymous said...

Here is an open-response question for readers of this blog entry.

Do you agree with Dr. Guskey's comments?

Be careful! This question requires higher order thinking to answer.

Consider; now that the ACT provides benchmark scores, is it still just a norm-reference assessment as Guskey asserts, or is it something new -- a hybrid NRT/CRT? What are the implications of that?

Also, explain why Dr. Guskey might want to duck this obvious benchmark issue raised by the ACT, the EXPLORE and the PLAN tests. Since all three tests are in use in Kentucky statewide, how do you think Dr. Guskey missed this important testing development?

Here are some more thoughts. Can, and should, teachers really take pride in CATS results for their own students? Don't jump on the answer until you read below.

The KEA Lobby Team's position paper on Senate Bill 1 says that putting in a test with individual student validity and reliability would enable programs to evaluate teachers using those scores. The union is solidly against that. As a consequence, the KEA clearly implies CATS cannot be used to evaluate classroom teachers because it does not generate sufficiently valid and reliable classroom level scores.

Do you agree with Dr. Guskey or the union? Side with the union and you can pretty much forget Guskey's comments about individual teacher's taking pride in CATS scores. Side with Dr. Guskey, and you just might give the legislature a green light to hold you accountable with CATS.

Don't you just love these real world open response questions?

Richard Day said...

Wow!

Well, first, Guskey's right...but I don't have time to go into all of that now. (I'm lecturing on education in antebellum Kentukcy this morning...and need to get my head into that.)

In the meantime consider this: NRT v CRT is not an either or proposition. The ACT is normed. It is a NRT...and the benchmarks do not make it anything else.

The benchmarks you refer to are not criterion referenced - they refer to the relationship between a student's ACT score and the grades they eventually earn in college. The sample of students used to derive the benchmarks is NOT scientific. It is a convenience sample. In fact, ACT says "there is no guarantee that it is representative of all colleges in the U.S."

The ACT exists to help colleges predict which students are most likely to be successful. It does so by weeding out students - using a curriculum (criteria) that the students may or may not have been taught.

Here's some homework:
http://www.act.org/research/policymakers/pdf/benchmarks.pdf