Sunday, February 01, 2009

CATS Confusion Abounds

The most important issue - by far - facing returning legislators this week is the state budget. But for some legislators, it is the elimination of CATS. Let's hope the big issue does not get lost within the smaller issue.

After reading various comments in the debate over the fate of CATS it seems that the conversation would benefit from a little clarification.

The first thing I think of when I hear someone say, ‘Let’s get rid of CATS,’ is that the speaker is either not fully informed about what CATS really is, or the speaker is assuming everyone understands the real meaning of CATS, and is casually misusing the name on purpose.

I suppose it's no big deal when a lay person makes the mistake. In fact, the media and KDE contribute to that misunderstanding in some ways. And clearly, some teachers, who know better, use CATS as a short-hand reference to the Kentucky Core Content Test (KCCT). But it is bothersome when a legislator fails to understand the implications of such a comment.

So let's clarify a couple of things.

First, CATS is a multi-component system designed to produce some measure of school accountability; it is not a test. The Commonwealth Accountability Testing System has many parts that are still going to be with us long after the General Assembly adjourns - particularly the $9 million component required of Kentucky by No Child Left Behind (NCLB).

But CATS is the KCCT, the ACT, the Explore, the Plan, the alternative assessment, writing portfolio…and everything.

Furthermore, some are confused about the relationship between CATS and NCLB. It seems the public is being led to believe these are two separate tests, when in fact, the NCLB scores are produced by the KCCT. It that sense, NCLB is a portion of the Core Content Test.

But this is a case where the public ought to be confused because the data are published in two separate reports. The NCLB report comes out early, in order to meet the federal deadline. The KCCT report comes later. Because they are reported separately, the general public seems to think they are separate tests.

But lately, a new idea has been introduced into the conversation - the idea that CATS should be able to compare the performance of Kentucky students to the performance of students in other states.

KSN&C discussed this idea recently with Kentucky Board of Education member, Dorie Combs, who chairs the Department of Curriculum and Instruction at Eastern Kentucky University where I work.

Combs told KSN&C that,

This idea of, ‘Let’s give a test where we can compare ourselves to other states.’ That’s probably the biggest misunderstanding. And it takes a little explanation.

First of all, every state has to come up with a test to measure that state’s standards. So every state has a different test that they use for NCLB.

And they aren’t using off-the-shelf tests to do this. In most cases they’re doing what we used to do. They’re augmenting an off-the-shelf test. So they might have the CTBS, which they have taken and added to; with additional questions. And then adding in, in many cases, an open response questions, which is what we do as well.

This makes state-to-state comparisons inappropriate. Combs says,

There’s really only one test out there that compares the states and that’s the NAEP. [The National Assessment of Educational Progress]

Of course, the concept behind CATS was never to compare Kentucky to other states. It’s a criterion test designed to look at how the schools perform. And any inference that CATS was intended to, designed to, or has done so, is simply wrong.

Combs added,

Even if we used the ACT: Again, the ACT would not yield a fair comparison because not every state takes the ACT. And we’re [testing] juniors, here. Other states test their seniors.

Combs was recently at a meeting with one of the New York Regents (like Kentucky's school board members) and they discussed the long-standing New York Regent's Exam. She learned that,

They know that their high school graduation requirement tests “at about the ninth or tenth grade level. What does that do?

Coincidentally, I have two students in one of my EDF 203 sections both of whom passed their high school proficiency exams, in Ohio, as eighth graders; and to hear them tell it, were never tested again.

Kentucky chose to give the ACT in the fall and the KCCT in the spring. The idea was get some performance data early in the junior year with the hope of lighting a fire under some kids going into the spring KCCT testing, where the schools are held accountable.

KSN&C: You know Dorie, the fundamental thing behind this test that was so revolutionary was the notion of measuring how well schools performed as opposed to how well kids performed. Do you think that, if there was adequate funding, that the notion of measuring school performance might give way to the notion of measuring individual performance?

Combs: There are those who take that view; that the best way to get kids to perform is to hold their feet to the fire. But what we’ve seen happening in other states – it doesn’t work that way. Inevitably, you get into this cycle - where some students don’t perform well enough to move to the next level, or pass, or graduate – yet they have good grades. The families start suing the school districts. Saying, ‘How can you say they can’t graduate when you’ve passed them every year and given them good grades?’

And, then [those states] come back, and they change the test. Or, they start letting them take it over and over and over. So every state that’s used a student accountability model eventually falls into this quicksand. And then going back and saying ‘Well we’ve got too many people not passing. There must be something wrong with the test.

I understand that teachers and some school administrators feel like they should not be put under the stress of the student’s performance.

And some are complaining about how schools use test data to fire young teachers; an idea that probably ought to be challenged in court. Teachers do not control enough of the variables. It’s totally unfair. But if that’s the way the game is going to be played, then let’s put the superintendents under the same stress. Just as superintendents have objected to such plans – on justifiable grounds – so too should the teachers be heard on this issue. What’s good for the lower and middle levels of the organizational chart ought to be true for the top as well.

Combs says,

It concerns me that schools use test score data to fire teachers.

I don’t think you can take that test score and say, ‘It’s your fault.’ Because all the way down the line, you have to say, ‘Well, it’s their fault too.’ We can’t isolate it to teachers. I’m opposed to that approach where all you say is, ‘What does this teacher contribute.’ It can make it difficult for that teacher to work with a child.

So this idea of getting rid of teachers is not coming from [the Kentucky Board of Education].

The idea is to say, ‘What can we do to get better?’ Not to beat everybody up and say, ‘This is your fault.’

And when you look at it, the punishment for the school is usually more money.

...arguably, a disincentive to obtaining a target. Yet the most exalted examples among our superintendents and principals seem to be those who move quickly to get the "right people" on the bus and the "wrong people" off.

Combs contends that,

This idea that beginning teachers are disposable is a shame because it doesn’t give teachers the time to develop into mature, strong teachers because there’s so much you learn in that first year…

At present, the proposal to dump CATS is supported by arguments that it would save the state money.

The Herald-Leader recently opined that Senate President David Williams was being disingenuous when he claimed the state would realize savings by eliminating CATS.

He's just using the budget emergency to get in a little pandering. He's playing to that small segment of Republicans and soreheads who like nothing about the public schools and to a larger segment disenchanted with what they see as an overemphasis on testing.

The Commonwealth Accountability Testing System accounts for about 1 percent of the education budget or about $14 million. Kentucky is facing nearly a $500 million shortfall.

Trying to replace CATS with an off-the-shelf achievement test could end up costing Kentucky millions in federal dollars by putting the state out of compliance with the No Child Left Behind law. There also would be costs and penalties for breaking agreements with current testing contractors.

Killing CATS could dig the state even deeper into the hole.

To put a finer point on the actual dollars, Combs points out that the proposed elimination of CATS,

is not going to help us money-wise, because it doesn’t cost that much... Assessment [in Kentucky] is $15 million all together. We can’t get rid of the whole thing. We’ve still got to address NCLB.

While the Obama administration has been fairly clear that growth models are about to become de rigueur for NCLB, that hasn’t happened yet and it could be a while before it does.

Let’s say they vote to [eliminate CATS]. We still have to keep the components that are required for NCLB. So that means keeping, [grades] 4 through 8, reading, writing; science at elementary [through] high school, and reading and math in high school. So we’ve got to continue that. So now we’re already spending $9 million. Now, we’re only going to save five or six million.

Here’s the likely scenario. There’s not going to be any forethought to - when we take this away - What’s going to be in its place?

So we could well end up having to create one test while paying for another.

But in any case, arguments over how much money could potentially be saved should not exceed $6 million, tops.

In my view, we had an imperfect, but decent enough, assessment until NCLB requirements came in on top of CATS. In fact, NCLB did not benefit Kentucky one bit. We already had Senate Bill 168 which requires the disaggregation of student achievement data. If we wanted to stick sharper “teeth” into SB 168, it would have done everything Kentucky needed.

Combs says she hopes,

that changes might be made [to NCLB] to give us some flexibility. And pull some of this, ‘If you’re not meeting 100% of your goals, you’re failing.

[I just want to] make sure everyone understands that KCCT fills the requirements of NCLB. There’s really no test that is a national test - other than NAEP - and that NAEP does actually make these state comparisons. That’s the best way to compare state-to-state and those data are available.

Has CATS simply become a bad brand? It certainly seems to this observer that the popular sentiment has turned against CATS.

Combs countered,

If I was a teacher – the best thing to make my life easier - it’d be to get rid of that dang test.

Well….duh. That would make everybody glad.

Does that make it the right thing to do?

Not necessarily.

The Council of Chief State School Officers maintains a careful review of state accountability systems. Turns out,

There’s only one state that uses an off-the-shelf achievement test. And, you know who that is?

I’d like to think that if I had a minute, I would have come up with that one. But I forgot - the Iowa Test of Basic Skills. Of course.


Richard Innes said...


Your post itself has a little confusion.

For example, you write, “It (sic, I think you meant “In”) that sense, NCLB is a portion of the Core Content Test.” Actually, it is more correct to say that the Core Content Tests in reading and math are a portion of NCLB. If NCLB changes in certain ways, the Core Content Tests, or whatever we use to comply with NCLB, may have to change, as well.

Of course, we don’t know what the new administration will do with NCLB, but there was sentiment in the US Department of Education under the old administration to require a review of all state standards and NCLB supporting tests because of sharp divergences are appearing between those state tests and the NAEP. CATS is culpable and could have problems with a review.

US Ed has also expressed concern about excessively permissive loopholes caused by unreasonable minimum student sample sizes for subgroup score reporting (the “N” number) and abuse of the confidence interval process (no state pushes this loophole harder than Kentucky). Some of these problems may be driven by the fact that CATS was never designed to generate accurate data for either individual students or small groups of students. CATS is only supposed to be designed to develop a school-wide accountability figure with acceptable accuracy. Thus, Kentucky’s basic assessment may be found unsuitable for NCLB, which clearly requires valid and reliable data for small student subgroups. It will be interesting to see what happens in Washington.

You also write about some states, and maybe some districts here, using the results of state tests to fire teachers. I think you over-simplify what actually is composed of 50 different situations. Because CATS isn’t accurate below the school level, if it is even really accurate at that level, it is not appropriate to use CATS scores as the sole reason for firing teachers in Kentucky. In fact, no one test should ever be used that way.

However, there are examples of much better, longitudinal testing programs that can provide good input to the decision process about whether individual teachers are performing well (let me reemphasize, this must be a process that includes more than just test scores from a single test).

The Tennessee Value Added Assessment System provides one example of a better testing model. TVAAS, as it’s called, uses a very sophisticated statistical process to determine annual student growth over the course of the school term. TVAAS adjusts for those “uncontrollable” factors you mention. Thus, inner city school teachers are treated fairly.

TVAAS also starts with tests that provide valid and reliable scores for individual students, by the way, another failing with CATS. And, those tests can provide national norms, which Kentucky parents want. Other leading states like North Carolina are moving to TVAAS-like systems, as well. Surprisingly, a number of Kentucky school districts are using somewhat similar longitudinal tests – which they have to purchase on their own – as well. However, CATS would have to undergo major changes to ever be useful for longitudinal analysis, something its current supporters like Prof. Combs seem unwilling to contemplate.

To close, let’s talk a minute about CATS costs. I don’t really know what the total cost is. You don’t know – and neither does Dr. Combs. In fact, even the Kentucky Auditor of Public Accounts admits they can’t figure it out, either. However, an OEA report from 2005 indicates the true total cost is probably on the order of $40 million. Some of that cost would still be required if we move to another assessment, but the overall savings could be far more substantial than anyone realizes. For example, the CATS writing portfolios now absorb a huge amount of local resources, both in and out of the local classroom. A more rational approach to teaching and assessing writing could be far less costly, and probably more effective (Don’t believe it? Watch this YouTube:

Anyway, the reason no one knows what CATS really costs is that the MUNIS financial system is doing a deplorable job of capturing education expenses in useful and accurate ways. The new auditor report on CATS contract costs doesn’t specifically say so, but MUNIS is where the auditor had to turn for information, and the auditor freely admits that information simply could not be found. That lack of accurate financial information makes it awfully hard to engage in constructive discussions about CATS.

Richard Day said...


Thanks for catching the typo.

The fact that Kentucky can ill-afford to turn down federal funds and therefore had to make adjustments to the KCCT hasn't helped CATS, but it hasn't changed the fact that CATS is the more comprehensive of the two. The NCLB scores are dervied from KCCT.

You are correct, of course, that we do not yet know how the Obama administration will approach NCLB. Many states ran into trouble administering regulations under the Bush administration, so that's not particularly remarkable, but is certainly a factor moving forward. I'm betting heavily on a new NCLB growth model. But We'll see.

There's little doubt that Kentucky's forray into a this new kind of high-stakes accountability system has produced its share of headaches: The top-down design; The absence of an established curriculum for too many years; Our early complaints about KIRIS; confidence intervals; backloading... It's not too tough to find something to fault.

But once CATS became stable(and until NCLB was loaded on top of it) I could rely on CATS to direct specific curriculum changes at the school level and predict future outcomes with some clarity. As you know, some of us even enjoyed the "sport" of predicting next years results in advance.

As a principal, I wanted all the data I could get. BUT - I also wanted to understand the limitations of its proper use. I like the value-added systems conceptually. But we must not be confused. While they may represent an improvement over present systems they are not perfect either. As you know, Daniel Koretz advises against using such data to terminate teachers. This is doubly true for principals who dump rookies in their first year before they can even get their feet on the ground.

(I'm not saying rookies should never be let go, but the practice is overdone in some cases.)

You know, confidence intervals reminded me of something. Remember the Booker T Washington Academy and the "remarkable progress" they made? That was a school that benefitted substantially from the confidence interval. So it begs the question. Were the gains real?

I'm not sure what CATS changes Dr Combs or other board members might be willing to contemplate if the changes were rolled out in a planned fashion. Why not start now to plan a more comprehensive longitudinal system for 2014?

I am aware of KDE's accounting problems, but I don't think that invalidates KDE's cost estimates for continuing the NCLB portion of the KCCT. I've got to trust that KDE estimates are in the neighborhood...

Thanks for the comment.


Anonymous said...

As a teacher I know that the amount of stress involved with CATS is driving teachers away from the profession in droves. Other professionals not in the classroom have no idea what we are subjected to as teachers by students, administrators, and parents. Everyone is telling you how to do your job, and most of them don't have an inkling about how to teach period.

I was taught to teach by some wonderful professors and I am a great teacher, yet my professional advice goes completely unnoticed by administrators when we are making educational decisions. SBDM councils are supposed to make decisions for the school and teachers that are a representation of input from all teachers, yet it is seldom the case. Principals still crack the whip. Then if it fails it must have been you. We are jumping through hoops for every research based program the districts can buy. I honestly thought that my college training was research based!

I have taught 22 years and go to physical therapy twice a week just to even be able to work. I had my classroom funds taken back, I have to pay to have a tiny refrigerator in my classroom to use, and I get a whopping 20 minute lunch! This is what I spent over six years in college to do? We don't even have a soda machine to even get a drink during the day. We are worse off now than when I first began teaching, when I actually could not wait to get out of bed and go to school. Now, tell me how much we have advanced because of CATS?

I think there has to be a more adequate solution. Teachers cannot keep absorbing the blows for this test. There are wonderful teachers in Kentucky that care for our kids. They unselfishlessly buy from their own pockets to make sure the kids have hands on activities to learn skills and concepts. They spend countless hours they are never paid for and are treated very disrespectfully.

Richard Day said...

Thanks for the comment.

You are expressing what I find most troubling about high-stakes testing. In the hands of a nervous principal and superintendent, it can lead to a kind of top-down intervention that actually gets in the way of effective teaching.

It can also lead to administrative CYA when things go wrong.