By Skip Kifer
SB1 proposes a huge amount of testing. Some testing is statewide; other testing is not. Some tests are used for accountability purposes; others are not. Some tests are multiple- choice only; others are not. There are formative tests, summative tests, performance assessments, benchmark tests, interim tests, norm-referenced tests, criterion-referenced tests, end of course tests, and writing portfolios as well as program reviews and program audits which are used to measure students, schools and districts.
The Kentucky Department of Education (KDE) is directed to integrate the testing into a new assessment system. It has almost three years to complete the task.
I doubt that a defensible new system can be created, even with that amount of time, without additional resources for KDE and more focused testing.
A first step to building a new assessment is to find out whether what is presently being done works. Major components of the existing system should be evaluated. For instance, EPAs - Explore, Plan and ACT - produces results presumed to help students learn and school personnel make better decisions. Does it? What is the evidence that the information is used? What is the evidence that using the evidence makes a difference? On what? Studies of each major component should provide answers to those or similar questions. If the component is not working, it should be eliminated.
There should be a thorough understanding of the implications of moving from an assessment system focused on schools to one focused on students. CATS relied on different forms of the test to get a better sample of achievement within a school. A student might spend an hour taking a test but with six forms of the test, the assessment contained about six times as much information for each school. Even with that, about 25% of schools were misclassified. Some were said to be progressing when they were not. Others were said to be needing assistance when they were progressing.
The existing assessment had thousands of misclassifications at the student level, too. Proposing to produce longitudinal data on students places more pressure on the assessment and more obstacles to accurate measurement. Creating measures that can be scaled across grades suggests spending more time testing each student. Is that desirable? Is it practicable?
Finally, the biggie! After almost two decades of educational reform with a spotlight on accountability through high-stakes testing, is their evidence that such testing works?
Research results both in Kentucky and nationally are mixed. Test scores go up but there are questions of whether that is mainly because of teaching to the test. Because what is taught and tested narrows, higher test scores may not mean mastery of a content area and may mean that content areas not tested are not emphasized. In addition, states with high-stakes testing do no better, and perhaps worse, on national tests than states without such programs. Reasonable persons would agree that test scores are, at best, a narrow reflection of successful schools. Other aspects are more important. Evidence should be collected, therefore, that indicates whether after about two decades of high-stakes testing Kentucky schools:
a) are better places for kids than they were prior to the reform;
b) nurture talent in ways that it should be nurtured; and
c) insure those who emerge from the school system have commitments to democratic ideals, participation in democratic communities, and tolerance for varied persons and views.
Armed with good evidence, resources and time, I hope KDE creates a system that benefits the Commonwealth and its children.