Thursday, March 26, 2009

To Predict or Not to Predict

By Skip Kifer

A colleague and I were doing a session on formative assessment that began with, would you believe, a formative test on creating and using formative assessments. We generated a lively discussion when going over answers to the test, particularly the answer to this question:

Which of the following is NOT a purpose for using formative assessments?

a. To learn which students are doing well and which are not doing so well.
b. To gather data about what has been taught well or not so well.
c. To predict future performance on a norm referenced test.
d. To provide a basis for future instruction.

C is the correct answer. Some teachers insisted that C was not a correct answer because prediction, too, was a purpose of formative assessments. They used as examples what they have learned recently about “benchmark” assessments.

There is no one definition of formative assessment. And there are those, mainly vendors like ACT, who say that benchmark assessments are formative assessments. They are not.

The crux of the difference between benchmark assessments and formative assessments is the difference between predicting and correcting. Benchmark assessments predict; formative assessments correct.

I give formative tests after teaching an instructional unit. The tests:

a. cover what has been done in the unit;
b. are scored but not graded;
c. are discussed immediately upon completion; and
d. are discussed (hopefully) in earnest by the students.

They are now, therefore, instructional tools. They convey to the student what was important in the unit. The students convey what they learned. My aim is to get most students to know most of the material in the unit. In doing so, I want high scores but small differences, less variance, in the scores. I want each student to know more and the class to be more alike.

The discussion is the first part of correcting. By getting a conversation going, I hope that students better understand what I thought I was teaching. The next step, however, is to look at the students’ scores and figure out what additional experience each student needs in order to master the material. I also peek at the scores to see what I did well and not so well.

Contrast the formative assessment scenario with a benchmark one.

Benchmark tests may or may not be related to what students have been taught. Alignment studies have to be conducted to determine if they are and to what extent.

Since the questions are not released, a teacher does not know what has been done well or poorly. Students have no idea how well they are doing because the questions and the right answers are never discussed. There is no way to tie the results to what a teacher has been teaching because the test, at that time, does not necessarily reflect what has been covered in the curriculum.

What benchmark tests do is predict scores on other tests. That is, benchmark results are used to estimate the extent to which student scores on a first test are replicated by results of a second test. That prediction is strongest when the variation on the tests is largest. On the other hand, the more similar are student scores, the worse the prediction. If each student did exactly the same on the first test, the prediction would be zero. So my attempts to use formative assessments to get high scores with small differences between them fly in the face of strong prediction.

Imagine a student entering a classroom and the teacher saying she can predict where the student will be at the end of the year. The student will be about as much above the mean or below as they now are. That strong prediction would be desirable from a benchmark point of view. It may not be so desirable from the student’s view. But that’s predicting!

Imagine that same student entering a classroom and a teacher saying we are going to work together to master the material in this course. She will use formative assessments to help the student. That would imply no relationship between the status of students upon entering the class and their final status in the class. That is correcting. And, teaching!

Just for fun here is the formative test on formative assessment and an answer key.

A formative assessment on formative assessment…..

1. The person who first distinguished between formative and summative approaches was:

a. Paul Black
b. Benjamin Bloom
c. Michael Scriven
d. Richard Stiggins


2. When the chef tastes the soup, it is _____? When the customer tastes the soup, it is _____? Choose the two words that best complete these thoughts.

a. summative, formative
b. formative, summative
c. objective, formative
d. summative, objective

3. Which of the following is NOT a purpose for using formative assessments?

a. To learn which students are doing well and which are not doing so well.
b. To gather data about what has been taught well or not so well.
c. To predict future performance on a norm-referenced test.
d. To provide a basis for future instruction.

4. Formative assessment activities in the classroom used properly:

a. Provide a basis to rank order students.
b. Predict performance on other tests.
c. Help the teacher know what she has done well or poorly.
d. Add to the precision and usefulness of grades.

5. The Taxonomy of Educational Objectives:

a. Orders outcomes according to how difficult it is to teach them.
b. Is based on learning hierarchies
c. Shows what is desirable
d. Is ordered by cognitive complexity

6. Suppose students in a class were asked to memorize the names of those who came over on the Mayflower. A test question then asked them to write down 10 of those names. At what level of the Taxonomy is the response likely to be?

a. Knowledge
b. Comprehension
c. Application
d. Analysis

7. Suppose the list contains occupations of the Mayflower passengers. Students are asked to write an essay that draws inferences about why certain occupations were present on the ship. At what level of the Taxonomy are the responses likely to fall?

a. Knowledge
b. Comprehension
c. Application
d. Analysis

8. Suppose the list of names of persons on the Mayflower also included their occupations. Students are asked to write an essay that draws inferences about why certain occupations are on the ship. That would represent what Depth of Knowledge?

a. DOK 1
b. DOK 2
c. DOK 3
d. DOK 4


Key to Formative Test

1. C. Scriven, M. (1967). The methodology of evaluation. Tyler, Gagne & Scriven (Eds). Perspectives of curriculum evaluation. (AERA Monograph Series on Curriculum Evaluation, No. 1). Chicago: Rand McNally
2. B. Bob Stake coined this to distinguish between the two.
3. C. Prediction comes from benchmark assessments. A teacher would like all of her students to do well on the formative tests. If so, there would be no correlation (prediction) between the formative test results and some norm referenced test results.
4. C. First of all, did the teacher do well, on what? Secondly, which students did well and which did poorly, on what?
5. D. Bloom called the Taxonomy the most used, least read book in education. The taxonomy orders outcomes according to cognitive complexity.
6. A. Recall would fall under Knowledge, the least complex cognitively of the levels.
7. C or D. A taxonomy purist would probably say D. I have difficulty getting beyond Applications.
8. C or D. I don’t do Depth of Knowledge. I don’t think it adds anything to the taxonomy. There are lots of classification schemes. I am happy being familiar with one.

No comments: