Standardized tests will finally ask good essay questions.
But robot grading threatens that progress.
This from Dana Goldstein in
Slate:
In 2002, Indiana rolled out computer scoring of its 11th
grade state writing exam. At the time, ETS, the company that developed
Indiana’s software, said automatic writing assessment could help cut the
state’s testing budget in half. But by 2007, Indiana had abandoned the
practice.
Why? Though ETS’s E-Rater proved adept at scoring so-called “naked”
essays based only on personal opinion, it couldn’t reliably handle
questions that required students to demonstrate knowledge from the
curriculum. State testing officials tried making lists of keywords the
software could scan for: in history, for example, “Queen Isabella,”
“Columbus,” and “1492.” But the program didn’t understand the
relationship between those items, and so would have given full credit to
a sentence like, “Queen Isabella sailed 1,492 ships to Columbus, Ohio.”
Cost and time savings never materialized, because most tests also had
to be looked at by human graders.
Indiana’s experience is worth keeping in mind, since although the
technology has not advanced dramatically over the past decade, we’re now
in the midst of a new whirlwind of enthusiasm about electronic writing
assessment. Last month, after a
study
from Mark Shermis of the University of Akron announced that computer
programs and people award student-writing samples similar grades, an
NPR headline teased,
“Can a Computer Program Grade Essays As Well As a Human? Maybe Even
Better, Study Says.” Education technology entrepreneur Tom Vander Ark,
who co-directed the Shermis study, hailed the results as proof that
robo-grading is “fast, accurate, and cost-effective.”
He is right about “fast”: E-Rater can reportedly grade 16,000 essays
in 20 seconds. But “accurate” and “cost-effective” are debatable,
especially if we want students to write not only about what they think
and feel, but also about what they know. Testing companies acknowledge
it is easy to game the current generation of robo-graders: Such software
rewards longer word counts, unusual vocabulary, transition words such
as “however” and “therefore,” and grammatical sentences—whether or not
the facts contained within the sentences are correct. To address these
problems, the Hewlett Foundation, which also paid for the Shermis study,
is offering a $100,000 prize to the team of computer programmers that
can make the biggest strides in improving the technology.
The recent push for automated essay scoring comes just as we’re on
the verge of making standardized essay tests much more sophisticated in
ways robo-graders will have difficulty dealing with. One of the major
goals of the new Common Core curriculum standards, which 45 states have
agreed to adopt, is to supplant the soft-focus “personal essay” writing
that currently predominates in American classrooms with more
evidence-driven, subject-specific writing. The creators of the Common
Core hope
machines can soon score these essays cheaply and quickly, saving states
money in a time of harsh education budget cuts. But since robo-graders
can’t broadly distinguish fact from fiction, adopting such software
prematurely could be antithetical to testing students in more
challenging essay-writing genres...
1 comment:
If we are using computers to evaluate human writing, then why are we even learning how to write? Why not just have the computer do it for you. Just speak to the computer and it can translate it into a well written sentence.
Example
Uneducated illiterate non writer:
"It was really bitch'en but it sucked too."
Computer translation:
"It was the best of times, it was the worst of times."
I just don't understand that we expect human teachers to instruct students but have computer programs evaluate if students learned and instruction occurred. Maybe I am a member of a dying profession?!?!? Soon to be replaced by some automated, computerized machine.
Post a Comment