Stephen Sawchuck reports that a new report on value-added scores compares them to measures in other fields, including batting averages in baseball. I think the batting average analogy is a good one. Here's why:
When we look at a player's batting average (the number of hits divided by the number of at bats), we are provided with some information that reflects how good they are at baseball. The value of this information increases when the player has more at bats and plays for more years. And the larger the gap is between two players' batting averages, the more certain we can be that one player is better than the other.
But, at the same time, a player's batting average in one given year gives us only a small amount of information about that player's ability. Averages vary year to year: players suffer injuries, become distracted, grow older, switch teams, play with different teammates, and so on. For example:
Who's a better shortstop: Alexei Ramirez or Derek Jeter? The former hit .282 this year while the latter hit .270.
Who's a better first baseman: Aubrey Huff or Ryan Howard? The former hit .290; the latter .277.
Batting average doesn't do a good job of measuring power, speed, defense, leadership, or any of a million other desirable characteristics. And for that reason, teams have moved beyond batting average when evaluating and signing players. It seems like every year a new stat bubbles up that measures a player's ability in one facet of the game.
Adam Dunn is a career .250 hitter who plays awful defense, but will make millions of dollars next year because he'll probably hit 40 home runs yet again.
Mike Cameron, a .250 lifetime hitter, signed a 2-year, $15 million dollar contract last year as a 36 year-old. His power (about 25 HR per season recently) didn't hurt, but the main motivation was his defense.
When we look at a more sophisticated (though still flawed) computation of how many wins a player added over an average replacement player (Wins Above Replacement, or WAR), we see that Jose Bautista, who hit .260 this year, ranks sixth. The fifh-ranked player, Adrian Beltre, has a lifetime average of .275.
At the same time, nobody would argue that batting average is meaningless -- especially over longer periods of time. There are plenty of hall of famers with career averages over .300, but none that I know of with averages of .220.
In other words, the batting average analogy is an excellent one for value-added scores because they have four very important things in common:
1.) Anybody who tells you that it is totally meaningless is totally wrong.
2.) When you see large differences over long periods of time between two people, you can be pretty sure that the one with the larger number is better.
3.) At the same time, that number in and of itself gives us very little information about how good somebody is, particularly over a shorter period of time.
4.) There are other skills not measured by the number that need to be taken into account when evaluating the person.
Unlike baseball, we don't have better statistics in evaluation. We don't have something like on base percentage, yet alone change in student motivation over replacement teacher.
What does this mean for schools and value-added scores? If we designed the perfect evaluation system, given current knowledge and tools, value-added scores would have to be included. The only really compelling reason to leave them out is the potential for misuse by people who don't understand #'s 2-4 above. Partly for that reason, it's important that value-added scores be only one component of how a teacher is evaluated. And principals should be looking for large differences over long periods of time, not tiny year-to-year differences when making hiring/firing/tenure decisions.
Opponents of value-added scores would be better off arguing that they can be useful -- but only in some circumstances -- while proponents should be looking for supplemental measures to boost the meaningfulness of evaluations. And both sides should keep in mind that the majority of teachers don't teach subjects tested by state tests anyway, so in addition to measuring a teacher's impact on students other than on standardized tests (e.g. motivation, attitude, self-control, creativity, interpersonal skills, etc.) we should also keep in mind that we need better ways to evaluate a teacher's impact on students in other subjects beyond 3rd-8th grade English/math as well.
In short, value-added scores can aid our evaluations of teachers . . . but only a little bit, and only if used properly.