Thursday, November 18, 2010

Value-Added vs. Batting Averages

Stephen Sawchuck reports that a new report on value-added scores compares them to measures in other fields, including batting averages in baseball.  I think the batting average analogy is a good one.  Here's why:

When we look at a player's batting average (the number of hits divided by the number of at bats), we are provided with some information that reflects how good they are at baseball.  The value of this information increases when the player has more at bats and plays for more years.  And the larger the gap is between two players' batting averages, the more certain we can be that one player is better than the other.

But, at the same time, a player's batting average in one given year gives us only a small amount of information about that player's ability.  Averages vary year to year: players suffer injuries, become distracted, grow older, switch teams, play with different teammates, and so on.  For example:

Who's a better shortstop: Alexei Ramirez or Derek Jeter?  The former hit .282 this year while the latter hit .270.

Who's a better first baseman: Aubrey Huff or Ryan Howard?  The former hit .290; the latter .277.

Batting average doesn't do a good job of measuring power, speed, defense, leadership, or any of a million other desirable characteristics.  And for that reason, teams have moved beyond batting average when evaluating and signing players.  It seems like every year a new stat bubbles up that measures a player's ability in one facet of the game.

Adam Dunn is a career .250 hitter who plays awful defense, but will make millions of dollars next year because he'll probably hit 40 home runs yet again.

Mike Cameron, a .250 lifetime hitter, signed a 2-year, $15 million dollar contract last year as a 36 year-old.  His power (about 25 HR per season recently) didn't hurt, but the main motivation was his defense.

When we look at a more sophisticated (though still flawed) computation of how many wins a player added over an average replacement player (Wins Above Replacement, or WAR), we see that Jose Bautista, who hit .260 this year, ranks sixth.  The fifh-ranked player, Adrian Beltre, has a lifetime average of .275.

At the same time, nobody would argue that batting average is meaningless -- especially over longer periods of time.  There are plenty of hall of famers with career averages over .300, but none that I know of with averages of .220.

In other words, the batting average analogy is an excellent one for value-added scores because they have four very important things in common:

1.) Anybody who tells you that it is totally meaningless is totally wrong.
2.) When you see large differences over long periods of time between two people, you can be pretty sure that the one with the larger number is better.
3.) At the same time, that number in and of itself gives us very little information about how good somebody is, particularly over a shorter period of time.
4.) There are other skills not measured by the number that need to be taken into account when evaluating the person.

Unlike baseball, we don't have better statistics in evaluation.  We don't have something like on base percentage, yet alone change in student motivation over replacement teacher.

What does this mean for schools and value-added scores?  If we designed the perfect evaluation system, given current knowledge and tools, value-added scores would have to be included.  The only really compelling reason to leave them out is the potential for misuse by people who don't understand #'s 2-4 above.  Partly for that reason, it's important that value-added scores be only one component of how a teacher is evaluated.  And principals should be looking for large differences over long periods of time, not tiny year-to-year differences when making hiring/firing/tenure decisions.

Opponents of value-added scores would be better off arguing that they can be useful -- but only in some circumstances -- while proponents should be looking for supplemental measures to boost the meaningfulness of evaluations.  And both sides should keep in mind that the majority of teachers don't teach subjects tested by state tests anyway, so in addition to measuring a teacher's impact on students other than on standardized tests (e.g. motivation, attitude, self-control, creativity, interpersonal skills, etc.) we should also keep in mind that we need better ways to evaluate a teacher's impact on students in other subjects beyond 3rd-8th grade English/math as well.

In short, value-added scores can aid our evaluations of teachers . . . but only a little bit, and only if used properly.

2 comments:

Michael Dunn said...

Value-Added models are portrayed as statistically believable because they compare student test scores at the end of the year (after the teacher has added his value to the student by way of good instruction) with those at the beginning of the year (prior to the teacher’s influence). To be statistically meaningful, students would have to take the same test twice per year. Yet even this will produce highly biased results. A teacher in a middle class school, for example, will have students who are, on average, much more “school ready.” They are more likely to be reading at grade level, have decent study skills, financial security, enriching extracurricular activities, lower stress, and, therefore, likely to show higher gains than those who are less “school ready.”

the entire discourse about teacher accountability and teacher effectiveness is a case of a misguided solution to a misdiagnosed problem. If the problem is low student achievement, then let’s address the main cause: poverty. To get the greatest bang for the buck (or value-added), we must address the growing and untenable gap between the rich and the poor. Improving teachers, schools, curricula can help, but their contributions are infinitesimal compared with socioeconomic changes that help bring more people out of poverty. Richard Rothstein has identified numerous policies that have existed in the past that can help (e.g., increasing the earned income credit, housing vouchers, increased nursing and medical support on school campuses). Decreasing taxes for lower income people (especially regressive taxes like sales tax) and increasing it for corporations and the wealthy (along with closing the tax loopholes they exploit) should also help. Strengthening unions and increasing the numbers of unionized workers can help increase wages. Maintaining the mortgage interest tax deduction will help promote increased home ownership, which is one way to increase familial wealth as well as security.

Roger Sweeny said...

I imagine a world ... where it is illegal to keep statistics on baseball players--aside from whether they show up for work each day.

Instead, in order to play professional baseball, one must go to "baseball school." There one takes courses in human physiology, the professional baseball rule book, theory of hitting, the history of baseball, etc.

Some schools require the aspiring player to play on a team for a period of several months, taking the place of a "master player" for parts of a game and listening to advice from the master player and a sponsor from the baseball school.

After graduation, the aspiring player can play for a real team but the decision on whether to keep the player has to be based on whether the newbie is "good in the clubhouse" and how well he hits at three carefully observed batting practices ("how well" to be determined partly by whether he does what the observer considers to be "best hitting practice").

Meanwhile, some radicals have proposed that records should be kept on what the aspiring player does in real games. At the moment, the major proposal is that "at bats" and "hits" should be combined in a summary statistic called "batting average."

To some people this seems like useful information but the people who work at baseball schools have made many arguments why it is far, far from perfect. Some say it is useless. Some say that allowing it to be calculated will actually make things worse. The statistic will be misinterpreted by decision-makers and players will care more about their individual statistics than about winning games.

A few baseball regulatory commissions are now allowing the use of "batting average" in addition to all the previous requirements.