The LA Times ran an article Sunday that's the first in a series leading to the release of scores of teacher effectiveness based on a value-added model for all 3rd-5th grade math/ELA teachers -- which has already created quite a stir and will likely continue to cause a ruckus. In short, the Times got its hands on 6 years of test data and hired an outside stats guru (the RAND corp's Richard Buddin) to analyze the data and rank the thousands of teachers according to their effectiveness at raising test scores, controlling for a number of factors.
The general tenor of the article suggests that value-added scores are an underutilized resource and that using them more could make huge differences. The article notes in different ways that teachers are very, very important, that the right ones can make all the difference, and that we should be focusing more on hiring/training better teachers and firing worse ones. None of which is in any way unusual to read in an article on education these days -- and none of which is completely without merit. Teachers are important, and we do need to do a better job of ensuring that we have better teachers in our schools -- especially the schools with the most disadvantaged populations.
If you read the article carefully and then the background report on the methodology, however, a few things jump out:
*Teacher quality varies widely within schools -- just as with test scores, there's far more variation within schools than across schools ("Teachers are slightly more effective in high- than in low-API schools, but the gap is small, and the variance across schools is large"). Which means that the highest performing schools don't have all the best teachers and the lowest performing schools don't have all the worst teachers. Which means that something other than teacher quality is causing schools to be low and high performing. Which means we should probably focus our attention on more than just teacher quality.
*There's an extremely weak correlation between how the schools fare in the state API rating system and how they fare in a measure of "school effects" that controls for all sorts of factors. As Buddin writes, "About a fourth of low-API schools have above average school value added relative to other elementary schools in the district. Similarly, about a fourth of the highest-quartile API schools have below average school effectiveness. The overall message is that many schools with low achievement levels are producing strong achievement gains and many schools with high achievement levels are producing weak achievement gains for their students."
*I'm not sure exactly how large the teacher effects are, but looking at the info they provide, with the exception of a few outliers, they don't appear to be earth-shatterlingly huge. The methodology paper says that a student with a teacher one standard deviation above normal would move from the 50th to 58th percentile in ELA. If I'm doing my math right (which I might not be -- it's late), that means that 2/3 of teachers, on average, move their students up or down less than one-fifth of a standard deviation each year. The article mentions a teacher who's ranked among the top 5% of all elementary school teachers whose students gain, on average, 4 and 5 percentile points in ELA and math in a given year.
*The article mentions a teacher held in high-esteem at one of the highest scoring schools who performs far below average according to the value-added scores. According to the article, her principal thinks she's a great teacher as do the kids and parents in her school. This means that either a.) principals, kids, and parents aren't good judges of teacher quality (at least sometimes), and/or b.) what people define as a good teacher only somewhat overlaps with what teachers can do to boost value-added scores
In future articles I'd really like to see a better description/graphic of how the large the differences in impacts of different teachers is. From what I've read so far, it looks like the vast majority of teachers aren't really all that far apart. Especially considering that previous research has found that you can't simply add one year of teacher effects from a great teacher to the next year's effects from another great teacher (e.g. having three straight teachers that boost scores 10 percentile points on average won't boost your score 30 percentile points). I'd also like to see more on the stability of these results on a year-to-year basis -- previous research has found one year's value-added scores to be only loosely correlated with the previous year's scores (I think the latest paper on the topic found that it took three years to compute a stable score, which makes it hard to use value-added scores for yearly hiring or bonus decisions).
Also, don't forget that value-added scores a.) only represent part of what teachers do and b.) currently only apply to a small fraction of teachers. if we consider elementary schools to be K-5, only grade 3-5 math/ELA teachers had value-added scores -- the majority of teachers in elementary schools teach either K-2 or something other than math/ELA . . . so this use of value-added is no magic bullet.
Lastly, I'd like to add a note about the practical application of these findings. When we do a rigorous statistical analysis of teacher effectiveness, we control for all sorts of things, from previous test scores to the test scores of other kids in the class, and so on. In short, the goal is to say "everything else equal, teacher A will raise test scores by x points more than teacher B". But in real life, everything else isn't equal. So even though the results indicate that the teachers in the worst schools are about as good as the teachers in the best schools, practically that doesn't mean that parents will be (or should be) any more likely to want their kids to attend the worst schools. Mr. Jones might raise the average kid's score by 20 points, all else being equal, but that doesn't mean your kid's score is going to go up by 20 points in his class.
Keep your eyes on the situation, b/c I guarantee you there will be lots of exaggerated responses from people on both sides of the issue. Just remember: value-added scores aren't completely worthless, but they also fall far short of solving all of our problems.