Wednesday, May 19, 2010

"One Test on One Day" isn't "BS"

I'm all for calling people out when they say ridiculous things, but it seems like we're a little too hasty with the accusations and eye-rolling nowadays.  Particularly when unions are involved.

A lot of people don't like teachers' unions and union leaders.  I get it, I really do.  Unions exist to protect their members, and union leaders are almost obliged to spin, frame, and spout off in order to do so.  And this not only results in them sounding ridiculous from time to time, but also prevents all sorts of things from occurring that might help, or at least satisfy, other people.  Much like politicians, union leaders often engage in hyperbole and cling to stupid ideas in order to get what their constituents want.  They deserve to be called out when they do this, and I think, increasingly, they are.  Maybe it's just my relatively short experience in the field, but it seems like unions have made more compromises on previously off-limits ideas (e.g. merit pay, performance evaluations, testing, seniority rights, etc.) in the past few years than maybe in the previous few decades (if you're a historian and know otherwise, please feel free to correct me).  And it seems to me that this is largely because criticism of union talking points has been widespread across both parties and newspaper editorial pages from both ends of the political spectrum.  How many times has the NY Times run editorials in the past couple years in favor of things that teachers' unions don't particularly like?  And I think that a lot of good might come from the resulting compromises and experimentation with new ideas.

But I think we may be taking it too far.  While unions have taken somewhat indefensible positions on firing teachers, seniority rights, pay raises, and other topics over the years, there's really no reason to automatically assume that every statement from every union leader's mouth is automatically ridiculous.  Just because all the cool kids are bashing union leaders doesn't mean you have to too.

Why am I suddenly annoyed by this?  While skimming through my google reader before bed, I noticed two otherwise reasonable commentators agreeing that something a union leader said was ridiculous -- but it was really their reactions that were ridiculous.

Stephen Sawchuk's posts are usually -- and I don't mean this in a bad way, I swear -- quite bland.  They're written by a reporter, and they stick pretty close to the facts without a lot of extra fluff and verbiage.  But his latest post is different.  He makes at least two crucial mistakes while rushing to jump onto the "union leaders are ridiculous" bandwagon.

His post is a response to the statement by NEA head Dennis Van Roekel that it's "absurd" to judge teachers based on "one test on one day."  Sawchuck's reaction?  He says that value-added scores use tests from at least two different points in time and, therefore, teachers aren't judged on one test score.  But that's actually a far more ridiculous statement than the first one b/c it's based on a complete misunderstanding of the first statement.

I suppose I could be mistaken, but it seems fairly obvious to me that Van Roekel is referencing the fact that when value-added scores are calculated they're based on only one test (unless a teacher teaches multiple subjects) in the year that the teacher actually teaches the students (whether or not these tests are one day is a legitimate question -- NYC moved from one day, 50 minute tests my first year to multi-day tests afterward, which I gather is fairly typical).  In other words, the situation he's comparing it to is one in which teacher effectiveness is measured by other measures of student progress in addition to the singular state test -- whether that be observations, portfolio assessments, or simply more tests.  And this is a legitimate point.  Ask a psychometrician if we could better measure student ability by giving more than one test in a given subject.  Ask an economist if we could better estimate student growth if tests were given, say, monthly.

Under most current systems, a student's score on the state test may be unrepresentative of their actual ability for any number of reasons (they were sick, they were short on sleep, it was hot that day, or maybe they just guessed right every single time).  If we used multiple tests over a longer period of time, there would be regression to the mean and we'd get more accurate estimates of what students do and do not know.  It's basic statistics. 

Van Roekel is right to raise the point, and policymakers would be wise to take it into consideration.  It's the reason why value-added scores are only weakly correlated from year to year (recent research finds that it takes three years of value-added scores to obtain a reasonably stable measure of a particular teacher's effect on student test scores).

So, bottom line, Sawchuk botched that one.  Mistakes happen.  I've botched blog posts before, and it doesn't (I hope) make me a bad guy, so I'm willing to assume that Sawchuk is simply human as well.

But the end of his blog post raises questions in my mind regarding how much time and thought he actually put into the post.  He writes:

But recall that not all that long ago NEA's single test-score line managed to really tick-off House Education and Labor chairman Rep. George Miller. Isn't it a sign that it's time to update a talking point when even lawmakers start to roll their eyes in response?
As I read it, this was supposed to support his argument that many years of data means that teachers aren't judged on only one test.  I was surprised to see that somebody else had made the same mistake, so I clicked on the link to read about Miller saying the same thing.  But it's actually a completely different complaint.  Miller was upset because he thought Van Roekel was insinuating that, under the proposed law, teachers would and could be judged only on test scores and nothing else.  So he strenuously defended the proposal by saying there was nothing in there that prevented other measures from being used as well.  In other words, Miller interpreted Van Roekel's statement the same way I did, and raised a completely different objection to it than does Sawchuk.  And Miller's objection is irrelevant in this case, b/c value-added scores, at least as far as I know, never take any measures other than test scores into account.  So, if anything, Miller's statements refute Sawchuk's point.

But, like I said, one bad post doesn't invalidate a career's worth of respectable work.  What actually annoys me more is the response of Andy Rotherham on his blog.  In a post entitled "When Sawchucks Attack" he simply writes "Don't bring that BS into his house..."  End of post.

When I first started reading education blogs, it was pretty clear that Rotherham's "Eduwonk" blog stood head and shoulders above all others in terms of the sheer amount of information it contained relating to education news.  But over the past couple of years or so, it seems to me like the posts have gotten shorter and the language snarkier.  Where once background information and analysis were provided, he now seems content to simply lob verbal bombs at anyone opposed to reforms he likes and move on with his day.  Perhaps he's gotten busier, perhaps he's grown weary of constantly explaining himself, or maybe he's just frustrated that everybody won't listen to him.  All would be understandable.  But, whatever the reason, the quality of the blog has significantly deteriorated in my eyes.  And his latest post is a perfect example of why.

I'm not quite sure if he's agreeing with Sawchuk or simply pointing readers in that direction b/c he thought it was entertaining -- which is precisely my problem with his posts lately.  I assumed that he was agreeing with Sawchuk when I first read it, b/c if he was disagreeing I'd assume he would've indicated that in some way.  But whether he's just standing back and saying "wow, look at that" or saying "wow, great post," the labeling of Van Roekel's statement as "BS" is off the mark.  It's hyperbolic, sure, but it's also important to take into consideration.  And I take issue with the snarky marginalization of important statements.

Maybe I'm reading too much into this, but it seems like I've been reading more and more of these snippy knee-jerk reactions to anybody opposed to any aspect of the DOE's favored reforms lately.  I agree that there are a lot of bad arguments against charter schools, merit pay, school turnarounds, etc. but that doesn't mean that everybody who points out any weakness of any of the above should automatically be dismissed with the wave of a hand and bit of derision.

The irony is that one of the greatest weaknesses of value-added scores are their instability (largely due to the small sample sizes) and that testing more would yield more accurate scores.  So proponents of value-added scores who dismiss the "one test" criticism are actually arguing for weaker, less meaningful value-added scores -- which isn't going to help them become ubiquitous any time soon.


din819go said...

With all the complaints around one test on one day I have never understood why kids aren't tested within the first week of school, at the end of the first semester and then late in the 2nd semester. These scores should be posted and then the results will speak for themselves...why isn't this done in a public manner?

our district uses thinklink so i know the tests are given and the data used but why isn't it published?

Unknown said...

Hi Corey,

I’m writing to suggest the addition of as a resource on your website/blog. This website provides helpful insight into the field of school counseling for those who are performing their career exploration or simply interested in learning more. It seems like it would be a worthy link on your blog.

Hope this is helpful,

Seth Sanford

Anonymous said...


As a fellow education researcher: this is an excellent blog.