Wednesday, May 19, 2010

"One Test on One Day" isn't "BS"

I'm all for calling people out when they say ridiculous things, but it seems like we're a little too hasty with the accusations and eye-rolling nowadays.  Particularly when unions are involved.

A lot of people don't like teachers' unions and union leaders.  I get it, I really do.  Unions exist to protect their members, and union leaders are almost obliged to spin, frame, and spout off in order to do so.  And this not only results in them sounding ridiculous from time to time, but also prevents all sorts of things from occurring that might help, or at least satisfy, other people.  Much like politicians, union leaders often engage in hyperbole and cling to stupid ideas in order to get what their constituents want.  They deserve to be called out when they do this, and I think, increasingly, they are.  Maybe it's just my relatively short experience in the field, but it seems like unions have made more compromises on previously off-limits ideas (e.g. merit pay, performance evaluations, testing, seniority rights, etc.) in the past few years than maybe in the previous few decades (if you're a historian and know otherwise, please feel free to correct me).  And it seems to me that this is largely because criticism of union talking points has been widespread across both parties and newspaper editorial pages from both ends of the political spectrum.  How many times has the NY Times run editorials in the past couple years in favor of things that teachers' unions don't particularly like?  And I think that a lot of good might come from the resulting compromises and experimentation with new ideas.

But I think we may be taking it too far.  While unions have taken somewhat indefensible positions on firing teachers, seniority rights, pay raises, and other topics over the years, there's really no reason to automatically assume that every statement from every union leader's mouth is automatically ridiculous.  Just because all the cool kids are bashing union leaders doesn't mean you have to too.

Why am I suddenly annoyed by this?  While skimming through my google reader before bed, I noticed two otherwise reasonable commentators agreeing that something a union leader said was ridiculous -- but it was really their reactions that were ridiculous.

Stephen Sawchuk's posts are usually -- and I don't mean this in a bad way, I swear -- quite bland.  They're written by a reporter, and they stick pretty close to the facts without a lot of extra fluff and verbiage.  But his latest post is different.  He makes at least two crucial mistakes while rushing to jump onto the "union leaders are ridiculous" bandwagon.

His post is a response to the statement by NEA head Dennis Van Roekel that it's "absurd" to judge teachers based on "one test on one day."  Sawchuck's reaction?  He says that value-added scores use tests from at least two different points in time and, therefore, teachers aren't judged on one test score.  But that's actually a far more ridiculous statement than the first one b/c it's based on a complete misunderstanding of the first statement.

I suppose I could be mistaken, but it seems fairly obvious to me that Van Roekel is referencing the fact that when value-added scores are calculated they're based on only one test (unless a teacher teaches multiple subjects) in the year that the teacher actually teaches the students (whether or not these tests are one day is a legitimate question -- NYC moved from one day, 50 minute tests my first year to multi-day tests afterward, which I gather is fairly typical).  In other words, the situation he's comparing it to is one in which teacher effectiveness is measured by other measures of student progress in addition to the singular state test -- whether that be observations, portfolio assessments, or simply more tests.  And this is a legitimate point.  Ask a psychometrician if we could better measure student ability by giving more than one test in a given subject.  Ask an economist if we could better estimate student growth if tests were given, say, monthly.

Under most current systems, a student's score on the state test may be unrepresentative of their actual ability for any number of reasons (they were sick, they were short on sleep, it was hot that day, or maybe they just guessed right every single time).  If we used multiple tests over a longer period of time, there would be regression to the mean and we'd get more accurate estimates of what students do and do not know.  It's basic statistics. 

Van Roekel is right to raise the point, and policymakers would be wise to take it into consideration.  It's the reason why value-added scores are only weakly correlated from year to year (recent research finds that it takes three years of value-added scores to obtain a reasonably stable measure of a particular teacher's effect on student test scores).

So, bottom line, Sawchuk botched that one.  Mistakes happen.  I've botched blog posts before, and it doesn't (I hope) make me a bad guy, so I'm willing to assume that Sawchuk is simply human as well.

But the end of his blog post raises questions in my mind regarding how much time and thought he actually put into the post.  He writes:

But recall that not all that long ago NEA's single test-score line managed to really tick-off House Education and Labor chairman Rep. George Miller. Isn't it a sign that it's time to update a talking point when even lawmakers start to roll their eyes in response?
As I read it, this was supposed to support his argument that many years of data means that teachers aren't judged on only one test.  I was surprised to see that somebody else had made the same mistake, so I clicked on the link to read about Miller saying the same thing.  But it's actually a completely different complaint.  Miller was upset because he thought Van Roekel was insinuating that, under the proposed law, teachers would and could be judged only on test scores and nothing else.  So he strenuously defended the proposal by saying there was nothing in there that prevented other measures from being used as well.  In other words, Miller interpreted Van Roekel's statement the same way I did, and raised a completely different objection to it than does Sawchuk.  And Miller's objection is irrelevant in this case, b/c value-added scores, at least as far as I know, never take any measures other than test scores into account.  So, if anything, Miller's statements refute Sawchuk's point.

But, like I said, one bad post doesn't invalidate a career's worth of respectable work.  What actually annoys me more is the response of Andy Rotherham on his blog.  In a post entitled "When Sawchucks Attack" he simply writes "Don't bring that BS into his house..."  End of post.

When I first started reading education blogs, it was pretty clear that Rotherham's "Eduwonk" blog stood head and shoulders above all others in terms of the sheer amount of information it contained relating to education news.  But over the past couple of years or so, it seems to me like the posts have gotten shorter and the language snarkier.  Where once background information and analysis were provided, he now seems content to simply lob verbal bombs at anyone opposed to reforms he likes and move on with his day.  Perhaps he's gotten busier, perhaps he's grown weary of constantly explaining himself, or maybe he's just frustrated that everybody won't listen to him.  All would be understandable.  But, whatever the reason, the quality of the blog has significantly deteriorated in my eyes.  And his latest post is a perfect example of why.

I'm not quite sure if he's agreeing with Sawchuk or simply pointing readers in that direction b/c he thought it was entertaining -- which is precisely my problem with his posts lately.  I assumed that he was agreeing with Sawchuk when I first read it, b/c if he was disagreeing I'd assume he would've indicated that in some way.  But whether he's just standing back and saying "wow, look at that" or saying "wow, great post," the labeling of Van Roekel's statement as "BS" is off the mark.  It's hyperbolic, sure, but it's also important to take into consideration.  And I take issue with the snarky marginalization of important statements.

Maybe I'm reading too much into this, but it seems like I've been reading more and more of these snippy knee-jerk reactions to anybody opposed to any aspect of the DOE's favored reforms lately.  I agree that there are a lot of bad arguments against charter schools, merit pay, school turnarounds, etc. but that doesn't mean that everybody who points out any weakness of any of the above should automatically be dismissed with the wave of a hand and bit of derision.

The irony is that one of the greatest weaknesses of value-added scores are their instability (largely due to the small sample sizes) and that testing more would yield more accurate scores.  So proponents of value-added scores who dismiss the "one test" criticism are actually arguing for weaker, less meaningful value-added scores -- which isn't going to help them become ubiquitous any time soon.

Monday, May 17, 2010

__________ Shouldn't Attend College

The Times had an interesting little article in yesterday's Week in Review section entitled "Plan B: Skip College".  The article runs through a quick list of scholars advocating alternatives to 4 year colleges for high school students, and then mostly focuses on a couple economists advocating vocational training as a substitute for college.  It calls "urging that some students be directed away from four-year colleges" a "third rail of the education system."

In some ways, the article is all well and good.  Yes, not everybody needs to go to college.  Yes, it can be dangerous to say this.  Yes, college is expensive.  Yes, lots of students begin college and don't finish.  And, yes, vocational training in high schools is fading fast -- to some extent in response to the increased academic focus of the NCLB era.

But I always have the same two problems when I hear these arguments:

1.) College does not only exist to train students for future employment.  Students might benefit from attending college in myriad ways regardless of whether or not it directly relates to their future career.  Similarly, society may benefit in many ways other than a more skilled labor force if more people attend college.  The article quotes one economist as asking "why 15 percent of mail carriers have bachelor’s degrees" when “some of them could have bought a house for what they spent on their education”.  Some of the mail carriers may regret going to college, but I'll bet many don't -- and democracy is dependent on an more educated populace.

2.) Who, exactly, shouldn't attend college?  It's all well and good to argue that many just won't cut it in -- and/or won't really benefit from -- college, but who doesn't go, and who gets to decide?  Almost all upper and upper-middle class parents will accept nothing less than attendance at a 4 year institution from their children, so I don't think that the "skip college" movement is going anywhere with that crowd -- those kids will continue to attend college whether economists like it or not.  Which leaves us with kids from middle, working, and lower class families.  Should we really have different expectations for these kids?  The reality of the situation says that maybe we should, but I'm not sure I can find any good justification for channeling these kids, and pretty much only these kids, into less scholarly programs once they reach high school and beyond.  In this sense, the argument of one scholar to "get them some intervening credentials, some intervening milestones. Then, if they want to go further in their education, they can" may be a good compromise between what is morally imperative and what is practically feasible.

Given that less than a third of Americans aged 25-29 hold bachelor's degrees, there's certainly a need to both study and improve other avenues to education and career training.  But let's not lose sight of these two problems when we discuss possible solutions.

Wednesday, May 12, 2010

My Revolutionary New Grading System

I taught a section of an undergraduate public policy course this spring (part of the reason the blog updates have been few and far between) and decided to try out a bit of a different grading system for some of the assignments.  I almost blogged about this at the beginning of the semester, but I figured I'd wait and see how it played out first.

First, some background: when I TA'd the class in the past, we had a number of short papers that the students didn't put all that much effort into.  At a place like Vanderbilt, which has become super-competitive in recent years, I figured we'd be teaching some of the best and brightest young minds in the country.  But I can't honestly say that the writing ability of the students terribly impressed me.  So I decided that it would be better for the students if I spent more time emphasizing the quality of the writing than the quantity of the writing.

So this semester instead of giving weekly reflection papers, I assigned three 1-2 page policy analysis papers (in addition to a final paper, some quizzes, and a debate), but with a twist: the assignment wasn't done until the student turned in an 'A' paper, and the grade was based on how many attempts it took them to accomplish this.  At the same time, I tried to give out very specific instructions, sometimes a model paper, and offered copious amounts of help (including meeting with students, reading drafts, and giving students extensive feedback about their papers (which they actually read, at least if they didn't get an 'A')).  I'm not sure how much a student gains from simply slapping together a 'B' paper at the last minute, turning it in, and never thinking about it again (nor am I sure that's a good use of the professor's time either).  So not only would all students be almost forced to do their best (or at least high-quality) work, from which they should learn more, but the incentives would be better aligned with potential benefits (a bright student who is content with a B has it easy under the typical system, but under this system earning a B is at least as much work as earning an A).

My initial intention was to award students an 'A' if it took them one try to write an 'A' paper, a 'B' if it took them two tries, a 'C' if it took them three tries, a 'D' if it took them four tries, and pray nobody took more than four tries.  But I decided that was a little too rough around the edges -- it wouldn't be fair to award a student who turned in a 'D' paper and an 'A' paper the same grade as somebody who turned in an 'A-' and 'A'.  At the same time, I didn't want it to be possible for somebody to earn a higher grade after submitting two papers than somebody else did while submitting only one (which eliminates the simple averaging of all grades that some of the students wanted).  So after some tinkering around, I came up with the following system: I'd average the grade they received for the number of drafts it took them (85 for 2 tries, 75 for 3 tries, etc.) with the average of all the papers they'd turned in.  I calculated some mock grades and handed the following chart out to the students:

1st 2nd 3rd 4th avg. base final grade
60 93 76.5 85 81 B-
90 95 92.5 85 89 B+
91 99 95 85 90 A-
80 95 87.5 85 86 B
75 95 85 85 85 B
70 80 95 81.67 75 78 C+
75 85 95 85 75 80 B-
65 88 93 82 75 79 C+
70 75 90 95 82.5 65 74 C
85 88 93 88.67 75 82 B-
95 95 A
98 98 A+

So, did it work?  Yes and no.

The main goal was to elicit better writing from the students.  I can honestly say that the quality of writing improved quite a bit over the course of the semester (the average grades on the three papers, in order, were 85, 88, and 92 -- and I'd like to think the grading standards were fairly consistent).  I foresaw a couple drawbacks: 1.) the system was a lot more work for me, and 2.) the students might rebel.

The first was was definitely true.  At first I enthusiastically dove into drafts and re-writes and wrote copious comments, but after reading the same papers over and over again it began to get tedious.  After the first draft of the first paper, students figured out that sending me a draft to read over ahead of time was a good idea (they were right), and I gave very few students an 'A' on their first attempt (nobody earned a straight 'A' on more than one of the three papers), meaning that I read drafts of the first version, graded the first versions, read drafts of the second versions, graded the second versions, and so on.  Luckily, most people got it by the second version.

The second issue wasn't as bad as I feared.  I just got my teacher evaluations back and, despite some griping about grading and the papers, they were quite good.  The comments specifically about the papers were decidedly mixed, with a few students obviously bitter over the grading system, but at least as many acknowledging that it helped them.  I've posted all those comments at the end of this post.

So, would I do it again?  Yes . . . well, maybe.  On the one hand, I'm convinced from what I saw and heard that this substantially benefited students in the end.  On the other, it was an awful lot of work for me (and I think it would be horribly draconian and unfair to implement it without offering extra writing feedback).  I'm willing to work on tweaking the grading system a bit to soothe hurt feelings, but it may be as much about framing the issue (e.g. two papers is supposed to be a B, so turning in a 91 and 97 and earning an 89 is actually more fair than earning an 85) as it is about actual grading.  Before doing this again, I think I would try even harder to make expectations and directions comprehensive and crystal clear both to help students and to potentially reduce the amount of drafts and re-writes I have to read.  I think teachers have to walk a fine line between setting a high bar for students and being unnecessarily punitive.  Nobody would accuse me of simply handing out A's just for being there, but I'd like to think I also avoided simply piling on so I could brag about how difficult I made the students' lives.  At least in this case, I think setting a high bar in terms of quality rather than quantity was what was best for the students.

The "revolutionary" in the title was mostly tongue-in-cheek, in case you didn't get that, but I'll continue to tinker with these ideas.  Any feedback or suggestions are most welcome.

Here are the student comments about the papers:

general comments about instructor/course

"I actually learned how to write policy evaluations and proposals and could see my improvement. I really appreciate all the time that Corey took to give feedback on papers no matter how late they were sent in. If every teacher were as helpful as Corey, I would be learning a lot more in my classes."

"The paper grading is atrocious. Requires re-writes until student gets what he deems a 92. Amount of re-writes dictates what average is. For example, one rewrite means the two grades you get are averaged with an 85. I got a 91 on a paper, then a 97, which averages to a 94, but when that is averaged with an 85, it becomes an 89, which is lower than my initial grade. Pointless and penalizes unnecessarily."

The strongest feature of this course was:

"The papers and application to real word current events."

"The emphasis on clear and concise writing and the importance placed on the quality of writing over the quantity."

The weakest feature of this course was:

"the grading policies"

"unclear expectations for policy papers"

"THE PAPERS! Haha. I hated having to do write re-writes, but I understand the reasoning behind it. The grading system is frustrating when one receives a 91 on the first drafts... but I really do understand the importance and value behind doing it this way."

"The grading policy of the papers. I understand the reasoning behind it, but it was still rough to get a 91 on a paper and then actually wind up with a lower grade in the end after rewriting it."

"Grading policy on papers"

"The policy papers were good in theory... they ended up being crap. We cannot read minds and therefore can never satisfy the professor. It is unfair that if someone gets a 91 originally, their grade is LOWERED after improving it."

My suggestions for the improvement of this course are:

"new grading policy"

"Maybe prepare students more for papers because it's hard to learn what exactly is required. It took a few attempts before I fully understood what was needed for a strong paper."

My suggestions for the instructor to improve her/his teaching are:

"Keep doing the re-writes, they were a pain in the rear but improved my writing dramatically. NEVER use the Andersen book and just use CNN articles related to current events."

"Change the grading of paper rewrites so that the three average together instead of how you do it. It seems fairer. Ex. Your Way 91 to 97 = (((91+97)/2)+85)/2 = 89.5 Ex. Ideal Way 91 to 97 = (91+97+85)/3 = 91"

"Paper grading policy. It may seem like a good idea in theory and I agree that it does increase writing skill, but the way grades are calculated is foolish"

"One improvement I would suggest would be tweaking the grading system for the papers. The general system in place is effective in that it challenges the students to learn through experience how to be better writers, but at times it can be a little harsh."

Sunday, May 9, 2010

Why Do Harvard Kids Head to TFA?

James Kwak has an interesting post entitled "Why Do Harvard Kids Head to Wall Street?"  Most interesting to me is that, at least up until the point where he starts discussing money, one could substitute Teach for America for every mention of Goldman Sachs or McKinsey.

For example:

The typical Harvard undergraduate is someone who: (a) is very good at school; (b) has been very successful by conventional standards for his entire life; (c) has little or no experience of the “real world” outside of school or school-like settings; (d) feels either the ambition or the duty to have a positive impact on the world (not well defined); and (e) is driven more by fear of not being a success than by a concrete desire to do anything in particular. (Yes, I know this is a stereotype; that’s why I said “typical.”) Their (our) decisions are motivated by two main decision rules: (1) close down as few options as possible; and (2) only do things that increase the possibility of future overachievement.

followed by:

The recruiting processes of Wall Street firms (and consulting firms, and corporate law firms) exploit these (faulty) decision rules perfectly. The primary selling point of Goldman Sachs or McKinsey is that it leaves open the possibility of future greatness. The main pitch is, “Do this for two years, and afterward you can do anything (like be treasury secretary).”  

and then:

For people who don’t know how to get a job in the open economy, and who have ended each phase of their lives by taking the test to do the most prestigious thing possible in the next phase, all of this comes naturally.

This seems like a pretty good explanation as to why something like 10% of recent Ivy League grads have entered TFA after college.  I didn't to an Ivy League college, but it's not all that dissimilar to the reasons I started teaching after college (for me getting a chance to "make a difference" right away was number one and those other things were next).  Which would lead me to two conclusions:

1.) TFA might be as savvy as Goldman Sachs and McKinsey in some ways
2.) Given that these students are doing society more good at TFA than at those types of firms, maybe we should have more public good-oriented programs that recruit in this manner