Tuesday, September 21, 2010

Live-Blogging the Release of the Nashville Performance Pay Experiment Results

Today marks the release of the National Center on Performance Incentives' report on the Nashville performance pay experiment (known as POINT).  The full report will be available on their website after the discussion.  In the meantime, I'm going to provide you with some snapshots of what's being said and written.  Please check out my previous posts on this study as well.  A live, streaming, video of the press conference is online here.  The press release (which is pretty good) is now available here.  The full report is available here.

Previous posts:
Part 1: Background Info
Part 2: What to Look For
Part 3: Why it Matters
Part 4: What We Can Learn

12:50pm: The press conference has begun.  I'm going to begin by posting the summary statement from the executive summary of the report:

POINT was focused on the notion that a significant problem in American education is the absence of appropriate incentives, and that correcting the incentive structure would, in and of itself, constitute an effective intervention that improved student outcomes.

By and large, results did not confirm this hypothesis. While the general trend in middle school mathematics performance was upward over the period of the project, students of teachers randomly assigned to the treatment group (eligible for bonuses) did not outperform students whose teachers were assigned to the control group (not eligible for bonuses).


Before you get excited, or disappointed, about this, bear in mind what I've written before -- the most important things we can learn from this study aren't what happen to test scores, they're insights into teacher behavior from the interviews and surveys.  Keep checking back for more details on this, and other, topics.

Also, please keep in mind that this study does not definitively prove either that merit pay systems are a bad idea or a good idea.

1:00pm: The number of teachers who received bonuses remained steady throughout (40, 41, 44), but the number of eligible teachers declined significantly (143, 105, 84) -- meaning over half of the teachers received a bonus for being above the historical 80th percentile of teachers in the last year.  This could mean that less successful teachers tended to leave the study -- whether by switching subjects, schools, or careers.  Or it could mean that the tests became easier and more teachers were rewarded.  I would say that it could mean that it took three years for the incentives to have an effect, but the treatment group did not outperform the control group -- even in the final year.

from the executive summary: "attrition of teachers from POINT was high. By the end of the project, half of the initial participants had left the experiment."

The report says that the differences in attrition -- and reasons for attrition -- between control and treatment groups was not statistically significant.  But it reports that 27 control group teachers, and only 15 treatment group teachers, left the study because they'd switched districts during the experiment.  I'd like to know more about this and whether this might be evidence that incentives made teachers slightly more likely to remain in the Metro Nashville school system.

about teacher attrition, from the report:


Teachers who left the study tended to differ from stayers on many of the baseline variables. Teachers who dropped out by the end of the second year of the experiment were more likely to be black, less likely to be white. They tended to be somewhat younger than teachers who remained in the study all three years. These dropouts were also hired more recently, on average. They had less experience (including less prior experience outside the district), and more of them were new teachers without tenure compared to teachers who remained in the study at the end of the second year. Dropouts were more likely to have alternative certification and less likely to have professional licensure. Their pre-POINT teaching performance (as measured by an estimate of 2005-06 value added) was lower than that of retained teachers, and they had more days absent. Dropouts completed significantly more mathematics professional development credits than the teachers who stayed.Dropouts also tended to teach classes with relatively more black students and fewer white students. They were more likely to be teaching special education students.

I'm going to need a little time to digest that, but the next table demonstrates that treatment and control group teachers, in all three years, did not differ in terms of effectiveness (as measured by tests) in the years prior to the experiment -- meaning that more effective teachers didn't seem more likely to stay if they could possibly earn bonuses.  Treatment group teachers, however, were six percentage points less likely to leave the middle school in which they started during the three years.

1:03pm: There were, however, positive effects for 5th grade teachers in the 2nd and 3rd years, though the effects did not persist until the end of 6th grade.  The 5th grade teachers had the same students all day for multiple subjects, so it's possible that they shifted focus to math instruction or that they simply knew their students better and were able to get better results.  The center did analyze results on other subject tests and found no differences that would indicate teachers ignored these subjects and only focused on math.

33.6% of the treatment group received a bonus in at least one year (out of 152, 16 won once, 17 twice, and 18 thrice).  Analysis done by the researchers prior to the experiment found that 55% of teachers were within a few more correct questions per student of attaining scores that would earn them a bonus.

1:08pm: Here are some other interesting tidbits from the executive summary, report, and press conference (which is now in Q&A):

-80% of teachers reported that they were already working as hard as they could and didn't change their effort due to the opportunity to earn an incentive.

-from the executive summary: "The introduction of performance  incentives in MNPS middle schools did not set off significant negative reactions of the kind that have attended the introduction of merit pay elsewhere. But neither did it yield consistent and lasting gains in test scores. It simply did not do much of anything."

-from the report: "From an implementation standpoint, POINT was a success. This is not a trivial result, given the widespread perception that teachers are adamantly opposed to merit pay and will resist its implementation in any form."

-I didn't mention this before, but the placement of teachers in control/treatment groups -- and whether or not they received a bonus was, officially, confidential.  The center didn't distribute this information to principals or teachers and participating teachers signed statements saying that they wouldn't tell anybody their status.  It looks like this was at least moderately successful, as about 75% of teachers reported that they didn't know if anybody had won a bonus in their school.

-There were no differences in student attrition between groups -- meaning that there's no evidence that bonus-eligible teachers were more likely to get problem students removed from their class.  There were also no differences in students enrolling late, and students who left treatment teachers' classes were no lower scoring.

1:12pm: Dale Ballou responds to a question by saying that test scores bounced around a lot before the experiment, making it "difficult to extrapolate" from the data prior to the start of the experiment and tell whether teachers were doing better post-bonus than pre-bonus.

1:16pm: From the report, here are the results from the teacher survey.  In short, nothing huge or shocking (note TCAP is the TN state test):

There are few survey items on which we have found a significant difference between the responses of treatment teachers and control teachers. (We note all contrasts with p values less than 0.15.) Treatment teachers were more likely to respond that they aligned their mathematics instruction with MNPS standards (p = 0.11). They spent less time re-teaching topics or skills based on students’ performance on classroom tests (p = 0.04). They spent more time having students answer items similar to those on the TCAP (p = 0.09) and using other TCAP-specific preparation materials (p = 0.02). The only other significant differences were in collaborative activities, with treatment teachers replying that they collaborated more on virtually every measured dimension. Data from administrative records and from surveys administered to the district’s math mentors also show few differences between treatment and control groups. Although treatment teachers completed more hours of professional development in core academic subjects, the difference was small (0.14 credit hours when the sample mean was 28) and only marginally significant (p = 0.12). Moreover, there was no discernible difference in professional development completed in mathematics. Likewise, treatment teachers had no more overall contact with the district’s math mentors than teachers in the control group.

1:20pm: One question I really want answered (that I don't see in the report) is how the bonus winners and losers differed (experience, school type, etc.), if they differed at all.  A questioner just asked something along those lines, and the response was that they've just begun to look at that, but that they're wary to draw too many conclusions about this because the sample sizes will get smaller the more focused the question is.

1:27pm: not sure this is quite verbatim, from the panel, but it's a good point nonetheless: "just because we didn't find an effect from doing incentives this way doesn't mean you wouldn't find results by doing it another way."

1:43pm: The questions aren't really helping to advance the discussion all that much -- in part b/c there are too many speeches and not enough questions.  The line at the microphone is now too long for me to get a question in before the end of the session, but here's what I'd ask:

1.) There are two assumptions behind performance pay that this study was designed to test: one is that teachers will respond to the opportunity to earn incentives by changing their practice, and two is that these changes in practice will result in better student performance.  The report says that there were very few differences in reported teacher behavior between treatment and control groups, but is there enough data to draw any conclusions about the performance of those who did report making changes?

I had a second question, but it now completely escapes me.  I'll try to find an answer to at least that question, at least.  That's going to do it for now, but expect a more concise summary of the report's findings in the next 24 hours or so.

Wait, just kidding: here are the other four questions that are rattling around in my brain right now.  I ran into one of the researchers in the hallway and asked him the question above in addition to the first three below.  The answer was basically that they're good questions, and ones they plan on looking into more -- the focus for today was completing the analysis of the trends in test scores.

2.) What, if any, differences were there between bonus winners and losers (e.g. were winners more experienced, teaching in better schools, teaching higher/lower performing students, teaching a different subject, etc.)?

3.) What, if any differences, were there between the reactions of bonus winners and losers (e.g. more likely to stay/leave, more/less critical of experiment, more/less enthusiastic, more/less likely to report subsequent changes in behaviors)?

4.) I think 15 treatment group and 27 control group teachers left the district during the experiment.  Can we consider that statistic (and maybe any relevant survey/interview questions) evidence that teachers might be less likely to leave an urban district if they can earn a bonus there but not elsewhere?

5.) A much higher percentage of the treatment group teachers (over half, compared to less than one-third) won bonuses in the final year.  Is that simply because the less experienced teachers were much more likely to leave their assignments, because the test changed, or something else?

Now that's (really) all for now.

No comments: