Thoughts on Education Policy: performance pay

Showing posts with label performance pay. Show all posts

Thursday, September 23, 2010

The Primary Purpose of Merit Pay

The most popular opinion of the last few days seems to be that the primary purpose of merit pay is to re-shape the teacher labor force by attracting and retaining better teachers. The notion that performance incentives would motivate teachers to perform better in the classroom has been implicitly or explicitly derided as silly and/or unimportant.

Did I miss something? Maybe I need to do some archival research, but I could've sworn that before the release of the results there weren't many merit pay proponents making this argument. But since learning of the lack of effect on standardized test scores in the Nashville experiment, it seems to be the only one I hear.

After learning of the results, Rick Hess wrote that

The second school of thought, and the one that interests serious people, is the proposition that rethinking teacher pay can help us reshape the profession to make it more attractive to talented candidates, more adept at using specialization, more rewarding for accomplished professionals, and a better fit for the twenty-first century labor force.

and the Washington Post quotes Eric Hanushek saying

The biggest role of incentives has to do with selection of who enters and who stays in teaching - i.e., how incentives change the teaching corps through entrance and exits . . . I have always thought that the effort effects were small relative to the potential for getting different teachers. Their study has nothing to say about this more important issue.

and Tom Kane writes:

the impact of the specific incentive they tested depends on what underlies the differences in teacher effectiveness–effort vs. talent and accumulated skill. I’ve never believed that lack of teacher effort–as opposed to talent and skills–was the primary issue underlying poor student achievement gains. Rather, the primary hope for merit pay is that it will encourage talented teachers to remain in the classroom or to enter teaching.

the Obama administration's official position seems to align with that too. Here's how the same Washington Post article described their repsonse:

While this is a good study, it only looked at the narrow question of whether more pay motivates teachers to try harder," said Peter Cunningham, assistant U.S. education secretary for communications and outreach. "What we are trying to do is change the culture of teaching by giving all educators the feedback they need to get better while rewarding and incentivizing the best to teach in high-need schools, hard to staff subjects. This study doesn't address that objective.

Maybe I'm wrong and there are more people that would've agreed with these four statements a few days ago than I think, but there were certainly more than a couple people arguing that performance incentives would increase teachers' motivation, improve their classroom performance, and subsequently increase the academic performance of their students. I've had conversations with people who've directly told me that lack of motivation is a huge problem in teaching and that providing proper incentives would fix this.

Without more research, I can't tell you whether people have conveniently changed their mind about the primary purpose of performance pay or whether those who believe it should be used primarily to alter the teacher labor force are now simply stepping to the forefront while those who believed in its motivational potential are shrinking into the background. But I'd guess that it's a little of both.

On the plus side, might everyone now agree that teacher pay should be re-fashioned with the primary goal being to encourage the recruitment and retention of excellent teachers? Do I hear a consenus emerging? I guess time will tell . . .

Wednesday, September 22, 2010

Nashville Incentive Pay Experiment: Results Wrap-Up

The National Center on Performance Incentives released the final report on the Nashville performance pay experiment (known as POINT) yesterday. The press release is available here, and the full report is available here.

The study involved 296 middle school math teachers in Nashville who were assigned to either a treatment group (eligible for bonuses of 5, 10, or $15,000) or a control group and then tracked for three years.

The main result was that students assigned to bonus-eligible teachers did not perform any better than students assigned to treatment group teachers. The lone exception were 5th grade students in years 2 and 3 of the study, but the gains did not persist through the end of 6th grade. The main portion of the executive summary reads as follows:

POINT was focused on the notion that a significant problem in American education is the absence of appropriate incentives, and that correcting the incentive structure would, in and of itself, constitute an effective intervention that improved student outcomes.
By and large, results did not confirm this hypothesis. While the general trend in middle school mathematics performance was upward over the period of the project, students of teachers randomly assigned to the treatment group (eligible for bonuses) did not outperform students whose teachers were assigned to the control group (not eligible for bonuses).

Prior to implementation, researchers calculated that 55% of teachers would be able to obtain bonuses if their students answered only an additional 2-3 questions correct on the state test. Across the three years of the study, 33.6% of bonus-eligible teachers earned a bonus in one or more years.

This means that most teachers did not dramatically improve, even for one year. It also means that the tendency of test scores to bounce around significantly did not result in different random groups of teachers receiving a bonus each year. About two-thirds of teachers in the treatment group never received a bonus. And about two-thirds of the teachers who received a bonus earned one in multiple years.

Of potentially more import are the results from the teacher interviews and surveys. These will continue to be analyzed in the coming months, but right now the main takeaway point is that teachers in the treatment group really didn't report changing much at all. About 80% of teachers reported that they were already working as hard as they could before the incentives were implemented and were therefore unable to work any harder afterward.

There are two potential bright spots, however, for merit pay proponents:

1.) The project ran smoothly (e.g. the right teachers received the right bonuses at the right time) and didn't suffer any major backlash. That this was truly a partnership between union, university, district, and other groups probably helped in this regard.

2.) It's unclear right now, but the bonuses may have had a small effect on the patterns of teacher attrition for those in the bonus group. 27 teachers in the control group left to teach in another district while only 15 teachers in the treatment group did. The numbers are small, so conclusions are hard to draw, but the most popular criticism of the study by performance pay advocates seems to be that it didn't shed any light on how performance pay might affect teacher recruitment and retention.

There was also no evidence that treatment group teachers were successful in gaming the system by having more kids (or lower performing kids) removed from their class, preventing new students from transferring into their classes, focusing more efforts on math instruction at the expense of other subjects, or helping their students cheat on the state tests.

What does it mean?

I wrote before that this was one of the most important studies this decade. That said, it's still just one study. And we have to be careful about how we extrapolate from the results of any one study. The main question the study was set up to answer was whether offering performance incentives to individual teachers would result in their students performing better on standardized tests. The study offered no evidence that this was the case, and little reason to believe that this would be the case for similarly designed incentive systems.

Why is this so? The main reason seems to be the lack of any major changes by teachers. But there are a couple of other possibilities. Just because teachers reported that they didn't change anything doesn't necessarily mean that they didn't change anything. So it's not out of the realm of possibility that at least some of the teachers made changes but that these changes didn't yield subsequent results. It's also possible that, despite the fact that teachers received the project fairly well, they wanted to prove the working hypothesis wrong -- that is, that teachers eligible for a bonus, at least on some level, resented the fact that somebody thought they'd teach better if they were offered a carrot, and decided to not work harder despite the carrot looking awfully delicious. At the same time, the control group teachers could've worked a little harder to prove that they were plenty motivated to teach solely because they wanted their students to succeed. While the answer is likely a little of all of the above, I tend to think the most likely scenario is that teachers simply weren't all that much more motivated by the prospect of a bonus fairly far down the road (teachers were paid the following November).

What it doesn't mean is that all merit pay schemes forevermore are doomed to abject failure. We don't know if different types of bonuses awarded in different ways (shorter time spans, group awards, non-monetary awards, etc.) might have a larger effect, and we know little about how performance pay affects the long-term make-up of the teacher labor force. At the same time, it does call into serious question the application of the overly simplistic homo economicus model used by economists. My gut feeling is that economists tend to view teacher motivation and the teaching profession in an overly simplistic manner guided too much by basic economic theories and not enough by the literature on the sociology of teaching or the psychology of motivation.

Where do we go from here?

NCPI has another study utilizing team-based incentives that should be out at some point over the next couple of years. In addition, other non-randomized studies of the myriad incentive systems that have sprouted all over the country the past couple years are certainly underway as well. I'm not sure why the NCPI researchers chose to claim that this was "the first scientific study of performance pay ever conducted in the United States," since the definition of "scientific" is hotly debated, but the literature base will certainly continue to grow in the coming years regardless.

In addition to the continuing analysis of the data from this study, there are plenty of opportunities for other researchers and funding agencies to examine questions surrounding the impact of merit pay on the teacher labor force, on school culture, on student outcomes other than test scores, and many other areas.

In the meantime, it's important that merit pay opponents not claim that this study proves once and for all that merit pay does not, and will not ever, work in schools. If nothing else, it appears likely that better teachers got paid more money than worse teachers -- which is arguably an improvement on the current system. And, at the same time, it's important that merit pay proponents not claim that this study is meaningless and that we should recklessly proceed with merit pay in schools at breakneck speed.

Merit pay has grown considerably more popular among teachers over the past decade, so it's eminently possible that districts and unions can work together to design and implement better, more nuanced merit pay systems that might have a better chance of success. I don't think anybody would argue that the status quo is the perfect system, so simply saying that merit pay won't work isn't a solution. But these results indicate that we should proceed with caution. The assumption that teachers will work harder for financial incentives is now a dangerous one to make and should be made with caution. As such, the performance pay systems that will undoubtedly continue to emerge should begin with a more nuanced and informed understanding of the practices and motivations of current and prospective teachers. We can only hope that an informed and open-minded approach from both sides will eventually result in compensation systems that attract and reward good teachers in ways that current teachers find meaningful and fair.

For more information, see my previous posts on the experiment:

Part 1: Background Info
Part 2: What to Look For
Part 3: Why it Matters
Part 4: What We Can Learn
Live-Blog of Results

Tuesday, September 21, 2010

Live-Blogging the Release of the Nashville Performance Pay Experiment Results

Today marks the release of the National Center on Performance Incentives' report on the Nashville performance pay experiment (known as POINT). The full report will be available on their website after the discussion. In the meantime, I'm going to provide you with some snapshots of what's being said and written. Please check out my previous posts on this study as well. A live, streaming, video of the press conference is online here. The press release (which is pretty good) is now available here. The full report is available here.

Previous posts:
Part 1: Background Info
Part 2: What to Look For
Part 3: Why it Matters
Part 4: What We Can Learn

12:50pm: The press conference has begun. I'm going to begin by posting the summary statement from the executive summary of the report:

POINT was focused on the notion that a significant problem in American education is the absence of appropriate incentives, and that correcting the incentive structure would, in and of itself, constitute an effective intervention that improved student outcomes.

By and large, results did not confirm this hypothesis. While the general trend in middle school mathematics performance was upward over the period of the project, students of teachers randomly assigned to the treatment group (eligible for bonuses) did not outperform students whose teachers were assigned to the control group (not eligible for bonuses).

Before you get excited, or disappointed, about this, bear in mind what I've written before -- the most important things we can learn from this study aren't what happen to test scores, they're insights into teacher behavior from the interviews and surveys. Keep checking back for more details on this, and other, topics.

Also, please keep in mind that this study does not definitively prove either that merit pay systems are a bad idea or a good idea.

1:00pm: The number of teachers who received bonuses remained steady throughout (40, 41, 44), but the number of eligible teachers declined significantly (143, 105, 84) -- meaning over half of the teachers received a bonus for being above the historical 80th percentile of teachers in the last year. This could mean that less successful teachers tended to leave the study -- whether by switching subjects, schools, or careers. Or it could mean that the tests became easier and more teachers were rewarded. I would say that it could mean that it took three years for the incentives to have an effect, but the treatment group did not outperform the control group -- even in the final year.

from the executive summary: "attrition of teachers from POINT was high. By the end of the project, half of the initial participants had left the experiment."

The report says that the differences in attrition -- and reasons for attrition -- between control and treatment groups was not statistically significant. But it reports that 27 control group teachers, and only 15 treatment group teachers, left the study because they'd switched districts during the experiment. I'd like to know more about this and whether this might be evidence that incentives made teachers slightly more likely to remain in the Metro Nashville school system.

about teacher attrition, from the report:

Teachers who left the study tended to differ from stayers on many of the baseline variables. Teachers who dropped out by the end of the second year of the experiment were more likely to be black, less likely to be white. They tended to be somewhat younger than teachers who remained in the study all three years. These dropouts were also hired more recently, on average. They had less experience (including less prior experience outside the district), and more of them were new teachers without tenure compared to teachers who remained in the study at the end of the second year. Dropouts were more likely to have alternative certification and less likely to have professional licensure. Their pre-POINT teaching performance (as measured by an estimate of 2005-06 value added) was lower than that of retained teachers, and they had more days absent. Dropouts completed significantly more mathematics professional development credits than the teachers who stayed.Dropouts also tended to teach classes with relatively more black students and fewer white students. They were more likely to be teaching special education students.

I'm going to need a little time to digest that, but the next table demonstrates that treatment and control group teachers, in all three years, did not differ in terms of effectiveness (as measured by tests) in the years prior to the experiment -- meaning that more effective teachers didn't seem more likely to stay if they could possibly earn bonuses. Treatment group teachers, however, were six percentage points less likely to leave the middle school in which they started during the three years.

1:03pm: There were, however, positive effects for 5th grade teachers in the 2nd and 3rd years, though the effects did not persist until the end of 6th grade. The 5th grade teachers had the same students all day for multiple subjects, so it's possible that they shifted focus to math instruction or that they simply knew their students better and were able to get better results. The center did analyze results on other subject tests and found no differences that would indicate teachers ignored these subjects and only focused on math.

33.6% of the treatment group received a bonus in at least one year (out of 152, 16 won once, 17 twice, and 18 thrice). Analysis done by the researchers prior to the experiment found that 55% of teachers were within a few more correct questions per student of attaining scores that would earn them a bonus.

1:08pm: Here are some other interesting tidbits from the executive summary, report, and press conference (which is now in Q&A):

-80% of teachers reported that they were already working as hard as they could and didn't change their effort due to the opportunity to earn an incentive.

-from the executive summary: "The introduction of performance incentives in MNPS middle schools did not set off significant negative reactions of the kind that have attended the introduction of merit pay elsewhere. But neither did it yield consistent and lasting gains in test scores. It simply did not do much of anything."

-from the report: "From an implementation standpoint, POINT was a success. This is not a trivial result, given the widespread perception that teachers are adamantly opposed to merit pay and will resist its implementation in any form."

-I didn't mention this before, but the placement of teachers in control/treatment groups -- and whether or not they received a bonus was, officially, confidential. The center didn't distribute this information to principals or teachers and participating teachers signed statements saying that they wouldn't tell anybody their status. It looks like this was at least moderately successful, as about 75% of teachers reported that they didn't know if anybody had won a bonus in their school.

-There were no differences in student attrition between groups -- meaning that there's no evidence that bonus-eligible teachers were more likely to get problem students removed from their class. There were also no differences in students enrolling late, and students who left treatment teachers' classes were no lower scoring.

1:12pm: Dale Ballou responds to a question by saying that test scores bounced around a lot before the experiment, making it "difficult to extrapolate" from the data prior to the start of the experiment and tell whether teachers were doing better post-bonus than pre-bonus.

1:16pm: From the report, here are the results from the teacher survey. In short, nothing huge or shocking (note TCAP is the TN state test):

There are few survey items on which we have found a significant difference between the responses of treatment teachers and control teachers. (We note all contrasts with p values less than 0.15.) Treatment teachers were more likely to respond that they aligned their mathematics instruction with MNPS standards (p = 0.11). They spent less time re-teaching topics or skills based on students’ performance on classroom tests (p = 0.04). They spent more time having students answer items similar to those on the TCAP (p = 0.09) and using other TCAP-specific preparation materials (p = 0.02). The only other significant differences were in collaborative activities, with treatment teachers replying that they collaborated more on virtually every measured dimension. Data from administrative records and from surveys administered to the district’s math mentors also show few differences between treatment and control groups. Although treatment teachers completed more hours of professional development in core academic subjects, the difference was small (0.14 credit hours when the sample mean was 28) and only marginally significant (p = 0.12). Moreover, there was no discernible difference in professional development completed in mathematics. Likewise, treatment teachers had no more overall contact with the district’s math mentors than teachers in the control group.

1:20pm: One question I really want answered (that I don't see in the report) is how the bonus winners and losers differed (experience, school type, etc.), if they differed at all. A questioner just asked something along those lines, and the response was that they've just begun to look at that, but that they're wary to draw too many conclusions about this because the sample sizes will get smaller the more focused the question is.

1:27pm: not sure this is quite verbatim, from the panel, but it's a good point nonetheless: "just because we didn't find an effect from doing incentives this way doesn't mean you wouldn't find results by doing it another way."

1:43pm: The questions aren't really helping to advance the discussion all that much -- in part b/c there are too many speeches and not enough questions. The line at the microphone is now too long for me to get a question in before the end of the session, but here's what I'd ask:

1.) There are two assumptions behind performance pay that this study was designed to test: one is that teachers will respond to the opportunity to earn incentives by changing their practice, and two is that these changes in practice will result in better student performance. The report says that there were very few differences in reported teacher behavior between treatment and control groups, but is there enough data to draw any conclusions about the performance of those who did report making changes?

I had a second question, but it now completely escapes me. I'll try to find an answer to at least that question, at least. That's going to do it for now, but expect a more concise summary of the report's findings in the next 24 hours or so.

Wait, just kidding: here are the other four questions that are rattling around in my brain right now. I ran into one of the researchers in the hallway and asked him the question above in addition to the first three below. The answer was basically that they're good questions, and ones they plan on looking into more -- the focus for today was completing the analysis of the trends in test scores.

2.) What, if any, differences were there between bonus winners and losers (e.g. were winners more experienced, teaching in better schools, teaching higher/lower performing students, teaching a different subject, etc.)?

3.) What, if any differences, were there between the reactions of bonus winners and losers (e.g. more likely to stay/leave, more/less critical of experiment, more/less enthusiastic, more/less likely to report subsequent changes in behaviors)?

4.) I think 15 treatment group and 27 control group teachers left the district during the experiment. Can we consider that statistic (and maybe any relevant survey/interview questions) evidence that teachers might be less likely to leave an urban district if they can earn a bonus there but not elsewhere?

5.) A much higher percentage of the treatment group teachers (over half, compared to less than one-third) won bonuses in the final year. Is that simply because the less experienced teachers were much more likely to leave their assignments, because the test changed, or something else?

Now that's (really) all for now.

A Primer on the Nashville Incentive Pay Experiment, Part 4

Part 4: What Can We Learn?

Previous posts:
Part 1: Background Info
Part 2: What to Look For
Part 3: Why it Matters

Despite what Rick Hess says, we won't learn "nothing" from the results of the study (but do read his post, as most of his points are good ones). So what, exactly, will we learn from the results of the study? It's hard to say exactly, but here are some things that we can and cannot learn from the study:

We can learn how individual teachers respond to a financial incentive offered for individual results.
We cannot learn how individual teachers, groups of teachers, or entire schools respond to financial or other types of incentives offered to groups of teachers or schools.

We can learn whether teachers will change their teaching in ways that will raise student test scores in response to an individual financial incentive.
We cannot learn whether teachers will change their teaching in ways that will increase student engagement, critical thinking, creativity or a myriad of other factors in response to an individual financial incentive.

We can learn whether middle school math teachers in Nashville were more likely to switch schools/districts or leave the profession if they were in the control or treatment group.
We cannot learn whether talented people across the country are more likely to become teachers, and subsequently remain in teaching, if there are performance bonuses in place.

We can learn whether, under this particular system, teachers who are "better" as measured by standardized tests across all three years tend to be rewarded on a year-to-year basis.
We cannot learn whether, under performance pay systems, better teachers (as measured in any number of ways) tend to be paid more.

In short, yes, there are rather severe limitations on what we can learn from this one study. But, at the same time, I'd argue that we can learn more from this study than from most others. Despite what Rick Hess, we will not learn "nothing of value," in part because different people value different things. Hess might think that the teacher recruitment/retention aspect of performance pay is the most important, but plenty of others think the incentivizing of effort is the most important.

What does this mean for the interested observer watching from afar? It means that your ears should perk up if when you hear strongly worded statements from both sides of the debate. This study is one piece in the puzzle -- and an important piece, at that. It's neither the end all and be all of research into performance pay nor an utterly useless waste of time that fails to inform the debate in the least.

A Primer on the Nashville Incentive Pay Experiment, Part 3

Part 3: Why it Matters

Previous posts:
Part 1: Background Info
Part 2: What to Look For

You may have noticed that I'm devoting a fair amount of attention to the results of the Nashville incentive pay experiment that are being released today. Let me take a couple of minutes to explain why.

The first, and most obvious, point is that this is the first randomized field trial evaluating the effectiveness of a merit pay system. The debates to date on whether or not we should use some form of performance pay in school have largely relied on ideology and theory. This will give us the first concrete, empirical, and comprehensive evidence to inform our future policy decisions. Given the importance of merit pay in the national discussion right now, it makes this one of the most important education studies of the decade.

Now, that is not to insinuate that there still won't be a ton of unanswered questions about merit pay after the results of this are digested (no one study is ever enough to close the book on such a wide-ranging topic) but, rather, that we will know significantly more about how merit pay plays out at the ground level after the release of this study.

Or, at least, we certainly hope we will -- especially given that this represents countless hours of effort by dozens of people over the past five years or so . . . and millions of dollars. Things can always be done bigger and better, but there won't be anything bigger and better than this for quite some time (if ever), so expect the results to be bandied about by both sides of the debate for years to come. In other words, expect this study to be the definitive study on how individual teachers respond to financial incentives well into the future.

My next post will address what, exactly, we might learn from the results. It will likely be followed up at an attempt to live-blog the release of the results beginning around 12:30pm central time.

Monday, September 20, 2010

A Primer on the Nashville Incentive Pay Experiment, Part 2

Part 2: What to Look For

The results from the Nashville incentive pay experiment are due to be released tomorrow (see last week's post for background info on the experiment) -- here are a few things to keep an eye out for in the final report:

Stability of scores

One of the issues with value-added scores has been their high variability from year to year. Researchers were worried about the effects of "statistical noise" and random variations in scores before the start of the experiment. In practical terms, you'll want to know how many teachers received a bonus each year versus how many earned a bonus one or two out of the three years -- and how well teachers who ever earned a bonus did across all three years (e.g. did they earn a bonus for being in the 85th percentile one year but then have a score in the 40th percentile the other two years). In other words, were bonuses going to the same teachers who tended to outperform other teachers, or were they just randomly assigned each year?

What, if anything, did teachers do differently?

To me, this is the most interesting question. Did treatment group teachers report putting more effort into teaching because of the incentive? Did they assign more homework? Did they crack down more on misbehaving students? Did they work harder to get certain students out of their class? Did they focus more on their math class than their other class (some of the teachers taught multiple subjects)? Did they attend more PD? Did they spend more time on test prep? Or did they not bother to change at all? The baseline numbers are going to get all the attention (i.e. did students of treatment group teachers score higher?) but, I think answering this question is far more important -- both because it informs us as to why, or why not, treatment teachers do better and because it gives us valuable insights into how teachers think and how their actions influence student achievement.

Teacher reaction to receipt/non-receipt of bonus

Did teachers who earned a bonus enthusiastically redouble their efforts while those who didn't simply gave up and maybe quit? Or did those who failed to earn a bonus decide to redouble their efforts to make sure they got one the next year while those who earned one became cocky and put their teaching on cruise control? In terms of the numbers, it will be interesting to see if there's any divergence between the scores of first-year winners versus first-year losers (in terms of bonus receipt) -- does one group go up in second years and other down, are both steady, or something else? Of course, if the scores are simply random each year, then we'd expect first-year losers to do better than first-year winners in the 2nd/3rd years because of regression to the mean.

Demographics of winning/losing teachers

Bonuses are not based on a value-added measure that attempts to control for everything and isolate the individual teacher's effect. The only thing the computed score takes into account is prior achievement of the students. So it will be interesting to see if winning teachers taught in better schools, were more experienced, had smaller classes, used a different curriculum, had lower student transience, or other factors that might influence students' gains on the state math test.

Performance in different types of math classes

Were teachers of advanced algebra classes more or less likely to earn bonuses than teachers of remedial math classes? Since the formula for determining bonuses is somewhat simplified, it may be the case that it's a lot easier to get low-performing student to advance x points than high-performing students, especially if there's a ceiling effect. Or it may be the case that some subjects are better aligned with the contents of that year's state test.

Improvement of scores

For merit pay systems to transform our schools, we need teachers to improve -- and continue to improve -- while they're eligible for these bonuses. Why? Let's say that the treatment teachers, as a group, have a score that ranks them in the 50th percentile historically each year. Now let's say that the treatment group teachers average a score in the 60th percentile each year. That would appear to be strong evidence that incentives make teachers better (at least as measured by gains on Tennessee's state tests, anyway), but it would only offer a limited amount of hope for the future because it would indicate only a one-time boost to scores. Let's say another experiment on a professional development system of some sort yields growth instead of simply a step up -- teachers are in the 50th percentile the first year, the 55th the second, the 60th the third, and the 65th the fourth . . . that system would offer more promise for future growth in school success.

Keep your eyes open for some more pre- and post-release analysis of the experiment . . .

Wednesday, September 15, 2010

A Primer on the Nashville Incentive Pay Experiment

Part 1: Background Information

According to Eduwonk, results from the Nashville incentive pay experiment are due to be released soon. I've been meaning for a while now to write up some background information on the experiment so that we have some context when the results are released, so this seems like as good a time as any.

The National Center on Performance Incentives was started in 2006 with a 5 year, $10 million grant received from the Department of Education's Institute for Education Sciences. The center is housed at Vanderbilt University's Peabody College and run in conjunction with various partners, including the RAND Corporation and the University of Missouri. Peabody's Matthew Springer and James Guthrie (now of the George W. Bush Institute for Public Policy) are the directors, and the center is staffed by people from a range of institutions across the country (full list). The funding was to cover two experiments plus other related costs. The first experiment was conducted in Nashville from 2006-09 and was dubbed the Project on INcentives in Teaching (POINT).

The center started at Vanderbilt the same time that I did, and I worked there during my first year (2006-07) to earn my keep around here. I haven't been involved with the center since then and have no information on what the results are.

The original experiment design was to encompass 200 middle school math teachers in the Metropolitan Nashville Public Schools -- 100 in the control group and 100 in the treatment group. Teachers in the treatment group were eligible for bonuses of up to $15,000 for each of three consecutive school years. Each teacher received $750 every year for participating as long as they completed all the required surveys, interviews, etc. Teachers were recruited into the experiment in the fall of 2006, not long after the school year had begun.

Bonuses were based on student gain scores* (not quite the same as value-added, see technical note at end) on the Tennessee state test (TCAP). Unlike virtually every state, TN's assement is system is vertically scaled, meaning that scores can be compared across years on the same scale (a score of, say, 250 in 7th grade means the same thing as a score of 250 in 6th grade). This means that a student who goes from 240 to 260 from 6th to 7th grade gained 20 points. Meanwhile, researchers looked at the years preceding the experiment to determine the average growth of students at each level. Taking the previous example, let's assume that the average TN 6th grader scoring a 240 on the state test then scores 255 next year. This would mean that a student who scored 260 was 5 points above average. For that, a teacher would receive a score of +5, and each student the teacher taught would be scored similarly. The average score for a student with teacher x would be calculated. The purpose of calculating scores this way was to strike a balance between statistical rigor and transparency/ease of communication. The result is a calculation that's not quite as rigorous as a value-added score, but a lot easier for teachers to understand.

When the teacher's final score has been calculated, it's then compared to the historic average for middle school math teachers in Nashville. If a teacher scores in the 80th percentile, they earn a $5,000 bonus, the 85th percentile earns a $10,000 bonus, and the 95th percentile yields at $15,000 bonus. The targets for the bonuses stay the same the entire three years, so it's possible for every teacher in the treatment group to earn a bonus each year (in other words, they're not competing against each other). It's my understanding that for the first year the bonuses were distributed along with paychecks the following fall, but I don't know what the procedures were the following two years.

The experiment ended in May, 2009 and a large team of researchers have been poring over data from test scores, interviews, surveys, and other sources of information ever since. This means that there is going to be a lot of analysis released at some point in time -- and that it's going to take a while for even the most informed reader to sort through.

technical note: A "gain score" is simply the gain in a student's score from year to year (260 - 240 = a gain of 20 points), while a "value-added score" is an attempt to isolate a teacher's effect on a student's score and might control not only for a student's previous achievement level but also the other teachers he/she has or has had, the school he/she attends, demographic factors, class size, peer effects, and any number of other things. In other words, a gain score is just the raw growth a student exhibits while a value-added score is a more precise estimate of exactly how a specific teacher influenced that growth (though value-added could be computed for schools, states, etc. as well).

Tuesday, March 23, 2010

Memory Test

Anybody feel like helping me try to find an article? I'm usually pretty good at this kind of thing, but I seem to have struck out on this one.

There was an opinion piece that ran in the NY Times (I think) about 5 years ago or so -- I think it was between 2003 and 2005 -- by, I believe, a former Clinton aide. The op-ed proposed using federal dollars to dramatically increase (double?) teacher salaries in a certain set of high-poverty (urban?) schools. The argument was that this would cost only a fraction of what the federal government currently spends on education and, at the same time, would do more to help close the achievement gap than what we're currently doing by attracting the best and brightest to teach in the neediest schools.

Does anybody else remember reading this? Anybody who sends me a link to the article will receive two gold stars.

Update: The gold stars go to Aaron Pallas -- the article can be found here

As to my memory, I fared reasonably well: the article was written in 2005 by Matthew Miller, who worked for the OMB during the Clinton administration. He mentions urban schools, but never specifically identifies which schools would receive the pay raises (he says "poor schools"). He proposes raising pay by 50% and doubling it for high-performing teachers and shortage subjects. He estimates that such a program would cost $30 billion (so he must have x number of schools in mind) -- 7% of the federal education budget.

Thursday, December 17, 2009

An Honest Discussion of Merit Pay?

I find it troubling that discussions around merit pay (or performance incentives, or whatever the iteration at hand or preferred terminology may be), never quite seems to be completely honest. Even when people are trying to be honest, it seems that part of the discussion is based on half-truths and misconceptions.

Take this recent anti-merit pay op-ed in EdWeek by Kim Marshall, for example. He points out a number of faulty assumptions that many make when discussing merit pay, and then makes some of her own.

He's more or less correct when he says "The best teachers are already working incredibly long hours, and there’s no evidence that extra pay will make them work harder or smarter". One could argue that other fields provide some evidence, but to date there's virtually no evidence (certainly no experimental evidence in the U.S.) that merit pay will make teachers work harder -- or that if they did work harder that this would subsequently yield better results. It may be the case that many teachers are working pretty much as hard as they can and/or wouldn't be better teachers if they worked harder (they may pursue the wrong strategies or simply become more stressed).

But then he says "Teachers who are rewarded for their own students’ test-score gains are less likely to share ideas with their colleagues." This is demonstrably false. Every merit pay scheme I know of is designed to prevent teachers from competing with teachers at their own school for a share of a defined pool of money. The experiment that just concluded in Nashville compared the performance of teachers' students to the typical historical performance -- not to how other kids in the school performed.

His other points are mostly valid, though not necessarily precise. It's true that researchers say it takes three years of data to accurately estimate the effectiveness of a teacher (as measured by standardized tests). It's true that incentivizing higher test scores also incentivizes more test prep and even cheating -- but by that logic it would also incentivize harder work, which he earlier dismissed. He says half of all teachers teach untested subjects, but in some states it's closer to 70%.

I have yet to find a discussion of merit pay that's both based on facts rather than conjecture and approaches the topic in an unbiased way. People on both sides of the argument are making many dangerous assumptions, often based on incorrect information. The fact is that merit pay is utterly unproven in American schools and that while we can guess how it might affect teachers and schools, we simply can't know for sure until we try.

Right now, the idea is spreading rapidly, and I worry that the continuation or termination of the trend is going to depend more on half-informed arguments rather than sober analysis of research.

Wednesday, September 9, 2009

Today's Random Thoughts

-When looking at the browser requirements for filling out the FAFSA, I noticed that anybody who regularly updates their computer is unable to fill one out online. The FAFSA website can only handle up to Internet Explorer 7 and Firefox 2.0 -- they're up to versions 8 and 3.5 now, respectively. This is something the Department of Education should've fixed months ago. While I'm on the subject of the federal govt. updating their websites, why can't they combine the login systems for e-filing tax returns and applying for financial aid? Why must one type in all their same tax info all over again?

-I continue to see little discussion of the fact that about 2/3 of teachers don't teach tested subjects when discussing merit pay. Indeed, NYC recently passed out 12,000 teacher evaluations based on state test scores . . . there are 87,000 teachers in NYC. Assuming these evaluations were accurate, how would we evaluate the other 75,000 teachers?

-Speaking of merit pay and evaluating teachers using standardized test scores . . . many object to the latter on the grounds that we might fire a good teacher based on faulty measures. What if, instead, we reward mediocre teachers for high performance based on test scores -- breaking the bank and not helping our schools at the same time? And, it's clear that faulty measures are being used (check out the last graph in this piece if you believe at all that NYC's school grading system is statistically valid), leading to 97% of NYC schools receiving a grade of A or B. New York's faulty tests also resulted in a doubling of the budget for teacher bonuses this year. Merit pay is supposed to be a fiscally efficient reform, but it's not under these types of conditions.

-The AP has a story on private investment in charter schools. I wonder how this will play out if charter schools continue to expand. If private monies dry up, we might not hear much more about it. But if private monies start going disproportionately to charter schools, it might be hard for public schools to compete. On the other hand, it might make charter schools even better, which would probably be a good thing for those attending charters.

-I wrote before that we seem to be seeing a charterization of urban public districts. Not only are charter schools spreading, but charter-like public schools (read: small, specialized schools often not for just one neighborhood) are as well. The NY Post has a little blurb on the vast array of specialties that this year's crop of new schools offer. If this continues I wonder what the ramifications of having community schools in the suburbs and beyond and specialty schools in the inner-city will be.

Monday, August 17, 2009

Today's Random Thoughts

-By now you've probably heard that Philadelphia is planning a reality show where Tony Danza is a high school teacher. If I thought for a second that it would show what life is really like inside our urban schools, I would be all for it. But Stephen Lentz captures the likely outcome -- handpicked students, special treatment, and an unrealistic view of what's actually happening -- nicely. Nancy Flanagan has a slightly different guess of what will be shown, pointing out that tv producers wanted to show drama, not great teaching, when they came to her school. That could happen as well, but I think the odds are in favor of them scripting a happy ending one way or another -- and that's often not a realistic view into the life of a first-year teacher in an urban school. Hopefully this show never happens, but if it does I hope we're all wrong about how it will play out.

-The NY Times has a set of opinions on the value of graduate degrees for teachers. It seems like everybody these days is willing to deride the value of ed schools or the utility of rewarding teachers for earning master's degrees from such worthless institutions (Martin Kozloff wins the award for the most derisive piece), but they did find a couple people to step up and defend the idea. As little as I gained from my experience in ed school, I'm still somewhat hesitant to decry all ed schools or the whole notion of rewarding teachers for furthering their education. Can't we find a way to reward teachers for attending programs that help them become better teachers?

-Speaking of rewarding teachers, I continue to be baffled by the fact that so many seem to think that merit pay is a simple undertaking. Even if we assume that standardized test scores are accurate, around 2/3 of teachers don't teach a tested subject. Test scores are going to play some sort of role (and probably a large one) in evaluation of teachers and schools for the foreseeable future, but I hope Sherman Dorn is right and that we're also developing evaluation models that take a myriad of factors into account.

Wednesday, August 5, 2009

I'll Pass the AP Test . . . For the Right Price

The biggest education story in the press today (NY Times) the results of the second year of a pilot program that pays students in NYC up to $1,000 for passing an AP test. I'm not quite sure of the exact details of the scheme b/c different outlets are reporting different details. The other NY papers (Post, Daily News) are reporting $500 for a 3, $750 for a 4, and $1,000 for a 5, but the Times is reporting $1,000 for a 5 if students attend Saturday prep classes and $500 if they don't and then $750/$400 for a 4. The Times doesn't mention any reward for a 3 even though say it takes a 3 to pass. The Reach website confirms the amounts reported by the Times and reports that a 3 earns test takers $500/$300.

At any rate, more students both took and passed AP tests than last year -- when more students took, but fewer passed, the tests. The passing rate only budged from 32% to 33% (still down from 35% before the program started), so the main effect may have been encouraging more students to take tests. As far as I can tell, the number of exams taken and passed were:

2007: 4,275 taken/1,481 passed by ? students (34.6%) ? per student
2008: 4,620 taken/1,476 passed by 1,161 students (31.9%) 4.0 taken/1.27 passed per student
2009: 5,436 taken/1,774 passed by 1,240 students (32.5%) 4.39 taken/1.43 passed per student

(2007 data from this article)

In order to tell exactly what effect the program had we need to know, among other things, how many students were in these 31 schools in each of the three years, how many (if any) students re-took exams that they'd failed in a previous year, and whether there were any demographic changes among the school populations and test takers. Not to mention how many students took the tests in 2007, before the program started (I'll keep looking for that figure).

My first reaction to the news was "of course more students passed, I'd have taken and passed more AP tests in high school if I got $1,000 for each one," but then I read more of the details. Without knowing how many total students are in the schools, the results don't seem very impressive. I'd expect the possibility of earning thousands of dollars to yield more of a reaction from high school students. Also note that, nationwide, AP tests have about a 57% pass rate.

For a second, though, let's assume a best-case scenario for the program -- school populations shrunk, and the growth in test-taking was due to weaker and younger students signing up. In other words, let's assume that the numbers actually tell us that the rewards led more students to take and pass AP exams. Even then, I'm not really sure what to think.

The program is doling out nearly $1,000,000 in bonuses per year to students at only 31 schools. And I'm not sure exactly what it means to take or pass an AP test. I don't think most would recognize that as a goal within itself. I think we really need to know the secondary effects of the trial: do students view AP exams and/or school more positively? Do students study more for non-AP subjects? Are students more likely to enroll in college? Are students more likely to complete college now that they have a head start on earning credits? These seem like the real goals of the program.

In other words, the numbers tell us almost nothing other than that the program seems unlikely to have fomented a revolution within these schools. But to actually know what the results are we need to let it play out for a couple more years and, more importantly, answer the questions above.

update: I've been told three interesting pieces of information:

1.) The enrollment for the 31 schools is about 44,000 students

2.) Students were paid for passing the Spanish Literature AP exam in 2008, but only underclassmen were paid for doing so in 2009

3.) Virtually all of the increase in tests taken/passed is black/latino students

Monday, April 13, 2009

What Happens to Student Teaching Under Performance Pay Schemes?

I was reading about student teaching the other day, and a thought occurred to me: what happens to student teaching under performance pay schemes? If you were a teacher and your salary depended upon how many points your students gained on the state test this year, how willing would you be to a.) spend your time training a student teacher instead of helping your students, and b.) let a student teacher take charge of lessons in your classroom.

I don't see any way that test-based performance pay and student teaching can successfully coexist. If the teacher is motivated by the incentive being offered, then they're also going to be less willing to let a student teacher spend time with their class -- unless, of course, they're convinced that the student teacher is at least as good at teaching as they are. If they don't care about the incentive and they're not motivated by it, then they should be as willing as ever to let a student teacher take charge but the performance pay plan won't really be succeeding if it's not motivating teachers to do more and do better.

If performance pay is based on factors other than test scores, of course, then the dilemma can be dealt with.

Sunday, February 15, 2009

Sunday Commentary: What to Incentivize?

Teachers should be encouraged to do what is best for students, and rewarded when they do -- which is why performance incentives make perfect sense on paper. But a million little details need to be solved before they can meet their promise. Perhaps most notably we need to be sure we incentivize the right behaviors. And teaching is far from the only field that struggles with this.

Today's NY Times Magazine has a fascinating article basketball statistics. More specifically, it explores the relationship between statistics and good basketball. The story centers around Shane Battier -- a player with below average stats but whose teams win when he is on the court. The Houston Rockets have apparently developed some ways to measure things that contribute to the team in place of the typical stats (points, rebounds, assists, etc.).

The most interesting (and relevant) part of the article is this: "It turns out there is no statistic that a basketball player accumulates that cannot be amassed selfishly. 'We think about this deeply whenever we’re talking about contractual incentives,' [Rockets GM Daryl Morey] says. “We don’t want to incent a guy to do things that hurt the team.”

In other words, basketball teams face the same problem (though in a different way) as do schools: creating incentives that encourage people to do what's best for the team/school. A basketball team does not want to promise Player A a million dollars more if he averages 20 points per game and then watch the team lose because Player A takes a ton of wild shots instead of passing to his teammates.

I don't think the exact same problem exists in school -- Teacher A probably isn't going to hurt the school by trying to raise test scores in their classroom. But a similar problem exists: we're not sure exactly what actions we want people to take, or we're at least not sure how we would measure such actions.

Can you imagine trying to rate a basketball player on whether he hustles, is in the correct position, passes to others when they're open, takes shots when he is open, communicates well with his teammates, and so forth? Similarly, how do we rate teachers on how well they motivate their students, keep students focused, prepare lesson plans, offer useful comments when grading, and so forth?

In basketball, one could give up on trying to measure what a player contributes and simply reward players based how many games the team wins. It is unclear whether players would want to win more if they had money riding on it, but it seems logical that such a scheme would at least discourage selfish behavior.

What about teaching? One of my colleagues once said that teaching is amenable, but not reducible, to measurement -- and I agree. The only possibility I see of truly measuring how well somebody teaches would take far too much time and effort to ever be feasible. But if we are committed to rewarding teachers based on what we measure, we have to find some sort of proxy. Unfortunately, we do not have a won-loss statistic in education, so there's no similarly simple solution. We do have test scores, but most teachers don't teach tested subjects. As such, we need to recognize that we cannot judge or reward most teachers based on their students' scores.

So what do we want to incentivize? Basketball GM's are starting to figure out that incentivizing ends may not justify the means. Rewarding teachers with high test scores may lead them to do all sorts of things that they should not. But rewarding means is almost impossible? Do we really want to sit around measuring all the little things that make a teacher special?

Whatever the measure is, I think we can learn one thing from basketball. If everyone in the school is rewarded for it then at least there is a possibility that it would foster cooperation and encourage people to raise their expectations of their colleagues.

Rewarding a player for a team's wins may not mean that he scores more points, but it might mean that he corrects the hitch in his teammate's jump shot or sets a valuable pick. Rewarding a teacher for a school's performance may not lead them to transform their teaching style, but it may mean that conversations in the teachers' lounge are a bit more productive, that staying after school to tutor a student (without pay) is less crazy, that the dynamite lesson plan on photosynthesis is shared with all of the 3rd grade teachers, and that the kid from Mrs. Smith's class who is wandering the hall becomes everybody's responspility. If we can just figure out those other 999,999 details, performance pay should be good to go.

______________________________________________________
Corey Bunje Bower is a Ph.D. student in education policy at Peabody College, Vanderbilt University. Before beginning his studies he taught sixth grade at a low-performing middle school in the Bronx that has since been shuttered. His research focuses on issues surrounding high-poverty urban schools -- including teacher retention, discipline, and school climate.

Sunday Commentary is a running feature on Thoughts on Education Policy. Submissions are open to all who are knowledgeable about education and willing to write a concise, thoughtful piece. Submissions may be sent to corey[at]edpolicythoughts.com.

Monday, February 9, 2009

Today's Random Thoughts

-I'm catching up with the backlog of blog posts in my Google Reader, and noticed that Kevin Carey seems to agree with much of what I wrote in yesterday's Sunday Commentary.

-I also noticed that some people at Fordham sure seem upset that education funding isn't being cut to the same extent as are other areas (one example). I can't quite figure this one out. Shouldn't we cut spending on education less than we do in many areas? Isn't education more important than many other functions of government?

-I haven't yet decided whether Michelle Rhee is toning down her rhetoric or simply clarifying earlier positions in this interesting piece in the Washington Post (more on this later).

-EdWeek hosted a rather boring (in my opinion) chat about performance pay, the transcript is here. It might've been more interesting had they answered my question (boy these grapes are sour).

-Also related to performance pay, a report just out on what to do with "the other 69%" of teachers who don't teach subjects/grade levels that are tested under NCLB.

Thursday, January 29, 2009

Today's Random Thoughts

Some interesting news items out there today:

-According to USA Today, a recent study found that students with recess behaved better in school. They don't describe the research very well, so I'm going to have to track down the actual article to see what it says. That breaks improve attention makes sense to me but, at the same time, when I was teaching the hardest times to get students under control were immediately after lunch and immediately after gym.

-An appeals court ruled that four teachers in Lansing, MI cannot sue their district over their perception that four students were not punished severely enough for major offenses including slapping a teacher and throwing chairs at another. The teachers claim that school-safety law (I think state law, but it's not clear) mandates that students are expelled after such actions, but the students in question were all suspended. I can sympathize with both sides on this one. On the one hand, any signal from above that such behavior is anything other than absolutely unacceptable makes teachers jobs more difficult (and potentially more dangerous). On the other, giving these students long suspensions doesn't exactly send the message that their actions were okay and, furthermore, the intervention of courts in school decisions always has the potential to really mess things up. Besides, if the teachers wanted to grab headlines with a lawsuit over atrocious behavior they should've sued the kids' parents.

-Houston handed out its performance bonuses today. Here are the two things I found most interesting about this:
1.) Bonuses for the 2007-08 school year were handed out sixth months after it finished. From what I've heard, research on extrinsic incentives find that they're much more effective when the time frame behind rewarded is shorter and the bonuses are given in short measure after the performance is completed.
2.) 90% of eligible employees received a bonus. Usually when merit-pay schemes are publicized they aim to reward only the best and brightest. I could see this going either way, though. I'm sure the number of people who receive bonuses makes this plan more acceptable to the union and possibly more popular among teachers, even if the trade off is that some below average performers are receiving bonuses. At the same time, a lot of places in the business world give bonuses to virtually all their employees -- with the size dependent upon both the performance of the business and the performance of the individual. Maybe it makes sense to think of bonuses as a regular part of one's pay that varies rather than something only a few special people receive.

About the Blog

After trying to teach middle school in the Bronx for two years, I decided to pursue a PhD in education policy. After seven years teaching at two different universities, I decided to enter the non-profit world where I could have more of an impact on the ground. While my research interests center on urban poverty and social policy, I try to stay abreast of all the latest happenings in education.

This blog is aimed at anybody interested in education policy (including, but not limited to, teachers, researchers, parents, and the general public). I aim to thoughtfully, but somewhat concisely, explore each issue I come across or bring up. I aim neither to talk down to the general public nor to insult the intelligence of the experts. The purpose of this blog is not to criticize individuals; it is to stir thought. The purpose of this blog is also (usually) not to take sides or push a certain point of view but, rather, to productively contribute to the discussion surrounding each topic. Any opinions are purely my own and do not represent the views of my employer in any way.