Saturday, February 25, 2012

Is Teacher Quality Really Causing the Achievement Gap?

Yesterday, the NY Times released the value-added scores of thousands of teachers over the past five years.  Before and immediately after the release, people seemed to mostly argue the merit of the decision to release the data.  But I have a substantive question about the data.

What caught my eye was that, according to the NYT analysis of this particular set of scores, good teachers are evenly distributed between high-poverty and low-poverty schools.  From the article:

there was no relationship between a school’s demographics and its number of high- or low-performing teachers: 26 percent of math teachers serving the poorest of students had high scores, as did 27 percent of teachers of the wealthiest.

The LA Times reported a roughly a similar situation in LA when they released teachers' scores a couple years ago.  Which is really quite shocking in a number of ways.  Most notably, researchers and practitioners have long assumed that lower-poverty schools had worse teachers than higher-poverty schools -- past studies have repeatedly found that teachers in high-poverty schools are less experienced, turn over at a much higher rate, score lower on achievement tests, attend less selective colleges, etc.  Accordingly, at least part of the theory of action behind the teacher quality movement has been that giving low-income students teachers who are as good as or better than those in higher-income schools would significantly narrow the achievement gap.

But these two measures of teacher quality indicate there may be no major differences between low- and high-poverty schools, while we know that large gaps in achievement still exist between low-income and high-income students.  Which means at least one of two things.

1.) Differences in teacher quality are not a major driver of the achievement gap.

2.) These value-added scores are not a good measure of teacher quality.

I don't think anybody seriously doubts -- or at least that anybody serious doubts -- that some teachers are much better than others and that the best teachers can make a large difference.  But if quality teachers, according to these value-added measure, are roughly evenly distributed between high- and low-poverty schools in LA at the same time that we see differences between high- and low-income students growing, then  improving the quality of teachers (again,as indicated by these value-added measures) in high-poverty schools seems unlikely to close the achievement gap.  Either other factors influence achievement far more, the effects of quality teachers on students are much less direct than many assume, or what we're measuring isn't what matters.

In short, these data indicate that we need to broaden our focus beyond teacher quality and/or re-evaluate the way we're currently measuring teacher quality.

Wednesday, February 8, 2012

How Education Research is like Football Research

In the fall, I wrote about an instance in which outsiders may be needed in education reform.  Today I'll give you an example of how outsiders can also be dangerous (though this one pertains more to research).

Perhaps the largest change in educational research over the past decade or so has been the sizable increase in large-scale quantitative research, a fair amount of which is conducted by researchers outside of ed schools.  Like any change, this has resulted in both positives and negatives.  But one thing that worries me is that I consistently notice the people who are most worried about statistical rigor in quantitative analyses (both inside and outside of ed schools) tend to be less concerned with understanding the context and processes of schooling.

And that's incredibly dangerous.

Methodology, statistics, and technical skills are very, very important in the development of good research.  But without a proper understanding of how schools work and what is actually happening on the ground, one can't expect to ask the right questions.  And if one fails to ask the right questions, it really doesn't matter how complex and rigorous their analysis is because the answers to those questions are meaningless.

Here's one example of how such a process can unfold -- it's completely unrelated to ed policy, but I still think it's illustrative.  The Freakonomics Blog posted a brief discussion yesterday of the ending to the Super Bowl.  The post said two things (paraphrasing, of course):

1.) Isn't it amazing that the coaches of both teams realized that the Giants scoring a touchdown with about a minute left was actually a better outcome for the Patriots?  The Patriots' coaches tried to let the Giants run the ball into the endzone while the Giants' coaches instructed their players not to score a TD.  These counter-intuitive behaviors are an excellent example of game theory properly implemented.

2.) But then the Giants failed to take game theory into account when attempting their two point conversion.  Wouldn't it have been much better for them to run time off the clock instead of trying to score to go up 6 points instead of 4?  They might've been able to kill 20 seconds by running the ball 95 yards backwards and around in circles, and certainly being up 4 with 40 seconds left is better than being up 6 with a minute left.  Why didn't the coaches think of this?

There's some clever thinking going on here.  Yes, this is an interesting application of game theory.  And, yes, running 20 seconds off the clock would've been a better strategy.  So the application of economic theory to the situation is exemplary.  In a short space, there's a cogent analysis and a provocative question.  But there are two fatal flaws.

1.) Both coaches did not apply game theory.  Tom Coughlin, the Giants' coach, said he preferred that the team take the guaranteed six points to running down the clock.  So let's hold off on patting him on the back for correctly applying game theory.

2.) More importantly, the clock doesn't run during two-point conversions.  The Giants could've run around in circles for ten minutes, and there still would've been exactly 59 seconds left on the clock.

So, what we have here, is a smart professor who's well-trained in economic theory and statistics.  This training has allowed him to make an important insight about a football game and ask an interesting question.  Except that he doesn't actually seem to know much about the rules of football or the context of the situation.  Which  has rendered his question moot.

And I see the same thing (in a much less dramatic and much less foolish) way happening in education research.  Smart people with training in other fields and disciplines and serious methodological credentials come into the field and find some low-hanging fruit ripe for picking.  At first, this seems like a great idea.  We can never have too many smart, well-trained researchers in education.  And the eye of the outsider can be sharp.  But then the research starts and we realize that somebody can be smart and well-trained but, at the same time, fail to truly understand how schools work and the contexts under which students, teachers, principals, schools, etc. operate.  And then we get smart, well-trained people asking the wrong questions (or interpreting their findings in silly ways).  And that neither advances the field nor helps us improve our educational system.

Let's bring the analogy back to football.  Let's say that Football was a field in many Universities.  Grad students train under faculty who work for Schools of Football and/or Departments of Football Policy, Football Leadership, Football Teaching & Learning, Football Evaluation, Football Foundations, Football Studies, and so on.  And most of the research on football is conducted by faculty and grad students from these schools and departments.  There's no reason why an economist shouldn't do a study on the costs and benefits of attending school on a football scholarship; why a psychologist shouldn't conduct a study on the impact of playing football on one's personality; or a sociologist shouldn't conduct a study of the impact of playing football on one's social capital.  But in order to do these studies well, they first need to understand how the game of football is played, what a player does on the field, how much he practices, and so on.  Otherwise they're just chucking their theories against a wall and hoping one sticks somewhere.

So, to all the smart economists, psychologists, sociologists, etc. out there who wish to conduct the research on education: Welcome, we'd love to hear your insights and figure out if we can apply your theories and methods to help us advance our field and improve our schools.  But before claiming that you've solved a problem none of us have been able to for the last 100 years, take some time to learn how schools operate.  Read a massive and wide-ranging stack of literature.  Go visit some schools.  Talk to people who work in schools and education departments.  Talk to people who study those who work in schools and education departments.  Then begin your research.

At the very least, that should save you the embarrassment of asking students how many touchdowns they need to score in order to hit a home run on their fourth grade reading test.

Tuesday, February 7, 2012

The Logistics of "Thinning Out" Bad Teachers

Nick Kristof recently wrote another column calling for more high-quality teachers based on the latest paper on value-added measures of teacher quality.  There's a whole lot to discuss about both the column and the research paper, but let me focus for a minute on one small part of it.

Near the end of the column, Kristof writes that "If we want to recruit and retain the best teachers, we simply have to pay more — while also more aggressively thinning out those who don’t succeed. It’s worth it."  Recruiting, retaining, paying (and training, which is left out of this sentence) are all complex endeavors, but the "thinning out" part of the equation is often taken for granted.

Here's my question for Kristof: even if (and that's a big if) we can find a fair, accurate, and agreeable way to identify and dismiss the worst teachers, how many teachers are we actually going to dismiss in such a scenario?

The first question would obviously be whether we need to fire the bottom 5%, 10%, 25% or some other number.  That's up for discussion.

But the logistical question then, is how many teachers among the bottom X% 1.) can be readily identified and 2.) are planning on teaching again next year.  This will differ greatly by school and district, but in some places, this is going to be a very small number.Why?  Let's take a look at what the research says.

First, research consistently finds that it takes 3-5 years for a teacher to reach their potential.  So a good number of the lowest-performing teachers are simply going to be novices who will be better teachers next year.  We don't want to fire a first-year teacher who was in the bottom X% if we have reason to believe they'll be a really good teacher in a couple of years.  That would be incredibly counterproductive.

Second, research has consistently found that value-added measures of teacher effectiveness bounce around considerably from year to year -- particularly for teachers who teach a small number of students (e.g. a 4th grade reading teacher with 18 students versus an 8th grade math teacher with 150 students).  At least one paper has found that averaging scores over three years provides a much better, and more stable, estimate of teacher performance than does any single-year estimate.

Third, a number of recent papers have found that, at least in the first few years, many of the least successful teachers exit teaching.  This makes sense -- if you start a new career and find yourself completely overwhelmed, you're not likely stay very long.

Fourth, teacher attrition is exceedingly high in many high-poverty schools.  The general consensus is that about half of urban teachers leave the field within their first 3 years.

So, what does this mean?  We probably don't want to fire a whole lot of teachers in the first 3-5 years of their career because a.) they're still learning and improving; b.) we can't be that sure who the worst teachers are anyway; and c.) a good portion of the catastrophically bad teachers are self-selecting out of the field anyway.  If we give discount the first two years, when teachers are still learning their craft, and then take three more years to compute accurate value-added scores, it would only be teachers who'd taught for 5+ years who would really be ripe for firing due to low value-added scores.

Which means that the main herd we're trying to thin is the teachers who've made it through those first few years, reached their potential, and for whom we have accurate value-added estimates.  But how many teachers is that?  When I looked at high-poverty NYC middle schools a few years, I found that in the average school, only 1/3 of teachers had 5 or more years of experience.

Let's say that we're very confident in our ability to recruit and retain teachers who are better than our current teaching force and so we decide to fire all below average teachers (a full 50%) -- which would be a far more aggressive plan than any I've seen proposed.  First, the majority of these below average teachers are novices who are still improving and for whom we don't have particularly good estimates of ability.  Given that the majority of struggling beginning teachers either improve or self-select out of the profession, let's estimate that 2/3 of all teachers in their first 5 years are identified as below average teachers.  This would mean that only 1/6 of all teachers in their sixth year and beyond are below average teachers.  And since only 1/3 of teachers are in their sixth year or beyond, this would mean that only 1/18 of all teachers would both have 5+ years of experience and be rated below average.  This is a little under 6% of all teachers.

The average school in my sample had 72 teachers.  So, that's the equivalent of firing four teachers.  And that's under an extremely aggressive scenario.  Besides, now that you've rid your school of the chaff, who, exactly, do you want to fire next year?  And if you want to argue that we could be more aggressive and fire some of the novice teachers, that would mean there'd be fewer low-performing experienced teachers (since teachers tend to be roughly equally effective pre- and post-tenure).  So, for now, let's stick with the assumption that, under an aggressive plan, we'd fire four teachers this year in the average school.

Now, other districts have far more experienced teachers.  And it might make more of a dent there.  But a good number of our poorest-performing schools and districts are quickly churning through teachers too fast for firing low performers to make much of an impact.  Certainly, we should make every effort to rid our schools of the worst teachers (by increasing the performance of, and/or dismissing the lowest performers) -- I don't think anybody seriously disputes that notion.  Or, at the very least, I don't think anybody serious disputes that notion.  But will firing the lowest performing 6% of teachers in high-poverty NYC schools make a difference?  It's possible.  But let's be reasonable -- it's not going to make much of a difference.

So, yes, let's work harder to rid our schools of the worst teachers.  But let's not pretend it will be easy to do.  And, perhaps more importantly, let's not hold our breath while we wait to see if that bullet is actually silver.  In most places, other problems loom far larger.

Monday, February 6, 2012

Evaluating the Evidence on Non-School Interventions

I've been meaning to finish writing this piece for six weeks, and now I finally have.  Enjoy.

One of the most e-mailed articles in the NY Times shortly before Christmas was this piece by Helen Ladd and Edward Fiske on social class and educational achievement, in which the authors call for more non-school interventions ("education policy makers should try to provide poor students with the social support and experiences that middle-class students enjoy as a matter of course"). Overall, I thought it was a pretty good piece, but two things in particular struck me.

1.) That they build an argument for focusing on what happens outside of schools and then their first recommendation is to expand pre-schools.

2.) The recommendations after the pre-school discussion are fairly vague.

While the first is interesting, I'm more intrigued by the second -- and I wonder to what extent it's because they want to recommend that we change 30 things they can't possibly list in the limited space and to what extent it's because they're not sure exactly what to address.

Which begs the question: what do we know about which non-school programs will make a difference?  One particularly promising young scholar has argued that we don't yet know enough (you'll get the joke if you click on the link) to draw many conclusions on the topic.

The authors are certainly right that "Large bodies of research have shown how poor health and nutrition inhibit child development and learning" and they could've included numerous other factors at the family and neighborhood level.  Since we know that these social factors and environmental conditions are causally related to academic performance, trying to ameliorate their impact on low-income children makes all the sense in the world.  But, at the same time, I have yet to find (after extensive searching) a whole lot of evidence that we've been able to successfully do this in ways that rigorous research has found subsequently improved academic performance.  And Russ Whitehurst argues the point even more strongly, writing in a recent report that "There is no compelling evidence that investments in parenting classes, health services, nutritional programs, and community improvement in general have appreciable effects on student achievement in schools in the U.S."

Let's take a look at the few programs they do mention in the piece.  When I search Google Scholar for research on the programs they name, this is about all I can find on the East Durham Children's Initiative, Syracuse's Say Yes to Education program, Omaha's Building Bright Futures, and Boston's Citizen Schools.  Only the last one links the program to any educational outcomes, and it appears to be an internal report.  If there's evidence in peer-reviewed academic journals that these programs have improved students' academic performance, I've yet to see it (note: this is not to say that any of these four aren't working, just that we don't yet have really good evidence that they are).

At this point, some of you may be saying "you forgot about the Harlem Chidren's Zone!".  That's certainly the most-cited example of social policy impacting academics.  But there's a funny thing about that.  As far as I can tell, only one study has linked HCZ to academic outcomes.  And one thing that recently caught me eye is a chapter by Roland Fryer and others in the new Duncan/Murnane book on inequality and schools (highly recommended, btw).  In particular, I find it interesting how they've changed their tune on HCZ the past couple years.

In 2009, Fryer put out an NBER working paper with PhD student Will Dobbie arguing that the HCZ had effectively closed the black-white achievement gap.  The paper got all sorts of play in the press, with David Brooks claiming it proved once and for all that the "no excuses" schools were all that we needed and some of the Broader, Bolder folks replying that, no, it proved once and for all that community resources made the difference.

Shortly thereafter, I asked Geoffrey Canada which it was when he visited Vanderbilt -- he said that we needed both and that it was a "terrible, phony debate" to try and separate them.  Nor could Dobbie and Fryer definitively separate them; in the introduction, they write (emphasis theirs) "We cannot, however, disentangle whether communities coupled with high-quality schools drive our results, or whether the high-quality schools alone are enough to do the trick." (p. 4)

But now they've updated the paper and, according to Fryer's Harvard info page, it's been accepted at the American Economic Journal: Applied Economics. This is from the abstract: "We conclude with evidence that suggests high-quality schools are enough to significantly increase academic achievement among the poor. Community programs appear neither necessary nor sufficient."

This would go nicely with the new book chapter (here's a slightly different version) in which they write, on the first page:

The evaluation of the Harlem Children's Zone allows us to conclude that a high-quality school coupled with community-based interventions does not produce better results than a high-quality school alone, offering further evidence that school investments offer higher social returns than community-based interventions.

That seems like a rather sweeping statement to make based on one preliminary estimate of one program's effects but, nonetheless, their findings do put the burden of proof back on those supporting the Broader, Bolder position.

The closest thing I've seen to a collection research citations indicating that we do have evidence that community-based interventions can work is David Kirp's recent book, but even that involved a good deal of cherry-picking and mostly discussed small programs not explicitly linked with local schools.

So, where does this leave us?  As I wrote above, we have plenty of evidence that a wide range of experiences associated with living in poverty negatively impact kids' academic performance.  And we have plenty of reason to believe that altering these experiences could, potentially, improve kids' academic performance.  But I, and others, would argue that we have precious little empirical evidence that social policy has (or will) alter kids' lives in ways that will subsequently improve their grades, test scores, graduation rate, attainment, etc.  So I find it a bit odd that Ladd and Fiske conclude by writing

But let’s not pretend that family background does not matter and can be overlooked. Let’s agree that we know a lot about how to address the ways in which poverty undermines student learning. Whether we choose to face up to that reality is ultimately a moral question.

I'd make a different pitch if I were they.  I'd write something more along the lines of this: Let's not pretend that family background and living conditions don't matter and can or should be overlooked.  Let's agree that we know a lot about how poverty undermines student learning and how large this impact is.  And let's agree that we urgently need more research on ways to address the links between poverty and education.  The Promise Neighborhoods and other initiatives deserve our full attention and support in the short-run and can potentially provide that will help us better address the problem in the long run.

Of course, twice as many words with half the certainty is a really bad formula for an op-ed.  And there's no quicker way to frustrate policymakers than to write "more research is needed."

But, at the same time, I'm not sure it's helping their cause to claim that we know how to solve the problem.  If I'm in charge of a new Promise Neighborhood, my immediate reaction would be "We do? Great!"  Quickly followed by asking "which factors should I aim to address and which programs do we know are best to address these?"  I don't know the answer to that, and I've yet to hear from anyone who does.

So, in the end, I'd say there's about as much empirical evidence that social policy will close the achievement gap as there is that charter schools, merit pay, and vouchers will close the gap.  That is, very little.  So if we insist on arguing for an either/or approach, this leaves us at a standstill.  Both sides can yell that the other side's evidence is weak.  Which doesn't seem particularly productive to me.

As a researcher, this seems like an excellent argument to conduct a lot more research on the links between social policy and academic performance (as well as on in-school interventions).  Were I a policymaker, I'd want to avoid putting all my eggs in one basket.  We know the status quo doesn't work, but we can't really say for sure what else would be better.  That seems like a golden opportunity for policymakers and researchers to work together and experiment (literally) with a wide variety of reforms -- the former would get to hedge their bets and look prudent and open-minded while the latter would get to conduct groundbreaking research on a crucial issue.

In sum: Do we have conclusive evidence that a particular set of non-school interventions will close the achievement gap?  No, we don't.  So let's not claim we do.  But, let's also vow to keep searching for it.