Some evidence that incentives for good teaching can work

In a new NBER Working Paper, Thomas Dee and James Wyckoff report:

Teachers in the United States are compensated largely on the basis of fixed schedules that reward experience and credentials. However, there is a growing interest in whether performance-based incentives based on rigorous teacher evaluations can improve teacher retention and performance. The evidence available to date has been mixed at best. This study presents novel evidence on this topic based on IMPACT, the controversial teacher-evaluation system introduced in the District of Columbia Public Schools by then-Chancellor Michelle Rhee. IMPACT implemented uniquely high-powered incentives linked to multiple measures of teacher performance (i.e., several structured observational measures as well as test performance). We present regression-discontinuity (RD) estimates that compare the retention and performance outcomes among low-performing teachers whose ratings placed them near the threshold that implied a strong dismissal threat. We also compare outcomes among high-performing teachers whose rating placed them near a threshold that implied an unusually large financial incentive. Our RD results indicate that dismissal threats increased the voluntary attrition of low-performing teachers by 11 percentage points (i.e., more than 50 percent) and improved the performance of teachers who remained by 0.27 of a teacher-level standard deviation. We also find evidence that financial incentives further improved the performance of high-performing teachers (effect size = 0.24).

You will find the paper ungated here.


Are you prepared to go to next logical step and support significantly higher public ed budgets to attract and retain sufficient high quality teachers?

I argue for just that in Launching the Innovation Renaissance.

Are you prepared to keep spending more and more and more WITHOUT a promise of doing anything better?

Are you prepared to go the next step and advocate much stronger, more universal use of dismissal threats with a much higher number of dismissals?

You'd have to pay me more to deal with coworkers who cannot be fired.

That is a huge factor.

Probably you need both. It just has to be objective and not create a culture of paranoia.

Why "objective"? Most professionals work under extremely subjective conditions. We can be fired simply because the manager doesn't like the cut of our jib, on any single day, every day of the year.

Managing and firing based just on objective numbers is the sign of really bad management. (The one real exception is sales, and even then a sales manager has to consider soft things like customer management.)

That is an unacceptable standard for public school teachers and would probably result in lots of unfounded terminations and "friend" hires, which doesn't serve the students.

Probably would be acceptable and probably wouldn't do what you suspect, although that wouldn't necessarily not serve the students.

But the good news is we don't have to be just as dumb only differently dumb as we have been. You could hire friends and then if you didn't meet the standards you and your friends get fired. Again, the good news is we don't have to even figure out how to do that.

My boss can already fire me for "unfounded" reasons and replace me with his "friend." Yet he hasn't.

Most of the adult world in America functions this way. Certainly the well-paid professional class does.

Now add 30 objective data points on your job, now add that you do a job that is always in high demand at tens of thousands of firms all over the country, now add that we could grade and trade and simply move teachers and kids around without firing anybody to try to find an optimum mesh of expectations and personalities.

Nah!!! sounds hard.

Now add Ratemyprincipaldotcom with "this mo'fuka is so nepotistic!"

Two things:
1. The inability to rely on any undesigned order keeps us in chaos.
2. I just can't see the problem, probably because I don't have the join a team gene.

"Most professionals work under extremely subjective conditions. We can be fired simply because the manager doesn’t like the cut of our jib, on any single day, every day of the year. "

Why is this a good thing that should be replicated in the teaching profession?

"Now add 30 objective data points on your job,"

Bad data is bad data, even if you have a large data set.

Actually not.

"Why is this a good thing that should be replicated in the teaching profession? "

Obviously, noone is saying it is. In fact, we are saying the opposite. We are saying that non-education managers would kill for the low-hanging fruit of objective performance data that education leaves rotting on the ground.

And that's for dumb stuff like signing up cell phone customers. Not the livelihoods of our children.

First, there seem to be people that are so far in their own bubble that they are surprised that jobs exist where you can be fired on any day. These people may be so insulated from the real world that they don't realize that, despite the ability to instantly fire workers, millions of workers are not instantly fired every day for petty reasons. In fact, team-oriented workers hate having dead-weight that they have to work around, so the working environment is better in places where people can be let go easily.

Second, if teachers want to be treated and paid like professionals, they should get the other side of being professionals -- at-will employment instead of a job being an asset that you cannot lose without a court case.

Finally, anyone thinking that the way the teaching profession works is the good way that the rest of the economy needs to emulate is probably too far gone to discuss things with.

How is that the next logical step? All the research implies is that higher quality teachers should be paid more than lower quality teachers. That could be accomplished by paying low-ranked teachers a lot less, high-ranked teachers a lot more, or both. Overall I fail to see how the research is anything but agnostic to the aggregated average teacher pay.

Doug, paying high-ranked teachers substantially more is not a complete solution. Assuming there are a fixed number of teaching positions, schools can be improved either by increasing quality of existing teachers or replacing low quality existing teachers. The total solution then is pay commensurate to results in addition to a minimum standard. It should be apparent that paying negative dollars for negative quality would not be a desirable result even if it were mathematically equivalent to paying high dollars for top performance.

Seems like you are saying the same thing as Doug, who never suggested paying negative dollars.

"That could be accomplished by paying low-ranked teachers a lot less, high-ranked teachers a lot more, or both."

It isn't enough to pay teachers in accordance with their effect on students. Assuming a market in which teachers are paid exactly what they are worth, the system still ends up with teachers throughout the spectrum of effectiveness. Paying a low (negative) price for low (negative) quality doesn't seem to suggest (to me) that high quality will emerge.

Assuming teachers are mis-matched, they would tend to find their correct value somewhere else.

Also, the better teachers could "buy" students from the not-as-better teachers.

OMG yes. Just think how many teachers with gigantic brains we could attract if we paid them $100,000, nay, $200,000 a year!

There is no shortage of good teachers willing to work at current pay levels. The secret to retaining good teachers is giving them majority white-Asian schools to teach in. This is the biggest not-so-secret secret in education.

You don't think we'd get smarter teachers if we paid $200,000 per year? Or you don't think IQ matters in job performance?

I'm saying the problem in attracting and retaining good teachers is not the pay.

You could pay teachers per student that they maintain a high standard of value-added for, and as such many schools could easily conceivably have several teachers earning more than $200k/yr.

The problem is we can't change anything.

Again, in 3 months time you matriculate from English/Chemistry/Physics/etc. classes with 25 students to 250 students. What happened? You suddenly got a huge frontal cortex?

We'd get them for about five minutes.

Smart people don't flock to teaching because smart people figure out quickly that teaching in today's schools is a bad job.

Has dealing with kids ever been a good job? Why do we want smart people in this job? We want people who can't help but love kids.

That is a substantial problem with this kind of discussion, as was noted elsewhere on the comments thread. If you reward people who are good teachers, but you misdefine what good teaching is, you accelerate the problems.

Don' what I'm saying.

That's why I propose grade and trade. If the grade is wrong, the trade will compensate.

If I'm understanding your system right, I think it's one of any number of radical changes that could improve the system. But won't, of course, happen, which you know.

There's two dimensions to this: (a) ease of job and (b) remuneration.

You are saying good people would flock to teaching if the task is made easier (i.e. easy to teach students). Sure.

The other approach is to pay them better to motivate them to teach well in spite of difficult students.

And difficult students should pay more to the teachers. Again, I don't know why this is brain surgery. You have 12 years of observational evaluation before you ship these people to prison. It shouldn't be hard to keep discipline records. You just have to figure out how to let teachers "draft" students and trade.

I think letting students and teachers select each other could be a really good model. You would need to fix a few things first, like taking away grades, because otherwise all the students pile onto the teacher that gives easy As.

This would still work for special needs kids. There are teachers, believe it or not, that enjoy working with special needs kids.

Once this selection process starts you have established a market, and then you use money to fill the gaps where needed.

You will get teacher retention at a higher rate in schools where the family culture is such that the family does a bunch of the work for you. That's the same in any job, the job is easier if you don't have to do all of it.

But you won't necessarily retain the best teachers this way, because making teaching easier is not what will motivate the best teachers. The best teachers are very selfish -- they get a kick out of doing their job well, like other professionals. And for teachers, that means teaching a kid something he would never have been able to understand without the teacher's help, something no one thought he'd ever get. The superstar engineers are the ones who want to build the bridge over the biggest chasm, hit the moon, etc. They may never do any of that, but that's who they are. Same with teaching. That doesn't mean they want to teach kids from rough households or anti-academic family traditions. What it means is they want a teaching environment that lets them teach, and then they'll take the kid who can't read and teach him to read, and the white-Asian kid who can and teach him Euclid (or whatever).

Worked with a man in a majority Mexican-American district (all) where most were second language English speakers. He was teaching a group of them higher level math -- stand and deliver idea, if language is an impediment then the kids who want to be excellent can still get there with math. He was teaching pre-algebra and algebra. The superintendent ended the program.

Sure put him on the short-timer's calendar to retirement. .

I deal with engineers who speak almost no English. They are Chinese. They aren't even really better at math but they are less distracted and in a Ricardian trade comparative advantage sense they still end up focusing on math. I suppose the Indians would do the math if they didn't speak better English. Again, it's amazing what we do once we get out of public schools- a lot of it what people tell us can't be done that we manage to do at even higher levels when it is even harder.

This is a good argument against enormous salaries for any position, no?

I don't know what you are referring to, but I'm for nearly infinite potential pay for teachers.

Is there some evidence that incentives for good governance can work?

I'd love to run the experiment- as long as I get to bring the incentives with me.

Did they take into account the cheating that was detected in more than a few school during that time.
"When standardized test scores soared in D.C., were the gains real?"

This is the correct answer. There can be no real effect observed when there is so much (50%-80% of classrooms) cheating is going on. The only way to really increase quality is to have entry level salary comparative to the private sector in particular fields. To make this work, you would also need high turnover. Get rid of the bad teachers quickly and keep the good ones.

We can't stop cheating in schools?

Time to pack it in.

How do we define and identify the bad teachers and the good ones? How many are "bad?" Where are we going to find more-skilled replacements?


"Where are we going to find more-skilled replacements? "

Why assume they need replacing instead of life-long learning and retraining? The rest simply get paid according to their value and if they can make more money elsewhere they can move on.

In addition to cheating you also have unbalanced classrooms. My principal can place 5 special needs students in my 8th grade class while the principal's buddy gets zero special needs kids in her class. This happens all the time.

The administrators are a big part of the problem.

Okay, what part of what I want to do gets in the way of the need to stop that?

Above someone claims that kind of thing might happen if schools did things the way everyone else manages to do things.

But if it already happens...

"... incentives based on rigorous teacher evaluations..."

Bad idea. Formal employee appraisal systems don't work anywhere and never have... despite vast experience & experimentation with them in big corporations, smaller companies, government, and the military.

Such evaluation/appraisal systems are always highly subjective in the formal criteria specified for employee performance, and in the judgments by individual supervisors. They also ignore the variable system/environment that employees must work within... but have no control over. Government school teachers in particular -- typically labor in byzantine bureaucracies, irrational rules/policies and incompetent upper management.

Sorting teachers are not the problem; bad management is the problem.
Most American schooling/education is ultimately managed by politicians.

But this study says they do work. Also they are based in this case mainly on objective measures (test results from one year to the next).

See Megan's comment above. Does it hold if there was widespread cheating? What if is a not uncommon phenomenon in schools offering incentive pay?

Grade and trade.

Teachers are damn near fungible...or if they aren't we are doin' it wrong.

Every objection is all so frickin' hysterical in light of reality. E.g. "we can't just let teachers move around because we can't make space for them!" I spent half my life in a tin can trailer. We already do about 3x of what "grade and trade" would do, we just don't admit it to ourselves.

I spent all 6th grade in a trailer, and from then on there were always trailers. Now, because of creative (i.e. illegal) financing, schools are Taj Mahals. The problem is all intransigence.

By the way, Michelle Rhee's war on the economic basis of Washington D.C.'s black middle class (easy government jobs) is having major demographic repercussions: as blacks leave D.C., there's a white baby boom going on, with the percentage of white babies born in D.C. up 34% in just the last 3 years.

Once upon a time, the great enterprises focused on hiring the best people so they would not need to figure out how to fire people.

The Federal government was a big innovator by removing hiring from the political crony method and adopted a "scientific" method of civil service exams and investigations of character.

When I was a kid in public school in the 50s and 60s, teachers were leaders in upping the standards just as they had been for my parents.

I have had no kids, so I observe the debate from a distance, but since Reagan, "government employees" are always the problem, so it seems they will never be the leaders in addressing the changing needs of society, and the people in charge of improving the education system are not educators but politicians or industrialists or bookkeepers or equipment suppliers.

The method of improvement to the process that is the latest to fail is to test the students to see if the right teachers were hired. That is like hiring a lot of people to operate lathes and milling machines and then using test gauges on the production just before a huge batch is being shipped to customers to decide of the employees should be kept or fired.

The stupidest idea is to hire people with absolutely no knowledge of machining on the assumption that they will come up with new ways to make parts, say using a milling machine instead of a lathe to produce a long round part.

The federal civil service developed an outstanding, scientifically-validated civil service exam in the mid-1970s, but the Carter administration through it out in January 1981 in the Luevano capitulation because the PACE test had disparate impact on blacks and Latinos. They promised a new non-disparate impact test would be developed real soon now, but a third of a century later, we're still waiting:

Well that's stupid, but not as stupid as making performance on a standardized test the main criterion for hiring federal jobs.

You could do a hell of a lot worse. There is literature on the factors that affect job performance and IQ is at the top of the list.

Read the article and learn about the alternatives. It's not a question of only relying on tests, it's a question of being able to use tests, just as Harvard looks for test scores and GPA and other factors.

Some departments, such as State, have devised their own test, but many just rely ever more heavily on resumes. One consequence is that the average age of federal new hires has gone up considerably.

The country would be better off with explicit racial quotas for blacks than with these kind of muddles.

Grade and trade.


I don't get this pervasive hostility to a standardized test? What's wrong with that?

Or are you saying the tests were badly set? If so, it's an argument for a better test question set; not just opposing performance measurement.

It is just a poor predictor of overall performance. Would you believe that in Canada they don't even use standardized tests as part of the college admissions process?

I don't mind test performance being used as _one_ factor of many, but just as almost no private employers rely primarily on a test to target their hiring, I don't think we should do that with government jobs. Even in the case of highly technical work, like programming, employment history and the ability to work on teams, meet deadlines, etc. are usually an important part of the vetting process. What I think government jobs could use more of are performance measures once the person is hired.

You know what is a poor predictor of success? Graduation. It's like grading a programmer on his ability to market and ship a product.

Student value-add is easy compared to anything the private sector has to do.

Standardized tests are a useful tool for evaluating individual students' strengths and weaknesses. They are not a useful tool for evaluating teacher performance.

Why not?

The tests are standardized so you can compare apples-to-apples.

So testing doesn't work at all? Then why do we do it?

Non-standardized tests are also known as tests made by teachers.

I never understand the mentality behind comments such as yours. I suspect it is "the way we do it now doesn't work." I agree.

What I want is choose your own adventure. I'd choose standardized math tests for my kid. They could be administered off-campus. Then I check and see what teachers create the best value-added on math at my kids level and just beyond. Then I want that teacher. None of the fact that we don't allow anyone to do that today changes how I feel about the ideal solution.

The problem is that no reasonable method has been determined to identify good teaching. Standardized tests are just how to identify who is getting kids to memorize set skills. That's not teaching or educating.

What does it mean to "memorize a skill"?

"Tests measure nothing!"

I think it was Matt Yglesias who said that teaching to the test is exactly what you want for some things like literacy.

Then it is time to pack it in.

The thing is, it is.

The problem with value added measures is that they don't *exactly* capture the teacher effect.

A kid does a standardised test and does badly. If he has rich parents they can send him off to tutoring and next year he posts a good score - but noone can say how much was the effect of tutoring, how much was the effect of the teacher. If he has poor parents then they can't send him off to tutoring.

That's one of the reasons why all the high quality teachers are in the richer schools - the parents pay for extra help, or are more able to give extra help, if the kid is doing badly and that gets confounded with the teacher effect. Meanwhile, teachers in the poor schools get disproportionately fired for being low quality teachers.

Grading teachers within a school or maybe a district would solve that problem, then?

Also, maybe you can more reliably give the teachers credit for scores that drop. Unless they stopped jthe tutoring after the improvement the year before...

Since when does anything have to be exact?

You have 30 students. Do you know how often a private sector factor gets 30 data points? I'd say you are taking drugs, driving in cars, and doing almost everything else that risks your life every day that never got 30 statistical data points.

Has anybody checked whether these kind of incentive systems have disparate impact on black teachers? The black voters of Washington fired Michelle Rhee's boss Mayor Fenty for her being mean to black teachers, so the political evidence, along with the last 50 years of social science, suggests that better systems of evaluation and reward will leave black teachers on average with the fuzzy end of the lollipop.

Grade and trade. Judge blacks versus blacks. Etc.

Why, for example, would we not want "Affirmative Action" for a black teacher teaching a large number of black students?

Why wouldn't a black teacher be, all else measured equal, marginally more effective for teaching black students?

I'd suspect they would be marginally more effective. And until white teachers completely crowd out black teachers by being paid less you probably need black teachers for white students too.

What is the problem? People act as if we control everything so they can't imagine a change that entails just letting go of things that we already don't really control anyway.

So, you collect the data. Then you compare the test results to the value-added of the black students and white students. Then you handicap the black teachers back to par. All of this is done by people who show, more or less, they aren't blatant racists and it is plenty good enough and is way better than what we are doing today where racism and cronyism hide behind vagaries like assignments, student assignments, crappy committee assignments, etc. At least then a teacher claiming racism could just pull out a sheet of paper with their numbers.

"Why wouldn’t a black teacher be, all else measured equal, marginally more effective for teaching black students?"

The problem James S. Coleman discovered in 1966 is that all else isn't, on average, equal: the white-black IQ gap is a major issue holding back mean black teacher effectiveness.

While doing the Coleman Report of 1966, James Coleman found that teachers' IQs had an impact on much children learned. Unfortunately, this meant that black teachers, especially black male teachers, tended to be worse for students than white teachers. Since black teachers were much of the basis for the black middle class, he suppressed this finding in the Coleman Report. He apologized for his violation of scientific ethics a couple of decades later.

Documentation here:

Did it account for character education, for example? I could almost care less of how many BS factoids my kid learns (excepting of course, by definition, how to think logically and mathematically, etc.). I could point to a bunch of Youtube videos of black male teachers who I'd love my kid to have as a teacher even if they were marginally worse at spouting trivia.

I don't want Obama deciding what teachers to hire and fire whether he's doing it wrong or (in his mind) right.

Hell, I could say the exact thing about graduate school professors. Give me one unimpressive guy who gives a shit over all the impressive a-holes who I lose ground with every time I interact with them.

I can imagine two types of incentives (surely there are more). One type works on reserves while the other works on development (another could work on matching- if a teacher has no reserves and cannot be developed maybe they need to be "re-matched"). For example, it may be hard to just have a good teacher or a poor performer do better for different reasons. A good teacher that has reserves like staying later to plan lessons or giving individual attention to students early in the morning (how I passed my 3rd grade timed math tests) these could be tapped by the right incentives. Development incentives could be longer-term and could also be partially placed at the feet of the principal.

You don't want it to be like academia where failure to develop an employee is deemed a success.

You don't have to pay a person more to stay after to help students. If you do that, in fact, what you are going to get is a whole system set up around it, with teachers bickering over how to fill out the time cards for it and whether Mrs. S got more than Mrs. G did for whatever.

What schools have to do is let good teachers teach. They don't want them staying after or coming early -- I got dressed down for opening my doors before school hours. And they don't want teachers planning their lessons, they want to hand them federal Common Core standards packaged in Houghten Mifflen curriculum you just follow along with like those old recordings where the kid reads along with the tape and it goes "bing" when you have to turn the page.

If the schools were ever entirely successful in producing the kind of teachers they consider good teachers, we'd have no good teaching going on in the public schools at all.

No, you pay the teachers to increase average and individual value-adds, though you never over-incentivize anything at the expense of everything. Paying them to stay late would be stupid. Don't do what I'm saying.

To generalize, as either Deming or Drucker (or surely both) said, any time you have an indicator and incentive, you have a counter indicator and incentive. You can even over-reward as long as you have an indicator to identify the selfish bastards.

If I were to walk into a school this morning and moon the principal, that would be better than what the dys-regulation we are currently doing:
"...that’s still just a tiny fraction of the roughly $100 billion in his budget (much of which the government direct-deposits into the bank accounts of schools, whether they deserve the money or not). "

"where will we find replacement teachers?!?"

Well, for a job that usually requires a degree, 40-50% of teachers leave in their first five years, so obviously what we are doing today ain't fer shit. That is worse than attrition at a typical company for which you don't toss your degree (e.g. if you are no longer a teacher, what good is your education degree?). We'd have to try hard to do any worse.

We can't possibly figure out how to grade teacher performance doing the actual job...but the tests they take towards their education college degree are fine.

What are we doing here people?

Most teachers are women, so I expect they leave to start families. Also, there are only so many white/Asian middle class school districts to go around. If you haven't landed in one by then, you are probably burned out after five years attempting to educate the uneducable.

The Atlantic article you linked and the cited studies work very hard to avoid inadvertently stumbling onto an answer.

The next step is to get rid of public schools and increase competition.

Value-added modeling (VAM) is a useful and statistically valid method for exploring questions such as what proportion of the variance of student outcomes is attributable to school-level factors vs. classroom-level factors. It provides reasonable estimates of the value associated with general factors such as the marginal benefit of a year of teacher experience beyond the first three years or the marginal value of additional years of teach teacher training and the like.

When aggregated across several years, it can provide a somewhat reliable indication of between-teacher differences in the outcomes measured by the associated tests.

However, these models are nowhere near reliable enough to provide a useful estimate of between-teacher differences in outcomes on a class-by-class or year-by-year basis. By extension, they provide no useful feedback to the teacher for improving performance.

They are completely useless for predicting how a given student is likely to be affected by a choice of teachers, even if the criterion of interest is simply performance on a given standardized test.

It is notable that in the study cited the measure of quality went well beyond VAM. When heavily augmented with more comprehensive methods, such as direct observation, there may be a useful role for VAM, but it can at best play a small role.

Given what we know about contracts and incentives in complex mutli-task settings (e.g. Feltham and Xie, 1994), it would be more than a little surprising if incentives turned out to be very effective for improving the quality of teaching.

Comments for this post are closed