Are Government Spending Multipliers Greater During Periods of Slack?

The embarrassing result here is that no, it seems they are not:

A key question that has arisen during recent debates is whether government spending multipliers are larger during times when resources are idle. This paper seeks to shed light on this question by analyzing new quarterly historical data covering multiple large wars and depressions in the U.S. and Canada. Using an extension of Ramey’s (2011) military news series and Jordà’s (2005) method for estimating impulse responses, we find no evidence that multipliers are greater during periods of high unemployment in the U.S. In every case, the estimated multipliers are below unity. We do find some evidence of higher multipliers during periods of slack in Canada, with some multipliers above unity.

That is from a new paper by Michael T. Owyang, Valerie A. Ramey, and Sarah Zubairy.

I see a few views (among others) of the multiplier:

1. The crude form of Say’s Law is true and fiscal policy simply remixes funds with no real impact.

2. Say’s Law is not really true as stated, but why should we be so impressed with a one-time uptick in monetary velocity, also called fiscal policy, unless the supply-side effects are also really good?

3. Fiscal policy effectively targets and mobilizes unemployed resources.

4. Fiscal policy postpones adjustment issues (this can be from AD too, it doesn’t have to be a “structural” problem), and may usefully smooth consumption, but it doesn’t do a good job targeting and mobilizing unemployed resources.

5. The size of the multiplier is determined by the expected monetary policy accommodation, and not by the quantity of unemployed resources.

I would say this paper provides evidence against #1 and #3.


"The embarrassing result here is that no, it seems they are not:"

The result is not *embarrassing* in any standard use of the term. There is an important empirical question that needs to be answered and this is one more paper in a large literature of mixed results. It's a welcome addition to the group of studies...with a longer time series than most work, but by no means without its drawbacks (joys of applied macro). For example, should we think the multiplier in 1933 is the best estimate of the multiplier in 2013 (for a whole host of reasons)? I thought the intro of the paper gave a little more accessible set up (though I appreciated the bullet points above too):

"A key question that has arisen during recent debates is whether government spending multipliers are larger during periods of slack. Some researchers and policy makers have argued that while government spending multipliers are estimated to be modest on average, they might become greater during times when resources are underutilized. Auerbach and Gorodnichenko (2012a, 2012b) (AG) test this hypothesis and find larger multipliers during recessions in both quarterly post-WWII U.S. data and in annual cross-country panel data since 1985. Their findings suggest multipliers near zero during expansions but between 1.5 to 2 during recessions. Fazzari, Morley, and Panovska (2012) confirm these findings using different methods and measures of slack in U.S. data since 1967. Gordon and Krenn (2010) find that multipliers are larger before mid-1941 than after in their analysis of U.S. data from 1919 to 1953. In addition, numerous cross-state analyses estimate bigger multipliers during periods of slack. On the other hand, Crafts and Mills (2012) analyze government spending multipliers in U.K. data from 1922 to 1938 period of considerable slack, yet find multipliers between 0.5 to 0.8."

"Mixed results" in this case means that there is no multiplier effect. If there were, then we'd observe it in nearly every study. The fact that it has been analyzed so many different ways over the course of decades and yet the results are still "mixed" means that the profession needs to let this one go.

Imagine if empirical results produced "mixed results" for the Law of Diminishing Marginal Utility.

Like many economic concepts, the multiplier is an interesting theoretical concept. Unlike many economic concepts, it is one that can be tested repeatedly, and quite reliably. After all these years, we are still seeing "mixed results." We now have enough data and enough studies to conclude that this is white noise.

Disagree. I meant "mixed" as in some find a large multiplier and some find a small multiplier... the identifying assumptions are key (context may well matter). The differences in these studies are NOT random. Rather than "letting go" (which seem particularly odd given all the large multiplier results) I am advocate of digging deeper. The multiplier probably is not some immutable, meta parameter (doubt many experts would disagree), but it may be well approximated under certain conditions as such. We have a lot more to learn here...admitting gaps in our understanding is not embarrassing or cause to give up.

How much variation do you have to see in variable before you will be willing to conclude that there is no pattern to the variation? If all estimates of the multiplier were either greater than one or less than one, I would agree with you. But we see multipliers in both categories. How is it possible after all this time to still believe that it is an empirically valid concept? It's not. A less ideological science would have completely thrown the idea away by now. This is the economic equivalent of frenology.

Why does there have to be one unique value across nations, times and policies? Or were all these papers measuring the same economic snapshot?

"A less ideological science would..."

Which world are we imagining here? The one where economics is less ideological? Or more scientific?

I think Tyler gives too much importance here to one paper in a forest of data. Perhaps because he is amused by the embarrassment that might result if it was true. Wishful thinking.

The reasonable conclusion to me seems (1) either our measurement procedures are noisy / flawed or (2) The multiplier in every nation / policy ./ era varies so widely that talking about it in general does not make sense. or (3) The concept of a multiplier is in itself so fuzzy that each paper is measuring a different quantity.

I think the Koch bros have turned up the heat on TC based on the last month or two

From the papers I've read, not all fiscal multipliers are created equal. It depends on the nature of the policy, who/what it's targeting, etc. Given that, the net effect will be some mixture of the effects from the policy particulars. I wouldn't be surprised if some policies had multipliers less than unity and others greater. Any result other than "mixed" would strike me as pretty astounding. I think "mixed" makes the most sense. The task is to delve into those mixes and figure out which policies have good multipliers, which have bad, and how they're influenced by other conditions or other possibly offsetting policies (eg. monetary).

Exactly. We need program based review. The view from 100,000 feet tells us too little.

As I understand it, the issue of multipliers historically dealt with the question of whether government spending crowded out private activity. A multiplier of less than one suggested that this was so.

In the current context, the question is different, I think. At issue is whether the incremental $5.4 trillion borrowed to support government spending resulted in any meaningful economic activity. If the multiplier was less than one, then the borrowing did not even create as much benefit as its face value, if I understand the concept correctly. In other words, we owe $5.4 trn plus interest, and we didn't even get the face value of consumption as a benefit. That's embarrassing, and frankly, deeply troubling.

But it's even worse. Auerbach suggests the multipliers could be as high as 2.3. OK, so let's do the math on this assumption.

The change in government spending (dG) peaked will peak in 2013 at $1.075 trn compared to 2007 levels. The change in the government deficit (dD, which I consider a fairer representation of the stimulus), peaked at $1.25 trn in 2009, compared to 2007 levels. So that's our dG and dD, the size of the shock to government spending.

If I apply Auerbach multipliers, up to 2.3, then the value of these shocks through 2013, compared to 2007, is $2.0 trn benefits from dG and $2.4 trn benefits from dD, depending on which metric you think is better. So that's the benefit of government spending or deficit since 2007, including multiplier effects.

On the other hand, cumulative debt through 2013 at the Federal level, over 2007 levels, is about $5.4 trn. Thus, the benefits from deficit spending are between 37-44% of debt incurred--and this using Auerbach multipliers.

Now, a multiplier of 2.3 equates to an IRR of about 75%. You can go a long time in the finance biz without seeing an IRR of 75% (at least one you believe). However, Auerbach would have us believe that, when the government sends a check to Mr. Jones for unemployment insurance, and Mr. Jones goes and buys a can of tuna, this act of consumption--not production, consumption--has an economic return equal to an IRR of 75%.

Hard to believe.

If I use a more modest multiplier, say 1.5, then the benefit/debt incurred ratio falls to about 0.27. Put another way, with a multiplier of 1.5, an incremental $1 dollar of GDP has cost us $4 in debt. That's embarrassing.

If these multiplier studies do not take into account private credit creation, they are not following the accounting well enough to be able to determine the viability of fiscal multipliers.

The problem with doing macro without following the accounting very, very closely is that it becomes impossible to tease out what's really happening in the economy. This and nearly every paper on fiscal multipliers has this flaw baked into the methodology.

Until these studies are done using the Godley/Lavioe accounting matrix, these studies will continue to find mixed results, and we can pretend to not know if fiscal policy works or not. It's great for having more papers, not so great for finding out what happens when the government changes fiscal policy.

"The embarrassing result here"

As a professional in a science/engineering field, I cringed when I saw this term. If study results are "embarrassing," you're looking at it wrong. Study results should be interesting and informative, not embarrassing. Research should be collaborative, not adversarial.

Collaboration--that's not the way economics, or even science, works. Kuhn retweeted this quote: "Science advances one funeral at a time." -- Max Planck. From the comment by TC, I took it that the two papers are "mixed" in that they refute both points #1 and #3, which are opposite. I agree with Rahul that these parameters probably change with time (as does for example monetaristism 'velocity'. Economics is nonlinear so 'anything can work' in theory--akin to rebooting a system that has some nonlinear problem--sometimes the transient condition will disappear.

If you convinced the US government to make a, say, Manhattan Project based on research you had strongly promoted and, after trillions of dollars spent, you found out "hey, nuclear fission doesn't release energy, I guess I was wrong about all that stuff. Sorry about the bill" you think you wouldn't be embarrassed? Are you particularly shameless, or have you not thought through the results of Macroeconomic theory on government spending?

Two more points (tl;dr the paper sorry):

1. Do they (and others) control for the "Sumner critique", i.e. effects of monetary policy, which cannot possibly be considered as independent from fiscal policy/unemployment etc.

2. What about the multiplier for spending cuts vs. spending hikes ? Do we have good reasons to think they are the same (Blanchard recently estimated the effect from the former for example).

3. Spending cuts/hikes vs tax raises/cuts ?

4. ...affecting low-income/high-income groups ?

Show me an empirical study in the 5-dimensional space of (spending; taxes; transfers; central bank policy; unemployment), without assuming linearity, and I'll call that science. Of course it's hard.

To, based on your point #3, you may be interested in Romer and Romer (2007) "The Macroeconomic Effects of Tax Changes." They categorize tax changes by the reason for the change. They find that "legislated tax increases designed to reduce a persistent budget deficit appear to have much smaller output costs than other tax increases." They find a rather large (~3) multiplier.

Carola, am I understanding the Romer and Romer analysis to support the idea of raising taxes in these United States, provided the money was used to reduce the government deficit?

Isn't this a fair characterization of the Bush Sr. and Clinton tax increases in the 1990s, which don't appear to have damaged the economy, though they did cost George his job?

How can there be no citation of Nakamura & Steinsson who find a multiplier of 1.5 in a recent U.S. study, that at first glance appears to have better identification than the rest? Vox summary is below, and the paper is conditionally accepted at the AER according to Emi's CV.

'The embarrassing result here is that no, it seems they are not'

Well, in America, as the study says itself in the last sentence of the summary - 'We do find some evidence of higher multipliers during periods of slack in Canada, with some multipliers above unity.'

Regarding such studies in general, I think Don Boudreaux's observations should be kept in mind -

I think Say's Law has gotten a bum rap, just because Keyne's derided it. It is usually thought of as "supply creates its own demand" but it is more subtle then that. In Say's time, the same complaint was heard from merchants - money is scarce. Say's insight was that mere money demand is insufficient - there has to be production backing up that demand. I think current macro policy is designed to create the illusion that things are better than they are - artificial demand, artificial money. Perhaps economic dislocations are inevitable and best cured by time.

More Krugman bait, sweet!

"Fiscal policy" needs to be dissaggregated both empiracally and as policy. Do different kinds of expenditures undertaken at different times of high aggregate unemployment of resources and under different monetary policies, raise long term real income? To what extent is it politically possible to execute a policy based on these empirical differences if found? My guess is that the answer to question 1 is yes and to question 2 is not so much. My conclusion is that it is better to have "automatic stabilizers" that, for example, transfer funds to states and local government to offset recession-induced declines in tax revenue and increases in recession-induced needs for public assistance like Medicaid, unemployment insurance, and food stamps and to use rules for public infrastruture spending that are sensitive to declines in Federal borrowing costs.

Automatic tax cuts like the payroll tax could also be part of the mix. The indicators that trigger the different "automatic" actions would not be the same: infrastructure spending would react to interest rates and contracting bid prices, payroll tax cuts and unemployment insurance to labor market conditions, possible geographically disaggregated, transfers to states to declines in state incomes and changes in the number of people qualifying for assistance.

I understand that you're not engaging in a comparative analysis , but given that you mention 2 and 5, I'm surprised you mention 4 with a presumably straight face. Fiscal policy does not and monetary policy does?!

Also, V Ramey found low multipliers. And the sun rose in the east again today. I remember her 2011 paper, where she corroborated her results using Blanchard & Perotti 2002 by quoting that B&P's multipliers subsided to nearly 0, 20 years out. But of course they did! What else are they supposed to do?

It's the nature of these debates. Unless we think of AD policy, any AD policy, as choosing between different path dependent wealth equilibria, all multipliers eventually subside to 0. But is the original trend the right benchmark in the first place?

I still think that B&P 2002 remains superior to more recent work in thinking about multipliers in general, and the anti-spending pro-tax cut brigade should add Alesina as cherry on top. Ramey's entire line of research is a non-illuminating digression.

"It is important to point out, though, that because we do not adjust for the fact that taxes often rise at the same time as government spending, these estimated multipliers are not necessarily equal to pure deficit-financed multipliers."

There was some recent discussion on The Economist that pretty emphatically suggested that military spending is not a good proxy for fiscal policy studies because it has a low utility to society. I think that objection is pretty valid in this case.

If you are looking for counter-cyclical spending which has a positive ROI, the military spending would seem to fall short. Dumping high-tech gear in Afghanistan is a pretty extreme form of avoiding returns. I mean, the same number of robots distributed to elementary schools would not guarantee technological nirvana, but it would surely put one or two more kids on the road to innovation.

(So yeah, I must assume that economists who reduce to "spending is spending, show me the multiplier" must either be blinkered in their view of the world, or sharpening a political axe.)

One more paper among lots of others with mixed results. I think the correct answer is that economists just dont know what the multiplier really is. Taking a strong position on either side of the issue seems like mood affiliation.

Is there some reason economists cant, or won't, say "We dont know"?


Saying "I don't know" makes you useless as an economist. Saying "we fiscal stimulus because the multiplier is X" gets you nice prestigious positions in government.

And there are economists who say "we don't know", they are called Austrians and no one listens to them.

Is it possible to vote 4 and 5?

I'm a big fan of Marginal Revolution, but the word embarrassing makes me wonder. Based on my read of Tyler, I believe the results are actually in line with his thinking so he means embarrassing to his ideological opponents. Which made me wonder - what results has Tyler found recently that were embarrassing to him or his beliefs?

"It is important to point out, though, that because we do not adjust for the fact that taxes often rise at the
same time as government spending, these estimated multipliers are not necessarily equal to
pure de…ficit-fi…nanced multipliers."

This seems to be the key phrase from the paper. So, all it's basically saying is that raising taxes to finance additional spending during times of economic slack isn't a good idea.

It's nice to know that Intermediate Macro holds up.

Just to remind everyone, a multiplier less than one is an attack on the middle class, a reduction in the number of people with the median income, otherwise known as death for nations.

I find it a horror that Krugman, Thoma, and DeLong simply persist in their demand for direct, viscious attacks on the middle class.

Isn't calling this "the embarrassing result" a little bit of confirmation bias, given that other studies (e.g. Auerbach & Gorodnichenko) say the opposite?

This comment builds on ThomasH's important point that fiscal policy needs to be disaggregated into its various forms when discussing "the multiplier." For those interested, following are some connections between recent papers and the types of fiscal policy with which they are associated.

General federal spending shocks identified through structural methods: Too many to count. Fairly standard in the macro literature.
National defense spending shocks: Barro and Redlick; some of Valerie Ramey's work
Localized defense spending shocks: Nakamura and Steinson
Local shocks associated with earmarks: Cohen, Coval, and Malloy
State government spending shocks financed through own sources (disclaimer: I'm on this one): Clemens and Miran State government spending financed through unexpected pension plan returns: Shoag
State government spending financed through shocks to federal transfers: Serrato and Wingender; Chodorow-Reich, Feiveson, Liscow, and Woolston

The list could go on and there is much that I've left out. Also important are theoretical contributions including some recent work by Emmanuel Farhi and Ivan Werning.

My reading is in line with what some earlier comments have already noted. There is substantial variation in the results obtained in various settings. It remains difficult if not impossible to systematically link these differences to substantive differences in the types of fiscal policy being studied.

Sorry, the list didn't preserve all of the line breaks I meant to add:

General federal spending shocks identified through structural methods: Too many to count. Fairly standard in the macro literature.

National defense spending shocks: Barro and Redlick; some of Valerie Ramey’s work.

Localized defense spending shocks: Nakamura and Steinson.

Local shocks associated with earmarks: Cohen, Coval, and Malloy.

State government spending shocks financed through own sources (disclaimer: I’m on this one): Clemens and Miran.

State government spending financed through unexpected pension plan returns: Shoag.

State government spending financed through shocks to federal transfers: Serrato and Wingender; Chodorow-Reich, Feiveson, Liscow, and Woolston.

Yesterday I wanted to write on a deLong blog to the same topic

"An exercise by the Financial Times to replicate and evaluate the IMF’s work, however, showed that the results suggesting very large multipliers – the relationship between deficit reduction efforts and growth – do not easily stand up to a different choice of countries or time period."

"For the countries where the full data are available on the IMF website, the results lose statistical significance if Greece and Germany are excluded."

"the whole exercise of trying to forecast growth for many different countries using essentially a single multiplier, whatever the value may be, is, in and of itself, an exercise in futility”.

What did little Olivier say in June 2008: "the state of macro is good"

Apparently everybody can produce the number fitting his political agenda by just excluding or picking the countries to go into the analysis.

This does serve as a widely used example, that one can not trust the IMF, the economists overall, an foremost not for 5 cents an Olivier Blanchard

Do they make the same problem as Barro with WWII in which the economy goes from depressed to so overheated rationing is needed?

What's up, of course this paragraph is truly fastidious and I have learned lot of things from it regarding blogging. thanks.

I do enjoy a bad spambot.

Perhaps when the economy is not doing well, a larger share of projects are make work projects of more marginal benefit, as opposed to when projects are more expensive and harder to mobilize in boom times.

Few thoughts:

1. You can't just 'ditch the multiplier' as a concept. If spending is increased then that spending has to go somewhere. Imagine you were filling a pool with water and after running the hose for 5 days you notice the water level has risen at all. You can't just call it a day and decide that we can 'ditch' the concept that water takes up volumn, that the conservation of matter and energy holds the water going in cannot just disappear. If the water level doesn't rise then there's something elose going on in the system which you have to figure out.

2. When the gov't announces some spending, it seems an open question to me when exactly the stimulus hits. For example, suppose the gov't announces it will build an aircraft carrier which will take 4 years to complete. Some of the stimulus may take place before a single dollar is spent (contractors, anticipating winning the contracts, begin to hire or at least not fire key staff people). On the other hand, some of the impact may depend on cash flow. Imagine contractors who tell subcontractors "I'll pay you when I get paid". For them the big payday may not come until months after the project finishes and the final payment is issued. It seems then the dynamics of stimulus may follow a percolation model. If water flows over loose gravel, the ground will asorb it quickly, over clay it will asorb slowly. Think of a spending shock that way, some industries soak up the spending like a sponge and take a while to release it into the rest of the economy, others spit it out almost as fast as it comes in...even faster! Imagine 0.4 of the stimulus happens before you start to look for it, then when you measure 0.7 stimulus from the time you look you declare it's less than 1. In reality you just the lower bound of stimulus. Likewise this would explain why different studies get wildly different amounts for the multiplier. Depending upon how you look, different studies will only see bits and pieces.

3. I see the value of looking at localized shocks to see if you can 'capture the multiplier'. Since some areas have the stimulus and others don't, it would seem you have set up a good randomized control. But this requires the different areas are independent. Imagine in the rust belt a single company in a small town has the good fortune to score an unexpected contract to give the Pentagon specialized bolts for an aircraft. This would seem to be very stimulative, esp. if the town's single major factory was about to shut down. But since the little town is surrounded by a huge state where everyone is suffering, the stimulus can easily 'leak away' into the larger community. Workers cashing their overtime checks may not hang out at the local bar but may instead do a trip to Atlantic City hundreds of miles away (after all, it's been a while since they saw a good check). Or the local factory owner may buy cheap vacation property at distressed prices in the state next door. The stimulus will indeed work but it will seem to have been very dampened if you're just measuring the GDP of the town itself.

Comments for this post are closed