How Bad is Peer Review? Evidence From Undergraduate Referee Reports on the Currency Unions and Trade Lit

In a recent paper, Glick and Rose (2016) suggest that the Euro led to a staggering 50% increase in trade. To me, this sounded a bit dubious, particularly given my own participation in the previous currency unions and trade literature (which I wrote up here; my own research on this subject is here). This literature includes papers by Robert Barro that imply that currency unions increase trade on a magical 10 fold basis, and a QJE paper which suggests that currency unions even increase growth. In my own eyes, the Euro has been a significant source of economic weakness for many European countries in need of more stimulative policies. (Aside from the difficulty of choosing one monetary policy for all, it also appears that MP has been too tight even for some of the titans of Northern Europe, including Germany. But that’s a separate issue…)
Given my skepticism, I gave my sharp undergraduates at NES a seek-and-destroy mission on the Euro Effect on trade. Indeed, my students found that the apparent large impact of the Euro, and other currency unions, on trade is in fact sensitive to controls for trends, and is likely driven by omitted variables. One pointed out that the Glick and Rose estimation strategy implicitly assumes that the end of the cold war had no impact on trade between the East and the West. Several of the Euro countries today, such as the former East Germany, were previously part of the Warsaw Pact. Any increase in trade between Eastern and Western European countries following the end of the cold war would clearly bias the Glick and Rose (2016) results, which naively compare the entire pre-1999 trade history with trade after the introduction of the Euro.  Indeed, Glick and Rose assume that the long history of European integration (including the Exchange Rate Mechanism) culminating with the EU had no effect on trade, but that switching to the Euro from merely fixed exchange rates resulted in a magical 50% increase. Several of my undergraduates pointed out that this effect goes away by adding in a simple time trend control. Others noted that the authors only clustered in one direction, rather than in two or three directions one might naturally expect. In some cases, multi-way clustering reduced the t-scores substantially, although didn’t seem to be critical. One student reasoned that the preferred regression results from GR (2016) don’t really suggest that CUs have a reliable impact on trade. The estimates from different CU episodes are wildly different —  GR found that some CUs contract trade by 80%, while others have no statistically significant effect, some have a large effect, while others have an effect that is simply too large to be believed (50-140%). Many of my students noted that there is an obvious endogeneity problem at play — countries don’t decide to join or leave currency unions randomly — and the authors did nothing to alleviate this concern. The currency union breakup between India and Pakistan is but one good example of the non-random nature of CU exits.
You’d think that a Ph.D.-holding referee for an academic journal which is still ranked in the Top 50 (Recursive, Discounted, last 10 years) might at least be able to highlight some of these legitimate issues raised by undergraduates. You might imagine that a paper which makes some of the errors above might not get published, especially if, indeed, star economists face bias in the publication system. You might also imagine that senior economists, tenured at Berkeley/at the Fed, might not make these kinds of mistakes which can be flagged by undergraduates (no matter how bright) in the first place. You’d of course be wrong.
The results reported and the assumptions used to get there are so bad that you get the feeling these guys could have gotten away with writing “Get me off your fucking mailing list” a hundred times to fill up space.
Before ending I should note that I do support peer review, and also believe that economics research is incredibly useful when done well. But science is also difficult. This example merely highlights that academic economics still has plenty of room for improvement, and that a surprisingly large fraction of published research is probably wrong. I should also add that I don’t mean to pick on this particular journal — if a big name writes a bad article, it is only a question of which journal will accept it. However, this view of the world suggests that comment papers, replications, and robustness checks deserve to be more valued in the profession than they are at present. Much of the problem with that line of work also stems from almost a willful ignorance of history. Thus, it’s also sad to see departments such as MIT scale back their economic history requirements in favor of more math. I don’t see this pattern resulting in better outcomes.
Update: Andrew Rose responds in the comments. Good for him! Here I consider each of his points.
Rose wrote: “-Get them to explain how that they could add a time trend to regression (2), which is literally perfectly collinear with the time-varying exporter and importer fixed effects.”Sorry, but a Euro*year interactive trend, or, indeed, any country-pair trend, is not going to even close to co-linear with time-varying importer and exporter year fixed effects. The latter would be controlling for a general country trend, but not for trends in country-pair specific relationships. To be fair, regression of one trending variable on another with no control for trends is the most common empirical mistake people make when running panel regressions.-Explain to them how time-varying exporter/importer fixed effects automatically wipe out all phenomena such as the effects of the cold war and the long history of European monetary integration.
Sorry, but that’s also not the case. A France year dummy, to be concrete, won’t do it. That would pick up trade between France and all other countries, including EU, EMU, and former Warsaw Pact countries. You’d need to put in a France*EU interactive dummy, for example. But such a dummy will kill the EMU. Below, I plot the evolution of trade flows over time (dummies in the gravity equation) for (a) all of Western Europe, (b) Western European EU countries, and (c) the original entrants to the Euro Area (plus Greece). What you can see is that, while trade between EMU countries was much higher after the Euro than before (your method), most of the increase happened by the early 1990s, in fact. Relative to 1998, trade even declined a bit by 2013. There’s nothing here to justify pushing a 50% increase.

Rose also indicated that my undergraduates should “Read the paper a little more carefully. For instance, consider; and a) the language in the paper about endogeneity b) Table 7 which explicitly makes the point about different currency unions. “Actually, let’s do that. When I search for “endogeneity” in the article, the first hit I get is on page 8, where it is asserted that including country-pair fixed effects controls for endogeneity. Indeed, it does control for time-invariant endogeneity. But if countries, such as India and Pakistan, have changing relations over time (such as before and after partition), this won’t help.
The second hit I get is footnote 7: “Our fixed‐effects standard errors are also robust. We do not claim that currency unions are formed exogenously, nor do we attempt to find instrumental variables to handle any potential endogeneity problem. For the same reason we do not consider matching estimation further, particularly given the sui generis nature of EMU.” [the bold is mine.] Actually, a correction here: my undergrads reported that the FE standard errors are actually clustered, but this is a minor point. You may not claim that currency unions are formed exogenously, but, as you admit, your regression results do nothing to try to reduce the endogeneity problem. And, this, despite the fact that I had already shared my own research with you (ungated version), which showed that your previous results were sensitive to omitting CU switches coterminous with Wars, ethnic cleansing episodes, and communist takeovers.
Also, the “For the same reason” above is a bit strange. The sentence preceeding it doesn’t give a reason why you don’t try to handle the endogeneity problem. The reason is? In fact, a matching-type estimator would be advisable here.

Leave a Reply