From a great height, and in the popular-culture memory of the 2020 election, the polls falsely predicted a big and easy Joe Biden win over Donald Trump, and it wound up so close that people were rioting over the results two months later. A more sophisticated take would include Republican over-performance in congressional and especially state legislative contests as well.
So is it finally time, as some Know-Nothings always insist, to abolish polling or at least refuse ever to pay attention to polls? No, it’s not, at least for those of us who agree that even flawed data is better than no data. But an educated reckoning with polling error is in order, and Nate Silver, whose FiveThirtyEight site involves a lot of slicing and dicing of polling data, offers a good first take that is already spurring some debate. Silver is offering this analysis in conjunction with a new and substantially revised Pollster Ranking is an indication of how 2020 has affected our understanding of who is and isn’t more or less getting things right.
As documented by Silver and others, here are some observations on what we learned about political polling during and after 2020:
Polling inaccuracy was high, but probably not as high as it seemed
According to FiveThirtyEight’s analysis, the combined average polling error of surveys of the presidential, congressional, and gubernatorial races over the 21 days prior to elections in 2019-2020 was 6.3 percent. That was significantly higher than average, but actually lower than in the 2015-2016 cycle. Average error in the presidential general election of 2020, however, was over 10 percent (the same as it was in 2016, actually). Keep in mind that this involves comparing polls somewhat far out with the final results, so part of this “error” could actually reflect late trends rather than inaccurate polling.
Remember as well that national polls reflect the national popular vote, not state-by-state results. Polls showing Biden with a high single-digit lead (the final RealClearPolitics average margin for Biden was 7.2 percent) seem way off after an Election Night that felt very tense and close, and in light of how close Trump came to pulling off an Electoral College win. Still, Biden ultimately won the national popular vote by 4.5 percent, the largest presidential margin since 1996.
The close battle for control of the House and the Senate also came as a surprise to some pollsters. But then again, both expectations and the GOP advantage in individual races distorted perceptions a bit. RealClearPolitics polling averages gave Democrats a 6.8 percent congressional generic ballot advantage. Democrats ultimately won the national House popular vote by 3.1 percent, but that was only enough to give them a narrow House majority at a time when pundits thought they’d build on their existing majority. As for the Senate, FiveThirtyEight’s largely poll-based projections misfired on just one contest, in North Carolina. There were polling errors, but in terms of the bottom-line results, they weren’t off by much. As Silver puts it, the overall dynamics of 2020 meant that “a decidedly mediocre year for the polls was being mistaken for a terrible one when that conclusion wasn’t necessarily justified.”
Polls did skew Democratic to an unusual extent
Polling errors can occur in either direction, of course. And part of what made 2020 polls seem so far off is that they tended to err in a pro-Democratic direction, with an average statistical bias of 4.8 percent, as Silver observes:
Interestingly, the bias was actually smaller for Trump’s presidential race against Biden (4.2 points) than in races for Congress or governor. But either way, that isn’t a good performance: It’s the largest bias in either direction in the cycles covered by our pollster ratings database, exceeding the previous record of a 3.8-point Republican bias in 1998.
The New York Times’s Nate Cohn thinks this evidence of “systemic bias” is far more troubling than raw error numbers, in part because it suggests systemic mistakes in either sampling voters or weighting the results. But there’s not much consensus as to the nature of the problem. One popular Republican theory — known as the “shy Trump voter,” itself a variation on the “shy Tory voter” hypothesis in the United Kingdom a while back — is that for various reasons GOP voters were loath to respond honestly to pollsters. But a very different explanation (one that David Shor pioneered), and one that may just indicate 2020 was an aberration, is that during the COVID-19 pandemic Democratic voters were more likely to stay home, and thus were more likely to respond to pollsters. Still another 2020-specific explanation is that a lot of Democrats who in this election only were disproportionately prone to voting by mail ran into obstacles doing so and either didn’t vote or had their votes discarded.
But some analysts — including Cohn — have been insisting since 2016 that pollsters are systemically failing to reach Trump-leaning demographic groups, or are underestimating their likelihood to vote. That could indicate a problem that won’t go away automatically, particularly if Trump himself continues to stay in the public eye.
2020 turned perceptions of good and bad pollsters nearly upside down
For those of us who read and interpret polls as part of our efforts to understand political phenomena, the “who got it right?” looks at the 2020 results have come as a shock. Pollsters thought to have shoddy methodologies and partisan bias did relatively well, as Silver (perhaps ruefully) acknowledged:
[L]et’s give a shout-out to the pollsters with the lowest average error. Those were AtlasIntel (2.2 percentage points), Trafalgar Group (2.6 points), Rasmussen Reports/Pulse Opinion Research (2.8 points), Harris Insights & Analytics (3.3 points) and Opinion Savvy/InsiderAdvantage (3.5 points).
Noting the pro-GOP slant of most of these outlets, Nate observes that having partisan “house effects” doesn’t represent “bias” if you get the results right! Then again, it’s possible they got the results right for the wrong reasons (“house effects” rather than discerning underlying dynamics other pollsters missed).
In any event, it’s equally clear a lot of big-name media-sponsored polls had a bad year in 2020, notably Monmouth (which only called 45 percent of the contests correctly in the final 21 days of the 2020 cycle, and had an average error of over ten percent), Quinnipiac, and SSRS (a frequent CNN polling partner). Interestingly, these poor performers (in 2020, at least) all use a live telephone methodology, while the above 2020 “winners” are all robocallers or online pollsters (or use a combination of non-live-caller methods). And that leads to another realization:
It’s no longer clear there’s such a thing as “gold-standard polls”
The term “gold-standard polls” has in the recent past been applied to surveys that adhered to top industry standards for transparency, and also used live telephone interviews as their primary technique. As Silver explains in great detail, while transparency still does have an impact on the verifiable reliability of polls, he’s no longer willing to grant the traditional live-interview firms special status, or a bonus in FiveThirtyEight’s pollster rankings.
If you read those rankings, you may still see a lot of familiar names near the top that didn’t have a great year in 2020 (e.g., Monmouth with an A) because the system is based on data from multiple election cycles. But there are others that would never have met any traditional “gold standard” definitions (e.g., robocall/online pollsters Emerson College and Trafalgar Group, who both get an A-minus).
Polling averages still make the most sense, for now
The problem with treating all polls as worthless, or choosing polling data selectively based on whose results fit some preconceived expectation of what polls “should” show, is that it really does push political analysis in the direction of spin and genuinely made-up results. Looking at polling averages doesn’t eliminate the risk of industry-wide error or bias, but it does reduce it somewhat. And just because even “good” pollsters have a bad year doesn’t mean they are no better than the charlatans (and they pop up every cycle) who just make up what they are paid to report with no real methodology at all.
Treating polling data as just part of the picture in analyzing political races (which many traditional handicappers like the Cook Political Report and Sabato’s Crystal Ball do) is the safest approach. And even many poll-focused analysts (like, yes, FiveThirtyEight) insist on expressing their projections in terms of probabilities rather than “predictions” or “calls” to remind everyone that no poll is infallible.
But as the proliferation of critiques and defenses of political polling illustrate, these are difficult times for that industry and for the writers, gabbers, campaigns, and elected officials who rely on it. Maybe 2020’s issues were mostly the product of changing methodologies, a pandemic, and the ever-disruptive Donald J. Trump. If so, maybe future polls could show more accurate results.