Image courtesy of American Statistical Association
Who will win, Biden or Trump? Countless polling organizations, academic researchers, news organizations, and campaign consultants are spending hours poring over data and constructing sophisticated models to answer this million-dollar question.
Let’s take a look at the prediction models.
Four categories of prediction models
Based on theoretical approach, prediction models can be sorted into four categories.
First, the most popular and intuitive forecasts are based on polling data. People state their vote intentions, which are aggregated and then reported as percentages of public support. An error attributed to probability sampling called margin of error generally accompanies the percentages.
A second category ignores polling data and utilizes economic and political data. The idea is that political and economic “fundamentals” structure the vote. Its like home field advantage in sports. The home field influences the players, teams, and fans – the outcome. Favorable economic and political conditions benefit the incumbent party while unfavorable conditions aid the challenger. Changes in the Gross National Product over the first two quarters of the election year and Gallup’s election year presidential approval ratings are fundamentals that are used to predict election outcomes.
A third category combines polling data and fundamentals. This approach is considerably more sophisticated and generates predictions for both the national popular vote and for states. Observers can then assess the Electoral College contest as well as the popular vote.
The final category includes a variety of unique methods. One model uses performances in presidential primaries to forecast general election winners while another includes a series of criteria that the incumbent must achieve in order to be declared a winner. Still others rely on stock market thresholds, betting markets, and citizens expectations to determine the outcome.
Be cautious about election polls
When asked the million-dollar question, I typically echo the logic of the second category, the pure fundamental modelers.
Election polls fluctuate considerably. In September 2016, for example, Hillary Clinton led Donald Trump by about 5 percentage points. A couple weeks later, polls showed Clinton over Trump by 14 percentage points. The swings are maddening – see chart below.
The news media then breathlessly report every short-term change in opinion polls as a serious transformation in electoral fortunes. The changes are attributed to many things – perhaps a debate gaffe, a major speech, a hair out of place, or even a scandal.
Nevertheless, the variation in poll results often stem from statistical and sampling errors – not substantive change. Moreover, different polling firms sample different populations (registered voters, likely voters, etc.), at different times, on-line, in person, or a combination, and employ very different survey instruments. This yields a picture of considerable change, but not genuine shifts in voter preferences. Finally, even when campaign events influence people’s vote intentions, that does not mean it affects the outcome of the election.
Distinct from intense partisans, general election voters in the summer months are just beginning to consider the alternatives. During late September and October, voters will eventually come to support a candidate based on the information they learned during the campaign and based on their initial predispositions. Political scientists Gelman and King call this attitudinal dynamic enlightened preferences. A gradual reduction in poll variations signal the presence of enlightened preferences – see above.
3 things to remember about election polls
The Figure below demonstrates three important lessons about election polls. Note, the black line represents changes in support for Republican candidates across the days before the election – these are average changes across thousands of surveys during respective elections. The dash line signifies 50/50 share of the two-party vote. Thus, above that line, the share favors Republicans; below the line, the share favors Democrats. Importantly, the triangle on the right-hand side represents the actual election outcome. Finally, the arrows at approximately -100 days before Election Day stand for party conventions.
- For most election years opinion polls are inadequate predictors of the actual election outcome. Six months out (-200 days) polls are not even close. In 1992, for example, polls strongly favored George Bush, showing approximately 60% of people intending to vote for the Republican. Only 1960 – an exceedingly close election and 1972 – a blowout, did election polls present an accurate picture of Election Day results.
- After the party Conventions, pre-election polls do much better – though even after Conventions there is substantial variation. The vertical axis below are divided in large increments of .20. If the vertical axis appeared in much smaller increments, of say .05, the variation in polls would be even more pronounced. Gelman and King in fact suggest that for almost the entire campaign, it would be unwise to use polls to forecast election outcomes.
- For virtually every election, polls do ultimately converge to a point very close to the actual outcome.
In summary, now is the time to consider pre-election polls. Dismiss prior polling averages and keep in mind that variation across polls will begin to shrink and then converge near the actual result. As the data show, it’s not always a perfect match. But poll averages a week or so before Election Day are a good approximation of the outcome.
This conclusion strikes many as both intuitive and disappointing. After all, it seems logical that precision increases closer to Election Day. To use a sports analogy, predictions are more accurate as the horses turn for home than when they break from the gates. But of course bets are placed before the race begins.
And that is the chief advantage of the fundamentals model. It forecasts the outcome long before the campaign starts.
Months before the 2016 general election, a prominent fundamentals model predicted a Hillary Clinton popular vote win at 51%. In reality, Clinton secured 48.2%. Six months later, in November, a Five Thirty Eight poll-based model had Clinton at 45.7% and Real Clear Politics polls averaged at 46%. Forecasters missed their target, yet were within 3 percentage points.
The key point concerns time. The fundamental model released its prediction in June. The polling models’ final predictions were published 6 months later just before Election Day.
I am reminded of a basketball game where in the last 2 minutes one team led the other by 20 points. A real time win probability model popped up on the TV screen and showed a 99.9% chance the team leading would win. Then in another game, with only seconds remaining, and a tie score, the win probability showed 50%.
Hmmm…. what is the value added? Is this forecasting?
If you are interested in understanding elections, and making predictions about their outcomes, the fundamentals model offers a theory and then announces a prediction well in advance of the actual competition.
For 2020 predictions, look for next week’s post!
 The other advantage is parsimony. The models generally include 3 or less predictor variables.
 Tracking back to June 2016, the Five Thirty-Eight model projected 42.1% support for Clinton and Real Clear Politics 43%.