By Thomas Cooley, Ben Griffy, and Peter Rupert
Three states are facing or currently undergoing a recount of votes cast, after a number of computer scientists reported some evidence of problems with the electronic voting. This finding was heavily disputed in the media, and seemingly little evidence was produced to support the conclusion that there was malfeasance in counties with electronic voting. Indeed, following the initial media response, the lead computer scientist backed away from initial reports, saying that there are flaws in electronic voting that could be easily exploited, and that an audit is important, but there isn’t direct evidence. We use our data to explore the claim that counties with electronic voting exhibited different voting patterns than their paper peers. What we find is definitely troubling: in some of the swing states, and specifically in states that were projected to vote Democratic at the top of the ticket, those with electronic voting had a decrease in the percent of the total vote going for the Clinton-Kaine campaign, and an increase for the Trump-Pence campaign. We try to determine if this is spurious by checking for patterns in other places with electronic voting, as well as during the 2012 election. We only find this correlation for swing states during the 2016 election.
We use the American Community Survey (5-year) for demographics (race, age, gender, education), data from the BLS on unemployment (October 2016 preliminary estimate), data from the BEA on personal income (2015 estimates; more recent estimates include many fewer counties. We use data from Verified Voting for voting machine type (here), which lists type, make, and model of voting machine by county for all states. Finally, we use voting data from Politico for the 2012 and 2016 elections, as well as data from CNN for the 2008 election. We have updated our data slightly since our last post, and the updated file is available here.
First, we graphically explore areas where various attributes (i.e. race, gender, education, income, unemployment, population size) do a good job explaining election outcomes, and areas where they do a worse job explaining the outcome. We started this in a previous post, and continue along those same lines. We find how much of the shift in voting patterns can be explained by these attributes by running a regression (including state fixed effects). We then use these predictions to assess how far each county is from their predicted outcome. Graphically, these differences are as follows:
A “blue” county is one in which the Clinton campaign outperformed what would be predicted by the county’s demographic and economic characteristics, while a “red” county is one in which the campaign underperformed. The set of attributes do a good job explaining the election outcomes, with more than 90 percent of the counties falling less than 3 percentage points above or below our prediction. There do appear geographic patterns, however, in the over or under performance. Now, here’s the map of counties with electronic voting machines:
Green counties signify counties that exclusively employ paper balloting methods, while yellow counties are ones that employed either a mix of paper and electronic voting, or electronic voting exclusively. It’s worth noting that only 76 counties in the entire country use only electronic voting machines, with nearly all of these located in Pennsylvania. Now, as a visual explanation of what we will do, compare the two above maps. If you focus on the swing states (Wisconsin, Pennsylvania, North Carolina, and Florida), what you see is a pattern emerging in which our model underpredicts Democratic support in counties where paper ballot methods are prevalent, and overpredicts Democratic support in counties where electronic voting methods are prevalent. In other words, counties with electronic voting machines are (visually) less likely to vote for Clinton than we would expect given their demographic makeup. Importantly, this pattern does not appear to be visually present in states that were never considered swing states, i.e. Texas, California, Washington, Illinois, where there is visually no correlation between voting methods and support. Focusing on Wisconsin, Pennsylvania, North Carolina, and Florida, we see
Here, we remove all counties with only paper voting, and focus on four key states that employ a mix of electronic and paper voting. Yellow counties are those with electronic voting who disproportionately voted for the Republican ticket when compared to their county demographics. Key areas, specifically population centers in each state appear to have voted less frequently for the Democratic ticket than would be predicted by their characteristics. But of course, visual inspection can be deceiving, so we now turn to more robust analysis.
To assess whether there were inconsistencies in swing states for counties with electronic voting, we use the same specification as above, but include an indicator variable for whether a county is in one of Florida, North Carolina, Pennsylvania, or Wisconsin, as well as an indicator employs electronic voting machines (EVM in the table below).
The coefficient of interest is the last one: This says that being in a swing state and having electronic voting in a county was associated with a 0.8 percentage point decrease in support for the Clinton campaign relative to support for the Obama campaign in 2012, after controlling for the attributes. This result is statistically significant, meaning that electronic voting machines in a county, or things that might be correlated with electronic voting machines in a county, are able to explain some of the results in these states. Ok, sorry, but here is a little “techy” stuff, we include state fixed effects (i.e., we account for how the overall state changed its vote during the election), employ clustered standard errors, and weight the counties by their population. This result is not limited to these four swing states (it is a larger effect if you include states that were considered swing states, but went Democratic, like Colorado). Our code and data are available here: code, data for those who wish to explore this result. We look at these four states because they were predicted to go Democratic before the election, and because exit polling the night of the election also put them squarely in the Democratic column:
If we expand our group of states to include other “swing states,” these results continue to hold as well. One notable exception is Ohio, whose counties exhibited a positive association between electronic voting and difference in voting patterns. For Ohio, it’s important to note that a large number of votes (over 20%) were cast by mail prior to the election, and that polls as early as October 28th were suggesting that the state would move to the Republican column. This may not be entirely satisfactory, but we wouldn’t necessarily expect to detect an effect if large numbers of ballots were cast in advance. Our exit poll data was obtained from TDMS Research, and are “unadjusted (night of)” exit polls; Edison Research alters their exit polls after the election to better reflect the electorate that they believe voted. It’s worth noting that these unadjusted exit polls have been shown to be unreliable in the past.
Of course, what we find could simply be spurious correlation, or simply a correlation between the placement of electronic voting machines and some underlying factor that was correlated with additional support for the Republican Ticket. We can’t directly discount these explanations, but we can explore the variation in voting patterns among states that were never considered swing states. If these “non swing states” exhibit the same type of pattern, i.e. electronic voting machines implied fewer votes for the Democratic ticket, then we would think that electronic voting machines are more common in places that changed their votes in the election for some other reason. We first explore this for four strongly Republican states, Arkansas, Missouri, West Virginia, and Kansas. The counties in these states exhibited approximately the same average change in support for the Democratic ticket when compared with the swing states, -6.6% on average for counties in swing states, and -7.4% for counties in the strongly Republican States. They also have about the same prevalence of electronic voting machines, with 53% of swing counties having electronic voting, and 50% of strongly conservative counties having electronic voting. The results are as follows:
Unlike before, there is no correlation between electronic voting and a change in support for either party. Note that we can include larger strongly conservative states like Texas, and the results still hold. Now, is there any pattern in strongly Democratic-leaning states, like California, Illinois, Washington, and Virginia?
Again, we find no correlation. Note that we use Virginia because it contains variation in electronic voting, though it is arguably still a swing state.
This is pretty strong evidence (we believe) that counties in swing states with electronic voting are different in some important way that isn’t captured by some underlying correlation across the country. If we thought that there was some non-random placement of electronic voting machines across the country, we would expect the pattern from the swing states to hold up nationwide. It does not, which suggests that these differences are limited to places that were expected to be close during the election.
Finally, we repeat the same exercise for swing states during the 2012 election. Data on electronic voting for the 2012 election is also available from Verified Voting, and is included in our data for analysis. For this, we choose Florida, North Carolina, Virginia, and Ohio, states that were expected to be close during the 2012 election and also contain counties with and without electronic voting. What we find is the following:
For the 2012 election, no correlation arises between electronic voting and states that were expected to swing the election. This again suggests that our results for the 2016 election are not simply spurious correlations.
It’s also worth noting that even if we assigned all counties in the country paper voting, the size of the effect is not large enough to change the election:
But, it’s hard to tell what the real size of the effect would be without more detailed data.
It’s tough to draw precise conclusions as to what these correlations mean. It’s still possible that there are other factors driving our results, other than electronic voting. But, what we do know is that results in key swing states differ in counties with electronic voting. Further, the patterns in these counties are not exhibited by other similar but not electorally important counties across the country. Additionally, electronic voting had no impact in swing states during the 2012 election. Taken together, it seems tough to dismiss the correlations that we have found in the data. While we don’t know how to interpret the findings practically, it certainly lends credence to the efforts to initiate recounts in several of the swing states.
uncleaned data: link
cleaned data: link
Stata code: link
github code (note, some of this code is mildly out of date; will update soon): link
Unexplained Variation map: link
Voting Machines map: link
Exit Polls map: link
Outcome with no Electronic Voting map: link