By Thomas Cooley, Ben Griffy, and Peter Rupert
Three states are facing or currently undergoing a recount of votes cast, after a number of computer scientists reported some evidence of problems with the electronic voting. This finding was heavily disputed in the media, and seemingly little evidence was produced to support the conclusion that there was malfeasance in counties with electronic voting. Indeed, following the initial media response, the lead computer scientist backed away from initial reports, saying that there are flaws in electronic voting that could be easily exploited, and that an audit is important, but there isn’t direct evidence. We use our data to explore the claim that counties with electronic voting exhibited different voting patterns than their paper peers. What we find is definitely troubling: in some of the swing states, and specifically in states that were projected to vote Democratic at the top of the ticket, those with electronic voting had a decrease in the percent of the total vote going for the Clinton-Kaine campaign, and an increase for the Trump-Pence campaign. We try to determine if this is spurious by checking for patterns in other places with electronic voting, as well as during the 2012 election. We only find this correlation for swing states during the 2016 election.
Data:
We use the American Community Survey (5-year) for demographics (race, age, gender, education), data from the BLS on unemployment (October 2016 preliminary estimate), data from the BEA on personal income (2015 estimates; more recent estimates include many fewer counties. We use data from Verified Voting for voting machine type (here), which lists type, make, and model of voting machine by county for all states. Finally, we use voting data from Politico for the 2012 and 2016 elections, as well as data from CNN for the 2008 election. We have updated our data slightly since our last post, and the updated file is available here.
Results:
First, we graphically explore areas where various attributes (i.e. race, gender, education, income, unemployment, population size) do a good job explaining election outcomes, and areas where they do a worse job explaining the outcome. We started this in a previous post, and continue along those same lines. We find how much of the shift in voting patterns can be explained by these attributes by running a regression (including state fixed effects). We then use these predictions to assess how far each county is from their predicted outcome. Graphically, these differences are as follows:
A “blue” county is one in which the Clinton campaign outperformed what would be predicted by the county’s demographic and economic characteristics, while a “red” county is one in which the campaign underperformed. The set of attributes do a good job explaining the election outcomes, with more than 90 percent of the counties falling less than 3 percentage points above or below our prediction. There do appear geographic patterns, however, in the over or under performance. Now, here’s the map of counties with electronic voting machines:
Green counties signify counties that exclusively employ paper balloting methods, while yellow counties are ones that employed either a mix of paper and electronic voting, or electronic voting exclusively. It’s worth noting that only 76 counties in the entire country use only electronic voting machines, with nearly all of these located in Pennsylvania. Now, as a visual explanation of what we will do, compare the two above maps. If you focus on the swing states (Wisconsin, Pennsylvania, North Carolina, and Florida), what you see is a pattern emerging in which our model underpredicts Democratic support in counties where paper ballot methods are prevalent, and overpredicts Democratic support in counties where electronic voting methods are prevalent. In other words, counties with electronic voting machines are (visually) less likely to vote for Clinton than we would expect given their demographic makeup. Importantly, this pattern does not appear to be visually present in states that were never considered swing states, i.e. Texas, California, Washington, Illinois, where there is visually no correlation between voting methods and support. Focusing on Wisconsin, Pennsylvania, North Carolina, and Florida, we see
Here, we remove all counties with only paper voting, and focus on four key states that employ a mix of electronic and paper voting. Yellow counties are those with electronic voting who disproportionately voted for the Republican ticket when compared to their county demographics. Key areas, specifically population centers in each state appear to have voted less frequently for the Democratic ticket than would be predicted by their characteristics. But of course, visual inspection can be deceiving, so we now turn to more robust analysis.
To assess whether there were inconsistencies in swing states for counties with electronic voting, we use the same specification as above, but include an indicator variable for whether a county is in one of Florida, North Carolina, Pennsylvania, or Wisconsin, as well as an indicator employs electronic voting machines (EVM in the table below).
The coefficient of interest is the last one: This says that being in a swing state and having electronic voting in a county was associated with a 0.8 percentage point decrease in support for the Clinton campaign relative to support for the Obama campaign in 2012, after controlling for the attributes. This result is statistically significant, meaning that electronic voting machines in a county, or things that might be correlated with electronic voting machines in a county, are able to explain some of the results in these states. Ok, sorry, but here is a little “techy” stuff, we include state fixed effects (i.e., we account for how the overall state changed its vote during the election), employ clustered standard errors, and weight the counties by their population. This result is not limited to these four swing states (it is a larger effect if you include states that were considered swing states, but went Democratic, like Colorado). Our code and data are available here: code, data for those who wish to explore this result. We look at these four states because they were predicted to go Democratic before the election, and because exit polling the night of the election also put them squarely in the Democratic column:
If we expand our group of states to include other “swing states,” these results continue to hold as well. One notable exception is Ohio, whose counties exhibited a positive association between electronic voting and difference in voting patterns. For Ohio, it’s important to note that a large number of votes (over 20%) were cast by mail prior to the election, and that polls as early as October 28th were suggesting that the state would move to the Republican column. This may not be entirely satisfactory, but we wouldn’t necessarily expect to detect an effect if large numbers of ballots were cast in advance. Our exit poll data was obtained from TDMS Research, and are “unadjusted (night of)” exit polls; Edison Research alters their exit polls after the election to better reflect the electorate that they believe voted. It’s worth noting that these unadjusted exit polls have been shown to be unreliable in the past.
Of course, what we find could simply be spurious correlation, or simply a correlation between the placement of electronic voting machines and some underlying factor that was correlated with additional support for the Republican Ticket. We can’t directly discount these explanations, but we can explore the variation in voting patterns among states that were never considered swing states. If these “non swing states” exhibit the same type of pattern, i.e. electronic voting machines implied fewer votes for the Democratic ticket, then we would think that electronic voting machines are more common in places that changed their votes in the election for some other reason. We first explore this for four strongly Republican states, Arkansas, Missouri, West Virginia, and Kansas. The counties in these states exhibited approximately the same average change in support for the Democratic ticket when compared with the swing states, -6.6% on average for counties in swing states, and -7.4% for counties in the strongly Republican States. They also have about the same prevalence of electronic voting machines, with 53% of swing counties having electronic voting, and 50% of strongly conservative counties having electronic voting. The results are as follows:
Unlike before, there is no correlation between electronic voting and a change in support for either party. Note that we can include larger strongly conservative states like Texas, and the results still hold. Now, is there any pattern in strongly Democratic-leaning states, like California, Illinois, Washington, and Virginia?
Again, we find no correlation. Note that we use Virginia because it contains variation in electronic voting, though it is arguably still a swing state.
This is pretty strong evidence (we believe) that counties in swing states with electronic voting are different in some important way that isn’t captured by some underlying correlation across the country. If we thought that there was some non-random placement of electronic voting machines across the country, we would expect the pattern from the swing states to hold up nationwide. It does not, which suggests that these differences are limited to places that were expected to be close during the election.
Finally, we repeat the same exercise for swing states during the 2012 election. Data on electronic voting for the 2012 election is also available from Verified Voting, and is included in our data for analysis. For this, we choose Florida, North Carolina, Virginia, and Ohio, states that were expected to be close during the 2012 election and also contain counties with and without electronic voting. What we find is the following:
For the 2012 election, no correlation arises between electronic voting and states that were expected to swing the election. This again suggests that our results for the 2016 election are not simply spurious correlations.
It’s also worth noting that even if we assigned all counties in the country paper voting, the size of the effect is not large enough to change the election:
But, it’s hard to tell what the real size of the effect would be without more detailed data.
It’s tough to draw precise conclusions as to what these correlations mean. It’s still possible that there are other factors driving our results, other than electronic voting. But, what we do know is that results in key swing states differ in counties with electronic voting. Further, the patterns in these counties are not exhibited by other similar but not electorally important counties across the country. Additionally, electronic voting had no impact in swing states during the 2012 election. Taken together, it seems tough to dismiss the correlations that we have found in the data. While we don’t know how to interpret the findings practically, it certainly lends credence to the efforts to initiate recounts in several of the swing states.
Links:
uncleaned data: link
cleaned data: link
Stata code: link
github code (note, some of this code is mildly out of date; will update soon): link
Interactive maps:
Unexplained Variation map: link
Voting Machines map: link
Exit Polls map: link
Outcome with no Electronic Voting map: link
Sensible to take the high level perspective, but I suspect you can do more with the data. Eg, in MD, electronic voting is commonly coupled with a paper copy. Hackers of the sophistication necessary to pull off an exploit like this would probably know that, and avoid working where a paper trail would show the machines had been hacked. Controlling for voting machines with paper receipts might be helpful. Voting patterns in counties along the MD-PA border could be particularly revealing. Demographically these counties are very similar, but the voting technologies vary.
Hi Erick, thanks for the comment. Just to be clear, we aren’t saying that the patterns we’ve found are caused by hacking, we are just presenting the data. In response to your question, my understanding is that most voting machines do have some sort of paper receipt. The website referenced in the post, verifiedvoting.org has really detailed information about which counties have voting machines with paper trails. Notably, Pennsylvania almost exclusively uses electronic voting without paper trails. Your idea to use counties along the NY-PA border is a good one and we will look into it! You might think though that people would sort into one side of the border or the other based on things like income tax, etc., which might be correlated with voting behavior.
Thanks for reading!
My apologies for duplicate comment. Yeah, different state tax regimes etc. might drive location choices or attitudes that influence or are otherwise correlated to how people vote. By comparing within state, you remove that source of differentiation. If there are enough observations to give you statistical insights, you might look at the difference between counties on either side of PA’s borders, holding things including voting technology constant, to see if the within PA difference between a paper county and voting machine without paper county, or a voting machine with a paper trail county and a voting machine without a paper trail, differs from the within MD differences, holding other things constant (if I got that right!). Sure you’re on top of it. Looking forward to what you do next.
Yeah, you’re right, I think that would control for the effects about which I was concerned. I’ll try to look into it over the next couple of days. If you go into our data post from about a week ago, we have all the data on voting machine types, including (I believe) whether or not the machines have a paper trail. The most concerning is that much of PA doesn’t have paper trails, which is pretty shocking.
Scanned this quickly, but AFAICT there is no mention of exit polling. Your preamble says there is “little evidence” of election fraud, but there was very considerable deviation of election results from exit polls – the international gold standard for detection of fraud.
How can any reputable study of 2016 election fraud completely ignore exit polling discrepancy?
Hi, appreciate the comment. We do actually discuss exit polling briefly in the body of the article. We have a graphic about midway through the page that shows the exit polling the night of the election (this is an important distinction). That said, exit polling is a good, though imprecise measure of outcomes in an election. We think that the statistical evidence we presented is a much stronger indicator of irregularities than deviations from exit polling, though we use these deviations to inform our search among the states.
One important thing to note is that exit polling is “revised” following the election to better reflect the outcome of the election. This makes it very challenging to assess the validity of exit polls post-election, as these initial polls are often no longer available. Because of this revision, it often appears that exit polls are more predictive of the outcome than they were the night of the election, which is why it might seem like they are strongly predictive when looking at previous elections.
We also link to the best source for the unadjusted exit polls: http://tdmsresearch.com/2016/11/10/2016-presidential-election-table/, so feel free to take a look there if you want to see what the predictions looked like the night of the election!
Thanks again for the comment!
Would it be possible to see if the political party of the county clerk (or whomever is designated to run elections) has an impact on the difference between recorded and expected vote?
In theory, yes. In practice, I think this would be a really challenging dataset to assemble. We were able to put together the election dataset by virtue of large amounts of information being available together. Having to scrape the secretary of state’s website in 50 states is a tall order. It’s a good thought though.
Hmm. Evidence is building! https://www.nytimes.com/2017/06/05/us/politics/reality-winner-contractor-leaking-russia-nsa.html