Electronic Voting Machines and the Election

By Thomas Cooley, Ben Griffy, and Peter Rupert

Three states are facing or currently undergoing a recount of votes cast, after a number of computer scientists reported some evidence of problems with the electronic voting. This finding was heavily disputed in the media, and seemingly little evidence was produced to support the conclusion that there was malfeasance in counties with electronic voting. Indeed, following the initial media response, the lead computer scientist backed down from initial reports, saying that there are flaws in electronic voting that could be easily exploited, and that an audit is important, but there isn’t direct evidence. We use our data to explore the claim that counties with electronic voting exhibited different voting patterns than their paper peers. What we find is definitely troubling: in some of the swing states, and specifically in states that were projected to vote Democratic at the top of the ticket, those with electronic voting had a decrease in the percent of the total vote going for the Clinton-Kaine campaign, and an increase for the Trump-Pence campaign. We try to determine if this is spurious by checking for patterns in other places with electronic voting, as well as during the 2012 election. We only find this correlation for swing states during the 2016 election.

Data:

We use the American Community Survey (5-year) for demographics (race, age, gender, education), data from the BLS on unemployment (October 2016 preliminary estimate), data from the BEA on personal income (2015 estimates; more recent estimates include many fewer counties. We use data from Verified Voting for voting machine type (here), which lists type, make, and model of voting machine by county for all states. Finally, we use voting data from Politico for the 2012 and 2016 elections, as well as data from CNN for the 2008 election. We have updated our data slightly since our last post, and the updated file is available here.

Results:

First, we graphically explore where various attributes (i.e. race, gender, education, income, unemployment, population size) do a good job explaining election outcomes, and areas where they do a worse job explaining the outcome. We started this in a previous post, and continue along those same lines. We find how much of the shift in voting patterns can be explained by these attributes by running a regression (including state fixed effects). We then use these predictions to assess how far each county is from their predicted outcome. Graphically, these differences are as follows:

resid.png

A “blue” county is one in which the Clinton campaign outperformed what would be predicted by the county’s demographic and economic characteristics, while a “red” county is one in which the campaign underperformed. While the color scale may be slightly deceiving, the set of attributes do a good job explaining the election outcomes, with more than 90 percent of the counties falling less than 3 percentage points above or below our prediction. There do appear patterns, however, in the over or under performance. Now, here’s the map of counties with electronic voting machines:

voting_machines.png

Green counties signify counties that exclusively employ paper balloting methods, while yellow counties are ones that employed either a mix of paper and electronic voting, or electronic voting exclusively. It’s worth noting that only 76 counties in the entire country use only electronic voting machines, with nearly all of these located in Pennsylvania. Now, as a visual explanation of what we will do, compare the two above maps. If you focus on the swing states (Wisconsin, Pennsylvania, North Carolina, and Florida), what you will perhaps see is a pattern emerging in which our model underpredicts Democratic support in counties where paper ballot methods are prevalent, and overpredicts Democratic support in counties where electronic voting methods are prevalent. Importantly, this pattern does not appearto be  visually present in states that were never considered swing states, i.e. Texas, California, Washington, Illinois, where there is visually no correlation between voting methods and support. Focusing on Wisconsin, Pennsylvania, North Carolina, and Florida, we see

FL_NC_PA_WI_Diffs.png

 

 

Here, we remove all counties with only paper voting, and focus on four key states that employ a mix of electronic and paper voting. Yellow counties are those with electronic voting who disproportionately voted for the Republican ticket when compared to their county demographics. Key areas, specifically population centers in each state appear to have voted less frequently for the Democratic ticket than would be predicted by their characteristics. But of course, visual inspection can be deceiving, so we now turn to more robust analysis.

To assess whether there were inconsistencies in swing states for counties with electronic voting, we use the same specification as above, but include an indicator variable for whether a county is in one of Florida, North Carolina, Pennsylvania, or Wisconsin, as well as an indicator employs electronic voting machines.

test_2016.PNG

The coefficient of interest is the last one: This says that being in a swing state and having electronic voting in a county was associated with a 0.8 percentage point decrease in support for the Clinton campaign relative to support for the Obama campaign in 2012, after controlling for the attributes. This result is statistically significant, meaning that electronic voting machines in a county, or things that might be correlated with electronic voting machines in a county, are able to explain some of the results in these states. Ok, sorry, but here is a little “techy” stuff, we include state fixed effects (i.e., we account for how the overall state changed its vote during the election), employ clustered standard errors, and weight the counties by their population. This result is not limited to these four swing states (it is a larger effect if you include states that were considered swing states, but went Democratic, like Colorado). Our code and data are available here: code, data for those who wish to explore this result. We look at these four states because they were predicted to go Democratic before the election, and because exit polling the night of the election also put them squarely in the Democratic column:

pres_state_exit_pct.png

If we expand our group of states to include other “swing states,” these results continue to hold as well. One notable exception is Ohio, whose counties exhibited a positive association between electronic voting and difference in voting patterns. For Ohio, it’s important to note that a large number of votes (over 20%) were cast by mail prior to the election, and that polls as early as October 28th were suggesting that the state would move to the Republican column. This may not be entirely satisfactory, but we wouldn’t necessarily expect to detect an effect if large numbers of ballots were cast in advance. Our exit poll data was obtained from TDMS Research, and are “unadjusted (night of)” exit polls; Edison Research alters their exit polls after the election to better reflect the electorate that they believe voted. It’s worth noting that these unadjusted exit polls have been shown to be unreliable in the past.

Of course, what we find could simply be spurious correlation, or simply a correlation between the placement of electronic voting machines and some underlying factor that was correlated with additional support for the Republican Ticket. We can’t directly discount these explanations, but we can explore the variation in voting patterns among states that were never considered swing states. If these “non swing states” exhibit the same type of pattern, i.e. electronic voting machines implied fewer votes for the Democratic ticket, then we would think that electronic voting machines are more common in places that changed their votes in the election for some other reason. We first explore this for four strongly Republican states, Arkansas, Missouri, West Virginia, and Kansas. The counties in these states exhibited approximately the same average change in support for the Democratic ticket when compared with the swing states, -6.6% on average for counties in swing states, and -7.4% for counties in the strongly Republican States. They also have about the same prevalence of electronic voting machines, with 53% of swing counties having electronic voting, and 50% of strongly conservative counties having electronic voting. The results are as follows:

placebo_2.PNG

Unlike before, there is no correlation between electronic voting and a change in support for either party. Note that we can include larger strongly conservative states like Texas, and the results still hold. Now, is there any pattern in strongly Democratic-leaning states, like California, Illinois, Washington, and Virginia?

placebo_1.PNG

Again, we find no correlation. Note that we use Virginia because it contains variation in electronic voting, though it is arguably still a swing state.

This is pretty strong evidence (we believe) that counties in swing states with electronic voting are different in some important way that isn’t captured by some underlying correlation across the country. If we thought that there was some non-random placement of electronic voting machines across the country, we would expect the pattern from the swing states to hold up nationwide. It does not, which suggests that these differences are limited to places that were expected to be close during the election.

Finally, we repeat the same exercise for swing states during the 2012 election. Data on electronic voting for the 2012 election is also available from Verified Voting, and is included in our data for analysis. For this, we choose Florida, North Carolina, Virginia, and Ohio, states that were expected to be close during the 2012 election and also contain counties with and without electronic voting. What we find is the following:

test_2012.PNG

For the 2012 election, no correlation arises between electronic voting and states that were expected to swing the election. This again suggests that our results for the 2016 election are not simply spurious correlations.

It’s also worth noting that even if we assigned all counties in the country paper voting, the size of the effect is not large enough to change the election:

pres_state_pct_no_electronic.png

But, it’s hard to tell what the real size of the effect would be without more detailed data.

It’s tough to draw precise conclusions as to what these correlations mean. It’s still possible that there are other factors driving our results, other than electronic voting. But, what we do know is that results in key swing states differ in counties with electronic voting. Further, the patterns in these counties are not exhibited by other similar but not electorally important counties across the country. Additionally, electronic voting had no impact in swing states during the 2012 election. Taken together, it seems tough to dismiss the correlations that we have found in the data. While we don’t know how to interpret the findings practically, it certainly lends credence to the efforts to initiate recounts in several of the swing states.

Links:

uncleaned data: link

cleaned data: link

Stata code: link

github code (note, some of this code is mildly out of date; will update soon): link

Interactive maps:

Unexplained Variation map: link

Voting Machines map: link

Exit Polls map: link

Outcome with no Electronic Voting map: link

 

 

November Employment: so-so

By Thomas Cooley, Ben Griffy, and Peter Rupert

Today the BLS announced that November payroll employment increased 178,000. This was in line with expectations and consistent with recent months. Several of the headline numbers indicate a very strong jobs report: unemployment declined to its lowest level since August 2007; but these numbers mask the continued truncation in the labor force, as much of this decline was driven by a decline in the labor force participation rate. The establishment survey contained positive results for the employed.

empchgm-2016-12-02

Of the increase in employment, 156,000 were private sector jobs, up from 135,000 in October. The single largest category was the services sectors, providing 139,000 new jobs, which was more than the 128,000 created in October. About half of this came from professional services, while most of the rest was composed of education and health services. Government employment (local, state, and federal) increased by 22,000, up from 7,000 in October. Average weekly hours held constant at 34.4, having changed little over the past year:

avghours-2016-12-02.png

Hourly earnings showed a small decline, moving from $25.92 to $25.89 per hour, and breaking a year a positive growth, though the decline was small and year over year, the growth rate was still positive:ahe-2016-12-02.png

As with last month’s employment report, the household survey again contained some less positive results for the US labor market. Unemployment continued to trend down, declining to 4.6 percent from 4.9 percent in October, it’s lowest since before the recession:

 

More inclusive measures (U6) also exhibited this downward trend. The rate for adult men declined from 4.6 to 4.3, and the rate for women declined from 4.3 to 4.2. Superficially, all of these statistics are very positive. However, much of the decline was driven not by new jobs, but by unemployed leaving the labor market, which contributed about a third of the decline in the unemployment rate:

uu6rate-2016-12-03

lfp-2016-12-02.png

Accounting for this decline were a large decline in reentrants and new entrants to the labor market, which combined to account for 144,000 of the overall 387,000 decline in unemployment levels.unemp-composition-2016-12-02.png

Both of these statistics suggest that unemployment is a very persistent state for some workers, leading to discouragement among workers. Indeed, the household survey also reports a large uptick in marginally attached workers, from 1,700,000 to 1,932,000, with about half of this increase coming from discouraged workers.

The only real take-away is that indicators for the labor market are mixed at this point. For those who are attached to the labor market, there are positive signs about employment opportunities. The continuing concern is the decline in labor force participation. However, this report was sufficiently strong and should not deter the Fed from making its expected move on interest rates at he next meeting.

 

Download Our Election Data

In order to facilitate broader discussion of the election, we have written a set of python scripts to download and organize data relating to the election. There is still a fairly high barrier to obtaining election results, so we wanted to make a clean source available for those interested. In addition to the series discussed in our previous post (here), we have included data on voting machines for those who wish to explore questions related to the recount.The code will download election results, graph them, and merge them into a .csv for statistical analysis.

We have made them available through the following sources:

  1. github: here
  2. dropbox: here

To run it, install Python (we suggest Anaconda), open a terminal and run the “Main.py” program from the file in which it was downloaded after editing options. It is likely that with a fresh installation of Python, additional modules will be necessary. This can be done by opening a terminal and typing “pip install <module name>” without quotes, and the required module substituted for <module name>.

Series available (County-level):

  1. 2016 Election (President, House, Senate, Governor)
  2. 2012 Election (President, House, Senate, Governor)
  3. 2008 Election (President, House, Senate, Governor)
  4. 2004 Election (President, House, Senate, Governor)
  5. Economic Statistics (unemployment, income, establishments, industries)
  6. Demographics (race, age, gender, education)

Series provided but not merged (County-level):

  1. 2002 industry composition
  2. Voting Rights Act coverage
  3. Voting machine type (paper, electronic, etc.)

The available options are explained and edited in the “Main.py” file. We will be gradually updating our code to include options for more series, as well as merging the “extra” series currently not merged.

Any coding contributions or comments are much appreciated.

Q3 GDP Revised Up

By Thomas Cooley, Ben Griffy, and Peter Rupert
Today’s second estimate of real GDP from the Bureau of Economic Analysis shows an increase of 3.2% for Q3. The advance estimate for Q3 had an increase of 2.9%. The final estimate for Q2 was also revised up to 1.4% from 1.1%. The year over year change (blue line) had been trending down for the past 5 quarters or so.
gdprealchgm-2016-11-29

The overall rise in real GDP was led by a 2.8% increase in real personal consumption expenditures (PCE) that contributed 1.9 percentage points to the gain in GDP. Compared to other recoveries this one is now quite mature, yet continues to grow at a steady pace.

pcerealchgm-2016-11-29

gdp-cyc-trough2016-11-29

There was also a large rise in exports, up 10.1% and imports also increased slightly, up 2.1%. Overall, net exports contributed 0.87 percentage points to GDP growth. Investment, on the other hand, continues to be weak, coming from both nonresidential (up 0.1%) and residential fixed investment (down 4.4%). Spending on equipment has declined 6 out of the last 8 quarters.

nrfireal-q-2016-11-29residentialinv-2016-11-29

This GDP report certainly provides enough support for the FOMC to raise rates during their December 13-14 meeting. Friday’s jobs report is expected to reinforce the view that the economy is on a stable path and that monetary policy can be normalized.

How Trump Won

How Trump Won

By Thomas Cooley, Ben Griffy, and Peter Rupert

At the start of Nov. 8th, most pundits would have been equally shocked by a Donald Trump victory as they would have been by Harry Truman rising from his grave clutching a newspaper celebrating his 1948 electoral victory. Almost universally, onlookers predicted a large, if not resounding victory for Hillary Clinton. And now a week later, many of those pundits have begun to acknowledge their own hubris in their predictions.

We take the opportunity to explore this and the past several elections, to see what differences might have driven such an unexpected outcome. What we find is interesting: Once we control for the level of education and unemployment in a county, the proportion of white men in a county was not predictive of a higher likelihood of voting for Trump. Counties with higher unemployment and less education were much less likely to vote for the Democratic ticket than they were in 2012, while all race and gender groups appear to have been more likely to increase their vote for Clinton once demographics were included. Additionally, counties that were heavily employed in manufacturing closer to the enactment of NAFTA swung their vote away from Clinton and may have decided the election.

To do this analysis, we combined county-level election results for the previous two elections, 2012 and 2016, with a number of characteristics of those counties, including race and gender, education, unemployment, and employment by industry (2-digit), for the most recent years available (most often 2015). We also include the percent of the county employed in manufacturing jobs for the year 2002 (the earliest year available at the county level) to assess whether a narrative about NAFTA and trade may have had a role in determining the outcome of the election. We further merge information on the counties previously covered by the Voting Rights Act (prior to the Shelby decision, 2013) to see what impact lifting the pre-clearance requirement may have had on the election.

As one might expect, there is a strong geographic component to the outcome of the election. The coasts strongly supported Clinton, while the center overwhelmingly supported Trump:

clinton_dist.png

obama_dist.png

An interactive version of the maps presented here, as well as instructions on how to use them are available at the bottom of the post. There are subtle, but important differences between the geographic distribution of votes in these two elections. Notably, Democratic losses were concentrated in areas that were strongholds as recently as 2012:

change_2012-2016.png

What drove these differences? There’s no doubt that the results are, at least to some degree, consequence of an undertow of racism, sexism, and homophobia, that voters were able to exorcise from the privacy of the voting booth. It’s also true Hillary Clinton was also an historically unpopular candidate, exceeding only her rival in popularity among presidential candidates. But it also seems that the economically dispossessed were willing to overlook these flaws to support Trump. The table below reports the marginal effect that a one percent change in a set of covariates had on the support for Clinton relative to Obama. We measure this change in support as the percent in a county voting for Clinton in 2016 minus the percent in that county that voted for Obama in 2012. The covariates are all the same scale, between 0 and 100, meaning that a 10 percent increase in the unemployment rate in a county implies a 5 percent decrease in support for Clinton relative to Obama in 2012 (see the number corresponding to unemployment in the table below). We also use state fixed effects, meaning that these results are relative to the average change in the state.

results.PNG

A quick read of this table reveals some interesting, and potentially surprising statistics. Counties with higher percentages of Hispanic and Latino voters turned out for Clinton, while counties with higher unemployment aligned with Trump’s populist message. The African American vote did not seem to improve Clinton’s outcomes, and we discuss some causes for this below. As has already been widely reported, counties with higher percentages of white men were more likely to support Trump, relative to 2012, which is shown by the cross-term in row 3 (remember that each variable is 0 to 100, so the cross-term ranges from 0 to 10000, potentially). There is an important subtlety here: once county-level demographic and economic characteristics are controlled for, counties with white men actually increased their vote for Clinton relative to how they voted in 2012, for almost all the combinations of percent male and percent white in the dataset. However, the cross-term in row 3 indicates that as either the percent white or percent male in a county increased, the margin got smaller, suggesting that highly white or male counties were less likely to vote for Clinton than their more diverse peers. Still, for all but the most white counties in the dataset, our model would predict that they would increase their vote for Clinton, relative to Obama in 2012. The dichotomy between what we see in our dataset and what we observed in the election is that the places that were overwhelmingly white and changed their votes to Trump also have higher rates of unemployment and higher percentages of residents with a high school degree or less. Nationally, the distribution of white males is shown below (counties in gray did not have the relevant data):

white_male.png

This seems at least geographically consistent with the narrative that white men swung the election for Trump. Our interpretation is that covariates that might be strongly correlated with certain geographic regions, like unemployment and education, are strongly correlated with support for Trump. As shown above, those with a high school education or less strongly decreased their support for Clinton relative to their support for Obama in 2012. That distribution geographically is displayed here:

HS_Less.png

Again, it appears that these groups are concentrated in states that had a substantial impact on the election, though not as densely as one might expect. Somewhat surprisingly, repeating the analysis above with a variable that represents the percent of counties with white men with a high school education or less does not yield significant results, that is at least partially suggestive that the most common narrative following the election, that low-education white male voters swung states from Clinton to Trump isn’t consistent with the data. Again, this is probably because there is a strong correlation between these groups and other characteristics. The result that we find most interesting comes from the variable labeled “Percent Employed in Manufacturing (2002),” the earliest year for which employment by sector is available at the county level. This means that counties with higher percentages of their workers employed in manufacturing sectors in 2002 were substantially less likely to vote for Clinton than they were for Obama just four years before. This could of course simply be correlation, but it’s also possible that these workers still hold the Clinton’s responsible for declining job prospects as a result of NAFTA in 1994. Where were these industries located? See below:

manufacturing_pct.png

We see that the percentage of individuals employed in manufacturing is fairly evenly distributed among states in the Midwest and the South. Remember that several key states in the election, Pennsylvania, Wisconsin, and Michigan, were decided by about 1.5 percent or less of the total vote, meaning that the shift in voting in these manufacturing heavy counties could have played a large role. Equally as important as the percent employed in manufacturing is the number of potential voters who were employed in these industries, and where they were located:

manufacturing_total.png

Unfortunately, many counties lack data on share of manufacturing from the 2002 data source. From the data we can obtain, counties that switched votes from Democrat to Republican, those in the Midwest had higher percentages of their workforce employed in manufacturing, and larger numbers employed in those industries as well. Furthermore, these industries were highly concentrated in the “Rust Belt,” the states closest to the Great Lakes. These states had been traditional Democratic strongholds, but swung to the Republicans for the first time in several elections. With this data, we can only conjecture about whether this was a cause, but it does appear that counties with jobs that were more likely to leave following the adoption of NAFTA shifted their votes in large quantities to the Republican ticket.

Another interesting and important narrative in this election is the removal of the Voting Rights Act as a protection against impeding voter participation. Could this also have played a role in swinging the election? Prior to Shelby County v. Holder (2013), which ruled the pre-clearance requirement unconstitutional, there were a number of jurisdictions under the purview of Section 5 of the Voting Rights Act (link). When we repeat the same exercise as before, predicting the percent change in Democratic support within a county between 2016 and 2012, we come to an interesting and perhaps counter-intuitive conclusion: support for Clinton was higher in previously covered counties than for Obama in 2012, at least as a percentage of those voting. Doing the same analysis as above, but including an indicator variable for counties that were covered by the Voting Rights Act yielded the following:

vra_results.PNG

What this suggests is that voters in counties that had previously been under the protection of the Voting Rights Act increased their support for Clinton by 2 percent relative to 2012. Some of this could be a result of much negative rhetoric on the Republican side being targeted at the minority groups that were previously protected by the Voting Rights Act.

vra_results_total.PNG

Note that this is not the difference in the total number of ballots cast for the two candidates in this election, but the change in the number of ballots cast in total between 2012 and 2016. Thus, Given that the average number of ballots cast in a county was around 40,000, this decrease in counties that had been covered by the VRA is substantial. Overall, the number of ballots cast increased by an average of about 1,000 per county between 2012 and 2016, suggesting that the turnout was substantially depressed in counties that were previous covered by the Voting Rights Act, though per our analysis, this didn’t seem to translate into a higher percentage of votes for the conservative on the ticket. And, at least graphically, it doesn’t seem like these differences could have swung the election:

vra.png

Given the geographic location of these covered counties, it seems unlikely that it directly played a role in shaping the presidential election, though it may have impacted North Carolina, and probably did have an impact in down-ballot races.

It’s still not entirely clear what drove such an unexpected result, but we think that the narrative needs some clarification. Having delved into the data, it appears that a long-standing disaffection for free-trade may have driven a lot of Midwest voters to switch party allegiances they held as recently as 2012 and vote Republican. In places that determined the outcome of the election, states like Wisconsin, Ohio, Pennsylvania, and Michigan, a disproportionate number of people had been employed in industries (in 2002) that were most likely to be impacted by NAFTA.

Interactive Maps: To use these maps, click on the corresponding link. You will either automatically or be prompted to download an html document. After downloading this document, either double click or drag-and-drop into your internet browser. This will open the interactive data. This slightly convoluted process is because we cannot embed the graphics in WordPress.

Clinton Voting Distribution: link

Obama Voting Distribution: link

Change 2012 to 2016: link

Manufacturing (Percent): link

Manufacturing (Levels): link

Race and Gender: link

Education: link

Voting Rights Act: link