Who Voted in 2016? Using Fuzzy Forests to Understand Voter Turnout

31 January 2020, Version 1
This content is an early or alternative research output and has not been peer-reviewed at the time of posting.


What can machine learning tell us about who voted in 2016? There are numerous competing turnout theories, and many covariates are required to assess which theory best explains turnout. This paper is a proof-of-concept that machine learning can help overcome this curse of dimensionality and reveal important insights in studies of political phenomena. We use Fuzzy Forests, an extension of Random Forests, to screen variables for a parsimonious but accurate prediction. Using the 2016 Cooperative Congressional Election Study, Fuzzy Forests chose only a few covariates as major correlates of turnout and still boasted high predictive performance. Our analysis provides three important conclusions about 2016 turnout: registration and voting procedures were important, political issues were important (especially Obamacare, climate change, and fiscal policy), but few demographic variables other than age were strongly associated with turnout. We conclude that Fuzzy Forests is an important methodology for studying over-determined questions in social sciences.


fuzzy forests
machine learning
variable screening
2016 election

Supplementary materials

Online Appendices
Online Appendices for Who Voted in 2016? Using Fuzzy Forests to Understand Voter Turnout


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.