Abstract
Political parties are increasingly homogeneous both ideologically and demographically. With increased party-line voting, a natural corollary of sorting is that membership in demographic groups should be increasingly prognostic of vote choice. We argue that predictability of voting decisions is a useful quantity of interest for testing hypotheses from the literature on partisan and demographic sorting. Contrary to expectations, we find that demographic sorting has not produced a very predictable electorate. Tree-based machine learning models, trained on demographic labels from public opinion surveys between 1952 and 2020, predict only 63.5% of out-of-sample vote choices correctly on average. Moreover, demographics have not grown more predictive over time, while partisanship has. Partisanship's diagnosticity has risen in absolute terms, and its relative dominance over ideology has been stable for the last seven decades. Additional data about voters can still yield superior predictions, but its added value decreases over time as partisanship's predictive power grows.