I’m writing this in the context of numerous takes on the implications of Hillary Clinton’s clear victory in the national popular vote, and more generally other “data-driven” analyses of electoral results that try to explain voter behavior, like any articles expressing conviction in one direction or the other that Bernie Sanders would have won against Donald Trump.
The most important point is that the fact that Hillary Clinton won the popular vote, by itself, is approximately meaningless. If we had a popular vote based system maybe more people on either side that took California for granted would have voted. Likewise with Texas, and every other state. No one has any idea. (Obviously both campaigns would have operated differently in a popular vote world, as as been commonly noted, but it’s entirely possible Trump would have won even if neither campaign changed its strategy.)
The other point is that it’s probably hard to make sense of voting data without modeling voter regret in some way. Republicans in early primary states that don’t like Trump, but stayed home on the assumption that Trump wouldn’t win, may have regretted their choice later on. Similarly, Democrats that stayed home in Michigan since the polls were predicting a clear Clinton victory may have regretted that choice on November 9th. This is the more detailed way of saying that Hillary Clinton’s popular vote victory might say more about Joe Biden’s popularity than her own, depending on how you model each vote.
As far as I can tell there’s no obvious way to get around this problem with only data. Many people thought Donald Trump would loose Florida based on early voting data and historical trends of right-wing, election day turnout. But the contest between Hillary Clinton and Donald Trump is obviously a historical anomaly, like pretty much every other presidential election, and therefore informing predictions based on history is nothing more than a slightly educated guess.
I suspect it means something that Hillary Clinton won the popular vote. But any explanation of what that might be requires some incorporation of the fact that voting is a funny behavior, inconsistent over time, and sensitive to the conditions under which the ballot is being cast, which itself recursively relies on expectations of these parameters for the population held by individual voters.
This amounts to a rather complicated model of voting that is probably impracticable. Instead we might try disciplined verbal reasoning about knowable facts.