# Democratic Primaries, FiveThirtyEight, and Markov Chains

At the moment I am very busy writing things like grant applications and research papers, so I lack the time for blog posts. But then I wasted part of my evening reading this article on FiveThiryEight about the Democratic Party primaries in the US. For each democratic primary contender, they provide the following data:

• How many of their current supporter do not consider voting for another candidate. For instance for Biden this number is 21.9%, while for Buttigieg it is only 6.2%.
• Which other candidates their supporter are considering. For instance 52.2% of Biden’s supporter also consider Warren, 39.2% Sanders, 28.9% Harris, and 24.8% Buttigieg.

## The Markov Chain Andrey Andreyevich Markov

With some fantasy one can see this as a stochastic matrix that belongs to a Markov chain. That is a matrix in out case indexed by the candidates Warren, Biden, Sanders, Buttigieg, and Harris. It gives the probabilities on how someone’s voting will change in one step (whatever a step is). For instance a total of 21.9% of all Biden supporters will not vote for anyone else, so let’s say that the probability of them staying Biden voters in one step is 21.9%, while the remaining 78.1% will change their vote. What are the probabilities for them changing their votes to a particular candidate? Let’s just pretend that this is proportional to the number of Biden’s supporters who consider that particular candidate. When you scale accordingly (and if I did this correctly), then we end of with a probability of 28.0% for a switch to Warren, 21.0% for Sanders, 13.5% for Buttigieg, and 15.6% for Harris. Doing this for all of the top candidates, that is Warren, Biden, Sanders, Harris, and Buttigieg in this order, we obtain the following transition matrix/stochastic matrix: $A := \begin{pmatrix}0.116 & 0.3 & 0.232 & 0.179 & 0.174\\0.28 & 0.219 & 0.21 & 0.135 & 0.156\\0.306 & 0.3 & 0.146 & 0.111 & 0.137\\0.345 & 0.273 & 0.153 & 0.062 & 0.168\\0.31 & 0.297 & 0.184 & 0.157 & 0.052\end{pmatrix}$

Some remarks:

• In a stochastic matrix all row sums are equal to 1 which is not the case in the matrix above. Internally, I used a proper stochastic matrix (with fractions), but that is not very readable.
• I want to emphasize at one point, and here seems to be a good point, that my method of assigning the transition probabilities is very arbitrary. I did this because I like stochastic matrices, not because I want to make a reasonable prediction about the US Democratic primaries.
• Why did I leave out all the other candidates? Because I was too lazy to type in all the matrices.

## The Stationary Distribution Elisabeth Warren

So what can we do with a stochastic matrix? We can figure out the limit distribution. After one step a Biden voter is a Warren voter with probability of 0.28, a Biden voter with probability 0.219, a Sanders voter with probability 0.21, a Harris voter with probability 0.135, and a Buttigieg voter with probability 0.156. But what after two steps? For that we have to consider the matrix $A^2$. The answer is (if I did this correctly) $[0.368, 0.225, 0.207, 0.085, 0.115].$

It is very easy (indeed, I learned this in my first year as a math student) to calculate the limit of the distribution. The whole process is a Markov chain and we want to calculate its stationary distribution. This is explained in the Wikipedia article on the stochastic matrix, but I looked it up in Section 6.5 in the (German) book “Linear Algebra” by Huppert and Willems. That is the book where I learned this from many years ago. We have to solve the equation $zA = z$ for a row vector $z$ whose entries sum up to 1. The vector $z$ is then the limit distribution. This gives us the following: $[0.256,0.274,0.192,0.135,0.144 ]$

So Warren gets 25.6% percentage of the vote, Biden 27.4%, Sanders 19.2%, Buttigieg 13.5%, and Harris 14.4%. Of course this makes many silly assumptions, in particular that the Democratic primary process takes forever. Or is it really silly? For a non-American, the length of the US primaries definitely feels very infinite. But of course we can also estimate how fast this process converges. This depends on the eigenvalues of $A$. The largest (right?) eigenvalue of $A$ is 1 as the row sums are all 1, so $A j = j$, where $j$ is the all-ones vector. I used Maxima to calculate the remaining eigenvalues. The eigenvalue which determines the rate of convergence towards the stationary distribution is the second largest one in absolute value. In our case that one is vaguely $0.2$. And this brings us back to the main theme of this blog: spectral graph theory.

## Final Predictions Joe Biden

But do I dare to conclude anything about the actual race? Not really. But it surprised me that Biden comes out on top with 27.4%, thanks to his many exclusive supporters. The FiveThirtyEight article highlights the fact that quite many voters consider voting for Warren. A priori I thought that this would make her win in my model, but Biden is not much worse in that regard either and the small difference he easily compensates with his 21.9% of exclusive voters, while Warren has only 11.6% of those.

Edit: Changed “voters” to “supporters” in some places in the text as this is more accurate of the poll which did not ask for a first choice.

Edit 2: If do not do any of the Markov chain business and simply  assign percentages based on the score of the individual candidates, then we end up with a very similar distribution: $[0.265,0.287,0.197,0.126,0.125]$