While everyone presumably has an intuitive understanding of what "left" and "right" mean on the political spectrum, not everyone would agree on the precise degree of being "leftist" or "right-wing" for any given politician. We do not have an absolute measure for this purpose either, but we can gauge politicians' relative position versus each other by looking at their voting records. Media reports often talk about the number of times a politician has voted in favor or against legislative proposals relating to a particular issue.

However, we are not interested in any particular issue, but rather in all of them at the same time. More specifically, we would like to compare members of the House of Representatives of 112th U.S. Congress by their voting records. So, how should we compare the cumulative voting behavior of our legislators? If we dealt with a few instances of voting, we could perhaps measure the legislators' degree of agreement in percent. The sheer number of votes taken in Congress within a legislative period makes this a high-dimensional problem though. And, "recent research results show that in high dimensional space, the concept of proximity, distance or nearest neighbor may not even be qualitatively meaningful." [1]

To overcome this challenge we turn to concepts from derived from information theory. In particular, we use mutual information. Mutual information is a quantity that measures the mutual dependence of two variables, i.e. how much knowing about one variable reduces the uncertainty regarding the other variable. It is intuitive, that, on a random issue, the vote of a Democrat would be more informative regarding another Democrat's vote than regarding the vote of a Republican.

As its name implies, the Mutual Information Map layout algorithm that is implemented in BayesiaLab utilizes mutual information. More specifically, Mutual Information Map is a layout algorithm that computes the mutual information matrix between all nodes and then uses a genetic algorithm to search for a node layout such that the proximity of two nodes is inversely proportional to their mutual information. Put more simply, the closer the are nodes positioned relative to each other, the greater the mutual information between them.

In our specific study of the U.S. Congress, each House member is represented by a node in an unconnected Bayesian network. Each node can take on the states "yea", "nay" or "not voting/not a member." Here, we treated "not voting/not a member" as a filtered state, so that two frequently absent congressmen would not be interpreted as highly informative regarding their mutual voting behavior. For the House to date, we have a total of 1,505 roll call votes recorded for each representative in the 112th Congress. This allows BayesiaLab to compute the mutual information between nodes/representatives and subsequently display it as a distance on a map.

Our source of the U.S. House of Representatives roll call data is Jeff Lewis' and Keith Poole's website (http://adric.sscnet.ucla.edu/rollcall/). Over the years they have systematically compiled voting data and extensively researched voting patterns. Their studies obviously go far beyond what we are attempting to do with our basic demo application today. We should also note that we may very well have overlooked a number of technicalities related to the voting procedures. Thus, we may have counted some of the votes incorrectly.

Also, the nature of a genetic algorithm means that we can never know that we have arrived at the theoretically optimal layout of the map. In our case, we simply stopped the algorithm at some point to retrieve the status quo and took that as the result. So, do not let our Mutual Information Map drive your decision at the next election!

Despite these caveats, the Mutual Information Map of the House Representatives turns out to be rather intuitive. Each party has its respective members closely clustered together and between them, not surprisingly, is a vast divide. The expression, "along party lines", manifests itself here quite literally.

To visualize genetic nature of the algorithm, we have recorded its progress over millions and millions of of iterations until it converges upon a fairly stable layout: We begin with all representatives being sorted alphabetically by their last name within a rectangular arrangement. The nodes for the representatives are colored following the convention of red for Republicans and blue for Democrats. Once we start the algorithm, we can observe a "busy dance" of nodes and, after a few seconds, clusters of homogenous colors emerge. It takes a few minutes for the picture to stabilize and after a while only nodes on the periphery remain on to move. By then, the big picture is clear, and the gap between parties is obvious.

In this recording, we also zoom in on two prominent House members, whose positions are fairly well known nationwide. As it turns out, we find both Rep. Paul Ryan and Rep. Nancy Pelosi relatively close to the center of their respective party.

It is also worth noting that we created this Mutual Information Map in BayesiaLab, before any Bayesian network is learned. So, the final map remains a fully unconnected network. We will leave the task of unsupervised structural learning on this dataset for a future blog post or white paper.


[1] Aggarwal, Charu C., Alexander Hinneburg, and Daniel A. Keim. β€œOn the Surprising Behavior of Distance Metrics in High Dimensional Space.” In Lecture Notes in Computer Science, 420–434. Springer, 2001.