While everyone presumably has an intuitive understanding of what "left" = and "right" mean on the political spectrum, not everyone would agree on the= precise degree of being "leftist" or "right-wing" for any given politician= . We do not have an absolute measure for this purpose either, but we can ga= uge politicians' relative position versus each other by looking at their vo= ting records. Media reports often talk about the number of times a politici= an has voted in favor or against legislative proposals relating to a partic= ular issue.

However, we are not interested in any particular issue, but rather in al= l of them at the same time. More specifically, we would like to compare mem= bers of the House of Representatives of 112th U.S. Congress by their voting= records. So, how should we compare the cumulative voting behavior of our l= egislators? If we dealt with a few instances of voting, we could perhaps me= asure the legislators' degree of agreement in percent. The sheer number of = votes taken in Congress within a legislative period makes this a high-dimen= sional problem though. And, "recent research results show that in high dime= nsional space, the concept of proximity, distance or nearest neighbor may n= ot even be qualitatively meaningful." [1]

To overcome this challenge we turn to concepts from derived from informa= tion theory. In particular, we use mutual information. Mutual information i= s a quantity that measures the mutual dependence of two variables, i.e. how= much knowing about one variable reduces the uncertainty regarding the othe= r variable. It is intuitive, that, on a random issue, the vote of a Democra= t would be more informative regarding another Democrat's vote than regardin= g the vote of a Republican.

As its name implies, the Mutual Information Map layout algorithm that is= implemented in BayesiaLab utilizes mutual information. More specifically, = Mutual Information Map is a layout algorithm that computes the mutual infor= mation matrix between all nodes and then uses a genetic algorithm to search= for a node layout such that the proximity of two nodes is inversely propor= tional to their mutual information. Put more simply, the closer the are nod= es positioned relative to each other, the greater the mutual information be= tween them.

In our specific study of the U.S. Congress, each House member is represe= nted by a node in an unconnected Bayesian network. Each node can take on th= e states "yea", "nay" or "not voting/not a member." Here, we treated "not v= oting/not a member" as a filtered state, so that two frequently absent cong= ressmen would not be interpreted as highly informative regarding their mutu= al voting behavior. For the House to date, we have a total of 1,505 roll ca= ll votes recorded for each representative in the 112th Congress. This allow= s BayesiaLab to compute the mutual information between nodes/representative= s and subsequently display it as a distance on a map.

Our source of the U.S. House of Representatives roll call data is Jeff L= ewis' and Keith Poole's website (http://adric.sscnet.ucla.= edu/rollcall/). Over the years they have systematically compiled voting= data and extensively researched voting patterns. Their studies obviously g= o far beyond what we are attempting to do with our basic demo application t= oday. We should also note that we may very well have overlooked a number of= technicalities related to the voting procedures. Thus, we may have counted= some of the votes incorrectly.

Also, the nature of a genetic algorithm means that we can never know tha= t we have arrived at the theoretically optimal layout of the map. In our ca= se, we simply stopped the algorithm at some point to retrieve the status qu= o and took that as the result. So, do not let our Mutual Information Map dr= ive your decision at the next election!

Despite these caveats, the Mutual Information Map of the House Represent= atives turns out to be rather intuitive. Each party has its respective memb= ers closely clustered together and between them, not surprisingly, is a vas= t divide. The expression, "along party lines", manifests itself here quite = literally.

To visualize genetic nature of the algorithm, we have recorded its progr= ess over millions and millions of of iterations until it converges upon a f= airly stable layout: We begin with all representatives being sorted alphabe= tically by their last name within a rectangular arrangement. The nodes for = the representatives are colored following the convention of red for Republi= cans and blue for Democrats. Once we start the algorithm, we can observe a = "busy dance" of nodes and, after a few seconds, clusters of homogenous colo= rs emerge. It takes a few minutes for the picture to stabilize and after a = while only nodes on the periphery remain on to move. By then, the big pictu= re is clear, and the gap between parties is obvious.

In this recording, we also zoom in on two prominent House members, whose= positions are fairly well known nationwide. As it turns out, we find both = Rep. Paul Ryan and Rep. Nancy Pelosi relatively close to the center of thei= r respective party.

It is also worth noting that we created this Mutual Information Map in B= ayesiaLab, before any Bayesian network is learned. So, the final map remain= s a fully unconnected network. We will leave the task of unsupervised struc= tural learning on this dataset for a future blog post or white paper.

References:

[1] Aggarwal, Charu C., Alexander Hinneburg, and Daniel A. Keim. =E2=80= =9COn the Surprising Behavior of Distance Metrics in High Dimensional Space= .=E2=80=9D In Lecture Notes in Computer Science, 420=E2=80=93434. Springer,= 2001.