Child pages
  • Mutual Information and Kullback-Leibler Divergence

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


Localtab Group

titleQ1: Mutual Information vs KL Divergence

LaTeX Formatting
The Mutual Information between two variables X and Y is defined as follows: $$I(X,Y)=\sum_{x \in X}\sum_{y \in Y} p(x,y)\log_2 \frac{p(x,y)}{p(x)p(y)}$$
The KL Divergence allows comparing two probability distributions, P and Q $$D_{KL}(P({\cal X})\|Q({\cal X}))=\sum_{\cal X}P({\cal X})log_2\frac{P({\cal X})}{Q({\cal X})}$$
We use the KL Divergence in BayesiaLab for measuring the strength of a direct relationship between two variables. Here, P is then the Bayesian network with the link, and Q is the one without the link.
The Mutual Information can be rewritten as: $$I(x,y)=D_{KL}(p(x,y)\|p(x)p(y))$$

Therefore, Mutual Information (I) and KL Divergence are identical when there are no spouses (co-parents) implied in the measured relation.


Let's take the following network with two nodes X and Z.

The analysis of the relation with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) and with KL (Validation Mode: Analysis | Visual | Arc Force) return the same value: 0.3436


LaTeX Formatting
The percentage value in blue in the Mutual Information analysis corresponds to the Normalized Mutual Information $$I_N(X,Z)=\frac{I(X,Z)}{H(Z)}$$
and the one in red corresponds to $$I_N(X,Z)=\frac{I(X,Z)}{H(X)}$$
where H() is the entropy defined as: $$H(X)=-\sum_{x\in X}p(x)log_{2}(p(x))$$


The percentage in blue in the Arc Force analysis is the relative weight of the link compared to the sum of all the arc forces.

However, as soon as other variables are implied in the relation as co-parents, the KL Divergence will integrate them in the analysis, leading to a more precise result.


Let's take the following deterministic example where Z is an Exclusive Or between X and Y, i.e. true when X and Y are different.

The analysis of the relations with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) returns the following graph where the mutual information between X and Z and Y and Z are both null.

Indeed, X and Y do not have any impact on Z when they are analyzed separately.

On the other hand, the force of the arcs computed with KL (Validation Mode: Analysis | Visual | Arc Force) reflects perfectly the deterministic relation between of X and Y on Z.

titleQ2: Normalized Mutual Information

Two clones will have a Normalized Mutual Information I_N(X, X) = 1 but not necessarily a Mutual Information I(X, X)=1. It depends on the value of the initial entropy H(X). You will get it with a binary variable X that has a uniform marginal distribution.