##### Child pages
• Mutual Information and Kullback-Leibler Divergence

## Key

• This line was added.
• This line was removed.
• Formatting was changed.

...

Localtab Group

Localtab
active true Q1: Mutual Information vs KL Divergence

LaTeX Formatting
block-align left
The Mutual Information between two variables X and Y is defined as
follows:

The KL Divergence allows comparing two probability distributions, P and Q.

We use the KL Divergence in BayesiaLab for measuring the strength of a direct relationship between two variables. P is then the Bayesian network with the link and Q is the one without the link.

The Mutual Information can be rewritten as:
  follows: $$I(X,Y)=\sum_{x \in X}\sum_{y \in Y} p(x,y)\log_2 \frac{p(x,y)}{p(x)p(y)}$$ The KL Divergence allows comparing two probability distributions, P and Q $$D_{KL}(P({\cal X})\|Q({\cal X}))=\sum_{\cal X}P({\cal X})log_2\frac{P({\cal X})}{Q({\cal X})}$$ We use the KL Divergence in BayesiaLab for measuring the strength of a direct relationship between two variables. P is then the Bayesian network with the link and Q is the one without the link. The Mutual Information can be rewritten as:$$I(x,y)=D_{KL}(p(x,y)\|p(x)p(y))$$

Therefore, Mutual Information (I) and KL Divergence are identical when there are no spouses (co-parents) implied in the measured relation.

Example

Let's take the following network with two nodes X and Z.

The analysis of the relation with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) and with KL (Validation Mode: Analysis | Visual | Arc Force) return the same value: 0.3436

Info

LaTeX Formatting
block-align left
The percentage value in blue in the Mutual Information analysis corresponds to the Normalized Mutual
Information

and the one in red corresponds to

where H() is the entropy defined as:
  Information $$I_N(X,Z)=\frac{I(X,Z)}{H(Z)}$$ and the one in red corresponds to $$I_N(X,Z)=\frac{I(X,Z)}{H(X)}$$ where H() is the entropy defined as: $$H(X)=-\sum_{x\in X}p(x)log_{2}(p(x))$$

Info

The percentage in blue in the Arc Force analysis is the relative weight of the link compared to the sum of all the arc forces.

However, as soon as other variables are implied in the relation as co-parents, the KL Divergence will integrate them in the analysis, leading to a more precise result.

Example

Let's take the following deterministic example where Z is an Exclusive Or between X and Y, i.e. true when X and Y are different.

The analysis of the relations with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) returns the following graph where the mutual information between X and Z and Y and Z are both null.

Indeed, X and Y do not have any impact on Z when they are analyzed separately.

On the other hand, the force of the arcs computed with KL (Validation Mode: Analysis | Visual | Arc Force) reflects perfectly the deterministic relation between of X and Y on Z.

Localtab
title Q2: Normalized Mutual Information

Two clones will have a Normalized Mutual Information I_N(X, X) = 1 but not necessarily a Mutual Information I(X, X)=1. It depends on the value of the initial entropy H(X). You will get it with a binary variable X that has a uniform marginal distribution.

...