Child pages
  • Conditional Probability Trees (5.3)

Contents

Context

Conditional Probability Distributions (CPD) are typically represented by Conditional Probability Tables (CPT), in which there is one probability distribution for each combination of the states of the parent nodes.

Even though in BayesiaLab the final internal model is in the form of a CPT, the definition of the CPD can be facilitated by using the Deterministic or Equation modes in the Node Editor.

BayesiaLab 5.3 now offers the ability of using Conditional Probability Trees (CPTr) for compactly representing CPDs by exploiting Contextual Independencies, e.g. when the state of one parent makes the other co-parent(s) independent of the child node.

CPTr are available in the Node Editor, so you can manually specify a Tree. Also, they are now integrated in BayesiaLab's machine learning algorithms:

  • For estimating the parameters of a given network: this allows generalizing probability distributions, either if data is scarce for some combination of parents' states, or if the distributions are not significantly different. 
  • For estimating the parameters and learning the structure of a graph: this returns a more complex structure as adding parents with CPTr can be less costly than with CPT.

New Feature: Tree Mode in Node Editor

Node Editor | Probability Distribution | Tree

Example

Let's define a logical OR between 3 nodes: Tuberculosis, Cancer and Bronchitis

The tree below represents this deterministic function by exploiting two contextual independencies:

  • When Tuberculosis is True, then Cancer and Bronchitis are independent of the node OR.

Right click on No Selector to select the Parent to add in the Tree

 

Define the probability distribution for this context, i.e. Tuberculosis=True.

 

Right click on False to select the Parent to add to the Tree.

 

Define the probability distribution for this context, i.e. Tuberculosis=False and Cancer=True.

 

Right click on False to select the Parent to add to the Tree.

  

Define the last two probability distributions.

  

You have to generate the CPT by clicking Validate.

A Tree Node can be deleted by right-clicking on the corresponding node.

Trees can be copied, entirely collapsed or expanded by right-clicking on the highlighted part of the Tree panel.

You can also expand and collapse a Tree Node by double-clicking on it.

Upon Validation, we obtain the following Conditional Probability Table:


New Feature: Parameter Estimation with Trees

Edit | Parameter Estimation with Trees

If this option is checked, the parameters will no longer be estimated using the Maximum Likelihood Estimation for each combination of states. Rather, they will be estimated by machine learning a probabilistic tree that exploits any regularities in the data. This has the objective of compactly representing the relationship between the parent and child nodes.

  • We define the MDL score (Minimum Description Length) as a measure of the quality of candidate trees with respect to the available data. This score, which is derived from Information Theory, automatically takes into account the data likelihood with respect to the tree and its structural complexity.
  • Trees are available for Parameter Learning, Unsupervised Structural Learning (Taboo and Taboo Order only) and Supervised Learning.
  • The Structural Coefficient and Local Structural Coefficients used for computing the MDL score of a Bayesian networks are also used for computing the MDL score of Trees.
  • In this context, the Smoothing Parameter is also taken into account.
  • If no regularities can be exploited, tables are used as usual.
  • If the (conditional) dependency of a parent with its child is not strong enough, that parent will not be included in the induced tree.

    Even if the link is not physically removed, it is "soft-deleted" through the parameters.

    In this example, Evoque Joy is not dependent on Flowery given Fresh.

Nodes with CPTr are displayed in pink.

Such nodes can be selected by using Edit | Select Nodes | Generalized

Whenever Trees are used instead of Tables, an icon is added in the lower right corner of the graph window: 

Example

Let's take our textbook Perfume example to illustrate the impact of estimating parameters with Trees.

As expected, the structure is much more complex (+12 arcs), as adding parents does not have necessarily an exponential impact on the size of the CPD representation.

Let's focus on the node Elegant:

It's strongest relationship is with Chic (as learned with EQ and CPT).

Pleasure is only interesting if the evaluation of Chic is Medium (<=6.4)

Easy to wear is only interesting if the evaluation of Chic is High (<=8.2)