Child pages
  • Prior Knowledge for Structural Learning

Contents

Question

Is it possible to use BayesiaLab for learning a Bayesian network when some of the arcs and parameters in the Bayesian network are already specified before the learning process starts, e.g. using prior knowledge?

Answer

BayesiaLab is ideally suited to take advantage of any available prior knowledge in the context of structural learning. This allows combining existing expert knowledge with knowledge discovery through machine learning. The following example illustrates a workflow in this regard.

See also Maximum Likelihood Estimation.

Example

We begin with a set of nodes that represents responses from a consumer survey.

Initial Unconnected Network

Using a priori knowledge, e.g. from domain experts, we add arcs to reflect known relationships.

This structure will then serve as the starting point of the structural learning algorithm, assuming we check Keep Structure.

However, if these three relationships are not supported by the dataset, they will be removed by the learning algorithms.

The EQ and Taboo learning algorithms are the only algorithms that can utilize predefined arcs. All the other Unsupervised Learning algorithms start by deleting the existing arcs.

The parameters are estimated from the underlying database. However, after the learning process is completed, we can edit the nodes via the Node Editor and paste any a-priori probabilities into the conditional probability distributions.

Add a-priori knowledge via the Node Editor

If we want to keep the pre-defined arcs, we have to fix them by right-clicking on them and selecting Properties | Fix. Dotted lines indicate which arcs are fixed.

We now start the Taboo learning algorithm (Learning | Unsupervised Structural Learning | Taboo),

and are prompted to choose whether to delete or not to delete the existing arcs. 

Once learning is completed, we see four new arcs in addition to the fixed arcs.

The Taboo learning algorithm is the only algorithm that can perfectly maintain such fixed arcs.

EQ can maintain fixed arcs in some cases, but there is no guarantee as the networks have to be converted into Essential Graphs.

All the others unsupervised learning algorithms start by deleting the existing arcs.

The parameters are estimated from the underlying database.

We can also express our prior knowledge by setting a Number of Structure Equivalent Examples (NSEE).

Checking this option creates a "virtual dataset" made of NSEE cases that correspond to the expert knowledge expressed with the initial Bayesian network (structure and parameters).

A new icon in the lower right corner of the graph window indicates that such virtual cases have been defined.

Right-clicking on this icon displays the structure of the prior knowledge that has been used for generating the virtual cases that will now be mixed with the real ones.

This way of defining expert knowledge can be used with all the learning algorithms.

When using EQ or Taboo, we can choose to either delete these arcs or to keep them.

The parameters of the final graph are estimated from both the virtual and real datasets (Maximum Likelihood Estimation with Priors).

Unlike the first two methods that consist of just adding arcs, this method does not allow focusing on a subset of nodes. The absence of arcs is also considered as prior knowledge (marginal independence).