# Contents

The **Learning** menu provides access to a wide range of learning algorithms and related functions.

**Missing Values Processing**

As the name implies, this function allows you to specify the **Missing Values Processing** algorithm:

**Static Completion****Dynamic Completion****Structural EM**

### Stratification

If the **Target State** of the **Target Node** has a very weak representation, for instance in the case of fraud detection, **Stratification** will allow modifying the probability distribution of the **Target Node** by using the internal weights associated with the states. This modification of the probability distribution permits learning a network that is structurally more complex. Once the structure learned, the parameters, i.e. the **Conditional Probability Tables**, are estimated on the unstratified data.

In the following dialog box, you can specify the proportion of each state of the **Target Node**. By default, the marginal distribution of the **Target Node** is shown. You can now use the sliders to set the proportions to the desired levels, or enter the percentages directly.

Once the **Stratification** is set, the iconis displayed in the status bar. You can also remove the **Stratification** by right-clicking on this icon and selecting **Remove Stratification** from the **Contextual Menu**.

**Estimation** **of** P**robabilities**

Allows updating the probabilities tables by using the frequencies of the cases that are observed in the database, smoothed or not depending on the user choice defined by the **settings**. If the database contains missing values, this algorithm also launches the missing value processing. At the end of the estimation, the score of the Bayesian network (structure and new probabilities) is displayed in the **console** and automatically inserted in the network's **comment**.

**Structural** L**earning** **of** **Bayesian** N**etworks**

A broad set of learning algorithms is available to perform a wide range of data mining tasks:

**Unsupervised Learning**for the discovery of probabilistic relationship in data.**Supervised**and**Semi-Supervised Learning**for the characterization of a particular variable.**Unsupervised Learning**for the identification of new concepts.

If the current Bayesian network has existing arcs, a dialog box appears to indicate if this Bayesian network corresponds to an a priori knowledge that the learning algorithms have to take into account.

In that case, an equivalent number has to be specified to indicate how many cases have been used for the construction of that network (by learning or expertise). This number is automatically set when the network is learned from a database. It corresponds to the sum of the weights of the database's learning set if weights are associated, or to the number of examples in the learning set. A virtual database representing that knowledge is then added to the current database in order to take into account this a priori knowledge. A new icon is then added in the task bar:

All the learning algorithms that are in BayesiaLab are based on the minimization of the MDL score (Minimum Description Length). This score takes into account both the adequacy of the Bayesian network to the data, and the structural complexity of the graph. The score values are available in the **console** during learning. **The** **score** **of** **a** **given** **network** **can** **always** **be** **computed by** **updating** **its** **probabilities**.

The **excluded nodes** are not taken into account in the learning algorithms.

The **filtered states** are taken into account.

A**compression** **rate** is available in the console. This indicator measures the data compression obtained by the network with respect to the previous network (usually, the unconnected network). This rate, which corresponds to the part "adequacy of the network to the data" of the MDL score, then not only gives an indication on the probabilistic links that are in the network, but also the strength of these links. For example, with a database containing two binary variables that are strictly identical, the cor- responding network will link these variables and describe in the conditional probability table that the value of the second variable is deterministically defined by the first variable. The compression rate will be then equal to 50%.

**Learning** P**olicies**

if the network has **Decision** and **Utility** nodes, action policies can be learned for **static** **Bayesian** **networks** and for **Dynamic** **Bayesian** **networks**. The scheme used for policy learning relies on dynamic programming and reinforcement learning principles, which makes this learning available with exact as well as approximate inference. Learning policies is only available in **Validation mode**.