Child pages
  • Discretization Wizard (8.0)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Context

BayesiaLab requires the discretization of the continuous variables. This process basically consists in creating a clone of the hidden continuous variable, with discrete states (usually called bins in this context). Discretization has obviously a huge impact on the model because it defines the perception of the domain.

One of the most important parameters of discretization is the number of bins. It has indeed a direct impact on the model complexity. The more bins there are, the larger the (conditional) probability tables are.

...

Localtab
activetrue
title2 bins

Image Removed

Localtab
title3 bins

Image Removed

Localtab
title5 bins

Image Removed

In the context of machine learning, this means that we need to have enough samples to estimate all these probabilities. The size of the data set must therefore be taken into consideration when choosing the number of bins. 

New Feature: Intervals

As of version 8.0, the number of bins proposed by default in the discretization wizard is automatically calculated using the number of observations. This number is between 3 and 7. 

...