Child pages
  • Univariate (6.0)

Contents

Context

Import/Associate | Discretization and Aggregation Wizard

Node Editor | Curve | Generate a Discretization

Learning | Discretization

A univariate discretization method is based on the analysis of the continuous values of the variable only. Such methods can be used while importing/associating data. Once data has been loaded, discretization can be started via Learning | Discretization or by using the Node Editor | Curve | Generate a Discretization.

Updated Feature: Normalized Equal Distance

The Normalized Equal Distance algorithm has been modified to take into account the Minimum Interval Weight that defines the minimum prior probability of a bin (Window | Preferences | Discretization). As a result, the updated algorithm is now less sensitive to outliers compared to the familiar Equal Distance algorithm.

New Feature: R2-GenOpt

The entirely new R2-GenOpt algorithm utilizes a Genetic Algorithm to find a discretization that maximizes the R2 between the discretized variable and its corresponding (hidden) continuous variable. As such, it is the optimal approach for achieving the first discretization objective, i.e. finding a precise representation of the continuous values of a variable.

This algorithm takes into account the Minimum Interval Weight and can also create a specific bin for representing zeros if the Isolate Zeros option is set.

The R2 value between the discretized variable and its corresponding continuous variable can be retrieved in the Information Mode by hovering over the monitor.