Child pages
  • K-Means Discretization

Contents

Question

How does the BayesiaLab K-Means discretization algorithm work?

Answer

This discretization algorithm is an unsupervised univariate discretization algorithm that consists in applying the classical K-means clustering to one-dimensional continuous data.

The Expectation-Maximization algorithm works as follows:

  1. Initialization: random creation of K centers
  2. Expectation: each point is associated with the closest center
  3. Maximization: each center position is computed as the barycenter of its associated points

Steps 2 and 3 are repeated until convergence is reached.

The discretization thresholds used by BayesiaLab are defined as: 

$$T_i = \frac{K_{i}+K_{i+1}}{2}$$




Example

The figure below illustrates how this algorithm works with K=3.