Child pages
  • Data Clustering (7.0)

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »


The root page BlabC:BayesiaLab Home could not be found in space BayesiaLab.


Learning | Clustering | Data Clustering

Data Clustering is a form of unsupervised learning that is utilized to segment the data. The output of the algorithm is a new variable,  [Factor_i]. The states of this new variable correspond to the created segments. 

There are various reasons to use Data Clustering:

  • For finding observations that look the same
  • For finding observations that behave the same
  • For representing an unobserved dimension
  • For compactly representing the joint probability distribution

Even though some metrics are available to judge the technical quality of the created variable, the practical quality is usually quite subjective as a good segmentation should be easily interpretable. Another important quality relies on the stability of the solution.


Data Clustering has been updated in versions 5.1 and 5.2.