This menu gives access to algorithms that allow clustering data in an unsupervised way, in order to find partitions of homogeneous elements. These algorithms are based on a naive architecture in which node CLUSTERS, which is used to model the partitions, is the parent of all the other variables. Unlike supervised learning, the values of node CLUSTERS are never observed in a database. All these algorithms then rely on Expectation-Maximization methods for the estimation of these missing values.

Output

It is possible to create cluster with ordered numerical states. These values are the mean of the score of each connected node for each state of the cluster node. This score is weighted by the binary mutual information of each node in order to be more representative of the relationships. If two of these values are strictly identical, an epsilon is added to one of them to obtain two different values. The excluded nodes are not taken into account for the computation of the numerical values.

Clustering Settings

The assistant gives access to the different search methods:

Options

Edit Node Weights

A button displays a dialog box in order to edit weights associated to each variable.

 
Those weights, with default value 1, are associated with the variables and permit to guide the clustering. A weight greater than 1 will imply that the variable will be more taken into account during the clustering. A zero weight will make the variable purely illustrative.

Result

At the end of clustering, an algorithm allows finding automatically if one of the Clusters node's states is a filtered state or not. If so, this state is marked as filtered.

An automatic analysis of the obtained segmentation is then carried out and returns a textual report. This report is a Target Report Analysis, but contains some additional information. It is made of: 

The Mapping button of the report window allows displaying a graphical representation of the created clusters:

This graph displays three properties of the found clusters: 

The rotation buttons at the bottom right allow rotating the entire graph.

In order to ease the understanding of the obtained clusters, and if at least one variable used in the clustering has numerical values associated to its states, the states of the node Cluster will have long names automatically associated. This name will contain the mean value of all the clustered variables obtained when observing the state of the Cluster.