Child pages
  • Data Clustering (5.1)



Modeling Mode: Learning | Clustering | Data Clustering

Updated: Data Clustering Tool

Four changes have been made to the Data Clustering tool. 

  • The Clustering function now only applies to the selected nodes, as opposed to all nodes.
  • The resulting latent variable is no longer labeled Cluster, but instead [Factor_i], where i is the number of latent variables already induced in the network. For instance, if Clustering is performed in a network with two existing factors ([Factor_0] and [Factor_1]), the new factor will be [Factor_2].
  • The latent variable is automatically imputed by choosing, for each line, the state with the highest posterior probability given the values of its associated variables;
  • If a Target Variable is defined, and not included in the selected nodes, the option "Weighting coefficient by mutual information with target" becomes available. This feature can be useful for influencing the clustering result to get clusters that are more correlated to the Target Variable
    The weight  of each variable  is defined by the associated parameter  (yellow circle above) and the relative significances of the Mutual Information of the variable with the Target Node  as follows: