ResearchTopics Publications In progress publications Communications
PackagesAnaQol Project PRO-online R Packages Online R-package
Life of the unitProjects Collaborations PhD thesis Traineeships Traineeships propositions Seminars
15 mai 2018
02 mai 2018
15 avril 2018
15 avril 2018
Updated16 mai 2018
The Stata module "Clv"
Clv clusters variables around latent components. The variables are clustered stepwise by seeking to minimize at each step the decrease of the T criterion, computed as the sum of the first eigenvalues of the matrices of data of all the clusters. A hierarchical cluster analysis based on this criterion is performed. A consolidation procedure can be run subsequently which allows each variable to be assigned to the latent component it is the most correlated with.
Type "findit clv" or "ssc install clv" directly from your Stata browser.
Syntax (version 2.14)
clv [varlist] [if expr] [in range] [weight] [, nostandardized bar consolidation(#) nodendro cutnumber(#) deltaT horizontal showcount abbrev(#) title(string) caption(string) kernel(numlist) method(string) nobiplot addvar genlv(string) replace textsize(string) savedendro(filename[,replace]) std dim(string) ]
If no varlist is indicated, the procedure uses the varlist from the last clv procedure, but does not perform the hierarchical cluster analysis.
Only fweights are allowed. The biplots are disabled if weights are used.
The individuals with one or several missing values are omitted.
With the polychoric and polychoricv2 methods, the nostandardized option is disabled.
This module uses the following modules downloadable on SSC: polychoric, biplotvlab and genscore.
The author thanks Ronan Conroy for its propositions of improvements.
- nostandardized: uses centered variables instead of standardized variables
- bar: displays a chart of the decrease in the T criterion at each step
- consolidation(#): performs a consolidation procedure with the obtained partition into the specified number of clusters (by default, no consolidation procedure is performed)
- nodendro: suppresses the display of the dendogram.
- cutnumber(#): limits the dendrogram to the specifed number of clusters
- deltaT: uses the variation of the T criterion as height variable for the dendrogram
- horizontal: displays an horizontal (instead vertical) dendrogram
- showcount: displays the number of variables in each cluster (usefull with the cutnumber option)
- abbrev(#): defines the length of the variables labels on the dendrogram (15 characters by default)
- title(string): defines the title of the dendrogram
- caption(string): defines the caption of the axis of the dendrogram which indicates the names of the variables
- kernel(numlist): defines one or several kernels of variables (variables which are clustered together in an initial step). The first number indicates that the first variables are clustered together, the second number indicates that the following variables are clustered together...
- method(string): indicates the method to cluster the variables among classical (by default) for the method described by Vigneau and Qannari, polychoric for a use of the matrix of polychoric coefficients of correlation (instead of Pearson coefficients of correlation), v2 for a modified algorithm wich search to minimize the maximum second eigenvalue among the clusters of 2 variables and more, polychoricv2 which correspond to the v2 option with the matrix of polychoric coefficients of correlation, and centroid which is defined by Vigneau and Qannari as an adaptation of CLV when the sign of the correlation coefficients between the variables is important.
- nobiplot: avoids to display a biplot of the latent variables with the consolidation option
- addvar: adds the variables on the biplot realized with the latent variables (only with the consolidation option)
- genlv(string): saves the latent variables in new variables with the string as prefix (followed by a number). This option must be used in conjonction with the consolidation option.
- replace: allows replacing the variables creates with the genlv option if they already exist.
- textsize(string): defines the size of the labels of the variables on the dendrogram (see help textsizestyle).
- savedendro(filename[,replace]): saves the dendrogram in the file defined by this option. If this file already exists, it is possible to replace it with the replace option.
- std: allows standardizing the latent variables for the graphical representation on the biplot.
- dim(string): allows choosing the axes represented on the biplot.
clv var1-var15, cons(6) bar nodendro