A Topological Clustering of Variables - Equipe de Recherche en Ingénierie des Connaissances Accéder directement au contenu
Chapitre D'ouvrage Année : 2022

A Topological Clustering of Variables

Résumé

The clustering of objects (individuals or variables) is one of the most used approaches to exploring multivariate data. The two most common unsupervised clustering strategies are hierarchical ascending clustering (HAC) and k-means partitioning used to identify groups of similar objects in a dataset to divide it into homogeneous groups. The proposed topological clustering of variables, called TCV, studies an homogeneous set of variables defined on the same set of individuals, based on the notion of neighborhood graphs, some of these variables are more-or-less correlated or linked according to the type quantitative or qualitative of the variables. This topological data analysis approach can then be useful for dimension reduction and variable selection. Its a topological hierarchical clustering analysis of a set of variables which can be quantitative, qualitative or a mixture of both. It arranges variables into homogeneous groups according to their correlations or associations studied in a topological context of principal component analysis (PCA) or multiple correspondence analysis (MCA). The proposed TCV is adapted to the type of data considered, its principle is presented and illustrated using simple real datasets with quantitative, qualitative and mixed variables. The results of these illustrative examples are compared to those of other variables clustering approaches.
Fichier principal
Vignette du fichier
TCV-Chapter.pdf (1.28 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03828789 , version 1 (25-10-2022)

Identifiants

  • HAL Id : hal-03828789 , version 1

Citer

Rafik Abdesselam. A Topological Clustering of Variables. Data Analysis and Related Applications 2, Multivariate, Health and Demographic Data Analysis, 10, , 2022, Big Data, Artificial Intelligence and Data Analysis SET, 9781786307729. ⟨hal-03828789⟩
30 Consultations
104 Téléchargements

Partager

Gmail Facebook X LinkedIn More