Many-view Clustering: An Illustration Using Multiple Dissimilarity Measures

Abstract

Multi-view problems generalize standard machine learning problems to situations in which data entities are described from multiple different perspectives, a situation that arises in many applications due to the consideration of multiple data sources or multiple metrics of dissimilarity between entities. Multi-view algorithms for data clustering offer the opportunity to fully consider and integrate this information during the clustering process, but current algorithms are often limited to the use of two views.

Here, we describe the design of an evolutionary algorithm for the problem of multi-view data clustering. The use of a many-objective evolutionary algorithm addresses limitations of previous work, as the resulting method should be capable of scaling to settings with four or more views. We evaluate the performance of our proposed algorithm for a set of traditional benchmark datasets, where multiple views are derived using distinct measures of dissimilarity. Our results demonstrate the ability of our method to effectively deal with a many-view setting, as well as the performance boost obtained from the integration of complementary measures of dissimilarity for both synthetic and real-world datasets.

Publication
In GECCO ‘19: Proceedings of the Genetic and Evolutionary Computation Conference Companion
Adán JOSÉ-GARCÍA
Adán JOSÉ-GARCÍA
Research Fellow in Digital Health
Julia HANDL
Julia HANDL
Professor in Decision Sciences
Wilfrido GÓMEZ-FLORES
Wilfrido GÓMEZ-FLORES
Associate Professor in Machine Learning
Mario GARZA-FABRE
Mario GARZA-FABRE
Associate Professor in Computer Science

Related