Many-view Clustering: An Illustration Using Multiple Dissimilarity Measures

Adán JOSÉ-GARCÍA, Julia HANDL, Wilfrido GÓMEZ-FLORES, Mario GARZA-FABRE

July 2019

DOI Project

Abstract

Multi-view problems generalize standard machine learning problems to situations in which data entities are described from multiple different perspectives, a situation that arises in many applications due to the consideration of multiple data sources or multiple metrics of dissimilarity between entities. Multi-view algorithms for data clustering offer the opportunity to fully consider and integrate this information during the clustering process, but current algorithms are often limited to the use of two views.

Here, we describe the design of an evolutionary algorithm for the problem of multi-view data clustering. The use of a many-objective evolutionary algorithm addresses limitations of previous work, as the resulting method should be capable of scaling to settings with four or more views. We evaluate the performance of our proposed algorithm for a set of traditional benchmark datasets, where multiple views are derived using distinct measures of dissimilarity. Our results demonstrate the ability of our method to effectively deal with a many-view setting, as well as the performance boost obtained from the integration of complementary measures of dissimilarity for both synthetic and real-world datasets.

Type

Conference paper

Publication

In GECCO ‘19: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Many-view Clustering: An Illustration Using Multiple Dissimilarity Measures

Abstract

Adán JOSÉ-GARCÍA

Research Fellow in Digital Health

Julia HANDL

Professor in Decision Sciences

Wilfrido GÓMEZ-FLORES

Associate Professor in Machine Learning

Mario GARZA-FABRE

Associate Professor in Computer Science

Related