Automatic Clustering Algorithms: A Systematic Review and Bibliometric Analysis of Relevant Literature

Abstract

Cluster analysis is an essential tool in data mining. Several clustering algorithms have been proposed and implemented, most of which are able to find good quality clustering results. However, the majority of the traditional clustering algorithms, such as the K-means, K-medoids, and Chameleon, still depend on being provided a priori with the number of clusters and may struggle to deal with problems where the number of clusters is unknown. This lack of vital information may impose some additional computational burdens or requirements on the relevant clustering algorithms. In real-world data clustering analysis problems, the number of clusters in data objects cannot easily be preidentified and so determining the optimal amount of clusters for a dataset of high density and dimensionality is quite a difficult task. Therefore, sophisticated automatic clustering techniques are indispensable because of their flexibility and effectiveness.

This paper presents a systematic taxonomical overview and bibliometric analysis of the trends and progress in nature-inspired metaheuristic clustering approaches from the early attempts in the 1990s until today’s novel solutions. Finally, key issues with the formulation of metaheuristic algorithms as a clustering problem and major application areas are also covered in this paper.

Publication
Neural Computing and Applications
Adán JOSÉ-GARCÍA
Adán JOSÉ-GARCÍA
Research Fellow in Digital Health

Related