AN ENSEMBLE MODEL APPROACH FOR MANY-FEATURE DATA CLUSTERING

Authors

  • Pham Van Nha Hanoi

Keywords:

Big data, classification, clustering, ensemble model, feature reduction, many-feature

Abstract

Big data processing is attracting the attention of researchers in the context of the globalization of the fourth industrial revolution. A fundamental property of big data is that data has many features. To deal with big data, it is necessary to use powerful tools for knowledge discovery. In this paper, we propose a many-feature data clustering model using advanced machine learning techniques. We call the ensemble feature-reduction clustering model - EFRC. The EFRC model consists of three stages. First, data is reduced-feature using a random projection. Then, data is divided into subsets based on the potential for noise quantification and overlap. Different clustering techniques are then used to cluster the subset of data. Finally, the results of clustering modules are consensus using a classification technique to produce the final clustering result. Some experiments were conducted on benchmark datasets. Experimental results demonstrate the superior performance of the EFRC model compared to the previous models.

Downloads

Published

2021-09-30