Towards automated configuration of stream clustering algorithms

Loading...
Thumbnail Image

Publisher link

Rights

This is a post-peer-review, pre-copyedit version of an article published in Proceedings of ECML PKDD: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-43823-4_12

Abstract

Clustering is an important technique in data analysis which can reveal hidden patterns and unknown relationships in the data. A common problem in clustering is the proper choice of parameter settings. To tackle this, automated algorithm configuration is available which can automatically find the best parameter settings. In practice, however, many of our today’s data sources are data streams due to the widespread deployment of sensors, the internet-of-things or (social) media. Stream clustering aims to tackle this challenge by identifying, tracking and updating clusters over time. Unfortunately, none of the existing approaches for automated algorithm configuration are directly applicable to the streaming scenario. In this paper, we explore the possibility of automated algorithm configuration for stream clustering algorithms using an ensemble of different configurations. In first experiments, we demonstrate that our approach is able to automatically find superior configurations and refine them over time.

Citation

Carnein, M., Trautmann, H., Bifet, A., & Pfahringer, B. (2019). Towards automated configuration of stream clustering algorithms. In P. Cellier & K. Driessens (Eds.), Machine Learning and Knowledge Discovery in Databases: Proc ECML PKDD 2019, Part 1 (Vol. CCIS 1167, pp. 137–143). Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-030-43823-4_12

Series name

Date

Publisher

Springer

Degree

Type of thesis

Supervisor