On dynamic feature weighting for feature drifting data streams
Barddal, J. P., Gomes, H. M., Enembreck, F., Pfahringer, B., & Bifet, A. (2016). On dynamic feature weighting for feature drifting data streams. In P. Frasconi, N. Landwehr, G. Manco, & J. Vreeken (Eds.), Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (Vol. LNAI 9852, pp. 129–144). Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-319-46227-1_9
Permanent Research Commons link: https://hdl.handle.net/10289/11028
The ubiquity of data streams has been encouraging the development of new incremental and adaptive learning algorithms. Data stream learners must be fast, memory-bounded, but mainly, tailored to adapt to possible changes in the data distribution, a phenomenon named concept drift. Recently, several works have shown the impact of a so far nearly neglected type of drifcccct: feature drifts. Feature drifts occur whenever a subset of features becomes, or ceases to be, relevant to the learning task. In this paper we (i) provide insights into how the relevance of features can be tracked as a stream progresses according to information theoretical Symmetrical Uncertainty; and (ii) how it can be used to boost two learning schemes: Naive Bayesian and k-Nearest Neighbor. Furthermore, we investigate the usage of these two new dynamically weighted learners as prediction models in the leaves of the Hoeffding Adaptive Tree classifier. Results show improvements in accuracy (an average of 10.69% for k-Nearest Neighbor, 6.23% for Naive Bayes and 4.42% for Hoeffding Adaptive Trees) in both synthetic and real-world datasets at the expense of a bounded increase in both memory consumption and processing time.
© 2016 Springer International Publishing Switzerland.This is the author's accepted version. The final publication is available at Springer via dx.doi.org/10.1007/978-3-319-46227-1_9