Holmes, G., Kirkby, R. & Bainbridge, D.(2004). Batch-Incremental Learning for Mining Data Streams. Working papers, University of Waikato, Department of Computer science 2004, Hamilton, New Zealand: University of Waikato.
Permanent Research Commons link: http://hdl.handle.net/10289/1749
The data stream model for data mining places harsh restrictions on a learning algorithm. First, a model must be induced incrementally. Second, processing time for instances must keep up with their speed of arrival. Third, a model may only use a constant amount of memory, and must be ready for prediction at any point in time. We attempt to overcome these restrictions by presenting a data stream classification algorithm where the data is split into a stream of disjoint batches. Single batches of data can be processed one after the other by any standard non-incremental learning algorithm. Our approach uses ensembles of decision trees. These tree ensembles are iteratively merged into a single interpretable model of constant maximal size. Using benchmark datasets the algorithm is evaluated for accuracy against state-of-the-art algorithms that make use of the entire dataset.
Department of Computer Science Working Papers
University of Waikato