Pfahringer, B., Holmes, G. & Wang, C. (2004). Millions of random rules. In Proceedings of the Workshop on Advances in Inductive Rule Learning, 15th European Conference on Machine Learning (ECML), Pisa, 2004.
Permanent Research Commons link: http://hdl.handle.net/10289/1490
In this paper we report on work in progress based on the induction of vast numbers of almost random rules. This work tries to combine and explore ideas from both Random Forests as well as Stochastic Discrimination. We describe a fast algorithm for generating almost random rules and study its performance. Rules are generated in such a way that all training examples are covered roughly by the same number of rules each. Rules themselves usually have a clear majority class among the examples they cover, but they are not limited in terms of either minimal coverage, nor minimal purity. A preliminary experimental evaluation indicates really promising results for both predictive accuracy as well as speed of induction, but at the expense of both large memory consumption as well as slow prediction. Finally, we discuss various directions for our future research.
Knowledge Engineering Group