An MDL estimate of the significance of rules
Citation
Export citationCleary, J. G., Legg, S. & Witten, I. H. (1996). An MDL estimate of the significance of rules. (Working paper 96/03). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
Permanent Research Commons link: https://hdl.handle.net/10289/1156
Abstract
This paper proposes a new method for measuring the performance of models-whether decision trees or sets of rules-inferred by machine learning methods. Inspired by the minimum description length (MDL) philosophy and theoretically rooted in information theory, the new method measures the complexity of text data with respect to the model. It has been evaluated on rule sets produced by several different machine learning schemes on a large number of standard data sets. When compared with the usual percentage correct measure, it is shown to agree with it in restricted cases. However, in other more general cases taken from real data sets-for example, when rule sets make multiple or no predictions-it disagrees substantially. It is argued that the MDL measure is more reasonable in these cases and represents a better way of assessing the significance of a rule set's performance. The question of the complexity of the rule set itself is not addressed in the paper.
Date
1996-03Type
Report No.
96/03
Collections
- 1996 Working Papers [32]