Loading...
Thumbnail Image
Item

Using a permutation test for attribute selection in decision trees

Abstract
Most techniques for attribute selection in decision trees are biased towards attributes with many values, and several ad hoc solutions to this problem have appeared in the machine learning literature. Statistical tests for the existence of an association with a prespecified significance level provide a well-founded basis for addressing the problem. However, many statistical tests are computed from a chi-squared distribution, which is only a valid approximation to the actural distribution in the large-sample case-and this patently does not hold near the leaves of a decision tree. An exception is the class of permutation tests. We describe how permutation tests can be applied to this problem. We choose one such test for further exploration, and give a novel two-stage method for applying it to select attributes in a decision tree. Results on practical datasets compare favourably with other methods that also adopt a pre-pruning strategy.
Type
Conference Contribution
Type of thesis
Series
Citation
Frank, E. & Witten, I.H.(1998). Using a permutation test for attribute selection in decision trees. In Proceeding of 15th International Conference on Machine Learning, Madison, Wisconsin(pp.152-160). San Francisco: Morgan Kaufmann Publishers.
Date
1998
Publisher
Morgan Kaufmann Publishers
Degree
Supervisors
Rights
This article has been published in Proceeding of 15th International Conference on Machine Learning, Madison, Wisconsin. ©1998 Morgan Kaufmann.