Loading...
Thumbnail Image
Publication

Ensembles of balanced nested dichotomies for multi-class problems

Abstract
A system of nested dichotomies is a hierarchical decomposition of a multi-class problem with c classes into c−1 two-class problems and can be represented as a tree structure. Ensembles of randomly generated nested dichotomies have proven to be an effective approach to multi-class learning problems [1]. However, sampling trees by giving each tree equal probability means that the depth of a tree is limited only by the number of classes, and very unbalanced trees can negatively affect runtime. In this paper, we investigate two approaches to building balanced nested dichotomies—class-balanced nested dichotomies and data-balanced nested dichotomies—and evaluate them in the same ensemble setting. Using C4.5 decision trees as the base models, we show that both approaches can reduce runtime with little or no effect on accuracy, especially on problems with many classes. We also investigate the effect of caching models when building ensembles of nested dichotomies.
Type
Conference Contribution
Type of thesis
Series
Citation
Date
2005-01-01
Publisher
SPRINGER-VERLAG BERLIN
Degree
Supervisors
Rights
This is an author’s accepted version of a conference paper published in Proc 9th European Conference on Principles and Practice of Knowledge Discovery in Databases. © 2005 Springer.