Tree-structured multiclass probability estimators

Leathart, Timothy Matthew

Tree-structured multiclass probability estimators

Authors

Leathart, Timothy Matthew

Files

thesis.pdf (2.43 MB)

Permanent Link

https://hdl.handle.net/10289/12926

Rights

Abstract

Nested dichotomies are used as a method of transforming a multiclass classification problem into a series of binary problems. A binary tree structure is constructed over the label space that recursively splits the set of classes into subsets, and a binary classification model learns to discriminate between the two subsets of classes at each node. Several distinct nested dichotomy structures can be built in an ensemble for superior performance. In this thesis, we introduce two new methods for constructing more accurate nested dichotomies. Random-pair selection is a subset selection method that aims to group similar classes together in a non-deterministic fashion to easily enable the construction of accurate ensembles. Multiple subset evaluation takes this, and other subset selection methods, further by evaluating several different splits and choosing the best performing one. Finally, we also discuss the calibration of the probability estimates produced by nested dichotomies. We observe that nested dichotomies systematically produce under-confident predictions, even if the binary classifiers are well calibrated, and especially when the number of classes is high. Furthermore, substantial performance gains can be made when probability calibration methods are also applied to the internal models.

Citation

Leathart, T. M. (2019). Tree-structured multiclass probability estimators (Thesis, Doctor of Philosophy (PhD)). The University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/12926

Type

Thesis

Date

2019

Publisher

The University of Waikato

Degree

Doctor of Philosophy (PhD)

Supervisor

Frank, Eibe
Pfahringer, Bernhard
Holmes, Geoffrey

Tree-structured multiclass probability estimators

Authors

Files

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor