A novel two stage scheme utilizing the test set for model selection in text classification

Pfahringer, Bernhard; Reutemann, Peter; Mayo, Michael

A novel two stage scheme utilizing the test set for model selection in text classification

Authors

Pfahringer, Bernhard

Reutemann, Peter

Mayo, Michael

Files

A novel two stage scheme utilizing the test set for model selection in text classification.pdf (103.18 KB)

Permanent Link

https://hdl.handle.net/10289/1487

Abstract

Text classification is a natural application domain for semi-supervised learning, as labeling documents is expensive, but on the other hand usually an abundance of unlabeled documents is available. We describe a novel simple two stage scheme based on dagging which allows for utilizing the test set in model selection. The dagging ensemble can also be used by itself instead of the original classifier. We evaluate the performance of a meta classifier choosing between various base learners and their respective dagging ensembles. The selection process seems to perform robustly especially for small percentages of available labels for training.

Citation

Pfahringer, B., Reutemann, P., Mayo, M. (2005). A novel two stage scheme utilizing the test set for model selection in text classification. Paper presented at the 18th Australian Joint Conference on Artificial Intelligence, University of Technology, Sydney, Australia, December 5-9, 2005.

Type

Conference Contribution

Date

2005

Publisher

University of Technology, Sydney

A novel two stage scheme utilizing the test set for model selection in text classification

Authors

Files

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor