Show simple item record  

dc.contributor.authorFrank, Eibeen_US
dc.contributor.authorHolmes, Geoffreyen_US
dc.contributor.authorKirkby, Richard Brendonen_US
dc.contributor.authorHall, Mark A.en_US
dc.date.accessioned2008-03-19T04:58:12Z
dc.date.available2007-07-22en_US
dc.date.available2008-03-19T04:58:12Z
dc.date.issued2002-06-01en_US
dc.identifier.citationFrank, E., Holmes, G., Kirkby, R. & Hall, M. (2002). Racing committees for large datasets. (Working paper series. University of Waikato, Department of Computer Science. No. 03/02/2002). Hamilton, New Zealand: University of Waikato.en_US
dc.identifier.urihttps://hdl.handle.net/10289/39
dc.description.abstractThis paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It allows the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so. The basic idea is to split incoming data into chunks and build a committee based on classifiers build from these individual chunks [3]. Our method extends earlier work in two ways: (a) the best chunk size is chosen automatically by racing committees corresponding to different chunk sizes, and (b) the committees are pruned adaptively to keep the size of each individual committee as small as possible without negatively affecting accuracy. This paper shows that choosing an appropriate chunk size automatically is important because the accuracy of the resulting committee can vary significantly with the chunk size. It also shows that pruning is crucial to make the method practical for large datasets in terms of running time and memory requirements. Surprisingly, the results demonstrate that pruning can also improve accuracy.en_US
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherUniversity of Waikato, Department of Computer Science
dc.relation.ispartofseriesComputer Science Working Papers
dc.subjectMachine learning
dc.titleRacing committees for large datasets.en_US
dc.typeWorking Paperen_US
uow.relation.series03/02


Files in this item

This item appears in the following Collection(s)

Show simple item record