Improving on bagging with input smearing

dc.contributor.authorFrank, Eibe
dc.contributor.authorPfahringer, Bernhard
dc.coverage.spatialConference held at Singaporeen_NZ
dc.date.accessioned2008-11-20T22:26:49Z
dc.date.available2008-11-20T22:26:49Z
dc.date.issued2006
dc.description.abstractBagging is an ensemble learning method that has proved to be a useful tool in the arsenal of machine learning practitioners. Commonly applied in conjunction with decision tree learners to build an ensemble of decision trees, it often leads to reduced errors in the predictions when compared to using a single tree. A single tree is built from a training set of size N. Bagging is based on the idea that, ideally, we would like to eliminate the variance due to a particular training set by combining trees built from all training sets of size N. However, in practice, only one training set is available, and bagging simulates this platonic method by sampling with replacement from the original training data to form new training sets. In this paper we pursue the idea of sampling from a kernel density estimator of the underlying distribution to form new training sets, in addition to sampling from the data itself. This can be viewed as “smearing out” the resampled training data to generate new datasets, and the amount of “smear” is controlled by a parameter. We show that the resulting method, called “input smearing”, can lead to improved results when compared to bagging. We present results for both classification and regression problems.en_US
dc.identifier.citationFrank, E. & Pfahringer, B. (2006). Improving on bagging with input smearing. In W.K. Ng, M. Kitsuregawa & J. Li(Eds.), Proceedings of 10th Pacific-Asia Conference, PAKDD, Singapore, April 9-12,2006(pp. 97-106). Berlin: Springer.en_US
dc.identifier.doi10.1007/11731139_14en_US
dc.identifier.urihttps://hdl.handle.net/10289/1430
dc.language.isoen
dc.publisherSpringer, Berlinen_US
dc.relation.isPartOfProc 10th Pacific-Asia Conference, Advances in Knowledge Discovery and Data Miningen_NZ
dc.relation.urihttp://www.springerlink.com/content/j715903g2t12q724/en_US
dc.sourcePAKDD 2006en_NZ
dc.subjectcomputer scienceen_US
dc.subjectbaggingen_US
dc.subjectMachine learning
dc.titleImproving on bagging with input smearingen_US
dc.typeConference Contributionen_US
pubs.begin-page97en_NZ
pubs.elements-id17040
pubs.end-page106en_NZ
pubs.finish-date2006-04-12en_NZ
pubs.place-of-publicationBerlinen_NZ
pubs.start-date2006-04-09en_NZ
pubs.volumeLNCS 3918en_NZ
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: