Learning from batched data: model combination vs data combination

dc.contributor.authorTing, Kai Ming
dc.contributor.authorLow, Boon Toh
dc.contributor.authorWitten, Ian H.
dc.date.accessioned2008-10-20T03:49:44Z
dc.date.available2008-10-20T03:49:44Z
dc.date.issued1997-05
dc.description.abstractWhen presented with multiple batches of data, one can either combine them into a single batch before applying a machine learning procedure or learn from each batch independently and combine the resulting models. The former procedure, data combination, is straightforward; this paper investigates the latter, model combination. Given an appropriate combination method, one might expect model combination to prove superior when the data in each batch was obtained under somewhat different conditions or when different learning algorithms were used on the batches. Empirical results show that model combination often outperforms data combination even when the batches are drawn randomly from a single source of data and the same learning method is used on each. Moreover, this is not just an artifact of one particular method of combining models: it occurs with several different combination methods. We relate this phenomenon to the learning curve of the classifiers being used. Early in the learning process when the learning curve is steep there is much to gain from data combination, but later when it becomes shallow there is less to gain and model combination achieves a greater reduction in variance and hence a lower error rate. The practical implication of these results is that one should consider using model combination rather than data combination, especially when multiple batches of data for the same task are readily available. It is often superior even when the batches are drawn randomly from a single sample, and we expect its advantage to increase if genuine statistical differences between the batches exist.en_US
dc.format.mimetypeapplication/pdf
dc.identifier.citationTing, K.M., Low, B.T. & Witten, I.H. (1997). Learning from batched data: model combination vs data combination. (Working paper 97/14). Hamilton, New Zealand: University of Waikato, Department of Computer Science.en_US
dc.identifier.issn1170-487X
dc.identifier.urihttps://hdl.handle.net/10289/1077
dc.language.isoen
dc.publisherDepartment of Computer Science, University of Waiken_NZ
dc.relation.ispartofseriesComputer Science Working Papers
dc.subjectcomputer scienceen_US
dc.subjectMachine learning
dc.titleLearning from batched data: model combination vs data combinationen_US
dc.typeWorking Paperen_US
pubs.begin-page83en_NZ
pubs.elements-id54840
pubs.end-page106en_NZ
pubs.place-of-publicationHamiltonen_NZ
uow.relation.series97/14
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
uow-cs-wp-1997-14.pdf
Size:
4.7 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.8 KB
Format:
Item-specific license agreed upon to submission
Description: