Hitting the target: Stopping active learning at the cost-based optimum

dc.contributor.authorPullar-Strecker, Zacen_NZ
dc.contributor.authorDost, Katharinaen_NZ
dc.contributor.authorFrank, Eibeen_NZ
dc.contributor.authorWicker, Jörgen_NZ
dc.date.accessioned2024-01-30T01:34:04Z
dc.date.available2024-01-30T01:34:04Z
dc.date.issued2022en_NZ
dc.description.abstractActive learning allows machine learning models to be trained using fewer labels while retaining similar performance to traditional supervised learning. An active learner selects the most informative data points, requests their labels, and retrains itself. While this approach is promising, it raises the question of how to determine when the model is ‘good enough’ without the additional labels required for traditional evaluation. Previously, different stopping criteria have been proposed aiming to identify the optimal stopping point. Yet, optimality can only be expressed as a domain-dependent trade-off between accuracy and the number of labels, and no criterion is superior in all applications. As a further complication, a comparison of criteria for a particular real-world application would require practitioners to collect additional labelled data they are aiming to avoid by using active learning in the first place. This work enables practitioners to employ active learning by providing actionable recommendations for which stopping criteria are best for a given real-world scenario. We contribute the first large-scale comparison of stopping criteria for pool-based active learning, using a cost measure to quantify the accuracy/label trade-off, public implementations of all stopping criteria we evaluate, and an open-source framework for evaluating stopping criteria. Our research enables practitioners to substantially reduce labelling costs by utilizing the stopping criterion which best suits their domain.en_NZ
dc.format.mimetypeapplication/pdf
dc.identifier.citationPullar-Strecker, Z., Dost, K., Frank, E., and Wicker, J. (2024) Hitting the target: Stopping active learning at the cost-based optimum. Machine Learning, 113(4), 1529-1547. https://doi.org/10.1007/s10994-022-06253-1
dc.identifier.doi10.1007/s10994-022-06253-1en_NZ
dc.identifier.eissn1573-0565en_NZ
dc.identifier.issn0885-6125en_NZ
dc.identifier.urihttps://hdl.handle.net/10289/16419
dc.language.isoenen_NZ
dc.publisherSpringer Science and Business Media LLCen_NZ
dc.relation.isPartOfMachine Learningen_NZ
dc.relation.urihttps://rdcu.be/cXLPTen_NZ
dc.rights© 2022 The Authors. This work is licensed under a CC BY 4.0 licence.
dc.subjectcomputer scienceen_NZ
dc.subjectactive learningen_NZ
dc.subjectstopping criteriaen_NZ
dc.subjectdata labellingen_NZ
dc.subjectcost analysisen_NZ
dc.titleHitting the target: Stopping active learning at the cost-based optimumen_NZ
dc.typeJournal Article
dspace.entity.typePublication
pubs.publication-statusPublished onlineen_NZ

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10994-022-06253-1.pdf
Size:
1.73 MB
Format:
Adobe Portable Document Format
Description:
Published version

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Research Commons Deposit Agreement 2017.pdf
Size:
188.11 KB
Format:
Adobe Portable Document Format
Description: