Permanent link to Research Commons versionhttps://hdl.handle.net/10289/14953
Arguably the most popular application task in artificial intelligence is image classification using transfer learning. Transfer learning enables models pre-trained on general classes of images, available in large numbers, to be refined for a specific application. This enables domain experts with their own—generally, substantially smaller—collections of images to build deep learning models. The good performance of such models poses the question of whether it is possible to further reduce the effort required to label training data by adopting a human-in-the-loop interface that presents the expert with the current predictions of the model on a new batch of data and only requires correction of these predictions—rather than de novo labelling by the expert—before retraining the model on the extended data. This paper looks at how to order the data in this iterative training scheme to achieve the highest model performance while minimising the effort needed to correct misclassified examples. Experiments are conducted involving five methods of ordering, using four image classification datasets, and three popular pre-trained models. Two of the methods we consider order the examples a priori whereas the other three employ an active learning approach where the ordering is updated iteratively after each new batch of data and retraining of the model. The main finding is that it is important to consider accuracy of the model in relation to the number of corrections that are required: using accuracy in relation to the number of labelled training examples—as is common practice in the literature—can be misleading. More specifically, active methods require more cumulative corrections than a priori methods for a given level of accuracy. Within their groups, active and a priori methods perform similarly. Preliminary evidence is provided that suggests that for “simple” problems, i.e., those involving fewer examples and classes, no method improves upon random selection of examples. For more complex problems, an a priori strategy based on a greedy sample selection method known as “kernel herding” performs best.
© 2022 The Authors.