Hidden features: Experiments with feature transfer for fine-grained multi-class and one-class image categorization
Vetrova, V., Coup, S., Frank, E., & Cree, M. J. (2018). Hidden features: Experiments with feature transfer for fine-grained multi-class and one-class image categorization. In 2018 International conference on image and vision computing New Zealand (IVCNZ). Auckland, New Zealand: IEEE.
Permanent Research Commons link: https://hdl.handle.net/10289/12793
Can we apply out-of-the box feature transfer using pre-trained convolutional neural networks in fine-grained multiclass image categorization tasks? What is the effect of (a) domainspecific fine-tuning and (b) a special-purpose network architecture designed and trained specifically for the target domain? How do these approaches perform in one-class classification? We investigate these questions by tackling two biological object recognition tasks: classification of “cryptic” plants of genus Coprosma and identification of New Zealand moth species. We compare results based on out-of-the-box features extracted using a pre-trained state-of-the-art network to those obtained by finetuning to the target domain, and also evaluate features learned using a simple Siamese network trained only on data from the target domain. For each extracted feature set, we test a number of classifiers, e.g., support vector machines. In addition to multiclass classification, we also consider one-class classification, a scenario that is particularly relevant to biosecurity applications. In the multi-class setting, we find that out-of-the-box lowlevel features extracted from the generic pre-trained network yield high accuracy (90.76%) when coupled with a simple LDA classifier. Fine-tuning improves accuracy only slightly (to 91.6%). Interestingly, features extracted from the much simpler Siamese network trained on data from the target domain lead to comparable results (90.8%). In the one-class classification setting, we note high variability in the area under the ROC curve across feature sets, opening up the possibility of considering an ensemble approach.
This is an author’s accepted version of a paper published in the Proceedings: 2018 International conference on image and vision computing New Zealand (IVCNZ). © 2018 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.