Large-scale automatic species identification

The crowd-sourced Naturewatch GBIF dataset is used to obtain a species classification dataset containing approximately 1.2 million photos of nearly 20 thousand different species of biological organisms observed in their natural habitat. We present a general hierarchical species identification system based on deep convolutional neural networks trained on the NatureWatch dataset. The dataset contains images taken under a wide variety of conditions and is heavily imbalanced, with most species associated with only few images. We apply multi-view classification as a way to lend more influence to high frequency details, hierarchical fine-tuning to help with class imbalance and provide regularisation, and automatic specificity control for optimising classification depth. Our system achieves 55.8% accuracy when identifying individual species and around 90% accuracy at an average taxonomy depth of 5.1—equivalent to the taxonomic rank of “family”—when applying automatic specificity control.

Citation

Mo, J., Frank, E., & Vetrova, V. (2017). Large-scale automatic species identification. In W. Peng, D. Alahakoon, & X. Li (Eds.), Proceedings of 30th Australasian Joint Conference on Advances in Artificial Intelligence (Vol. LNCS 10400, pp. 301–312). Springer, Cham: Springer. https://doi.org/10.1007/978-3-319-63004-5_24

Type

Conference Contribution

Date

2017

Publisher

Springer

Large-scale automatic species identification

Authors

Files

Permanent Link

DOI

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor