Furthering deep learning in near-infrared spectroscopy for fruit quality assessment

dc.contributor.advisorHolmes, Geoffrey
dc.contributor.advisorFrank, Eibe
dc.contributor.advisorMcGlone, Andrew
dc.contributor.authorWohlers, Mark
dc.date.accessioned2026-02-12T20:22:47Z
dc.date.available2026-02-12T20:22:47Z
dc.date.issued2026
dc.description.abstractNear-infrared (NIR) spectroscopy is widely used to assess fruit quality in the horticulture industry. It enables non-destructive estimation of key fruit quality measures from spectra, including dry matter content (associated with taste) and soluble solids content (associated with ripeness). Traditionally, partial least squares regression (PLSR) has been the dominant modelling method. However, more recently, deep learning (DL) has shown promise due to its ability to learn features automatically and model non-linear patterns. However, there are several challenges DL faces when applied to NIR. Labelled datasets are complicated, expensive, and time-consuming to obtain at the size required to fit these models. Deciding on the appropriate architecture and hyperparameters can also be challenging when validation data is sparse. Additionally, a problem of great practical importance in NIR spectroscopy is the difficulty of generalising across different devices of the same model or under different conditions, such as temperature. This thesis addresses these challenges through three complementary methods. The first uses a data augmentation technique that samples from a multivariate normal distribution with a covariance matrix designed to simulate spectral differences observed across devices. The experiments investigate whether the augmentation improves generalisability and training with small sample sizes. The second method is a metric based on model stability to diffeomorphic transformations relative to uncorrelated perturbations of similar magnitude. The experiment evaluates the appropriateness of this method for model selection tasks and compares its performance with standard validation methods. The third method adapts the Barlow Twins contrastive learning method to enable semi-supervised learning in the NIR setting. The Barlow Twins loss function allows unlabelled data to compensate when labelled data is scarce. This method also improves generalisability by encouraging multiple measurements on the same fruit to be similar in the encoded latent space. Evaluation of these methods is conducted on two datasets: a new dataset containing 5418 kiwifruit sampled across five devices and three seasons, and a previously published dataset of 4675 mangoes measured across four seasons. The results show that the methods improve predictive performance, especially for small labelled datasets and calibration transfer problems. This allows for the easier application of deep learning to NIR spectroscopy by reducing the requirements for labelled data, improving model generalisability across devices, and enabling model selection under data constraints.
dc.identifier.urihttps://hdl.handle.net/10289/17928
dc.language.isoen
dc.publisherThe University of Waikatoen_NZ
dc.relation.doihttps://doi.org/10.1016/j.chemolab.2023.104924
dc.relation.doihttps://doi.org/10.1016/j.chemolab.2025.105449
dc.rightsAll items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.en_NZ
dc.titleFurthering deep learning in near-infrared spectroscopy for fruit quality assessment
dc.typeThesisen
dspace.entity.typePublication
pubs.place-of-publicationHamilton, New Zealanden_NZ
thesis.degree.grantorThe University of Waikatoen_NZ
thesis.degree.levelDoctoralen
thesis.degree.nameDoctor of Philosophy (PhD)
uow.thesis.typeThesis with publication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis.pdf
Size:
18.95 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.58 KB
Format:
Item-specific license agreed upon to submission
Description: