Holmes, GeoffreyFrank, EibeMcGlone, AndrewWohlers, Mark2026-02-122026-02-122026https://hdl.handle.net/10289/17928Near-infrared (NIR) spectroscopy is widely used to assess fruit quality in the horticulture industry. It enables non-destructive estimation of key fruit quality measures from spectra, including dry matter content (associated with taste) and soluble solids content (associated with ripeness). Traditionally, partial least squares regression (PLSR) has been the dominant modelling method. However, more recently, deep learning (DL) has shown promise due to its ability to learn features automatically and model non-linear patterns. However, there are several challenges DL faces when applied to NIR. Labelled datasets are complicated, expensive, and time-consuming to obtain at the size required to fit these models. Deciding on the appropriate architecture and hyperparameters can also be challenging when validation data is sparse. Additionally, a problem of great practical importance in NIR spectroscopy is the difficulty of generalising across different devices of the same model or under different conditions, such as temperature. This thesis addresses these challenges through three complementary methods. The first uses a data augmentation technique that samples from a multivariate normal distribution with a covariance matrix designed to simulate spectral differences observed across devices. The experiments investigate whether the augmentation improves generalisability and training with small sample sizes. The second method is a metric based on model stability to diffeomorphic transformations relative to uncorrelated perturbations of similar magnitude. The experiment evaluates the appropriateness of this method for model selection tasks and compares its performance with standard validation methods. The third method adapts the Barlow Twins contrastive learning method to enable semi-supervised learning in the NIR setting. The Barlow Twins loss function allows unlabelled data to compensate when labelled data is scarce. This method also improves generalisability by encouraging multiple measurements on the same fruit to be similar in the encoded latent space. Evaluation of these methods is conducted on two datasets: a new dataset containing 5418 kiwifruit sampled across five devices and three seasons, and a previously published dataset of 4675 mangoes measured across four seasons. The results show that the methods improve predictive performance, especially for small labelled datasets and calibration transfer problems. This allows for the easier application of deep learning to NIR spectroscopy by reducing the requirements for labelled data, improving model generalisability across devices, and enabling model selection under data constraints.enAll items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.Furthering deep learning in near-infrared spectroscopy for fruit quality assessmentThesis