Computing and Mathematical Sciences Papers
Permanent URI for this collectionhttps://researchcommons.waikato.ac.nz/handle/10289/6
This collection houses research from the School of Computing and Mathematical Sciences at the University of Waikato.
Browse
Recent Submissions
Item type: Item , Fostering community empowerment: A human-centered approach to designing clean water solutions in a Jakarta slum(SAGE, 2024) Lubis, Pierre Yohanes; Shahri, Bahareh; Ramirez, MarianoThis article presents a case study of the application of human-centered design (HCD) as a codesign approach to address complex problems in slum communities in Jakarta, Indonesia. Through a review of relevant literature, we examine how the HCD methodology embraces a participatory framework but retains a certain degree of control not found in pure participatory approaches. We explain why HCD was selected for this study and describe the methods used, including sort cards, solution cards, in-depth interviews, focus group discussions, and product usability interviews. These methods were employed to generate a solution that addressed the issue of sourcing clean water in Jakarta’s slums, which was then prototyped, tested, and implemented. The study contributes to the development of a cohesive and applicable methodology by integrating codesign and HCD in designing solutions for people at the Base of the Pyramid.Item type: Item , Exploring student perceptions of product-service systems(Routledge, 2025) Lubis, Pierre Yohanes; Shahri, BaharehIt is important that Product Design students understand the scope of sustainable design. The purpose of this study is to pilot the introduction of Product-Service System (PSS) Design to undergraduate students in Industrial Product Design. The study aims to test students' perceptions of how the Product Design field is evolving to embrace sustainable approaches address the needs of less-privileged communities. This chapter highlights how the transformation from product design to PSS Design is understood by students and how it can count towards sustainability and be preferred over product design. The advantages of a PSS over the design of products alone can be backed by the concept of sustainability when contextual issues such as cultural and economic factors are being considered. PSS was received to be an appropriate solution by providing job creation in communities, reducing consumerism, and, at the same time, being perceived to challenge the concept of ownership of products and services.Item type: Item , Re-evaluating fatigue measurement: A comparative study of subjective and objective fatigue tests(Elsevier BV, 2025) König, Jemma Lynette; Bowen, Judy; Hinze, AnnikaFatigue poses a significant risk in hazardous industries, with forestry being a particularly under-examined domain. Despite the availability of subjective and objective fatigue tests, inconsistencies in their application, selection rationale, and performance remain largely unaddressed in existing literature. This paper investigates the utility and challenges of subjective and objective fatigue assessments through a review of existing literature and two case studies: one intensive longitudinal study with a single participant and another broader study involving 31 participants. Our results reveal strong internal consistency across subjective tests but variable outcomes for objective tests, raising questions about test sensitivity and context-specific reliability. We argue for clearer guidance on fatigue test selection and propose criteria to inform future research in complex, real-world settings like forestry.Item type: Item , HfG-Archiv Ulm online: From exclusive reality to inclusive virtuality(Mediathek HGK FHNW, 2021) Short, Carolina; García Ferrari, Tomás; Quijano, MarcelaThe Archive of the Ulm School of Design (HfG-Archiv Ulm) began operating in 1987. It was created as a joint effort between the city council and a group of alumni that saw the necessity of preserving the institution’s memory and legacy after its closure. The first version of a website for the Archive was envisioned in 1999. Its goal was to present information on the HfG Ulm, display the Archive collection, and communicate related events to a massive audience. The HfG-Archiv Ulm website maintained the same structure and interface for almost 20 years. With the years of existence, it became an archive on its own. The virtual components acted as extensions of the tangible and intangible objects stored in the physical archive. Over the years of its existence, the website accomplished the mission of collecting and storing the Archive’s material and activities. At the same time, it was an instrument for research, education, and exposure for the Ulm School of Design. The project served as a communication tool for the Archive and became an archive of activities, events, publications and updates. The WWW was not conceived as a medium to preserve information, but it could work as such. In addition, the universal access of a website grants the possibility of reaching a physical place in Germany, achieving Winograd’s locomotion metaphor. We speak of navigating from one site to another, touching and following links — all metaphors of spatial locomotion that engage people opening new ways of thinking, learning, and doing. As technology changes, future work could amplify the experience of visiting the Archive by creating a contemporary virtual model, enhancing the opportunity to expand knowledge and spaces of interaction.Item type: Item , Automatic species identification from images for Aotearoa(Taylor and Francis Group, 2025) Wang, Hongyu; Schlumbom, Paul; Frank, Eibe; Vetrova, Varvara; Holmes, Geoffrey; Pfahringer, BernhardImage classification for species identification has applications in areas such as conservation and education. Given New Zealand's geographic isolation and the relatively small number of species present on its islands, there is an opportunity to apply machine learning to enable accurate automatic species identification for Aotearoa, even on mobile devices without Internet access. We present neural network-based image classification models trained to classify organisms present in New Zealand. The data for model development and evaluation, obtained from the crowd-sourcing website iNaturalist, comprises 14,991 species, including 6,216 Animalia, 6,173 Plantae, and 2,407 Fungi species, alongside a small set of observations of Bacteria, Chromista, Protozoa, and Viruses. It contains organisms observed in the natural environment as well as captive and cultivated organisms. The trained models achieve over 76% classification accuracy across all species and produce class probability estimates, calibrated using temperature scaling, that can be used to gauge confidence in their classifications. Input attribution methods can be used to interpret a model's inferences by highlighting its areas of focus on images. The models are available to the public as downloadable model files and as part of both web and mobile applications for species identification that are distributed as open-source software.Item type: Item , The Google Translator Toolkit and minority languages case study: Translating Moodle 2.0 into te reo Māori(2011) Manuirirangi, HōriThis paper describes a case study where the Google Translator Toolkit (GTT) was used to undertake a large translation task involving a minority language. The task involved the translation of 50,000 interface terms of the Moodle learning management system into te reo Māori (the Māori language). The paper begins by describing how some minority languages are not in an environment where technology can be easily used for translations. It then suggests that te reo Māori however, is in a position to use technology and suggests some technologies that are suitable. The paper then briefly describes the GTT and the translation task that this tool was used for. The translators' feedback on the use of this technology in this environment is summarised and it appears that the GTT is suitable to be used by minority language translators.Item type: Item , 15 yr of interstellar neutral hydrogen observed with the interstellar boundary explorer(IOP, 2025-05-01) Galli, A; Swaczyna, P; Bzowski, M; Kubiak, MA; Kowalska-Leszczynska, I; Wurz, P; Rahmanifard, F; Schwadron, NA; Möbius, E; Fuselier, SA; Sokół, JM; Gasser, J; Heerikhuisen, Jacob; McComas, DJThe interactions of our heliosphere with the surrounding local interstellar medium (LISM) lead to a range of observable phenomena such as energetic neutral atoms (ENAs) from the boundary regions of the heliosphere and the influx of interstellar neutrals (ISNs) into the inner solar system. Hydrogen is the dominant neutral species in the LISM, but due to ionization and radiation pressure, only a fraction of the ISN H atoms reach the inner solar system close to Earth. Monitoring this signal therefore provides observational constraints on our assumptions of the LISM and the solar-activity-dependent loss processes inside the heliosphere. The IBEX-Lo instrument on board the Interstellar Boundary Explorer has been the only instrument so far to measure ISN H atoms directly, together with ISN D, He, Ne, O, and ENAs in the energy range from tens of eV to 2 keV. This study covers 15 yr of IBEX-Lo ISN H observations, i.e., more than one solar cycle and includes two solar minima when the ISN H signal in IBEX-Lo is strongest. Despite the very intense ISN He signal, the ISN H signal can be retrieved with appropriate knowledge of the instrument, choice of optimum observation season, and supporting modeling. The retrieved ISN H signal shows a clear anticorrelation with solar activity. The resulting ISN H maps are available in orbit format and in ecliptic coordinates and will be the basis for future more detailed comparison with heliosphere models.Item type: Item , Notional(Domus Argenia Publisher, 2023) Soo, Chin-En Keith; Yu, Chunyang; Xu, Zhangqi; Soddu, Celestino; Colabella, EnricaEveryone is unique, especially their face. Even twins have more or less different faces. Faces are the most critical way for people to remember each other in their daily lives, so a face can be regarded as a unique ID of a person. Notional aims to generate a unique pattern by collecting facial data of a person, such as face length and forehead width, to show that each person is unique. The final pattern can even be used as a form of identity in the context of the metaverse.Item type: Item , Pruning feature extractor stacking for cross-domain few-shot learning(2025) Wang, Hongyu; Frank, Eibe; Pfahringer, Bernhard; Holmes, GeoffreyCombining knowledge from source domains to learn efficiently from a few labelled instances in a target domain is a transfer learning problem known as cross-domain few-shot learning (CDFSL). Feature extractor stacking (FES) is a state-of-the-art CDFSL method that maintains a collection of source domain feature extractors instead of a single universal extractor. FES uses stacked generalisation to build an ensemble from extractor snapshots saved during target domain fine-tuning. It outperforms several contemporary universal model-based CDFSL methods in the Meta-Dataset benchmark. However, it incurs higher storage cost because it saves a snapshot for every fine-tuning iteration for every extractor. In this work, we propose a bidirectional snapshot selection strategy for FES, leveraging its cross-validation process and the ordered nature of its snapshots, and demonstrate that a 95% snapshot reduction can be achieved while retaining the same level of accuracy.Item type: Item , Real world initiation of newly funded empagliflozin and dulaglutide under special authority for patients with type 2 diabetes in New Zealand(BMC, 2025) Chepulis, Lynne Merran; Rodrigues, Mark William; Gan, Han; Keenan, Rawiri; Kenealy, T; Murphy, R; Karu, LT; Scott-Jones, J; Clark, P; Moffitt, A; Mustafa, S; Lawrenson, Ross; Paul, Ryan G.Background: Type 2 diabetes (T2D) is sub-optimally managed for many in Aotearoa New Zealand, and disproportionately affects Māori and Pacific peoples. In February 2021, SGLT2i/GLP1RA agents were funded for use for the first time with prioritisation for Māori, Pacific and those with cardiovascular and/or renal disease or risk (CVRD). This study evaluates the impact of health system factors on initiation of SGLT2i/GLP1RA therapy. Methods: Primary care data was collected for patients with T2D aged 18–75 years from four primary care organisations (302 general practices) in the Auckland / Waikato region of New Zealand (Feb 2021 – July 2022). Initiation of SGLT2i/GLP1RA therapy was reviewed by patient (age, gender, ethnicity, CVRD status) and health system variables (funding, provider type, staffing, patient numbers, rurality, after-hours access). Logistic regression was used to estimate the odds ratio of a patient being dispensed SGLT2i/GLP1RA. Results: Of 57,743 patients with T2D, 22,331 were eligible for funded SGLT2i/GLP1RA access and 10,272 of those (46.0%) were prescribed. Initiation of therapy was highest in Māori (50.8%) and Pacific (48.8%) patients (vs. 36·2–40·7% of other ethnic groups; P < 0.001), but was comparable in those with and without CVRD (47·1% vs. 48·9%; P = 0.2). Prescribing was highest in practices with higher doctor/patient numbers, low-cost fees, Māori health providers and clinics without after-hours access. Conclusion: Prioritised access for SGLT2i/GLP1RA appears to be associated with a reduced health equity gap for Māori and Pacific patients with T2D in NZ, but work is required to improve prescribing for patients with CVRD.Item type: Item , Revisiting deep hybrid models for out-of-distribution detection(Journal of Machine Learning Research Inc., 2025) Schlumbom, Paul-Ruben; Frank, EibeDeep hybrid models (DHMs) for out-of-distribution (OOD) detection, jointly training a deep feature extractor with a classification head and a density estimation head based on a normalising flow, provide a conceptually appealing approach to visual OOD detection. The paper that introduced this approach reported 100% AuROC in experiments on two standard benchmarks, including one based on the CIFAR-10 data. As there are no implementations available, we set out to reproduce the approach by carefully filling in gaps in the description of the algorithm. Although we were unable to attain 100% OOD detection rates, and our results indicate that such performance is impossible on the CIFAR-10 benchmark, we achieved good OOD performance. We provide a detailed analysis of when the architecture fails and argue that it introduces an adversarial relationship between the classification component and the density estimator, rendering it highly sensitive to the balance of these two components and yielding a collapsed feature space without careful fine-tuning. Our implementation of DHMs is publicly available.Item type: Item , The WEKA data mining software: An update(ACM, 2009) Hall, Mark A.; Frank, Eibe; Holmes, Geoffrey; Pfahringer, Bernhard; Reutemann, Peter; Witten, Ian H.More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.Item type: Item , Applying the perceived creepiness of technology scale to social robots(IEEE Press, 2025) Turner, Jessica Dawn; Bowen, Judy; König, Jemma Lynette; Stawarz, Katarzyna; Vanderschantz, NicholasDesigning positive robot experiences requires an understanding of users' perceptions and meeting their needs in an ethical manner. However, despite best intentions, users have strong positive or negative reactions to robots, either finding them ''cute'' or ''creepy''. The Perceived Creepiness of Technology Scale (PCTS) was designed for evaluating how creepy a technology appears to a user on first encounter. In this paper we applied the PCTS to a cross-section of social robots to measure their perceived creepiness and evaluate the strengths and weaknesses of PCTS when applied in a Human-Robot Interaction (HRI) context. We demonstrate that while a robot may not be perceived as creepy initially, it can have underlying unethical practices inherent in its design which is not well captured by the PCTS. This emphasises the need for better HRI practices to ensure creepiness is appropriately assessed in the social robot domain.Item type: Publication , Using model trees for classification(Kluwer Academic Publishers, 1998) Frank, Eibe; Wang, Yong; Inglis, Stuart J.; Holmes, Geoffrey; Witten, Ian H.Model trees, which are a type of decision tree with linear regression functions at the leaves, form the basis of a recent successful technique for predicting continuous numeric values. They can be applied to classification problems by employing a standard method of transforming a classification problem into a problem of function approximation. Surprisingly, using this simple transformation the model tree inducer M5′, based on Quinlan’s M5, generates more accurate classifiers than the state-of-the-art decision tree learner C5.0, particularly when most of the attributes are numeric.Item type: Item , Epidemiology of giant cell arteritis in Waikato, Aotearoa New Zealand(Pasifika Medical Association Group, 2024-03-22) van Dantzig, Philippa; Quincey, Vicki; Kurz, Jason A.; Ming, Caroline; Kamalaksha, Sujatha; White, DouglasGiant cell arteritis (GCA) is the most common primary vasculitis in adults over 50 years of age. Our primary objective was to assess the incidence and prevalence of GCA in Waikato in a bid to deepen our understanding of the epidemiology of GCA in Aotearoa New Zealand. methods: From January 2014 to December 2022, cases of GCA were identified prospectively and retrospectively through temporal artery ultrasound request lists and temporal artery biopsy histology reports. Using electronic health records, data were collected retrospectively on patient demographics and clinical features. These were used to calculate the incidence, prevalence and standardised mortality ratio (SMR) of GCA in Waikato. results: There were 214 patients diagnosed with GCA over the 9-year period. The majority of patients were European (93.9%, 201/214) with Māori patients being significantly younger than European patients. The mean annual incidence of clinical GCA was 14.7 per 100,000 people over 50 years (95% confidence interval [CI] 12.7–16.6). The SMR was 1.18 (95% CI 0.83–1.52). conclusion: This is the largest study to date on the epidemiology of GCA in Aotearoa New Zealand. The incidence of GCA is comparable to other studies performed in Aotearoa New Zealand and appears to be stable over time. GCA is uncommon in Māori, Pacific Islander and Asian ethnic groups.Item type: Item , Performance of a fast-track pathway for giant cell arteritis in Waikato, Aotearoa New Zealand(Pasifika Medical Association Group, 2024-03-22) van Dantzig, Philippa; White, Douglas; Kurz, Jason A.; Ming, Caroline; Kamalaksha, Sujatha; Quincey, VickiGiant cell arteritis (GCA) is the most common primary vasculitis in adults over 50 years of age. To facilitate early diagnosis and reduce harms from corticosteroids and temporal artery biopsies, fast-track pathways have been established. We review the benefits of the fast-track pathway set up in Waikato, Aotearoa New Zealand. methods: Patients were collected prospectively as part of the fast-track pathway from 2014 to 2022. Their records were then reviewed retrospectively to collect data on clinical features, investigations and treatment. results: There were 648 individual patients over the study period who had a colour Doppler ultrasound (CDUS) of the temporal arteries. There were 17 true positive CDUS, giving a sensitivity of 10.3% (95% confidence interval [CI] 6.3–15.5%) and specificity of 99.8% (95% CI 99.1–100%). Patients with GCA and a positive scan had significantly fewer steroids than those with GCA and a negative scan (p=0.0037). There were 376 patients discharged after a CDUS who did not have a diagnosis of GCA, resulting in reduced corticosteroid and temporal artery biopsy exposure. conclusions: This is a real-life study that reflects the benefits of fast-track pathways in Aotearoa New Zealand to patients and healthcare systems. It also shows the effect of corticosteroids on positive CDUS, an important consideration when setting up an fast-track pathway.Item type: Publication , Enhancing aerial imagery analysis: Leveraging explainability and segmentation(Institute of Electrical and Electronics Engineers, 2024-04-08) Dwivedi, Anany; Lim, Nick Jin Sean; Bifet, Albert; Frank, Eibe; Pfahringer, BernhardIn the field of aerial and satellite remote sensing, the widespread adoption of deep learning brings new possibilities. Current approaches, however, often overlook the unique characteristics of aerial data. This study introduces a methodology that capitalizes on distinctive features, leveraging additional annotations for enhanced neural network training. Despite modest gains in classification accuracy, the synergy of enhanced explainability, automated segmentation, and targeted classification demonstrates nuanced improvements. Preliminary results showcase potential applications in land cover mapping. This work can be extented towards reducing dependency on labor-intensive human annotations through an iterative annotation and training loop.Item type: Publication , The WEKA workbench. Online appendix for "Data mining: Practical machine learning tools and techniques"(The University of Waikato, 2016) Frank, Eibe; Hall, Mark A.; Witten, Ian H.The WEKA workbench is a collection of machine learning algorithms and data preprocessing tools that includes virtually all the algorithms described in our book. It is designed so that you can quickly try out existing methods on new datasets in flexible ways. It provides extensive support for the whole process of experimental data mining, including preparing the input data, evaluating learning schemes statistically, and visualizing the input data and the result of learning. As well as a wide variety of learning algorithms, it includes a wide range of preprocessing tools. This diverse and comprehensive toolkit is accessed through a common interface so that its users can compare different methods and identify those that are most appropriate for the problem at hand. WEKA was developed at the University of Waikato in New Zealand; the name stands for Waikato Environment for Knowledge Analysis. Outside the university the WEKA, pronounced to rhyme with Mecca, is a flightless bird with an inquisitive nature found only on the islands of New Zealand. The system is written in Java and distributed under the terms of the GNU General Public License. It runs on almost any platform and has been tested under Linux, Windows, and Macintosh operating systems.Item type: Publication , Generating rule sets from model trees(SPRINGER-VERLAG BERLIN, 1999) Holmes, Geoffrey; Hall, Mark A.; Frank, Eibe; Foo, NormanModel trees—decision trees with linear models at the leaf nodes—have recently emerged as an accurate method for numeric prediction that produces understandable models. However, it is known that decision lists—ordered sets of If-Then rules—have the potential to be more compact and therefore more understandable than their tree counterparts. We present an algorithm for inducing simple, accurate decision lists from model trees. Model trees are built repeatedly and the best rule is selected at each iteration. This method produces rule sets that are as accurate but smaller than the model tree constructed from the entire dataset. Experimental results for various heuristics which attempt to find a compromise between rule accuracy and rule coverage are reported. We show that our method produces comparably accurate and smaller rule sets than the commercial state-of-the-art rule learning system Cubist.Item type: Item , Online estimation of discrete densities(IEEE, 2013) Geilke, Michael; Frank, Eibe; Karwath, Andreas; Kramer, Stefan; Xiong, H; Karypis, G; Thuraisingham, B; Cook, D; Wu, XWe address the problem of estimating a discrete joint density online, that is, the algorithm is only provided the current example and its current estimate. The proposed online estimator of discrete densities, EDDO (Estimation of Discrete Densities Online), uses classifier chains to model dependencies among features. Each classifier in the chain estimates the probability of one particular feature. Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains and ensembles of weighted classifier chains. For all density estimators, we provide consistency proofs and propose algorithms to perform certain inference tasks. The empirical evaluation of the estimators is conducted in several experiments and on data sets of up to several million instances: We compare them to density estimates computed from Bayesian structure learners, evaluate them under the influence of noise, measure their ability to deal with concept drift, and measure the run-time performance. Our experiments demonstrate that, even though designed to work online, EDDO delivers estimators of competitive accuracy compared to batch Bayesian structure learners and batch variants of EDDO.