Reutemann, P., & Holmes, G. (2015). Scientific workflow management with ADAMS: building and data mining a database of crop protection and related data. In R. M. Beresford, K. J. Froud, J. M. Kean, & S. P. Worner (Eds.), The Plant Protection Data Toolbox (pp. 167–174). New Zealand Plant Protection Society (Inc).
Permanent Research Commons link: https://hdl.handle.net/10289/10675
Data mining is said to be a field that encourages data to speak for itself rather than “forcing” data to conform to a pre-specified model, but we have to acknowledge that what is spoken by the data may well be gibberish. To obtain meaning from data it is important to use techniques systematically, to follow sound experimental procedure and to examine results expertly. This paper presents a framework for scientific discovery from data with two examples from the biological sciences. The first case is a re-investigation of previously published work on aphid trap data to predict aphid phenology and the second is a commercial application for identifying and counting insects captured on sticky plates in greenhouses. Using support vector machines rather than neural networks or linear regression gives better results in case of the aphid trap data. For both cases, we use the open source machine learning workbench WEKA for predictive modelling and the open source ADAMS workflow system for automating data collection, preparation, feature generation, application of predictive models and output generation.
New Zealand Plant Protection Society (Inc)
© 2015 New Zealand Plant Protection Society. Used with permission.