Holmes, G., Fletcher, D. & Reutemann, P. (2010). Predicting polycyclic aromatic hydrocarbon concentrations in soil and water samples. In D.A. Swayne, W. Yang, A.A. Voinov, A. Rizzoli & T. Filatova (Eds.), Proceedings of International Environmental Modelling and Software Society (iEMSs) 2010 International Congress on Environmental Modelling and Software Modelling for Environment’s Sake, Fifth Biennial Meeting, July 5-8 2010, Ottawa, Canada.
Permanent Research Commons link: http://hdl.handle.net/10289/4374
Polycyclic Aromatic Hydrocarbons (PAHs) are compounds found in the environment that can be harmful to humans. They are typically formed due to incomplete combustion and as such remain after burning coal, oil, petrol, diesel, wood, household waste and so forth. Testing laboratories routinely screen soil and water samples taken from potentially contaminated sites for PAHs using Gas Chromatography Mass Spectrometry (GC-MS). A GC-MS device produces a chromatogram which is processed by an analyst to determine the concentrations of PAH compounds of interest. In this paper we investigate the application of data mining techniques to PAH chromatograms in order to provide reliable prediction of compound concentrations. A workflow engine with an easy-to-use graphical user interface is at the heart of processing the data. This engine allows a domain expert to set up workflows that can load the data, preprocess it in parallel in various ways and convert it into data suitable for data mining toolkits. The generated output can then be evaluated using different data mining techniques, to determine the impact of preprocessing steps on the performance of the generated models and for picking the best approach. Encouraging results for predicting PAH compound concentrations, in terms of correlation coefficients and root-mean-squared error are demonstrated.
International Environmental Modelling and Software Society
This article has been published in Proceedings of International Environmental Modelling and Software Society (iEMSs) 2010 International Congress on Environmental Modelling and Software Modelling for Environment’s Sake, Fifth Biennial Meeting, July 5-8 2010, Ottawa, Canada. Used with permission.