Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Theses
      • Masters Degree Theses
      • View Item
      •   Research Commons
      • University of Waikato Theses
      • Masters Degree Theses
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Clustering for Classification

      Evans, Reuben James Emmanuel
      Thumbnail
      Files
      thesis.pdf
      496.0Kb
      Citation
      Export citation
      Evans, R. J. E. (2007). Clustering for Classification (Thesis, Master of Science (MSc)). The University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/2403
      Permanent Research Commons link: https://hdl.handle.net/10289/2403
      Abstract
      Advances in technology have provided industry with an array of devices for collecting data. The frequency and scale of data collection means that

      there are now many large datasets being generated. To find patterns in these datasets it would be useful to be able to apply modern methods of

      classification such as support vector machines. Unfortunately these methods are computationally expensive, quadratic in the number of data points in fact, so cannot be applied directly.

      This thesis proposes a framework whereby a variety of clustering methods can be used to summarise datasets, that is, reduce them to a smaller but

      still representative dataset so that these advanced methods can be applied. It compares the results of using this framework against using random selection on a large number of classification and regression problems. Results show that the clustered datasets are on average fifty percent smaller than the original datasets without loss of classification accuracy which is significantly better than random selection. They also show that there is no free lunch, for each dataset it is important to choose a clustering method carefully.
      Date
      2007
      Type
      Thesis
      Degree Name
      Master of Science (MSc)
      Publisher
      The University of Waikato
      Rights
      All items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.
      Collections
      • Masters Degree Theses [2385]
      Show full item record  

      Usage

      Downloads, last 12 months
      78
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement