Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Clustering for classification

      Evans, Reuben James Emmanuel; Pfahringer, Bernhard; Holmes, Geoffrey
      DOI
       10.1109/CITA.2011.5998839
      Link
       ieeexplore.ieee.org
      Find in your library  
      Citation
      Export citation
      Evans, R. & Pfahringer, B. (2011). Clustering for classification. In Proceedings of 2011 7th International Conference Information Technology in Asia (CITA 11), 12-13 July 2011, Kuching, Sarawak (pp. 1-8).
      Permanent Research Commons link: https://hdl.handle.net/10289/6004
      Abstract
      Advances in technology have provided industry with an array of devices for collecting data. The frequency and scale of data collection means that there are now many large datasets being generated. To find patterns in these datasets it would be useful to be able to apply modern methods of classification such as support vector machines. Unfortunately these methods are computationally expensive, quadratic in the number of data points in fact, so cannot be applied directly. This paper proposes a framework whereby a variety of clustering methods can be used to summarise datasets, that is, reduce them to a smaller but still representative dataset so that advanced methods can be applied. It compares the results of using this framework against using random selection on a large number of classification problems. Results show that clustering prior to classification is beneficial when employing a sophisticated classifier however when the classifier is simple the benefits over random selection are not justified given the added cost of clustering. The results also show that for each dataset it is important to choose a clustering method carefully.
      Date
      2011
      Type
      Conference Contribution
      Publisher
      IEEE
      Collections
      • Computing and Mathematical Sciences Papers [1454]
      Show full item record  

      Usage

       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement