Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Mining Arbitrarily Large Datasets Using Heuristic k-Nearest Neighbour Search

      Wu, Xing; Holmes, Geoffrey; Pfahringer, Bernhard
      DOI
       10.1007/978-3-540-89378-3_35
      Link
       springerlink.com
      Find in your library  
      Citation
      Export citation
      Wu, X. ,Pfahringer, B. & Holmes, G. (2008). Mining arbitrarily large datasets using heuristic k-nearest neighbour search. In W. Wobcke & M. Zhang, Proceedings of 21st Australasian Joint Conference on Artificial Intelligence Auckland, New Zealand, December 1-5, 2008(pp. 355-361). Berlin: Springer.
      Permanent Research Commons link: https://hdl.handle.net/10289/1764
      Abstract
      Nearest Neighbour Search (NNS) is one of the top ten data mining algorithms. It is simple and effective but has a time complexity that is the product of the number of instances and the number of dimensions. When the number of dimensions is greater than two there are no known solutions that can guarantee a sublinear retrieval time. This paper describes and evaluates two ways to make NNS efficient for datasets that are arbitrarily large in the number of instances and dimensions. The methods are best described as heuristic as they are neither exact nor approximate. Both stem from recent developments in the field of data stream classification. The first uses Hoeffding Trees, an extension of decision trees to streams and the second is a direct stream extension of NNS. The methods are evaluated in terms of their accuracy and the time taken to find the neighbours. Results show that the methods are competitive with NNS in terms of accuracy but significantly faster.
      Date
      2008
      Type
      Conference Contribution
      Publisher
      Springer
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement