Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Extremely fast decision tree mining for evolving data streams

      Bifet, Albert; Zhang, Jiajin; Fan, Wei; He, Cheng; Zhang, Jianfeng; Qian, Jianfeng; Holmes, Geoffrey; Pfahringer, Bernhard
      Thumbnail
      Files
      p1733-bifet.pdf
      Published version, 1.100Mb
      DOI
       10.1145/3097983.3098139
      Find in your library  
      Citation
      Export citation
      Bifet, A., Zhang, J., Fan, W., He, C., Zhang, J., Qian, J., … Pfahringer, B. (2017). Extremely fast decision tree mining for evolving data streams. In Proceedings of 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1733–1742). New York, USA: ACM. https://doi.org/10.1145/3097983.3098139
      Permanent Research Commons link: https://hdl.handle.net/10289/11588
      Abstract
      Nowadays real-time industrial applications are generating a huge amount of data continuously every day. To process these large data streams, we need fast and efficient methodologies and systems. A useful feature desired for data scientists and analysts is to have easy to visualize and understand machine learning models. Decision trees are preferred in many real-time applications for this reason, and also, because combined in an ensemble, they are one of the most powerful methods in machine learning.

      In this paper, we present a new system called STREAMDM-C++, that implements decision trees for data streams in C++, and that has been used extensively at Huawei. Streaming decision trees adapt to changes on streams, a huge advantage since standard decision trees are built using a snapshot of data, and can not evolve over time. STREAMDM-C++ is easy to extend, and contains more powerful ensemble methods, and a more efficient and easy to use adaptive decision trees. We compare our new implementation with VFML, the current state of the art implementation in C, and show how our new system outperforms VFML in speed using less resources.
      Date
      2017
      Type
      Conference Contribution
      Publisher
      ACM
      Rights
      © 2017 Copyright held by the author(s). Publication rights licensed to Association for Computing Machinery.
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

      Downloads, last 12 months
      246
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement