Holmes, G., Richard, K., Pfahringer, B. (2005). Tie-breaking in Hoeffding trees. In proceedings of the Second International Workshop on Knowledge Discovery from Data Streams, Porto, Portugal, 2005.
Permanent Research Commons link: https://hdl.handle.net/10289/1488
A thorough examination of the performance of Hoeffding trees, state-of-the-art in classification for data streams, on a range of datasets reveals that tie breaking, an essential but supposedly rare procedure, is employed much more than expected. Testing with a lightweight method for handling continuous attributes, we find that the excessive invocation of tie breaking causes performance to degrade significantly on complex and noisy data. Investigating ways to reduce the number of tie breaks, we propose an adaptive method that overcomes the problem while not significantly affecting performance on simpler datasets.