Leveraging Plasticity in Incremental Decision Trees
Marco Heyden, Heitor Murilo Gomes, Edouard Fouché, Bernhard Pfahringer, Klemens Böhm
Published in ECML PKDD '24, 2024
Commonly used incremental decision trees for mining data streams include Hoeffding Trees (HT) and Extremely Fast Decision Trees (EFDT). EFDT exhibits faster learning than HT. However, due to its split revision procedure, EFDT suffers from sudden and unpredictable accuracy decreases caused by subtree pruning. To overcome this, we propose PLASTIC, an incremental decision tree that restructures the otherwise pruned subtree. This is possible due to decision tree plasticity: one can alter a tree’s structure without affecting its predictions. We conduct extensive evaluations comparing PLASTIC with state-of-the-art methods on synthetic and real-world data streams. Our results show that PLASTIC improves EFDT’s worst-case accuracy by up to 50 % and outperforms the current state of the art on real-world data. We provide an open-source implementation of PLASTIC within the MOA framework for mining high-speed data streams.