Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Parallelizing scientific experiments

Published:

6 minute read

In this blog post, I want to show my approach to parallelize scientific experiments using Python multiprocessing. While there may be other (better) methods to do this, the procedure described here has proven to be effective for me. We will evaluate Decision Trees and Random Forests on the Arrhythmia and Wisconsin Breast Cancer datasets.

Read more

portfolio

publications

Evolution-Based Online Automated Machine Learning

Cedric Kulbach, Jacob Montiel, Maroua Bahri, Marco Heyden, Albert Bifet

Published in PAKDD'22, 2022

We introduce EvoAutoML, an evolution-based online learning framework consisting of heterogeneous and connectable models that supports large and diverse configuration spaces and adapts to the online learning scenario.

Read more

Scalable Online Change Detection for High-dimensional Data Streams

Florian Kalinke, Marco Heyden, Edouard Fouché, Klemens Böhm

Published in arXiv preprint, 2022

We propose an algorithm, Maximum Mean Discrepancy Adaptive Windowing (MMDAW), which leverages the well-known Maximum Mean Discrepancy (MMD) two-sample test, and facilitates its efficient online computation on windows whose size it flexibly adapts.

Read more

Tandem Outlier Detectors for Decentralized Data

Marco Heyden, Jürgen Wilwer, Edouard Fouché, Steffen Thoma, Sven Matthiesen, Thomas Gwosch

Published in SSDBM, 2022

We address the problem of distinguishing between different types of outliers in decentralized data. We present a “tandem” solution that combines local and federated outlier detectors to effectively identify those types.

Read more

Adaptive Bernstein Change Detector for High-Dimensional Data Streams

Marco Heyden, Edouard Fouché, Vadim Arzamasov, Tanja Fenn, Florian Kalinke, Klemens Böhm

Published in Data Mining and Knowledge Discovery, SpringerNature, 2024

Our paper presents ABCD, a novel approach for change detection in high-dimensional data streams. ABCD detects changes accurately and provides insights into the specific subspace where changes occur. By leveraging an encoder-decoder model and Bernstein’s inequality, ABCD quantifies the severity of changes and outperforms other methods in our experiments.

Read more

Budgeted Multi-Armed Bandits with Asymmetric Confidence Intervals

Marco Heyden, Vadim Arzamasov, Edouard Fouché, Klemens Böhm

Published in KDD '24, 2024

We present a UCB-sampling policy for the Budgeted MAB problem that uses asymmetric confidence intervals to overcome issues of existing policies; our policy achieves logarithmic regret and outperforms existing policies in synthetic and real settings.

Read more

Leveraging Plasticity in Incremental Decision Trees

Marco Heyden, Heitor Murilo Gomes, Edouard Fouché, Bernhard Pfahringer, Klemens Böhm

Published in ECML PKDD '24, 2024

Hoeffding Trees (HT) and Extremely Fast Decision Trees (EFDT) are popular for mining data streams, with EFDT offering faster learning but suffering from accuracy drops due to subtree pruning. To address this, we propose PLASTIC, an incremental decision tree that restructures pruned subtrees without impacting predictions, leveraging decision tree plasticity.

Read more

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Heading 1

Heading 2

Heading 3

Read more

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Heading 1

Heading 2

Heading 3

Read more