1
Dynamic Ensemble Size Adjustment for Memory
Constrained Mondrian Forest
Martin Khannouz and Tristan Glatard
Department of Computer Science and Software Engineering
Concordia University, Montreal, Quebec, Canada
Abstract—Supervised learning algorithms generally assume the
availability of enough memory to store data models during the
training and test phases. However, this assumption is unrealistic
when data comes in the form of infinite data streams, or
when learning algorithms are deployed on devices with reduced
amounts of memory. Such memory constraints impact the model
behavior and assumptions. In this paper, we show that under
memory constraints, increasing the size of a tree-based ensemble
classifier can worsen its performance. In particular, we experi-
mentally show the existence of an optimal ensemble size for a
memory-bounded Mondrian forest on data streams and we design
an algorithm to guide the forest toward that optimal number by
using an estimation of overfitting. We tested different variations
for this algorithm on a variety of real and simulated datasets,
and we conclude that our method can achieve up to 95% of the
performance of an optimally-sized Mondrian forest for stable
datasets, and can even outperform it for datasets with concept
drifts. All our methods are implemented in the OrpailleCC open-
source library and are ready to be used on embedded systems
and connected objects.
Index Terms—Mondrian Forest, Trimming, Concept Drift,
Data Stream, Memory Constraints.
I. INTRODUCTION
Supervised classification algorithms mostly assume the
availability of abundant memory to store data and models.
This is an issue when processing data streams — which are
infinite sequences by definition — or when using memory-
limited devices as is commonly the case in the Internet of
Things. We focus on the Mondrian forest, a popular online
classification method. Our ultimate goal is to optimize it for
data streams under memory constraints to make it compatible
with connected objects.
The Mondrian forest is a tree-based, ensemble, online learn-
ing method with comparable performance to offline Random
Forest [1]. Previous experiments highlighted the Mondrian
forest sensitivity to the ensemble size in a memory-constrained
environment [2]. Indeed, introducing a memory limit to the
ensemble also introduces a trade-off between underfitting and
overfitting. On the one hand, low tree numbers make room
for deeper trees and increase the risk of overfitting. On the
other hand, large tree numbers constrain tree depth due to
the memory limitation and increase the risk of underfitting.
Depending on the memory limit and the data distribution, a
given ensemble size may either overfit or underfit the dataset.
The goal of this paper is to address this trade-off by adapting
the ensemble size dynamically.
In summary, this paper makes the following contributions:
1) Highlight the existence of an optimal tree count in
memory-constrained Mondrian forest;
2) Propose a dynamic method to optimize the tree count;
3) Compare this method to the Mondrian forest with a
optimally-sized tree count.
II. MATERIALS AND METHODS
All the methods presented in this section are implemented
in the OrpailleCC framework [3]. The scripts to reproduce our
experiments are available on GitHub at https://github.com/big
-data-lab-team/benchmark-har-data-stream.
In this section, we start by presenting background on
Mondrian Forests (II-A and II-B), then presents the main
contribution of the paper, namely dynamically adjusting the
ensemble size of a memory-constrained Mondrian forest (II-C,
II-D,II-E,II-F), then describes the experimental evaluation
framework (II-G,II-H).
A. Mondrian Forest
The Mondrian forest [1] is an ensemble method that aggre-
gates Mondrian trees. Each tree recursively splits the feature
space, similar to a regular decision tree. However, the feature
used in the split and the value of the split are picked randomly.
The probability to select a feature is proportional to its range,
and the value for the split is uniformly selected in the range of
the feature. In contrast with other decision trees, the Mondrian
tree does not split leaves to introduce new nodes. Instead,
it introduces a new parent and a sibling to the node where
the split occurs. The original node and its descendant are not
modified and no data point is moved to that new sibling besides
the data points that initialized the split. This approach allows
the Mondrian tree to introduce new branches to internal nodes.
This training algorithm does not rely on labels to build the tree,
however, each node maintains counters for each label seen.
Therefore, labels can be delayed, but are needed before the
prediction. In addition to the counters, each node keeps track
of the range of its feature which represents a box containing
all data points. A data point can create a new branch only if
it is sorted to a node and falls outside of the node’s box.
B. Mondrian Forest for Data Stream Classification
The implementation of Mondrian forest presented in [1], [4]
is online because trees rely on potentially all the previously
seen data points to grow new branches. To support data
arXiv:2210.05704v1 [cs.LG] 11 Oct 2022