
ROAD-R: The Autonomous Driving Dataset with Logical Requirements
Eleonora Giunchiglia1∗,Mihaela C˘
at˘
alina Stoian1∗,Salman Khan2,
Fabio Cuzzolin2and Thomas Lukasiewicz3,1
1Department of Computer Science, University of Oxford, UK
2School of Engineering, Computing and Mathematics, Oxford Brookes University, UK
3Institute of Logic and Computation, TU Wien, Austria
eleonora.giunchiglia@cs.ox.ac.uk, mihaela.stoian@cs.ox.ac.uk, 19052999@brookes.ac.uk,
fabio.cuzzolin@brookes.ac.uk, thomas.lukasiewicz@cs.ox.ac.uk
Abstract
Neural networks have proven to be very power-
ful at computer vision tasks. However, they of-
ten exhibit unexpected behaviours, violating known
requirements expressing background knowledge.
This calls for models (i) able to learn from the
requirements, and (ii) guaranteed to be compliant
with the requirements themselves. Unfortunately,
the development of such models is hampered by
the lack of datasets equipped with formally speci-
fied requirements. In this paper, we introduce the
ROad event Awareness Dataset with logical Re-
quirements (ROAD-R), the first publicly available
dataset for autonomous driving with requirements
expressed as logical constraints. Given ROAD-R,
we show that current state-of-the-art models often
violate its logical constraints, and that it is possi-
ble to exploit them to create models that (i) have
a better performance, and (ii) are guaranteed to be
compliant with the requirements themselves.
1 Introduction
Neural networks have proven to be incredibly powerful at
processing low-level inputs, and for this reason they have
been extensively applied to computer vision tasks, such as
image classification, object detection, and action detection
(see e.g., [Krizhevsky et al., 2012; Redmon et al., 2016]).
However, they can exhibit unexpected behaviors, contradict-
ing known requirements expressing background knowledge.
This can have dramatic consequences, especially in safety-
critical scenarios such as autonomous driving. To address
the problem, models should (i) be able to learn from the re-
quirements, and (ii) be guaranteed to be compliant with the
requirements themselves. Unfortunately, the development of
such models is hampered by the lack of datasets equipped
with formally specified requirements. A notable exception is
given by hierarchical multi-label classification (HMC) prob-
lems (see, e.g., [Vens et al., 2008]) in which datasets are pro-
vided with binary constraints of the form (A→B)stating
that label Bmust be predicted whenever label Ais predicted.
∗Contact authors.
In this paper, we introduce multi-label classification prob-
lems with propositional logic requirements, in which datasets
are provided with requirements ruling out non-admissible
predictions and expressed in propositional logic. In this new
formulation, given a multi-label classification problem with
labels A,Band C, we can, for example, write the require-
ment:
(¬A∧B)∨C,
stating that for each datapoint in the dataset either the label
Cis predicted, or Bbut not Aare predicted. Obviously, any
constraint written for HMC problems can be represented in
our framework, and thus, our problem formulation represents
a generalisation of HMC problems.
Then, we present the ROad event Awareness Dataset with
logical Requirements (ROAD-R), the first publicly available
dataset for autonomous driving with requirements expressed
as logical constraints. ROAD-R extends the ROAD dataset
[Singh et al., 2021], which consists of 22 relatively long (∼
8minutes each) videos annotated with road events. A road
event corresponds to a tube, i.e., a sequence of frame-wise
bounding boxes linked in time. Each bounding box is labeled
with a subset of the 41 labels specified in Table 1. The goal is
to predict the set of labels associated to each bounding box.
We manually annotated ROAD-R with 243 constraints, each
verified to hold for each bounding box. A typical constraint is
thus “a traffic light cannot be red and green at the same time”,
while there are no constraints like “pedestrians should cross
at crossings”, which should always be satisfied in theory, but
which might not be in real-world scenarios.
Given ROAD-R, we considered 6 current state-of-the-art
(SOTA) models, and we showed that they are not able to learn
the requirements just from the data points, as more than 90%
of the times, they produce predictions that violate the con-
straints. Then, we faced the problem of how to leverage the
additional knowledge provided by constraints with the goal of
(i) improving their performance, measured by the frame mean
average precision (f-mAP) at intersection over union (IoU)
thresholds 0.5 and 0.75; see, e.g., [Kalogeiton et al., 2017;
Li et al., 2018]), and (ii) guaranteeing that they are compli-
ant with the constraints. To achieve the above two goals, we
propose the following new models:
1. CL models, i.e., models with a constrained loss allowing
them to learn from the requirements,
arXiv:2210.01597v2 [cs.LG] 5 Oct 2022