Frame Augmentation for Imbalanced Object Detection Datasets
A major challenge in most object detection datasets is class imbal-
ance. It is especially apparent in uncurated datasets where frames
originate from a real-world setup such as a set of cameras col-
lecting data from fixed locations. In that case, the dataset class
distribution mirrors the real-world distribution, causing a bias to-
wards over-represented classes if used for model training. In this
paper we propose a synthesis technique for balancing the dataset,
which exploits having sets of frames from the same camera view.
The result is synthesized frames containing only rare objects, while
guaranteeing realistic object placement both in terms of scene con-
text and perspective. We train a deep learning object detection
model on the augmented dataset and compare its performance to
a model trained on the original, imbalanced dataset. Results show
that including the synthesized frames in the training results in a
significant performance boost for the rare classes.