Challenges of Deep Learning-based Text Detection in the Wild

Zobeir Raisi; Mohamed A. Naiel; Paul Fieguth; Steven Wardell; John Zelek

doi:10.15353/jcvis.v6i1.3543

Vol. 6 No. 1 (2020)
Special Issue: Proceedings of CVIS 2020

Articles

Challenges of Deep Learning-based Text Detection in the Wild

https://doi.org/10.15353/jcvis.v6i1.3543

Published 2021-01-15

Zobeir Raisi
Mohamed A. Naiel
Paul Fieguth
Steven Wardell
John Zelek

Zobeir Raisi
University of Waterloo

Mohamed A. Naiel
University of Waterloo

Paul Fieguth
University of Waterloo

Steven Wardell
ATS Automation Tooling Systems

John Zelek
University of Waterloo

How to Cite

Raisi, Z., Naiel, M. A., Fieguth, P., Wardell, S., & Zelek, J. (2021). Challenges of Deep Learning-based Text Detection in the Wild. Journal of Computational Vision and Imaging Systems, 6(1), 1–5. https://doi.org/10.15353/jcvis.v6i1.3543

Download Citation

Abstract

The reported accuracy of recent state-of-the-art text detection methods, mostly deep learning approaches, is in the order of 80% to 90% on standard benchmark datasets. These methods have relaxed some of the restrictions of structured text and environment (i.e., "in the wild") which are usually required for classical OCR to properly function. Even with this relaxation, there are still circumstances where these state-of-the-art methods fail. Several remaining challenges in wild images, like in-plane-rotation, illumination reflection, partial occlusion, complex font styles, and perspective distortion, cause exciting methods to perform poorly. In order to evaluate current approaches in a formal way, we standardize the datasets and metrics for comparison which had made comparison between these methods difficult in the past. We use three benchmark datasets for our evaluations: ICDAR13, ICDAR15, and COCO-Text V2.0. The objective of the paper is to quantify the current shortcomings and to identify the challenges for future text detection research.

PDF