A review on deep learning approaches for optical character recognition with emphasis on Persian, Arabic and Urdu scripts

Document Type : Survey


1 Vaje Research Group, Kerman, Iran

2 Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

3 Department of Computer Engineering, Sirjan University of Technology, Sirjan, Iran


In recent years, the success of deep convolutional neural networks in object recognition has attracted the attention of many different areas of machine learning, including the field of optical character recognition, to this category. One of the major challenges in this area is to extract distinctive and informative features. Most of the methods proposed in the optical recognition of letters in recent years are based on hand-crafted features that have limited generalizability. Today, with the help of convolutional networks, feature extraction can be left to the machine automatically and with high efficiency. Also, structures based on the combination of convolutional and recursive networks have been proposed, which can perform recognition without the need for letter separation. This approach has received a great deal of attention from machine vision researchers in recent years; since, with the help of these networks, recognition can be done independently of the language and only according to the training set. The purpose of this article is to review the work done with this new approach in the field of optical character recognition. To this end, after stating the problem and a brief overview of the previous methods, methods based on deep learning algorithms and their characteristics are evaluated in more detail. Since the emphasis of this article is on research on optical recognition of letters in continuous scripts, such as Persian, Arabic and Urdu, the work done in these areas is also reviewed in a separate section. Also, while introducing famous datasets for different applications and reviewing the evaluation criteria of optical character recognition methods, the most important software and open source packages that are used for optical character recognition will be introduced.