YOLOv3-Tesseract Model for Improved Intelligent form Recognition

Zhang       Yun-An; Pan       Ziheng; Dui       Hongyan; Bai       Guanghan

doi:10.2174/2666255813666191204141610

Abstract

Background: YOLOv3-Tesseract is widely used for the intelligent form recognition because it exhibits several attractive properties. It is important to improve the accuracy and efficiency of the optical character recognition.

Methods: The YOLOv3 exhibits the classification advantages for the object detection. Tesseract can effectively recognize regular characters in the field of the optical character recognition. In this study, a YOLOv3 and Tesseract-based model of improved intelligent form recognition is proposed.

Results: First, YOLOv3 is trained to detect the position of the text in the table and to subsequently segment text blocks. Second, Tesseract is used to individually detect text blocks and combine YOLOv3 and Tesseract to achieve the goal of table character recognition.

Conclusion: Based on the Tianchi big data, experimental simulation is used to demonstrate the proposed method. The YOLOv3-Tesseract model is trained and tested to effectively accomplish the recognition task.

Keywords: Character recognition, form image, YOLOv3-tesseract, deep learning, Optical Character Recognition (OCR), Python3.5.4.

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

4

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/2666255813666191204141610	Print ISSN 2666-2558
Publisher Name Bentham Science Publisher	Online ISSN 2666-2566

Recent Advances in Computer Science and Communications

YOLOv3-Tesseract Model for Improved Intelligent form Recognition

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract