Generic placeholder image

Recent Advances in Computer Science and Communications

Editor-in-Chief

ISSN (Print): 2666-2558
ISSN (Online): 2666-2566

Research Article

YOLOv3-Tesseract Model for Improved Intelligent form Recognition

Author(s): Zhang Yun-An, Pan Ziheng, Dui Hongyan* and Bai Guanghan

Volume 14, Issue 6, 2021

Published on: 04 December, 2019

Page: [1833 - 1842] Pages: 10

DOI: 10.2174/2666255813666191204141610

Price: $65

conference banner
Abstract

Background: YOLOv3-Tesseract is widely used for the intelligent form recognition because it exhibits several attractive properties. It is important to improve the accuracy and efficiency of the optical character recognition.

Methods: The YOLOv3 exhibits the classification advantages for the object detection. Tesseract can effectively recognize regular characters in the field of the optical character recognition. In this study, a YOLOv3 and Tesseract-based model of improved intelligent form recognition is proposed.

Results: First, YOLOv3 is trained to detect the position of the text in the table and to subsequently segment text blocks. Second, Tesseract is used to individually detect text blocks and combine YOLOv3 and Tesseract to achieve the goal of table character recognition.

Conclusion: Based on the Tianchi big data, experimental simulation is used to demonstrate the proposed method. The YOLOv3-Tesseract model is trained and tested to effectively accomplish the recognition task.

Keywords: Character recognition, form image, YOLOv3-tesseract, deep learning, Optical Character Recognition (OCR), Python3.5.4.

Graphical Abstract


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy