Visible to the public Chinese Character Captcha Sequential Selection System Based on Convolutional Neural Network

TitleChinese Character Captcha Sequential Selection System Based on Convolutional Neural Network
Publication TypeConference Paper
Year of Publication2020
AuthorsBi, X., Liu, X.
Conference Name2020 International Conference on Computer Vision, Image and Deep Learning (CVIDL)
Date PublishedJuly 2020
ISBN Number978-1-7281-9481-3
Keywords10-layer convolutional neural network, affine transformation, CAPTCHA, captchas, character recognition, Chinese character, Chinese character captcha recognition, Chinese character captcha sequential selection system, Chinese character detection, Chinese character recognition sub-process, CNN, Completely Automated Public Turing Test to Tell Computers and Humans Apart, composability, convolutional neural nets, convolutional neural networks, detection, feature extraction, handwritten character recognition, Human Behavior, image segmentation, natural language processing, Neural networks, probability, pubcrawl, recognition, rotation transformation, security of data, single Chinese character data, Task Analysis, word order recovery, word order restoration, word processing

To ensure security, Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) is widely used in people's online lives. This paper presents a Chinese character captcha sequential selection system based on convolutional neural network (CNN). Captchas composed of English and digits can already be identified with extremely high accuracy, but Chinese character captcha recognition is still challenging. The task we need to complete is to identify Chinese characters with different colors and different fonts that are not on a straight line with rotation and affine transformation on pictures with complex backgrounds, and then perform word order restoration on the identified Chinese characters. We divide the task into several sub-processes: Chinese character detection based on Faster R-CNN, Chinese character recognition and word order recovery based on N-Gram. In the Chinese character recognition sub-process, we have made outstanding contributions. We constructed a single Chinese character data set and built a 10-layer convolutional neural network. Eventually we achieved an accuracy of 98.43%, and completed the task perfectly.

Citation Keybi_chinese_2020