Evaluation of deep convolutional nets for document image classification and retrieval

1 August 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 991-995
https://doi.org/10.1109/icdar.2015.7333910

Abstract

This paper presents a new state-of-the-art for document image classification and retrieval, using features learned by deep convolutional neural networks (CNNs). In object and scene analysis, deep neural nets are capable of learning a hierarchical chain of abstraction from pixel inputs to concise and descriptive representations. The current work explores this capacity in the realm of document analysis, and confirms that this representation strategy is superior to a variety of popular handcrafted alternatives. Extensive experiments show that (i) features extracted from CNNs are robust to compression, (ii) CNNs trained on non-document images transfer well to document analysis tasks, and (iii) enforcing region-specific feature-learning is unnecessary given sufficient training data. This work also makes available a new labelled subset of the IIT-CDIP collection, containing 400,000 document images across 16 categories.

Keywords

This publication has 15 references indexed in Scilit:

Convolutional Neural Networks for Document Image Classification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Structural similarity for document image classification and retrieval
Pattern Recognition Letters, 2014
Unsupervised Classification of Structurally Similar Document Images
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Digital Libraries and Document Image Retrieval Techniques: A Survey
Published by Springer Science and Business Media LLC ,2011
Building a test collection for complex document information processing
Published by Association for Computing Machinery (ACM) ,2006
A survey of document image classification: problem statement, classifier architecture and performance evaluation
International Journal on Document Analysis and Recognition (IJDAR), 2006
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Form classification using DP matching
Published by Association for Computing Machinery (ACM) ,2000
Twenty years of document image analysis in PAMI
Ieee Transactions On Pattern Analysis and Machine Intelligence, 2000
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998

Cited by 179 articles