Dissertation/Thesis Abstract

Statistical model-based binary document image coding, reconstruction, and analysis
by Guo, Yandong, Ph.D., Purdue University, 2014, 96; 3723139
Abstract (Summary)

Binary document image is still one of the most important information carriers in this era of data. In this final exam, we will present two novel technologies to learn and understand low-level features from document images, and we also apply these technologies in the applications including compression, reconstruction, registration, and searching.

The first learning technology is the entropy-based dictionary learning, which is a method to learn a strong prior for document images. The information in this prior is used to encode the image effectively. If there are more than one page to be encoded, we impose hierarchical structure onto the dictionary, and dynamically update the dictionary. Compared with the best existing methods, we achieve much higher compression ratio.

The dictionary prior we proposed is also used to restore noisy document images. Our dictionary-based restoration improves the document image quality, and the encoding effectiveness simultaneously.

The second learning technology is layout structure detection for document images. Our layout detection method is faster and more efficient, compared with conventional methods. Using this technology, we construct sparse feature set for document images, which is then used in our novel, efficient document image searching system.

Indexing (document details)
Advisor: Bouman, Charles A. Bouman A.
Commitee: Bell, Mark R., Comer, Mary L.
School: Purdue University
Department: Electrical and Computer Engineering
School Location: United States -- Indiana
Source: DAI-B 77/01(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Statistics, Electrical engineering, Computer science
Keywords: Binary document image, Compression, Image analysis, Reconstruction, Search, Sparse coding
Publication Number: 3723139
ISBN: 9781339057941
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest