Bi-level document image compression using layout information

Inglis, Stuart J.; Witten, Ian H.

Item

Bi-level document image compression using layout information

Inglis, Stuart J.
;
Witten, Ian H.

Abstract

Most bi-level images stored on computers today comprise scanned text, and are stored using generic bi-level image technology based either on classical run-length coding, such as the CCITT Group 4 method, or on modern schemes such as JBIG that predict pixels from their local image context. However, image compression methods that are tailored specifically for images known to contain printed text can provide noticeably superior performance because they effectively enlarge the context to the character level, at least for those predictions for which such a context is relevant. To deal effectively with general documents that contain text and pictures, it is necessary to detect layout and structural information from the image, and employ different compression techniques for different parts of the image. The authors extend previous work in document image compression in two ways. First, we include automatic discrimination between text and non-text zones in an image. Second, the system is tested on a large real-world image corpus.

Type

Conference Contribution

Citation

Inglis, S. J., & Witten, I. H. (1996). Bi-level document image compression using layout information. In J. A. Storer & M. Cohn (Eds.), Proceedings of the DCC ’96, Data Compression Conference (pp. 442–450). Washington, DC, USA: IEEE. https://doi.org/10.1109/DCC.1996.488374

Date

1996

Publisher

IEEE

Rights

This is an author’s accepted version of an article published in Proceedings of the DCC '96, Data Compression Conference. © 1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Bi-level document image compression using layout information

Inglis, Stuart J.
;
Witten, Ian H.

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections