Publisher
AMER LIBRARY ASSOCCitation
Han, Y., & Wan, X. (2018). Digitization of Text Documents Using PDF/A. Information Technology and Libraries, 37(1), 52-64.Rights
Copyright © 2018 Information Technology and Libraries.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
The purpose of this article is to demonstrate a practical use case of PDF/A for digitization of text documents following FADGI's recommendation of using PDF/A as a preferred digitization file format. The authors demonstrate how to convert and combine TIFFs with associated metadata into a single PDF/A-2b file for a document. Using real-life examples and open source software, the authors show readers how to convert TIFF images, extract associated metadata and International Color Consortium (ICC) profiles, and validate against the newly released PDF/A validator. The generated PDF/A file is a self-contained and self-described container that accommodates all the data from digitization of textual materials, including page-level metadata and ICC profiles. Providing theoretical analysis and empirical examples, the authors show that PDF/A has many advantages over the traditionally preferred file format, TIFF/JPEG2000, for digitization of text documents.Note
Open access journalISSN
2163-52260730-9295
Version
Final published versionAdditional Links
http://ejournals.bc.edu/ojs/index.php/ital/article/view/9878ae974a485f413a2113503eed53cd6c53
10.6017/ital.v37i1.9878