American Periodical Series Online
About APS Online


APS News | Timeline | Linking Today and Yesterday | APS: Sample Uses and Results | Close Window


Digitizing the American Periodicals Series (APS) Microform Collections

Database Development
American Periodicals Series (APS) Online is part of ProQuest Information and Learning's "Digital Vault" initiative-the digitization of the world's largest commercially available microfilm collection. Stored in three climate controlled, underground vaults, the ProQuest microfilm collection includes over 5 billion page images from over 20,000 periodicals, 7,000 newspapers, 400 research collections, and 1,000,000 dissertations.

Why Digitize APS?
Surveys of academics and librarians showed great enthusiasm for digitizing the APS microform collections as one of the first Digital Vault products. Respondents expected to see full page images (a replication of the microfilm) with searchable ascii text and publication searches by date and issue.

The Digitization Process
Preparations for digitizing the 7,000,000-page APS collections included developing strict quality control and editorial guidelines. After successful test runs, the process of digitizing APS in sequence (see the description of the microform collection) was underway. That process for each periodical involved…

  • Pulling the master negative from the vault.
  • Preparing and cleaning the film, and building 100-foot rolls with two page images per frame of film.
  • Scanning the film at 300+ dpi resolution.
  • Cleaning and enhancing images with filters.
  • Quality checks for each of the procedures.

APS Online was released during the summer of 2000 as part of a 48-month production schedule. APS I and a portion of APS II were completed and made available in 2001. A customized APS interface was developed and made available in 2001, and in 2002 new digitization specifications were introduced. Both the interface and new digitization specifications were informed by feedback from users, customers, and the APS Advisory Board.

Digitization of APS I (1740-1800)
Because of the variety of archaic journals, typefaces, and images, the OCR rate for APS I (1740-1800) material fluctuated greatly. Magazine-like content was OCR'd primarily in the 70%-90% range. Newspaper-like content, with its larger image size and multi-column format, posed a greater challenge to OCR, with ranges generally not reaching the levels of magazine-like content. (This content is being evaluated for upgrading through methods used for APS II and APS III, which are described below).

Feedback from customers and users was very positive in regard to ProQuest having made a vast, primary source, historical content repository accessible through the modern Online media. Responding to suggestions for more focused search capabilities, additional information on the more than 1,100 periodicals, and better image quality, ProQuest initiated new manufacturing procedures for APS II and III and introduced a customized interface in mid-2001.

Digitization of APS II and APS III
After pages are scanned and captured electronically, each page of each journal is zoned: the various content elements (including cover, table of contents, article title and subtitle, article, images, and captions) are electronically outlined and separated from the page into a unique file. Each outlined area is deskewed and despeckled, and each outlined area is tagged according to editorial guidelines. The outlined areas are threaded, meaning they are returned to their original page location.

Through this procedure, the OCR rate greatly improved, which is especially critical for "busy pages" (pages with multiple articles or images). Article titles, image captions, and article abstracts (generally the first paragraph of each article) are captured at a minimum 99.95% OCR rate, and the mid 90% range is routine for text, except in cases where the original source material had flaws. These OCR rates improve viewing and search capabilities. The zoning process of digitization allows APS users to access either full page or individual article displays, and to focus searches on a variety of article types.

The new manufacturing process for APS Online first began to show during summer 2002. Twenty top journals from the second half of the nineteenth century were selected as the first to be manufactured under the new process. Manufacturing of the original APS sequence (continuing APS II) will resume after the twenty journals are completed, and the previously manufactured content already available in APS Online will be evaluated for upgrading.

Progress reports on content loading, profiles of new features, and general APS product information are available in APS News.


Back to the Top