Skip to main content

University Library, University of Illinois at Urbana-Champaign

Introduction to OCR and Searchable PDFs: ABBYY FineReader

Learn OCR best practices and how to begin an OCR project using ABBYY FineReader, Adobe Acrobat Pro, or Tesseract with this guide.

Navigation

What is ABBYY FineReader

Abby Fine Reader Logo

ABBYY FineReader is an optical character recognition (OCR) system. It is used to convert scanned documents, PDF documents, and image documents (including digital photos) into editable/searchable documents. ABBYY FineReader 12 can automatically recognize and processes documents with any combination of 190 languages and provides full dictionary support for 48 languages. The software is available for Mac and Windows machines.

ABBYY FineReader can analyze documents in multiple ways:

  • As images/documents are scanned into the program
  • Already existing image or PDF files

Basic OCR Operations in ABBYY:

  • Before performing OCR, the program analyzes the structure of the entire document and detects the areas that contain text, images, tables, and/or barcodes
  • Recognition results are then displayed in the text window
  • Uncertain characters are highlighted in this window and the user can locate possible errors and quickly correct them within ABBYY FineReader

With the resulting files being editable and searchable, researchers will be able to:

  • Copy and paste passages of text
  • Search the text in PDF readers or word processing programs
  • Ingest the text into analysis programs like ATLAS.ti or NVivo
  • Make information easier to find via the web by creating searchable documents

Scholarly Commons

Scholarly Commons's picture
Scholarly Commons
Contact:
306 Main Library
Drop-ins welcome
Monday-Friday 8:30am-6:00pm
Phone: 217-244-1331
Website