Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

Introduction to OCR and Searchable PDFs

Learn OCR best practices and how to begin an OCR project using ABBYY FineReader, Adobe Acrobat Pro, or Tesseract with this guide.

Adobe Acrobat Pro

What is Adobe Acrobat Pro?

Adobe Acrobat Pro is an optical character recognition (OCR) system. It is used to convert scanned files, PDF files, and image files into editable/searchable documents. It comes in three options: Acrobat X Pro, Acrobat XI Pro, Acrobat Pro DC. The differences between these versions is outlined in the left column. Though it has fewer language options than ABBYY FineReader, Adobe Acrobat Pro is a more pervasive software, partially because it is less academic, and more business-oriented. It is available for both Mac and Windows machines, and includes apps for iOS, Android, and Windows.

Adobe Acrobat Pro can analyze documents in multiple ways:

  • Acrobat can analyze images as they are scanned into the program
  • Acrobat can analyze already existing images, PDF files, or other file types after PDF conversion

Basic OCR Operations in Adobe Acrobat Pro:

  • Open document in Acrobat as a PDF
  • Click 'Edit Text'
  • Program applies optical character recognition to the document
  • Document is now fully editable

With the resulting files being editable and searchable, researchers will be able to:

  • Copy, paste, and edit passages of text within the document
  • Search the text in PDF readers or word processing programs
  • Ingest the text into analysis programs like ATLAS.ti or NVivo
  • Make information easier to find via the Internet by creating searchable documents

DC and XI and Reader, oh my!

Things can get confusing when talking about Adobe Acrobat Pro, due to the number of different versions that have existed, and exist today. This section will go through the current iterations of Adobe Acrobat Pro, their availability on campus and for download.

Video Tutorials