This activity will help you familiarize yourself with the Adobe Acrobat Pro interface. The goal of this exercise will be to convert a scanned image into a PDF file, implement OCR, and then export the file as a Microsoft Word document.
Step 1: Import JPEG file
- Open Adobe Acrobat Pro (it should be on your desktop, or you can look through the programs on the window button in the lower left-hand corner).
- Once Adobe Acrobat Pro is open, the next step is to locate the document you would like to work with. For this activity, we will use document titled 'Activity #1- Pride & Prejudice', found in the Activity Document section of this page. Download the document.
- Return to Acrobat, and from the file menu select Create > PDF from File. Import the document you just downloaded.
- The image will be immediately converted into a non-editable PDF.
Step 2: Using OCR
- You will want to make a few tools visible on your navigation panel at the left of the screen:
- Go to View > Show/Hide > Navigation Panes > Order
- Go to View > Show/Hide > Navigation Panes > Content
- These two tools will let you tag your PDF and set a reading order
- On the right hand tools bar, click ‘Edit PDF’.
- Wait for Adobe to work its magic.
- You should now be able to highlight text, and use the edit tools. Play around with the 'Edit Text & Images' tools to familiarize yourself with them.
- If you want to create a reading order and tag structure on the page, be sure to click on at least one text box to ensure the text is registered
- Click “Close” button on the upper right to turn off editing
- You should now be able to highlight the text, but there will be extra bits from the marginal notes on the page
- Click on the icon for Order in the left sidebar (Four boxes with a “Z” connecting them)
- Click the Options menu (rectangle with two dots and dashes) and select Show reading order panel
- The reading order panel will open
- Click the “Clear Page Structure…” button at the bottom of the panel
- Click and drag on the page to draw a box over the text
- The selected text will then be surrounded by purple boxes
- Select what type of text this is in the PDF (a book or chapter title may be Heading 1 while the main body text will be Text/Paragraph)
- You can check your work by using Read Out Loud (View > Read Out Loud > Activate Read Out Loud) or by exporting it as a Microsoft Word Document
Step 3: Exporting as a Microsoft Word Document
- Once you have familiarized yourself with the Edit Text & Images and Reading Order tools, go to File > Export to.
- From the drop down list of options, choose Microsoft Word Document.
- Give your document a name and save.
- Give the program a few seconds to load, then go to your desktop and open up the document. Notice what did/did not translate correctly, and how time-intensive it would be to fix every oddity or mistake.