Iceni logo

OCR Corrections

Pro

Standard

Infix can be used to adjust the hidden text associated with a scanned document. This text is generated by OCR (Optical Character Recognition) software from a scanned image of a printed page.

The OCR text is hidden in the PDF so that it can be searched. Often there are errors in the hidden text that can be difficult to fix because it is hidden.

Open the PDF to be corrected then choose Document > OCR Corrections > Start

The example shows a scanned page with real OCR text hidden.

Since OCR mode could cause a PDF to be substantially changed, you will be asked to confirm your choice.

Choose the "Start OCR mode" option if you wish to begin.

The hidden text is made visible and the scanned image is faded and locked so that it isn't accidentally moved during editing.

You can now edit the text whilst making reference to the original content in the image.

1

2

3

This example shows some corrections (shown in red because Show Changed Text has been enabled in the Preferences dialogue box).

After all corrections have been done, choose: Document > OCR Corrections > Finish. The OCR text, including any edits you made will become invisible and the scanned image restored to its normal density.

4

Notes

  • If your document happened to contain any non-ocr text added after the scanning process, this too will be hidden at the end of the correction process.
  • Choose View > Show Text Frames to see the boundaries between different blocks of text.
  • Changing the colour of the OCR text can make it easier to distinguish from the background image. This will not effect the finished PDF.
  • Some OCR packages create many small text blocks that are difficult to edit. Use the Marshal Text facility in Infix Pro to merge disjoint blocks of text into a single, editable text block.