The Optical Character Recognition (OCR) module recognizes and translates printed alphanumeric characters on a scanned image document and image-only PDFs into characters in a text document. OnBase OCR supports 18 languages, making it an ideal solution for international companies or businesses with document sets in multiple languages. The following output formats are available.
-
ASCII Text (Standard)
-
ASCII Text (Formatted)
-
PDF (several varieties)
-
Microsoft Word
-
HTML 3.2
-
HTML 4.0
-
Rich Text Format
-
Unicode Text (Standard)
-
Unicode Text (Formatted)
OCR is performed after an image has been scanned, Document Import Processed, or swept into OnBase. OCR settings can be created and saved for multiple Document Types.
The OCR module is most often used to convert image documents so that they can be:
-
Full Text Indexed
-
Internally and externally searched
-
Used with text-based cross-referencing
OnBase uses the Hyland OCR Engine as its character recognition software. It does not require separate licensing or registration.
If you are licensed for any OnBase modules that use OCR technology (that is, Advanced Capture, Intelligent Capture for AP, or Automated Redaction), you must obtain and install the latest version of the Hyland OCR Engine software (available from your service provider) on each server or workstation that is to perform processing for these modules.
The OCR module is compatible with a number of different file formats and image types. For a list of supported file types, see the section on supported image file formats in the Full-Page OCR module reference guide.