Select the format to be used for the non-image rendition of the document that is created by the OCR process. The following are available:
|
Output Format |
Additional Information |
|---|---|
|
ASCII Text (Standard) |
Non-formatted plain ASCII text |
|
ASCII Text (Formatted) |
Plain ASCII text with formatting preserved |
|
PDF (Standard) |
Normal Portable Document Format (PDF) document |
|
PDF (Image with Searchable text) |
Original image overlaid on text |
|
PDF (Image Substitutes) |
PDF includes embedded image snippets of text or image areas of the document that cannot be read by the OCR engine |
|
HTML 3.2 |
HTML compliant 3.2 |
|
HTML 4.0 |
HTML compliant 4.0 |
|
Microsoft Word 2003 (DOC) |
Microsoft Word 2007 format (.docx) Note:
While the Microsoft Word 2003 (DOC) format is maintained for backward compatibility with previous versions of OnBase, it now renders the output in the Microsoft Word 2007 format (.docx). |
|
Rich Text Format |
Rich Text format |
|
Microsoft Word 2007 (DOCX) |
Microsoft Word 2007 format (.docx) |
|
Unicode Text (Standard) |
Non-formatted Unicode text (UTF-16 encoding) |
|
Unicode Text (Formatted) |
Unicode text with formatting preserved (UTF-16 encoding) |
Encrypted PDFs are not supported for OCR processing.