Select the format to be used for the non-image rendition of the document that is created by the OCR process. The following are available:
Output Format |
Additional Information |
---|---|
ASCII Text (Standard) |
Non-formatted plain ASCII text |
ASCII Text (Formatted) |
Plain ASCII text with formatting preserved |
PDF (Standard) |
Normal Portable Document Format (PDF) document |
PDF (Image with Searchable text) |
Original image overlaid on text |
PDF (Image Substitutes) |
PDF includes embedded image snippets of text or image areas of the document that cannot be read by the OCR engine |
HTML 3.2 |
HTML compliant 3.2 |
HTML 4.0 |
HTML compliant 4.0 |
Microsoft Word 2003 (DOC) |
Microsoft Word 2007 format (.docx) Note:
While the Microsoft Word 2003 (DOC) format is maintained for backward compatibility with previous versions of OnBase, it now renders the output in the Microsoft Word 2007 format (.docx). |
Rich Text Format |
Rich Text format |
Microsoft Word 2007 (DOCX) |
Microsoft Word 2007 format (.docx) |
Unicode Text (Standard) |
Non-formatted Unicode text (UTF-16 encoding) |
Unicode Text (Formatted) |
Unicode text with formatting preserved (UTF-16 encoding) |
Encrypted PDFs are not supported for OCR processing.