Recommended Standards for OCR Processing - Automated Redaction - English - Foundation 22.1 - OnBase - Essential - Premier - Standard - external - Standard - Essential - Premier

Automated Redaction

Platform
OnBase
Product
Automated Redaction
Release
Foundation 22.1
License
Standard
Essential
Premier

Because OCR processing is an integral part of the Automated Redaction process, it is considered a best practice to ensure that all documents that undergo Automated Redaction meet the recommended criteria for OCR processing. This will ensure that the OCR engine returns the most accurate results possible:

  • Scan documents at a minimum resolution of at least 240 dpi (300 dpi recommended) when preparing documents for OCR processing. Depending on the needs of your solution, a higher resolution may be required. The use of lower resolutions in OCR processing results in illegible text captured from the image.

    The resolution should always be set to a squared value (such as 240x240 or 300x300 dpi). If poor or inaccurate OCR processing results are reported at 240 dpi, the documents should be discarded and re-scanned at incrementally higher resolutions until acceptable OCR results are achieved.

  • Store bi-tonal (black and white) images using the TIFF-Group IV file format. Grayscale or color images should be saved using a lossless color-capable image file format.

  • Scan documents as bi-tonal images. Bi-tonal images require far less disk space and load faster than grayscale or color images.

    However, if documents are originally scanned as grayscale or color images, it is recommended to process the original grayscale or color document and then convert the image to a bi-tonal format. The OCR engine will produce better results with the original color or grayscale document than it will with a dithered document, especially if there are areas of the page that are similar in contrast.

  • Always get the best scanner image possible before resorting to image cleanup.

  • Use a dedicated workstation for OCR processing.

  • When configuring an OCR format, use the Most accurate Recognizer setting.