You can use the CloudOCR engine for document recognition, although we recommend to use FineReader 11 for this task.
Consider the following information.
-
CloudOCR returns a rotation value between 0 and 360 to correct the rotation of each page. This functionality cannot be deactivated.
-
CloudOCR returns the location per word; there is no per character position information.
-
CloudOCR returns the confidence information per word.
To access the confidence value, use the method GetCharConfidence. For more information, see "GetCharConfidence" in the Brainware Intelligent Capture Scripting Help.
- The [base endpoint]/vision/v2.0/read/core/asyncBatchAnalyze call returns a confidence value of 100 for all characters of a recognized word and a confidence value of 50 for each character of a word flagged with low confidence.
- The [base endpoint]/vision/v3.[x]/read/analyze[?language] returns a confidence value between 0 and 100 for all characters of a recognized word.
Document Recognition with Word Segmentation Characters
We highly recommend not to use the word segmentation characters when using the CloudOCR engine, as the results may not be fully compatible with other engines, such as the Format Analysis Engine.
When performing document recognition using word segmentation characters, CloudOCR splits any word into 3 or more separate words - the part before the segmentation character, the character itself, and the part after the character.
Example
- One of the default word segmentation characters is "/".
- A date is formatted as follows: MM/DD/YYYY.
Result
The date is broken into 5 words.
- DD
- MM
- YYYY
- Two '/' characters
As all five words have the same position values, the Format Analysis Engine might order them randomly, for example as / DD YYYY MM /.