A text value form identifier is configured using the options on the Text Match tab of the Form Identification Zone dialog box.
Form Identification Zone Text Match Option | Description |
---|---|
Match Value |
This field displays the text value returned from the OCR engine’s evaluation of the Form Identification Zone. Note: The Match Value is also displayed in the
Result Verification panel below the
enlarged image of the Form Identification Zone.
You may select the match rules for the Form Identification Zone by selecting one of the match rule radio buttons. Tip: If the position of the text that
is being compared to the Match Value shifts from document to
document to the extent that a portion of this text might fall
outside the Form Identification Zone, or different text might fall
inside the zone, select the Contains text
option to increase the chances of a desired match.
|
Bar Coded Text |
Note: This option is only available if your
solution is licensed for the Bar Code Recognition Server.
Select this check box if the Form Identification Zone contains a bar code. The bar code will be converted into text and then identified according to the zone’s configured match rules. Note: Depending on your configuration, the
Advanced Capture engine will only search for
certain bar code types in this zone. See Bar Code Types for more
information.
|
Bar Code Types |
Note:
This option is only available when the Bar Code option is selected. Click this button to access the Select Bar Code Types dialog box. Here you can select the types of bar codes for which the Advanced Capture engine will search when processing the Data Field Zone. The engine will ignore any deselected bar code types when processing this zone. By default, all bar code types are selected. Tip:
To expedite Advanced Capture processing, select only the desired bar code types. To save changes to your bar code type selections, click Save. |
Regular expression match |
Select this check box if the Match Value should match the form of a regular expression. For example: the regular expression rule for a Social Security Number is \d{3}-\d{2}-\d{4} (i.e., 3 digits, dash, 2 digits, dash, 4 digits). If the OCR engine identifies text matching this pattern in the zone, that text is returned as a Match Value. Note: All regular expressions must be ECMA
compliant.
|
Ignore all whitespace |
Select this check box if you would like to ignore all whitespace (e.g., spaces, paragraph returns, etc.) in the Match Value after it is read by the OCR engine. For example, if the Match Value read by the OCR engine is ENG 313 and the Ignore all whitespace check box is selected, the Match Value would be read as ENG313. |
Match DOES NOT EQUAL | Select this check box if the document should be matched to the Advanced Capture form only when the Match Value does not appear in the Form Identification Zone. |
Allowed character types |
These options are available to assist the OCR engine in determining if the value it reads is correct. Select the check box next to each character type that is allowed in the Match Value. The available options are:
When a character is recognized by the OCR engine that is not part of an allowable character set, the character is replaced by a tilde (~) and the value is automatically marked as suspect. Tip: Using the Allowed
character types options can sometimes help the OCR
engine more easily determine the correct value by eliminating
characters that are obviously not correct (e.g., an I is correctly
identified instead of a 1 because numeric
characters are filtered and prevented from being recognized as part
of the value).
Select the Disable Asian Text Support check box to instruct the OCR engine to skip Asian (i.e., double-byte) characters. Note: The Disable Asian Text
Support check box is only available if the OCR
format assigned to the Document Type is configured for an Asian
language (e.g., Japanese, Korean, etc.).
Tip: Selecting the Disable
Asian Text Support check box allows you to identify
numeric data (i.e., Date Keyword Values, Currency Keyword Values,
and Document Dates) when performing OCR on documents configured to
contain Asian (i.e., double-byte) characters.
|
Suspect Level |
Enter the Suspect Level threshold, 1-99, in this field. By default, this value is set to the default Suspect Level set for the Advanced Capture form. The Suspect Level is the level of confidence placed in data values captured in this zone. The default Suspect Level set for the Advanced Capture form and the actual Suspect Level detected for the selected value are displayed below the Suspect level field. After a zone is processed, the OCR engine gives the resulting value a score between 1 and 99, depending on how confident it is in the result that was returned. The higher the score is, the lower the OCR engine's confidence is in the results. The value you enter in this field is the threshold at which the OCR engine determines if a returned value is acceptable or suspect. A score returned by the OCR engine higher than the Suspect Level threshold you set causes the value captured from the zone to be marked as suspect. All scores lower than the Suspect Level threshold indicate that the captured value is considered by the OCR engine to be acceptable. For example, setting the Suspect Level to 99 would indicate you completely trust the result returned by the OCR engine because no higher score could be returned and no result could be marked as suspect. Setting the Suspect Level to 1 would indicate you have no trust in the result, since no lower score could be returned and no result could be determined acceptable. Setting the Suspect Level to 0 reverts back to the default threshold of 75. Tip:
By default, the Suspect Level threshold is set to 75 and the average score given to a processed field is 70. It is considered a best practice to set your Suspect Level to the default threshold of 75 to ensure that suspect Keyword Values are being consistently identified. Note: If the Form Identification Zone is marked
suspect, it will not be used to determine if the form matches the
document. If no other Form Identification Zones are configured for
the form, or if all of the zones are marked suspect, the form will
not be able to be matched to the document.
|
Combined rule expressions |
In the case where multiple Form Identification Zones are needed to match a document to a form, you can configure how these zones are processed using the Combined rule expressions option. Rules can be dragged-and-dropped into the proper order in this field and can be combined/evaluated using the AND or OR boolean operators.
Note: Entire Identification Groups cannot be
copied. However, individual rules can be copied to new or existing
groups by right-clicking on the rule name and dragging it within the
desired Identification Group.
A document is matched to a form when all of the rules within an Identification Group (i.e., an entire set of AND rules) are true. For example: In the example above, each Identification Group consists of two rules. For the document to be matched to the form, both rules within any one of the three Identification Groups must be true. If any rule in Identification Group #1 is false, the rules in Identification Group #2 are evaluated, and so on. |
VB Script |
Use the VB Script drop-down list to select a VB script to associate with the identification of this Advanced Capture form. Click the ... button to open the VB Scripts dialog box. Here, the selected script can be re-configured or edited. For more information on these options, contact your System Administrator. |
Activation groups |
When you have configured multiple Form Identification Zones or Page Registration Zones for a document, you can assign individual Data Field Zones to a specific Form Identification or Page Registration Zone using activation groups. Activation groups allow you to activate only the Data Field Zones assigned to the Form Identification or Page Registration Zone that is used to match the document to an Advanced Capture form. Data Field Zones assigned to Form Identification or Page Registration Zones that are not used to match the document to a form will not be processed. Also, Data Field Zones present on pages other than the pages containing their assigned Form Identification or Page Registration Zones will not be processed, unless otherwise specified through the Page Location(s) setting or by adding a + to the front of the activation group name on the Form Identification or Page Registration Zone. This selective activation saves processing time and reduces the number of forms that need to be created for a Document Type. Use the Activation groups field to enter or select an activation group name. Add a + to the front of a group name (for example, +Group1) on a Form Identification or Page Registration Zone to set all Data Field Zones assigned to this group to be processed. Use commas to separate multiple group names.
|
Use registration point |
Select this check box to enable the Registration Point feature. Note: The Registration Point is the starting point
(upper-left corner) of the value detected in the Form Identification
Zone. Click Find Registration to
automatically set the Registration Point at the starting point of
the value in the Form Identification Zone, or double-click a
location in the field below the check box to manually set the
Registration Point at that position.
The position of the Registration Point on the Advanced Capture form is compared to the same position on the document being processed to determine the offset (i.e., the skew or rotation) of any imported documents. The position of the configured Form Identification Zone is adjusted on the document being processed to account for the detected offset to ensure that the Advanced Capture process is able to process the document properly. If you selected the Regular expression match check box for the Match Value, the Register on capture group position check box will be enabled. Select this option and click Find Registration to set the Registration Point to the starting point of the first capture group contained within the Match Value’s regular expression. This option is useful when the Match Value’s first capture group is not the first value detected within the Form Identification Zone. |