Configuring a Text Value Form Identifier - Advanced Capture - Foundation 24.1 - Foundation 24.1 - Ready - OnBase - Essential - Premier - Standard - external - Standard - Essential - Premier

Advanced Capture

Platform
OnBase
Product
Advanced Capture
Release
Foundation 24.1
License
Standard
Essential
Premier

A text value form identifier is configured using the options on the Text Match tab of the Form Identification Zone dialog box.

Form Identification Zone Text Match Option Description
Match Value

This field displays the text value returned from the OCR engine’s evaluation of the Form Identification Zone.

Note: The Match Value is also displayed in the Result Verification panel below the enlarged image of the Form Identification Zone.

You may select the match rules for the Form Identification Zone by selecting one of the match rule radio buttons.

Tip: If the position of the text that is being compared to the Match Value shifts from document to document to the extent that a portion of this text might fall outside the Form Identification Zone, or different text might fall inside the zone, select the Contains text option to increase the chances of a desired match.
  • Exact match. Select this radio button if the value identified by the OCR engine when a document is processed must exactly match the Match Value in order to match the document to the Advanced Capture form.
  • Contains text. Select this radio button if the Match Value can be part of a longer text string identified by the OCR engine for the document being processed in order for Advanced Capture to match the document to the Advanced Capture form.

    Note: When using this option, only the Match Value portion of the text string is taken into consideration to determine the Suspect Level of the value.
  • Fuzzy match. Select this radio button if the Match Value should match the value detected by the OCR engine for the document, but some margin of error is allowed between the two values.

    Enter a value in the Max Errors field to set the number of discrepancies allowed between the Match Value and the value read from the document being processed.

    For example, if your Match Value is HYLAND UNIVERSITY and the value detected by the OCR engine from the processed document is HYL4ND UNIVERS1TY, the document will still be matched to the Advanced Capture form if the Max Errors value is set to 2 or more.

Bar Coded Text
Note: This option is only available if your solution is licensed for the Bar Code Recognition Server.

Select this check box if the Form Identification Zone contains a bar code. The bar code will be converted into text and then identified according to the zone’s configured match rules.

Note: Depending on your configuration, the Advanced Capture engine will only search for certain bar code types in this zone. See Bar Code Types for more information.
Bar Code Types
Note:

This option is only available when the Bar Code option is selected.

Click this button to access the Select Bar Code Types dialog box.

Here you can select the types of bar codes for which the Advanced Capture engine will search when processing the Data Field Zone. The engine will ignore any deselected bar code types when processing this zone.

By default, all bar code types are selected.

Tip:

To expedite Advanced Capture processing, select only the desired bar code types.

To save changes to your bar code type selections, click Save.

Regular expression match

Select this check box if the Match Value should match the form of a regular expression.

For example: the regular expression rule for a Social Security Number is \d{3}-\d{2}-\d{4} (i.e., 3 digits, dash, 2 digits, dash, 4 digits). If the OCR engine identifies text matching this pattern in the zone, that text is returned as a Match Value.

Note: All regular expressions must be ECMA compliant.
Ignore all whitespace

Select this check box if you would like to ignore all whitespace (e.g., spaces, paragraph returns, etc.) in the Match Value after it is read by the OCR engine.

For example, if the Match Value read by the OCR engine is ENG 313 and the Ignore all whitespace check box is selected, the Match Value would be read as ENG313.

Match DOES NOT EQUAL Select this check box if the document should be matched to the Advanced Capture form only when the Match Value does not appear in the Form Identification Zone.
Allowed character types

These options are available to assist the OCR engine in determining if the value it reads is correct. Select the check box next to each character type that is allowed in the Match Value. The available options are:

  • Numerals. Numeric characters, 0-9.
  • Uppercase. Uppercase alphabetic characters.
  • Lowercase. Lowercase alphabetic characters.
  • Punctuation. Punctuation marks (i.e., . ! ? ).
  • Miscellaneous. Other ASCII characters that do not fall into one of the above categories (i.e., # $ * @).

When a character is recognized by the OCR engine that is not part of an allowable character set, the character is replaced by a tilde (~) and the value is automatically marked as suspect.

Tip: Using the Allowed character types options can sometimes help the OCR engine more easily determine the correct value by eliminating characters that are obviously not correct (e.g., an I is correctly identified instead of a 1 because numeric characters are filtered and prevented from being recognized as part of the value).

Select the Disable Asian Text Support check box to instruct the OCR engine to skip Asian (i.e., double-byte) characters.

Note: The Disable Asian Text Support check box is only available if the OCR format assigned to the Document Type is configured for an Asian language (e.g., Japanese, Korean, etc.).
Tip: Selecting the Disable Asian Text Support check box allows you to identify numeric data (i.e., Date Keyword Values, Currency Keyword Values, and Document Dates) when performing OCR on documents configured to contain Asian (i.e., double-byte) characters.
Suspect Level

Enter the Suspect Level threshold, 1-99, in this field. By default, this value is set to the default Suspect Level set for the Advanced Capture form.

The Suspect Level is the level of confidence placed in data values captured in this zone. The default Suspect Level set for the Advanced Capture form and the actual Suspect Level detected for the selected value are displayed below the Suspect level field.

After a zone is processed, the OCR engine gives the resulting value a score between 1 and 99, depending on how confident it is in the result that was returned. The higher the score is, the lower the OCR engine's confidence is in the results.

The value you enter in this field is the threshold at which the OCR engine determines if a returned value is acceptable or suspect. A score returned by the OCR engine higher than the Suspect Level threshold you set causes the value captured from the zone to be marked as suspect. All scores lower than the Suspect Level threshold indicate that the captured value is considered by the OCR engine to be acceptable.

For example, setting the Suspect Level to 99 would indicate you completely trust the result returned by the OCR engine because no higher score could be returned and no result could be marked as suspect.

Setting the Suspect Level to 1 would indicate you have no trust in the result, since no lower score could be returned and no result could be determined acceptable.

Setting the Suspect Level to 0 reverts back to the default threshold of 75.

Tip:

By default, the Suspect Level threshold is set to 75 and the average score given to a processed field is 70. It is considered a best practice to set your Suspect Level to the default threshold of 75 to ensure that suspect Keyword Values are being consistently identified.

Note: If the Form Identification Zone is marked suspect, it will not be used to determine if the form matches the document. If no other Form Identification Zones are configured for the form, or if all of the zones are marked suspect, the form will not be able to be matched to the document.
Combined rule expressions

In the case where multiple Form Identification Zones are needed to match a document to a form, you can configure how these zones are processed using the Combined rule expressions option.

Rules can be dragged-and-dropped into the proper order in this field and can be combined/evaluated using the AND or OR boolean operators.

  • Rules placed within the same Identification Group are combined using an AND operator.
  • Different Identification Groups are evaluated using an OR operator.
Note: Entire Identification Groups cannot be copied. However, individual rules can be copied to new or existing groups by right-clicking on the rule name and dragging it within the desired Identification Group.

A document is matched to a form when all of the rules within an Identification Group (i.e., an entire set of AND rules) are true.

For example:

In the example above, each Identification Group consists of two rules. For the document to be matched to the form, both rules within any one of the three Identification Groups must be true. If any rule in Identification Group #1 is false, the rules in Identification Group #2 are evaluated, and so on.

VB Script

Use the VB Script drop-down list to select a VB script to associate with the identification of this Advanced Capture form.

Click the ... button to open the VB Scripts dialog box. Here, the selected script can be re-configured or edited. For more information on these options, contact your System Administrator.

Activation groups

When you have configured multiple Form Identification Zones or Page Registration Zones for a document, you can assign individual Data Field Zones to a specific Form Identification or Page Registration Zone using activation groups. Activation groups allow you to activate only the Data Field Zones assigned to the Form Identification or Page Registration Zone that is used to match the document to an Advanced Capture form. Data Field Zones assigned to Form Identification or Page Registration Zones that are not used to match the document to a form will not be processed. Also, Data Field Zones present on pages other than the pages containing their assigned Form Identification or Page Registration Zones will not be processed, unless otherwise specified through the Page Location(s) setting or by adding a + to the front of the activation group name on the Form Identification or Page Registration Zone. This selective activation saves processing time and reduces the number of forms that need to be created for a Document Type.

Use the Activation groups field to enter or select an activation group name. Add a + to the front of a group name (for example, +Group1) on a Form Identification or Page Registration Zone to set all Data Field Zones assigned to this group to be processed. Use commas to separate multiple group names.

  • When a Form Identification Zone or Page Registration Zone is matched to a form, all activation groups that have been configured for the zone will be activated.

  • Form Identification Zones are organized into Identification Groups (under Combined rule expressions), and only one Identification Group can be matched to a form. Once an Identification Group has been matched, any remaining Identification Groups on the document will be skipped.

  • Multiple Page Registration Zones can be matched to a form. Every Page Registration Zone on the document will be tested for a match.

  • If multiple activation groups have been configured for a Data Field Zone, the zone will be processed if any of these activation groups is activated.

  • If no activation groups have been configured for a Data Field Zone, the zone will be considered active and thus will be processed.

  • If a Data Field Zone is configured to only be searched for on certain pages (that is, through the Page Location(s) setting), the zone can only be considered active on these pages. This overrides any conflicting settings that would otherwise activate the Data Field Zone (for example, when a Data Field Zone is assigned to an activation group that is named with a + on the corresponding Form Identification or Page Registration Zone, or when a Data Field Zone is not assigned to any activation group).

Use registration point

Select this check box to enable the Registration Point feature.

Note: The Registration Point is the starting point (upper-left corner) of the value detected in the Form Identification Zone. Click Find Registration to automatically set the Registration Point at the starting point of the value in the Form Identification Zone, or double-click a location in the field below the check box to manually set the Registration Point at that position.

The position of the Registration Point on the Advanced Capture form is compared to the same position on the document being processed to determine the offset (i.e., the skew or rotation) of any imported documents. The position of the configured Form Identification Zone is adjusted on the document being processed to account for the detected offset to ensure that the Advanced Capture process is able to process the document properly.

If you selected the Regular expression match check box for the Match Value, the Register on capture group position check box will be enabled. Select this option and click Find Registration to set the Registration Point to the starting point of the first capture group contained within the Match Value’s regular expression. This option is useful when the Match Value’s first capture group is not the first value detected within the Form Identification Zone.