Once you have configured the Line Item Extraction Field Zone's table decomposition mode, you are able to set other, non-mode specific configuration options.
Data Field Zone Line Item Extraction Option |
Description |
---|---|
XML Node (Parent) |
Note:
This field is only displayed if the Advanced Capture form is configured to create an XML rendition of documents matched to it (i.e., the Create XML data rendition option is selected for the Advanced Capture form). Enter the name of the XML parent node (i.e., the element) that Keyword Values extracted from this zone are contained in when the XML rendition of the document is created. Note:
While the XML Node (Parent) name may include alphanumeric characters, underscores (_), hyphens (-), or periods (.), it must begin with either a letter or an underscore. |
Begin table marker |
Use the Begin table marker field to specify the text that indicates the beginning of the information being extracted. The markers can be configured as literal text (i.e., the text specified in the Begin Table Marker field is the marker) or regular expressions (i.e., the text in the Begin Table Marker field defines a regular expression, and text matching that pattern is the marker). Literal markers must be enclosed in quotation marks to differentiate them from the regular expression markers. Multiple begin table markers (either literal, regular expression, or both) can be configured by separating the text by a space in the Begin table marker field. Note:
To access the Regular Expression Library, click in the field and press F2. See The Regular Expression Library for more information. By default, text used as a Begin table marker is not included in the Keyword Value(s) being extracted. To include this text in the Keyword Vale, select the Inclusive check box beneath the Begin table marker field. |
End table marker |
Use the End table marker field to specify the text that indicates the end of the information being extracted. The markers can be configured as literal text (i.e., the text specified in the End Table Marker field is the marker) or regular expressions (i.e., the text in the End Table Marker field defines a regular expression, and text matching that pattern is the marker). Literal markers must be enclosed in quotation marks to differentiate them from the regular expression markers. Multiple end table markers (either literal, regular expression, or both) can be configured by separating the text by a space in the End table marker field. Note:
To access the Regular Expression Library, click in the field and press F2. See The Regular Expression Library for more information. By default, text used as an End table marker is not included in the Keyword Value(s) being extracted. To include this text in the Keyword Vale, select the Inclusive check box beneath the End table marker field. |
Invalid Row |
Use the Invalid Row field to specify text that indicates the beginning of the row that should be discarded The markers can be configured as literal text (i.e., the text specified in the Invalid Row field is the marker) or regular expressions (i.e., the text in the Invalid Row field defines a regular expression, and text matching that pattern is the marker). Literal markers must be enclosed in quotation marks to differentiate them from the regular expression markers. Multiple invalid row markers (either literal, regular expression, or both) can be configured by separating the text by a space in the Invalid Row field. Note:
To access the Regular Expression Library, click in the field and press F2. See The Regular Expression Library for more information. |
Page Location(s) |
The Page Location(s) options control the pages that the OCR engine searches for a particular Data Field Zone.
|
VB script |
Use the VB script drop-down to select a VBscript to associate with the processing of this Data Field Zone. Click the ... button to open the VB Scripts dialog box. Here, the selected script can be re-configured or edited. |
Suspect level |
Enter the Suspect Level threshold, 1-99, in this field. By default, this value is set to the default Suspect Level set for the Advanced Capture form. The Suspect Level is the level of confidence placed in data values captured in this zone. The default Suspect Level set for the Advanced Capture form and the actual Suspect Level detected for the selected table are displayed below the Suspect level field. After a zone is processed, the OCR engine gives the resulting value a score between 1 and 99, depending on how confident it is in the result that was returned. The higher the score is, the lower the OCR engine's confidence is in the results. The value you enter in this field is the threshold at which the OCR engine determines if a returned value is acceptable or suspect. A score returned by the OCR engine higher than the Suspect Level threshold you set causes the value captured from the zone to be marked as suspect. All scores lower than the Suspect Level threshold indicate that the captured value is considered by the OCR engine to be acceptable. For example, setting the Suspect Level to 99 would indicate you completely trust the result returned by the OCR engine because no higher score could be returned and no result could be marked as suspect. Setting the Suspect Level to 1 would indicate you have no trust in the result, since no lower score could be returned and no result could be determined acceptable. Setting the Suspect Level to 0 reverts back to the default threshold of 75. Tip:
By default, the Suspect Level threshold is set to 75 and the average score given to a processed field is 70. It is considered a best practice to set your Suspect Level to the default threshold of 75 to ensure that suspect Keyword Values are being consistently identified. |
Advanced Recognition |
Note:
Options involving ICR processing below are only enabled if your solution is licensed for Intelligent Character Recognition (ICR). Note:
The OCR engine does not support Asian characters when reading dot matrix-printed text. Using this drop-down list, select the type of processing you would like to perform on this Data Field Zone.
|
Advanced Recognition (cont.) |
Note:
The Auto-detect ICR/OCR options may not work properly if the Data Field Zone contains less than 25 characters. Tip:
Of the two Auto-detect ICR/OCR options, Default to ICR is more likely to produce the best results when the type of text cannot be determined. This is because the OCR engine's intelligent character recognition generally reads machine-printed text more accurately than the engine's optical character recognition reads hand-written text. Note:
The Line Item Extraction Data Field Zone cannot contain a mixture of OCR and ICR values. All values within the zone must use one type of recognition only. |
Activation groups |
When you have configured multiple Form Identification Zones or Page Registration Zones for a document, you can assign individual Data Field Zones to a specific Form Identification or Page Registration Zone using activation groups. Activation groups allow you to activate only the Data Field Zones assigned to the Form Identification or Page Registration Zone that is used to match the document to an Advanced Capture form. Data Field Zones assigned to Form Identification or Page Registration Zones that are not used to match the document to a form will not be processed. Also, Data Field Zones present on pages other than the pages containing their assigned Form Identification or Page Registration Zones will not be processed, unless otherwise specified through the Page Location(s) setting or by adding a + to the front of the activation group name on the Form Identification or Page Registration Zone. This selective activation saves processing time and reduces the number of forms that need to be created for a Document Type. Use the Activation groups field to enter or select an activation group name. Add a + to the front of a group name (e.g., +Group1) on a Form Identification or Page Registration Zone to set all Data Field Zones assigned to this group to be processed. Use commas to separate multiple group names.
|
Activation groups (cont.) |
Alternatively, you can assign a form definition group as the Data Field Zone's activation group to activate the zone for processing. Form definition groups can be used to extract only specific types of information (e.g., header data vs. detail data) during processing. In the Activation groups drop-down list, form definition groups are enclosed in brackets (e.g., [Group1]). |