To improve the Intelligent Capture for AP engine's ability to accurately index documents without manual user intervention, you can train the engine to find document identifiers and Keyword Values by performing index verification on batches that have already undergone ICAP processing. In confirming or correcting the indexing results achieved by the engine, you provide the engine with feedback on how effective its indexing logic was for processing a given batch. The engine can then incorporate this feedback when processing batches for the same vendor going forward, effectively honing its indexing logic and steadily improving over time.
When attempting to identify a Document Type, the Intelligent Capture for AP engine's logic allows for either of two classification methods: the Naive Bayes method and the Keyword Tag method.
The Naive Bayes method is the primary classification method. Occurring on a per-vendor basis, it uses probability to determine the likelihood that a given document matches a particular Document Type. Naive Bayes classification is used only if a document enters the vendor identification stage of the Intelligent Capture process without a Document Type, and once the vendor is identified, if learning has already occurred for the vendor. Also, only documents belonging to a Document Type assigned to the current scan queue are considered for Naive Bayes classification.
The Keyword Tag method is the secondary classification method. It analyzes the Keyword tags found on a document to try to predict the Document Type. Like Naive Bayes classification, Keyword Tag classification is used only if a document enters the vendor identification stage of the Intelligent Capture process without a Document Type. However, Keyword Tag classification can only occur if Naive Bayes classification fails to determine a Document Type and if the Intelligent Capture for AP engine has not previously been exposed to the document.
Document Type classification can be bypassed altogether if only one Document Type has been assigned to the scan queue, if a default Document Type has been assigned to the scan queue, or if a Document Type is assigned in the Pre Index Scan Mode. For more information on scan queue configuration and pre-indexing, see the Document Imaging module reference guide or help files.
Once a Document Type has been identified, the Intelligent Capture for AP engine can be trained to learn where to find Keyword Values on documents by analyzing the OCR results of the areas surrounding values that have been manually indexed using any of the Interactive Data Capture product's three indexing methods (i.e., Auto-Complete indexing, Point and Click indexing, or Swiping; see the Interactive Data Capture module reference guide for more information). Using these OCR results, the engine will attempt to learn which Keyword tags or contexts will identify a value, the approximate location of the value on a document, and the general format or pattern by which the value is captured. This information is then stored in the database and associated with the specific vendor that issued the document. When processing documents from this same vendor in the future, the engine will recall and use the information to attempt to accurately capture the desired data without manual user intervention.