To create learnsets for all fields in a class, complete the following steps.
-
Verify the following.
- Ensure that the Execute Extraction option is active on the Runtime Mode tab in the Settings dialog box.
- Ensure that the Add trained documents to the learnset option is active on the Train Mode tab in the Settings dialog box.
- Homogeneous sets of sample documents where all documents within a set belong to the same class, as learning data extraction performs class by class.
- You can either learn all fields defined for a class at once, or one field at a time. In the first case, your project settings must support selection of the first field when switching to the next document, in the second case your project settings must make sure that the field selection is not changed when switching to the next document.
- Switch to Document Input Selection - Batch.
- In the left pane, select the first document from the batch that contains the documents for your learnset.
- Switch to Normal Train Mode.
-
In the lower left pane, on the
Classes tab, double-click the class you want
to train.
Note: If you use field inheritance, select the parent class.The Fields tab displays the fields of the selected class. The fields are empty and the first field is selected. Candidate highlighting turns on automatically.
- In the toolbar, click the Classify / analyze current document button to highlight all candidates for the first field on the document.
-
In the middle pane, click the correct candidate.
The field contains the text and the field’s background color changes to green. The program automatically selects the next field and highlights its candidates. When all document fields have a value, the document is automatically added to the extraction learnset. The program continues automatically with the next document and the first field.
- Repeat the previous steps until you have enough samples.
- Optional. On the Classes tab, check the number of documents in the extraction learnset.