Before creating a Report Mining process, ensure you understand the following terms.
- area
-
The area is the section of a page where a Report Mining process searches for information to extract. When a process is created, you must specify where the area begins and ends on each page. Each Document Type assigned to a process can have a different area defined.
- column
-
A column, also called an output column, refers to a column in the generated output file, or spreadsheet. Columns classify the type of information pulled from a document. For example, when you mine an aging report, you may pull the invoice number, customer name, and current balance. Each of these would have its own column in the output file.
- document
-
In this guide, documents are OnBase documents from which Report Mining extracts information to display in the output file. An aging report is used in most of the examples provided.
- mining
-
Mining refers to the process of extracting information from OnBase documents and consolidating that information in an output file.
- output file
-
An output file, also referred to as a spreadsheet, is the file that contains information extracted from mined documents. On workstations with a version of Microsoft Excel installed, this file is displayed as an Excel spreadsheet. When Excel is not available, the file is displayed in CSV (comma-delimited) format. CSV files can be imported into other spreadsheet applications for formatting and analysis.
- record
-
A record is information in the mined document that is used to generate a new row in the output file. Groups of related records are called a record group. For example, in an aging report, each invoice line item would be a record. Therefore, each invoice would have its own row in the output file. Each collection of invoices for a specific customer would be a record group.
- record group
-
A record group is a collection of related records in a document. Record groups are used to find records in a document and to indicate whether records are related. For example, the following aging report shows invoices and balances for multiple customers. Each customer has multiple invoices, and each invoice is a record. The invoices pertaining to a single customer is called a record group.
- record group single data
-
Record group single data is information that occurs once within a record group. In an aging report, this may be the customer's name or contact information.
- record primary data
-
Record primary data is the key information that each row in the output file is based on. For example, in an aging report, this may be an invoice number.
- record secondary data
-
Record secondary data is additional information that exists on the same line as the record primary data in the mined document. In an aging report, this may be the invoice date or balance.
In this example, the invoice number is the primary data that generates a new row in the output file. The invoice date and current balance are secondary data because they occur on the same line as the primary data. The customer’s name is record group single data because it occurs only once.
- row
-
A row refers to a line of cells in the output file. Each row corresponds to a single record in the mined document.
- tag
-
A tag is a location on a page that Report Mining uses to find the area or information being mined. For example, the first page of a document may contain the text Page 1. To obtain page numbers from a document, Report Mining may use Page as the tag string.
- tag string
-
A tag string is the string of characters that compose a tag. Report Mining uses the tag string to identify the tag. For example, the first page of a document may contain the text Page 1. To obtain page numbers from a document, Report Mining may use Page as the tag string.