You can define a set of format strings for each field. Before you begin, consider the following information.
- Review your documents and, for each field, create a list of the strings that you want the engine to extract. This way makes it easier to identify the patterns to search for.
- Do not try to find precisely one candidate, try to generate a set. Learning works better with both correct and false examples.
- Try to anticipate other likely formats. If the correct candidate has no chance to be identified, the need for manual correction increases.