The string you are looking for may consist of only one word only or several words. Whether the string is finally found in the document also depends on rules as to how it is constructed from words. Default rules are set for this and should work in most cases. If your format string looks fine, but you still cannot find candidates, you may have to adjust them. For more information, see About the Rules for String Construction from Words.The engine provides the following methods to find strings.
String compare
String Compare is a very simple search method that finds each literal occurrence of the specified format string.
Levenshtein
Levenshtein search is an error-tolerant search method that finds each literal occurrence of the specified format string, but also strings that can be derived from the specified one by inserting, interchanging or deleting single characters. The number of key operations required to derive the erroneous string determines whether there is still a match. Use this method to account for typical typographical errors like character interchange.
Example0 errors | 1 error | 2 errors |
---|---|---|
invoice | invoike | invoke |
involce | involve |
Trigram
Trigram search is another error-tolerant search method. To compare two words, they are fragmented into groups of three characters called trigrams. The number of identical groups determines whether there is still a match. Use this method to account for OCR errors in your document.
Example1st trigram | 2nd trigram | 3rd trigram | |
---|---|---|---|
brain | bra | rai | ain |
train | tra | rai | ain |
Simple Expression
This is the default method that you can use to specify simple format patterns. In simple expressions, some characters have a special meaning. All other characters have no special meaning and represent themselves.
Character | Description |
---|---|
# | Represents one digit. Example: ### matches 123. |
@ | Represents one upper-case or lower-case letter. Examples: @@@@@@@ matches Capture and @# matches U2. |
? | Represents one alphanumeric character. Example: ?rain matches brain and train. |
`´ |
|
[integer[,integer]] | Indicates a number of or a range of repetitions of the previous
character. Examples:
|
[string] | Represents a single character which can be any one character within the string.
Literal instances of the above special characters--#, @, and ?--can be located by
including these within the string. Note: The first character of
the string must be non-numeric.
Examples organi[sz]e matches both organize and organise. [#][abcdef0123456789][6] matches HTML color codes such as #4A7023 and #FF5733. |
Regular expressions
Use regular expressions to precisely specify complex format patterns. Designer supports a subset of regular expressions. For more information, see What are Regular Expressions?.