Add a search index - Remote Matching Service - 23.1 - 23.1 - Brainware - external - Brainware/Remote-Matching-Service/23.1/Remote-Matching-Service-User-Guide/About-search-indexes/Add-a-search-index - 2024-02-05

Remote Matching Service User Guide

Platform
Brainware
Product
Remote Matching Service
Release
23.1
License

To add a new search index, complete the following steps.

  1. On the navigation pane, expand Search Indexes.
  2. Right-click and click Add.
  3. Do any of the following.
    • Select CSV File if your data is saved in a text file that contains a list of data values separated by a particular character.

    • Select Database, if your data is located in a database.

    For information on specific fields, refer to the table below. After you have provided all mandatory information, RMS tries to read the first rows of the file based on the current configuration (for CSV File) or tries to select the first rows from the database based on the current configuration (for Database) and displays a confirmation message.

    Note: After specifying the CSV file in the Source File text field, if you get an error message that reads “The record source definition is invalid.”, you need to validate the path of the CSV file.
    Field Description
    Index Name Type a unique name for the index.
    Note:

    This field is applicable for both CSV File and Database options.

    Include
    Note:

    The Source File, first row contains column names, Separator, and Quote Character fields are applicable for the CSV File option only.

    The Data Source, Username, Password, and Query fields are applicable for the Database option only.
    Source File Type the path of your CSV file location.
    Note:

    When working with multiple server instances, make sure to use a path that is accessible by all servers.

    first row contains column names Select this check box if the first row of the CSV file contains column names.
    Separator Select the character that separates fields in the CSV file. This is typically a semicolon or a comma.
    Quote Character Select the character that is used to quote the content of a field. If your CSV file does not have the content of fields in quotes, select None.
    Data Source Select the data source that contains the data you want to add to the index. The list contains all data sources that have been configured in Data Sources.
    User Name Type the user name that is used to access the data source.
    Password Type the password that is used to access the data source.
    Query Type in the SQL query that is used to select the records you want to add to the index. For example: select * from vendor
    Exclude
    Note:

    The fields in the Exclude area of the Add Search Index dialog box are applicable to both the CSV File and the Database options.

    Source File Type in the path to your CSV file of records to exclude.
    Note:

    When working with multiple server instances, make sure to use a path that is accessible by all servers.

    first row contains column names Select this check box if the first row of the CSV file contains column names.
    Separator Select the character that separates fields in the CSV file. This is typically a semicolon or a comma.
    Quote Character Select the character that is used to quote the content of a field. If your CSV file does not have the content of fields in quotes, select None.
  4. Select the Columns tab, and verify or select the following information.
    This tab is only enabled if you have entered a valid record source configuration.
    • Verify that the displayed column names reflect the available data in your record source by checking that all of the columns are referenced in the SELECT statement or, for CSV files, that the separator and quote characters reflect the contents of the file.
    • Select the Search check box for all columns whose values should be part of the index. This should include any information you are expecting to find in the analyzed text.
    • Select the Filter check box for all columns that can be used as a filter to preselect records during a search.
    • Select the ID button for the column that contains the unique id of the record. This is mandatory when working with Brainware Intelligent Capture. Custom client applications may not need this.
    • Select the Vendor Type button for the column that contains the vendor type information. This is only required when working with Brainware Intelligent Capture. Refer to the Brainware Intelligent Capture documentation for more information about vendor types.
  5. Select the Search Index tab, and enter the following information.
    • If the ASSA engine is available at your installation, in the Engine list, select the search engine you want to use. Record Search is the best choice in most cases. Choose ASSA when migrating from an earlier version of Brainware Intelligent Capture or when working with Chinese, Japanese or Korean (CJK) characters.
    • When using the ASSA engine, in the Engine Instances field, select the number of instances that will be opened for searching. While a higher number of instances allow more concurrent searches, this value should always be less than or equal to the number of CPU cores you have available on your servers.
    • When using Record Search, check Merge Digit Blocks if spaces in consecutive blocks of numbers should be ignored. For example, 123 456 789 would be transformed to 123456789 in the index.
    • In the Rebuild Time field, select a date and a start time for the rebuild of the index. Use the date format year-month-day, such as 2019-11-01. To schedule automatic rebuilds to take place periodically, click Repeat and type a time interval. For example, to start a rebuild every night at 10 pm., enter a start date and 22:00 as the time and select 1 Days as the period. The rebuild time is configured in your browser's current time zone not the server's time zone. Internally, the rebuild time is stored in UTC (Coordinated Universal Time).
      Note: When a date and start time is selected for a scheduled automatic rebuild, the rebuild is only executed once the interval period is over and not on the specified date and time.

      Example: If the Rebuild Time is configured as 08/10/23 at 5PM with 1 Day interval, the automatic rebuild will only start after the 1 day interval period has elapsed, that is on 08/11/23 at 5PM.

    • If you want the index to be rebuilt only if it changes when the scheduled time arrives, click Rebuild Only if Changed.

      For CSV record sources, the rebuild will only occur if the original file used to create the search index has been modified since the last rebuild.

      For database record sources, either select Stored Procedure from the list and enter the name of a stored procedure to run, or select File Modified from the list and enter the location of a file whose modification time should be checked. The stored procedure needs to check if something has been modified since a given time. It must accept two parameters. The first is an input parameter with the time to check; i.e., the time the index was last built. The second parameter is an output parameter that should be set to 0 if nothing has changed or a non-zero value if something has changed. For example:create procedure VendorChanged(@ts datetime, @cnt int out) as begin select @cnt = count(*) as cnt from vendor_update where last_update > @ts end

    • To load the search index when the server starts up, select the Load check box. If you want the index to be unloaded automatically after a period of inactivity, check the Automatically unload when inactive for check box and select the period of time after which the index should be unloaded.
      Note:

      The Dashboard may not reflect the closed state immediately.

    • Select the Encrypt Index Files check box if the index files are to be stored in an encrypted format.

      Note:

      The index will not reflect this setting until it has been rebuilt.

  6. On the Scoring tab, enter the following information.
    Note: This tab is only visible if scoring is available at your installation.

    If ASSA scoring is available:

    1. In the Scoring Type list, select the scoring type that you want to use.

      For cases where you want fine control over the scoring mechanism, select Advanced Record.

    To configure ASSA-based scoring, complete the following steps.

    1. In the Word Order Weight field, type an integer number between 0 and 100 that reflects the influence that the order of search words in the record has on the resulting score. The higher the word order weight, the more important the word order is.
    2. Select the Merge Digit Blocks check box if spaces in consecutive blocks of numbers should be ignored. For example, 123 456 789 would be transformed to 123456789 in the index.

    To configure Advanced Record scoring, complete the following steps.

    Note:

    You need to use Advanced Record scoring if ASSA-based scoring is not available.

    1. In the Weighting Mode list, select the weighting mode that you want to use.

      In Relative mode, the weight of a column within the score is basically determined by the length of the value – a long name has a higher influence than a short ZIP code. You can increase or decrease the weight of individual columns relative to other columns. For example, to give a customer number a higher weight you can assign a higher value. The default weight for each column is one.

      In Absolute mode, column weights are defined as absolute percentages. You may, for example, give a customer number a weight of 30% - if that number is not found the score cannot be higher than 70% even if everything else matches.

    2. For Absolute weighting mode, either select the Distribute weight if value is missing check box to distribute the weight of a field with no value to the other fields, or clear it and in the Missing Value Score field, type an integer between 0 and 100 that indicates the weight given to fields that have no value. This allows the score to be reduced for records that have several missing values. When the value is zero, a missing value adds nothing to the score and the value is considered as not being found at all. When it is 100; it adds the full weight of the column to the score and the value is considered as found.
    3. Select Merge Digit Blocks if spaces in consecutive blocks of numbers should be ignored. For example, 123 456 789 would be transformed to 123456789 in the index.

      The column scoring grid has a row for each column. To edit a value click on the relevant cell. For information on specific values, refer to the table below.

      Value Description
      Weight Enter a percentage value between 0 and 100 when using Absolute weighting. The total weight across all columns should add up to 100%. If the total weight is more than 100%, your search results will be normalized onto the range 0 – 100%. For Relative weighting mode, the value is relative to the values in the other columns.
      Min. Score Enter a threshold value for column scores. If the score for a value is below that threshold, the value is considered as not found at all. For instance, on a numeric customer number, you could set a minimum score of 90% but on a customer name field, a value of 60% would be more tolerant of errors in the name. The aim is to avoid, for example, a numeric field with some numbers in common with a numeric search term from increasing the score when it is obviously a false match.
      Optional This field is available for Relative weighting only. Select the check box if the field score should only go into the total record score if its score is beyond the Min. Score column’s threshold value, otherwise its value is ignored. For example, for a telephone number field, it’s significant if the value is present but does not reflect negatively if it’s not.
      Word Order Select the check box if the order of the words is important and should be checked. For example, word order may not be significant for names – John Smith and Smith, John are equivalent – but street names might need to be ignored completely if the word order is wrong.

      To merge the values of other columns with a column in order to treat them as one value, complete the following steps.

    • To select the columns to merge, click Merge cell and select the Available Columns whose text is to be merged with this one.

    • To change the order of the columns in the merged text, select one of the Merged Columns and use the up and down buttons to change the order of the text.

    • To save the configuration, click the button to save the configuration.

      Note:

      If none of the Available Columns are selected, this column is not merged with any others. A searchable column can be merged with more than one column, however you cannot merge a column with itself or with columns that are already merged.

  7. Click Save.
    When configuring a new index, you will be asked if you want to rebuild the index immediately. If you reject this, the index will not be available for searches until the first scheduled rebuild has succeeded.