Setting the Text Encoding for Processing - Unicode Considerations - English - Foundation 22.1 - OnBase - external

Unicode Considerations

Unicode Considerations
Foundation 22.1

DIP processing can be used to import files with different types of encoding into OnBase. To import files with Unicode encoding, select the appropriate option in the Process Settings dialog box's Text Encoding drop-down list.



Disk Group

Select a Disk Group to which to save imported documents in a batch. A Disk Group must be selected to save the process format.

Language Conversion

Select the language associated with the ASCII code page that created the import file.


This setting is only used for legacy language conversions. The option <NO CONVERSION> should be selected when configuring process settings.

Index Extraction Format

Select the extraction format used to extract Keyword Values from the imported files. This setting is used in conjunction with the Extract Index Information setting in the Options tab.

This index information can be imported into third-party programs or used as data for an AutoFill Keyword Set for related documents. In order to extract index information, your system must use a properly configured index extraction format.


In order to extract index information, your system must use a properly configured Index Extraction Format. See the System Administration documentation for more information on configuring an Index Extraction Format.

To configure an index extraction format, see the Document Import Processor documentation.

Send to Scan Queue

Select a scan queue to which to send scanned batches. This option only applies to image documents. Non-image documents in the batch could potentially cause errors.


If you select a scan queue in the Send to Scan Queue drop-down list, neither the batch nor the verification report goes to the Awaiting Commit queue.

If you select a custom scan queue configured for custom capture processing in the Unity Client, you must define which Application Server and data source to connect to in one of the following ways:

  • Apply the -APPSRV_URL and -APPSRV_DSN command line switches to the OnBase Client performing the Directory Import process


    For information on how to apply these specific command line switches, see the Command Line Switches module reference guide.

  • Define the Application Server in the OnBase Configuration module


For information on how to define the Application Server, see the System Administration module reference guide.

If the Application Server is not defined in one of these ways, batches cannot be sent to the selected custom scan queue.

Scan Queue Status

Select a scan queue status to display in the scan queue configured in the Send to Scan Queue option. Choose one of the following options from the drop-down list:

  • Awaiting Index

  • Index in Progress

  • Awaiting Commit


Selecting Awaiting Commit marks the batch to be sent to the scan queue as fully indexed. The batch is removed from the DIP queue and is exclusively handled in Imaging like all other Document Imaging batches.

Text Encoding

Select a text encoding option to use during processing. Alternatively, leave this option set to <Default> to process using the database's default text encoding.


While this option allows for files using an encoding different from that of the database to be read, any Keyword Values stored by the process must be converted to the codepage used by the database.

Text encodings supported are:

  • Arabic

  • Central European

  • Chinese, Simplified

  • Chinese, Traditional

  • Cyrillic

  • Greek

  • Hebrew

  • Japanese

  • Korean

  • Thai

  • Turkish

  • UTF-8

  • UTF-16 LE

  • UTF-16 BE

  • Western European


If the index file has a Unicode byte order mark (BOM) or an HTML character set, the encoding specified by the BOM or HTML character set is used regardless of the option selected in Text Encoding.