OnBase supports ANSI, UTF-8, and UTF-16 text encodings for HTML files. When HTML files are imported into OnBase, the type of encoding being used in the files is determined by one of the following factors:
-
Byte Order Mark (BOM) specified in the HTML file
-
Value assigned to the charset tag in the HTML file
-
Default encoding for the HTML file as determined by the assigned file type:
-
If the file type is HTML, the default encoding is set to the local ANSI code page.
-
If the file type is HTML Unicode, the default encoding is set to UTF-8.
-
OnBase checks for these factors in the above order of precedence. If neither a BOM nor a value for the charset tag is specified in the HTML file, a default encoding is assigned.
When creating an HTML file with a Unicode encoding, it is considered a best practice to specify a BOM. This will help to ensure that OnBase correctly detects the encoding on import.