Text Encoding for HTML Files - Unicode Considerations - English - Foundation 22.1 - OnBase - external

Unicode Considerations

Platform
OnBase
Product
Unicode Considerations
Release
Foundation 22.1
License

OnBase supports ANSI, UTF-8, and UTF-16 text encodings for HTML files. When HTML files are imported into OnBase, the type of encoding being used in the files is determined by one of the following factors:

  • Byte Order Mark (BOM) specified in the HTML file

  • Value assigned to the charset tag in the HTML file

  • Default encoding for the HTML file as determined by the assigned file type:

    • If the file type is HTML, the default encoding is set to the local ANSI code page.

    • If the file type is HTML Unicode, the default encoding is set to UTF-8.

OnBase checks for these factors in the above order of precedence. If neither a BOM nor a value for the charset tag is specified in the HTML file, a default encoding is assigned.

Tip:

When creating an HTML file with a Unicode encoding, it is considered a best practice to specify a BOM. This will help to ensure that OnBase correctly detects the encoding on import.