Products affected: MapInfo Professional
Using Data Files in Any Language or Character SetPro users can work with characters from any language in data files, so that multi-language tables display properly in maps, browsers, the Info tool, and other locations. MapInfo Pro can open tables, files, or workspaces with Unicode characters in the file name or path name regardless of the locale of MapInfo Pro or which localized version of MapInfo Pro you are running. A system setting called Encode Workspaces and Tab Files enables this feature, which is off by default.
Note: Users would disable Encode Workspaces and Tab Files to share MapInfo tables with versions of MapInfo Pro that are older than version 15.2, to share data with applications that do not support the UTF-8 character set, or when the data from only one language. In this case, workspaces and tables are written with the current system character setting (charset).
When enabled, this system setting writes workspaces using the UTF-8 charset. New Tab files or Tab files being re-written, such as save copy as, pack table, update friendly name, or update metadata, use the UTF-8 encoding. The !charset in the .tab file remains the same; it represents the data in the table and not the charset of the .tab file itself. MapInfo Pro writes a UTF-8 Byte Order Mark (BOM) at the beginning of the file, so that other applications recognize the encoding.
When Encode Workspaces and Tab Files is enabled (turned on) and users are opening an Excel or Access file for import into MapInfo native TAB format, the resulting tables (TAB files) are in UTF-8 format. When opening an instance of an Excel, ASCII, CSV, or Lotus 1-2-3 data type and Create Copy in MapInfo Format is checked on the Open Tabledialog, the resulting table is in MapInfo Extended format with a default character set (charset) preference set to NativeX (MapInfo Extended). When reading from or writing to a .QRY file, the file opens using the UTF-8 character set.
To enable or disable the Encode Workspaces and Tab Files feature:
- On the PRO tab, click Options, and click System Settings in the System group, to open the System Settings Preferences dialog box.
- Select the Encode Workspaces and Tab Files check box to enable this feature or clear the check box to disable it.
- Click OK.
When saving data to the MapInfo Extended TAB format (NativeX format), MapInfo Pro interprets the width of character fields in tables with a UTF-16 character set (charset) as the number of characters with two bytes (16-bits) per character. It interprets the width of character fields in tables with any character set other than UTF-16 (such as WindowsLatin1, Cyrillic, and UTF-8) as the number of bytes. For non UTF-8 character sets each character takes up one byte, but could also take from one to four bytes. For UTF-8, since it is used to store characters from any language, it is more likely to require more than one byte. This means that Pro users need to allow for larger field widths to avoid data truncation.
Using the UTF-16 character set is the best way to ensure that all data is preserved, but it results in larger file sizes. The UTF-8 character set can be used to encode all characters faithfully, but truncation could occur. When saving a copy of a table from a non UTF-8 character set to UTF-8, increase the field width to avoid truncation.
UPDATED: December 4, 2019