|Breaking down barriers! Performance, file size and multi-lingual data in the 64 bit MapInfo Pro|
Some significant barriers have come down with this release.
|Have you ever had to break a MapInfo table into two because the size exceeded the 2 GB limit? Or, have you had to work with data that included multiple character sets (Unicode data?). If yes, good news is here! |
The 64 bit versions of MapInfo Pro (v15.2 and later) support Unicode (UTF-8 and UTF-16 character sets) and offers the ability to create files larger than 2 GB in size. To go along with this performance improvements have been made to make the use of large files more practical.
Creating large tables - the Extended TAB file format
The 64 bit versions of MapInfo Pro can now create a TAB file that is larger than 2 GB. We call this the Extended TAB file format. It is important to note that this format is not used by default. You can create Extended TAB files as needed and there is a preference you can set to have the software default to using this format.
Creating an extended format table is easy. You will find this option when saving a copy of a table or creating a new table. Choose MapInfo Extended (*.tab) in the Save as Type drop-down list.
When creating a new table, the Extended TAB file option is also available.
When using the Import capabilities, choose the MapInfo Extended Tab option when the dataset is larger than 2 GB.
The 64 bit versions (v15.2 and later) also fully support Unicode. This allows you to correctly display data in multiple character sets at the same time. This can be mixed data in a single table or different data sets using different character sets, as in the screen shot below.
In this screenshot above, Russian (Cyrillic), Arabic and Japanese data are all being displayed in MapInfo Pro at the same time.
UTF-8 versus UTF-16
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. UTF-8 and UTF 16 are two of the established standards for encoding. They only differ in how many bytes they use to encode each character. Since both are variable width encoding, they can use up to four bytes to encode the data but when it comes to the minimum, UTF-8 only uses 1 byte (8 bits) and UTF-16 uses 2 bytes (16 bits). This bears a significant impact on the resulting size of the encoded files. When using ASCII only characters, a UTF-16 encoded file would be roughly twice as big as the same file encoded with UTF-8.
If most of the characters in the file are ASCII characters, it is advisable to use UTF-8 encoding. Otherwise it is better to use UTF-16 encoding.
MapInfo Pro allows you to save an existing table into a new table with UTF-8 or UTF-16 encoding. You can encounter data corruption, due to truncation or conversion, when saving a copy of a table between Unicode and non-Unicode character sets. When saving non-UTF-8 (non-Unicode) to UTF-8 (Unicode), there is the potential for data truncation.
There is information in the MapInfo Help file on this and we when have placed a reminder in the Save Copy As ...dialog box.
Setting the default behaviour for Extended TAB and Unicode data
Creating Unicode format datasets and tables that exceed the2 GB file size limit are not turned on by default.
If desired, you can change the defaults in the System Settings dialog box. You will find the system preferences (now called Options) in the "Backstage" area (the Pro tab on the ribbon).
- This means you have to expressly choose MapInfo Extended (*.tab) when you want to create a file that will exceed 2 GB in size. The reason for this is because it is only version 15.2 of the software that supports the new Extended TAB file format. You cannot share data in the Extended tab format with earlier versions of MapInfo Pro.
- Likewise, you will need to expressly choose the UTF-8 or UTF-16 character sets. In general we recommend UTF-8, especially if most of the characters in your data are ASCII characters.
SQL Select - filtering data:
Queries that filter data sets have increased performance, particularly when the result set (the number of rows returned by the query) is large. The improvement is more dramatic, the larger the result set. Performance varies across a number of factors but for result sets at 20,000 to 25,000 rows (and greater) the improvement starts to become significant.
The computer used for these comparisons was a Dell Latitude E6420, with an Intel Core i7 (4 dual core 2.2 Ghz processors), 8 GB Ram on Windows 7.
Note that the two bottom examples were querying a data set of 21.5 GB in size. The size of the result set is a major factor in the performance and the performance improvement. The larger the result set, the longer it used to take earlier versions of MapInfo Pro to complete the query.
MapInfo Pro has improved performance when displaying a large number of point objects. The same computer as used in the tests above was used for these comparisons.
Improved redraw performance for point data:
The dataset used is the WorldPlaces table from the Pitney Bowes WorldInfo data product. This table has 675,000 point objects.
Opening a workspace and (on-screen) Map querying performance:
The time needed to complete selections with the map selecting tools (rectangle select, radius select, boundary select, polygon select, invert selection) is also improved. As with the query filtering performance mentioned above, the performance for larger result sets is more dramatic as compared to simple selections of only a few objects.
Here is an example that puts together the improved point rendering performance with the map querying performance. Improvements have been made at two points along the workflow saving significant time.
Opening the workspace:
Map based querying comparison
Video tutorial: Want to see some of this in action?
A video tutorial which covers the Fast Point Rendering capabilities in MapInfo Pro is available from our YouTube channel. Click here to watch it.
A separate article covers this performance improvement. When your data is indexed, certain object editing operations (including update column, deleting data and combining data) finish in less time than before. In some cases, the improvement is very significant.
Smart Indexing has been added to both v15.0 (the latest 32 bit release) and v15.2.
What about raster data?
This article is all about the improvements to working with vector data in MapInfo Pro. For those of you who work with raster grid data, be sure to check out the "Get on the Grid" series of articles on MapInfo Pro Advanced. This is our next generation raster grid analysis tool for MapInfo Pro. It too can work with very large datasets! In fact, it can display rasters very quickly of virtually any size. We've used it with multiple terabyte datasets!
In conclusion we hope you like what you see here. We have removed some longstanding limitations in the software to allow you to work with larger data sets.
Are you using the latest version of MapInfo Pro?
If you haven't tried the latest version of MapInfo Pro then you don't know what you are missing!
Download a 30 day free trial here: http://www.pitneybowes.com/us/mipro-free-trial.html
Article by Tom Probert, Editor of "The MapInfo Pro" journal
When not writing articles for "The MapInfo Pro" journal, Tom enjoys talking to MapInfo Pro users at conferences and events. When not working he likes to see movies with car chases, explosions and kung-fu fighting.