VERIFIED SOLUTION i
X

Resolve font substitution in PDF export in EngageOne Vault.

Issue

By default, Vault renders text in AFP and Metacode documents to bitmaps when exported to PDF. This method is used because this behavior is supported by all fonts and it preserves the style of the font. However, it does have drawbacks, for example, the resulting text appears fuzzy, it does not scale well when zoomed in, cannot be cut and pasted and tends to produce larger PDF files.

Cause

This occurs if no font substitution has been set up using either AFPSUBSTITUTE or METASUBSTITUTE (both of these can be found in your vault_installation\server\tools folder) to create fonts.ini, which maps the AFP/METACODE font to an appropriate Windows font.  Information on this can be found in “Working with fonts” in the Vault Customizing Guide. For some fonts, such as bar code fonts, you intentionally use the bitmap output conversion to eliminate any alteration in the appearance of the barcode to ensure successful reading.

There are also incidences where font substitution has been set up, but Vault still renders some or all of the text as bitmaps.

Resolution

UPDATED: July 13, 2017


Non-standard GCGIDs can be one of the reasons why font substitution is not taking place.

In AFP, a font consists of a code page, a font character set and sometimes a coded font. The code page determines how text in the print stream is encoded. It maps each code point to a character name called a Graphic Character Global ID (GCGID). The bitmaps or outlines in the font character set are linked by the GCGID. Vault uses a file called gcgiduni.map to map the GCGIDs to Unicode characters when exporting data to PDF.

Some output generators or font converters use non-standard character names. That means Vault will be unable to find the right code point to use with the default gcgiduni.map. When this happens, Vault will fall back to rendering the entire chunk of text involved as a bitmap.

If the character names are consistent, entries can be added to gcgiduni.map in the associated resource set. If you know that the text is always encoded as ASCII or EBCDIC, you can use the profile setting GuessUnknownCharacters to tell Vault to interpret the code points as ASCII (=2) or EBCDIC (=3).

Out of Range Translations

For some modes, such as basic font substitution, the code points in the output text are limited to the range 0-255. If you have a character that is mapped in the gcgiduni.map file to a code point outside of this range, Vault will also fall back to rendering the text as bitmap.

This commonly happens with characters such as the en-dash and “fancy” quotes. In these cases it may be possible to alter the gcgiduni.map file to map the character to a different character that is in the lower range. For example, the user could map fancy quotes to the basic quote:

gcgiduni.map:

SP190000 0027

SP200000 0027

SS680000 002D

In this example above, SP190000 is a left single quote and it is being mapped to code point U+0027 which is a standard single quote.

Incorrect Spacing, Overlapping Text

When Vault is configured to use basic font substitution for multiple fonts, a situation where some of the text uses the wrong spacing may be encountered. This happens when multiple substitutions use the same base font name, for example Helvetica-Bold. The way Vault generates the structures in the output PDF causes most readers to use the font metrics for one of the fonts for all of the substitutions. If there are many fonts with different metrics mapped into the same substitution, inappropriate character widths may be applied in some cases.

One way to work around this issue is to use unique substitution names. These can be generated automatically when using the –u switch in AFPSUBSTITUTE. This makes PDF readers use the width tables explicitly generated for each font but meansthere is less control over the font style.  This means that the resulting text loses bold and italic characteristics.

Another reason that poor spacing is that the user has configured an embedded replacement font that has different metrics than the original font. For example if the character “O” is narrow in the original font but wide in the embedded font, the spacing will be off. Sometimes this is quite visible. If possible, try to use a font that closely matches the original. Alternatively, it may be possible to adjust the size of the font using the PDF width field in the fonts.ini settings. The size of this font setting will affect the width of chunks of text in the output. If they overlap, making the font smaller may tease apart the text chunks and remove the overlap.

Incorrect Style

The text output to PDF may end up with the wrong style (bold/italics) when using afpsubstitute –u. As mentioned above, this option generates unique base font names that cause the readers to use separate width tables for the fonts. But it also generates base font names that are not well known. This causes the reader to select a font based on metrics. Vault has limited support for configuring the metrics that would trigger the reader to use a particular style, bold for example.

It may be possible to use an explicit substitute name such as Helvetica-Bold for a certain subset of fonts that have compatible widths and then use a unique name for fonts with distinct widths so that the two do not clash.

Wrong Characters Displayed

In some cases the characters Vault displays do not match the original print stream. This can happen for a couple of reasons.

Character Appears as a Box

If font embedding has been set up and the code point that a character translates to is not present in the font, the reader may display the character as a box in some cases.

If the code point simply does not exist in the target font, it may be necessary to use a different font or substitution mode or change the character mapping in gcgiduni.map to map the character to one that is defined.

There have been instances where some Type 1 fonts omit the space character. If this is the case the spaces may all appear as boxes. With recent versions of Vault, AFPForceBreakOnSpace=1 is set by default. This positions text word by word so no spaces are actually ommitted. If this has been turned off (set to 0 not 1) this issue may occur.

Another case where this happens is when CapExplicitEmbeddedTTF is set to reduce the size of explicitly embedded TrueType fonts. Code points over this setting do not work and typically cause the characters to appear as boxes.

Mac OS X

In some cases text will not be displayed when viewing exported PDF documents with Mac OS X Safari or Preview Viewer.  In particular this issue was noticed when embedded Type 1 fonts. In these cases, alternate methods such as basic substitution or TrueType font embedding may help workaround the problem.

Cut and Paste

Cannot Select Text

If the text being exported to PDF is rendered as bitmaps, the reader will not be able to support cut and paste, searching or screen reading. The best way to address this is to configure font substitution. See the Display Quality section for troubleshooting when font substitution is configured but is still rendering as bitmaps.

Pasted Text is Missing Spaces

In some cases, print stream generators will position text character by character to achieve effects such as full justification. When Vault converts this text to PDF, it will convert in the same one character chunks as the source stream. This presents a challenge for the PDF reader in that it needs to guess where the word boundaries are based on the font characteristics. This tends to be an error prone process in practice, resulting in missing or extra spaces appearing in pasted text. In some cases this can also affect searches.

Unable to Search Text

In some cases, automatically embedded Type 1 fonts contain non-standard character names. If this is the case, some readers will be unable to determine how to interpret the text code points for searching, cut and paste or screen reading.

By default Vault leaves the interpretation of the characters in Type 1 fonts up to the reader. However, you can specify the AutoType1ToUnicode profile setting to have Vault generate a ToUnicode map for the font which helps the reader interpret the text. When AutoType1ToUnicode=1 is set, it generates the map using AFP code page data and the translations in the gcgiduni.map file.

Performance

High Rendering Engine CPU utilization

The process of converting text to bitmaps in PDF is an inefficient process. If most or all of the text in a document is converted to bitmap, the rendering engine will consume additional CPU resources. Under heavy load this could lead to situations where the rendering engine is running at very high CPU utilization leading to slow response time, time outs, application exceptions and thread pool exhaustion.

The best way to reduce this effect is to configure font substitution. The substitution and embedding options generally take less CPU time than bitmap conversion.

Slow Response Time Rendering PDF Documents

In some cases PDF exports take a long time to execute because the size of the resulting PDF exceeds the server component’s threshold for spilling temporary output data from memory to disk. When the generation process runs in memory it executes faster than if it has to use disk as a backing store. This means that the size of the output PDF can affect the rendering performance.

If it hasn't already done so, configure font substitution. The default bitmap conversion will produce much larger PDF files than substitution or embedding in most cases. This means that the resulting PDF files are less likely to hit the threshold.

The user can also consider setting AutoFileThreshold to raise the point at which the rendering engine spills temporary data to disk above the normal range for this data and environment. The default spills data to disk when it reaches 512KB. If the user is aware that PDF files in a specific case are usually 1-2MB in size, you could set the following parameters in e2renderd.ini:

[server1]

AutoFileThreshold=2097152

Altering this setting should be done with care as increasing it does take up additional memory per concurrent rendering operation. Vault processes are 32-bit so they are limited to 2-4 GB of address space depending on the platform.

Exported PDF Files Are Large

In some cases the size of PDF files exported from Vault are larger than is convenient for end users. This can happen when converting text to bitmaps. In that case, consider setting up font substitution using fonts.ini.

In some cases explicitly embedded TrueType fonts require very large internal tables. One way to reduce the size is to use the profile option CapExplicitEmbeddedTTF=255 to reduce the number of code points that appear in these tables to those 255 or less. This does however mean that characters above the limit are considered invalid and may appear as boxes when rendered.

Managing Resource Sets

Resource Packs and Font Substitution

In environments using DOC1 generated resource packs, resource set directories can be created and updated frequently. This leads to complications managing font substitution settings inside fonts.ini.

One approach to creating a fonts.ini for each new resource set is to set up a fonts.ini file in the template resource set directory. When resource packs are extracted to create new resource set directories, the extraction process first copies files from another template resource set. This template directory can contain backgrounds, fonts and in this case, a fonts.ini file. If the jobs are using consistent font names the user can create a fonts.ini in the template directory that will work with newly created resource sets.

This method is not without issues. If the generating application does not use consistent resource names across jobs, the substitutions will not work correctly when copied to newly created resource sets. Usually DOC1 is consistent about font names but if there are multiple DOC1 repositories feeding jobs into Vault, they can produce colliding resource names. This could result in cross wired substitution settings.

Another approach is to use the AutoCreateFontsIni=1 profile setting. When resource pack expansion occurs, this setting causes Vault to generate font substitution settings similar to ones created by using afpsubstitute. It reads any existing fonts.ini files and preserves existing settings and then adds settings for new font resources.

Keep in mind that afpsubstitute generated settings are usually just a starting point for tuning font substitutions, so the automatically created settings are not always appropriate and may require adjustments.

Resource Name Collisions

If the fonts used to render a document do not match those the stream was intended to use, the resulting display will be incorrect or even unintelligible. This will typically affect both bitmap (GIF/PNG.TIFF) output as well as PDF. For example, this can happen when a font is missing and Vault falls back to using a default font.

A more serious issue occurs when the generating application reuses resource names between jobs. That is, F1 is Arial in one job and Times Roman in another. Vault requires that the resources in a resource set have unique names. If a name would be reused, the job must be assigned a new resource set with the alternate resource. When using resource packs this should happen automatically.  However, when using the ExtractResources setting it is possible to extract resources with conflicting names. If a font gets changed as a result, older documents may stop rendering correctly. One option to help detect this case is to use ExtractCollisions=2 to force the load process to check for collisions during extraction.

Error Messages

Unable to Compute Width table

ERROR 61050: unable to compute width table for font, resourceset [], font []

ERROR 71007: attempt to inject empty resource into resource cache, resource [], resource set []

These errors occur when the rendering code tries to compute a font’s width table, which is used in PDF export, but is unable to do so. One specific reason this error can occur is when you configure explicit font embedding in an older version that does not support it.

In particular, explicit font embedding was not available before Vault 6.1:

fonts.ini:

AR10BP ARIAL 32.0 0 1 *embed:ANY_FONT.ttf 32.0 262176

AR11BP ARIAL 18.0 0 2 *embed:z003034l.pfb 18.0 96

Settings

[profile] AutoType1ToUnicode Used to control how automatically embedded Type 1 fonts are interpreted by the PDF reader.

0 = the reader uses the font’s character name data to interpret text

1 = Vault generates a ToUnicode map for the font based on the AFP code page data and the gcgiduni.map file

3 = Vault generates a ToUnicode map based on the character names in the font. The names must be in the form /uniXXXX where XXXX is the Unicode code point. This is a fairly specific scenario that will not apply to most customers.

AFPForceBreakOnSpace When set to 1, this causes each word in the text to be positioned and exported explicitly rather than being written as a string containing spaces. This mode is enabled automatically when variable space increment is used (that is, where the font’s widths are being overridden). It is also helpful if the font does not have a defined space character. The default is 1.

CapExplicitEmbeddedTTF Used to reduce the set of supported characters in explicitly embedded TrueType fonts, this can reduce the size of the tables associated with the font and thus the PDF.

GuessUnknownCharacters Used to have Vault guess the encoding of characters whose GCGIDs so not appear in the gcgiduni.map file.

0 = off (default)

1 = guess ASCII or EBCDIC (avoid, tends to be inconsistent)

2 = assume ASCII

3 = assume EBCDIC

AutoCreateFontsIni When set to 1, font substitution settings are generated during resource pack extraction, these are added to fonts.ini for any fonts without existing settings.

ResourceSet When using XML journals, this sets the template resource set for resource pack extraction, otherwise this sets the final resource set name.

ExtractResources Used to extract inline resources to a specified target directory, usually a resource set under server/distrib.

ExtractCollisions Controls the behavior of ExtractCollisions when there are existing resources.

0 = skip

1 = replace (default)

2 = compare and fail if differences are detected

[server1] AutoFileThreshold Used to control the point at which temporary rendering output is spilled to disk, default is 524288 (512KB).

Tools

afpsubstitute reads AFP fonts in the current directory and produces a basic set of font substitution settings suitable for an initial fonts.ini

afpextract extracts inline resources from an AFP file to the current directory

afpdecode decodes AFP print streams and resources such as code pages and font character sets

metasubstitute reads Metacode fonts in the current directory and produces a basic set of font substitution settings suitable for an initial fonts.ini

metaextract extracts inline resources from a Metacode stream to the current directory

metaresource decodes Metacode resources such as fonts, images, logos and forms

metadecode decodes Metacode print streams

Downloads

  • No Downloads