VERIFIED SOLUTION i
X

Understanding Vault Error 28447

UPDATED: October 18, 2017


What causes Vault error 28447?  I see this in the log:
 
07:39:46 building [work\lett_cwd_ltrprtf052420130420_05242013.drd] profile [Letters1] document build engine [xmljournal]
         0    10   20   30   40   50   60   70   80   90   100
         |    |    |    |    |    |    |    |    |    |    |
         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
07:39:48 ERROR 28447: invalid byte '<' at position 2 of a 3-byte sequence [line 71199, column 9]
07:39:48 ERROR 10114: document build failed for file [work\lett_cwd_ltrprtf052420130420_05242013.drp]

Another example:
 
01:30:52 ERROR 28447: An exception occurred! Type:UTFDataFormatException, Message:invalid byte 1 (–) of a 1-byte sequence. [line 324352, column 35] 
01:30:52 ERROR 10114: document build failed for file [I:\Group1\e2\Vault\Server\work\m-10group3_cws_bill-automated-print-201320020190300.drp] 
 



These type of errors are almost always due to a bad character or characters in the XML file.  The "invalid byte" part of the message is almost always accurate, but unfortunately with XML the line and column number is often not accurate.

If the bad character or characters in the journal can be identified and fixed the file will usually ingest.

In the first example above, in
 
<Name>ARTHUR KRAUS        é</Name>
<Addr line="1">955 WINCHESTER RD</Addr>
<Addr line="2">L</Addr>
<Addr line="3"></Addr>
<Addr line="4"></Addr>
<City>YNDHURST</City>
<Region>H</Region>
<PostalCode>412437125521</PostalCode>
</CustData>
<NumberOfPages value="1"/>
<Skipped><SPages></SPages></Skipped>

note the invalid character "é" right before the "<".  Removing this character allowed the file to ingest.

In the second example above,  in address line 1, the dash is code 150 or 0x96 or 0b10010110. 
 
The XML document is declared to be encoded as UTF-8. 
 
No valid UTF-8 character code starts with 0b10xxxxxx. 
(http://en.wikipedia.org/wiki/UTF-8) 
 
<document docID="14108" docMasterID="444DF5D75C6A3314F405656E305803D6" docInstanceID="C9B3738B36B40C4188C13D6452BDD74F"> 
<VendorId>cafdab460fb54b718fb482ce4a99ae83</VendorId> 
<DocTypeId>C9A201DD52094C3294FFD8AD03405D37</DocTypeId> 
<AccNo>8357366666</AccNo> 
<StmtDate>20130219</StmtDate> 
<DDSDocValue name="BillID" type="text" len="12">835736625967</DDSDocValue> 
<CustData> 
<Name>LLC. Tesoro West Coast Company</Name> 
<Addr line="1">c/o Ecova – MS 3249 </Addr> 
<Addr line="2">PO BOX 2440 </Addr> 
<Addr line="3"> </Addr> 
<Addr line="4"> </Addr> 
<City>Spokane</City> 
<Region>WA</Region> 
<PostalCode>!992102440405!</PostalCode> 
<Country>USA</Country> 
<Phone>00014108</Phone> 
</CustData> 
<NumberOfPages value="1"/> 
<Skipped><SPages></SPages></Skipped> 
</document>

One way to determine the invalid characters is to load the journal and the DTD (doc1 output uses eGad.dtd) into an XML parser and see where it fails.

Unfortunately different XML parsers denote the line and column numbers differently.  So the line and column numbers are often inaccurate in the error message.
 
In this case the first part of the error message:
 
invalid byte '<' at position 2 of a 3-byte sequence
 
was not very accurate either, as it was the character immediately BEFORE this character that was bad.
 
The best way to find out the real cause is to load the journal into an XML parser (along with the DTD) and see where it fails.
 
I’m attaching the eGad.dtd file in case you don’t have it.
 
What I did is change the journal extension to .XML, put the DTD in the same directory and the load the XML file in internet explorer.  IE told me which line was invalid.

Environment Details

5.3, 5.4, 5.5, 6.0, 6.1

Downloads

  • No Downloads