Various GUIDs present in a Doc1 Interchange Journal (DIJ) explained

There are various tags in the Doc1 Interchange Journal (DIJ)  What is the difference between DocMasterID, DocInstanceID, and JobGUID?

Example:
 
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE eGAD SYSTEM "eGAD.Dtd">
<eGAD pakUID="PID_DC005E01992811E3AEC544AC881EB40B">
<jobdata>
<datetime>20140218214412</datetime>
<platform>HP UNIX</platform>
<Version major="4" minor="3"/>
<JobGUID>DC005E02992811E3AEC544AC881EB40B</JobGUID>
<JobName>print/my_afp_file.afp</JobName>
<JobShortName>my_afp_file.afp</JobShortName>
<NativeFormat value="AFPDS"/>
<ResourceGUID p="1" value="33DC34D3A8474AD599203ABDF995BA63"/>
<ResourceGUID p="0" value="86BC1A67257548458F5EB0FAE579C0EB"/>
</jobdata>
<document docID="1" docMasterID="796ED2FB7E2B8F61B521F9DEFE2B9854" docInstanceID="DC005E00992811E3AEC544AC881EB40B">
 


DocMasterID:
(1) DocMasterID from the XML journal is copied to the document property doc.guid which is used by default in constructing the guid.dri index entries. 
(2) It is legal to have duplicate values of docMasterID. It is possible to load multiple renditions of the same document content into vault. For example, you could have an HTML and AFP version of the same invoice loaded. When rendering, vault will choose among the renditions the first guid that supports the requested output mode. So you can think of documents with the same docMasterID as having the same content. 
 
The DocMasterID is assumed in a couple of cases to be unique.  Some versions of Vault will have issues with a file that contains a DocMasterID that is the same as a record that has already been processed.  Even though the data is different within the DIJ entry, it’s not different enough to create a unique ID. 
 
The DocMasterID is generated from a hash of:
  1. Offset to start of publication
  2. Offset of end of publication
  3. DIJ Account number
  4. DIJ Statement date
So given sensible data it should be unique.  However, items 1) and 2) are a byte offset to the start and end of the publication input data. So you could get duplicate MasterIDs if the input data was fixed record length. (for example mainframe data among others)
If a customer uses data that has fixed length records and fields AND they do more than one run that contains records with the same statement date, it is highly likely that the first DocMasterID in subsequent runs will match a DocMasterID in a previous run.
 
JobGUID:
The JobGUID from DOC1 is produced solely based on the Date plus the Run Time of the job.  The Run Time is calculated down to the second as follows HH:MM:SS, therefore it's possible that if a job is started within the same second as another, the JobGUID can be the same, even though the files are unique.

The recommendation is that jobs at run at least one second apart from each other at any given time.  This would ensure that the JobGUID is unique every time.

In the XML journal you can check <datetime>20140218214412</datetime> field to see when the job was run.  If the <datetime> tag in two journals has the same value, the JobGUID will also be the same.
 
DocInstanceID:
The DocInstanceID in the printstream is about as unique as you can get, it’s generated using the MAC address and current time.
 
The Generate produces duplicate DocInstanceIDs if you have Jobs running in parallel on the same physical machine.
UPDATED:  August 5, 2019