VERIFIED SOLUTION i

Removing bad pointers from an EngageOne Vault index

6.0, 6.1
I can't access many documents in Vault.  I get these kinds of errors in the log:
 
00:08:22 ERROR 11826: attempt to read block of [-1409286024] bytes at [00000000006E53AC], extends past end of file [0000000000770152]
00:08:22 127.0.0.1:56126 <storage1> ERROR 70192: document at offset [0x006E53A40000E352] in document data file [\\isilonnas.corp\factdigitalnew\Vault\docdata\2013062924625-afp-tr-bsc1e21-280613-gui.drd] is not valid
00:08:22 127.0.0.1:56126 <storage1> storage.docdata failed, status [70192]
00:08:22 127.0.0.1:56126 <database1> search failed, status [70192]

I think that I may have incorrectly removed a file, and then tried to add it back in again.

If a customer incorrectly removes a job from the Vault and then attempts to reinsert the same job (slightly change) this can cause invalid pointers to remain in Vault, corrupting the index and making Vault unusable.

A full reindex will solve this problem, but sometimes you can get by with this much faster alternative:
 
1. Make a folder outside of the pagedata/docdata directory tree called "badfiles"
2. Move all files that are possibly double-indexed into badfiles
a. On the online system, viewing those documents will temporarily stop working, hopefully that's OK - it won't be for a huge time, maybe 1hr
3. Use tools\indexcheck index/invlink.dri -copy:index/invlink.new.dri -validfile 
a. This will take about as long as COPYING the file would take, it does not take very long
b. This will refuse to copy any document entries where the file isn't present in the docdata folder, conveniently getting rid of those dubious references
4. Stop vault temporarily, rename index/invlink.dri to invlink.bad, then rename invlink.new.dri to invlink.dri
*REPEAT STEP 3 & 4 for each DOCUMENT LINKED INDEX 'dh' flags, e.g. GUID, IGUID, etc*
5. Start vault
6. Move the files from badfiles back to pagedata/docdata
7. Index the badfiles with a .index file for each DRD
 
Total ETA:  1-2 hrs, even for a huge customer, and even with the legacy index engine
 
UPDATED:  October 3, 2019