VERIFIED SOLUTION i
X

Using Vault indexcheck validfile

UPDATED: September 7, 2017


Using indexcheck Validfile

 

Using Indexcheck ValidFile

Vault includes a utility, called indexcheck, for manipulating index data. It is located in the server\tools directory. Indexcheck has a wide variety of options. The –copy option in particular is used to copy index entries from one index to another.

The -copy option can be used to:

• compact an index

• recover space from mass purge operations

• convert between standard, Unicode and indexerd index types

• repair or change the sort order of a Unicode index

• filter out document index entries pointing to invalid jobs

This last use involves the –validfile option. On its own, the –validfile option is used to report document index entries that point to missing job files. Combined with –copy, it makes a copy of a document index with any such entries filtered out. Generally –validfile is used with the –noprint option so valid index entries are not logged.

What ValidFile Does

The –validfile option starts by reading the server.ini configuration file from the current directory. Specifically, it looks for the storage model used and any redirections for storage related directories. It then scans the storage directories to make a list of available document data (drd) and document page (drp) files. It does not check for the existence of journal files. Since it reads the server.ini, indexcheck –validfile is normally run from the server directory. Note that older versions of indexcheck may not support storage model or path redirections.

When the index operation scans through the target index, the –validfile option extracts the job name from the document index entry and looks it up in the list. If the referenced job is missing the corresponding drd and/or drp file, the index entry is considered invalid. Combined with the –copy option, this will cause the index entry to be skipped. Without the –copy option, it will report the index entry in the output.

The –validfile option should only be used on document indexes. Customer index entries do not contain references to jobs. If you happen to use –validfile on a customer index, all entries will be treated as invalid. Document indexes will have the 'h' flag in the profiles.ini/database.ini IndexN= settings whereas customer indexes have the 'c' flag. The default invlink, guid and iguid indexes are examples of document indexes that you could use with –validfile.

Indexcheck reads raw index data during its scan. If there is another process writing to the index at the same time, it may see inconsistent index data. Similarly, the –validfile option is sensitive to changes in the job list in the storage directories. As a precaution, stop or suspend the e2loaderd process responsible for these indexes while using indexcheck.

Finding Invalid Index Entries

If the server reports errors that indicate that some jobs files are missing from the storage directories, indexcheck –valid file can be used to scan one or more document indexes for entries that refer to them. A good place to start is the invlink index which all documents in a given database would normally be indexed into. To run the scan, change to the server directory and run indexcheck in the tools directory giving the path to the index file and the –validfile option:

 

tools\indexcheck index\statements\invlink.dri -validfile -noprint

 

 

Index Diagnostic 7.2.1.12

 

Search [index\statements\invlink.dri] for:

 

[]

 

Found:

 

[00020347_2001/11/11_20011111-tryme-telco-statement_]

ERROR 14116: missing document information file

[00020347_2002/01/11_20150202-tryme-telco-statement_]

ERROR 14117: missing compressed data file

[32759454_2001/11/11_20011111-tryme-telco-statement_]

ERROR 14116: missing document information file

[32759454_2002/01/11_20150202-tryme-telco-statement_]

ERROR 14117: missing compressed data file

 

1500 matches

1000 errors

 

Here two files are missing, 20011111-tryme-telco-statement.drd and 20150202-tryme-telco-statement.drp.

Scanning a large production index could take considerable time and in some cases, produce a large number of errors. Similarly a copy operation could take a large amount of time. Consider redirecting the output to a file. On Unix, consider using nohup so the process is not interrupted if disconnected.

Filtering Out Invalid Index Entries

In some scenarios, the structure of the index will be valid but there are some index entries present that point to invalid document offsets or to corrupted jobs. If the index structure is suspect, do not rely on index copy methods and prefer rebuilding the index from scratch instead.

As an example, if a job is removed without unindexing and then reloaded with the same job name, document pointers to invalid locations will be present in the index. The structure of the index should be fine but the content is incorrect. At best, accessing these documents will produce errors that show up in the server log. At worst they could lead to crashes or information disclosure. It is important to remove these entries.

There are also cases where job files have been lost or damaged due to external causes. These could cause various errors on retrieval. However, without a valid drd file, document index entries pointing to these jobs are not removable using unindex.

If the set of such jobs is known, you can use –validfile to remove related entries from the index by intentionally removing the job files from the storage directories:

1. identify the list of problem jobs

2. make a list of document indexes ('h') that these jobs might be indexed into

3. decide on the name of each index copy (e.g. add "new-" to the file name)

4. make sure the new index files do not exist (indexcheck –copy will use an existing target)

5. wait for the e2loaderd process to go idle and stop it

6. manually move the affected drd/drp/jrn files out of the storage directories

7. use indexcheck –copy –validfile -noprint on each document index to make a copy with all references to the damaged jobs removed (syntax below)

8. stop all remaining Vault processes

9. swap the new and old indexes

10. restart the Vault processes

When using the filter process above, the customer table (account.drr or account.drt) and customer indexes should not be altered. This method is intended for use with the standard or Unicode indexes. This procedure requires additional steps when using indexerd indexes.

The following is an example of the syntax to make a filtered copy of a document index. The number of matches reported is the count from the source index, not the destination.

 

tools\indexcheck index\statements\invlink.dri -copy:index\statements\new-invlink.dri -validfile -noprint

 

 

Index Diagnostic 7.2.1.12

 

Search [index\statements\invlink.dri] for:

 

[]

 

Found:

 

 

1500 matches

 

index\statements\invlink.dri

file size [139264]

disk read [34]

stack depth [2]

cache read miss [33]

 

Working with Indexerd

The scanning and filtering processes can be applied to indexerd indexes with some modifications to account for differences in the way indexerd indexes work.

The equivalent command for finding invalid index entries is shown below. The index name syntax is slightly different and the –mode:2 switch is added to indicate that it is an indexerd index:

 

tools\indexcheck -mode:2 index2/invlink.dri -validfile -noprint

 

 

Similarly, the equivalent command for creating a filtered index is:

 

tools\indexcheck -mode:2 index2/invlink.dri -copy:index2/new-invlink.dri -validfile -noprint

 

 

The tricky part with indexed indexes is that there is no simple way of swapping in the new index. Instead you need to drop the existing index and then copy the new index into place:

 

tools\indexcheck -mode:2 index2/invlink.dri -drop –confirm

 

tools\indexcheck -mode:2 index2/new-invlink.dri -copy:index2/invlink.dri –noprint

 

tools\indexcheck -mode:2 index2/new-invlink.dri -drop –confirm


Downloads

  • No Downloads