indexerd Space Reuse Notes

In what version does the new index (DR2; indexerd) start more efficiently reusing database space?

Space reuse of nodes that are no longer part of any valid database tree is now available in 6.1M0p0130. In order to take advantage of it, you will have to set up indexer to start dropping database versions. When  a database version is dropped, it can no longer be accessed by “rolling back”. Any tree nodes that have only dead references are reused in creating new trees.  In order to have a reusable node then, all the documents in that node will have been purged and any tree versions that reference those documents will have to have been dropped. You are already set up for purging as part of your normal operation. To enable tree version dropping, add the following to indexerd.ini
Versionretentiondays – number of days to retain database versions.
Versionretentioncount – minimum number of versions to retain. Note that versionretentiondays takes precedence over versionretentioncount.
Unfortunately, the way nodes were retired previous to this version means that already retired nodes will not get automatically reused. To facilitate the recovery of these older retired nodes, there is a new indexcheck option “-compact”. What –compact will do is trim all the tree versions except the current one,  scan all the databases for these already retired nodes and modify them so that they can be reused. Based on where you are in your lifecycle, we expect that this would be somewhat less than 5 percent of your space but you can run indexcheck –compact on a full database on your test system to find out. The space recovered is reported in the indexerd log file.  It may not be worth the effort as –compact runs for a very long time on a database as large as your current one.
You can request the 6.1M0p0130 via support. It is available to them currently. You will need to update both indexerd and indexcheck.

Thanks for the updated version.  I'll get this version from support and start testing.  Can you provide more details on the database versions?
Does the database version increment  for each Vault job load/indexing?   Is
the Versionretentiondays related to the retention period set up for each
profile?   Within the same database we have some documents with retention
period of 7 years and some with 2 years.

Given that you basically load new jobs once a day, usually within the same window, I would expect that you would wind up with one database version per day. So if you only wanted to be able to roll back 7 days for example, then you would set your versionretentiondays=7. Basically, a new database version will get created whenever a new job loads but we defer it by 30 minutes (default) so that a new tree versions get committed only 30 minutes after the last job load.  The database tree versions to be retained  only related to how far back you want to be able to roll back to, it is not related to your document retention settings. The rollback feature allows you to reset the database "view" backwards in time. The expected production use of this is generally if a job load needs to be quickly backed out for whatever reason or some corruption of the index has occurred, the view can be reset without having to change the underlying index.dr2 file.  
UPDATED:  September 18, 2017