Community
Participate
Working Groups
For this example I will use a sample history store which contains 1000 entries. This is the way it looks like things are happening. - on shutdown, Workspace.save(FULL_SAVE) is called - this calls HistoryStore.clean() which does: Set uuids = new Set(); for each HistoryStoreEntry do if entry is ok according to policy settings then uuids.add(entry.getuuid()); endif endfor - so we have a set of 1000 uuids which represent the blobs on disk - this then calls removeGarbage(set) which does: Set blobs = new Set(); for each blob on disk do blobs.add(blob); endfor - so we have a set of 1000 blobs - then the following algorithm is called to cleanup unreferenced blobs: for each blob in blobs do if not uuids.contains(blob) then delete blob from disk endif endfor Some observations: - this seems like it would be slow - this seems like it would create a lot of garbage - currently the implementation of history store has a 1-1 ratio between HistoryStoreEntries and Blobs on disk but this will soon be changed and entries will be sharing blobs on disk (for the copy and move cases)
Upon further review it doesn't look like we can do a whole lot here since the blobs are now shared between file states. But have refactored the code so now it deletes unreferenced blobs while iterating over the blobs on disk.
Released new code into HEAD. Closing.