Bug 510659 - BatchRefUpdate is slow with RefDirectories
Summary: BatchRefUpdate is slow with RefDirectories
Status: NEW
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: 4.7   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-18 17:56 EST by David Turner CLA
Modified: 2017-04-25 03:44 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Turner CLA 2017-01-18 17:56:21 EST
I have 10,000 refs, packed into a packed-refs file.  I want to delete one hundred of them.  What BatchRefUpdate should do is open the packed-refs file, read it in to a Set, delete the hundred refs, and write it back out.

What it actually does is: for each ref, it reads the packed-refs file, then does a linear-time operation to delete one ref from the RefList (why is this not just a TreeMap?), then writes it back out to the packed-refs file, and then repeats.

This is much slower.

I wanted to fix this, but I wasn't sure whether it made sense to just add a bulk delete operation to RefDatabase, or whether I should subclass BatchRefUpdate, or whether I should do something else.
Comment 1 Dave Borowitz CLA 2017-04-24 08:59:51 EDT
I don't think we want to change the BatchRefUpdate or other interface to support batch deleting specifically. If you want to do a bunch of ref updates in a batch, then BatchRefUpdate is the one true way to do that.

I think the solution is to properly implement batch updates in RefDirectory, so I filed bug 515678. We can merge these bugs if you like.