Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] New packfiles created with every push

On Wed, Jun 21, 2017 at 10:25 PM, Will Saxon <saxonww@xxxxxxxxx> wrote:
Hello,

This is a re-send of an email I sent to jgit-dev on 6/19. I don't see my message in the archives, and I sent it immediately after subscribing, so I thought maybe it was rejected or otherwise lost.

your first email wasn't lost, it arrived on the list, it's in the archive here
 
We are using JGit primarily via Gerrit Code Review v2.14.1. I'm emailing this list as prior emails to the Gerrit list have pointed out that this is a JGit question.

Since upgrading to Gerrit 2.14.0, we've noticed poor push performance for tags and commits, especially later in the workday. Push times often exceed 1 minute - I've personally seen as high as 6m30s - which is a significant regression for us. The initial cause was thought to be the recently-introduced autogc capability, so we disabled it by setting receive.autogc false in all of our repos, but we're still seeing very bad push performance even with it disabled.

did you try to completely disable autoGc by setting

[gc]
autolimit=0
gc.autopacklimit=0

in the ~/.gitconfig of the OS user running Gerrit ?
 
One thing I have noticed is that every new patchset or tag pushed to Gerrit causes a new pack file to be created. In the case of a tag pushed after a patchset, a new pack file with the patchset content + the tag object is created, i.e. we have one pack with the commit, tree, and blob objects, plus another pack with all of that plus a tag object. I'm not sure if this is new or if this is how JGit/Gerrit have always worked.


I tried pushing some commits to a gerrit 2.13.8 server and a new pack appeared in the respective git repository's objects/pack folder
for each commit arriving.
 
Having the additional packfiles doesn't seem to affect small repos where there is not a lot of change during the day, however for one of our larger/busier repos we end up with hundreds of pack files by the end of the workday; we do a single GC run at about 10pm - GC on this repo takes ~45 minutes since Gerrit 2.14 - which reduces the number of pack files to 2, but then dozens of nightly builds (each of which are tagged) plus whatever developers push during the day brings the total back up into the hundreds by the next evening. This affects our developers when they try to push, and also slows down any automated push activity (e.g. build tagging). 

Regular git doesn't seem to create these additional pack files, instead there are new loose objects which are then gathered into a pack by gc if a loose object threshold is crossed.

Is JGit (or Gerrit's use of JGit) supposed to be creating all these additional pack files? Would a high number of small pack files, some with redundant data, contribute to push performance issues? If so, is there any information I can put together which would help narrow down what the problem is?

this would be a reason not to disable receive.autogc, since then autoGc would combine the packs as soon as their
number exceeds the threshold (default threshold is 50 packs).

2.14.0 uses JGit 4.7.0.201704051617-r which unpacks garbage to loose objects in order to maintain their
expiration per object (and no longer per pack like older JGit versions did which always kept all garbage in a
separate garbage pack) [1]. If a lot of objects become garbage (more than the autoGc threshold of 6700 loose objects)
but those garbage objects' retention time isn't yet expired this JGit version will run autoGc on each receive-pack
since the loose object threshold is expired. This performance problem was fixed in JGit 4.7.1.201706071930-r
which is used in Gerrit 2.14.1 [2]. With that version autoGc is run in a background job and it is run at most once per day if the
autoGc threshold for loose garbage objects is exceeded. It also ensures that at most one thread pushing to the same repository
concurrently will run autoGc.

So you may consider to upgrade to 2.14.1 and try if this fixes your problem.


-Matthias

Back to the top