Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced commit

Thanks for your clarification. Testing for objectDatabase().has(orphan) does the trick.

Is there a reason as to why resolve() behaves in such an inconsistent way? It wouldn't return non-null for an abbreviated id that doesn't exist or a ref that doesn't exist or something like HEAD^9 if there isn't a 9th parent, right?


From: Christian Halstrick <christian.halstrick@xxxxxxxxx>
To: R�diger Herrmann <ruediger.herrmann@xxxxxx>
Cc: Matthias Sohn <matthias.sohn@xxxxxxxxx>; "jgit-dev@xxxxxxxxxxx" <jgit-dev@xxxxxxxxxxx>
Sent: Friday, April 8, 2016 8:15 AM
Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced commit

Hi,

I think everything works fine here. The commit really get's garbage
collected during the first gc. No need to modify the config or to
modify lastModified timestamps on disk.

The only problem is how you check in the end whether an object exists
or not. You call Repository.resolve(orphanedCommmitId). But
Repository.resolve(...) will return something non-null for every well
formed object-id string. A Repository.resolve("0123456789....") will
return an AnyObjectID instance even if that object does not exist. You
should try ask the ObjectDatabase whether he knows the object with a
certain id. Like here:

ObjectId initial = git.getRepository().resolve("HEAD");
RevCommit orphan = git.commit().setMessage("orphan").call();
RefUpdate refUpdate = git.getRepository().updateRef("refs/heads/master");
refUpdate.setNewObjectId(initial);
refUpdate.forceUpdate();
FileUtils.delete(new File(git.getRepository().getDirectory(),
"logs"),FileUtils.RECURSIVE | FileUtils.RETRY);
assertTrue(git.getRepository().getObjectDatabase().has(orphan.getId()));
git.gc().setExpire(new Date()).call();
assertFalse(git.getRepository().getObjectDatabase().has(orphan.getId()));

I'll see whether I can find a test which does not even require that
you manually delete the reflogs.

Ciao
  Chris


On Thu, Apr 7, 2016 at 6:31 PM, R�diger Herrmann
<ruediger.herrmann@xxxxxx> wrote:
> Alright, here is the change:
> https://git.eclipse.org/r/70155
>
>
>
> From: Matthias Sohn <matthias.sohn@xxxxxxxxx>
> To: R�diger Herrmann <ruediger.herrmann@xxxxxx>
> Cc: "jgit-dev@xxxxxxxxxxx" <jgit-dev@xxxxxxxxxxx>
> Sent: Thursday, April 7, 2016 5:27 PM
>
> Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced
> commit
>
> I meant push it for review to Gerrit, if we don't want to keep it as a test
> we can still abandon it
>
> On Thu, Apr 7, 2016 at 5:16 PM, R�diger Herrmann <ruediger.herrmann@xxxxxx>
> wrote:
>
> N.B. Note sure what you mean with' contribute'. While the test from the
> previous post is good enough to demonstrate the bug in JGit (if any), I
> don't think it would qualify for a _real_ test to be included in any test
> suite.
>
>
>
>
> ________________________________
> From: R�diger Herrmann <ruediger.herrmann@xxxxxx>
> To: Matthias Sohn <matthias.sohn@xxxxxxxxx>
> Cc: "jgit-dev@xxxxxxxxxxx" <jgit-dev@xxxxxxxxxxx>
> Sent: Thursday, April 7, 2016 4:59 PM
>
> Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced
> commit
>
> Thanks for looking into this. The three methods shown below are all that is
> necessary to run in the GarbageCollectCommandTest.
>
>>>> BEGIN
>
> @Test
> public void testPruneOldOrphanCommit() throws Exception {
> StoredConfig config = git.getRepository().getConfig();
> config.setString("gc", null, "prunePackExpire", "1.second.ago");
> config.setString("gc", null, "pruneExpire", "1.second.ago");
> config.save();
> ObjectId initial = git.getRepository().resolve("HEAD");
> RevCommit orphan = git.commit().setMessage("orphan").call();
> changeLastModified(orphan, subtractDays(new Date(), 365));
> RefUpdate refUpdate = git.getRepository()
> .updateRef("refs/heads/master");
> refUpdate.setNewObjectId(initial);
> refUpdate.forceUpdate();
> FileUtils.delete(new File(git.getRepository().getDirectory(), "logs"),
> FileUtils.RECURSIVE | FileUtils.RETRY);
>
> git.gc().setExpire(new Date()).call();
> Thread.sleep(4000);
> git.gc().setExpire(new Date()).call();
>
> assertNull(git.getRepository().resolve(orphan.name()));
> }
>
> private static Date subtractDays(Date date, int days) {
> Calendar calendar = Calendar.getInstance();
> calendar.setTime(date);
> calendar.add(Calendar.DAY_OF_MONTH, days * (-1));
> return calendar.getTime();
> }
>
> private void changeLastModified(ObjectId commitId, Date date) {
> File objectsDirectory = new File(git.getRepository().getDirectory(),
> "objects");
> File commitObjectDirectory = new File(objectsDirectory,
> commitId.name().substring(0, 2));
> File commitObjectFile = new File(commitObjectDirectory,
> commitId.name().substring(2));
> commitObjectFile.setLastModified(date.getTime());
> }
>
> <<< END
>
>
> ________________________________
> From: Matthias Sohn <matthias.sohn@xxxxxxxxx>
> To: R�diger Herrmann <ruediger.herrmann@xxxxxx>
> Cc: Shawn Pearce <spearce@xxxxxxxxxxx>; "jgit-dev@xxxxxxxxxxx"
> <jgit-dev@xxxxxxxxxxx>
> Sent: Thursday, April 7, 2016 4:32 PM
> Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced
> commit
>
> can you contribute this test as a new test in GarbageCollectCommandTest ?
> I'll ask Christian to have a look and investigate why this still doesn't
> work.
>
> On Thu, Apr 7, 2016 at 4:21 PM, R�diger Herrmann <ruediger.herrmann@xxxxxx>
> wrote:
>
> Thanks for your insights. So setting gc.prunePackExpire = 1.second.ago (or
> pruneExpire for JGit < 4.3) and running GC twice with some seconds delay
> should prune the unreferenced object, right?
>
> I changed the code accordingly but the commit is still not garbage collected
> (note, I am running against HEAD)
> @Test
> public void testPruneOldOrphanCommit() throws Exception {
> StoredConfig config = git.getRepository().getConfig();
> config.setString("gc", null, "prunePackExpire", "1.second.ago");
> config.setString("gc", null, "pruneExpire", "1.second.ago");
> config.save();
> ObjectId initial = git.getRepository().resolve("HEAD");
> RevCommit orphan = git.commit().setMessage("orphan").call();
> changeLastModified(orphan, subtractDays(new Date(), 365));
> RefUpdate refUpdate = git.getRepository()
> .updateRef("refs/heads/master");
> refUpdate.setNewObjectId(initial);
> refUpdate.forceUpdate();
> FileUtils.delete(new File(git.getRepository().getDirectory(), "logs"),
> FileUtils.RECURSIVE | FileUtils.RETRY);
>
> git.gc().setExpire(new Date()).call();
> Thread.sleep(4000);
> git.gc().setExpire(new Date()).call();
>
> assertNull(git.getRepository().resolve(orphan.name()));
> }
>
>
>
> ________________________________
> From: Matthias Sohn <matthias.sohn@xxxxxxxxx>
> To: R�diger Herrmann <ruediger.herrmann@xxxxxx>
> Cc: Shawn Pearce <spearce@xxxxxxxxxxx>; "jgit-dev@xxxxxxxxxxx"
> <jgit-dev@xxxxxxxxxxx>
> Sent: Thursday, April 7, 2016 1:26 PM
>
> Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced
> commit
>
> in order to avoid race conditions when a pack arrives while a gc is running
> jgit doesn't immediately prune
> garbage objects but puts them in a garbage pack which is only pruned with
> the next gc after a grace period
> defined by gc.prunePackExpire [1] which is 1 hour by default. Before [1]
> this reused gc.pruneExpire
> which is 2 weeks by default which led to pack file explosion, this was fixed
> by [1] recently.
> [1] will be contained in 4.3 which I am going to release today.
>
> [1] https://git.eclipse.org/r/#/c/69628/
>
> -Matthias
>
> On Thu, Apr 7, 2016 at 10:23 AM, R�diger Herrmann <ruediger.herrmann@xxxxxx>
> wrote:
>
> Thanks for the hint. I wasn't aware that commits are reflogged. However, if
> I delete the logs directory before collecting garbage, the orphan commit is
> still not pruned.
>
> Here the updated test that deletes the reflog:
>
> @Test
> public void testPruneOldOrphanCommit() throws Exception {
> ObjectId initial = git.getRepository().resolve("HEAD");
> RevCommit orphan = git.commit().setMessage("orphan").call();
> changeLastModified(orphan, subtractDays(new Date(), 365));
> RefUpdate refUpdate = git.getRepository()
> .updateRef("refs/heads/master");
> refUpdate.setNewObjectId(initial);
> refUpdate.forceUpdate();
> FileUtils.delete(new File(git.getRepository().getDirectory(), "logs"),
> FileUtils.RECURSIVE | FileUtils.RETRY);
>
> git.gc().setExpire(new Date()).call();
>
> assertNull(git.getRepository().resolve(orphan.name()));
> }
>
>
>
> ________________________________
> From: Matthias Sohn <matthias.sohn@xxxxxxxxx>
> To: Shawn Pearce <spearce@xxxxxxxxxxx>
> Cc: R�diger Herrmann <ruediger.herrmann@xxxxxx>; "jgit-dev@xxxxxxxxxxx"
> <jgit-dev@xxxxxxxxxxx>
> Sent: Thursday, April 7, 2016 12:51 AM
> Subject: Re: [jgit-dev] GarbageCollectCommand does not prune unreferenced
> commit
>
> On Wed, Apr 6, 2016 at 11:19 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
>
> On Wed, Apr 6, 2016 at 2:14 PM, R�diger Herrmann
> <ruediger.herrmann@xxxxxx> wrote:
>> You are right of course, the shown code isn't quite what I tired. Sorry
>> for
>> the confusion. If I change it to 'unreference' the orphan commit the
>> result
>> is still the same.
>>
>> The correct version would reset HEAD to the previous commit so that
>> 'orphan'
>> is actually unreferenced like this:
>>
>> @Test
>> public void testPruneOldOrphanCommit() throws Exception {
>> ObjectId initial = git.getRepository().resolve("HEAD");
>> RevCommit orphan = git.commit().setMessage("orphan").call();
>> changeLastModified(orphan, subtractDays(new Date(), 365));
>> RefUpdate refUpdate = git.getRepository()
>> .updateRef("refs/heads/master");
>> refUpdate.setNewObjectId(initial);
>> refUpdate.forceUpdate();
>
> Is orphan still in the reflog for master? or HEAD?
>
>
> I tried your test and the orphaned commit is still referenced from both the
> master's and HEAD's reflogs.
>
> -Matthias
>
>
>
>
>
>
>
>
>
> _______________________________________________
> jgit-dev mailing list
> jgit-dev@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit
> https://dev.eclipse.org/mailman/listinfo/jgit-dev
>
>
>
>
>
>
> _______________________________________________
> jgit-dev mailing list
> jgit-dev@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit
> https://dev.eclipse.org/mailman/listinfo/jgit-dev



Back to the top