Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Implementing shallow clones

Matt Fischer <mattfischer84@xxxxxxxxx> wrote:
> 
> If I can get to the point of producing the set of commits at the
> edges, the rest is pretty easy (I can just feed the parents of those
> edge commits into the PackWriter as uninteresting commits to get it to
> exclude the proper things.)

That probably won't work.  If you tag the parents of the edges as
uninteresting we'll exclude any blobs that the parent contains.
If the commit we are actually sending in response to the deepen
request also uses that blob, but the client doesn't yet have that
blob, we'll send an incomplete pack.

You probably need to do something more sophisticated, like force
the edges to be parsed and then whack out their parent array so its
an empty list.  Then the traversal won't follow into the parents,
and thus stops at the boundary.  You also would need to mark the
blobs we know the client already has (from its common have lines)
as uninteresting.

> Looking through the code, it looks to me
> like generating this set would take logic that looks sort of like a
> combination of the Topo sort and the Boundary generator, along with
> the ability to tag each commit with an integer for its depth as we go
> along.

Probably true.  Only we really don't want to fatten out the RevCommit
structure by default to add the depth tag.

But you could create a custom subclass of ObjectWalk that overrides
createCommit to be a subclass which does have the depth field.
Then use this new subclass only for shallow clone enumeration
support, and set the depth as the commits spool out of the next()
method.  Since you extended the ObjectWalk you might be able to wedge
something into the StartGenerator that sets up a new filter generator
to assign the depths as they spool out of the TopoSortGenerator.
It should be pretty easy, the depth is the shortest path to a commit
so each commit just has to set its depth + 1 into its parents.

> If I were to add this logic, would it be correct to say that I
> should be making a new generator which only returns commits of a
> certain depth,

I think so.  The TopoSortGenerator is incremental, so if you put
the numbering generator into the pipeline you can also decide
when to abort.  Though aborting here is a bit tricky, if you just
want to stop a given depth you need to remove the parents from
the underlying PendingGenerator's queue once you realize that the
parents are impossibly deep.  You can't stop until the queue is
empty, because you might be visting down one branch, need to cut
it off, then go visit another branch that is still in the queue.

> and somehow add parameters to the RevWalk to tell it
> that I want to instantiate this generator?  If so, what is the proper
> way to make this generator get the commits in Topo order, and do the
> necessary tagging to compute the depth on each node?

Maybe just do a "walker instanceof MySpecialWalker" inside of the
StartGenerator right after the TopoSortGenerator is created?

Like I was saying above, doing this probably requires having both the
TopoSortGenerator here and having access to the PendingGenerator's
queue so you can fully kill a branch when its gone to the needed
depth.

Actually, you can kill it by marking it UNINTERESTING.  That'll make
it drop out of the queue because PendingGenerator skips over commits
that are uninteresting.

-- 
Shawn.


Back to the top