Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jgit-dev] Implementing shallow clones

The company I work for uses jgit, as part of Gerrit, and occasionally
we run into situations where it would be nice to have support for
shallow clones.  Currently, jgit doesn't support this (in fact it
crashes the program if you try to ask for one), so I thought I would
look into how difficult it would be to patch this functionality in.

Overall, it looks like it would be pretty straightforward--there are a
couple additional messages to handle in the protocol, but that's all
pretty easy.  The tougher bit is determining the set of commits to
send.  The protocol specifies a depth n, and the set of commits must
be all commits less than n commits away from a wanted object.  It
looks to me like the RevWalk infrastructure ought to be flexible
enough to support this action, but I wanted to see if anybody had any
advice on how best to implement it.  The mainline git implementation
does it by topologically sorting the commit graph, walking along it,
and assigning a number to each commit that says how many nodes away
this commit is from the nearest want-ref.  The protocol then requires
the server to report back the set of commits at the edges of this
search, i.e. the ones who have parents that we won't be sending
because they're out of the requested depth.  Then, obviously, we also
have to build the pack such that it only includes the requested
commits.

If I can get to the point of producing the set of commits at the
edges, the rest is pretty easy (I can just feed the parents of those
edge commits into the PackWriter as uninteresting commits to get it to
exclude the proper things.)  Looking through the code, it looks to me
like generating this set would take logic that looks sort of like a
combination of the Topo sort and the Boundary generator, along with
the ability to tag each commit with an integer for its depth as we go
along.  If I were to add this logic, would it be correct to say that I
should be making a new generator which only returns commits of a
certain depth, and somehow add parameters to the RevWalk to tell it
that I want to instantiate this generator?  If so, what is the proper
way to make this generator get the commits in Topo order, and do the
necessary tagging to compute the depth on each node?

Thanks,
Matt


Back to the top