Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Looking for an example of a custom RevWalk Sort

On 12 August 2014 21:09, Michael O'Cleirigh <michael.ocleirigh@xxxxxxxxxxx> wrote:

Hello,

 

I’m working on some history rewriting programs for a subversion to git conversion (>70,000 commits).   When we do the initial subversion to git conversion we translate svn:externals properties on the branch into a fusion.dat file in the root directory of the branch.

 

This is sort of like a submodule but within the same repository.  It maps a subdirectory name to a commit id.  We have a custom maven plugin that can do a special subtree like merge to materialize the subdirectories from the commit object given.


If I understand you correctly this submodule stuff is not a standard part of SVN, but a custom approach used by your project?

So the contents of this fusion.dat file (in the root of the commit's file tree) are something like this? :

src/main/java/libA -> d065748c3fcbc3d2449012ac75b02fba962fe735
src/main/java/libB -> ac2971e610a8f784ba74644aaff46276a24d6bc3

...and then the file tree of commit d065748c3fcbc3d2449012ac75b02fba962fe735 needs to be stuck into the file tree of the current commit at the location of src/main/java/libA - and this is done by your maven plugin?

I have several rewrite history programs that are using a reverse and topo sort which end up rewriting the original commit id’s that were stored initially in these fusion.dat files.


a) So you're not planning to get rid of the fusion.dat files? You plan to continue using your maven plugin to materialize the subdirectories?

b) what are the rewrites you're doing (if they're not removing/replacing the dat files?).
 

I have a tool that should work to rewrite these fusion.dat files using the old commit to new commit records from the previous history rewriting however the sort order of the commits is not exactly what I need.

 

If I look at a commit with this fusion.dat file I know the id’s of the commits that have to be processed before this one.  You can image that the aggregate branch contains the fusion.dat file but it is referencing lateral branches.  In subversion these might have been all in a single commit but in Git there is the top level branch and each individual module branch.


The BFG Repo Cleaner has a similar-ish problem when it tries to rewrite commit-ids that are embedded in commit messages - it's not guaranteed that it's encountered that commit id before (even tho it uses a reverse and topo sort), as it could be from a lateral branch, so it just attempts, in a very simple way, to recursively clean that commit-id and the history behind it, which works pretty well most of the time because it memoizes all cleaning operations on git-id - but occasionally risks blowing-up with a StackOverflowError if the commit-ids unseen history is too deep.

Incidentally, depending on what clean-up operations you're doing, you might be able to make use of the BFG as your cleaner, adding your fusion.dat file-updater as a bfg.Cleaner[Seq[Tree.Entry]].

I’d like to add in my own sorter so that the RevWalk will consider this additional interdependence when ordering the results.

 

I can sort of see how I might subclass RevWalk and TopoSortGenerator to include my additional sort constraint data.

 

But those classes are all package scoped so it seems not designed to be extended directly by JGit users.


Just my 2 cents, if you want to do this custom sorting, and given this sounds like a one-off, you're probably best off just doing a small JGit fork and doing a TopoSortGenerator that works exactly how you want. Changing org.eclipse.jgit.revwalk.RevSort and all that lot for a general solution might well require a breaking-API change.

Apologies if my questions appear dim, I have a bit of a cold and may be misunderstanding you (also there are brighter people on this mailing list than me, I just had my interest piqued because of my work on the BFG repo-cleaner).

best regards,
Roberto


Back to the top