Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [egit-dev] Re: JGit merge algorithm (fwd)

Hi,

On Fri, 11 Dec 2009, Halstrick, Christian wrote:

> > would be better to document clearly that
> > 
> > 	4 * chunkNo + 0 = ConflictState.ordinal()
> > 	4 * chunkNo + 1 = begin index in the base document
> > 	4 * chunkNo + 2 = begin index in the document
> > 	4 * chunkNo + 3 = end index in the document
> 
> Aren't we loosing info here? Range in base and in the document don't 
> have to be of same length. Storing two starts and one end value is not 
> sufficient to compute the range in both docs.

Yes.  OTOH you might not want to repeat the original range in each and 
every conflict, right?

> >                            Then it would be better to have the 4*chunkNo+0
> > number refer to the document, i.e. 0 = base, 1 = ours, 2 = theirs, 3 =
> > theirs2...
> 
> Currently I have coded the following
>  	3 * chunkNo + 0 = which document (0=base, 1=firstDoc, 2=secondDoc, 
>                           ...) combined with state info
>  	3 * chunkNo + 1 = begin index in the document combined with other
>                           state info
>  	3 * chunkNo + 2 = end index in the document
> 
> Putting together all your comments and my ideas my proposal would be
> 
> 	4 * chunkNo + 0 = index of the doc (0=base, 1=first doc(ours), 
>                              2=second doc (theirs...)
>  	4 * chunkNo + 1 = ConflictState.ordinal()
>  	4 * chunkNo + 2 = begin index in the document
>  	4 * chunkNo + 3 = end index in the document
> 
> For each conflict I would expect number-of-docs chunks and one chunk
> (optionally) telling me something about the base (also to support diff3
> Format). The conflictstate for the chunk for the base could be special
>  (e.g. 3="base text for conflict") to distinguish it from really 
> conflicting chunks.

I am not so sure that you need the ConflictState at all.  If you pick +0 = 
0 (base document, common range) and +0 = -1 (base document, but 
conflicts), you could have something like this:

(0, 0, 10)	lines 0 - 10 is common
(-1, 10, 15)	lines 10 - 15 in the base document are conflicted
(1, 10, 12)	lines 10 - 12 in our document replace said lines
(2, 10, 18)	lines 10 - 18 in their document replace said lines
(0, 15, 20)	lines 15 - 20 in the base document are again common

> > Oh, and I am still not certain if adding a compareTo() method to Edit 
> > is worth it.  IMHO it would be clearer to do the check explicitely, on 
> > oursEdit and theirsEdit.
> 
> Hmm, this compareTo was needed more in earlier intermediate states of this
> Algorithm and could really be replaced by explicit calls now. I like the
> Idea to have all the merge-algorithm-relevant computations directly in one
> Class.

My hesitation is probably due to the impression that there is no 
well-defined order (even partial one) with edits, as they can be on 
different document pairs...

> > MergeFormatter: Hmm.  Somehow, assuming that the sequences are RawText 
> > instances does not feel right.  I'd be happier if there was a method 
> > in Sequence that takes an OutputStream and writes a range of items.
> 
> Here I have a different opinion. The Sequence interface is so small and 
> clean offers exactly what a merge-algorithm needed. No access to 
> specific element but just a comparison method. This would be spoiled (in 
> my eyes) if would introduce method to serialize elements to streams.
>
> Another argument I have is that a MergeFormatter is anyhow something 
> which is tied to the kind of documents which have been merged. For Java 
> Code in RawTexts it's required to add texts "<<<", ">>>", "===" to mark 
> conflicting ranges. These markers have the same type as the merged docs, 
> right? So, the formatter was in my eyes so much tied to the type of the 
> docs that is was ok to write a specific RawText version of it.
>
> But with one thing I agree: having only one formatter which is hardcoded 
> to RawText handling maybe not sufficient. We may need other formatters 
> for other document types and for other ways how to represent merges 
> (e.g. a diff3 MergeFormatter). An interface for the formatter is needed 
> in my eyes.

Fair enough.  Maybe an interface is not even needed, as you might do 
different things with the conflict list than outputting it.  IOW the 
MergeFormatter might be the only class needing said interface.

> > In spite of MergeChunk desiring to be able to reflect more than two 
> > non-base versions, the MergeFormatter handles them not really 
> > gracefully by repeating "=======" conflict separators, but only 
> > listing the name of the last one.
> 
> Fully agree, I also detected this bug. Are there any standards how to
> present n-way merges?

If there are, I do not know of them.  Which does not really make it 
unlikely, though ;-)

Ciao,
Dscho



Back to the top