RE: [egit-dev] Re: JGit merge algorithm (fwd)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [egit-dev] Re: JGit merge algorithm (fwd)

From: "Halstrick, Christian" <christian.halstrick@xxxxxxx>
Date: Mon, 14 Dec 2009 18:05:02 +0100
Accept-language: en-US, de-DE
Acceptlanguage: en-US, de-DE
Delivered-to: egit-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/egit-dev>
List-help: <mailto:egit-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/egit-dev>, <mailto:egit-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: Acp6hg7L4kGJKeSTQo+maGuCQAPRtACVmbLA
Thread-topic: [egit-dev] Re: JGit merge algorithm (fwd)

Hi,

next round on the merge algorithm has started. I added a new patch set
at http://egit.eclipse.org/r/#change,140. Thanks to Jo for his comments,
I incorporated some of his work into that. 
This patchset contains in contrast to the previous

- removal of compareTo method in Edit
- addition of some sentinel value for Edit iterators.
  Jo, I guess you don't like that one but after playing with the
  Implementation a good sentinel value saves a lot of if's and
  Special handling.
- introduction of a nextEdit() method which takes
  care to insert the sentinel value.
  Jo, I decided against changing the iterator in EditList, because
  I don't like breaking the standard Iterator semantics which
  throws NoSuchElementException. 
- writing of the ConflictState explicitly into each chunk in the
  the MergeResult. Here are some optimizations suggested from
  Jo pending.
- .. some minor stuff

Ciao
  Chris

> -----Original Message-----
> From: Johannes Schindelin [mailto:Johannes.Schindelin@xxxxxx]
> Sent: Friday, December 11, 2009 6:23 PM
> To: Halstrick, Christian
> Cc: EGit developer discussion
> Subject: RE: [egit-dev] Re: JGit merge algorithm (fwd)
> 
> Hi,
> 
> On Fri, 11 Dec 2009, Halstrick, Christian wrote:
> 
> > > would be better to document clearly that
> > >
> > > 	4 * chunkNo + 0 = ConflictState.ordinal()
> > > 	4 * chunkNo + 1 = begin index in the base document
> > > 	4 * chunkNo + 2 = begin index in the document
> > > 	4 * chunkNo + 3 = end index in the document
> >
> > Aren't we loosing info here? Range in base and in the document don't
> > have to be of same length. Storing two starts and one end value is
> not
> > sufficient to compute the range in both docs.
> 
> Yes.  OTOH you might not want to repeat the original range in each and
> every conflict, right?
> 
> > >                            Then it would be better to have the
> 4*chunkNo+0
> > > number refer to the document, i.e. 0 = base, 1 = ours, 2 = theirs,
> 3 =
> > > theirs2...
> >
> > Currently I have coded the following
> >  	3 * chunkNo + 0 = which document (0=base, 1=firstDoc,
> 2=secondDoc,
> >                           ...) combined with state info
> >  	3 * chunkNo + 1 = begin index in the document combined with other
> >                           state info
> >  	3 * chunkNo + 2 = end index in the document
> >
> > Putting together all your comments and my ideas my proposal would be
> >
> > 	4 * chunkNo + 0 = index of the doc (0=base, 1=first doc(ours),
> >                              2=second doc (theirs...)
> >  	4 * chunkNo + 1 = ConflictState.ordinal()
> >  	4 * chunkNo + 2 = begin index in the document
> >  	4 * chunkNo + 3 = end index in the document
> >
> > For each conflict I would expect number-of-docs chunks and one chunk
> > (optionally) telling me something about the base (also to support
> diff3
> > Format). The conflictstate for the chunk for the base could be
> special
> >  (e.g. 3="base text for conflict") to distinguish it from really
> > conflicting chunks.
> 
> I am not so sure that you need the ConflictState at all.  If you pick
> +0 =
> 0 (base document, common range) and +0 = -1 (base document, but
> conflicts), you could have something like this:
> 
> (0, 0, 10)	lines 0 - 10 is common
> (-1, 10, 15)	lines 10 - 15 in the base document are conflicted
> (1, 10, 12)	lines 10 - 12 in our document replace said lines
> (2, 10, 18)	lines 10 - 18 in their document replace said lines
> (0, 15, 20)	lines 15 - 20 in the base document are again common
> 
> > > Oh, and I am still not certain if adding a compareTo() method to
> Edit
> > > is worth it.  IMHO it would be clearer to do the check explicitely,
> on
> > > oursEdit and theirsEdit.
> >
> > Hmm, this compareTo was needed more in earlier intermediate states of
> this
> > Algorithm and could really be replaced by explicit calls now. I like
> the
> > Idea to have all the merge-algorithm-relevant computations directly
> in one
> > Class.
> 
> My hesitation is probably due to the impression that there is no
> well-defined order (even partial one) with edits, as they can be on
> different document pairs...
> 
> > > MergeFormatter: Hmm.  Somehow, assuming that the sequences are
> RawText
> > > instances does not feel right.  I'd be happier if there was a
> method
> > > in Sequence that takes an OutputStream and writes a range of items.
> >
> > Here I have a different opinion. The Sequence interface is so small
> and
> > clean offers exactly what a merge-algorithm needed. No access to
> > specific element but just a comparison method. This would be spoiled
> (in
> > my eyes) if would introduce method to serialize elements to streams.
> >
> > Another argument I have is that a MergeFormatter is anyhow something
> > which is tied to the kind of documents which have been merged. For
> Java
> > Code in RawTexts it's required to add texts "<<<", ">>>", "===" to
> mark
> > conflicting ranges. These markers have the same type as the merged
> docs,
> > right? So, the formatter was in my eyes so much tied to the type of
> the
> > docs that is was ok to write a specific RawText version of it.
> >
> > But with one thing I agree: having only one formatter which is
> hardcoded
> > to RawText handling maybe not sufficient. We may need other
> formatters
> > for other document types and for other ways how to represent merges
> > (e.g. a diff3 MergeFormatter). An interface for the formatter is
> needed
> > in my eyes.
> 
> Fair enough.  Maybe an interface is not even needed, as you might do
> different things with the conflict list than outputting it.  IOW the
> MergeFormatter might be the only class needing said interface.
> 
> > > In spite of MergeChunk desiring to be able to reflect more than two
> > > non-base versions, the MergeFormatter handles them not really
> > > gracefully by repeating "=======" conflict separators, but only
> > > listing the name of the last one.
> >
> > Fully agree, I also detected this bug. Are there any standards how to
> > present n-way merges?
> 
> If there are, I do not know of them.  Which does not really make it
> unlikely, though ;-)
> 
> Ciao,
> Dscho

References:
- [egit-dev] Re: JGit merge algorithm (fwd)
  - From: Johannes Schindelin
- RE: [egit-dev] Re: JGit merge algorithm (fwd)
  - From: Halstrick, Christian
- RE: [egit-dev] Re: JGit merge algorithm (fwd)
  - From: Johannes Schindelin
- RE: [egit-dev] Re: JGit merge algorithm (fwd)
  - From: Halstrick, Christian
- RE: [egit-dev] Re: JGit merge algorithm (fwd)
  - From: Johannes Schindelin

Prev by Date: Re: [egit-dev] [RFC] Talk on EGit/JGit at EclipseCon
Next by Date: [egit-dev] Index and cache
Previous by thread: RE: [egit-dev] Re: JGit merge algorithm (fwd)
Next by thread: [egit-dev] http server for tests?
Index(es):
- Date
- Thread

Breadcrumbs