Bug 289697 - Wrong highlighting of individual changes in text compare
Summary: Wrong highlighting of individual changes in text compare
Status: REOPENED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Compare (show other bugs)
Version: 3.5   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform-Compare-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-17 04:25 EDT by David Balažic CLA
Modified: 2020-11-03 03:56 EST (History)
1 user (show)

See Also:


Attachments
Screenshot of problem (orange is my self-censorship) (8.59 KB, image/png)
2013-10-08 12:08 EDT, David Balažic CLA
no flags Details
Screenshot of compare (14.56 KB, image/png)
2018-11-13 05:45 EST, David Balažic CLA
no flags Details
file 1 for comparison (188 bytes, text/plain)
2020-11-03 03:56 EST, David Balažic CLA
no flags Details
file 2 for comparison test (183 bytes, text/plain)
2020-11-03 03:56 EST, David Balažic CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Balažic CLA 2009-09-17 04:25:55 EDT
I have a project checked out from SVN.
I changed a line in a *.java file from:
    return zip.matches("^[a-zA-Z0-9- \\/]+$");
to:
    return zip.matches("^[a-zA-Z0-9- \\/]{8}$");

When I compared my new code to the version in SVN, the individual changes highlight is wrong.
The higlighted part is old version is "\\/]{8}$" and in new version it is "\\/]+$".
That is obviously wrong, since the "\\/]" part is unchanged , also the "$" is unchanged.

It should be: "{8}" and "+" highlighted.

This is the same if I switch from "Java Source Compare" to "Text Compare".

-- Configuration Details --
Product: Eclipse 1.2.0.20090618-0904 (org.eclipse.epp.package.jee.product)
Installed Features:
 org.eclipse.platform 3.5.0.v20090611a-9gEeG1HFtQcmRThO4O3aR_fqSMvJR2sJ
Comment 1 Martin Johansen CLA 2010-04-20 08:28:02 EDT
This bug is related to 271427.

The issue is with

org.eclipse.compare.contentmergeviewer.TokenComparator

It groups tokens into these groups: unspecified, whitespaces, digits, letters and quotes. The phenomenon reported is caused by the unspecified tokens being grouped into one group.

There are several possible solutions to this. A simple solution, which might not be the best one, is to treat each unspecified character as a token, this solves bug 271427 and bug 289697. I would argue it is intuitive behavior for the tokenizer.

A quick-fix is this is to change line 62 in TokenComparator.java as follows;

\begin{verbatim}
<			if (category != lastCategory) {
>			if (category != lastCategory || category == '?') {
\end{verbatim}

The deeper issue here is of course to what degree TokenComparator should take into account the semantics of the text it parses. I leave this question open.
Comment 2 David Balažic CLA 2011-05-23 13:39:14 EDT
Would it be acceptable to add an option of not grouping at all?

So a single changed letter in a word would be highlighted just by itself, while the rest is left as is?

For example:
- descendingOrder foo
+ descending0rder foo

In a change like this, currently the entire first word is highlighted, obscuring or the actual change. It does not really obscure, it just does not reveal as much as it could by highlighting just the actually changed letter(s).
Comment 3 David Balažic CLA 2013-10-08 12:08:02 EDT
Created attachment 236230 [details]
Screenshot of problem (orange is my self-censorship)

Another case with Eclipse 4.3.1 (Kepler SR-1)

Even if the only change is the removed '--' at the begin of line, the space characters are also marked.

(it is even much worse if put back the '--' in the first and third line, like this:

--INSERT INTO C
    App...
--     Is...


then it is totally confused and shows as if the entire old block was replaced with one line (the seconds one having "    App...") and then the last two lines of the old block replaced by almost all of the new block. The actual changes are again just some '--' characters removed from the begin of line (comment in SQL)

(block = the about 15 lines that I uncommented in the original file))
Comment 4 David Balažic CLA 2018-11-13 05:45:18 EST
Created attachment 276559 [details]
Screenshot of compare

Another example.
Even if the only changes are the space char in line 1 and the "\r" at the end of line2, the code highlights a lot more. Like the space at the begin of line 2 that did not change at all.

This is with a fresh install of:
Eclipse Java EE IDE for Web Developers.

Version: 2018-09 (4.9.0)
Build id: 20180917-1800
Comment 5 Eclipse Genie CLA 2020-11-03 00:39:42 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 6 David Balažic CLA 2020-11-03 03:54:47 EST
This still happens in:

Eclipse IDE for Enterprise Java Developers (includes Incubating components)

Version: 2020-09 (4.17.0)
Build id: 20200910-1200


I'll attach two test files that contain all the examples I mentioned before.
Just put them into a workspace and compare them.
Comment 7 David Balažic CLA 2020-11-03 03:56:18 EST
Created attachment 284639 [details]
file 1 for comparison
Comment 8 David Balažic CLA 2020-11-03 03:56:36 EST
Created attachment 284640 [details]
file 2 for comparison test