Bug 78653 - Eclipse compare support does not appear to use file inspection correctly
Summary: Eclipse compare support does not appear to use file inspection correctly
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Compare (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P2 major (vote)
Target Milestone: 3.1 M7   Edit
Assignee: Andre Weinand CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-11-15 14:38 EST by Kim Letkeman CLA
Modified: 2005-05-08 09:01 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kim Letkeman CLA 2004-11-15 14:38:02 EST
Eclipse compare support uses the content type by passing in both the filename 
and the stream for inspection. Unfortunately, it calls a version of the 
interface that never performs inspection if there is a filename present. 
Instead, it simply fails if the filename is not bound to a content type 
pattern. This means that content type defined by file inspection patterns does 
not work with base Eclipse compare support. We can explicitly call the file 
inspection interface directly from our own merge facade, but this does not 
work for Eclipse and CVS, so there is not point in doing that before this is 
fixed.
Comment 1 Rafael Chaves CLA 2005-01-12 13:24:17 EST
Content-type detection based on contents only tends to be more expensive since
the contents have to be checked against *all* content types available to the
platform (which may be in a small number today but will tend to increase for
3.1). That API was intended for cases where the actual file name may not be
available (such as remote contents obtained through a URL).

The expected contributions to the content type infra-structure by plug-in
providers are:

a) a plug-in "officially" responsible for a file format/content type should
contribute a content type for it.
b) specialized formats of a more general content type can be adequately handled
if a sub-content type (having the more general one as base) is contributed.
c) plug-ins that want to extend the definition for an existing content type to
be associated to not originally supported file name patterns can do it by
contributing additional file associations to existing content types

In all these cases, the lower in the stack the plug-in lives, the better, so
more plug-ins can benefit from its contributions.

Kim, why can't the file name of the file match its content type by taking either
the "b" or "c" approaches?
Comment 2 Kim Letkeman CLA 2005-01-12 14:09:45 EST
Eclipse compare support *does* call file inspection where there is no file 
name specified, as is the case when we call the compare support directly. 
But "compare with each other" always uses a file name, and when the file name 
is wrong Eclipse accepts the result as a failure without trying for a file-
inspection match. I realize that Eclipse does this for performance reasons, 
but that does not help us when we want to type files by token and not by 
pattern.

We've run into cases where a token method would be much better than a file 
pattern method. There are cases where the file's *location* and not its *name* 
determines the file's content-type. So far, we've always been able to find 
something unique in the base name (extensions always seem to be XML in these 
cases), but this is very brittle as well. 

Also note that there are at least three known cases where file names *will 
never* match the pattern as specified in the content-type rules.

1) ClearCase base or contributor during 3-way merge. The files are extracted 
with a temp file name like ABCDEF123 without an extension. 

2) The "compare with other version" command can compare two repository 
versions, which generates two of these untyped temp files.

2) CVS retains back-up versions of files in the workspace view and users often 
try to run "compare with each other" on them. So ... for example, we'll see 
inputs for a 2-way compare like "modelfile.emx" and "modelfile.emx.1.2". There 
is no way to get a content-type from Eclipse for the latter file using the 
usual calls. File inspection is the only way.

We currently handle these with a combination of hacks that uses the local 
file's content type (if available) or a content-type obtained from the label 
passed in by ClearCase. This is a brittle solution and would be unnecessary 
with content-type by inspection.

There are probably other cases, and since Eclipse is an open platform other 
cases are likely to crop up over time. A consistent method for typing files 
that is flexible enough so that it does not insist on filename patterns is the 
right way (in my opinion of course) to ensure that all files can be accurately 
typed. 

To summarize, the essence of the problem is not the owenership or hierarchy of 
the content-type definitions, but rather the unique physical characteristic by 
which we indentify a specific content-type. Content-inspection is a powerful 
tool that currently is much less useful than it should be because Eclipse 
itself does not use it correctly.
Comment 3 Rafael Chaves CLA 2005-02-24 17:29:52 EST
Re: content type resolution depending on location - see bug 69640. If you have
use cases that are not handled by that solution, please anotate that PR.

Re: content type determination based on content and name vs. content only - this
is not only about performance, it is about getting correct results as well. Some
times, for text-based content types (XML excepted), there is no "unique physical
characteristic" that would allow us to detect the content type. But the final
choice is up to clients of the content type API. Compare could, if regular
content/name based detection cannot find any eligible content types, try to do
content-based detection (there is API for that). The worse it could happen would
be to end up using a text comparison instead of a nice content type specific
comparison.
Comment 4 Andre Weinand CLA 2005-05-08 09:01:27 EDT
Changed CompareUIPlugin.getContentType to always use file inspection.