Bug 37668 - [plan item] Content-type-based editor lookup
Summary: [plan item] Content-type-based editor lookup
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.1   Edit
Hardware: All All
: P4 enhancement with 7 votes (vote)
Target Milestone: 3.1 M7   Edit
Assignee: Kim Horne CLA
QA Contact:
URL:
Whiteboard:
Keywords: plan
: 80237 (view as bug list)
Depends on: 60291
Blocks: 78340 78654 81710
  Show dependency tree
 
Reported: 2003-05-15 10:46 EDT by Jim des Rivieres CLA
Modified: 2005-05-10 13:49 EDT (History)
27 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jim des Rivieres CLA 2003-05-15 10:46:12 EDT
Content-type-based editor lookup. The choice of editor is currently based on 
file name pattern. This is not very flexible, and breaks down when 
fundamentally different types of content are found in files with 
undistinguished file names or internal formats. For example, many different 
models with specialized editors get stored in XML format files named *.xml. 
Eclipse should support a notion of content type for files and resources, and 
use these to drive decisions like which editor to use. This feature would also 
be used by team providers when doing comparisons based on file type. The 
several existing file-type registries in Eclipse should be consolidated. 
[Platform Core, Platform UI] [Theme: User experience]
Comment 1 Rafael Chaves CLA 2003-06-12 15:57:42 EDT
The non-uniform encoding support work (bug 37933) would benefit from a central
file-type registry too. 
Comment 2 Morten Christensen CLA 2003-09-30 08:26:45 EDT
Very cool. Many new applications based on the "eclipse rich client platform" 
need this as well, so it should be designed for reuse.
Comment 3 David Williams CLA 2004-02-22 18:57:32 EST
Please see 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=52784
for potential contribution which might be helpful/related to content-type based 
editor lookup. 
Comment 4 Douglas Pollock CLA 2004-04-06 12:37:01 EDT
Is anything being done about this bug by Platform-Core in the 3.0 time frame? 
Comment 5 Rafael Chaves CLA 2004-04-06 13:36:38 EDT
Yes. A new revision of the API and basic implementation were released in today's
integration build. A doc about the proposed mechanism was made available today
in the Platform/Core web page (Development Resources & Planning -> Component
planning -> Proposed items):

http://dev.eclipse.org/viewcvs/index.cgi/%7Echeckout%7E/platform-core-home/main.html

Here is direct link:

http://dev.eclipse.org/viewcvs/index.cgi/%7Echeckout%7E/platform-core-home/documents/content_types.html
Comment 6 Ed Burnette CLA 2004-04-06 14:26:03 EDT
Two comments:

1) The faq says "Do Eclipse's content types have anything to do with IANA's 
MIME Media Types? Not so far." Why not? MIME types are well understood and 
flexible. You can tell if it's text or binary by looking at the first part of 
the media type, which gives you a rudamentary hierarchy (for example a text 
editor could edit text/* if there were no better matches).

2) Windows has a built-in media type lookup mechanism available through the 
registry. On Windows, it would be really nice to have Eclipse and other RCP 
applications use this mechanism at least for filename patterns (the most 
common case) so that the user only has to make associations once. I don't know 
whether or not Linux or Mac has anything similar.
Comment 7 Rafael Chaves CLA 2004-04-06 14:59:13 EDT
Re: 1) The main reason is that not every file format with a content type in
Eclipse will have a MIME type, so we could not use it as the main association
mechanism between content types and applications. We considered keeping
MIME-types as an optional property for each content type, and provide a method
findContentTypesByMIMEType (or something like that), but decided removing it
since there was no sound use case for that (and the idea in this initial
proposal was to keep only the essential stuff to avoid distractions).

Re: 2) That is a good point. I don't think we are going to do that know, but we
should at least open the oportunity for alternative implementations of the
content type registry to be plugged in (so others can improve this if required).
Comment 8 John Ruud CLA 2004-04-08 19:10:44 EDT
This is really cool, and would hopefully be able to replace our home-grown 
framework. I'm trying to test it out to make sure that we'll be able to use it 
by the time 3.0 is released. I've added 3 content types with their own content 
describers:
'generic' is based on 'org.eclipse.core.runtime.xml', and 'message' 
and 'translation' are both based on 'generic'. 

What is unclear to me is how editors are/will get tied to content types. Is 
this part implemented yet?

I'd like to be able to select an object (resource) and, in this simplified 
example "open with" the 'generic editor', and depending on the selected object 
also either with the 'message' or 'translation' editor. Also, the default 
editor should be the most specific one (message/translation). Does this sound 
doable based on the current design?

Also, the PDE editor is currently not particularly helpful when it comes to 
defining content types, but I'm confident this will get fixed in time (and is 
not essential). I was using build I20040407
Comment 9 Douglas Pollock CLA 2004-04-09 11:27:45 EDT
The editor look-up piece is not done.  Some thoughts have been put forward 
here: "http://www.magma.ca/~pollockd/despumate/contentBasedEditorLookUp.html".  
Since the document was first posted, we've had some face to face conversations.  
I believe that Platform Core has agreed to provide support for the following 
three things: caching the name of the top-level element while checking multiple 
content types (i.e., to prevent excessive reparsing of XML files), providing 
support for dynamically adding content types at runtime, and providing support 
for content types as preferences. 
 
Platform UI is still trying to decide what items it will complete for M9.  We 
are very interested in minimizing risk as we are getting quite close to 3.0.  
It is possible that content-based editor look-up will not be integrated into 
Platform UI in the 3.0 timeline. 
Comment 10 John Ruud CLA 2004-04-13 18:42:48 EDT
I understand from your document, Douglas, that there are obvious risks in 
pushing this through too fast (not that I wouldn’t love having it in 3.0 if at 
all possible). 

In our current design we’re choosing an editor based on an IEditorInput (not 
an InputStream). The *name* of the input (e.g. file name for FileEditorInput) 
is used to look up a corresponding cached object, so in our case there is no 
need to reparse the content from the InputStream. This is probably not the way 
most plugins will/should work because of the cost of having to activate the 
plugin up-front etc. However, another advantage with using IEditorInput is to 
support launching content-based editors on objects that are not directly 
backed by files (we’re using an “object editor input” to represent these 
objects). 

I haven’t studied the problem enough to know what the various issues are, but 
for those who have: would it be of general value for IContentDescribers to 
also support something like IEditorInput, as (arguably) the most general type 
of editor input (the XMLContentDescriber would probably assume a 
FileEditorInput)?
Comment 11 DJ Houghton CLA 2004-04-15 10:20:51 EDT
Moving to Rafael for an update.
If the UI piece to this is not on the 3.0 plan then we should note that and
defer this report.
Comment 12 Jim des Rivieres CLA 2004-04-28 13:42:01 EDT
This plan item has been split into two. The infrastructure for content-type-
based editor lookup is now in plan item bug 60291.
Comment 13 Rafael Chaves CLA 2004-04-28 18:02:31 EDT
A content type registry is in place. Since it was done too late, adoption (in
the Eclipse SDK) will be limited, editor-file associations and a central UI for
for manipulating content types being the most noticeable missed use cases.

Here are the scenarios taking advantage of the content type registry:
- non-uniform file encoding in the workspace (bug 37933 - [plan item] Improve
file encoding support)
- file associations in Compare extension points (bug 51791 - Allow binding
filenames to compare extensions)
- content sensitive object contributions, being retrofitted to work with the
content type registry (bug 33018 - [Contributions] plugin.xml context menu
should not have "Run Ant..." item)

Also, the Team team is considering adopting it as well, not sure about the exact
scenario/planned schedule.

Further comments regarding infrastructure from Core for content type management
should be posted to bug 60291.
Comment 14 Rafael Chaves CLA 2004-04-28 18:04:28 EDT
Deferred for post-3.0.
Comment 15 Dejan Glozic CLA 2004-04-29 11:31:09 EDT
Rafael, could you please take a look at bug 60369? We are trying to 
dynamically choose between two internal editors depening on the context 
(enclosing project). Any suggestion using the new content registry support?
Comment 16 John Arthorne CLA 2004-05-07 15:51:49 EDT
Removing mileston
Comment 17 Douglas Pollock CLA 2004-05-11 17:38:16 EDT
This is a copy of some mini-analysis that Platform UI did....

Eclipse has support for different types of editors. When a file is opened, an 
editor is selected based on the file type. Currently, the file type is 
determined by pattern matching on the name of the file. This is similar to 
systems used by some major desktop environments, such as Windows XP.

However, in recent years, this type of application-type definition has begun 
to fail. This could probably be traced back to the rise in the family of 
document types based on the XML family of standards. It is possible to store 
application-specific data in XML files. The de facto naming convention for 
these XML files has been to use the ".xml" extension regardless of the 
application they are associated with. For example, XML files ("*.xml") can be 
configuration files, web pages, Ant build files, etc.

In Eclipse, we would like to be able to support different types of editors 
based on the actual contents of the file. This is similar to the magic system 
used on Linux. This way, you can identify different types of files based on 
distinguishing characteristics of the content itself. Ant files, for example, 
contain a top-level <project> tag. This enhancement is generally referred to 
as content-based editor look-up, and is currently a plan item for 3.0 (not 
required).

This page describes how this will be dealt with from the perspective of the 
User Interface component of the Eclipse Platform.

Bugzilla
There are several enhancement requests related to this problem. Here is a 
brief summary of all of the existing bugs that I've found referencing this 
problem.

Bug 37668 [plan item] Content-type-based editor lookup 
Content-type-based editor lookup. The choice of editor is currently based on 
file name pattern. This is not very flexible, and breaks down when 
fundamentally different types of content are found in files with 
undistinguished file names or internal formats. For example, many different 
models with specialized editors get stored in XML format files named *.xml. 
Eclipse should support a notion of content type for files and resources, and 
use these to drive decisions like which editor to use. This feature would also 
be used by team providers when doing comparisons based on file type. The 
several existing file-type registries in Eclipse should be consolidated. 
[Platform Core, Platform UI] [Theme: User experience] 

Bug 52784 Allow extensible content type identifier 
I know there's several "defects" addressing this issue, and I know 
there's "work in progress" going on, but thought I'd append (contribute) code 
as to how we've addressed this in past versions of Eclipse ("on top" of base --
 would be better of course to be further down). Hopefully the final solution 
will be at least compatible, if it doesn't use this code directly. 

Bug 21652 [EditorMgmt] Rework editor concept (allow IAdaptable) 
Currently it is not possible to open the editor on special XML files only. 

Bug 33018 [Contributions] plugin.xml context menu should not have "Run Ant..." 
item 
In a plugin project, both "plugin.xml" and "build.xml" items have "Run Ant..." 
in their context menu. However, if "Run Ant..." from the "plugin.xml" context 
menu is selected, the resulting dialog box complains that "Config file is not 
of expected XML type". Ant cannot be run. 

Bug 37929 [Plan Item] Improve UI scalability 
Improve UI scalability. Despite efforts to ensure UI scalability with a large 
base of available tools, the Eclipse workbench still intimidates many users 
with long menus, wide toolbars, and lengthy flat lists of preferences. This 
problem is acute in large Eclipse-based products. The Platform should provide 
additional ways for controlling workbench clutter, such as further menu and 
toolbar customizability, distinguishing between novice and advanced functions, 
supporting different developer roles, and more specific object contributions 
for particular file types. [Platform UI, Platform Debug, JDT UI] [Theme: User 
experience] 

Other Proposals
The Core team for the Eclipse Platform has also prepared a complete proposal 
for the underlying support for content-based editor look-up. This proposal has 
been implemented, and an initial implementation is provided as of 
I200404060927.

Existing Work
To date, Platform UI has done no work toward this feature. Late last year, 
Debbie Wilson implemented a specific extension to object contributions for XML 
files. This extension provided the ability to contribution pop-up menu items 
based on the contents of the currently selected XML file.

Platform Core has just recently released (I200404060800 or later) code that 
should provide a framework for defining content types. This mechanism should 
provide a lot of what is required to make this work from the user interface 
perspective.

What Needs to Be Done
For Platform UI to provide support for content-based editor look-up, we would 
need to extend the "editors.exsd" extension point definition to include some 
mechanism for specifying content types. Since it can be one or more content 
types, and content type identifiers could contain any character, this will 
likely need to be a new 0..* XML element. It might be used as follows:


	<editor
		name="Ant Build File Editor"
		extensions="xml"
		icon="icons/full/obj16/ant_buildfile.gif"
		class="org.eclipse.ant.internal.ui.editor.AntEditor"
		id="org.eclipse.ant.ui.AntEditor">
		<contentType id="org.eclipse.ant.ui.antBuildFile" />
	</editor>
			
This would allow application developers to define editors in terms of the 
content types that they can define using new content type mechanism 
in "org.eclipse.ui.core.runtime". This particular piece is somewhat obvious.

The less obvious piece is what to do with the Java code that performs editor 
look-up. In this area, we're in a somewhat unfortunate position of having a 
lot of existing code we'd like to use, as well as a lot of code that 
duplicates the functionality of the new content type mechanism. These two 
categories of code are tightly coupled. So, for example, IEditorRegistry and 
IFileEditorMapping have a lot of existing behaviour that we would like to keep 
around, but do not support look-ups based on content.

I believe there is really only one good way to approach this problem, and that 
is to re-implement Platform UI's editor look-up mechanism -- new interfaces 
and classes that simply uses the underlying content type mechanism. 
Implementations of the old API (e.g., EditorRegistry) would be modified to 
call through to the new code.

For this to work, we'd need some more modifications on the part of the Core 
team. Some initial inspection of the code provided seems to indicate that 
content types can't come from preferences, and that it is not possible to 
dynamically add a content type at runtime. Users can define their own content 
types (i.e., file associations) in the File Associations preference page. 
There would need to be a way of persisting these user-defined content types. 
Also, if we wish to pass our old API through the new content-type mechanism, 
the editor extension point would need to be able to register content types 
declared in its editor element.

A trickier way of approaching this whole integration issue might be to 
implement a new set of classes, but leave the old classes intact. But then I 
believe there might be an unclear semantic on the conflict resolution between 
the old and the new.

Risks
There are two major risks at this late point in the development cycle. The 
first potential problem is one of performance degradation -- either through 
plug-in loading, or excessive manipulation of the file system or input 
streams. The second problem is one of regression.

Content-based editor look-up is fairly obviously a risky proposition for any 
application the size of Eclipse -- or more particularly its derivatives. The 
major use case for content-based look-up is the overloading the of XML 
extension to mean many diverse things. XML parsing is typically very 
expensive. We use the editor mechanism to decide which icon to display next to 
a file in file browsing widgets, and also to decide which editor to open. One 
can easily see where this could go wrong.

It is possible to use some default XML icon until the file is first opened. 
This means that file browsing through hierarchies of XML files would not 
require parsing every XML file in the tree. However, this perhaps provides a 
confusing user experience. The user is then not capable of knowing what type 
of file it is until it is opened.

Even if we defer icon determination, and only do content-based look-up on 
opening an editor, we could still run into problems. The problem is that there 
are so many derivatives of XML files now, that we could run into a situation 
where the XML file is parsed multiple times to determine which editor to open. 
The current content-type implementation does appear to offer any facilities 
for parsing an XML file only once (i.e., generating a DOM tree, and making it 
available to instances of IContentDescriber.

And, this also means that we have to potentially load multiple plug-ins (even 
though one is all that's eventually needed) to determine the content type when 
opening a file. This might be a somewhat undesired side effect.

The second problem is one of regression in the Platform UI functionality. Any 
time we're asked to drop old functionality for new functionality, there is an 
intrinsic risk that the new code won't work the exact same as the old code.

Time Estimations
These are just rough approximations based on some guesswork. They are really 
just numbers pulled out the air. I'd estimate five uninterrupted working days 
to figure out everywhere we use file associations, completely understand the 
integration tasks, and to write the code. Then there'd be five more days to 
write test cases, sniff test, write documentation and peer code review.
Comment 18 John-Mason P. Shackelford CLA 2004-05-11 17:59:15 EDT
To recognize an XML file as belonging to a particular type of editor a look at 
the DOCTYPE and/or the root element should decide between many cases. Anything 
much more fancy could be a rather intensive recognition process which will have 
rapidly diminishing returns, i.e. much more work for the sake of a relatively 
few cases in which there would be confusion. In such cases, one is probably 
going to want to fallback on the user arbitration rule anyway.
Comment 19 David Williams CLA 2004-05-12 00:23:47 EDT
Thanks for the update. I agree that just peeking at "beginning" of file should 
suffice, when peeking was needed at all (certainly would want to avoid whole 
DOM!). But I'll also wanted to remind everyone there is more file types in the 
world than just XML. JSP's are the other case I've seen that move beyond pre-
defined file extensions. In principle, someone might want to peak inside for 
contentType [sic] attribute on a page directive. But, I have to admit, most 
cases I've seen users/customers struggle with is the following sort of use-
case: Eclipse (actually add-on tools) might provide some "JSP capability", 
usually all associated with extension .jsp, but this user/customer may have 
some special files and web server setup where .jsv (voice) is also desired to 
be treated "just like" JSPs by the tools. In that case, the most important 
thing is to be able to add file extensions to existing content-type definition. 
(which would then make available editors associate with "JSP Content" -- as 
well as any other functionality associated with "JSP Content"). I think this 
case is covered in the above descriptions, but just thought I'd make sure. 
Comment 20 Michael Scharf CLA 2004-05-12 09:34:34 EDT
Editor lookup should also be project-nature specific: depending on the project 
nature(s), particular (default) editors should be used.

One example is AJDT: AJDT has its own editor and this editor is used for all 
*.java files, because is adds some gutter als allows the aspectj syntax. 
However, there might be other replacements for the java editor and if the 
setting is global, there is no way to get two such extensions work together. 
If the choice of the editor can depend on the project nature, then different 
projects can associate different editors with .java files. If a project has 
two natures wich both define an editor for the same file type, then there is a 
problem.

However, nature dependent editor selection can also reduce the need for 
content lookup. If a project does not have the nature of a foo-xml, then there 
is no need to check for foo-xml content.

The icon of the file could then also differ depending on the nature of the 
project....
Comment 21 Rafael Chaves CLA 2004-08-27 14:39:04 EDT
Reopening...
Comment 22 Rafael Chaves CLA 2004-08-27 15:27:51 EDT
I opened bug 69640 to address the association between content types and natures,
which seems to me a very good idea.
Comment 23 Kai-Uwe Maetzel CLA 2004-12-06 04:20:49 EST
*** Bug 80237 has been marked as a duplicate of this bug. ***
Comment 24 David Williams CLA 2005-01-05 12:49:23 EST
May I request an update on this long-planned item, for content-type based 
editor lookup? We in the WTP project will soon be in dire need of this
functionality. I've always assumed it was being worked on, and "just around the
corner" since it has such obvious importance. But now I am not so sure (of
either, that it is being worked on, or that its obvious to the platform team how
important this is). Any status, outlook or insights would be much appreciated. 
Comment 25 Rafael Chaves CLA 2005-01-06 12:49:42 EST
David, this is being looked at by the Platform Core and UI teams. Expect
progress/resolution on this issue during the M5 cycle.
Comment 26 John Ruud CLA 2005-02-24 15:20:12 EST
I'm eager to start playing with the content-type-based editors, in order to 
make sure it will work for us, and before they freeze the APIs again (in M6?). 
Is this already part of the M5 code, or will it be available soon?
Comment 27 Kim Horne CLA 2005-03-16 06:35:40 EST
I've started work on this piece and with any luck it should be available for M6.
 At the very least, the API should be in place (if non-functional).
Comment 28 Kim Horne CLA 2005-03-19 12:09:58 EST
Initital implementation in HEAD.  I would appreciate it all interested parties could play with this weeks 
integration build as soon as it's available.  In particular, I'm curious if the API I've added to the IDE class 
and to IEditorRegistry is sufficient.

To see the support in action, modify your favourite <editor> extension such that it has a child of the 
form: <contentTypeBinding contentTypeId="someID"/>.  This will bind the editor to files of the 
specified content type.
Comment 29 Michael Van Meekeren CLA 2005-03-21 11:18:33 EST
Moving this to Kim as she is now working on it.
Comment 30 Dani Megert CLA 2005-03-22 03:54:42 EST
Kim, does this work also add content-type support to the
"org.eclipse.ui.editors" extension point?
Comment 31 Kim Horne CLA 2005-03-22 06:35:24 EST
Erm, yes.  That's all it does, in fact.  :)  To your editor extensions you may
now add <contentTypeBinding /> children that advertise what content types the
editor may work against.  You can do this in liu of filename and extension
attributes, although they do work together as well.
Comment 32 Kim Horne CLA 2005-03-28 16:25:21 EST
Pushing milestone to M7.
Comment 33 Michael Van Meekeren CLA 2005-03-29 10:28:36 EST
can you comment on what's left to do?
Comment 34 Kim Horne CLA 2005-03-29 10:48:42 EST
Bascially we need to decide on the correct selection policy.   Currently any
editor bound by content type always takes priority over old file name/file
extension bindings.  We need to decide how the "default" attribute for editors
comes into consideration for content type bindings (if at all).

We also need to see if we can bring some harmony into the File
Associations/Content Types preference page relationship.
Comment 35 Bob Foster CLA 2005-03-29 17:31:38 EST
(In reply to comment #34)
> Currently any
> editor bound by content type always takes priority over old file name/file
> extension bindings.

Why? One would suppose extensions mapped to content types and at that point were
peers.

> We need to decide how the "default" attribute for editors
> comes into consideration for content type bindings (if at all).

I'm kind of hoping this can be done without breaking existing plug-ins, losing
features, etc.
Comment 36 Kim Horne CLA 2005-03-29 18:10:06 EST
"Why" because it was easy.  Like I said, the policy needs to be addressed for M7. 
Comment 37 Thomas Hallgren CLA 2005-03-31 03:20:32 EST
Has anyone considered mapping of composite file extensions? I'd like to map
.jra.gz to my JRA editor. Normally that editor would use .jra but it is capable
of wrapping its input in a GZIPInputStream if the extension is .jra.gz.

The content-type of a .gz is too generic so I can't really use that. Apparently
I can't create a specific mapping for .jra.gz. A scheme selecting the "longest
match" would be preferrable IMO.
Comment 38 Gunnar Wagenknecht CLA 2005-03-31 04:38:41 EST
(In reply to comment #37)
> Has anyone considered mapping of composite file extensions? 

This is covered in bug 22905. 
Comment 39 Rafael Chaves CLA 2005-03-31 10:54:24 EST
Re: comment 37 - please open a PR against Platform/Runtime for that to be
handled at the content type level.
Comment 40 Gunnar Wagenknecht CLA 2005-03-31 13:00:51 EST
open bug 89859
Comment 41 John Ruud CLA 2005-04-04 13:04:56 EDT
With the latest integration build ('I20050401-1645'), the 'Open With' context 
menu is currently displaying *all* editors that are matching by file 
extension, as opposed to only those that are appropriate based on the content 
of the EditorInput. It looks like OpenWithMenu.fill(Menu menu, int index) is 
using getEditors(String fileName, IContentType contentType), which is not 
filtering out the editors that are matching by file name (and extension) but 
not by content.

In my case around 12 editors are displayed under the 'open with' menu, while 
only 2 are appropriate based on the content of the files (the other editors 
are *not* appropriate).

Comment 42 Kim Horne CLA 2005-04-04 13:10:04 EDT
Please enter a bug report along with steps to reproduce.  I can't duplicate the
problem you're seeing...
Comment 43 John Ruud CLA 2005-04-04 14:30:29 EDT
My mistake. The 'extensions' element of the editors was filled out in addition 
to the contentTypeBinding (so removing the extensions element fixed the 
problem)
Comment 44 Kim Horne CLA 2005-04-19 14:49:38 EDT
I would like to close this plan item.  The major outstanding issues are how the
file association preference page interacts with content types (Bug 91965) and
the resolution policy (Bug 91966).  Interested parties should contribute to
those reports.
Comment 45 Kim Horne CLA 2005-05-10 13:49:01 EDT
Marking as verified for the I20050509-2010 testpass.