Bug 37696 - [plan item] Remove dependency on Xerces
Summary: [plan item] Remove dependency on Xerces
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 2.1   Edit
Hardware: All All
: P4 enhancement with 2 votes (vote)
Target Milestone: 3.0   Edit
Assignee: DJ Houghton CLA
QA Contact:
URL:
Whiteboard:
Keywords: plan
: 39187 (view as bug list)
Depends on: 21386 44609 44751 44868
Blocks:
  Show dependency tree
 
Reported: 2003-05-15 11:21 EDT by Jim des Rivieres CLA
Modified: 2004-04-02 15:37 EST (History)
23 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jim des Rivieres CLA 2003-05-15 11:21:47 EDT
Remove dependency on Xerces. The Xerces plug-in currently provides XML support 
for the Eclipse platform. XML support is now incorporated into J2SE 1.4, and 
the presense of the Xerces plug-in can creates conflicts. Eclipse Platform 
should consistenly use the built-in XML support that ships with JDK 1.4, or 
possibly an alternative XML parser such as XMLPull which has a much smaller 
footprint. [Platform Core]
Comment 1 Cagatay Kavukcuoglu CLA 2003-05-22 21:55:02 EDT
Also see bug 36643.
Comment 2 Martin Boel CLA 2003-05-26 09:02:59 EDT
A word of warning: the J2SE XML implementation have its own bugs and its own
performance profile. Another approach is to rename the packages
in"org.apache.xerces" to "org.apache.eclipsexerces" in the apache sources. This
will remove name clashes and maintain the current level of bug's and
performance. If jdk 1.3 is to be supported by eclipse the xerces package has to
be provided anyway. Please observe that replacing xerces with XMLPull or another
equivalent will not remove the risk of namespace clashes.
Comment 3 Bob Foster CLA 2003-05-27 15:59:56 EDT
Any approach that keeps the platform's XML parsing solution out of other 
plugins' classloaders will do. Renaming the packages is one such, which would 
require that Eclipse maintain its own version of Xerces. Another approach 
would be to isolate the use of Xerces in its own classloader (what plugins 
have to do now if they want to use another version of Xerces), which would 
require refactoring how Xerces is used and make it somewhat less convenient.
Comment 4 Robb Wiedrich CLA 2003-06-09 14:54:15 EDT
Why not code to the JAXP specifiction, which removed the dependency on a 
particular version of xerces?  This would allow you to use both the version of 
xerces as provided by the JDK 1.4 and also any xerces that you might ship with 
Eclipse.
Comment 5 Michael Valenta CLA 2003-06-19 10:19:58 EDT
FYI: Team/CVS uses XML for import/export of project sets and for storing 
repository state and commit comment history.
Comment 6 Dorian Birsan CLA 2003-06-19 10:20:47 EDT
We use Xerces in Help to parse help contributions (toc.xml and context.xml), as 
well as to implement our own proprietary working sets until the platform 
provides non-UI interfaces for working sets.

Also, in Update/Install, we parse site and feature descriptions.
Comment 7 Philipe Mulet CLA 2003-06-19 10:26:41 EDT
JDT/Core uses XML for persisting classpath settings (.classpath file and in 
metadata area).
Comment 8 Erich Gamma CLA 2003-06-19 10:28:29 EDT
JDT UI/JDT text uses xerces to parse template files.
Comment 9 Dejan Glozic CLA 2003-06-19 10:28:59 EDT
PDE uses XML for parsing just about everything (plugin.xml, fragment.xml, 
feature.xml, site.xml) in its multi-page editors.

Usage in the manifest editor (plugin.xml/fragment.xml) is particularly tricky 
because we use DOM support for model reconciler (to provide content outline 
support with the source page).

Update also persists site bookmark information using XML file.
Comment 10 Jed Anderson CLA 2003-06-19 11:19:33 EDT
The Autorefresh uses xerces to read/write a list of excluded resources.
Comment 11 Sam Robb CLA 2003-06-19 11:52:31 EDT
Just a note - removing Xerces entirely will also have an impact on plugins 
outside of Eclipse Platform that currently make use of Xerces for XML parsing.

Whatever solution is decided upon, some information on how to migrate Xerces 
dependent code would be very welcome.
Comment 12 DJ Houghton CLA 2003-06-19 12:11:03 EDT
Just as a note, I believe that we still intend to ship the Xerces plug-in with 
Eclipse, but remove the dependancies from the base plug-ins. (although the 
rules might change once we only support higher level JDKs which already 
include Xerces)

Removing the dependancies is especially important for the Rich Client Platform 
(RCP). Being able to ship a minimal Eclipse without the Xerces plug-in (which 
is 3M in size) would be a big win.
Comment 13 Nick Edgar CLA 2003-06-19 12:50:55 EDT
Platform UI has only the following uses of Xerces:

1. JFace dialog settings (kind of like a preference store): 
- class: org.eclipse.jface.dialogs.DialogSettings
- no refs to Xerces in API

2. XMLMemento, the default implementation of IMemento:
- class: org.eclipse.ui.XMLMemento
- one ref in API: constructor XMLMemento(Document, Element), however this 
currently has no callers in the SDK other than in XMLMemento itself.  Users of 
XMLMemento are expected to use the static factory methods createReadRoot / 
createWriteRoot.

3. Welcome page parser (uses SAX):
- class: org.eclipse.ui.internal.dialogs.WelcomeParser
- no refs to Xerces in API

Comment 14 Gary Gregory CLA 2003-06-19 17:18:56 EDT
For code that use a specific XML vocabulary, why not create XSDs (XML Schema)
and generate code from the XSDSs with JAXB (http://java.sun.com/xml/jaxb/)?
Comment 15 Chris McKillop CLA 2003-06-20 00:02:29 EDT
We make use of the Xerces plugin in our commerical product based on Eclipse
(Momentics).  Also note that j9 does not currently ship with any of the 1.4 XML
APIs.  I am not sure what future plans there for adding this to j9's default set
of classlibs.
Comment 16 Dani Megert CLA 2003-06-23 03:33:07 EDT
In addition to comment 8 JDT UI also uses Xerces in the following area:
- JAR Packager: read and write .jardesc file
- Javadoc Export: read and write settings file
- Javadoc Location: read and write settings file

Besides JDT UI we also have an internal plug-in that depends on Xerces.
Comment 17 Darin Wright CLA 2003-06-24 22:38:23 EDT
The debugger uses XML for persistence of:
* launch configurations
* source locations
* runtime classpaths/source lookup paths
* launch history
* installed JRE's
* launch variables
Comment 18 Philippe Krief CLA 2003-06-25 11:16:06 EDT
In WSDD, we also have several dependencies with the Xerces plugin: for exemple, 
the P3ML plugin...
Now, P3ML references only JAXP API so, it should be OK with any of your 
suggestions...
thanks
Philippe
Comment 19 Chris Riddick CLA 2003-07-25 10:07:47 EDT
You can override the default parser versions in J2SE1.4 by using the endorsed 
jars option.

For example, you could add the following to the startup command file...
-Djava.endorsed.dirs=.\plugins\xalan-j_2_5_1\bin\xml-apis.jar;.\plugins\xalan-
j_2_5_1\bin\xercesImpl.jar;.\plugins\xalan-j_2_1_1\bin\xalan.jar
Comment 20 Kevin Duffey CLA 2003-09-06 02:17:06 EDT
XMLPull will solve all your problems. If I could have ever figured out how to 
modify the parsing code I probably could have helped add xml pull. I am not 
entirely sure if the various sub-nodes of extensions, the ones used to help 
build the ui, such as actionSets, and such, are "hard coded" or if a dynamic 
DOM tree is created such that any plugin could provide any type of dynamic node 
set and have it parsed in such a way that the extension point plugin requiring 
it could gain access to the xml->object tree nodes, or what. XMLPull using 
kxml2 at 20K or so in size is extremely fast, is very easy to code, and uses 
very little resources, not to mention you can easily break out of parsing the 
entire document if you don't need to. My futile attempts in adding it into the 
core and never being able to get the core to work has deterred me from working 
on it. I followed instructions, as well as help from some others and still 
could not get the second eclipse runtime to show up at all. If someone wishes 
to help me on that so that I can try to develop it, I would be happy to help 
again.

Comment 21 Ed Merks CLA 2003-10-01 12:26:47 EDT
EMF and XSD both depend on the Xerces plugin.  We've already run into problems 
with folks using JDK 1.4 with Eclipse 3.0 because in that environment, the 
Xerces plugin is never loaded and the JDK's version takes over.  It would seem 
like a good idea to upgrade the Xerces plugin to use the same level of Xerces as 
is in JDK 1.4. If that were the case, whether the Xerces was picked up from the 
JDK or from the plugin, we'd get the same Xerces and we'd be very happy.

This issue is very high priority for EMF and XSD and we can't afford to wait 
months for this to be resolved. What can we do to ensure that this issue reaches 
resolution as soon as possible?
Comment 22 DJ Houghton CLA 2003-10-01 14:17:48 EDT
Currently everyone is unable to use an IBM 1.4 JRE with Eclipse. As you can
imagine, this is a high-priority item for us as well. We are working on a plan
and hope to resolve this situation asap.
Comment 23 Darin Swanson CLA 2003-10-09 16:29:42 EDT
Ant integration specific bug 44494
Comment 24 Michael Valenta CLA 2003-10-14 12:46:31 EDT
In the process of converting from Xerces to Java 1.4 XML APIs, I encountered a 
few strange things that I thought I would share (in case others encounter the 
problem or can offer insight).

1) There seems to be two ways to create a parser, the first is:

   SAXParserFactory factory = SAXParserFactory.newInstance();
   SAXParser parser = factory.newSAXParser();
   ContentHandler handler = ???;
   InputSource source = ???;
   parser.parse(source, handler);

This worked for me with the change in behavior described next in point 2. The 
second method is:

   XMLReader parser = XMLReaderFactory.createXMLReader();
   ContentHandler contentHandler = ???
   parser.setContentHandler(contentHandler);
   InputSource source = ???;
   parser.parse(inputSource);

I saw this in an article and others I've talked to have used it but it doesn't 
work out of the box (at least with Sun 1.4.1_02) so I used the first approach. 

2) The startElement/endElement methods in the ContentHandler are provided with 
three naming parameters (namespaceURI, localName and qualifiedName). With 
Xerces, the localName was provided but with the parser in Sun 1.4.1_02 the 
qualifed name is provided and the localName is an empty string. The API was 
not clear on when one or the other was used so I rewrote our handlers to 
handle both combinations (we have very simple XML so this is OK but I suspect 
this would not be OK if namespaces are involved).

Anyway, I hope this helps others that are converting.
Comment 25 Bob Foster CLA 2003-10-14 14:41:35 EDT
"2) The startElement/endElement methods in the ContentHandler are provided with 
three naming parameters (namespaceURI, localName and qualifiedName). With 
Xerces, the localName was provided but with the parser in Sun 1.4.1_02 the 
qualifed name is provided and the localName is an empty string."

You need:

   SAXParserFactory factory = SAXParserFactory.newInstance();
   factory.setNamespaceAware(true);
   ...

Comment 26 DJ Houghton CLA 2003-10-14 14:54:22 EDT
*** Bug 39187 has been marked as a duplicate of this bug. ***
Comment 27 DJ Houghton CLA 2003-10-15 11:27:21 EDT
Just to make Michael and Bob's comments clear, this is (roughly) the code that
we are using in the plugin parser: 

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
SAXParser parser = factory.newSAXParser();
InputSource input = getInput();
DefaultHandler handler = myHandler();
parser.parse(input, handler);
Comment 28 Tim Koss CLA 2003-10-16 16:28:50 EDT
Just to verify from comment #12  by DJ:
     
     So, we are still shipping Xerces with the full Eclipse 3.0?
Comment 29 DJ Houghton CLA 2003-10-16 17:06:10 EDT
Currently we are. 
Not shipping Xerces would be a breaking API change.
There is a request into the PMC for removal of the project, but we need to 
consider downstream effects.
Comment 30 Nitin Dahyabhai CLA 2003-11-06 18:06:06 EST
One thing our team has found is that the DOM Level 2 Traversal and Range
Specification Java bindings found in Xerces aren't present in the J2SE XML
implementation.  While they're not part of the DOM2 Core, we have classes that
implement those interfaces and other plugins that use our implementations
strictly through the interfaces.  Are there any recommendations for providing
org.w3c.dom.ranges and org.w3c.dom.traversal to plugins that used to get them
from the Xerces plugin but want to use the J2SE XML implementation?
Comment 31 DJ Houghton CLA 2004-03-30 12:00:41 EST
The removal of the Xerces plug-in from the SDK has been approved by the PMC and
the change will be present in builds later this week.
Comment 32 DJ Houghton CLA 2004-04-02 15:32:40 EST
The SDK has removed all dependancies on Xerces.
The org.apache.xerces plug-in is no longer shipped with the SDK.
Closing.