Bug 136901 - large schema leads to hang, huge memory bloat
Summary: large schema leads to hang, huge memory bloat
Status: VERIFIED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: XSD (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Ed Merks CLA
QA Contact:
URL: http://www.hl7.org/v3ballot/html/infr...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-15 01:54 EDT by Patrick McCormick CLA
Modified: 2023-01-12 11:53 EST (History)
2 users (show)

See Also:


Attachments
zip of schemas required to reproduce - with minor fixes to schema locations (86.50 KB, application/octet-stream)
2006-04-26 14:17 EDT, Craig Salter CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick McCormick CLA 2006-04-15 01:54:42 EDT
org.eclipse.xsd version 2.1.1
Eclipse Version: 3.1.2
eclipse.buildId=M20050929-0840
java.version=1.5.0_06
java.vendor=Sun Microsystems Inc.
BootLoader constants: OS=win32, ARCH=x86, WS=win32, NL=en_US
Framework arguments:  -vm  c:\Program Files\Java\jre1.5.0_06\bin\java.exe
Command-line arguments:  -os win32 -ws win32 -arch x86 -vm  c:\Program Files\Java\jre1.5.0_06\bin\java.exe
eclipse.vmargs=-Xms256m -Xmx1024m -Xss32m

I have 2GB of RAM on my laptop, with usually 1.4GB "Available" in task manager after Eclipse is running.

When I load the XML file:
http://www.hl7.org/v3ballot/html/infrastructure/cda/SampleCDADocument.xml

with all the referenced XSD files present, Eclipse sits for some time loading CDA.xml in the background at 99% CPU.  It allocates memory for a few minutes until it reaches 1,067MB, then it slows down and basically hangs.

I have successfully validated this XML file with Xerces-J in about two seconds, so I know it's not a problem with the XSD files.

the references are here; I edited POCD_MT000040.xsd so that the xs:includes go to current directory instead of the parent directory.

http://www.hl7.org/v3ballot/html/infrastructure/cda/CDA.xsd
http://www.hl7.org/v3ballot/html/infrastructure/cda/POCD_MT000040.xsd
http://www.hl7.org/v3ballot/html/processable/coreschemas/datatypes.xsd
http://www.hl7.org/v3ballot/html/processable/coreschemas/voc.xsd
http://www.hl7.org/v3ballot/html/processable/coreschemas/NarrativeBlock.xsd
http://www.hl7.org/v3ballot/html/processable/coreschemas/datatypes-base.xsd
Comment 1 Patrick McCormick CLA 2006-04-15 02:00:18 EDT
One odd thing I noticed is if "datatypes-base.xsd" is not present, validation with Xerces-J breaks, but Eclipse finishes validation in a few seconds.  I can tell that validation somewhat works because the "(realmCode*, typeId?, templateId*..." hints appear in the XML Design view.

Another note; this schema is part of a popular healthcare standard for clinical document exchange, so I expect the audience for this particular XSD fileset to be pretty large.  The XSD files were generated from a tool.
Comment 2 Patrick McCormick CLA 2006-04-22 23:13:54 EDT
Here is the standard and a graphic depicting the schema:
http://www.hl7.org/v3ballot/html/infrastructure/cda/cda.htm
http://www.hl7.org/v3ballot/html/infrastructure/cda/graphics/L-POCD_RM000040.gif
Comment 3 Arthur Ryman CLA 2006-04-25 13:19:03 EDT
Patrick, I agree with your suggestion to include this in the test suite for WTP.

Jeffrey, please create a JUnit.

Craig, please investigate.
Comment 4 Craig Salter CLA 2006-04-25 13:27:11 EDT
Thanks for reporting I'll take a look.  
Comment 5 Craig Salter CLA 2006-04-26 12:23:00 EDT
Although I haven't figured out the exact problem yet I do have a theory.  The file datatypes-base.xsd has an include like this...

<xsd:include schemaLocation="voc.xsd"/>

Although this is perfectly legal I think it may be confusing our tools (seems like the XSD model but I'm not sure).  If I comment out this include all seems to work fine (the XML Schema editors come up and the XML editor works providing content assist etc).  I'm using a recent WTP 1.5 to test this workaround but I'm guessing this should apply to WTP 1.0 too.

Hopefully this workaround can keep you going until we can diagnose the underlying problem.
Comment 6 Craig Salter CLA 2006-04-26 14:06:27 EDT
Ed, it looks like we have an endless loop happening in the XSD model.  I added a println to XSDSchemaImpl.java to help demonstrate as shown below...

 public XSDSchema included(XSDInclude xsdInclude)
 {
    System.out.println("included : " +  xsdInclude.getSchemaLocation() + " by " + xsdInclude.getSchema().getSchemaLocation());


... and here's a portion of what gets printed to the console...

included : datatypes.xsd by file:/D:/workspaces/test/foobar/voc.xsd
included : voc.xsd by file:/D:/workspaces/test/foobar/datatypes-base.xsd
included : datatypes-base.xsd by file:/D:/workspaces/test/foobar/datatypes.xsd
included : datatypes.xsd by file:/D:/workspaces/test/foobar/POCD_MT000040.xsd
included : datatypes-base.xsd by file:/D:/workspaces/test/foobar/datatypes.xsd
included : voc.xsd by file:/D:/workspaces/test/foobar/datatypes-base.xsd
included : datatypes.xsd by file:/D:/workspaces/test/foobar/voc.xsd
included : datatypes-base.xsd by file:/D:/workspaces/test/foobar/datatypes.xsd
included : voc.xsd by file:/D:/workspaces/test/foobar/datatypes-base.xsd
included : datatypes.xsd by file:/D:/workspaces/test/foobar/voc.xsd
included : datatypes-base.xsd by file:/D:/workspaces/test/foobar/datatypes.xsd
included : voc.xsd by file:/D:/workspaces/test/foobar/datatypes-base.xsd
included : datatypes.xsd by file:/D:/workspaces/test/foobar/voc.xsd
included : datatypes-base.xsd by file:/D:/workspaces/test/foobar/datatypes.xsd

Let me know if I can help further with this.  I must admit that I get confused when the model goes into the 'patch' related code :-)
Comment 7 Craig Salter CLA 2006-04-26 14:12:45 EDT
A couple of other notes...

A few fixes are required to the linked schema's schemaLocations so that references get fixed up for use in a local workspace.  I've attached a zip of the 'fixed up' schemas.

I can also reproduce this by opening the file CDA.xsd in the 'sample xsd' editor. So that should help rule out any evil 'WTP-isms' that could be messing things up :-)
Comment 8 Craig Salter CLA 2006-04-26 14:17:19 EDT
Created attachment 39564 [details]
zip of schemas required to reproduce - with minor fixes to schema locations
Comment 9 Ed Merks CLA 2006-04-27 06:35:22 EDT
A fix has been committed to CVS.  It will be in the RC2 build.
Comment 10 Nick Boldt CLA 2006-05-02 10:39:51 EDT
Fixed in 2.2.0RC2 (S200605020900)
Comment 11 Nick Boldt CLA 2008-01-28 16:35:41 EST
Move to verified as per bug 206558.
Comment 12 Nick Boldt CLA 2008-01-28 16:41:35 EST
Move to verified as per bug 206558.