Bug 225165 - Import does not handle xs:any data in <model> element or extra user attributes
Summary: Import does not handle xs:any data in <model> element or extra user attributes
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Cosmos (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Jimmy Mohsin CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-01 13:41 EDT by David Whiteman CLA
Modified: 2012-01-03 13:48 EST (History)
5 users (show)

See Also:


Attachments
Contains the file changes related to org.eclipse.cosmos.rm.repository project (31.73 KB, patch)
2008-11-04 07:07 EST, Ramesh CLA
no flags Details | Diff
Contains file changes related to org.eclipse.cosmos.rm.validation project (1.52 KB, patch)
2008-11-04 07:09 EST, Ramesh CLA
no flags Details | Diff
Contains the new SMLIF file that i prepared to check the fix (part of org.eclipse.cosmos.rm.validation.tests project) (6.26 KB, patch)
2008-11-04 07:13 EST, Ramesh CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Whiteman CLA 2008-04-01 13:41:43 EDT
The SML-IF spec allows for optional user data following the <instances> element of a <model>, and also allows optional user attributes on the model element.  We need to modify the import operation to store this data in the metadata construct so that it's preserved for the export operation.
Comment 1 John Arwe CLA 2008-08-28 09:23:57 EDT
Note: this is not a spec compliance issue.  The SMLIF schema allows open content, both attributes and elements, in many places.  The SMLIF spec is silent on how that content is processed, if at all, by a consumer.

It would certainly be a laudable goal to preserve all open content.  Note however that doing so might introduce inconsistencies from the pt of view of processors that do understand the open content and use it.

Example: I add open content on the model element that enumerates all the instance documents' cryptographic hashes.  If a new instance document is added, a later export would not add the hash of the newly added document (it's open content, so you don't understand it).  Removing and updating existing documents have similar issues, as does any case that involves dependencies between the content you do understand and the open content you do not understand.  

One can certainly argue that these issues are all a consequence of putting non-standard (open) content into your documents, i.e. that is the risk of using open content, so either the open content processors must cope with this or the "non-standard" is busted by definition.  But that gets us into the realm of design philosophies.
Comment 2 David Whiteman CLA 2008-08-28 09:28:50 EDT
Yes, this does show the import is not spec compliant.  However, any potential workgroup interop would involve the validator, and not the import utility, so this would be of lesser priority.
Comment 3 David Whiteman CLA 2008-10-08 11:24:14 EDT
Naveen wrote via email:
> Hi David,
>  
> As discussed in yesterdays RM call, I went through few links about XML parsing, 
> SML-IF schema,<xs: any>, <xs: anyAttribute> and SAX parser.
>  
> I tried debugging (drilling down) the code to see what an import operation does.
>  
> I have few queries regarding the open content mentioned in the above bug.
>  
> What is open content? Is it an element or a plain text which can be added into the 
> SML-IF file.

It can be either.  Open content means anything that is not part of the schema.

> Where can the open content be added? In <model> element or in <instance> element. I
> guess that it can be added in any of the above elements as both have option of <xs: any>
> defined in SML-IF schema.

Yes, the location of the open content depends on where <xs:any> is specified on the schema.

> I was not able to reproduce the exact problem. My understanding of the bug is:
> We import an existing SML-IF file into its constituent SML files using Import 
> Operation of SML-IF plugin. Suppose we add some open content to the existing SML-IF
> file and perform import, the open content is lost as it is not being stored 
> anywhere. Hence we need to modify the import operation to store the open content.
> Is my understanding of the bug correct? If yes, where and how do we have to store 
> the open content?

Look in the import code in the repository plugin, and you will see where we are writing the SML-IF content to a metadata file.  It is a "hidden file" that begins with a "." stored in the top level destination directory.  The format of it is not specified in the SML-IF spec and is entirely the invention of the COSMOS RM team.  So you would have to come up with a new element to store the open content so that it will be available when you do an export.  You could call it <xsAnyData>, <userData> or whatever you like.  Be sure to use constants for strings in the code like we do for other elements.

Does that answer your questions?
Comment 4 Naveen Tirupattur CLA 2008-10-13 01:11:16 EDT
Yes David. I have checked the .smlif_meta file for its structure. I will try implementing the same for the open content.
Comment 5 Naveen Tirupattur CLA 2008-10-13 07:42:09 EDT
Hi David,

I have checked the smlif_meta file structure and also the code of SMLIFFileHandler.java. This handler is segegrating the data based on the element(tag) names. But if the content is open, what mechanism should be applied to filter the open content as the element(tag) name could be anything?

I am stuck at this point as how to determine the open content in the SML-IF file.

regards,
-Naveen.
Comment 6 David Whiteman CLA 2008-10-13 10:06:43 EDT
The open content will occur at certain points in the file.  If it occurs following a <model> tag, then you can create an inModel flag that you set to true when the startElement() method is called for the element name of "model", and set it back to false in endElement().  If the content includes elements that you are not expecting, then you capture those elements using an else block in startElement() and endElement().  You could also set an inOpenContent flag when you get an endElement() for the last expected child element of <model> and before you get the endElement() for model itself (I'm using this as an example here, I don't recall offhand where the open content is specified in the schema).  Gathering all the open content will involve you writing the start tags, the content passed through the characters() method, and the end tags to a string writer or buffer, and when you get to the place where the open content must stop (the </model> tag, likely), you write the contents of the string writer to the metadata.  Does this make sense?
Comment 7 John Arwe CLA 2008-10-31 17:05:26 EDT
Clarifications:

Comment 1: > Note: this is not a spec compliance issue.
Comment 2: > Yes, this does show the import is not spec compliant.
The SMLIF spec does not prescribe any particular behavior for open content, so I think it is inaccurate to have the 'not' in the excerpt above from comment 2.  The current behavior, as well as the requested behavior, demonstrate nothing with respect to compliance that I am aware of.

Comment 3: > It can be either.
I believe xs:any only matches element information items, i.e. it is defined as if it was an element declaration with mixed=false.  Any elements matching an xs:any could themselves contain text content, but at the outermost level it would need an element wrapping it.

Comment 3: > Open content means anything that is not part of the schema.
More accurately, open content is any instance level content that would match a schema wild card during schema validity assessment.  Schema wild cards include both xs:any and xs:anyAttribute in XML Schema 1.0, as correctly noted.

Comment 3: > depends on where <xs:any> is specified
xs:any and xs:anyAttribute

Comment 6: 
David describes the process for element content.  To handle attributes, you would need to "look at" every attribute for known elements (unknown elements, including their known and unknown attributes having already been handled), do the same known/unknown discrimination for attributes, and then stash unknown attributes ala what you did for unknown elements more generally.  In both cases, you need to retain the metadata to tell you where in the SMLIF file things go (elements: previous-sibling probably; attributes: element xpath, a hash of the element content, or similar).
Comment 8 Ramesh CLA 2008-11-04 07:07:41 EST
Created attachment 116918 [details]
Contains the file changes related to org.eclipse.cosmos.rm.repository project

The File SMLIFFileHandler.java also include the changes that I did for issue #176187 (which I sent for review earlier, I believe the changes are yet to get reviewed), so just to avoid confusion (if any) while reviewing this file I have demarked the code changes with the related issue numbers.

Note:If the issue #176187 gets reviewed before this then it wouldn't give any confusion for reviewers :)
Comment 9 Ramesh CLA 2008-11-04 07:09:19 EST
Created attachment 116919 [details]
Contains file changes related to org.eclipse.cosmos.rm.validation project
Comment 10 Ramesh CLA 2008-11-04 07:13:44 EST
Created attachment 116921 [details]
Contains the new SMLIF file that i prepared to check the fix (part of org.eclipse.cosmos.rm.validation.tests project)
Comment 11 David Whiteman CLA 2008-11-04 07:22:51 EST
Hi Ramesh... if I check the SMLIF file in, will that automatically add a JUnit test, or do you need to add a new JUnit test method as well?
Comment 12 Ramesh CLA 2008-11-04 07:44:46 EST
(In reply to comment #11)
> Hi Ramesh... if I check the SMLIF file in, will that automatically add a JUnit
> test, or do you need to add a new JUnit test method as well?
> 
No David, I believe it will not add a JUnit testcase automatically (as I haven't done any changes in validation.tests project to do so, is there any configuration to auto-generate a JUnit?)

Actually I prepared this opencontent.xml file as part of 'my Sample project' while testing the fix (i.e. by running the smlif plugin code through 'Eclipse Application' option). 
While sending the patch for review I thought it is better to add opencontent.xml file into CVS for verification and reference so I included it under validation.tests project.

Please let me know if I needed to add a JUnit testcase for this
Thanks
Ramesh Pokala

Comment 13 David Whiteman CLA 2008-11-04 09:44:29 EST
Yes, it would be good to add a JUnit test for it if you can.  Whenever possible, it is good to add a JUnit test for every problem that is fixed, so we make sure we keep the desired behavior in the future and there is no quality regression.  There is already a JUnit test for the import utility, so you can open the TPTP test for that and add a new test method using that interface, and it will generate a stub for you to put in the JUnit logic.  Let me know if this does not make sense.
Comment 14 Jimmy Mohsin CLA 2008-11-04 11:54:03 EST
Candidate Whiteman,

Methinks this RM bug is thine property...

Jimmy
Comment 15 David Whiteman CLA 2008-11-04 16:43:03 EST
Will review this fix once a JUnit test is available.