Bug 323168 - Mixed content in XSD vs attribute in XMI
Summary: Mixed content in XSD vs attribute in XMI
Status: NEW
Alias: None
Product: MDT.BPMN2
Classification: Modeling
Component: Core (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 323167
  Show dependency tree
 
Reported: 2010-08-19 11:54 EDT by Henning Heitkoetter CLA
Modified: 2010-11-25 10:34 EST (History)
1 user (show)

See Also:


Attachments
Patch series for Documentation.text (17.15 KB, application/octet-stream)
2010-08-19 11:59 EDT, Henning Heitkoetter CLA
no flags Details
Rebased to current master (90.61 KB, patch)
2010-08-20 09:43 EDT, Henning Heitkoetter CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Henning Heitkoetter CLA 2010-08-19 11:54:31 EDT
At several places, the XSD of the specification designates the content of an element as the place to serialize an attribute value, whereas the XMI uses an attribute.

This occurs in different flavours:
1a) "mixed+any to string"
  in XMI: attribute Documentation.text (String)
  in XSD: <=> mixed content of Documentation element (incl. [0..1] any element)
1b) "element with mixed+any to string"
  ScriptTask.script (String)
  <=> child element "script" of type "tScript" with mixed content (and any[0..1]) 
  [analog for TextAnnotation.text <=> child "text"]
2) "mixed to object"
  FormalExpression.body (Object)
  <=> mixed content of FormalExpression
?) "mixed to ?" 
  Expression allows mixed content, but I can't identify what it represents in the metamodel

1b is similar to 1a, except that the content is wrapped in another element

Current situation:
Currently, we consider only XMI attributes, which leads to an invalid XML serialization, as the attribute gets serialized although it is not in the schema. The other way around, loading a schema-valid XML document (produced externally), results in a exception (FeatureNotFound).
Comment 1 Henning Heitkoetter CLA 2010-08-19 11:58:07 EDT
Problem concerning 1a and 1b:
How is arbitrary (text) content supposed to map to an attribute?
(Note: when I'm referring to "XML" (in constrast to XMI), I mean a serialization conforming to the BPMN2 XSD.)

In cases where the content is only text (or CDATA), the conceptual mapping is relatively straightforward (not necessarily its implementation)
	XML: <bpmn2:documentation>Documentation text</bpmn2:documentation>
	XMI: <bpmn2:documentation text="Documentation text" />

However, the mapping becomes less obvious when child elements are considered, like in the following XML:
	<bpmn2:documentation>
		<!-- this element contains a text and an element node -->
		HTML document follows
		<html xmlns="...">
			...
		</html>
	</bpmn2:documentation>

Simply escape it and stuff it into the text attribute? What's with the other way around (XMI->XML)? Should a round trip be possible (XML saved as XMI saved as XML)?
Comment 2 Henning Heitkoetter CLA 2010-08-19 11:59:59 EDT
Created attachment 177020 [details]
Patch series for Documentation.text

This patch series is a first implementation for handling Documentation.text correctly. So far, arbitrary child elements are not supported, only text and CDATA nodes. Once we decide how to represent these as a string in XMI, a corresponding implementation should be rather straightforward (add an "any" attribute and handle it in the same manner).

Patch 5: An attribute mixed:EFeatureMapEntry[0..*] (with ExtendedMetadata "kind->elementWildcard") is added to the Documentation class, as it would appear if one creates the Ecore model directly from the XSD. The value of the existing text attribute is now derived by concatenating the text and CDATA nodes in "mixed", its setter replaces the content of mixed with a text node containing the new value. The "text" attribute is not marked as transient, because it has to be serialized in XMI.

Patch 6: To suppress the serialization of "text" in XML, the shouldSaveFeature method of XMLSave is overridden to return false for this specific feature.

Patch 1: In order to do the same for "mixed" in XMI, we need a BPMN specific XMIResource implementation (similar to Bpmn2ResourceImpl for XML). I chose bpmn2xmi as the file extension.

Patch 2-4 deal with testing.

With this implementation, the user can still use the convenient text attribute, while serialized models conform to XSD or XMI, respectively.

I'd like to hear your comments.
If you agree with this approach, I would push these changes and then port them to ScriptTask and TextAnnotation.
Comment 3 Henning Heitkoetter CLA 2010-08-20 09:43:37 EDT
Created attachment 177095 [details]
Rebased to current master

Everything in one patch.
Comment 4 Henning Heitkoetter CLA 2010-09-08 04:12:48 EDT
Commited.

As I found out, (Global)ScriptTask.script and TextAnnotation.text are already working as they're supposed to be through ExtendedMetadata, at least for text content.

"Any" content (at Documentation, ScriptTask and TextAnnotation) will have to wait for now. I welcome any comments on the problems mentioned above in comment 1.
Comment 5 Henning Heitkoetter CLA 2010-11-23 12:01:06 EST
Regarding case 2 - mixed to object @ FormalExpression.body:

In the XSD metamodel, the mixed content of FormalExpression stores the body of the formal expression. On the contrary, the property FormalExpression.body in the CMOF metamodel has type "Element" and is not a composition, but a reference. I cannot see where this element should be stored (contained).

Therefore and because it is better suited to model the intended content of the property, I propose that we change the equivalent of FormalExpression.body in Ecore from an EReference with type EObject to an EAttribute with type EString (as EString is an EDataType, not an EClass). This has, however, the following implication: when serializing to XMI, the value of this property will be stored as an attribute and not as an XML element or reference, as it would have to according to the XMI production rules.

As a compromise, I experimented with different strategies to leave the type as EObject and force a XMI serialization as XML element (instead of attribute), but then deserialization didn't work.

If there are no objections, I will commit the EString solution and we can work on based on that implementation, which works correctly with regard to XML (de)serialization (produces and reads valid XML documents). (De)serialization with XMI also works with the slight deviation from the specification as described above. I believe that this is the best support we can offer in this case, because I don't see how we can follow the specification wrt XMI in our case.
Comment 6 Henning Heitkoetter CLA 2010-11-25 06:05:17 EST
(In reply to comment #5)
> Regarding case 2 - mixed to object @ FormalExpression.body:
> ...

See commit 6c0e1a8e508139a2c6b6b7529a4a1bb4d68ef7a5

----

Summary / status quo of this bug:
- 1a&/b are supported for XML and XMI, with the exception of arbitrary content (any[0..1]) - see comment #1
- 2 is completely supported for XML (XSD does not allow arbitrary content in that case), with the metamodel type changed to string. The XMI serialization is not strictly valid, as outlined in comment #5
Comment 7 Reiner Hille CLA 2010-11-25 07:15:54 EST
> Summary / status quo of this bug:
> - 1a&/b are supported for XML and XMI, with the exception of arbitrary content
> (any[0..1]) - see comment #1
> - 2 is completely supported for XML (XSD does not allow arbitrary content in
> that case), with the metamodel type changed to string. The XMI serialization is
> not strictly valid, as outlined in comment #5

Thanks Henning.
One question:
When I imported the CMof, I have changed all "untyped" references to become a reference to EObject.
Would it be an option to use EAnyObject instead - which can be also filled with an EString, similar to your proposal?
Comment 8 Henning Heitkoetter CLA 2010-11-25 09:40:04 EST
(In reply to comment #7)
> One question:
> When I imported the CMof, I have changed all "untyped" references to become a
> reference to EObject.
> Would it be an option to use EAnyObject instead - which can be also filled with
> an EString, similar to your proposal?
What do you mean by EAnyObject? I can't seem to find a type of that name.
Comment 9 Reiner Hille CLA 2010-11-25 09:45:18 EST
> > Would it be an option to use EAnyObject instead - which can be also filled with
> > an EString, similar to your proposal?
> What do you mean by EAnyObject? I can't seem to find a type of that name.
Indeed I also don't find something like a common super type for EObject and (E)String. I thought that it exists, as the CMOF importer created something with the name EAnyObject. Is there a feature type the corresponds to java.lang.Object?
Comment 10 Henning Heitkoetter CLA 2010-11-25 10:34:37 EST
(In reply to comment #9)
> Indeed I also don't find something like a common super type for EObject and
> (E)String. I thought that it exists, as the CMOF importer created something with
> the name EAnyObject. Is there a feature type the corresponds to
> java.lang.Object?
There is a data type called EJavaObject. I believe I've experimented with this one as well, but it led to strange serializations (~hash code or hex code) and, as a data type, was serialized (XMI) in attribute form as well.