Mark,
Thank you and Andrey for this. As I
understand it, the main difference between the two approaches is that in the Conventional
Approach, SBVR noun concepts are mapped to instances of EClass, and in the EMF
Extension Approach, SBVR noun concepts are mapped to subclasses of EClass. Thus,
in the EMF Extension Approach, an instance of a mapped subclass of EClass represents
an instance of the mapped SBVR concept. Is this correct?
I would like to offer comments on some of the
issues you raised, and a variation on your EMF Extension proposal.
* EMF does not support associations with more than 2 roles, so an
extension would be need to support SBVR fact types with more than 2 roles
Have you considered objectifying fact
types and representing them as noun concepts? With objectification, an instance
of the noun concept is an instance of the fact type. I suggest objectifying all
fact types with more than two roles, and all binary fact types where the
opposite role does not range over an _expression_ or a number. Characteristics
and binary fact types where the opposite role ranges over an _expression_ (text,
icon, image, multi-media file) or a number would be modeled as eAttributes of
the EClass subclass that represents. In this approach, “FactType” is
a subclass of EClass, along with “NounConcept,” and each fact type
becomes a subclass of FactType. (I think the name of your “Noun”
class should be “NounConcept” as in SBVR, since the class
represents the concept, not a word or phrase that designates the concept, which
word is a noun.) Each FactType has an attribute for each role. The eReferences
are used to link NounConcepts to their participating roles in FactTypes, and to
link each role of a FactType to the NounConcept over which it ranges. This
architecture is similar to that of Object Role Modeling (ORM), except that objects
in ORM have no attributes, only objects and roles, and value types of objects
for literals. We could do the same, using eAttributes only for “text”
and “number” values.
* The SBVR concept "Role" is not distinguished from the ObjectType
that a role ranges over. This means you can't have a Role that is used in zero
fact types. Nor can Roles have Designations and other attributes, independent
from the ObjectType that a Role ranges over.
The approach outlined above solves this
problem by allowing roles to be NounConcepts. It is not necessary that a noun
concept participate in a fact type. The approach also appears to preserve the
semantic stability property of fact models, wherein adding additional fact
types does not change the structure of an existing model, as happens with
object models. For instance, you have Driver as a subclass of EClass in your
example. Suppose you were to extend you model so that driver is a role of a
person, and add a Person subclass of EClass. Some of the attributes of Driver
would need to be refactored to keep the model in third normal form*, and the
Driver class would have Person as an eSupertype. Some people object to roles
being subtypes, and would rather have an associative relationship rather than a
generic one: a person has the role of Scoutmaster, rather than a Scoutmaster is
a person. SBVR and natural language makes a generic relationship between roles and
the object types that can fill those roles, by specifying a general concept for
the role, as in “driver: person that …”.
* The normal forms are most often
associated with relational database designs, but are applicable to conceptual
models as well, especially where reference schemes provide “keys”
and key roles for concepts. UML and Java, of course, rely on object references
(handles) as surrogate identifiers, rather than keys, to resolve references in
implementations. Keys and key attributes have to be patched onto UML, through
some form of annotation.
* The implementation of AtomicFormulations will be more somewhat more complex,
since it will have to deal with FactTypes captured as EMF attributes, EMF
references, or n-way relationships
The implementation of atomic formulations
is trivial in the approach outlined above: each instance of a subclass of FactType
represents an atomic formulation.
* Concept names must not contain blanks or other characters that are not valid
in EMF element or attribute names. This is because such characters cause
problems when loading instance documents as described below. I think this can
be solved very easily by escaping these characters.
Since last spring, courtesy of Dave
Carlson and Kenn Hussey, EMF can
apply a camel case transformation to class and attribute names when importing a
UML model. I suggest we use that: “noun concept” becomes “NounConcept.”
* IndividualConcepts are not supported, but I see no obstacle to implementing
them.
I noticed the caveat from Ed Merks on the penultimate
page of Andrey’s presentation, about the hazards of extending EMF vs. annotating
it. Can anyone say more about what we might lose, what pitfalls we might
encounter, by going with the EMF Extension approach?
Stan
From:
mdt-sbvr.dev-bounces@xxxxxxxxxxx [mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx] On Behalf Of Mark H Linehan
Sent: Friday, November 21, 2008
1:55 PM
To: mdt-sbvr.dev@xxxxxxxxxxx
Subject: [mdt-sbvr.dev] Thoughts
about two approaches to modeling the Meaning & Representation (MRV) part of
SBVR
As I've
discussed in previous notes, over the last week I've experimented with the MRV
implementation in the mdt-sbvr libraries. My goal has been to compare this
"conventional approach" to what I have called the "EMF
extension" approach. See my notes of May 16 and August 23 to this mailing
list. The purpose of this email is to summarize what I see in the comparison. I
start by describing these two designs, and then I compare them.
Conventional Approach (as implemented in the checked-in code)
This is a direct implementation of the SBVR metamodel, created by converting
that metamodel to a corresponding EMF model. The classes and associations that
you see in the EMF model are (mostly) the concepts and fact types described in
the SBVR specification. Variances are due either to aspects that are not (yet)
implemented, or to additional methods introduced in the implementation to
simplify use of the metamodel.
The SBVR specification is fact-oriented, which means that modeled concepts and
fact types are treated as simply "facts about the model". So if you
look at the .sbvr file created in this approach, you see a bunch of
"PackagedElements" that represent those facts. For example, the
concept called "vehicle" is handled as three facts: (1) the fact that
an ObjectType exists; (2) the fact that a Text with the value
"vehicle" exists; and (3) the fact that a Designation exists with a
meaning that is the ObjectType and a representation that is the Text. Here's an
example:
<packagedElement
xsi:type="mrv:ObjectType"/>
<packagedElement
xsi:type="mrv:Designation" meaning="//@packagedElement.6"
_expression_="//@packagedElement.8"/>
<packagedElement
xsi:type="mrv:Text" value="Vehicle"/>
IndividualConcepts are modeled with facts of the existence of an
IndividualConcept (e.g. a car) or of an Actuality that uses an
AtomicFormulation (e.g. for "car has driver name Bill"). So, when
stored, they show up as more facts of the form:
<packagedElement
xsi:type="mrv:IndividualConcept"
general="//@packagedElement.9"/>
... where "//@packagedElement.9" is the ObjectType for
"car".
.... and
there are other lines that represent Actualities, involving the IndividualConcept.
I can't show those yet because they depend upon AtomicFormulation, which isn't
implemented.
There is no provision for modeling instances (as opposed to
IndividualConcepts). That is, you can't directly store, load, or work with
instances of SBVR vocabulary concepts.
The EMF-generated implementation is very low-level. To work directly with it,
you have to code the relationships among all the elements of the SBVR
metamodel. For example, you have to explicitly write code to support the three
facts given in a paragraph above. Dave has created some utility library methods
that simplify this, and I created a bunch more which you can see at the end of
my "sbvrTest.java" experiment. I believe these utility functions will
be absolutely necessary to shield implementation users from having to
understand the low-level relationships.
EMF Extension Approach (as implemented in the code by Andrey Soares that
I sent to this list on August 23)
This approach integrates the SBVR metamodel with the Eclipse Modeling Framework.
Many of the basic MRV concepts are implemented as subtypes of corresponding EMF
classes. For example, an SBVR Noun extends EMF's EClass. The SBVR fact type
"concept specializes concept" is implemented via EClass.eSuperTypes.
Modeled fact types are implemented via EClass.eStructuralFeature. Aspects of
SBVR that have no equivalent in EMF are modeled the conventional way.
The .sbvr file created for a model is an extension of the .ecore format. For
example, the concept "vehicle" is persisted as:
<eClassifiers
xsi:type="sbvr:Noun" UID="NOU_9YI_hrf0Ed2Dmc3NA60pqA"
name="Vehicle"/>
Unfortunately, such models cannot be opened with the regular EMF .ecore editor
without more work. (I do believe this is achievable.)
In the EMF extension approach, you can dynamically create instances of the
corresponding EClass subtypes. That is, the code does "EObject _objectDriver1 = new
DynamicEObjectImpl(driver)" (where "driver" is a
subtype of EClass) to create an individual concept. Then a tool can do "_objectDriver1.eSet(Name, "Bill")" to
set attributes of the instance. When you store an instance model this way, you
get an XML file that uses element tags equivalent to the concepts in the SBVR
model. For example:
<Driver Name="Bill" Age="35"
rents="Car_A Car_B" isOfAge="true">
As with
the conventional approach, I see a need for utility library methods that
simplify working with the metamodel. Such methods could insulate users from
some but probably not all differences between the two approaches. So I think we
have to commit to one design or the other.
Practically, the current implementation of the EMF extension approach is
incomplete because the following aspects of SBVR are not yet supported. I
believe these points could be addressed if we go down this path.
* EMF does not support associations with more than 2 roles, so an extension
would be need to support SBVR fact types with more than 2 roles
* The SBVR concept "Role" is not distinguished from the ObjectType
that a role ranges over. This means you can't have a Role that is used in zero
fact types. Nor can Roles have Designations and other attributes, independent
from the ObjectType that a Role ranges over.
* The implementation of AtomicFormulations will be more somewhat more complex,
since it will have to deal with FactTypes captured as EMF attributes, EMF
references, or n-way relationships
* Concept names must not contain blanks or other characters that are not valid
in EMF element or attribute names. This is because such characters cause
problems when loading instance documents as described below. I think this can
be solved very easily by escaping these characters.
* IndividualConcepts are not supported, but I see no obstacle to implementing
them.
Comparison
The EMF extension approach unifies SBVR with EMF, similar to the way Java, UML,
and XML are already integrated with EMF. The relevant parts of SBVR models
become true EMF models, and thus can make use of the various EMF features. This
approach exploits EMF's ability to store and load XML documents that use
element and attribute names corresponding to EMF class and attribute names. I
believe that with some more work, one could use the EMF generator to create a
set of Java classes corresponding to the concepts in an SBVR model. Going
further, one could use the EMF infrastructure to do a lot more with SBVR
models. To put it another way, integrating MRV closely with EMF makes each SBVR
business model also a PIM-level model represented in EMF. And then the EMF
features can support some mappings to PSM models in Java, XML, etc.
Another advantage of the EMF extension approach is support for dynamically
instantiating instances as EMF EObjects. This advantage becomes important if a
tool implements reasoning on such instances -- what the OWL world calls
"Abox" reasoning (see http://en.wikipedia.org/wiki/TBox).
For example, imagine a tool that -- given "today is November 21,
2008" and "the accident happened 3 days ago" -- knows how to
reason that "the accident happened on November 18, 2008". The closest
equivalent with the conventional approach is the ability to model individual
concepts, but (a) individual concepts are defined at modeling time, whereas
instances can be loaded at runtime, e.g. from a database; (b) I expect that
such a tool might be able to reason using EObjects, but would have a very hard
time with individual concepts managed as a bunch of facts, as they are in the
conventional approach.
The principle downside I see with the EMF extension approach is the risk that
future changes in EMF could break the MRV implementation. This risk arises from
the fact that the implementation depends upon some aspects of the EMF design.
On the other hand, the EMF design is pretty open, so it would be hard to change
it in significant ways without breaking lots of other code.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research
phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx