From: mdt-sbvr.dev-bounces@xxxxxxxxxxx
[mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx] On
Behalf Of Mark H Linehan
Sent: Tuesday, November 25, 2008
2:52 PM
To: SBVR developer list
Subject: RE: [mdt-sbvr.dev]
Thoughts about two approaches to modelingthe Meaning & Representation (MRV)
part of SBVR
Stan,
I have
inserted responses like this.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research
phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx
"Stan Hendryx" <stan@xxxxxxxxxxxxxxxx>
"Stan
Hendryx" <stan@xxxxxxxxxxxxxxxx>
Sent by:
mdt-sbvr.dev-bounces@xxxxxxxxxxx
11/22/2008 01:14 PM
Please respond to
SBVR developer list <mdt-sbvr.dev@xxxxxxxxxxx>
|
|
To
|
"'SBVR
developer list'" <mdt-sbvr.dev@xxxxxxxxxxx>
|
cc
|
|
Subject
|
RE:
[mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning &
Representation (MRV) part of SBVR
|
|
Mark,
Thank
you and Andrey for this. As I understand it, the main difference between the
two approaches is that in the Conventional Approach, SBVR noun concepts are
mapped to instances of EClass, and in the EMF Extension Approach, SBVR noun
concepts are mapped to subclasses of EClass. Thus, in the EMF Extension
Approach, an instance of a mapped subclass of EClass represents an instance of
the mapped SBVR concept. Is this correct?
Mark:
yes, this is correct.
I would
like to offer comments on some of the issues you raised, and a variation on
your EMF Extension proposal.
* EMF does not support associations
with more than 2 roles, so an extension would be need to support SBVR fact
types with more than 2 roles
Have you
considered objectifying fact types and representing them as noun concepts? With
objectification, an instance of the noun concept is an instance of the fact
type. I suggest objectifying all fact types with more than two roles, and all
binary fact types where the opposite role does not range over an _expression_ or
a number. Characteristics and binary fact types where the opposite role ranges
over an _expression_ (text, icon, image, multi-media file) or a number would be
modeled as eAttributes of the EClass subclass that represents. In this
approach, “FactType” is a subclass of EClass, along with
“NounConcept,” and each fact type becomes a subclass of FactType.
(I think the name of your “Noun” class should be
“NounConcept” as in SBVR, since the class represents the concept,
not a word or phrase that designates the concept, which word is a noun.) Each
FactType has an attribute for each role. The eReferences are used to link
NounConcepts to their participating roles in FactTypes, and to link each role
of a FactType to the NounConcept over which it ranges. This architecture is
similar to that of Object Role Modeling (ORM), except that objects in ORM have
no attributes, only objects and roles, and value types of objects for literals.
We could do the same, using eAttributes only for “text” and
“number” values.
Mark:
Andrey didn't have time to consider fact types with more than 2 roles.
Objectifying them is certainly a possibility worth evaluating versus the main
alternative of introducing a new kind of "EFactType" concept in EMF.
Mark: The
current implementation handles characteristics and binary fact types that range
over texts and numbers as "Characteristics" or
"DataProperties", that extend both EMF's EAttribute and SBVR's
FactType. The implementation handles other "has" binary fact types as
"ObjectProperties" that extends both EMF's EReference and SBVR's
FactType. It handles non-has binary fact types as two instances of
"BinaryFactTypes" that extend both EReference and SBVR's FactType.
This mapping of one binary fact type to two instances is a limitation that I meant
to identify in my last email, but forgot. What you suggest seems like an
attractive alternative.
Stan: I see classes in Andrey’s
UML diagrams with names that begin with an underscore and are colored dark grey,
e.g. “_EClass”, stereotyped <<EClass>>. This is
evidently some kind of EMF-ism I’m not familiar with. Can you comment on
what this means?
Mark: I
agree about the class named "NounConcept" versus "Noun".
* The SBVR concept "Role" is not distinguished from the ObjectType
that a role ranges over. This means you can't have a Role that is used in zero
fact types. Nor can Roles have Designations and other attributes, independent
from the ObjectType that a Role ranges over.
The
approach outlined above solves this problem by allowing roles to be NounConcepts.
It is not necessary that a noun concept participate in a fact type.
Mark: if
I understand you correctly, a role would be represented by an attribute of an
objectified FactType. So how could a role exist independently of a fact type?
Stan: Roles and fact type roles
are noun concepts. SBVR constrains fact type roles to always be filled in fact
type instances, so that is a constraint. There is no such constraint on
situational roles. For example, Bill might pre-register (be instantiated) as a
driver (situational role of a person), and be instantiated in that role each
time he rents a car, filling the fact type role “driver” in “rental
has driver.”
The
approach also appears to preserve the semantic stability property of fact
models, wherein adding additional fact types does not change the structure of
an existing model, as happens with object models. For instance, you have Driver
as a subclass of EClass in your example. Suppose you were to extend you model
so that driver is a role of a person, and add a Person subclass of EClass. Some of the
attributes of Driver would need to be refactored to keep the model in third
normal form*, and the Driver class would have Person as an eSupertype.
Mark: I
agree that the objectified binary fact types that do not range over text and
integers would remain "stable" in this situation. But any former
characteristics or text attributes or binary attributes of the old Driver would
presumably be moved to the new Person class.
Stan: Right. That movement is
what I meant by refactoring.
Some people object to roles being subtypes, and would rather
have an associative relationship rather than a generic one: a person has the
role of Scoutmaster, rather than a Scoutmaster is a person. SBVR and natural
language makes a generic relationship between roles and the object types that
can fill those roles, by specifying a general concept for the role, as in
“driver: person that …”.
Mark:
there is also the SBVR relationship "role ranges over object type".
So I read "driver: person that ..." (where "driver" is a
role) as saying that "driver" ranges over the concept
"person". I think SBVR is confusing in this area. I also think it is
better to distinguish the "ranges over" relationship from the
"specializes" relationship since they mean different things.
Stan: Yes, there is a certain
amount of controversy and confusion about this. I agree with your reading. I
see no reason to force a distinction in cases like this. Where the definition
is informal, does not give a general concept, it seems like you would have to use
“ranges over”. This
is the explanation in the spec, p.23ff: “Saying that a role ranges over
an object type is similar to saying the role specializes the object type in
that the role incorporates every characteristic incorporated by the object
type, and therefore, each instance of the role is necessarily an instance of
the object type. But “ranges over” is different in that it allows
that both the role and the object type incorporate the same characteristics -
the object type can incorporate a characteristic that its instances fill that
role.” This is important when specifying job descriptions, say, where you
list characteristics of the role, and look for matching characteristics in a
person who will fill that role. In your example, there may be an essential
characteristic that the driver is of age. This can be stated as part of the
definition of “driver” or as a necessity separate from the
definition. The “person” concept can also have a characteristic “is of age”. The presumption is that
the characteristic of the person matches that of the role.
* The normal forms are most often associated with relational
database designs, but are applicable to conceptual models as well, especially
where reference schemes provide “keys” and key roles for concepts. UML
and Java, of course, rely on object references (handles) as surrogate
identifiers, rather than keys, to resolve references in implementations. Keys
and key attributes have to be patched onto UML, through some form of
annotation.
* The implementation of AtomicFormulations will be more somewhat more complex,
since it will have to deal with FactTypes captured as EMF attributes, EMF
references, or n-way relationships
The
implementation of atomic formulations is trivial in the approach outlined
above: each instance of a subclass of FactType represents an atomic
formulation.
Mark: I
think each instance of a subclass of a FactType represents a fact, not an
atomic formulation. Furthermore, the roles of an atomic formulation need to be
able to bind to variables as well as IndividualConstants and Expressions. It is
not clear to me how an instance of a subclass of FactType would bind to a
variable.
Stan: Variable binding involves
the mediation of a quantifier. An instance of a fact type is an atomic fact, a
fact in which the roles are bound to individuals. Facts that involve variables always
involve quantifiers. All variables in such a fact must be bound to a quantifier.
Such a fact is not an atomic fact, not an instance of a fact type, but may involve
many instances of one or more fact types.
* Concept names must not contain blanks or other characters that are not valid
in EMF element or attribute names. This is because such characters cause
problems when loading instance documents as described below. I think this can
be solved very easily by escaping these characters.
Since
last spring, courtesy of Dave Carlson and Kenn Hussey,
EMF can apply a camel case transformation to class and attribute names when
importing a UML model. I suggest we use that: “noun concept”
becomes “NounConcept.”
Mark:
sounds good.
* IndividualConcepts are not supported, but I see no obstacle to implementing
them.
I
noticed the caveat from Ed Merks on the penultimate page of Andrey’s
presentation, about the hazards of extending EMF vs. annotating it. Can anyone
say more about what we might lose, what pitfalls we might encounter, by going
with the EMF Extension approach?
Mark: I
think it's a general warning. I don't know of specific issues.
Stan: I wonder what the EMF
generator does with subclasses of EClass?? Might some of the generated code not
work, or is missing, or what? You mentioned something about needing to redo the
editor. How complicated is that?
Stan
From: mdt-sbvr.dev-bounces@xxxxxxxxxxx [mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx]
On Behalf Of Mark H Linehan
Sent: Friday, November 21, 2008 1:55 PM
To: mdt-sbvr.dev@xxxxxxxxxxx
Subject: [mdt-sbvr.dev] Thoughts about two approaches to modeling
the Meaning & Representation (MRV) part of SBVR
As I've
discussed in previous notes, over the last week I've experimented with the MRV
implementation in the mdt-sbvr libraries. My goal has been to compare this
"conventional approach" to what I have called the "EMF
extension" approach. See my notes of May 16 and August 23 to this mailing
list. The purpose of this email is to summarize what I see in the comparison. I
start by describing these two designs, and then I compare them.
Conventional Approach (as implemented in the checked-in code)
This is a direct implementation of the SBVR metamodel, created by converting
that metamodel to a corresponding EMF model. The classes and associations that
you see in the EMF model are (mostly) the concepts and fact types described in
the SBVR specification. Variances are due either to aspects that are not (yet)
implemented, or to additional methods introduced in the implementation to
simplify use of the metamodel.
The SBVR specification is fact-oriented, which means that modeled concepts and
fact types are treated as simply "facts about the model". So if you
look at the .sbvr file created in this approach, you see a bunch of
"PackagedElements" that represent those facts. For example, the
concept called "vehicle" is handled as three facts: (1) the fact that
an ObjectType exists; (2) the fact that a Text with the value "vehicle"
exists; and (3) the fact that a Designation exists with a meaning that is the
ObjectType and a representation that is the Text. Here's an example:
<packagedElement xsi:type="mrv:ObjectType"/>
<packagedElement xsi:type="mrv:Designation"
meaning="//@packagedElement.6" _expression_="//@packagedElement.8"/>
<packagedElement xsi:type="mrv:Text"
value="Vehicle"/>
IndividualConcepts are modeled with facts of the existence of an
IndividualConcept (e.g. a car) or of an Actuality that uses an
AtomicFormulation (e.g. for "car has driver name Bill"). So, when
stored, they show up as more facts of the form:
<packagedElement xsi:type="mrv:IndividualConcept"
general="//@packagedElement.9"/>
... where "//@packagedElement.9" is the ObjectType for
"car".
.... and
there are other lines that represent Actualities, involving the
IndividualConcept. I can't show those yet because they depend upon
AtomicFormulation, which isn't implemented.
There is no provision for modeling instances (as opposed to
IndividualConcepts). That is, you can't directly store, load, or work with
instances of SBVR vocabulary concepts.
The EMF-generated implementation is very low-level. To work directly with it,
you have to code the relationships among all the elements of the SBVR
metamodel. For example, you have to explicitly write code to support the three
facts given in a paragraph above. Dave has created some utility library methods
that simplify this, and I created a bunch more which you can see at the end of
my "sbvrTest.java" experiment. I believe these utility functions will
be absolutely necessary to shield implementation users from having to
understand the low-level relationships.
EMF Extension Approach (as implemented in the code by Andrey Soares that I
sent to this list on August 23)
This approach integrates the SBVR metamodel with the Eclipse Modeling
Framework. Many of the basic MRV concepts are implemented as subtypes of
corresponding EMF classes. For example, an SBVR Noun extends EMF's EClass. The
SBVR fact type "concept specializes concept" is implemented via
EClass.eSuperTypes. Modeled fact types are implemented via
EClass.eStructuralFeature. Aspects of SBVR that have no equivalent in EMF are
modeled the conventional way.
The .sbvr file created for a model is an extension of the .ecore format. For
example, the concept "vehicle" is persisted as:
<eClassifiers xsi:type="sbvr:Noun"
UID="NOU_9YI_hrf0Ed2Dmc3NA60pqA" name="Vehicle"/>
Unfortunately, such models cannot be opened with the regular EMF .ecore editor
without more work. (I do believe this is achievable.)
In the EMF extension approach, you can dynamically create instances of the
corresponding EClass subtypes. That is, the code does "EObject _objectDriver1
= new
DynamicEObjectImpl(driver)" (where "driver" is a subtype of EClass) to create an
individual concept. Then a tool can do "_objectDriver1.eSet(Name,
"Bill")" to set attributes of the instance.
When you store an instance model this way, you get an XML file that uses
element tags equivalent to the concepts in the SBVR model. For example:
<Driver Name="Bill" Age="35"
rents="Car_A Car_B" isOfAge="true">
As with
the conventional approach, I see a need for utility library methods that
simplify working with the metamodel. Such methods could insulate users from
some but probably not all differences between the two approaches. So I think we
have to commit to one design or the other.
Practically, the current implementation of the EMF extension approach is
incomplete because the following aspects of SBVR are not yet supported. I
believe these points could be addressed if we go down this path.
* EMF does not support associations with more than 2 roles, so an extension
would be need to support SBVR fact types with more than 2 roles
* The SBVR concept "Role" is not distinguished from the ObjectType
that a role ranges over. This means you can't have a Role that is used in zero
fact types. Nor can Roles have Designations and other attributes, independent
from the ObjectType that a Role ranges over.
* The implementation of AtomicFormulations will be more somewhat more complex,
since it will have to deal with FactTypes captured as EMF attributes, EMF
references, or n-way relationships
* Concept names must not contain blanks or other characters that are not valid
in EMF element or attribute names. This is because such characters cause
problems when loading instance documents as described below. I think this can
be solved very easily by escaping these characters.
* IndividualConcepts are not supported, but I see no obstacle to implementing
them.
Comparison
The EMF extension approach unifies SBVR with EMF, similar to the way Java, UML,
and XML are already integrated with EMF. The relevant parts of SBVR models
become true EMF models, and thus can make use of the various EMF features. This
approach exploits EMF's ability to store and load XML documents that use
element and attribute names corresponding to EMF class and attribute names. I
believe that with some more work, one could use the EMF generator to create a
set of Java classes corresponding to the concepts in an SBVR model. Going
further, one could use the EMF infrastructure to do a lot more with SBVR
models. To put it another way, integrating MRV closely with EMF makes each SBVR
business model also a PIM-level model represented in EMF. And then the EMF
features can support some mappings to PSM models in Java, XML, etc.
Another advantage of the EMF extension approach is support for dynamically
instantiating instances as EMF EObjects. This advantage becomes important if a
tool implements reasoning on such instances -- what the OWL world calls
"Abox" reasoning (see http://en.wikipedia.org/wiki/TBox). For example, imagine a tool that -- given
"today is November 21, 2008" and "the accident happened 3 days
ago" -- knows how to reason that "the accident happened on November
18, 2008". The closest equivalent with the conventional approach is the
ability to model individual concepts, but (a) individual concepts are defined
at modeling time, whereas instances can be loaded at runtime, e.g. from a
database; (b) I expect that such a tool might be able to reason using EObjects,
but would have a very hard time with individual concepts managed as a bunch of
facts, as they are in the conventional approach.
The principle downside I see with the EMF extension approach is the risk that
future changes in EMF could break the MRV implementation. This risk arises from
the fact that the implementation depends upon some aspects of the EMF design.
On the other hand, the EMF design is pretty open, so it would be hard to change
it in significant ways without breaking lots of other code.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research
phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx_______________________________________________
mdt-sbvr.dev mailing list
mdt-sbvr.dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev