RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR

From: "Stan Hendryx" <stan@xxxxxxxxxxxxxxxx>
Date: Sat, 22 Nov 2008 10:14:28 -0800
Delivered-to: mdt-sbvr.dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/mdt-sbvr.dev>
List-help: <mailto:mdt-sbvr.dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev>, <mailto:mdt-sbvr.dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev>, <mailto:mdt-sbvr.dev-request@eclipse.org?subject=unsubscribe>
Organization: Hendryx & Associates
Thread-index: AclMI9eCApY4Fiy1Sfug7ZcOBxbt9wAnMeeg

Mark,

Thank you and Andrey for this. As I understand it, the main difference between the two approaches is that in the Conventional Approach, SBVR noun concepts are mapped to instances of EClass, and in the EMF Extension Approach, SBVR noun concepts are mapped to subclasses of EClass. Thus, in the EMF Extension Approach, an instance of a mapped subclass of EClass represents an instance of the mapped SBVR concept. Is this correct?

I would like to offer comments on some of the issues you raised, and a variation on your EMF Extension proposal.

* EMF does not support associations with more than 2 roles, so an extension would be need to support SBVR fact types with more than 2 roles

Have you considered objectifying fact types and representing them as noun concepts? With objectification, an instance of the noun concept is an instance of the fact type. I suggest objectifying all fact types with more than two roles, and all binary fact types where the opposite role does not range over an _expression_ or a number. Characteristics and binary fact types where the opposite role ranges over an _expression_ (text, icon, image, multi-media file) or a number would be modeled as eAttributes of the EClass subclass that represents. In this approach, “FactType” is a subclass of EClass, along with “NounConcept,” and each fact type becomes a subclass of FactType. (I think the name of your “Noun” class should be “NounConcept” as in SBVR, since the class represents the concept, not a word or phrase that designates the concept, which word is a noun.) Each FactType has an attribute for each role. The eReferences are used to link NounConcepts to their participating roles in FactTypes, and to link each role of a FactType to the NounConcept over which it ranges. This architecture is similar to that of Object Role Modeling (ORM), except that objects in ORM have no attributes, only objects and roles, and value types of objects for literals. We could do the same, using eAttributes only for “text” and “number” values.

* The SBVR concept "Role" is not distinguished from the ObjectType that a role ranges over. This means you can't have a Role that is used in zero fact types. Nor can Roles have Designations and other attributes, independent from the ObjectType that a Role ranges over.

The approach outlined above solves this problem by allowing roles to be NounConcepts. It is not necessary that a noun concept participate in a fact type. The approach also appears to preserve the semantic stability property of fact models, wherein adding additional fact types does not change the structure of an existing model, as happens with object models. For instance, you have Driver as a subclass of EClass in your example. Suppose you were to extend you model so that driver is a role of a person, and add a Person subclass of EClass. Some of the attributes of Driver would need to be refactored to keep the model in third normal form*, and the Driver class would have Person as an eSupertype. Some people object to roles being subtypes, and would rather have an associative relationship rather than a generic one: a person has the role of Scoutmaster, rather than a Scoutmaster is a person. SBVR and natural language makes a generic relationship between roles and the object types that can fill those roles, by specifying a general concept for the role, as in “driver: person that …”.

* The normal forms are most often associated with relational database designs, but are applicable to conceptual models as well, especially where reference schemes provide “keys” and key roles for concepts. UML and Java, of course, rely on object references (handles) as surrogate identifiers, rather than keys, to resolve references in implementations. Keys and key attributes have to be patched onto UML, through some form of annotation.

* The implementation of AtomicFormulations will be more somewhat more complex, since it will have to deal with FactTypes captured as EMF attributes, EMF references, or n-way relationships

The implementation of atomic formulations is trivial in the approach outlined above: each instance of a subclass of FactType represents an atomic formulation.

* Concept names must not contain blanks or other characters that are not valid in EMF element or attribute names. This is because such characters cause problems when loading instance documents as described below. I think this can be solved very easily by escaping these characters.

Since last spring, courtesy of Dave Carlson and Kenn Hussey, EMF can apply a camel case transformation to class and attribute names when importing a UML model. I suggest we use that: “noun concept” becomes “NounConcept.”

* IndividualConcepts are not supported, but I see no obstacle to implementing them.

I noticed the caveat from Ed Merks on the penultimate page of Andrey’s presentation, about the hazards of extending EMF vs. annotating it. Can anyone say more about what we might lose, what pitfalls we might encounter, by going with the EMF Extension approach?

Stan

From: mdt-sbvr.dev-bounces@xxxxxxxxxxx [mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx] On Behalf Of Mark H Linehan
Sent: Friday, November 21, 2008 1:55 PM
To: mdt-sbvr.dev@xxxxxxxxxxx
Subject: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR

As I've discussed in previous notes, over the last week I've experimented with the MRV implementation in the mdt-sbvr libraries. My goal has been to compare this "conventional approach" to what I have called the "EMF extension" approach. See my notes of May 16 and August 23 to this mailing list. The purpose of this email is to summarize what I see in the comparison. I start by describing these two designs, and then I compare them.

Conventional Approach (as implemented in the checked-in code)

This is a direct implementation of the SBVR metamodel, created by converting that metamodel to a corresponding EMF model. The classes and associations that you see in the EMF model are (mostly) the concepts and fact types described in the SBVR specification. Variances are due either to aspects that are not (yet) implemented, or to additional methods introduced in the implementation to simplify use of the metamodel.

The SBVR specification is fact-oriented, which means that modeled concepts and fact types are treated as simply "facts about the model". So if you look at the .sbvr file created in this approach, you see a bunch of "PackagedElements" that represent those facts. For example, the concept called "vehicle" is handled as three facts: (1) the fact that an ObjectType exists; (2) the fact that a Text with the value "vehicle" exists; and (3) the fact that a Designation exists with a meaning that is the ObjectType and a representation that is the Text. Here's an example:
<packagedElement xsi:type="mrv:ObjectType"/>
<packagedElement xsi:type="mrv:Designation" meaning="//@packagedElement.6" _expression_="//@packagedElement.8"/>
<packagedElement xsi:type="mrv:Text" value="Vehicle"/>

IndividualConcepts are modeled with facts of the existence of an IndividualConcept (e.g. a car) or of an Actuality that uses an AtomicFormulation (e.g. for "car has driver name Bill"). So, when stored, they show up as more facts of the form:
<packagedElement xsi:type="mrv:IndividualConcept" general="//@packagedElement.9"/>

... where "//@packagedElement.9" is the ObjectType for "car".

.... and there are other lines that represent Actualities, involving the IndividualConcept. I can't show those yet because they depend upon AtomicFormulation, which isn't implemented.

There is no provision for modeling instances (as opposed to IndividualConcepts). That is, you can't directly store, load, or work with instances of SBVR vocabulary concepts.

The EMF-generated implementation is very low-level. To work directly with it, you have to code the relationships among all the elements of the SBVR metamodel. For example, you have to explicitly write code to support the three facts given in a paragraph above. Dave has created some utility library methods that simplify this, and I created a bunch more which you can see at the end of my "sbvrTest.java" experiment. I believe these utility functions will be absolutely necessary to shield implementation users from having to understand the low-level relationships.

EMF Extension Approach (as implemented in the code by Andrey Soares that I sent to this list on August 23)

This approach integrates the SBVR metamodel with the Eclipse Modeling Framework. Many of the basic MRV concepts are implemented as subtypes of corresponding EMF classes. For example, an SBVR Noun extends EMF's EClass. The SBVR fact type "concept specializes concept" is implemented via EClass.eSuperTypes. Modeled fact types are implemented via EClass.eStructuralFeature. Aspects of SBVR that have no equivalent in EMF are modeled the conventional way.

The .sbvr file created for a model is an extension of the .ecore format. For example, the concept "vehicle" is persisted as:

<eClassifiers xsi:type="sbvr:Noun" UID="NOU_9YI_hrf0Ed2Dmc3NA60pqA" name="Vehicle"/>

Unfortunately, such models cannot be opened with the regular EMF .ecore editor without more work. (I do believe this is achievable.)

In the EMF extension approach, you can dynamically create instances of the corresponding EClass subtypes. That is, the code does "EObject _objectDriver1 = new DynamicEObjectImpl(driver)" (where "driver" is a subtype of EClass) to create an individual concept. Then a tool can do "_objectDriver1.eSet(Name, "Bill")" to set attributes of the instance. When you store an instance model this way, you get an XML file that uses element tags equivalent to the concepts in the SBVR model. For example:

As with the conventional approach, I see a need for utility library methods that simplify working with the metamodel. Such methods could insulate users from some but probably not all differences between the two approaches. So I think we have to commit to one design or the other.

Practically, the current implementation of the EMF extension approach is incomplete because the following aspects of SBVR are not yet supported. I believe these points could be addressed if we go down this path.

* EMF does not support associations with more than 2 roles, so an extension would be need to support SBVR fact types with more than 2 roles
* The SBVR concept "Role" is not distinguished from the ObjectType that a role ranges over. This means you can't have a Role that is used in zero fact types. Nor can Roles have Designations and other attributes, independent from the ObjectType that a Role ranges over.
* The implementation of AtomicFormulations will be more somewhat more complex, since it will have to deal with FactTypes captured as EMF attributes, EMF references, or n-way relationships
* Concept names must not contain blanks or other characters that are not valid in EMF element or attribute names. This is because such characters cause problems when loading instance documents as described below. I think this can be solved very easily by escaping these characters.
* IndividualConcepts are not supported, but I see no obstacle to implementing them.

Comparison

The EMF extension approach unifies SBVR with EMF, similar to the way Java, UML, and XML are already integrated with EMF. The relevant parts of SBVR models become true EMF models, and thus can make use of the various EMF features. This approach exploits EMF's ability to store and load XML documents that use element and attribute names corresponding to EMF class and attribute names. I believe that with some more work, one could use the EMF generator to create a set of Java classes corresponding to the concepts in an SBVR model. Going further, one could use the EMF infrastructure to do a lot more with SBVR models. To put it another way, integrating MRV closely with EMF makes each SBVR business model also a PIM-level model represented in EMF. And then the EMF features can support some mappings to PSM models in Java, XML, etc.

Another advantage of the EMF extension approach is support for dynamically instantiating instances as EMF EObjects. This advantage becomes important if a tool implements reasoning on such instances -- what the OWL world calls "Abox" reasoning (see http://en.wikipedia.org/wiki/TBox). For example, imagine a tool that -- given "today is November 21, 2008" and "the accident happened 3 days ago" -- knows how to reason that "the accident happened on November 18, 2008". The closest equivalent with the conventional approach is the ability to model individual concepts, but (a) individual concepts are defined at modeling time, whereas instances can be loaded at runtime, e.g. from a database; (b) I expect that such a tool might be able to reason using EObjects, but would have a very hard time with individual concepts managed as a bunch of facts, as they are in the conventional approach.

The principle downside I see with the EMF extension approach is the risk that future changes in EMF could break the MRV implementation. This risk arises from the fact that the implementation depends upon some aspects of the EMF design. On the other hand, the EMF design is pretty open, so it would be hard to change it in significant ways without breaking lots of other code.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research

phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx

Follow-Ups:
- RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
  - From: Mark H Linehan

References:
- [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
  - From: Mark H Linehan

Prev by Date: RE: [mdt-sbvr.dev] 3rd set of comments on MRV
Next by Date: RE: [mdt-sbvr.dev] 3rd set of comments on MRV
Previous by thread: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
Next by thread: RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
Index(es):
- Date
- Thread

Breadcrumbs