RE: [mdt-sbvr.dev] Thoughts about two approaches to modelingthe Meaning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [mdt-sbvr.dev] Thoughts about two approaches to modelingthe Meaning & Representation (MRV) part of SBVR

From: "Stan Hendryx" <stan@xxxxxxxxxxxxxxxx>
Date: Tue, 25 Nov 2008 21:05:56 -0800
Delivered-to: mdt-sbvr.dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/mdt-sbvr.dev>
List-help: <mailto:mdt-sbvr.dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev>, <mailto:mdt-sbvr.dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev>, <mailto:mdt-sbvr.dev-request@eclipse.org?subject=unsubscribe>
Organization: Hendryx & Associates
Thread-index: AclPUGlWau9kclUfT/ujyPvz3iv/MQALc+KQ

Mark,

My response is like this, prefixed “Stan:”.

From: mdt-sbvr.dev-bounces@xxxxxxxxxxx [mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx] On Behalf Of Mark H Linehan
Sent: Tuesday, November 25, 2008 2:52 PM
To: SBVR developer list
Subject: RE: [mdt-sbvr.dev] Thoughts about two approaches to modelingthe Meaning & Representation (MRV) part of SBVR

Stan,

I have inserted responses like this.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research

phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx
"Stan Hendryx" <stan@xxxxxxxxxxxxxxxx>

"Stan Hendryx" <stan@xxxxxxxxxxxxxxxx>
Sent by: mdt-sbvr.dev-bounces@xxxxxxxxxxx

11/22/2008 01:14 PM

Please respond to
SBVR developer list <mdt-sbvr.dev@xxxxxxxxxxx>

"'SBVR developer list'" <mdt-sbvr.dev@xxxxxxxxxxx>

Subject

RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR

Mark,
Thank you and Andrey for this. As I understand it, the main difference between the two approaches is that in the Conventional Approach, SBVR noun concepts are mapped to instances of EClass, and in the EMF Extension Approach, SBVR noun concepts are mapped to subclasses of EClass. Thus, in the EMF Extension Approach, an instance of a mapped subclass of EClass represents an instance of the mapped SBVR concept. Is this correct?

Mark: yes, this is correct.

I would like to offer comments on some of the issues you raised, and a variation on your EMF Extension proposal.

* EMF does not support associations with more than 2 roles, so an extension would be need to support SBVR fact types with more than 2 roles
Have you considered objectifying fact types and representing them as noun concepts? With objectification, an instance of the noun concept is an instance of the fact type. I suggest objectifying all fact types with more than two roles, and all binary fact types where the opposite role does not range over an _expression_ or a number. Characteristics and binary fact types where the opposite role ranges over an _expression_ (text, icon, image, multi-media file) or a number would be modeled as eAttributes of the EClass subclass that represents. In this approach, “FactType” is a subclass of EClass, along with “NounConcept,” and each fact type becomes a subclass of FactType. (I think the name of your “Noun” class should be “NounConcept” as in SBVR, since the class represents the concept, not a word or phrase that designates the concept, which word is a noun.) Each FactType has an attribute for each role. The eReferences are used to link NounConcepts to their participating roles in FactTypes, and to link each role of a FactType to the NounConcept over which it ranges. This architecture is similar to that of Object Role Modeling (ORM), except that objects in ORM have no attributes, only objects and roles, and value types of objects for literals. We could do the same, using eAttributes only for “text” and “number” values.

Mark: Andrey didn't have time to consider fact types with more than 2 roles. Objectifying them is certainly a possibility worth evaluating versus the main alternative of introducing a new kind of "EFactType" concept in EMF.

Mark: The current implementation handles characteristics and binary fact types that range over texts and numbers as "Characteristics" or "DataProperties", that extend both EMF's EAttribute and SBVR's FactType. The implementation handles other "has" binary fact types as "ObjectProperties" that extends both EMF's EReference and SBVR's FactType. It handles non-has binary fact types as two instances of "BinaryFactTypes" that extend both EReference and SBVR's FactType. This mapping of one binary fact type to two instances is a limitation that I meant to identify in my last email, but forgot. What you suggest seems like an attractive alternative.

Stan: I see classes in Andrey’s UML diagrams with names that begin with an underscore and are colored dark grey, e.g. “_EClass”, stereotyped <<EClass>>. This is evidently some kind of EMF-ism I’m not familiar with. Can you comment on what this means?

Mark: I agree about the class named "NounConcept" versus "Noun".

* The SBVR concept "Role" is not distinguished from the ObjectType that a role ranges over. This means you can't have a Role that is used in zero fact types. Nor can Roles have Designations and other attributes, independent from the ObjectType that a Role ranges over.
The approach outlined above solves this problem by allowing roles to be NounConcepts. It is not necessary that a noun concept participate in a fact type.

Mark: if I understand you correctly, a role would be represented by an attribute of an objectified FactType. So how could a role exist independently of a fact type?

Stan: Roles and fact type roles are noun concepts. SBVR constrains fact type roles to always be filled in fact type instances, so that is a constraint. There is no such constraint on situational roles. For example, Bill might pre-register (be instantiated) as a driver (situational role of a person), and be instantiated in that role each time he rents a car, filling the fact type role “driver” in “rental has driver.”

The approach also appears to preserve the semantic stability property of fact models, wherein adding additional fact types does not change the structure of an existing model, as happens with object models. For instance, you have Driver as a subclass of EClass in your example. Suppose you were to extend you model so that driver is a role of a person, and add a Person subclass of EClass. Some of the attributes of Driver would need to be refactored to keep the model in third normal form*, and the Driver class would have Person as an eSupertype.
Mark: I agree that the objectified binary fact types that do not range over text and integers would remain "stable" in this situation. But any former characteristics or text attributes or binary attributes of the old Driver would presumably be moved to the new Person class.

Stan: Right. That movement is what I meant by refactoring.

Some people object to roles being subtypes, and would rather have an associative relationship rather than a generic one: a person has the role of Scoutmaster, rather than a Scoutmaster is a person. SBVR and natural language makes a generic relationship between roles and the object types that can fill those roles, by specifying a general concept for the role, as in “driver: person that …”.
Mark: there is also the SBVR relationship "role ranges over object type". So I read "driver: person that ..." (where "driver" is a role) as saying that "driver" ranges over the concept "person". I think SBVR is confusing in this area. I also think it is better to distinguish the "ranges over" relationship from the "specializes" relationship since they mean different things.

Stan: Yes, there is a certain amount of controversy and confusion about this. I agree with your reading. I see no reason to force a distinction in cases like this. Where the definition is informal, does not give a general concept, it seems like you would have to use “ranges over”. This is the explanation in the spec, p.23ff: “Saying that a role ranges over an object type is similar to saying the role specializes the object type in that the role incorporates every characteristic incorporated by the object type, and therefore, each instance of the role is necessarily an instance of the object type. But “ranges over” is different in that it allows that both the role and the object type incorporate the same characteristics - the object type can incorporate a characteristic that its instances fill that role.” This is important when specifying job descriptions, say, where you list characteristics of the role, and look for matching characteristics in a person who will fill that role. In your example, there may be an essential characteristic that the driver is of age. This can be stated as part of the definition of “driver” or as a necessity separate from the definition. The “person” concept can also have a characteristic “is of age”. The presumption is that the characteristic of the person matches that of the role.

* The normal forms are most often associated with relational database designs, but are applicable to conceptual models as well, especially where reference schemes provide “keys” and key roles for concepts. UML and Java, of course, rely on object references (handles) as surrogate identifiers, rather than keys, to resolve references in implementations. Keys and key attributes have to be patched onto UML, through some form of annotation.

* The implementation of AtomicFormulations will be more somewhat more complex, since it will have to deal with FactTypes captured as EMF attributes, EMF references, or n-way relationships
The implementation of atomic formulations is trivial in the approach outlined above: each instance of a subclass of FactType represents an atomic formulation.
Mark: I think each instance of a subclass of a FactType represents a fact, not an atomic formulation. Furthermore, the roles of an atomic formulation need to be able to bind to variables as well as IndividualConstants and Expressions. It is not clear to me how an instance of a subclass of FactType would bind to a variable.

Stan: Variable binding involves the mediation of a quantifier. An instance of a fact type is an atomic fact, a fact in which the roles are bound to individuals. Facts that involve variables always involve quantifiers. All variables in such a fact must be bound to a quantifier. Such a fact is not an atomic fact, not an instance of a fact type, but may involve many instances of one or more fact types.

* Concept names must not contain blanks or other characters that are not valid in EMF element or attribute names. This is because such characters cause problems when loading instance documents as described below. I think this can be solved very easily by escaping these characters.
Since last spring, courtesy of Dave Carlson and Kenn Hussey, EMF can apply a camel case transformation to class and attribute names when importing a UML model. I suggest we use that: “noun concept” becomes “NounConcept.”
Mark: sounds good.

* IndividualConcepts are not supported, but I see no obstacle to implementing them.

I noticed the caveat from Ed Merks on the penultimate page of Andrey’s presentation, about the hazards of extending EMF vs. annotating it. Can anyone say more about what we might lose, what pitfalls we might encounter, by going with the EMF Extension approach?

Mark: I think it's a general warning. I don't know of specific issues.

Stan: I wonder what the EMF generator does with subclasses of EClass?? Might some of the generated code not work, or is missing, or what? You mentioned something about needing to redo the editor. How complicated is that?

Stan

From: mdt-sbvr.dev-bounces@xxxxxxxxxxx [mailto:mdt-sbvr.dev-bounces@xxxxxxxxxxx] On Behalf Of Mark H Linehan
Sent: Friday, November 21, 2008 1:55 PM
To: mdt-sbvr.dev@xxxxxxxxxxx
Subject: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR

As I've discussed in previous notes, over the last week I've experimented with the MRV implementation in the mdt-sbvr libraries. My goal has been to compare this "conventional approach" to what I have called the "EMF extension" approach. See my notes of May 16 and August 23 to this mailing list. The purpose of this email is to summarize what I see in the comparison. I start by describing these two designs, and then I compare them.

Conventional Approach (as implemented in the checked-in code)

This is a direct implementation of the SBVR metamodel, created by converting that metamodel to a corresponding EMF model. The classes and associations that you see in the EMF model are (mostly) the concepts and fact types described in the SBVR specification. Variances are due either to aspects that are not (yet) implemented, or to additional methods introduced in the implementation to simplify use of the metamodel.

The SBVR specification is fact-oriented, which means that modeled concepts and fact types are treated as simply "facts about the model". So if you look at the .sbvr file created in this approach, you see a bunch of "PackagedElements" that represent those facts. For example, the concept called "vehicle" is handled as three facts: (1) the fact that an ObjectType exists; (2) the fact that a Text with the value "vehicle" exists; and (3) the fact that a Designation exists with a meaning that is the ObjectType and a representation that is the Text. Here's an example:
<packagedElement xsi:type="mrv:ObjectType"/>
<packagedElement xsi:type="mrv:Designation" meaning="//@packagedElement.6" _expression_="//@packagedElement.8"/>
<packagedElement xsi:type="mrv:Text" value="Vehicle"/>

IndividualConcepts are modeled with facts of the existence of an IndividualConcept (e.g. a car) or of an Actuality that uses an AtomicFormulation (e.g. for "car has driver name Bill"). So, when stored, they show up as more facts of the form:
<packagedElement xsi:type="mrv:IndividualConcept" general="//@packagedElement.9"/>

... where "//@packagedElement.9" is the ObjectType for "car".

.... and there are other lines that represent Actualities, involving the IndividualConcept. I can't show those yet because they depend upon AtomicFormulation, which isn't implemented.

There is no provision for modeling instances (as opposed to IndividualConcepts). That is, you can't directly store, load, or work with instances of SBVR vocabulary concepts.

The EMF-generated implementation is very low-level. To work directly with it, you have to code the relationships among all the elements of the SBVR metamodel. For example, you have to explicitly write code to support the three facts given in a paragraph above. Dave has created some utility library methods that simplify this, and I created a bunch more which you can see at the end of my "sbvrTest.java" experiment. I believe these utility functions will be absolutely necessary to shield implementation users from having to understand the low-level relationships.

EMF Extension Approach (as implemented in the code by Andrey Soares that I sent to this list on August 23)

This approach integrates the SBVR metamodel with the Eclipse Modeling Framework. Many of the basic MRV concepts are implemented as subtypes of corresponding EMF classes. For example, an SBVR Noun extends EMF's EClass. The SBVR fact type "concept specializes concept" is implemented via EClass.eSuperTypes. Modeled fact types are implemented via EClass.eStructuralFeature. Aspects of SBVR that have no equivalent in EMF are modeled the conventional way.

The .sbvr file created for a model is an extension of the .ecore format. For example, the concept "vehicle" is persisted as:

<eClassifiers xsi:type="sbvr:Noun" UID="NOU_9YI_hrf0Ed2Dmc3NA60pqA" name="Vehicle"/>

Unfortunately, such models cannot be opened with the regular EMF .ecore editor without more work. (I do believe this is achievable.)

In the EMF extension approach, you can dynamically create instances of the corresponding EClass subtypes. That is, the code does "EObject _objectDriver1 = new DynamicEObjectImpl(driver)" (where "driver" is a subtype of EClass) to create an individual concept. Then a tool can do "_objectDriver1.eSet(Name, "Bill")" to set attributes of the instance. When you store an instance model this way, you get an XML file that uses element tags equivalent to the concepts in the SBVR model. For example:

As with the conventional approach, I see a need for utility library methods that simplify working with the metamodel. Such methods could insulate users from some but probably not all differences between the two approaches. So I think we have to commit to one design or the other.

Practically, the current implementation of the EMF extension approach is incomplete because the following aspects of SBVR are not yet supported. I believe these points could be addressed if we go down this path.

* EMF does not support associations with more than 2 roles, so an extension would be need to support SBVR fact types with more than 2 roles
* The SBVR concept "Role" is not distinguished from the ObjectType that a role ranges over. This means you can't have a Role that is used in zero fact types. Nor can Roles have Designations and other attributes, independent from the ObjectType that a Role ranges over.
* The implementation of AtomicFormulations will be more somewhat more complex, since it will have to deal with FactTypes captured as EMF attributes, EMF references, or n-way relationships
* Concept names must not contain blanks or other characters that are not valid in EMF element or attribute names. This is because such characters cause problems when loading instance documents as described below. I think this can be solved very easily by escaping these characters.
* IndividualConcepts are not supported, but I see no obstacle to implementing them.

Comparison

The EMF extension approach unifies SBVR with EMF, similar to the way Java, UML, and XML are already integrated with EMF. The relevant parts of SBVR models become true EMF models, and thus can make use of the various EMF features. This approach exploits EMF's ability to store and load XML documents that use element and attribute names corresponding to EMF class and attribute names. I believe that with some more work, one could use the EMF generator to create a set of Java classes corresponding to the concepts in an SBVR model. Going further, one could use the EMF infrastructure to do a lot more with SBVR models. To put it another way, integrating MRV closely with EMF makes each SBVR business model also a PIM-level model represented in EMF. And then the EMF features can support some mappings to PSM models in Java, XML, etc.

Another advantage of the EMF extension approach is support for dynamically instantiating instances as EMF EObjects. This advantage becomes important if a tool implements reasoning on such instances -- what the OWL world calls "Abox" reasoning (see http://en.wikipedia.org/wiki/TBox). For example, imagine a tool that -- given "today is November 21, 2008" and "the accident happened 3 days ago" -- knows how to reason that "the accident happened on November 18, 2008". The closest equivalent with the conventional approach is the ability to model individual concepts, but (a) individual concepts are defined at modeling time, whereas instances can be loaded at runtime, e.g. from a database; (b) I expect that such a tool might be able to reason using EObjects, but would have a very hard time with individual concepts managed as a bunch of facts, as they are in the conventional approach.

The principle downside I see with the EMF extension approach is the risk that future changes in EMF could break the MRV implementation. This risk arises from the fact that the implementation depends upon some aspects of the EMF design. On the other hand, the EMF design is pretty open, so it would be hard to change it in significant ways without breaking lots of other code.
--------------------------------
Mark H. Linehan
STSM, Model Driven Business Transformation
IBM Research

phone: (914) 945-1038 or IBM tieline 862-1038
internet: mlinehan@xxxxxxxxxx_______________________________________________
mdt-sbvr.dev mailing list
mdt-sbvr.dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/mdt-sbvr.dev

Follow-Ups:
- RE: [mdt-sbvr.dev] Thoughts about two approaches to modelingthe Meaning & Representation (MRV) part of SBVR
  - From: Mark H Linehan

References:
- RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
  - From: Stan Hendryx
- RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
  - From: Mark H Linehan

Prev by Date: RE: [mdt-sbvr.dev] 3rd set of comments on MRV
Next by Date: [mdt-sbvr.dev] Some thoughts about using the EMF Extension approach in an editing tool
Previous by thread: RE: [mdt-sbvr.dev] Thoughts about two approaches to modeling the Meaning & Representation (MRV) part of SBVR
Next by thread: RE: [mdt-sbvr.dev] Thoughts about two approaches to modelingthe Meaning & Representation (MRV) part of SBVR
Index(es):
- Date
- Thread

Breadcrumbs