Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cme-dev] question on the expected mutual consistency of project classpaths



A quick note on working sets, following this note of Stan's below:
   Even if the classpath-consistency condition is a safe assumption in many
   cases, we probably still need the capability to subdivide a workspace
   (or our views of the workspace) into some form of working sets.  Even
   people who habitually work with sets of Eclipse projects that have
   consistent classpaths may still work with several such sets in a single
   workspace.  Accommodating multiple working sets is likely to raise some
   of the issues that contextualizing relationships would raise, e.g., the
   need to accommodate contextualized views and queries.  However, it would
   probably not entail some others, such as the performance and storage
   costs for multiple alternative sets of relationships.
There are various ways to handle working sets in the concern explorer. One
is to build a concern space for a specific working set (rather than for the
entire workspace). If this is done, I believe there are no
contextualization issues. This is a simple, Eclipse-compatible way to
restrict the "current" concern model to a subset of the projects that have
CME nature, (It does have some disadvantages, such as not being as
attractive as having multiple working sets, and relationships between them,
in a single space, and the need to rebuild the concern space when working
sets are switched, but I think it is very easy to do and provides
immeditate value).

- Harold



                                                                       
             Stan                                                      
             Sutton/Watson/IBM                                         
             @IBMUS                                                     To
             Sent by:                  cme-dev@xxxxxxxxxxx             
             cme-dev-admin@ecl                                          cc
             ipse.org                                                  
                                                                   Subject
                                       [cme-dev] question on the expected
             04/30/2004 09:34          mutual consistency of project   
             AM                        classpaths                      
                                                                       
                                                                       
             Please respond to                                         
                  cme-dev                                              
                                                                       
                                                                       





Hi All,

As most (if not all) of you know, we're in the middle of redesigning the
Conman loaders.  One issue affecting the redesign (and the design of an
appropriate Conman model) is the question of the mutual consistency of
classpaths in an Eclipse workspace.  (For purposes of building a concern
model I believe we can focus just on the classpaths that are associated
with particular projects, ignoring runtime classpaths, classpaths that may
be embedded in particular files, etc.)

The key consistency issue is whether references to classes will be resolved
in the same way in all projects, i.e., according to all classpaths in the
workspace.  There is a more formal and precise statement of what this means
below, but a basic manifestation of the issue is whether a reference from a
class A in one project to a class B in another project will be resolved to
the same declaration of class B according to the classpaths in two
different projects.

This is an important issue for which we need to have a good understanding,
the sooner the better.

The classpath-consistency condition has an effect on how we model
relationships in Conman.  If all references are resolved in a uniform way
across the workspace, then we can put all of the relationships for all
projects into one big pool without worrying about sorting them by project
(or classpath).  This is what we do now, by putting all relationships into
the concern space directly.  If we can do this, it will save time in
computing the relationships and space in storing them.  It also allows us
to present a relatively simplified view of the workspace to the user (the
view we present now).  This condition is assumed, in effect, by the current
loaders and query mechanisms, and it probably affords some simplification
(although perhaps minor) in the implementation of these components.

On the other hand, this condition is not required by Eclipse, we don't know
whether typical users will typically assume this condition holds or will be
taking advantage of the more general Eclipse semantics, and we have no idea
as to whether our current workspace (or typical workspaces) will observe
the condition.  We would also have to create some means of verifying the
condition (which we expect would operate mainly as a stand-alone utility).

If we cannot safely assume that the condition holds (or if we just want to
model the more general Eclipse semantics), then we would have to compute a
set of relationships relative to the classpath for each project and keep
those separately (or be able to sort them out).  This has a higher cost in
terms of computation and storage.  It could complicate the views of the
workspace and the formulation of queries against it (which would have to
accommodate an element of project relativity).  The impact on the loaders
would be minor (less than the impact of changes that we've incorporated on
a regular basis); my understanding is that future work on query
implementations could accommodate this change readily but that existing
implementations would have to be re-engineered to some extent.  On the
other hand, it if is not safe to assume the condition, and we do not
contextualize the relationships, then the information and views we provide
for a workspace will be wrong in general and the utility of the environment
will be compromised.

Two additional notes:

   Even if the classpath-consistency condition is a safe assumption in many
   cases, we probably still need the capability to subdivide a workspace
   (or our views of the workspace) into some form of working sets.  Even
   people who habitually work with sets of Eclipse projects that have
   consistent classpaths may still work with several such sets in a single
   workspace.  Accommodating multiple working sets is likely to raise some
   of the issues that contextualizing relationships would raise, e.g., the
   need to accommodate contextualized views and queries.  However, it would
   probably not entail some others, such as the performance and storage
   costs for multiple alternative sets of relationships.
   Most of the changes we might make to the loaders now can probably
   accommodate contextualization later without too much additional work,
   e.g., it should not be difficult to change where the loaders store
   relationships.  However, we need to be careful that other work we may do
   does not depend to heavily on this assumption if the assumption is not
   expected to hold in the future.  Also, for the sake of correctness and
   usefulness of the environment, we need to have some idea of whether
   classpath consistency holds now.


So, we're wondering whether classpath consistency reflects reality in the
perception and use of Eclipse and whether it would be natural or onerous to
require it of our users.  We don't have the breadth of perspective to judge
this by ourselves.  Please let us know what you think!

Thanks,

Stan

P. S.  Here's is a formulation of the classpath-consistency condition that
Harold Ossher worked up after talking with Bill Harrison:

   An artifact is uniquely identified by a name pair, (d, n), where d is a
   "disambiguator" and n is the artifact's selfIdentiryingName.
      There are many possibilites for the disambiguator, and the rest of
      this analysis is neutral to them. E.g.:
         a container, like an Eclipse project or special kind of concern,
         whose contents are guaranteed to have unique (non-duplicated)
         selfIdentifyingNames.
         location (disk address, canonocal path, ...)
         It is likely true that disambiguators are / have associated with
         them mappings from names to locations; i.e, given a disambiguator
         and a name, a unique disk address or whatever can be found for an
         artifact with that selfIdentifyingName
         A classpath P can be considered to be a sequence of (d, n) paris
            It is usually expressed at a higher level, such as a sequence
            of directories or jar files (perhaps a sequence of d's), but
            since names are unique within these elements so order within
            them doesn't matter, it boils down to a sequence of (d, n)
            pairs without loss of generality.
            A classpath implies a function: P(n) = the first (d', n') in P
            such that n' = n
            An artifact (d, n) contains names of things it refers to. The
            set of these names is denoted ref(d, n).
               It is assumed that all relationships loaded by conman
               loaders are determined by examining the ref sets for loaded
               artifacts (there might be complex computation involved, but
               this is the root source of the information)
               Given these definitions and assumptions, the following is a
               statement of the "compatible classpath" restriction we have
               talked about for some time:
            forall P1, P2, d, n:
                    (d, n) in P1 and (d, n) in P2 implies           // the
            same artifact on two classpaths implies
                            forall n' in ref(d, n): P1(n) = P2(n)        //
            that all names referenced in that artifact are resolved to the
            same artifact in both classpaths
               Checking the above condition requires detailed, expensive
               examination of all the artifacts. A conservative
               approximaiton is given by the following:
            forall P1, P2, d, n1, n2:
                    (d, n1) in P1 and (d, n2) in P2
            implies
            // d has some presence in both P1 and P2 implies
                            forall n occurring after the first occurrence
            of d in both P1 and P2: P1(n') = p2(n1)        // any
            later-occurring name is resolved uniformly
            This check can be performed by examining the names of all
            classes on each classpath but not their contents





Back to the top