Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [imp-dev] Connecting PDB to IMP discussion

To add to what Jurgen said regarding some of the top-level questions Ed had about IMP, the PDB, and Rascal:

First: there is actually no coupling between IMP, PDB and Rascal.

The implementation of language-specific services (like reference resolution) *can* make use of the PDB and/or Rascal, but needn't.

That said, since program analysis is a central part of much of a full-featured IDE's functionality, we want IMP to provide good building blocks for implementing analyses. So IMP bundles the PDB (and, hopefully some day, Rascal) for the convenience of IDE developers. Note that the PDB is housed in a separate IMP feature (org.eclipse.imp.analysis), and the IMP runtime itself doesn't depend on the PDB (or Rascal). Nor does the PDB depend on the IMP runtime.

Rascal is a particularly nice language for building the kinds of analyses that the PDB can also express (by writing ordinary Java code against the PDB API), but does so much more concisely, and with the added benefit of some extremely powerful features like pattern matching over concrete syntax.

A few specific comments on the PDB's design:

  - *Anything* can produce facts and add them to the PDB.

  - Fact producers are decoupled from fact consumers. Consumers simply ask for a particular kind of fact over a particular context (e.g. "a CHA-based call-graph for the Java project Foo"). Then, either (a) the fact already exists in the PDB, in which case it's returned directly, or (b) the PDB is aware of a producer of said type of facts, and schedules its execution. Either way, the consumer is unaware of how the fact gets produced, or for that matter when it gets produced.

  - The PDB provides a richer set of data structures than, for example, the java.util Collections library: relations, tuples, and algebraic data types, file locations, etc..

  - The PDB provides a richer set of operators than does, for example, java.util, e.g., projection, cross-product,  relational composition, relational join, substitutions, closure operators, and so on. This makes it easier to express a broad range of analysis algorithms concisely.


On Nov 25, 2009, at 4:16 AM, Jurgen Vinju wrote:

Hi Ed,

I have yet to write good technical documentation for PDB. Here's the
short story.

* "pdb.values" implements a symbolic representations for source code
artifacts (anything from parse trees to type hierarchies, metrics and
other relations).
* It includes a simple but expressive type system that helps
documenting complicated representations as well as prevent simple and
complicated programming errors.
* Rascal (DSL for source code analysis and transformation) was
designed to have exactly the same type system as pdb.values (an
extension to be precise). It even shares the code for type checking.
* Rascal also uses the value representation of pdb.values directly.

One of the goals of PDB is to act as a source code fact "bus". Rascal
is one of the tools that plugs into this bus, and it does so very
directly for efficiency's and simplicity's sake.
The IMP run-time (mainly UniversalEditor) on the other hand is  a
consumer of PDB, or {c,sh,w}ould be. Naturally, many visual and
interaction features of the IDE are based on
source code analyses.

We intend to make it so that:
 - Rascal, or another fact producer, takes care of fact extraction
and source analysis
 - PDB schedules and caches fact extractions and further analyses,
 - and the IMP run-time takes care of interaction and visualization
of these facts.

We also want to do this without crafting a direct dependency between
the IMP run-time and PDB. The link will be made by implementing IMP's
standard services.

More questions are welcome!

Cheers,

Jurgen

On Tue, Nov 24, 2009 at 3:11 PM, Ed Willink <ed@xxxxxxxxxxxxx> wrote:
Hi

Is there any background reading that explains how IMP, PDB and Rascal join
up?

I vaguely understand IMP and Rascal but am baffled as to how they form such
a close relationship.

    Regards

       Ed Willink

Robert M. Fuhrer wrote:

Ok, for reference resolution, silly me, of course you're right - and the
schema "map[loc, loc]" could be considered a composition of the
IReferenceResolver API with the ISourcePositionLocator API.
That said, the schema "map[loc, loc]" is still so generic that it could mean
many things, so if the hyperlink controller simply looked for a fact of the
schema "map[loc, loc]" for the given language, it could pick up relations
that aren't really ref => def. Another example is "rel[loc, loc]", which is
basically the signature of "mark occurrences", which by design permits
multiple interpretations.
But perhaps we could define an extension point in pdb2imp that identifies
the specific fact ID to use for, e.g., hyperlinking (or perhaps reference
resolution?), for a given language. Then the developer only writes a tiny
bit of XML to hook things up.
On Nov 24, 2009, at 8:43 AM, Jurgen Vinju wrote:

Hi IMPs,

Here's a discussion we'd like to share with you all.

On Nov 23, 2009, at 2:42 AM, Jurgen Vinju wrote:

... I think it would be good (after the LDTA deadline has

passed on Dec 6th) to talk about "pdb2imp". I think there is an

opportunity and need to link PDB's analyses to the visual interaction

of IMP run-time. Examples: programmer provides an analysis that

produces reference information, and pdb2imp (maybe in pdb.ui?)

provides the reference resolver extension. etc. You dig?

Then Bob replied:

Naturally, I like the idea of using the PDB to provide the analysis services
that underlie various IDE services (like hyperlinking, navigating to program
entities by name, etc.). That was 1/2 of the PDB's purpose to begin with.

I'm not sure how to make a *language-independent* bridge between
language-specific facts in the PDB and IMP's language-independent runtime,
though. I.e., how can we have a single universal fact schema for the
reference resolvers of all languages? Or have I misunderstood what you mean
by "pdb2imp" ?

Perhaps we should move this part of the conversation to imp.dev?

Yes. I think we can have at least one universal schema for reference
resolving. Maybe more universal schema's. For example:

Any analysis that produces "rel[loc,loc]", a relation from location to
location can be used to store reference information. In this case one
"hyperlink" may have multiple targets. Another "universal" scheme is
"map[loc,loc]" in which case a "hyperlink" can only have one target.

Cheers,

Jurgen

--
Cheers,
  - Bob
-------------------------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center

IMP Project Lead (http://www.eclipse.org/imp)
X10: Productivity for High-Performance Parallel Programming (http://x10-lang.org)


Back to the top