[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[news.eclipse.technology.ldt] Re: Beyond textual represenations...

I'm a bit puzzled why do you use term "AST". "AST" assumes that there is a syntax to parse and that there may be a concrete syntax tree. I suggest to use the word "model". This naturally moves this discussion to EMF group and GMF group. EMF is already a ready "AST framework" and solves problem for non-textual representations.

Basing LDT on EMF is a good idea and it has been discussed by Philipp Kutter.

Basing JDT AST on EMF might be also a good idea. But this is better discussed in JDT group.

BTW I would like to see something like the ModelDOM componet in LDT. This component would do the following components:
- Support IDocument-related interfaces (possibly Webtool SSE based)
- Have one-to-one mapping to soruce (So it is a concrete syntax tree rather than AST)
- Ingegrates parsers/reparsers
- Maintains mapping to EMF-based AST of source code


Currently I'm in process of inventing the component and I would rather have reused it.

Constnatine

Guillaume Pothier wrote:
Regarding 1, who cares? We already know how to parse source code into an AST. Moreover, if you ever want to read the source code, you would need to translate the AST back into source, where, presumably, someone would edit the text, at which point you would need to parse to build the AST again. So persisting an AST doesn't free you from the need to parse.


It is true that parsing would still be necessary in many cases for
editing. But it is more logical to parse when the user makes a change.


Regarding 2, I'll admit that dynamically redefining grammars in specific context sounds like an interesting idea. But what problem does it solve? How does Maya let one express something more elegantly or efficiently when compared to vanilla java? Perhaps more to the point, it sounds like the new parsing techniques necessary to make Maya work were already figured out, while it isn't at all clear that the user-interface for tools operating direcly on an AST is well-defined. And if the user ever changes source code, you would need to parse to build your AST. So you still have to write a parser.


The paper about Maya gave a few interesting examples. One was the
foreach construct (which is now part of Java). Another was multimethods
and open classes, which is quite a big evolution from the base language:
multimethods permit to select which implementation of a method is called
at runtime according to the actual type of all its parameters, in
contrast to java where only the first, implicit parameter (this) is used
for method selection. Open classes permits to define methods outside of
a class.
An interesting property of Maya is that it lets the programmer apply
language extensions to selected portions of the code. This can be used
as an aspect weaving technique, with the added benefit to be able to
extend your syntax if needed.
Maya itself is a powerful substrate for implementing new languages (that's also what LDT aims to be). But it lacks a very important part of language development: the associated tooling.


As far as tools for operating directly on ASTs are concerned, I think there are several existing interaction techniques:
- a visual class diagram editor. It is be easier to implement if based on ASTs rather than text
- a syntax aware text editor (like JDT's). It is conceptually such as tool.


Well, these are two straightforward examples, and they are already implemented in some products. Having a full AST manipulation framework would enable the creation of many more of these tools.
But right now with Eclipse if someone wants to implement an editor based on the AST instead of the text representation he has to pay a lot of attention to proper update of the source code in parallel with AST modifications. You cannot forget the text.



Regarding 3, this common substrate is what the LDT is going to try to do. But the plan is to address it at an API level. Not a serialized on-disk format.



I think there is a compromise (see below)

Regarding 4, I think having a high-level language to express how tooling should be generated is a good and interesting idea. However, I'm not clear on how persisting ASTs forces this to happen.


Ok, maybe I exagerated a bit saying that it would force us ;o) But it would greatly facilitate it.
By the way, the article on Jetbrains mentioned by Chris is a really excelent read on this matter.



Now, lets consider what you would lose if you were to persist ASTs. First, you would be forcing users to either use the eclipse toolset, or to understand the serialized AST format. If someone wants to read the persisted format outside of eclipse, they need to understand this format, and I don't think it is fair to ask people to do that. So think about how frustrated people will be when the want to use another IDE, or emacs/vi/notepad. Also, what about diff tools? Source control tools? Command-line text-processing tools such as perl, wc, grep,...? What about integrating projects into pre-existing build systems?


These are very strong and valid counter-arguments. I personally have no easy solution for them, apart from developing a new SCM system suitable for keeping the history of a source code database...

So maybe a good compromise would be to extend the AST API in Eclipse so that:
- it is feasible to create a tool that reasons *only* on ASTs
- a plugin could be developed that provides AST-level storage and SCM (meaning filesystem storage would be another plugin)


Regards,
Guillaume