Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [Dltk-dev] AST Discussion

Hi folks,

Let me please try to summarize AST thoughts we have for now:

- Adopters (language implementors) want to be as flexible as possible with nodes hierarchy.

- Core AST shall add some value to language implementation - without value it's better to abandon Core AST in favour of language-specific ones.

There are at least 3 observalbe fields where we can try to find a value of Core AST:

1) Core services. As Andrei S. mentioned there are a few services built on top of Core AST, however language implementors may provide own implementation of those services on top of custom AST with minimal efforts.

2) AST rewrite. At this moment I can't say if Core AST can add some value to concrete AST rewriters - I hope we'll know about this later. And we're anxiously waiting for Zend folks to see their initial implementation for AST Rewrite/PHP.

3) AST persistency. This looks like a place where Core AST framework can add great value for some languages/environments. 

Problem: popular Ruby and TCL frameworks may include hundreeds of files. Specific of such languages may require IDE to parse most of source modules from those frameworks for simple operations (e.g. code assist - remember that Ruby class built from 70+ sources ;). Having 30-50ms time to parse a module in average would not save us from long running operations (1000 files parse may take up to 5 minutes).

So persistent AST looks a kind of unversal solution to performance problems. Of course AST shall be complete enough to fulfil other service requirements (e.g. Source Element parser shall be able to build structural model from persistent AST as well as other services can work without accessing source code).

Current idea is to employ EMF for AST persistency. So language implementors will be able to build AST tree from EMF objects of any kind, and provide hierarchies, which reflects target language best. Services may enforce additional requirements on hiearchy, but with EMF we can be much flexible:

//tell service which class reflect Statement in my language
FoldingService(MyASTPackage.eINSTANCE.getMyStatementNode()); //assume ctor accept EClass describing statements in the language-specific AST

Also we'll be able to persist AST's of any kind including ones annotated with language-specific information (virtually any EObject).

So most value I see now from Core AST is persistent services for ASTs of any kind. Please share your thoughts. 

Kind Regards,
Andrey


----- Original Message -----
From: "Andrei Sobolev" <andrei.sobolev@xxxxxxxxx>
To: "DLTK Developer Discussions" <dltk-dev@xxxxxxxxxxx>
Sent: Tuesday, April 29, 2008 1:33:56 PM GMT +06:00 Almaty, Novosibirsk
Subject: Re: [Dltk-dev] AST Discussion

Hi all,

Current DLTK mostly used as API for some core DLTK functionality such as
search.
My opinion is to separate it from such places and make special
structures as in ISourceElementParser for structure model creation.
This allow us some extra space in AST modifications. And allow to make
some separate sub frameworks, like search framework, ast framework, etc.

For Remote functionality we need to implement feature named "offline
indexing".
We need some utility to index source code and create special files on
remote systems (for interpreter libraries, etc). Then if DLTK find such
index information from it will be used to build structure model, search,
etc. Also we plan to make such indexes for interpreter libraries and
store them in metadata. This will give great performance benefit for
remote projects, and for some search, completion operations. We plan to
store AST's in what index.

To solve this requirement we need a functionality to save and load AST
trees.
In current implementation it will be very difficult, and will require
some work for each language, because we have different AST trees.

We think about make EMF based AST tree. This will allow easy
persistence, and some other benefits.

Best regards,
Andrei Sobolev.

> My current use of the existing DLTK structures is somewhat limited,
> coercing rules and attributes both as MethodDeclarations and the
> grammar statement as a TypeDeclaration. 
>
> Thanks,
> Gerald
>
> At 11:39 AM 4/28/2008, Mark Howe wrote:
>> Content-Language: en-US
>> Content-Type: multipart/alternative;
>>         
>> boundary="_000_6355D410F100AC49AF5FB137855762B03636EB06cgmb01codegearn_"
>>
>> That is the intent, hopefully it's possible. Do you use AST now?
>>  
>> Thanks
>> Mark
>>
>> ------------------------------------------------------------------------
>>     From: dltk-dev-bounces@xxxxxxxxxxx [
>>     mailto:dltk-dev-bounces@xxxxxxxxxxx] On Behalf Of Gerald Rosenberg
>>     Sent: Tuesday, April 22, 2008 5:24 PM
>>     To: DLTK Developer Discussions
>>     Subject: Re: [Dltk-dev] AST Discussion
>>
>>     Is the intent to generalize the AST structure enough to handle a
>>     'language' such as Antlr? 
>>
>>     Formally, an Antlr module is composed of a grammar statement,
>>     globally scoped attributes, rules, and rule scoped attributes. 
>>     While not exact, in general an attribute can be treated as an
>>     expression and a rule as a statement.  The requirements for
>>     rewriting (refactoring?) and formatting will be different from
>>     classical expressions and statements, but hopefully within the
>>     scope of the new DLTK abstractions. 
>>
>>     Happy to help flush out the requirements.
>>
>>     Best,
>>     Gerald
>>
>>
>>
>>     At 04:10 PM 4/22/2008, Mark Howe wrote:
>>>         Content-Language: en-US
>>>         Content-Type: multipart/alternative;
>>>                 
>>>         boundary="_000_6355D410F100AC49AF5FB137855762B03636E307cgmb01codegearn_"
>>>
>>>         Andrey, Andrei and I have had some discussion about the need
>>>         for a rewriter for DLTK. The time frame is probably after
>>>         the release of 1.0 this summer. However, prior to 1.0 and
>>>         starting the rewriter we should discuss changes we may want
>>>         to make the AST.
>>>
>>>         My reasons for suggesting changes to the AST are:
>>>
>>>         We should avoid having to work in multiple AST's on DLTK.
>>>         With a careful design we should be able to the use the
>>>         generic AST for the rewriter and formatting. This is
>>>         important to avoid duplication of work among different
>>>         languages. That won't preclude languages from using a
>>>         dedicated AST.
>>>
>>>         I have some suggestions to kick start the discussion.
>>>
>>>         Generalize the ASTNode hierachy
>>>
>>>         Generalize the ASTNode hierarchy so it better fits all
>>>         dynamic languages. Various languages have different notions
>>>         of what an 'expression' and a 'statement' are. I suggest
>>>         removing Expression and Statement from the ASTNode hierarchy
>>>         (i.e. flattening the hierchy). Instead have a property on
>>>         ASTNode which returns whether it is a statement or an
>>>         expression. For instance a field declaration is an
>>>         expression in Ruby (in fact a method declaration is an
>>>         expression, although it returns a null) but is currently a
>>>         Statement -> Declaration -> FieldDeclaration.
>>>
>>>         Modify the ASTVisitor to support the flattened hierarchy,
>>>         currently it has
>>>
>>>         visit(Expression ..) visit(Statement..)
>>>         visit(MethodDeclaration... visit(ModuleDeclaration and
>>>         visit(TypeDeclaration...
>>>
>>>         change to something like
>>>
>>>         visitExpression(ASTNode.. visitStatement(ASTNode etc
>>>
>>>         and each node would have to call the appropriate visit
>>>         method. AST's would probably have to be created from
>>>         factories so they can be configured for each language (ie
>>>         whether an type of node is a statement or expression).
>>>
>>>         Comments, other suggestions?   
>>>
>>>         Mark
>>>         _______________________________________________
>>>         dltk-dev mailing list
>>>         dltk-dev@xxxxxxxxxxx
>>>         https://dev.eclipse.org/mailman/listinfo/dltk-dev 
>>
>> _______________________________________________
>> dltk-dev mailing list
>> dltk-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/dltk-dev
> ------------------------------------------------------------------------
>
> _______________________________________________
> dltk-dev mailing list
> dltk-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/dltk-dev
>   

_______________________________________________
dltk-dev mailing list
dltk-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/dltk-dev


Back to the top