Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] decoupled preprocessor

Mike,
is there a chance that we can use your decoupled preprocessor for the 
current C- and C++-parsers? The DOM-Scanner really is a nightmare to
maintain.

Markus.

> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx 
> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Mike Kucera
> Sent: Dienstag, 19. Juni 2007 23:40
> To: CDT General developers list.
> Subject: RE: [cdt-dev] decoupled preprocessor
> 
> It looks like you are planning to do preprocessing on the raw 
> character
> stream and then feed the result to your ANTLR lexer.
> 
> The C99 preprocessor works differently, it processes a token 
> stream, not a
> character stream. It creates a CodeReader for each include, 
> passes it to
> the lexer and expects a token stream as the result. It then 
> adds the token
> stream to its own input and continues processing.
> 
> I don't know which approach makes more sense with ANTLR. With 
> LPG I was
> able to separate the lexer and parser and stick the preprocessor
> in-between.
> 
> I believe that doing lexing before preprocessing makes the 
> preprocessing
> phase much easier to write and maintain. For example the C99 
> preprocessor
> doesn't need to deal with comments, from bug reports this is 
> something that
> I can tell has created many issues in the DOM scanner. Also 
> the code is
> cleaner because it is processing a token stream instead of a 
> raw character
> stream (for example, compare Macro.invoke() to BaseScanner.
> expandFunctionStyleMacro()).
> 
> Also, if you return raw characters from the preprocessor then 
> how will you
> the calculate the offsets on the AST nodes? The offsets are normally
> contained in the tokens.
> 
> > But if you already have everything we've done
> > there, then might be the better approach.
> 
> Well, I hope so :) Its pretty new and I'm still working out 
> the bugs. It
> does have a few features the DOM scanner doesn't, like support for
> trigraphs.
> 
> I hope you do decide to give it a try. I'll decouple it soon.
> 
> 
> Mike Kucera
> Software Developer
> IBM CDT Team, Toronto
> mkucera@xxxxxxxxxx
> 
> 
> 
>                                                               
>              
>              Doug Schaefer                                    
>              
>              <DSchaefer@xxxxxx                                
>              
>              m>                                               
>           To 
>              Sent by:                  "CDT General 
> developers list."      
>              cdt-dev-bounces@e         <cdt-dev@xxxxxxxxxxx>  
>              
>              clipse.org                                       
>           cc 
>                                                               
>              
>                                                               
>      Subject 
>              06/19/2007 03:53          RE: [cdt-dev] 
> decoupled             
>              PM                        preprocessor           
>              
>                                                               
>              
>                                                               
>              
>              Please respond to                                
>              
>                "CDT General                                   
>              
>              developers list."                                
>              
>              <cdt-dev@eclipse.                                
>              
>                    org>                                       
>              
>                                                               
>              
>                                                               
>              
> 
> 
> 
> 
> Yes, it is definitely something I'll need. I'll need to take 
> a look at what
> you've done. ANTLR uses it's own character stream interface to feed
> characters to the lexer. It provides implementations that can 
> pull that out
> of Readers and InputStreams. I will likely want to create a 
> new one that
> doesn't try to load it all into a char[] at startup like the 
> built in ones
> do. We can then hook that up to the preprocessor.
> 
> I'm not sure how you built yours but the easiest path I can 
> see is to take
> our current scanner and replace nextToken with getChar and strip out
> anything that creates a token. But if you already have 
> everything we've
> done
> there, then might be the better approach.
> 
> Anyway, another shiny object flew by called CDT user docs, so 
> I'll get back
> to ANTLR in a few days :).
> 
> Cheers,
> Doug Schaefer, QNX Software Systems
> Eclipse CDT Project Lead, http://cdtdoug.blogspot.com
> 
> 
> > -----Original Message-----
> > From: cdt-dev-bounces@xxxxxxxxxxx 
> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On
> > Behalf Of Mike Kucera
> > Sent: Tuesday, June 19, 2007 3:43 PM
> > To: CDT General developers list.
> > Subject: [cdt-dev] decoupled preprocessor
> >
> >
> > Hi Doug,
> >
> > I take it from your latest blog post that you are going to 
> be in need of
> a
> > preprocessor for you ANTLR C++ experiment. I was planning 
> on decoupling
> > the
> > preprocessor that I wrote for the C99 parser so that it can 
> be used with
> > any parser. If you are interested in picking this up when 
> would you need
> > it?
> >
> > Mike Kucera
> > Software Developer
> > IBM CDT Team, Toronto
> > mkucera@xxxxxxxxxx
> >
> > _______________________________________________
> > cdt-dev mailing list
> > cdt-dev@xxxxxxxxxxx
> > https://dev.eclipse.org/mailman/listinfo/cdt-dev
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
> 
> 
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
> 


Back to the top