Bug 173110 - C99 parser contribution
Summary: C99 parser contribution
Status: RESOLVED FIXED
Alias: None
Product: CDT
Classification: Tools
Component: cdt-core (show other bugs)
Version: 4.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 4.0 RC0   Edit
Assignee: Chris Recoskie CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 179393
Blocks:
  Show dependency tree
 
Reported: 2007-02-06 11:57 EST by Mike Kucera CLA
Modified: 2014-01-29 17:00 EST (History)
4 users (show)

See Also:


Attachments
preliminary plug-in (66.33 KB, application/zip)
2007-02-06 12:09 EST, Mike Kucera CLA
no flags Details
parser tests patch (26.68 KB, text/plain)
2007-02-06 12:10 EST, Mike Kucera CLA
no flags Details
preprocessor contribution (deleted)
2007-03-13 14:54 EDT, Mike Kucera CLA
no flags Details | Diff
preprocessor contribution (deleted)
2007-03-13 14:54 EDT, Mike Kucera CLA
no flags Details | Diff
preprocessor contribution (deleted)
2007-03-13 14:56 EDT, Mike Kucera CLA
no flags Details | Diff
preprocessor contribution (deleted)
2007-03-13 15:00 EDT, Mike Kucera CLA
no flags Details | Diff
preprocessor contribution (513.27 KB, patch)
2007-03-14 11:21 EDT, Mike Kucera CLA
no flags Details | Diff
updated preprocessor patch (513.54 KB, patch)
2007-03-14 15:02 EDT, Mike Kucera CLA
no flags Details | Diff
patch number 3 (380.70 KB, patch)
2007-04-03 11:35 EDT, Mike Kucera CLA
cdtdoug: iplog+
Details | Diff
Patch #4 (attached on behalf of Mike) (744.73 KB, patch)
2007-04-26 21:05 EDT, Chris Recoskie CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kucera CLA 2007-02-06 11:57:45 EST
This work involves creating a new CDT parser for C99. The parser will be generated from grammar files using an LALR parser generator named LPG. 

LPG is a very easy-to-use and feature rich parser generator. It supports automatic recovery from syntax errors as well as a backtracking parser for handling ambiguous grammars. Automatic AST generation is also supported; however we do not plan to use this feature since CDT already contains a C99 compatible AST that can be easily reused. The grammar files will contain calls to actions that are used to generate the AST.

This work forms the basis for a framework that would allow CDT to be extended with different flavors of the C language. LPG has a very handy feature: it allows one grammar file to include another. The including file may add and redefine grammar rules as necessary. This feature can be used to add rules for language extensions and generate new parsers. 

LPG on sourceforge:  http://sourceforge.net/projects/lpg/
Comment 1 Mike Kucera CLA 2007-02-06 12:09:41 EST
Created attachment 58359 [details]
preliminary plug-in

This is a preliminary plug-in to give people an idea of how this work is progressing. Two grammar files are provided so far, one for the scanner and one for the parser. The preprocessor has not been implemented yet. A small test suite is provided that extends from AST2Tests.

The LPG java runtime is not included. It can be downloaded from sourceforge. The file lpgjavaruntime.jar should go in the root of the plug-in.

Any questions or comments are appreciated…
Comment 2 Mike Kucera CLA 2007-02-06 12:10:53 EST
Created attachment 58360 [details]
parser tests patch

This patch adds constructor methods to several of the parser test suites. This allows the current test suites to be reused with the C99 parser.
Comment 3 Chris Recoskie CLA 2007-02-07 11:11:43 EST
LPG 1.1 has been added to the orbit project.  We are refactoring the plugin to consume LPG from there.
Comment 4 Chris Recoskie CLA 2007-02-08 16:05:01 EST
I reworked the plugin to use the LPG from Orbit.  I've not included the JAR... we have to figure out how all this is going to fit into the CDT build process.  For now people that want to try the plugin out will have to get LPG from the Orbit download page.

I've refactored the plugin so that the tests are in their own plugin as well.

Everything lives in CVS under org.eclipse.cdt/c99.  It should all show up shortly.

I'll leave this Bugzilla open to facilitate further patch drops on the feature.

Thanks Mike, keep up the good work!
Comment 5 Doug Schaefer CLA 2007-02-08 16:09:37 EST
At the end of the day, we need to use the LPG plugin as it sits in Orbit. This will allow the CDT and other projects in Europa and beyond to use the same copy.

We already do that of sorts with our test plugins which use a couple of plugins out of the eclipse repository.
Comment 6 Chris Recoskie CLA 2007-02-09 09:51:45 EST
 (In reply to comment #5)
> At the end of the day, we need to use the LPG plugin as it sits in Orbit. This
> will allow the CDT and other projects in Europa and beyond to use the same copy.
> We already do that of sorts with our test plugins which use a couple of plugins
> out of the eclipse repository.

Agreed.  We just need to figure out the logistics.  Really all the Orbit stuff should just be on the Europa update site and we could reference a hypothetical feature for LPG that was on that site.  They don't currently have an update site though.
Comment 7 Doug Schaefer CLA 2007-02-09 10:15:10 EST
Agreed. I was just concerned that you were thinking of including the LPG jar in your feature. It should be in its own feature.
Comment 8 Chris Recoskie CLA 2007-02-09 11:50:44 EST
One other thing that I didn't mention that I wanted to get opinions on...

The current GNU parser isn't really maintained much... most of the original authors have moved on to other things, and it's hard to navigate the depths of that code.  I'm thinking at some point down the road that it would write a GNU extension on top of this C99 parser and then replace the old parser with it.  Then it would theoretically be easier to fix bugs as well as to cope with changes to the language over time.

Nothing imminent (definitely not for 4.0), but food for thought.
Comment 9 Doug Schaefer CLA 2007-02-09 12:43:58 EST
(In reply to comment #8)
> I'm thinking at some point down the road that it would write a GNU
> extension on top of this C99 parser and then replace the old parser with it. 

Of course, before doing that you'd make sure that all the JUnits pass and the indexing performance isn't severely degraded ;).
Comment 10 Chris Recoskie CLA 2007-02-14 13:59:17 EST
 (In reply to comment #7)
> Agreed. I was just concerned that you were thinking of including the LPG jar in
> your feature. It should be in its own feature.

Hmm...

From the Orbit Wiki:

"There are no features from Orbit, for consumers to use. The goal is that these bundles can be included in other project's features as needed. While this results in some duplication in projects zips or update sites, since all projects use exactly the same bundles from Orbit, there should be no duplication once installed."

Looks like we'll have to bundle it ourselves.
Comment 11 Doug Schaefer CLA 2007-02-14 14:11:46 EST
Did you check with Jeff McAffer to see if this is really the policy? Seems to me that if you have to do your own IP check and releng, what the hell would the point be then...
Comment 12 Chris Recoskie CLA 2007-02-14 14:18:10 EST
(In reply to comment #11)
> Did you check with Jeff McAffer to see if this is really the policy? Seems to
> me that if you have to do your own IP check and releng, what the hell would the
> point be then...

I guess consistent bundling?  I'll talk to him.
Comment 13 Mike Kucera CLA 2007-03-14 11:21:10 EDT
Created attachment 60810 [details]
preprocessor contribution

This patch adds a new preprocessor implementation to the C99 parser. This preprocessor is token-based, and sits between the lexer and the parser. Several test suites are included, with more to follow.

Most of the C99 parser so far is brand new. However the AST implementation and the LocationMap class from CDT are being reused.

I would like to further modularize the preprocessor in the future in order to make it reusable with other languages. I remember reading something about how the fortran people want a C preprocessor.

This patch includes a slight modification to the core tests to allow some of the test cases to be reused.

Moving on to implement content assist :o
Comment 14 Mike Kucera CLA 2007-03-14 11:25:04 EDT
I was trying to upload the patch yesterday but bugzilla kept giving me mysql errors, thats why there's four empty patches in the attachment list. I couldn't figure out how to get rid of them so I just marked them obsolete.
Comment 15 Mike Kucera CLA 2007-03-14 15:02:07 EDT
Created attachment 60842 [details]
updated preprocessor patch

Updated patch, marked failing test case.
Comment 16 Chris Recoskie CLA 2007-03-14 15:17:02 EDT
 (In reply to comment #15)
> Created an attachment (id=60842)
> updated preprocessor patch
> Updated patch, marked failing test case.

Applied to HEAD.

Thanks, keep 'em coming :-)
Comment 17 Mike Kucera CLA 2007-04-03 11:35:53 EDT
Created attachment 62793 [details]
patch number 3

This updated patch adds:
- content assist support
- much better error reporting in the preprocessor
- better integration with CDT
- more test cases
Comment 18 Chris Recoskie CLA 2007-04-03 14:49:30 EDT
I'm noticing some missing NON-NLS tags and some unused imports but other than that it looks good.  Once I get signoff on committing 179393 I'll commit this along with it.

I'll clean up the NLS tags and imports so don't bother submitting another patch to fix those.
Comment 19 Chris Recoskie CLA 2007-04-04 11:55:42 EDT
Applied the latest patches.  Thanks Mike.
Comment 20 Chris Recoskie CLA 2007-04-26 21:05:11 EDT
Created attachment 65133 [details]
Patch #4 (attached on behalf of Mike)

Mike has forwarded me patch #4.  For the sake of saving time I'm attaching this on his behalf.  I'll be committing it shortly
Comment 21 Chris Recoskie CLA 2007-04-26 21:08:31 EDT
Applying Mike's latest patch to HEAD.

This feature is essentially feature complete.  I'm going to mark this as FIXED.  Any outstanding bugs we'll track separately with new Bugzillas.
Comment 22 Chris Recoskie CLA 2007-04-26 21:10:52 EDT
BTW there was a slight problem in that there was a duplicate build path entry on the tests plugin.  I fixed that while I was at it.