Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Fw: [cdt-dev] LR parser and LPG version

Again posting on behalf of Bob Fuhrer.


...Beth

Beth Tibbitts
Eclipse Parallel Tools Platform http://eclipse.org/ptp
IBM STG Communications Protocols and Tools
Mailing Address: IBM Corp., Coldstream Research Campus, 745 West New Circle Road, Lexington, KY 40511
----- Forwarded by Beth Tibbitts/Watson/IBM on 03/03/2010 03:48 PM -----


From:

"Robert M. Fuhrer" <rfuhrer@xxxxxxxxxxxxxx>

To:

Beth Tibbitts/Watson/IBM@IBMUS

Date:

03/03/2010 03:45 PM

Subject:

Re: [cdt-dev] LR parser and LPG version



      Well.... the LR C++ parser passes like 99% of the parser test suites, which shows that using a parser generator for C++ isn't necessarily a bad idea. Of course I had to use some clever trickery in a few places to get it to work. The real star of the show in my opinion is CDT's ability to resolve ambiguity nodes in the AST, this is that makes using a parser generator for C++ possible. In my experience trying to resolve ambiguities during the parse is a dead end, but deferring the resolution until after the parse works very well. The DOM parser uses the same approach so this also helps when the parser is hand written.

      In the end there's a tradeoff; the DOM parser is more accurate and has a few extra features, but the LR parser is (in theory) easier to extend by third parties. Your milage may vary.

Hi there,

You might be interested to know that the LPG runtime v2.0.17 has recently been added to Orbit.

There are quite a few significant bug fixes relative to v1.1. As for compatibility, the main part of the grammar specification is backward-compatible with v1.x; most visible changes are to certain options, handling of EOF/EOL/error tokens, and the grammar/parser template files have been revised. At present, it's being used by MDT/OCL (from the EMF project), and IMP, along with a variety of projects outside eclipse.org.

Also, FWIW, the prevailing consensus among parsing experts on dealing with tricky languages like C++, _javascript_, and so on, is not to try to disambiguate at parse time, but to do it in a later pass (as it seems you're doing). There are two camps there: (a) rewrite the tree after the fact when the parser guesses wrong, and (b) use GLR to get a tree of possible parses, and weed out the bad ones in a subsequent pass. They've been extremely successful with that approach. In short, there are others who believe you're on the right track. :-)

--
Cheers,
- Bob
-------------------------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center

IMP Project Lead (http://www.eclipse.org/imp)
X10: Productivity for High-Performance Parallel Programming (http://x10-lang.org)

GIF image


Back to the top