Bug 458705 - [Serializer] Sequencer init method/create gets too big for some grammars
Summary: [Serializer] Sequencer init method/create gets too big for some grammars
Status: NEW
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext (show other bugs)
Version: 2.7.3   Edit
Hardware: PC Mac OS X
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-29 01:56 EST by Christian Dietrich CLA
Modified: 2015-11-11 04:15 EST (History)
4 users (show)

See Also:


Attachments
The sequencer.java class (768.63 KB, application/octet-stream)
2015-01-29 04:16 EST, Puneet Patwari CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Dietrich CLA 2015-01-29 01:56:26 EST
The generated Sequencer of some Grammars gets too big in its init method.
Maybe its pattern should be change (e.g. factoring out the or-ed conditions to own methods

see 
- https://www.eclipse.org/forums/index.php/t/457587/
- https://www.eclipse.org/forums/index.php/m/1564072/
Comment 1 Puneet Patwari CLA 2015-01-29 04:16:01 EST
Created attachment 250335 [details]
The sequencer.java class

Please look into the init(IGrammarAccess) method.
Comment 2 Puneet Patwari CLA 2015-01-29 04:19:09 EST
Hi christian

Thanks for filing a bug. I am using Windows 7 and eclipse version: 
Eclipse DSL Tools

Version: Kepler Service Release 2
Build id: 20140224-0627

I have tried using a solution from the bug : https://bugs.eclipse.org/bugs/show_bug.cgi?id=349992 where it says to modify the fieldPerClass in the Options block. But still could not make it work.

Thanks for help in advance.

Regards
Puneet
Comment 3 Moritz Eysholdt CLA 2015-01-29 04:20:04 EST
Thank you Puneet, but I'll need the grammar to do some useful analysis. Could you attach it to this bug? If you can't make the grammar public, would it be possible to send it to me via email? I can assure that I will not pass it to anybody else.
Comment 4 Moritz Eysholdt CLA 2015-01-29 04:22:27 EST
The options discussed in bug 349992 are for the parser and do not affect the serializer.
Comment 5 Moritz Eysholdt CLA 2015-01-29 04:32:36 EST
If the generated init() method gets too big, this means that for the serializer there are many syntactic ambiguities in your grammar.

Example:

Rule:
  "foo"? ("bar" | "baz") buz="buz";

For the serializer, there are two syntactic ambiguities here:
- It doesn't know whether to write out "foo"
- It doesn't know whether to write out "bar" or "baz"
- "buz" is *not* a syntactic ambiguity, because due to the assignment buz= it's stored in the model.

In your MyDslSyntacticSequencer, you'll find many emit_*() methods. Each method handles one syntactic ambiguity and the JavaDoc comment in front of the method contains the grammar snippet that is actually ambiguous for the serializer.

To work around the too-big init() method, you can modify your grammar to be less ambiguous to the serializer, for example by introducing more assignments.
Comment 6 Moritz Eysholdt CLA 2015-01-29 04:34:27 EST
Don't confuse this bug with bug 457579. That bug appeared for a different reason.
Comment 7 Puneet Patwari CLA 2015-01-29 04:48:43 EST
Hi Moritz

Thanks for your comments. Firstly, I will like to also point that I have quite a few unordered groups in my grammar. 

Secondly, Yes I see that the bug 349992 is for antlr generation problem.

Thirdly, I am trying to understand the example and try to implement.

Fourthly, I have sent you the grammar.

Your co-operation will be highly solicited.

Regards
Puneet
Comment 8 Christian Dietrich CLA 2015-01-29 04:53:25 EST
ahh i missed 457579
Comment 9 Puneet Patwari CLA 2015-01-29 05:21:01 EST
Hi Moritz

I understood the example and the concern in your comment. I also checked by modifying my grammar and reducing the emit_*() methods. In fact, I can avoid almost all of them by bringing in concrete assignments but I cannot do that. The reason being I am generating Xtext from EMF ecore model and not vice-versa. Therefore I cannot create utility EClasses in my ecore just to compensate the ambiguities that gets created in the serializer. Hope you understand what I mean.!

Regards
Puneet
Comment 10 Puneet Patwari CLA 2015-01-30 05:52:20 EST
Hi Moritz,

I would really like to thank you, because your motivation to clean up the grammar worked out very very well. I cleaned up the grammar as opposed to what I sent you yesterday. The init() method has shrunk to a size of only 10 lines from 250 lines!! In the process, I also reduced the number of emit_*() methods. I guess these simple tricks and motivations should be provided in the documentation itself, so that user benefits.

Anyways, the problem for the init() method exceeding 640K size still exists and hope you people find a way out.

Thanks & Regards
Puneet
Comment 11 Moritz Eysholdt CLA 2015-02-03 14:26:58 EST
The fix pushed to 
https://git.eclipse.org/r/#/c/41016/
reduces the size of Puneet's generated SyntacticSequencer from 2632 to 397 LOC. Java compiles the init() method just fine.
The patch needs more cleanup and testing, though.
Comment 12 Moritz Eysholdt CLA 2015-02-04 09:02:37 EST
I'm interrupting my work on this bug in favor of other, faster-to-fix, bugs.

The current state of work can be found here:
https://git.eclipse.org/r/#/c/41016/