458705 – [Serializer] Sequencer init method/create gets too big for some grammars

Bug 458705 - [Serializer] Sequencer init method/create gets too big for some grammars

Summary: [Serializer] Sequencer init method/create gets too big for some grammars

Status:	NEW

Alias:	None

Product:	TMF
Classification:	Modeling
Component:	Xtext (show other bugs)
Version:	2.7.3
Hardware:	PC Mac OS X

Importance:	P3 normal (vote)
Target Milestone:	---
Assignee:	Project Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-01-29 01:56 EST by Christian Dietrich
Modified:	2015-11-11 04:15 EST (History)
CC List:	4 users (show)

See Also:

Attachments
The sequencer.java class (768.63 KB, application/octet-stream) 2015-01-29 04:16 EST, Puneet Patwari	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Christian Dietrich

2015-01-29 01:56:26 EST

The generated Sequencer of some Grammars gets too big in its init method.
Maybe its pattern should be change (e.g. factoring out the or-ed conditions to own methods

see 
- https://www.eclipse.org/forums/index.php/t/457587/
- https://www.eclipse.org/forums/index.php/m/1564072/

Comment 1 Puneet Patwari

2015-01-29 04:16:01 EST

Created attachment 250335 [details]
The sequencer.java class

Please look into the init(IGrammarAccess) method.

Comment 2 Puneet Patwari

2015-01-29 04:19:09 EST

Hi christian

Thanks for filing a bug. I am using Windows 7 and eclipse version: 
Eclipse DSL Tools

Version: Kepler Service Release 2
Build id: 20140224-0627

I have tried using a solution from the bug : https://bugs.eclipse.org/bugs/show_bug.cgi?id=349992 where it says to modify the fieldPerClass in the Options block. But still could not make it work.

Thanks for help in advance.

Regards
Puneet

Comment 3 Moritz Eysholdt

2015-01-29 04:20:04 EST

Thank you Puneet, but I'll need the grammar to do some useful analysis. Could you attach it to this bug? If you can't make the grammar public, would it be possible to send it to me via email? I can assure that I will not pass it to anybody else.

Comment 4 Moritz Eysholdt

2015-01-29 04:22:27 EST

The options discussed in bug 349992 are for the parser and do not affect the serializer.

Comment 5 Moritz Eysholdt

2015-01-29 04:32:36 EST

If the generated init() method gets too big, this means that for the serializer there are many syntactic ambiguities in your grammar.

Example:

Rule:
  "foo"? ("bar" | "baz") buz="buz";

For the serializer, there are two syntactic ambiguities here:
- It doesn't know whether to write out "foo"
- It doesn't know whether to write out "bar" or "baz"
- "buz" is *not* a syntactic ambiguity, because due to the assignment buz= it's stored in the model.

In your MyDslSyntacticSequencer, you'll find many emit_*() methods. Each method handles one syntactic ambiguity and the JavaDoc comment in front of the method contains the grammar snippet that is actually ambiguous for the serializer.

To work around the too-big init() method, you can modify your grammar to be less ambiguous to the serializer, for example by introducing more assignments.

Comment 6 Moritz Eysholdt

2015-01-29 04:34:27 EST

Don't confuse this bug with bug 457579. That bug appeared for a different reason.

Comment 7 Puneet Patwari

2015-01-29 04:48:43 EST

Hi Moritz

Thanks for your comments. Firstly, I will like to also point that I have quite a few unordered groups in my grammar. 

Secondly, Yes I see that the bug 349992 is for antlr generation problem.

Thirdly, I am trying to understand the example and try to implement.

Fourthly, I have sent you the grammar.

Your co-operation will be highly solicited.

Regards
Puneet

Comment 8 Christian Dietrich

2015-01-29 04:53:25 EST

ahh i missed 457579

Comment 9 Puneet Patwari

2015-01-29 05:21:01 EST

Hi Moritz

I understood the example and the concern in your comment. I also checked by modifying my grammar and reducing the emit_*() methods. In fact, I can avoid almost all of them by bringing in concrete assignments but I cannot do that. The reason being I am generating Xtext from EMF ecore model and not vice-versa. Therefore I cannot create utility EClasses in my ecore just to compensate the ambiguities that gets created in the serializer. Hope you understand what I mean.!

Regards
Puneet

Comment 10 Puneet Patwari

2015-01-30 05:52:20 EST

Hi Moritz,

I would really like to thank you, because your motivation to clean up the grammar worked out very very well. I cleaned up the grammar as opposed to what I sent you yesterday. The init() method has shrunk to a size of only 10 lines from 250 lines!! In the process, I also reduced the number of emit_*() methods. I guess these simple tricks and motivations should be provided in the documentation itself, so that user benefits.

Anyways, the problem for the init() method exceeding 640K size still exists and hope you people find a way out.

Thanks & Regards
Puneet

Comment 11 Moritz Eysholdt

2015-02-03 14:26:58 EST

The fix pushed to 
https://git.eclipse.org/r/#/c/41016/
reduces the size of Puneet's generated SyntacticSequencer from 2632 to 397 LOC. Java compiles the init() method just fine.
The patch needs more cleanup and testing, though.

Comment 12 Moritz Eysholdt

2015-02-04 09:02:37 EST

I'm interrupting my work on this bug in favor of other, faster-to-fix, bugs.

The current state of work can be found here:
https://git.eclipse.org/r/#/c/41016/