Bug 472328 - [parser] problem combining syntactic predicates and hidden()
Summary: [parser] problem combining syntactic predicates and hidden()
Status: NEW
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext (show other bugs)
Version: 2.8.3   Edit
Hardware: PC Mac OS X
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-10 01:40 EDT by Knut Wannheden CLA
Modified: 2015-07-10 03:16 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Knut Wannheden CLA 2015-07-10 01:40:27 EDT
Given the following grammar:

grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

Model:
	( 'FOO' foos+=Foo ';' )*
;

Foo hidden():
	( =>(';' Item) | Item )+
;

Item hidden():
	Elem ( '(' (INT|ID) ')' )?
;

Elem hidden():
	INT | ID | '/' | ',' | ':' | '+'
;

As you can see the tricky part is that every Foo must be terminated with a semicolon but it can itself also contain semicolons (although of course never a trailing semicolon!). Also a Foo must not contain any whitespace, which is why hidden() is used. Given this grammar I would expect the parser to be able to parse input like:

FOO foo;
FOO :foo(42)/baz ;
FOO foo(bar);baz
;

For the first line there is a parse error "no viable alternative at input '\n'" reported against the semicolon character.

For the second line there is a parse error "no viable alternative at input ' '" reported against the blank preceding the semicolon.

For the third line (basically same case as second line) there is a parse error "no viable alternative at input '\n'" reported against the line (no squigly line).

It would seem like the semantic predicate is unaware of the hidden() and the parser always tries to match a semicolon inside the predicated alternative of the Foo rule.
Comment 1 Sven Efftinge CLA 2015-07-10 02:21:14 EDT
Hidden and syntactic predicates don't work well together.
At a first glance it looks like a duplicate of bug #470632.
Comment 2 Sebastian Zarnekow CLA 2015-07-10 03:11:06 EDT
 (In reply to Sven Efftinge from comment #1)
> Hidden and syntactic predicates don't work well together.

That's true.
Hidden channel changes are implemented by means of Antlr actions which are not executed while the parser is predicting / doing lookahead.
Comment 3 Knut Wannheden CLA 2015-07-10 03:16:51 EDT
Yes, it is very similar to that bug. Although I oversimplified my use case a bit too much. I cannot simply remove the hidden() in my case because the Foo rule is called from a rule with alternatives which could cause the semantic predicate in Foo to match in cases when it shouldn't.

In this simplified example I found a workaround when also applying hidden to the calling rule:

Model hidden():
	( 'FOO' WS foos+=Foo WS? ';' WS? )*
;

But in more complex scenarios (rule called from many places and the ';' keyword further up in the hierarchy) this workaround might not work and would grow very ugly and unwieldy.

I would appreciate any other ideas on how to solve this.

Here is a more complicated example grammar (with hidden() removed):

Model:
	( ( foos+=Foo | bars+=Bar ) ';' )*
;
Foo:
	'FOO' ( =>(';' Item) | Item )+
;
Item:
	INT | ID | '/' | ',' | ':' | '+'
;
Bar:
	INT 'BAR'? ID
;

Given the input:

FOO foo;
42 bar;
FOO foo2;
42 BAR bar2;

The output Model will now have the two Foos "FOO foo; 42 bar" and "FOO foo2" and one Bar "42 BAR bar2". This is because the syntactic predicate matches the Bar "42 bar" which isn't intended.