472328 – [parser] problem combining syntactic predicates and hidden()

Bug 472328 - [parser] problem combining syntactic predicates and hidden()

Summary: [parser] problem combining syntactic predicates and hidden()

Status:	NEW

Alias:	None

Product:	TMF
Classification:	Modeling
Component:	Xtext (show other bugs)
Version:	2.8.3
Hardware:	PC Mac OS X

Importance:	P3 normal (vote)
Target Milestone:	---
Assignee:	Project Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-07-10 01:40 EDT by Knut Wannheden
Modified:	2015-07-10 03:16 EDT (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Knut Wannheden

2015-07-10 01:40:27 EDT

Given the following grammar:

grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

Model:
	( 'FOO' foos+=Foo ';' )*
;

Foo hidden():
	( =>(';' Item) | Item )+
;

Item hidden():
	Elem ( '(' (INT|ID) ')' )?
;

Elem hidden():
	INT | ID | '/' | ',' | ':' | '+'
;

As you can see the tricky part is that every Foo must be terminated with a semicolon but it can itself also contain semicolons (although of course never a trailing semicolon!). Also a Foo must not contain any whitespace, which is why hidden() is used. Given this grammar I would expect the parser to be able to parse input like:

FOO foo;
FOO :foo(42)/baz ;
FOO foo(bar);baz
;

For the first line there is a parse error "no viable alternative at input '\n'" reported against the semicolon character.

For the second line there is a parse error "no viable alternative at input ' '" reported against the blank preceding the semicolon.

For the third line (basically same case as second line) there is a parse error "no viable alternative at input '\n'" reported against the line (no squigly line).

It would seem like the semantic predicate is unaware of the hidden() and the parser always tries to match a semicolon inside the predicated alternative of the Foo rule.

Comment 1 Sven Efftinge

2015-07-10 02:21:14 EDT

Hidden and syntactic predicates don't work well together.
At a first glance it looks like a duplicate of bug #470632.

Comment 2 Sebastian Zarnekow

2015-07-10 03:11:06 EDT

 (In reply to Sven Efftinge from comment #1)
> Hidden and syntactic predicates don't work well together.

That's true.
Hidden channel changes are implemented by means of Antlr actions which are not executed while the parser is predicting / doing lookahead.

Comment 3 Knut Wannheden

2015-07-10 03:16:51 EDT

Yes, it is very similar to that bug. Although I oversimplified my use case a bit too much. I cannot simply remove the hidden() in my case because the Foo rule is called from a rule with alternatives which could cause the semantic predicate in Foo to match in cases when it shouldn't.

In this simplified example I found a workaround when also applying hidden to the calling rule:

Model hidden():
	( 'FOO' WS foos+=Foo WS? ';' WS? )*
;

But in more complex scenarios (rule called from many places and the ';' keyword further up in the hierarchy) this workaround might not work and would grow very ugly and unwieldy.

I would appreciate any other ideas on how to solve this.

Here is a more complicated example grammar (with hidden() removed):

Model:
	( ( foos+=Foo | bars+=Bar ) ';' )*
;
Foo:
	'FOO' ( =>(';' Item) | Item )+
;
Item:
	INT | ID | '/' | ',' | ':' | '+'
;
Bar:
	INT 'BAR'? ID
;

Given the input:

FOO foo;
42 bar;
FOO foo2;
42 BAR bar2;

The output Model will now have the two Foos "FOO foo; 42 bar" and "FOO foo2" and one Bar "42 BAR bar2". This is because the syntactic predicate matches the Bar "42 bar" which isn't intended.