Bug 482989 - [Highlighting] Syntactic Highlighting need pressing backspace
Summary: [Highlighting] Syntactic Highlighting need pressing backspace
Status: NEW
Alias: None
Product: TMF
Classification: Modeling
Component: Xtext (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Jan Koehnlein CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 495261 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-11-25 05:19 EST by Jia Poh Kow CLA
Modified: 2016-06-07 03:22 EDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jia Poh Kow CLA 2015-11-25 05:19:51 EST
Below is the xtext code for the first version.
Expected outcome : syntax is colour starting from # till the end

Steps to reproduce:
(1) type #pragma 
result : you can see the output #pragma is color

(2) type #else 
result : output #else is not color like #pragma

(3) press backspace 
result : output #else is color like #pragma

Below is the complete code needed to reproduce:

Model:
elements+=ARMInstr*
;
ARMInstr:
PREPROCESSOR 

;	

PREPROCESSOR:
code?= ( '#if' | '#else' | '#elif' | '#error' | '#pragma'
| '#define' | '#undef' | '#include' | '#ifdef' | '#ifndef'
| '#endif' | '#line' | '#loop')
;

Second version of code
Expected Outcome: Syntax color starting from # till the end
Result: # is not color, keyword is color

Complete xtext code :

Model:
elements+=ARMInstr*
;
ARMInstr:
PREPROCESSOR 
;	

PREPROCESSOR:
hash?=('#') code?= ( 'if' | 'else' | 'elif' | 'error' | 'pragma'
| 'define' | 'undef' | 'include' | 'ifdef' | 'ifndef'
| 'endif' | 'line' | 'loop')
;
Comment 1 Sven Efftinge CLA 2015-11-26 02:37:48 EST
Seems like the tokensource is not updated correctly.
with a state

#els

it is two tokens (error(0,1), id(1,3)).
After entering the 'e' it is still two tokens (error(0,1), id(1,4))
This is where it goes wrong : DocumentTokenSource.getRepairEntryData(DocumentEvent)
Comment 2 Jan Koehnlein CLA 2015-12-16 12:05:30 EST
Case 1) 
The reason for the error is that you have two keywords (#else and #elif) with a common prefix (#el) that is not consumed by a terminal rule. This means that the lexing '#el' yields a prefix error token for the '#' followed by an ID token

  error('#') id('el')

and after the next keystrokes

  error('#') id('else')

because the reconciliation considers the id token as the start token of the changed region. We usually don't encounter this error as the ID rule consumes illegal alphanumerical keywords. Easiest workaround for you: Add a terminal rule 

  terminal HASHID: '#' ID;

to your grammar, such that the lexer identifies '#el' as an HASH_ID token even though it is never expected by the parser (yielding the expected syntax error), but doesn't separate the '#' from the rest.


Case 2)
Works as expected as non-alphanumerical keywords are never marked as keywords by default, so '#' is treated like '+', '(' or '"'.

Fix this for your case by adding a binding for AbstractAntlrTokenToAttributeIdMapper in the UI module of your language, and override DefaultAntlrTokenToAttributeIdMapper.calculateId(String, int) for the token '#'.
Comment 3 Jan Koehnlein CLA 2015-12-16 12:10:06 EST
Would be great if we could detect the situation from Case 1). We could even add an appropriate terminal rule automatically when generating the Antlr grammar. My impression is that this may be error prone and could yield unexpected results, so I am leaving this open for discussion for a fix in later versions.
Comment 4 Jan Koehnlein CLA 2016-06-07 03:22:20 EDT
*** Bug 495261 has been marked as a duplicate of this bug. ***