Community
Participate
Working Groups
Below is the xtext code for the first version. Expected outcome : syntax is colour starting from # till the end Steps to reproduce: (1) type #pragma result : you can see the output #pragma is color (2) type #else result : output #else is not color like #pragma (3) press backspace result : output #else is color like #pragma Below is the complete code needed to reproduce: Model: elements+=ARMInstr* ; ARMInstr: PREPROCESSOR ; PREPROCESSOR: code?= ( '#if' | '#else' | '#elif' | '#error' | '#pragma' | '#define' | '#undef' | '#include' | '#ifdef' | '#ifndef' | '#endif' | '#line' | '#loop') ; Second version of code Expected Outcome: Syntax color starting from # till the end Result: # is not color, keyword is color Complete xtext code : Model: elements+=ARMInstr* ; ARMInstr: PREPROCESSOR ; PREPROCESSOR: hash?=('#') code?= ( 'if' | 'else' | 'elif' | 'error' | 'pragma' | 'define' | 'undef' | 'include' | 'ifdef' | 'ifndef' | 'endif' | 'line' | 'loop') ;
Seems like the tokensource is not updated correctly. with a state #els it is two tokens (error(0,1), id(1,3)). After entering the 'e' it is still two tokens (error(0,1), id(1,4)) This is where it goes wrong : DocumentTokenSource.getRepairEntryData(DocumentEvent)
Case 1) The reason for the error is that you have two keywords (#else and #elif) with a common prefix (#el) that is not consumed by a terminal rule. This means that the lexing '#el' yields a prefix error token for the '#' followed by an ID token error('#') id('el') and after the next keystrokes error('#') id('else') because the reconciliation considers the id token as the start token of the changed region. We usually don't encounter this error as the ID rule consumes illegal alphanumerical keywords. Easiest workaround for you: Add a terminal rule terminal HASHID: '#' ID; to your grammar, such that the lexer identifies '#el' as an HASH_ID token even though it is never expected by the parser (yielding the expected syntax error), but doesn't separate the '#' from the rest. Case 2) Works as expected as non-alphanumerical keywords are never marked as keywords by default, so '#' is treated like '+', '(' or '"'. Fix this for your case by adding a binding for AbstractAntlrTokenToAttributeIdMapper in the UI module of your language, and override DefaultAntlrTokenToAttributeIdMapper.calculateId(String, int) for the token '#'.
Would be great if we could detect the situation from Case 1). We could even add an appropriate terminal rule automatically when generating the Antlr grammar. My impression is that this may be error prone and could yield unexpected results, so I am leaving this open for discussion for a fix in later versions.
*** Bug 495261 has been marked as a duplicate of this bug. ***