[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[pdt-dev] Manipulating the PHPTokenizers grammar


after receiving the correct JFlex.jar i managed to modify and compile the PHPTokenizer.jflex
grammar into the PHPTokinezer Java class.

The tokenizing of the template languages structures is working by and large, the plugin implementing
it can be found here:


However, there's some minor bugs which are related to the way the jflex grammar has been extended, and
hopefully someone from the list can help me out.

The tokens of the templating language reside inside the XML content, this is:

  {{ template code }}

My problem is that the rule for generic XML content of the grammar always overrides my rule for opening Template tags:

// initial rule to go to the Template content state
// will always be overriden by the rule below, as it will always match more
// characters

<YYINITIAL>  "{{"{Whitespace}* {
 // switch to template content state

// initial rule to go to the XML content state

<YYINITIAL>  [^<&%]*|[&%]{S}+{Name}[^&%<]*|[&%]{Name}([^;&%<]*|{S}+;*) {
  // switch to xml content state

If the input to the tokenizer is only "{{", then the first rule matches. But as soon as there's any other characters following ("{{ foo"), the second rule matches because the match is longer - as described in the jflex documentation.

It seems the authors of the original smarty plugin also had the same problem, that's probably why they chose to
detect the opening template tags via a custom function (findTwigDelimiter):

Here's the code for it:


Has anyone a hint how i can extend the PHPTokenizer in a way that the opening template tags will always match - and not be matched
by the generic xml-content rule?

any hints would be greatly appreciated!