While trying to improve the performance of my LPG parser I think I found
a serious performance bug in LPG.
Actually I can not believe that this wasn't mentioned before so if this
is known or if it is intended behaviour please appologize.
The problem arises if the same generated parser is reused to parse
several files. When a file is parsed with
lpg.runtime.DeterministicParser it first calls the method
lpg.runtime.Stacks.reallocateStacks(). Here is the code from this method:
int old_stack_length = (stateStack == null ? 0 :
stateStack.length),
stack_length = old_stack_length + STACK_INCREMENT;
if (stateStack == null)
{
stateStack = new int[stack_length];
locationStack = new int[stack_length];
parseStack = new Object[stack_length];
}
else
{
System.arraycopy(stateStack, 0, stateStack = new
int[stack_length], 0, old_stack_length);
System.arraycopy(locationStack, 0, locationStack = new
int[stack_length], 0, old_stack_length);
System.arraycopy(parseStack, 0, parseStack = new
Object[stack_length], 0, old_stack_length);
}
return;
As you can see if the stateStack was already created, the parser will
copy its contents into a new array with the size
old_stack_length+STACK_INCREMENT. This is done because the
reallocateStacks() method is also used to resize the stack when needed.
This behaviour results in a memory leak and in serious performance
problems when reusing a parser instance for compiling more than one
file. After I changed my code to not reuse the parser the compilation
time for about 200 files was reduced from 10 seconds to under one second
so it is now 10 times faster.
The ParseController that gets generated by IMP also lazily instantiates
the internal DeterministicParser. So perhaps IMP suffers the same
performance problems (I'm not sure if IMP reuses the ParseController).
If someone can confirm that this is a bug I'll create a bug report for it.
Greetings,
Dieter