While trying to improve the performance of my LPG parser I think I found a
serious performance bug in LPG.
Actually I can not believe that this wasn't mentioned before so if this is
known or if it is intended behaviour please appologize.
The problem arises if the same generated parser is reused to parse several
files. When a file is parsed with lpg.runtime.DeterministicParser it first
calls the method lpg.runtime.Stacks.reallocateStacks(). Here is the code
from this method:
if (stateStack == null)
{
stateStack = new int[stack_length];
locationStack = new int[stack_length];
parseStack = new Object[stack_length];
}
else
{
System.arraycopy(stateStack, 0, stateStack = new
int[stack_length], 0, old_stack_length);
System.arraycopy(locationStack, 0, locationStack = new
int[stack_length], 0, old_stack_length);
System.arraycopy(parseStack, 0, parseStack = new
Object[stack_length], 0, old_stack_length);
}
return;
As you can see if the stateStack was already created, the parser will copy
its contents into a new array with the size
old_stack_length+STACK_INCREMENT. This is done because the
reallocateStacks() method is also used to resize the stack when needed.
This behaviour results in a memory leak and in serious performance
problems when reusing a parser instance for compiling more than one file.
After I changed my code to not reuse the parser the compilation time for
about 200 files was reduced from 10 seconds to under one second so it is
now 10 times faster.
The ParseController that gets generated by IMP also lazily instantiates
the internal DeterministicParser. So perhaps IMP suffers the same
performance problems (I'm not sure if IMP reuses the ParseController).
If someone can confirm that this is a bug I'll create a bug report for it.