Community
Participate
Working Groups
I'm using JMC to profile Eclipse while I'm working with a project that has a lot of Javadoc (jOOQ). I've noticed that a significant amount of time is spent in a relatively unimportant place, in the Scanner, checking for a task tag: ------------------------------------------------------------ Stack Trace Sample Count Percentage(%) org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, char) 171 8.769 org.eclipse.jdt.internal.compiler.parser.Scanner.checkTaskTag(int, int) 171 8.769 org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken0() 171 8.769 org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken() 171 8.769 org.eclipse.jdt.internal.compiler.parser.Parser.parse() 171 8.769 org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult, int, int) 171 8.769 org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult) 171 8.769 org.eclipse.jdt.internal.compiler.parser.Parser.dietParse(ICompilationUnit, CompilationResult) 170 8.718 org.eclipse.jdt.internal.compiler.parser.SourceTypeConverter.convert(ISourceType[], CompilationResult) 1 0.051 ------------------------------------------------------------ And indeed, if I go to the preferences and remove all the pre-configured task tags from Java > Compiler > Task Tags (i.e. FIXME, TODO, XXX), then subjectively, working with Eclipse is a bit smoother. This is just an initial hint of a potential performance flaw. If you need more information, measurements, please let me know and I'll happily provide them here.
Thanks for the hint. I had a quick glance at the code to see if there's some low-hanging fruit, but couldn't find much: checkTaskTag should be invoked exactly once for each comment after we found start and end positions of the comment. Inside that method we do the necessary looping over all chars in the identified region. The algorithm is a little bit complex, but all is implemented in terms of direct access to chars in a char[], i.e., I don't see any expensive data structures involved. Actually the algorithm consists of two parts: - identifying task tags - finding the "message", i.e., the text between the tag and the end of line for display in the tasks view. On the caller side, all this only happens if !skipComments. But skipComments is a rarely used special-purpose flag, assigned true only ever in AbstractCommentParser.parseHref(). Do all other callers really need comments scanned? It seems that task tags are indeed relevant in a lot of situations, including the reconciler, so it would take a closer look, to see if we can save anything here. As you included the downstream method isJavaIdentifierPart() there's a little potential for micro-optimization by pulling some of these invocations out of their enclosing loops. I see this at least for this invocation: isJavaIdentifierPart(this.complianceLevel, previous) where previous is "owned" by the outer loop. Would you be in the position to make comparative measurements if we provide a patched jdt.core plugin?
> As you included the downstream method isJavaIdentifierPart() there's a > little potential for micro-optimization by pulling some of these invocations > out of their enclosing loops. I see this at least for this invocation: > isJavaIdentifierPart(this.complianceLevel, previous) > where previous is "owned" by the outer loop. Yes, I had noticed this as well. > Do all other callers really need comments scanned? It seems that task tags > are indeed relevant in a lot of situations, including the reconciler, so it > would take a closer look, to see if we can save anything here. In my JMC measurements, only Scanner.getNextToken0() -> Scanner.getNextToken() appeared as relevant. But perhaps other features weren't active during my work anyway. > Would you be in the position to make comparative measurements if we provide a patched jdt.core plugin? Sure!
@Stephan: do you think detection of task tags could be processed asynchronously?
(In reply to Mickael Istria from comment #3) > @Stephan: do you think detection of task tags could be processed > asynchronously? I don't think we have enough information for such a drastic change. One difficulty arises from the fact that the compiler/parser/scanner is invoked in a large number of different scenarios. We can't possibly make big changes to all callers. Moreover, I have difficulties believing the numbers. If I understand correctly, all the measured time that is spent in checkTaskTag() is actually spent in ScannerHelper.isJavaIdentifierPart() (all of 8.769%). That latter method looks super fast, if you ask me. I have no idea, how this could cause a bottle neck (unless measuring the very first invocation which still needs to initialize the unicode tables - those need to be read from file).
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie.