474686 – [compile] Excessive amount of time spent in Scanner.checkTaskTag()

Bug 474686 - [compile] Excessive amount of time spent in Scanner.checkTaskTag()

Summary: [compile] Excessive amount of time spent in Scanner.checkTaskTag()

Status:	NEW

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	4.5
Hardware:	PC Windows 8

Importance:	P3 normal (vote)
Target Milestone:	---
Assignee:	JDT-Core-Inbox
QA Contact:

URL:
Whiteboard:	stalebug
Keywords:

Depends on:
Blocks:

Reported:	2015-08-11 05:37 EDT by Lukas Eder
Modified:	2023-02-03 16:14 EST (History)
CC List:	5 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Lukas Eder

2015-08-11 05:37:58 EDT

I'm using JMC to profile Eclipse while I'm working with a project that has a lot of Javadoc (jOOQ). I've noticed that a significant amount of time is spent in a relatively unimportant place, in the Scanner, checking for a task tag:

------------------------------------------------------------
Stack Trace	Sample Count	Percentage(%)
org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, char)	171	8.769
   org.eclipse.jdt.internal.compiler.parser.Scanner.checkTaskTag(int, int)	171	8.769
      org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken0()	171	8.769
         org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken()	171	8.769
            org.eclipse.jdt.internal.compiler.parser.Parser.parse()	171	8.769
               org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult, int, int)	171	8.769
                  org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult)	171	8.769
                     org.eclipse.jdt.internal.compiler.parser.Parser.dietParse(ICompilationUnit, CompilationResult)	170	8.718
                     org.eclipse.jdt.internal.compiler.parser.SourceTypeConverter.convert(ISourceType[], CompilationResult)	1	0.051
------------------------------------------------------------

And indeed, if I go to the preferences and remove all the pre-configured task tags from Java > Compiler > Task Tags (i.e. FIXME, TODO, XXX), then subjectively, working with Eclipse is a bit smoother.

This is just an initial hint of a potential performance flaw. If you need more information, measurements, please let me know and I'll happily provide them here.

Comment 1 Stephan Herrmann

2015-08-11 09:15:23 EDT

Thanks for the hint. I had a quick glance at the code to see if there's some low-hanging fruit, but couldn't find much:

checkTaskTag should be invoked exactly once for each comment after we found start and end positions of the comment.

Inside that method we do the necessary looping over all chars in the identified region. The algorithm is a little bit complex, but all is implemented in terms of direct access to chars in a char[], i.e., I don't see any expensive data structures involved. Actually the algorithm consists of two parts:
- identifying task tags
- finding the "message", i.e., the text between the tag and the end of line for display in the tasks view.

On the caller side, all this only happens if !skipComments. But skipComments is a rarely used special-purpose flag, assigned true only ever in AbstractCommentParser.parseHref(). Do all other callers really need comments scanned? It seems that task tags are indeed relevant in a lot of situations, including the reconciler, so it would take a closer look, to see if we can save anything here.

As you included the downstream method isJavaIdentifierPart() there's a little potential for micro-optimization by pulling some of these invocations out of their enclosing loops. I see this at least for this invocation:
  isJavaIdentifierPart(this.complianceLevel, previous)
where previous is "owned" by the outer loop.


Would you be in the position to make comparative measurements if we provide a patched jdt.core plugin?

Comment 2 Lukas Eder

2015-08-11 09:30:08 EDT

> As you included the downstream method isJavaIdentifierPart() there's a 
> little potential for micro-optimization by pulling some of these invocations 
> out of their enclosing loops. I see this at least for this invocation:
>   isJavaIdentifierPart(this.complianceLevel, previous)
> where previous is "owned" by the outer loop.

Yes, I had noticed this as well.

> Do all other callers really need comments scanned? It seems that task tags 
> are indeed relevant in a lot of situations, including the reconciler, so it 
> would take a closer look, to see if we can save anything here.

In my JMC measurements, only Scanner.getNextToken0() -> Scanner.getNextToken() appeared as relevant. But perhaps other features weren't active during my work anyway.

> Would you be in the position to make comparative measurements if we provide a patched jdt.core plugin?

Sure!

Comment 3 Mickael Istria

2017-05-15 07:08:56 EDT

@Stephan: do you think detection of task tags could be processed asynchronously?

Comment 4 Stephan Herrmann

2017-05-16 15:53:59 EDT

(In reply to Mickael Istria from comment #3)
> @Stephan: do you think detection of task tags could be processed
> asynchronously?

I don't think we have enough information for such a drastic change.

One difficulty arises from the fact that the compiler/parser/scanner is invoked in a large number of different scenarios. We can't possibly make big changes to all callers.

Moreover, I have difficulties believing the numbers. If I understand correctly, all the measured time that is spent in checkTaskTag() is actually spent in ScannerHelper.isJavaIdentifierPart() (all of 8.769%). 
That latter method looks super fast, if you ask me. I have no idea, how this could cause a bottle neck (unless measuring the very first invocation which still needs to initialize the unicode tables - those need to be read from file).

Comment 5 Eclipse Genie

2020-09-28 11:53:20 EDT

This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.

Comment 6 Eclipse Genie

2023-02-03 16:14:16 EST

This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.