Bug 528697 - [content assist] Lots of parser invocations for classes that should have been compiled already
Summary: [content assist] Lots of parser invocations for classes that should have been...
Status: REOPENED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 4.7.1a   Edit
Hardware: PC Windows 10
: P3 critical (vote)
Target Milestone: ---   Edit
Assignee: JDT-Core-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords: helpwanted, performance
Depends on:
Blocks:
 
Reported: 2017-12-13 05:56 EST by Lukas Eder CLA
Modified: 2024-01-01 18:29 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lukas Eder CLA 2017-12-13 05:56:07 EST
This issue is related to my two previously created issues:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=528695
https://bugs.eclipse.org/bugs/show_bug.cgi?id=528689

---------------------------------------------------------------

When working with larger classes that contain many imports and static imports, content assist becomes quite unusable as it is super slow to the extent that it doesn't work (error popup after some timeout).

JFR / JMC method profiling shows a hot spot in org.eclipse.jdt.internal.compiler.parser.Parser.parse():


---------------------------------------------------------------
int org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken0()	88
   int org.eclipse.jdt.internal.compiler.parser.Scanner.getNextToken()	88
   void org.eclipse.jdt.internal.compiler.parser.Parser.parse()	88
   CompilationUnitDeclaration org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult, int, int)	83
      CompilationUnitDeclaration org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit, CompilationResult)	83
      CompilationUnitDeclaration org.eclipse.jdt.internal.compiler.parser.Parser.dietParse(ICompilationUnit, CompilationResult)	83
      CompilationUnitDeclaration org.eclipse.jdt.internal.compiler.parser.SourceTypeConverter.convert(ISourceType[], CompilationResult)	83
      CompilationUnitDeclaration org.eclipse.jdt.internal.compiler.parser.SourceTypeConverter.buildCompilationUnit(ISourceType[], int, ProblemReporter, CompilationResult)	83
      void org.eclipse.jdt.internal.codeassist.impl.Engine.accept(ISourceType[], PackageBinding, AccessRestriction)	83
      ReferenceBinding org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.askForType(PackageBinding, char[], ModuleBinding)	83
      Binding org.eclipse.jdt.internal.compiler.lookup.PackageBinding.getTypeOrPackage(char[], ModuleBinding)	83
      Binding org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.findImport(char[][], int)	69
         Binding org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.findSingleImport(char[][], int, boolean)	54
            void org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.faultInImports()	32
            Binding org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.resolveSingleImport(ImportBinding, int)	22
         Binding org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.findSingleStaticImport(char[][], int)	15
            Binding org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.findSingleImport(char[][], int, boolean)	15
            void org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.faultInImports()	15
            void org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.faultInTypes()	13
            Binding org.eclipse.jdt.internal.compiler.lookup.Scope.getBinding(char[], int, InvocationSite, boolean)	2
      Binding org.eclipse.jdt.internal.compiler.lookup.Scope.getTypeOrPackage(char[], int, boolean)	8
      Binding org.eclipse.jdt.internal.compiler.lookup.Scope.getPackage(char[][])	6
---------------------------------------------------------------

I have debugged through this logic and found that on the above call stack, there seems to be a problem in org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.askForType(), where the control flow is transferred to the part where a source model is used:

---------------------------------------------------------------
			// the type was found as a source model
			this.typeRequestor.accept(answer.getSourceTypes(), answerPackage, answer.getAccessRestriction());
---------------------------------------------------------------

When putting a breakpoint in 
org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit sourceUnit, CompilationResult compilationResult, int start, int end), it can be seen that the sourceUnit are classes that should have already been compiled without errors. It appears that content assist has to recompile (or at least re-parse) those classes, instead of the upstack LookupEnvironment.askForType() having found a "resolved binding" or a "binary type". I don't know what the advantage is of working with the source model, but I'm not convinced that content assist should parse the sources again and again.

Note that I'm using m2e to build a set of projects, in case that is relevant here.
Comment 1 Lukas Eder CLA 2017-12-13 08:12:03 EST
After digging into this a bit further and trying to find a mvce, I've noticed that this may be a bit tricky to reduce to a minimal example. I'll be very happy to test any hypotheses tha tyou may have, though, to help reproduce this issue.
Comment 2 Stephan Herrmann CLA 2017-12-13 17:21:50 EST
(In reply to Lukas Eder from comment #0)

Thanks for the analysis so far!

> When putting a breakpoint in 
> org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit
> sourceUnit, CompilationResult compilationResult, int start, int end), it can
> be seen that the sourceUnit are classes that should have already been
> compiled without errors. It appears that content assist has to recompile (or
> at least re-parse) those classes, instead of the upstack
> LookupEnvironment.askForType() having found a "resolved binding" or a
> "binary type".

Do you have any evidence that the type should already be known in the LookupEnvironment?

BTW: is any Java 9 involved in this? (in that case you'd have _many_ lookup environments - 1 / module).

Or are you saying the compiled .class file already exists and that should be used instead of sources?

As for the impact of m2e, that would primarily matter if you have build automatically unchecked and/or run a maven build without refreshing resources in Eclipse. I assume, however, you have your workflows to avoid these pitfalls :)
Comment 3 Lukas Eder CLA 2017-12-14 03:41:22 EST
(In reply to Stephan Herrmann from comment #2)
> Do you have any evidence that the type should already be known in the
> LookupEnvironment?

My understanding of LookupEnvironment might be incomplete as I've just discovered about its existence from this profiling / debugging session. But yes, the situation can be summarised as follows:

1. There are projects jOOQ and jOOQ-test. jOOQ-test depends on jOOQ. (there are more projects, but let's keep it simple)
2. I write code in a jOOQ-test class with tons of imports (about 320 statements) using content assist, which is slow in that class. It isn't slow in all classes, but mostly in those with many imports.
3. I put a breakpoint in org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit sourceUnit, CompilationResult compilationResult, int start, int end) to see what's being parsed. There are many classes from the jOOQ project that keep being parsed despite them having been compiled correctly before.

> BTW: is any Java 9 involved in this? (in that case you'd have _many_ lookup
> environments - 1 / module).

The JDK *running* Eclipse is indeed JDK 9, but the code is compiled for Java 8 using JDK 8

> Or are you saying the compiled .class file already exists and that should be
> used instead of sources?

That is my assumption, yes. The jOOQ project / Maven module has been built successfully and should be available to the jOOQ-test module in binary form.

> As for the impact of m2e, that would primarily matter if you have build
> automatically unchecked 

Oh yes, I have done that indeed because of a variety of problems when building automatically, including slow compilation in general because of overall slow compilation. I have reported many issues related to that in the past and will soon report another one. Once a project has tons of generics, the cache in        org.eclipse.jdt.internal.compiler.lookup.TypeSystem$HashedParameterizedTypes.get(ReferenceBinding, TypeBinding[], ReferenceBinding, AnnotationBinding[]) becomes a major bottleneck. I will soon report a separate issue for this.

Turning on building automatically seems to actually worsen the performance of content assist...

> and/or run a maven build without refreshing
> resources in Eclipse. I assume, however, you have your workflows to avoid
> these pitfalls :)

Well, even if that would be a viable workaround, the jOOQ project is modified all the time from within Eclipse, and I'm expecting Eclipse/m2e to build the changes automatically without me having to debug through anything to reverse engineer it.

I would understand if something was a bit slow when doing it the first time (e.g. the first time after a change in jOOQ, the content assist inside of jOOQ-test could be slow), but that's not the case. It's slow every time, and the resources that should have been built (and which are built in fact) don't seem to be seen as built for content assist.
Comment 4 Lukas Eder CLA 2017-12-14 05:13:12 EST
For the re(In reply to Lukas Eder from comment #3)
> Oh yes, I have done that indeed because of a variety of problems when
> building automatically, including slow compilation in general because of
> overall slow compilation. I have reported many issues related to that in the
> past and will soon report another one. Once a project has tons of generics,
> the cache in       
> org.eclipse.jdt.internal.compiler.lookup.TypeSystem$HashedParameterizedTypes.
> get(ReferenceBinding, TypeBinding[], ReferenceBinding, AnnotationBinding[])
> becomes a major bottleneck. I will soon report a separate issue for this.

For the record, here's that other issue I've just created:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=528761
Comment 5 Stephan Herrmann CLA 2017-12-14 07:34:18 EST
(In reply to Lukas Eder from comment #3)
> (In reply to Stephan Herrmann from comment #2)
> > Do you have any evidence that the type should already be known in the
> > LookupEnvironment?
> 3. I put a breakpoint in
> org.eclipse.jdt.internal.compiler.parser.Parser.parse(ICompilationUnit
> sourceUnit, CompilationResult compilationResult, int start, int end) to see
> what's being parsed. There are many classes from the jOOQ project that keep
> being parsed despite them having been compiled correctly before.

If types would have been compiled already *during the same invocation of the compiler*, then reparsing would be a straight bug, because then the LookupEnvironment should already know the type.
I understand that this is not what you are saying.
 
> > Or are you saying the compiled .class file already exists and that should be
> > used instead of sources?
> 
> That is my assumption, yes. The jOOQ project / Maven module has been built
> successfully and should be available to the jOOQ-test module in binary form.

This assumption does not match to what the implementation does.

To bother you with a few more implementation details: "the compiler" behaves quite differently, depending on how it is called. In particular lookup of additional classes does not happen in the the compiler itself, but in an INameEnvironment, of which several implementations exist. 

During "build project" o.e.j.i.c.builder.NameEnvironment indeed prefers output folders over source folders.

In the case of completion, a SearchableEnvironment is used. This guy basically operates in the universe of JDT's JavaModel. When analyzing the current project's classpath, each project dependency is treated by recursively analyzing the other project's classpath. At each of these steps source folders are preferred over output folders, which, among others, enables navigation with F3 across project boundaries.

I don't know the full reasoning about each detailed design choice, as all this happened before my time. My understanding is: SearchableEnvironment is optimized for interactive use cases in a way that we start with a single element and fetch just the minimal additional elements into compilation, for which the JavaModel provides a light-weight structure, with lots of caching involved.

The code seems to make the assumption that Parser.dietParse() is extremely fast (it skips all method bodies).

So much just for comparing assumptions against actual implementation.


If you have measurements where performance is not good, we certainly have to improve, but I'm not even sure that preferring .class over .java would completely resolve the issue, because also .class files need to be "parsed" and internal compiler bindings need to be created for completion to operate on. It may be faster than dietParse() but I'm unable to predict if speed up would be closer to 5% or closer to 100%.

In one universe, the "new index" might come to the rescue as this is the effort to keep sufficient information in a fast, cached database so neither .class nor .java needs to be read for certain use cases. Unfortunately, we can't promise the new index to be stable and fully functional in the near future.


For now I don't see a clear bug, and in order to optimize we will have to perform more measurements/experiments to really understand our options. From your measurements, can you say how much of the total time was spent below Parser.dietParse()? If that's a high percentage then we may want to perform experiments where indeed binary types are preferred and then compare results.


My gutt feelings says: there should be more potential in actually resolving less: it's unlikely that completion really needs details from all imported types. Perhaps we can invent some type proxies, which only have a fully qualified name, an address of the real thing (for ondemand parsing) and nothing else. Not a small project, but perhaps a useful one.
Comment 6 Stephan Herrmann CLA 2017-12-14 07:42:44 EST
(In reply to Lukas Eder from comment #3)
> Turning on building automatically seems to actually worsen the performance
> of content assist...

that sounds like bad news, I didn't expect you to say this ...
 
> > and/or run a maven build without refreshing
> > resources in Eclipse. I assume, however, you have your workflows to avoid
> > these pitfalls :)
> 
> Well, even if that would be a viable workaround, the jOOQ project is
> modified all the time from within Eclipse, and I'm expecting Eclipse/m2e to
> build the changes automatically without me having to debug through anything
> to reverse engineer it.

Valid expectations, and if you don't invoke "Run as ... Maven build" then co-operation between m2e and JDT indeed shouldn't require any manual refreshes or such. 

I was mainly asking because I want to be sure that the problem you observe is really within JDT, not in the coordination between various components (into which I don't have full insight either).
Comment 7 Eclipse Genie CLA 2020-01-19 04:44:37 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 8 Mickael Istria CLA 2020-01-19 08:07:39 EST
@Lukas: is this issue still present in latest release? If so,please reopen, it'd be a pity that such analysis about a potential bug performance win gets abandones.
Comment 9 Stephan Herrmann CLA 2020-01-19 09:01:32 EST
To summarize what we found out so far:

There's a hotspot in that Parser.parse() consumes a lot of time in a particular user's workspace.

We do not have a shared workspace / set of projects for joint exploration.

Parser.parse() is invoked for types whose details may not be relevant for the use case at hand.

We do not know if the alternative, reading and parsing .class files would result in a relevant speed up.

The compiler proper is a component that has to serve many different use cases. In the current architecture we have two hooks where use-case specific behavior can be plugged in: (1) the INameEnvironment which finds secondary types needed by the compiler - here we decide whether source of binary types are used. (2) the requestor, which is called using its accept() method (here: Engine) - here we trigger initial compile phases for the newly found type.

During (2) the parser can be asked to use dietParse(), which already avoids building AST for method bodies. We have no evidence that more text is really *parsed* than needed for the use case (members *are* relevant).


In consideration of the implementation, none of our observations can be considered as an obvious performance bug.

One fresh idea from just now: One could consider creating a variant of SourceTypeConverter, which, if no details are needed, works much more like BinaryTypeConverter: create AST from only that information readily available in the Java Model. We still need to find means to detect, in which situation that new strategy is sufficient.

There may be relevant potential for optimization, but from today's knowledge it is impossible to estimate the extent of such project. Is the required effort in the order of days? weeks? months? We do not know.

As of today nobody in the JDT team has concrete plans to work on this.  I'm changing the status of this bug to helpwanted with two requests:

- please provide a reference set of projects that can be used for profiling different experiments.

- obviously, contributions of experimental code changes would be most welcome.
Comment 10 Lukas Eder CLA 2020-01-20 03:49:42 EST
(In reply to Mickael Istria from comment #8)
> @Lukas: is this issue still present in latest release? If so,please reopen,
> it'd be a pity that such analysis about a potential bug performance win gets
> abandones.

Yes, it's still an issue, but I agree with @Stephan that there needs to be a way to reproduce this. I'll try again to find a more minimal example (no promises, of course). I do know that quite a few jOOQ customers suffer from this as well, though...
Comment 11 Eclipse Genie CLA 2022-01-10 05:29:40 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 12 Eclipse Genie CLA 2024-01-01 18:29:00 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.