Community
Participate
Working Groups
In 3.1. Open FailureTrace (in the junit plug-in). Open Call Hierarchy on the method refresh. On my system, it takes about 5.5 seconds. Now, rename refresh to refresh2, and Open Call Hierarchy again. On my system, it takes about .5 seconds. refresh neither overrides nor is overridden, so the difference seems to come from the fact that my workspace has 188 declared methods named "refresh", and 0 named "refresh2". However, there is no semantic relationship between FailureTrace.refresh and the 187 other methods with the same name. What would it take to build the search indices to take advantage of this fact?
Markus, I think this belongs to core since there isn't much we can do here in UI.
Yes, moving to JDT/Core. This is a general problem with search for method references.
Search needs to verify that there's no relationship between your pattern and entries returned by indexer. Search needs to parse file containing these entries to know whether these possible matches may be accurate or not... Don't know if there's something to do in this peculiar case, but I'll have a look.
The index for method invocations could contain information about the target type and prune non-matching index entries without parsing those files.
Unfortunately indexing cannot store any other information for method callers than method name and number of arguments. Parser definitely needs resolution to have more information on them and indexing does not and won't use resolution for performances concerns... So, I'm afraid that there's nothing we can do to fix this problem with current design... Please reopen and increase severity if you think we definitely need to find a solution to it.
Methods should simply be indexed by declaring classes AND signature, not only signature. Having experienced the problem with 'add', 'get' and 'close', it is clear to me that having the owning class/interface as part of the primary key to the method is mandatory to improove the search performance. If the the search engine is text aware but not java aware, (and I foxus on thf 'if') then there are no reasons to avoid improoving the JDT. I feel such java search should be delegated or implemented in a fully java-aware layer, for opportunity reasons. The book keeping performance of such lookup would be simply a cloned variant of the current lookup. If the current indexing cost 1x, having an index that adds the declaring class to primary key would overall cost 2x. When a piece of code uses FileOutputStream.close(), it registers its location under the OutputStream.close() anyway, since FileOutputStream isn't declaring close(). Then if you search for calls to OutputStream.close() (not any other close like Socket.close()), the java-aware search would be launched with OutputStream.close() and it would return a narrower list of locations. Then, the filtering for FileOutputStream.close() can occur, if required. If you want immediate searching of FileOutputStream.close(), it requires the calling code to register with all pairs of possible superclass, superinterface, declaringclass, or declaring interface with the method. (Such book keeping is ugly I gree, that why I would only register under the 1 declaring class/interface + method). The only memory impact it the number of entries in such new lookup map, since, for example, the OutputStream.close() key is not equals to the Socket.close() key (factorization of all colliding methods names under their declaring classes/interfaces). This issue should be reopenned as an improovement for the search engine.
(In reply to comment #5) > So, I'm afraid that there's nothing we can do to fix this problem with current > design... Maybe we just didn't check the right places to improve the search speed. What I usually do now if I run into an often-used method name (see comment 1), is this: - rename the method to refresh2 - save and let the incremental Java builder run - all references show up in the the problems view as errors ("The method refresh2(..) is undefined for the type ..") This is much faster than letting the search engine parse all files that contain a call to any refresh(..) method. This optimization could be implemented right in the search engine: If there are a lot of index matches and the workspace is fully built, only parse the CUs that have a dependency on the method declaration's CU. The builder must know these dependencies, otherwise an incremental build would take much longer.
(In reply to comment #7) > (In reply to comment #5) > > So, I'm afraid that there's nothing we can do to fix this problem with current > > design... > > Maybe we just didn't check the right places to improve the search speed. What I > usually do now if I run into an often-used method name (see comment 1), is > this: > > - rename the method to refresh2 > - save and let the incremental Java builder run > - all references show up in the the problems view as errors ("The method > refresh2(..) is undefined for the type ..") > > This is much faster than letting the search engine parse all files that > contain a call to any refresh(..) method. > > This optimization could be implemented right in the search engine: If there are > a lot of index matches and the workspace is fully built, only parse the CUs > that have a dependency on the method declaration's CU. The builder must know > these dependencies, otherwise an incremental build would take much longer. > Why don't you simply use a search only in sources? I think if there's a strong need for this requirement, it would be a more simple way to investigate: turn off the parse of files in libraries if too many matches are found in jar indexes...
> Why don't you simply use a search only in sources? I just tried it again and measured the time to search for references to org.eclipse.ltk.core.refactoring.Change.getParent() in my workspace (all values warm, i.e. do the operation twice and only measure 2nd run): No matter whether I turn off libraries or not, search takes about 30s until the first result is shown and 50s until the search is done. Incremental build, however, is done in 7s, which is a factor of 4-7 better.
We already have a notion of compound searches, so is it possible to search for references to the type org.eclipse.ltk.core.refactoring.Change + senders of the method getParent() ? Then use the reduced set to parse/match correctly.
> We already have a notion of compound searches, so is it possible to search for > references to the type org.eclipse.ltk.core.refactoring.Change + senders of the > method getParent() ? Nope, the invocation c.getParent() could be a polymorphic call (type of c is a subtype of Change). In this case, the CU containing the invocation does not have to reference the type Change explicitly.
(In reply to comment #5) > Please reopen and increase severity if you think we definitely need to find a > solution to it. We should really see whether comment 7 could be implemented. I almost starved to death while searching for references to my own getShell() method. And this problem will inevitably become worse with bigger and bigger workspaces.
First need to optimize the PossibleMatch storage in PossibleMatchSet and see the new performance results after it has been implemented...
Frédéric, could you please investigate this one before you move to another project ?
(In reply to comment #9) > > Why don't you simply use a search only in sources? > > I just tried it again and measured the time to search for references to > org.eclipse.ltk.core.refactoring.Change.getParent() in my workspace > (all values warm, i.e. do the operation twice and only measure 2nd run): > > No matter whether I turn off libraries or not, search takes about 30s until the > first result is shown and 50s until the search is done. Incremental build, > however, is done in 7s, which is a factor of 4-7 better. I've also made some measures and get different results... My workspace have all JDT sources, i.e. all projects got from JDT/Core, JDT/UI and JDT/Debug releng map files. Searching for all references in the workspace of Change.getParent() takes around 7 seconds in this workspace when I precise only sources in the Java Search dialog... For comparison, the incremental build solution takes 5-6 seconds in the same workspace. So, I do not observe the same improvement factor... Do I miss something?
My dev workspace also contains Ant and PDE in source (without tests). With I20090929-0800, I get these numbers for a search to the IMethod Change#getParent() in warm state (i.e. not realistic in real life): - all "Search In" options enabled: 27s - "JRE libraries" disabled: 23s - only "Sources": 21s Add a "2" after name of Change#getParent() and save: 8s until incremental build is done. And the building includes all the overhead of writing files, adding problems, etc. For comparison, search for references to Change#setEnabledShallow(boolean) is not measurable by hand, i.e. < 1s. > So, I do not observe the same improvement factor... Do I miss something? Maybe getParent() is not frequent enough in your workspace to make it obvious. Furthermore, class Change has quite a few dependencies, so it's not that good as an example because it makes incremental building slow as well (but I guess most of the slowness is disk IO, which would be less of a problem if you only read dependencies). In UI code, "refresh" and "getShell" are even better examples, e.g. to find references to org.eclipse.jface.preference.ListEditor.getShell(), I need 65s, while incremental build is done in 2s.
This would be a good item to look at post current work on the formatter. Targetting M4 for now. Will reconsider once we access how much work is needed.
Frédéric, please investigate for M5.
(In reply to comment #16) With additional projects, I can get the same numbers... However, I'm still not convinced that this solution is feasible. At a first look, the builder status does not seem to be publicly accessible, hence it may be really difficult to get its content for the Search Engine. But of course this is not the difficulty which must stop us... However, even if we could access the builder status content, it might be out-of-date or even not exist if no full build has been done before the search... So, as Olivier remind us for another bug, the accuracy of the compiler and its tools is the first law... Hence, the S.E. would always need to search by itself when the builder status was not accurate (which is a common case when the 'Build automatically' preference is not checked). An other point is that when I looked at the incremental builder, I saw that it was of course listening to resource events to know which resources will need to be updated in the status. Currently the S.E. does not listen to resources events. If we decide to change it, this would imply big changes that unfortunately we could not afford for 3.6 version. So, my conclusion is that: 1) I strongly doubt that using the build status could be the solution here 2) I agree that if we really want to fix this, a possible idea would be that the S.E. implement a new strategy to have something similar to the builder and indexing one (i.e. status + listen event resources)
> Hence, the S.E. would always need to search by itself > when the builder status was not accurate (which is a common case when the > 'Build automatically' preference is not checked). Yes, this optimization can only work when the affected projects are fully built, so the old way has to stay. But for Java development, the common case is that auto-build is enabled and the workspace is fully built. > An other point is that when I looked at the incremental builder, I saw that it > was of course listening to resource events to know which resources will need to > be updated in the status. Currently the S.E. does not listen to resources > events. If we decide to change it, this would imply big changes that > unfortunately we could not afford for 3.6 version. Why would the S.E. have to listen to resource changes? The S.E. can just ask the builder whether it is ready. If the workspace changes while a search is running, then it's OK to miss matches that are only available in the new state, or to report matches that are out of date. Clients that can't cope with outdated matches have to lock the workspace anyway, since the matches could also become outdated 1ms after the search ended (or even during the requestor callback).
(In reply to comment #20) > Yes, this optimization can only work when the affected projects are fully > built, so the old way has to stay. But for Java development, the common case is > that auto-build is enabled and the workspace is fully built. > Not sure about this assumption, all Java developers here have disabled auto-build... ;-) > > Why would the S.E. have to listen to resource changes? The S.E. can just ask > the builder whether it is ready. If the workspace changes while a search is > running, then it's OK to miss matches that are only available in the new state, > or to report matches that are out of date. Clients that can't cope with > outdated matches have to lock the workspace anyway, since the matches could > also become outdated 1ms after the search ended (or even during the requestor > callback). Unfortunately I have looked at the Incremental JDT builder + asked Philippe about this and the conclusion is that there's currently no other way to have the idea the builder state than asking for resource deltas using the API method IncrementalProjectBuilder.getDeltas(IProject). Expect if we missed something, it seems that this currently can only be done through a builder, hence I cannot figure out how the S.E. could get them... As I said, I talked with Philippe about this issue and his advice was to try to reduce the scope, as in fact the builder do. The builder never works on the entire workspace (except for the full build). It uses a stored state to know what is the smallest scope to recompile when a file is changed. The simple idea is to do the same in this peculiar case. My first idea was to reduce the scope by restricting the search to the sources only. After having talked with Philippe, I realized that we could also reduced the scope to the Enclosing projects instead of the entire workspace... And here are the number for org.eclipse.jface.preference.ListEditor.getShell() on the workspace I use for this issue: - Search in 'Sources' only using 'Workspace' scope: ~28s (warm) - Search in 'Sources' only using 'Enclosing projects' scope: ~2s (cold) ~1s (warm) - Search in 'Sources' + 'Required projects' + 'Application libraries' using 'Enclosing projects' scope: ~3s (cold) ~1s (warm) - Build: ~5s (cold) ~1s (warm) Maybe we could think about to give the users access to this kind of search in an easier way than the Java Search dialog...?
Moving to M6.
Re getting the build state from the builder: I see that this can get tricky. But you could probably just hock into the existing DeltaProcessor. I saw that it already calls JavaBuilder.buildFinished(), so it could also set a flag there and reset it on resource change events. > As I said, I talked with Philippe about this issue and his advice was to try to > reduce the scope, as in fact the builder do. The builder never works on the > entire workspace (except for the full build). It uses a stored state to know > what is the smallest scope to recompile when a file is changed. > > The simple idea is to do the same in this peculiar case. That's what I'm proposing, but I thought it's not worth to duplicate this dependency information in search, so I suggested to take this state from the builder when it is available. > My first idea was to reduce the scope That's fine too, but AFAIK, the search engine already does this where it doesn't reduce the possible match count (e.g. strips away unrelated projects). > Maybe we could think about to give the users access to this kind of search in > an easier way than the Java Search dialog...? The user should not be bothered with the 'Search In' flags all the time. The options in the Search dialog are mainly a way to reduce the match count if the user knows e.g. that there are no relevant references from the JRE (although there could be references through polymorphism, but Search cannot know that they will never happen because it does not do any flow analysis). The validity of those settings also depend on the workspace setup (e.g. whether I have some projects from source or imported as binary), but the user's request is always the same: Give me all references in the workspace as fast as possible, doing whatever tricks are necessary to speed up the operation.
Created attachment 158208 [details] Draft patch Markus, if you can make a quick test with this patch and let me know if it goes in the right direction... TIA Note that I still need to discuss with Jerome about it to be sure of the DeltaProcessor changes (it seems to be a little naive so far). I should be back with an update tomorrow...
(In reply to comment #24) I quickly tried it, but I'm not sure what I'm supposed to test. The changes look promising, but the shortcut in the IndexSelector is too easy, since it only visits the enclosing projects of the generatedResources, but it misses to check references to these resources. The incremental builder must know about these references somehow, and that knowledge most be used here as well.
Created attachment 158321 [details] Proposed patch A new proposal to address this issue... The key point is that Search Engine can be improved in this area *only* when the Auto-build is activated in the workspace. Unfortunately, all my attempts to implement a needRebuild() using the DeltaProcessor failed mainly due to the fact that there's no order specified between POST_BUILD and POST_CHANGE resources events coming from the Platform... So, assuming that the user has activated the Auto-build preference in his/her workspace, with this patch, the IndexSelector will be able to reduce the number of project index files to look at by using the project builder state and see whether it includes a reference to the focus type or not... Using this patch, searching for the ListEditor.getShell() method references on the *entire* Eclipse full sources workspace takes 14s just after having opened the session (i.e. when it's cold) and only 2s if I repeat the search (i.e. when it's warm)... Comparatively, on the same workspace, renaming the method and trigger a build takes less than 2s when it's cold and less than 1s when it's warm... So, even if it's not as fast as the builder, I guess this is the simplest change we can have for an acceptable result.
(In reply to comment #25) > I quickly tried it, but I'm not sure what I'm supposed to test. Just play with the last posted patch and see if it definitely addresses your problem. That means, do you think that the response time for the search query you perform in your usual workspace is now acceptable... Also, with the last patch I posted, do you think it's acceptable that the S.E. speed-up only works when the auto-build is ON?
Speedup is great for some examples (ListEditor#getShell()), but not noticeable for others (Change#getParent()). I guess that's a consequence of the strategy to only use dependency information to include/exclude complete projects. I'm not sure how frequent the two patterns are in practice -- I'll have to test this a bit longer in practice to see if the current fix is already enough. > Also, with the last patch I posted, do you think it's acceptable that the S.E. > speed-up only works when the auto-build is ON? Yes, that was the whole idea (use the available dependency data if it is available for free, but don't spend more cycles for resolving dependencies while maintaining the index). I found that the new strategy reduces the amount of potential matches in some cases. E.g. when you import jface, swt, and all plug-ins that match "*.ui" as source, then search for references to ListEditor#getShell() used to find potential matches in org.eclipse.compare which is e.g. required by org.eclipse.equinox.p2.ui. The found potential matches were all wrong though, so I think it's no big deal.
(In reply to comment #26) > Created an attachment (id=158321) [details] > Proposed patch > Released for 3.6M6 in HEAD stream.
Frederic, The following code looks a bit suspicious to me: (reformatted to fit) while (javaElement != null && !(javaElement instanceof ICompilationUnit) && !(javaElement instanceof IClassFile)) { javaElement = javaElement.getParent(); } if (javaElement != null) { ICompilationUnit compilationUnit = (ICompilationUnit) javaElement; If we exit the while with javaElement being an instanceof IClassFile then the cast to ICompilationUnit could throw an class cast exception ? Let me know if I have misunderstood something here. Otherwise the fix looks ok, I verified it by code review since the performance angle has already been checked by Markus.
(In reply to comment #30) > Frederic, > > The following code looks a bit suspicious to me: (reformatted to fit) > > while (javaElement != null && > !(javaElement instanceof ICompilationUnit) && > !(javaElement instanceof IClassFile)) { > javaElement = javaElement.getParent(); > } > > if (javaElement != null) { > ICompilationUnit compilationUnit = (ICompilationUnit) javaElement; > > If we exit the while with javaElement being an instanceof IClassFile then > the cast to ICompilationUnit could throw an class cast exception ? > > Let me know if I have misunderstood something here. > > Otherwise the fix looks ok, I verified it by code review since > the performance angle has already been checked by Markus. You definitely right, you points out here the potential CCE I was talking about in bug 304841 comment 4... Hopefully, I also saw it and it's now fixed in HEAD :-)
Verified for 3.6M6 by code inspection.