Bug 436645 - [CommonNavigator] NavigatorContentDescriptorManager.EvaluationCache#getDescriptors does not work
Summary: [CommonNavigator] NavigatorContentDescriptorManager.EvaluationCache#getDescri...
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 4.3   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: 4.6 M6   Edit
Assignee: Sean Young CLA
QA Contact:
URL:
Whiteboard: hasPatch
Keywords: helpwanted, performance
Depends on:
Blocks: 485019
  Show dependency tree
 
Reported: 2014-06-04 20:42 EDT by Sean Young CLA
Modified: 2016-02-19 21:00 EST (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sean Young CLA 2014-06-04 20:42:59 EDT
org.eclipse.ui.internal.navigator.extensions.NavigatorContentDescriptorManager.EvaluationCache#getDescriptors will always return null for a key that was inserted using setDescriptors.

This seems to be because NavigatorContentDescriptorManager.EvaluationCache attempts to prevent a memory leak from caching something too long by wrapping the key in an EvaluationReference when setDescriptor is set. The problem is that the getDescriptors method does not account for that, and still tries to find the key using the original object, which of course fails given that a standard HashMap is used (well, two of them, evaluations and evaluationsWithOverrides), and new EvaluationReference(o).equals(o) will always return false (as it should; it would be violate the symmetry requirement of equals if it did).

This is "exposed" to the NavigatorContentDescriptorManager itself through findDescriptorsForPossibleChild and findDescriptorsForTriggerPoint through findDescriptors (which calls the getDescriptors method of the cache without wrapping the key in any way)

Even though this is an internal API, this flaw may be having performance implications in several public APIs.


A simple code snippet to reproduce (pretend for a moment EvaluationCache isn't private):

NavigatorContentDescriptorManager.EvaluationCache cache;  // Some way to get a cache
// Hard references ensure that the GC isn't the one responsible for this.
String key = "Key";
NavigatorContentDescriptor[] descriptors = new NavigatorContentDescriptor[0];
cache.setDescriptors(key, descriptors);
NavigatorContentDescriptor[] fetchedDescriptors = cache.getDescriptors(key);
if (fetchedDescriptors == null) {
    throw new AssertionError("Can't find an entry I just put in");
}


You should get the AssertionError every time.


Suggested fix:
Short term: Have evaluations and evaluationsWithOverrides be WeakHashMap, and don't bother wrapping the key in a custom Reference implementation. WeakHashMap already uses equals of the underlying object instead of the weak reference. The fact that the values are stored in SoftReferences now should prevent bug 156395 from becoming an issue again.
This will have the unfortunate downside of making the keys weak instead of the intended soft (EvalutationReference extends SoftReference), but right now, well, you aren't getting your things anyways.

Long term: Use a custom Map implementation that properly "packs" and "unpacks" put AND get requests as well as ensuring the underlying object's equality is used, not the reference object itself. Unlike WeakHashMap, this could use a more "specialized" Reference implementation (though I am not sure what EvalutationReference is trying to accomplish) as well as allowing both the EvalutationReference (or Soft, or Weak, or whatever) keys and SoftReference values to have a ReferenceQueue registered with them, so that stale entries can get cleaned up more promptly (an EvaluationCache instance got a map of up to 24000 entries, most of the keys being collected, because of the current entry at a time lazy method stale entries are expunged)
There are likely some existing "cache maps" that already have this implemented nicely.


Risks:
There might be some clients out there that have "worked around" this by wrapping their key queries in an EvalutationReference already, and fixing this will cause them to break. I couldn't find any such usages. I don't think this should stop the bug from being fixed, as EvalutationReference is an internal API class, and thus anyone outside of CommonNavigator should not be directly using this class.

There might be untested, potentially buggy code that would start being run once this cache actually starts returning results.
Comment 1 Sean Young CLA 2014-06-04 20:45:42 EDT
Leaving at normal importance as I don't know how widely used this thing is, though the fact that I saw 20000 entries in my heap dump seems to suggest that this does get some decent use.


I can also take a stab at implementing my short term solution fix.
Comment 2 Sean Young CLA 2014-06-05 15:06:48 EDT
Oh, an additional short term solution idea:
Make EvaluationCache#getDescriptors wrap the lookup key in an EvaluationReference before querying for it in the Map.
EvaluationReference's equals method will need fixing though; right now, it will consider two EvaluationReference's equal if their underlying objects have equal hashcodes, which is not always true.
Comment 3 Sean Young CLA 2014-06-06 21:39:03 EDT
Actually, looking at the code, NavigatorContentDescriptorManager#findDescriptors wouldn't make much use of the cache even if it did work. Yea, it would add the entries of the cache if it found a cached value, but it would go ahead and do the full computation anyways, making the fact that there is a cache a moot point. Plus, it doesn't consider considerOverrides when querying and writing to the cache.

Thus, I think the best short term solution is actually just to ditch the cache entirely, and save a few MB in ram usage.

A long term solution can be to figure out how to get it using the cache properly.
Comment 4 Sean Young CLA 2014-06-09 17:33:28 EDT
For safety's sake, I am merely removing use of the cache (as it wasn't doing anything but wasting memory) for R4_3_maintenance. Adding cached result is sort of a big deal I don't want to risk for maintenance.

Change: 28238
https://git.eclipse.org/r/#/c/28238/

Will upload commit for fixing the cache usage properly soon for master.
Comment 5 Sean Young CLA 2014-06-09 17:55:13 EDT
Here is the commit for master that makes the cache work and be used properly:

Change: 28240
https://git.eclipse.org/r/#/c/28240/
Comment 6 Sean Young CLA 2014-06-09 18:04:46 EDT
The change https://git.eclipse.org/r/#/c/28238/ should NOT be merged into master, as it is completely supersceeded by https://git.eclipse.org/r/#/c/28240/.
Comment 7 Sean Young CLA 2014-06-13 14:17:30 EDT
Quick question. Is the "quick fix" (just disable it) targeted for 4.4, and the "real fix" for 4.5?

If so, which branch should I check the quick fix into?
Comment 8 Paul Webster CLA 2014-06-25 09:47:34 EDT
(In reply to Sean Young from comment #7)
> Quick question. Is the "quick fix" (just disable it) targeted for 4.4, and
> the "real fix" for 4.5?
> 
> If so, which branch should I check the quick fix into?

Hi Sean,

The quick fix will go into 4.4.1, R4_4_maintenance branch (which doesn't exist yet).

The real fix can go into master for 4.5

PW
Comment 9 Sean Young CLA 2014-06-25 16:02:22 EDT
Thanks. Will wait for R4_4_maintenance to come up and then cherry pick quick fix into that.

Will also update CLs with new version numbers.
Comment 10 Paul Webster CLA 2014-06-25 16:34:18 EDT
(In reply to Sean Young from comment #9)
> Thanks. Will wait for R4_4_maintenance to come up and then cherry pick quick
> fix into that.

I've pushed R4_4_maintenance to eclipse.platform.ui now.

PW
Comment 11 Sean Young CLA 2014-06-25 17:10:36 EDT
OK, cherry picked into R4_4_maintenance. https://git.eclipse.org/r/#/c/29003/
Comment 12 Sean Young CLA 2014-06-26 15:58:58 EDT
Another followup patch to the quick fix patch, removing a now unused class:
https://git.eclipse.org/r/#/c/29008/
Comment 13 Sean Young CLA 2014-07-14 17:19:36 EDT
Dang it. I will need to play around with this a bit more to be sure, but in my local testing, it seems that my patch may cause the outline view for files (especially the Java one) to not always update "live". I was hoping that the existing cache invalidation mechanism was called every times things changed, but apparently not. Unless I can find a cheap way to tell if something changed, we may just want to ditch the idea of caching this entirely if we can't find a way to invalidate properly cheaply (both in terms of cost to the computer and the developers)
Comment 14 Sean Young CLA 2014-07-14 17:21:27 EDT
Hmm, does the Java outline view actually use the navigator extensions? If not, then that is a different bug.
Comment 15 Sean Young CLA 2014-07-15 13:18:31 EDT
Ok, it's not this patch that is causing outlines not to be updated. Must be some other bug (like having the same file open multiple times, cause it lives in multiple projects, or something), but I will investigate and report in another bug.
Comment 16 Sean Young CLA 2014-08-04 16:28:49 EDT
If the "real fix" is not getting merged until 4.5 M2, can the "short term" fix (already in 4.4 maintenance) be merged into 4.5? If so, how is this done?
Comment 17 Dani Megert CLA 2014-08-05 04:35:39 EDT
(In reply to Sean Young from comment #16)
> If the "real fix" is not getting merged until 4.5 M2, can the "short term"
> fix (already in 4.4 maintenance) be merged into 4.5? If so, how is this done?

Can you point me to that fix and bug report?
Comment 18 Sean Young CLA 2014-08-05 12:08:54 EDT
The "short term" fixes for this bug are https://git.eclipse.org/r/#/c/29003/ and https://git.eclipse.org/r/#/c/29008/ (I would of submitted them as one changelog, but the previous changelog got submitted prematurely)

These two should be in any future branch where the "real fix" (https://git.eclipse.org/r/#/c/28240/) will NOT be merged too.


All of these are for this bug.
Comment 19 Dani Megert CLA 2014-08-26 11:28:17 EDT
(In reply to Sean Young from comment #18)
> The "short term" fixes for this bug are https://git.eclipse.org/r/#/c/29003/
> and https://git.eclipse.org/r/#/c/29008/ (I would of submitted them as one
> changelog, but the previous changelog got submitted prematurely)
> 
> These two should be in any future branch where the "real fix"
> (https://git.eclipse.org/r/#/c/28240/) will NOT be merged too.
> 
> 
> All of these are for this bug.

Thanks Sean. I'm currently too busy to review the patch, sorry. For now I've cherry-picked the workaround to 4.5 with

http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/?id=922279707c30ae4591dd03a933264c4454b7fdc3

http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/?id=4b423f6db26eaebede88479e2b3f7098ea2eb4fb

so that we have at least the same code in master as in R4_4_maintenance.
Comment 20 Lars Vogel CLA 2015-01-30 11:47:15 EST
(In reply to Sean Young from comment #18)
> The "short term" fixes for this bug are https://git.eclipse.org/r/#/c/29003/
> and https://git.eclipse.org/r/#/c/29008/ (I would of submitted them as one
> changelog, but the previous changelog got submitted prematurely)
> 
> These two should be in any future branch where the "real fix"
> (https://git.eclipse.org/r/#/c/28240/) will NOT be merged too.
> 
> 
> All of these are for this bug.

Can you summerize which Gerrit review is still open for master?
Comment 21 Sean Young CLA 2015-02-02 08:42:54 EST
Sorry for the confusion.

The CL that is currently open is https://git.eclipse.org/r/#/c/28240/
which is in master.

The changes to remove the broken version of NavigatorContentDescriptorManager.EvaluationCache were already submitted, the CL above adds back in a fixed version of the evaluation cache.
Comment 22 Sergey Prigogin CLA 2015-02-10 18:09:42 EST
(In reply to Sean Young from comment #21)

Do you have any data on whether the restored caching helps performance?
Comment 23 Sean Young CLA 2015-02-12 12:57:01 EST
In terms of time saved, no. However, I have been running this code on my install with some instrumentation added for hit rates and the like and there tends to be a very high hit rate (like 8 hits : 1 miss after a decent amount of use). Granted, not many plugins seem to use the navigator extension stuff (the Egit git plugin is pretty much the only major user I found).

Are there any benchmarks for this stuff already written?
Comment 24 Sean Young CLA 2015-02-12 12:57:55 EST
Sorry, just a clarification. No meaning I don't know if this saves any time. NOT no, this does not save any time. I haven't tested timings.
Comment 25 Dani Megert CLA 2015-04-24 07:25:00 EDT
Sorry, I won't have time to review this for 4.5.
Comment 26 Lars Vogel CLA 2015-05-22 13:02:35 EDT
Moving to 4.6, please move back to 4.5.1 if you think these bugs are important enough to be addressed in the service release.
Comment 27 Sean Young CLA 2015-05-22 13:07:48 EDT
I'll need to rebase the change; how soon do you need this done to get into 4.5.1?
Comment 28 Sergey Prigogin CLA 2015-05-22 13:12:50 EDT
(In reply to Sean Young from comment #27)
> I'll need to rebase the change; how soon do you need this done to get into
> 4.5.1?

Lets get it into 4.6 (master branch) first. If it behaves well in our testing, we can consider cherry-picking it to R4_5_maintenance for inclusion into 4.5.1.

It would be nice to have a rebased patch in Gerrit by June 24.
Comment 29 Dani Megert CLA 2015-05-22 13:26:37 EDT
(In reply to Sergey Prigogin from comment #28)
> (In reply to Sean Young from comment #27)
> > I'll need to rebase the change; how soon do you need this done to get into
> > 4.5.1?
> 
> Lets get it into 4.6 (master branch) first. If it behaves well in our
> testing, we can consider cherry-picking it to R4_5_maintenance for inclusion
> into 4.5.1.
> 
> It would be nice to have a rebased patch in Gerrit by June 24.

That's the way I like it ;-)
Comment 30 Sean Young CLA 2015-05-29 11:59:42 EDT
Ok, rebased.
Comment 31 Sean Young CLA 2015-06-22 14:17:28 EDT
Ping on a review for this?
Comment 33 Sergey Prigogin CLA 2015-07-13 15:14:21 EDT
Thank you, Sean.
Comment 34 Eclipse Genie CLA 2015-07-14 18:52:13 EDT
New Gerrit change created: https://git.eclipse.org/r/51941
Comment 35 Eclipse Genie CLA 2015-07-14 18:53:57 EDT
New Gerrit change created: https://git.eclipse.org/r/51942
Comment 38 Dani Megert CLA 2016-01-12 04:59:17 EST
Sean, this might have caused a regression for the Sirius project. Please take a look at bug 485019.
Comment 39 Maxime Porhel CLA 2016-01-19 03:11:23 EST
Hi Sean and Dani, 

Any news on the cache issue ? 


Regards, 

Maxime
Comment 40 Eclipse Genie CLA 2016-01-20 06:14:54 EST
New Gerrit change created: https://git.eclipse.org/r/64752
Comment 41 Maxime Porhel CLA 2016-01-20 06:17:36 EST
See https://git.eclipse.org/r/#/c/64752/ which allows us to retrieve the dynamic behavior for our navigator content extension (see Bug 485019). 

I reopen this issue as it causes our bug. The provided patch might need to be improved.

Regards
Comment 42 Sean Young CLA 2016-01-20 11:51:06 EST
Apologies for the late response.

I was afraid that there might of been code relying on the previously broken behavior causing things to never have a cache hit, and it looks like I was right.

There is a mechanism in place to invalidate caches (the cache is a VisibilityListener, and it invalidates all the entries upon onVisibilityOrActivationChange), but that may not be sufficient for all use cases.

An expanded API to mark entries (or the whole cache) for invalidation and/or some way to mark entries as uncacheable may be needed.
Comment 43 Maxime Porhel CLA 2016-01-21 04:40:21 EST
Hi Sean, 

I agree with you on the need of new API to invalidate the cache for other cases than the visibility/activation changes.

Do you plan to work on the new API soon ? Or to retrieve the old dynamic behavior first ?

Regards

Maxime
Comment 44 Sean Young CLA 2016-01-26 11:30:21 EST
I apologize, but I don't have the cycles to work on this at the moment. I should have some time in a month or so, but if you need more invalidation options before then, can someone else take a look?
Comment 45 Dani Megert CLA 2016-02-10 08:19:56 EST
(In reply to Sean Young from comment #44)
> I apologize, but I don't have the cycles to work on this at the moment. I
> should have some time in a month or so, but if you need more invalidation
> options before then, can someone else take a look?

Sergey, can you take a look? You gave Review+2 and merged the change. Thanks!
Comment 46 Sergey Prigogin CLA 2016-02-10 13:45:24 EST
(In reply to Dani Megert from comment #45)
Will do.
Comment 47 Eclipse Genie CLA 2016-02-10 20:54:17 EST
New Gerrit change created: https://git.eclipse.org/r/66358
Comment 48 Sergey Prigogin CLA 2016-02-10 20:55:06 EST
(In reply to Maxime Porhel from comment #43)
Hi Maxime,

Would it satisfy your needs if the caches were invalidated when INavigatorContentService.update() method is called? See https://git.eclipse.org/r/#/c/66358/ for the proposed change.
Comment 49 Maxime Porhel CLA 2016-02-11 03:25:17 EST
Hi Sergey, 

This is not enough: the INavigatorContentService.update() is currently called only on extension activation an in the UpdateActiveExtensionOperation (called from the common filter selection dialog).

Our simpler failing use case is a test expression checking a property tester which tests the nature of a IProject. This test expression is used in possibleChildren and triggerPoints of our content extension. 

The real expression is more complicated and do several checks on some IFile and some internal state of Sirius.

Regards, 

Maxime
Comment 50 Maxime Porhel CLA 2016-02-11 03:39:57 EST
Could you take a look at the following patch: https://git.eclipse.org/r/#/c/64752/

Our test suites are ok with it, it avoid to cache an expression as soon as there is a TestExpression in it.
Comment 51 Sergey Prigogin CLA 2016-02-11 13:30:58 EST
(In reply to Maxime Porhel from comment #49)

Sorry, for not explaining the proposed solution in detail. With https://git.eclipse.org/r/#/c/66358/ the Sirius code that knows what data may affect the Navigator content, may listen to changes in that data and call INavigatorContentService.update() whenever it detects a change. This approach is likely to result in better performance compared to https://git.eclipse.org/r/#/c/64752/, which would trigger repeated evaluation of property testers that are considered unsafe to cache.
Comment 52 Maxime Porhel CLA 2016-02-12 03:28:11 EST
(In reply to Sergey Prigogin from comment #51)

This means that all existing CNF extensions using some property testers will have to explicitly call INavigatorContentService.update()/NavigatorContentDescriptorManager.getInstance().clearCache() if they want to keep the dynamic behavior of their navigator content trigger points (and possible children) in Neon. 

This might impact other CNF users and not only Sirius.

Unless this change is documented, it seems safer to avoid to cache expressions known to be dynamic. 

Regards
Comment 53 Sergey Prigogin CLA 2016-02-12 17:20:01 EST
(In reply to Maxime Porhel from comment #52)

I agree that the caching behavior has to be documented. Not caching dynamic expressions may have negative performance implications.
Comment 55 Dani Megert CLA 2016-02-19 03:31:52 EST
(In reply to Eclipse Genie from comment #54)
> Gerrit change https://git.eclipse.org/r/66358 was merged to [master].
> Commit:
> http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/?id=cb9032b4744f8d42f1b08ca87851a0f8b9a72e18
> 


Either there is a real problem with this change, or tests need to be adjusted. We have 4 failures (on all platforms):

http://download.eclipse.org/eclipse/downloads/drops4/N20160218-2000/testresults/html/org.eclipse.ui.tests.navigator_linux.gtk.x86_64_8.0.html
Comment 56 Eclipse Genie CLA 2016-02-19 20:24:20 EST
New Gerrit change created: https://git.eclipse.org/r/66976