Bug 331445 - performance regression caused by fix in 298835
Summary: performance regression caused by fix in 298835
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.6   Edit
Hardware: PC Windows XP
: P3 major (vote)
Target Milestone: 3.7 M5   Edit
Assignee: Serge Beauchamp CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks: 331758
  Show dependency tree
 
Reported: 2010-11-30 10:23 EST by Randall Theobald CLA
Modified: 2011-01-25 11:40 EST (History)
6 users (show)

See Also:


Attachments
Proposed patch (1.06 KB, patch)
2010-11-30 13:43 EST, Randall Theobald CLA
no flags Details | Diff
Refined patch. (1019 bytes, patch)
2010-11-30 13:45 EST, Randall Theobald CLA
no flags Details | Diff
Alternate fix (1.33 KB, patch)
2010-12-02 14:26 EST, Serge Beauchamp CLA
no flags Details | Diff
New patch (1.27 KB, patch)
2010-12-15 06:27 EST, Serge Beauchamp CLA
no flags Details | Diff
Previous change (1.34 KB, patch)
2010-12-15 13:47 EST, Serge Beauchamp CLA
no flags Details | Diff
Fix & test (5.43 KB, patch)
2010-12-21 13:27 EST, Szymon Brandys CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Randall Theobald CLA 2010-11-30 10:23:23 EST
Build Identifier: 3.6.0.v20100519

I am a performance analyst for an adopting product. I am taking our initial
look at certain performance aspects of our upcoming release, which is based on
Eclipse 3.6 platform (our last release was on 3.4.2). I'm seeing a 37%
regression (from 86 seconds up to 118 sec) on a specific action (happens to be
a workspace clean and build). A profile is pointing to changes made in bug 298835 as the culprit, specifically in:

    org.eclipse.core.internal.localstore.FileStoreRoot.localLocation

where URIUtil is used to convert to and from a URI, as well as getting the variable path manager everytime, are the costly additions.

Profiling (JProf callstack sampling) shows that the cost of this method (localLocation) compared to our previous release has increased by about 21x, with the majority of that being in URI conversion. But even the call to 'getManager' is over 3x the cost of the what the entire method used to cost.



Reproducible: Always

Steps to Reproduce:
1. Benchmark/profile a series of thousands of calls to WorkspaceRoot.getFileForLocation(IPath) with long IPaths, both with and without the fixes from 298835.
2. Notice the large disparity.
Comment 1 Randall Theobald CLA 2010-11-30 13:43:21 EST
Created attachment 184164 [details]
Proposed patch

Here is a proposed patch that completely short-cuts the regression for my workload.

If at all possible, please provide a build including this fix that can be picked up by adopting products ASAP.
Comment 2 Randall Theobald CLA 2010-11-30 13:45:47 EST
Created attachment 184165 [details]
Refined patch. 

Previous patch had an inadvertent change.
Comment 3 Serge Beauchamp CLA 2010-12-02 14:14:32 EST
Hi, 

I'm concerned that this might be an excessive optimization, since it assumes that paths starting with "/" on unix platform or "\\" and "c:\" can't contain variables that might be resolved by the path variable manager.

While this is currently true given the current implementation of the path variable manager, this might not be true in the future.

The performance penalty seems significant as reported, but I think this is a very low level micro benchmark, and its impact on the overall workspace performance should be very modest.

Basically, the only thing the optimization saves is the conversion to the URI and back, since the time passed in the PVM is negligible.

The optimization could be more rewritten by simply replacing :

location = URIUtil.toPath(resource.getPathVariableManager().resolveURI(URIUtil.toURI(location)));

by:

location = resource.getPathVariableManager().resolvePath(location);

What do you think?
Comment 4 Randall Theobald CLA 2010-12-02 14:22:13 EST
The reproduction steps I reported are, agreeably, a microbenchmark. But that is only because I can't tell you to use my product and the customer-confidential workload that exposes the problem. Please be aware, like I said, that the problem surfaced in a workspace build (common operation), and it was very real (37%), not negligible.

The URI conversion IS the expensive part.

Is there a quick test to see if there are any variables in the path (like just checking for the % character or something (I don't remember how variables look) ?? I would be much more comfortable with that.

If this fix is good considering how the variable manager works today, then if the variable manager gets enhanced to where this doesn't work, test cases should start failing.

These low-level APIs can be hit very hard by established customers, so we should always be wary about adding significant processing to them.
Comment 5 Serge Beauchamp CLA 2010-12-02 14:26:23 EST
Created attachment 184384 [details]
Alternate fix

my suggestion is to use this fix instead, since it doesn't make additional assumption about the format of the path, and uses the path variable manager encapsulation of the logic of resolving paths.

Can you confirm that this fix addresses your performance issue as well?

Thanks,
Comment 6 Randall Theobald CLA 2010-12-02 14:44:42 EST
I will check this.

But I'm curious, why was the additional logic to use URIs added if it wasn't needed?
Comment 7 Randall Theobald CLA 2010-12-02 16:07:42 EST
The alternate fix should be ok, though, like I said, getting the variable path manager every time is not free. However, I noticed that the .resolvePath(IPath) call is deprecated.
Comment 8 Randall Theobald CLA 2010-12-02 16:08:28 EST
Please also note that we will need a fix for this on the 3.6.2 stream.
Comment 9 John Arthorne CLA 2010-12-03 09:47:29 EST
I like your fix Serge. This probably happened during a general transition from IPath to URI to support remote file systems. Generally speaking the URI-based resolve is better because it can handle the case where the variable resolves to a remote file system location. However in cases such as this where we know the location is only a local file system path, we can safely use the path-based resolve method. 

In fact in this case (FileStoreRoot#localLocation), I don't see how it is possible for there to be a variable in that path to begin with. That path is computed by a call to IFileStore#toLocalFile, which doesn't know anything about variables.
Comment 10 Szymon Brandys CLA 2010-12-14 09:37:38 EST
It is something we want to backport to 3.6.2, so the fix is needed ASAP.
Comment 11 Randall Theobald CLA 2010-12-14 09:39:36 EST
So if there is no chance to have a variable in these paths, can we remove the use of the variable manager altogether?
Comment 12 Szymon Brandys CLA 2010-12-14 09:44:04 EST
(In reply to comment #11)
> So if there is no chance to have a variable in these paths, can we remove the
> use of the variable manager altogether?

It looks like a reasonable thing to do. Serge?
Comment 13 Serge Beauchamp CLA 2010-12-14 11:25:00 EST
(In reply to comment #12)
> (In reply to comment #11)
> > So if there is no chance to have a variable in these paths, can we remove the
> > use of the variable manager altogether?
> 
> It looks like a reasonable thing to do. Serge?

To be honest, I don't think it would be an appropriate fix.

The method is called from 

FileSystemResourceManager.locationFor(IResource)

Which is API, and we can reasonably expect a resource to be passed that contains path variables.

Using the alternate fix I put has the performance penalty of a method call and two comparisons, since its implementation for the location.isAbsolute() is very small, basically the cost of calling 'new ProjectPathVariableManager(IResource) plus a couple of comparisons in ProjectPathVariableManager.resolvePath(IPath):


	public IPath resolvePath(IPath path) {
		if (path == null || path.segmentCount() == 0 || path.isAbsolute() || path.getDevice() != null)
			return path;


(note that there's no URI conversion)

If calling 'new ProjectPathVariableManager(IResource)' is still a too big of footprint, ProjectVariableProviderManager.getDescriptors() could be optimized a bit to avoid calling map.values().toArray().

But I think it's micro optimizing code in excess of what is to be expected for such an API. 

What do you think, should I commit the alternate fix?
Comment 14 John Arthorne CLA 2010-12-14 16:34:48 EST
(In reply to comment #13)
> To be honest, I don't think it would be an appropriate fix.
> 
> The method is called from 
> 
> FileSystemResourceManager.locationFor(IResource)
> 
> Which is API, and we can reasonably expect a resource to be passed that
> contains path variables.

Yes, it is called from that API method. However, the actual path in question is the "localRoot" local variable, which represents a bare local file system location. That field is initialized by FileStoreRoot#toLocalPath, which obtains the local path directly from EFS. Since EFS knows nothing of path variables it can't contain a path variable...

I don't actually care about the micro-optimization in this case, but from looking at the code I don't think the variable lookup is needed.
Comment 15 Serge Beauchamp CLA 2010-12-15 06:27:30 EST
Created attachment 185215 [details]
New patch

Oh, I see your point John.

So removing the PVM resolution looks fine to me.

Here's a new patch that does just that.

Can either John or Szymon review it please?  Thanks
Comment 16 Szymon Brandys CLA 2010-12-15 08:48:41 EST
Looks fine, thanks.
Comment 17 Szymon Brandys CLA 2010-12-15 12:39:37 EST
Two core.resources tests fail. FileStoreRoot#toLocalVariable may return a Path with variables, if the given URI contains them already.  For instance when we create a linked project using a location with variables. The fix is rolled back.
Comment 18 Serge Beauchamp CLA 2010-12-15 13:47:46 EST
Created attachment 185253 [details]
Previous change

Looks like the previous patch should be used instead (using the IPath API instead of the URI conversion).

I verified that it passes the junit tests fine.
Comment 19 Szymon Brandys CLA 2010-12-21 13:27:25 EST
Created attachment 185658 [details]
Fix & test
Comment 20 Szymon Brandys CLA 2010-12-21 13:31:00 EST
I added an extra check for null in FSR#localLocation and a test. The code is committed to HEAD.
Comment 21 Szymon Brandys CLA 2011-01-25 11:40:35 EST
Verified by code inspection.