Community
Participate
Working Groups
I am seeing very slow project creation because it seems that all files must be statted and identified prior to interacting with the project. Similarly starting up Eclipse is slowed down as all files must be statted during startup. This problem is made even worse over NFS or MVFS. I propose 'lazy' resource discovery. When a project is created or Eclipse is started just put a placeholder in place for the project. Put the process of getting directory listings in a background thread, queued by directory depth (ie list and stat for top level directories for projects before beginning decent into subdirs). User opening of a project or directory in a project adds priority to that projects or directories listings. In the event that a User has drilled down a listing not yet filled in use the same "Pending..." that the CVS folks use to let the user know things are coming. The goal is to give the illusion that a project (or Eclipse as a whole) is immediately available, and then conspire to makes sure that whenever a developer is looking for information it is already present.
Ed, I'd like to collect some data before we explore solutions in more detail. I am going to attach a benchmarking plugin that is used to test the performance of various operations in different end-user configurations. I want to get an idea of the performance cost of stat in your configuration. The plugin is installed by unzipping into a new directory, and then linking to your product using Help > Software Updates > Manage Configuration > Add an Extension Location. This allows you to install the new plugin without having to restart. Once installed, it adds a "Benchmark" sub-menu to the resource navigator's context menu. The two benchmarks I want you to run are not actually context sensitive (i.e., it doesn't matter what the current selection is in the navigator view). Load up your large workspace, and run both the "Standard Visitor" and "Vist and Stat" benchmarks. Pick a number of iterations that takes a reasonably long time (at least a few seconds) in order to minimize noise in the results. The standard visitor is just a baseline to discover the cost of iterating over all resources in your workspace (no I/O or expensive operations involved). The visit and stat performs exactly one stat call on each resource in the workspace.
Created attachment 14951 [details] Performance benchmark plugin
I ran the benchmarks requested over a workspace the contained a single large project: Benchmark Standard Visition (100 iterations): 393.48ms average per iteration Benchmark Visit and Stat (1 iterations): 501951.0 ms average per iteration I'll visit and stat over more iterations later... but as you can see for the one iteration result... generating them may take a while :)
It should be mentioned that the project I ran benchmarks for in bug 74392 comment 3 is a ClearCase view over MVFS... and the ClearCase client I'm working on is 0.2 ms or so roundtrip from the vob server.
Wow! That definitely shows the bottle neck. 8 minutes to stat the project is incredible. I think a single iteration is enough to show a reliable result on a sample that size. How many resources in the project? The "Find Markers" benchmark prints the number of resources visited, so you can run that on the project to see the size. For reference, I just ran on my local workspace (24,378 resources). The basic visitor took 256 ms, and Visit and Stat took 4909 ms.
Find Markers shows 74176 markers in the single project that is the only project in the current workspace.
Correction to bug 74392 comment 6. It found 74176 resources, not marker.
I just noticed that in Eclipse 3.0.1 startup does not seem to be slowed down by what I hypothesised was the 'stating' problem. I seem to remember it being in Eclipse 3.0. I am I crazy or did something change?
You actually shouldn't be seeing extensive stating on startup under normal circumstances. There is a preference (Workbench > Startup and Shutdown > Refresh Workspace on starutp). If this is checked (or the -refresh command line argument is used), it will stat the entire workspace on startup. Also ensure Preferences > Workbench > Refresh workspace automatically is turned off. When this second preference is turned on, a background thread will periodically stat the files in the file system to see if they have changed. Finally, when Eclipse detects that the previous session crashed (didn't shutdown using File > Exit), it will do a "free" refresh (stat) of the entire workspace on startup.
I just wall clock timed the process of creating a project over my single large code base: 9:15 +/- 5 seconds. So it looks like other than the 8:20 for stating we have about 55 seconds being used for other activities related to building the project. What can we do to find out what else is taking so long?
Given the speed of your file system, I suspect another big cost is that we call java.io.File.list() once on each folder to gather the list of child names. In other words, simply discovering the complete tree of file and directory names will take some time. If all stating was eliminated (or moved to the background), would 55 seconds be an acceptable range or is it still too slow? How often are you needing to create such projects? I have another general question about your work flow that would help us scope a solution. Out of the 75,000 files in the project, what proportion of them will eventually be read locally? Are builds run locally or remotely? Do you only ever need the bytes from a small set of those files? We currently have the notion in our API of "non-local" files and folders. Such resources have a bit set indicating they do not reside on a local disk. They are never stated, but trying to obtain their contents without first making them local is not allowed. Would it make sense in your situation for us to create a project of non-local resources, and then mark them as local lazily (and obtain their stat info) only as the contents are requested? I.e., I buy your original argument for lazy discovery, but how eagerly should we "conspire" to make them available?
Created attachment 15049 [details] Patch to org.eclipse.core.resources plugin Attached is a replacement for resources.jar contained in eclipse/plugins/org.eclipse.core.resources for Eclipse 3.0.1. It moves the bulk of the file system work for opening a project into a background job. This is not intended as a workable solution, since the background job will prevent you from doing any real work while it runs. However, it will demonstrate the fastest possible response time for creating very large projects. To turn this into a workable solution, the background job would have to politely interrupt its work whenever the user wants to make changes, and then resume its activity during idle time.
I've tried your replacement resource.jar, and it seems to be a big move forward. The project is immediately created (although not populated) and I can move about and do other things in Eclipse while the file activity is taking place. I did have an issue while doing independent CVS activity (trying to get the HEAD listing for a repository for example) with those background CVS jobs not finishing until after the background job discovering all of the files was finished. Does that fit your 'no real work till completed' expectation, or is it some other effect? Shall we move forward to the next step with this?
Expanding HEAD shouldn't conflict with the refresh going on, but it turns out it does. The CVS team has entered a separate bug for this. As for next steps, I'm currently in the middle of several other things - I was just hoping to gather some early data that could influence the shape of the solution. This bug, and related issues with large/remote workspaces, is certainly on the radar though.
Could you either attach the code for the changes you made in the resource.jar you posted, or provide reference to where it is in CVS? Also, could you provide the bug number for the CVS bug you mentioned in bug 74392 comment 14 ?
Created attachment 15071 [details] Patch to org.ecllipse.core.resources Attached is a patch to the org.eclipse.core.resources plugin version 3.0.1, corresponding to the JAR file in attachment 15049 [details]. The bug number for the CVS defect is bug 75918.
Created attachment 15370 [details] Bug fix requirements Required features for the bug fix.
Thanks for the requirements, Andrew, this is useful information. Some comments: R4: This seems to be an implementation detail rather than a useful requirement. I.e., whether there are several concurrent discovery threads or one thread processing a single queue of discovery requests shouldn't matter from a user's point of view. It is possible that multiple discovery threads might speed things up since one could proceed while another is blocked on the expensive I/O, but one thread per project seems a bit arbitrary... R6: Depth-based priority isn't interesting. Any discovery of the file system tree must happen top-down regardless (can't discover a file until its parent is known). An important detail not included here is what happens when an "undiscovered" tree is queried programmatically. For example, if a builder or search engine asks for the children of a directory that has not yet been fully discovered, does it raise the priority, or just return what is already known. Picture a tool that is attempting to "visit" the entire project tree - should it just see the discovered subset or should it block until the entire tree is known?
John, I agree with your comments in bug 74392 comment 18. Are you working on any code for this now? If not, do you mind if I take a whack at it?
I'm actually working on it right now. My first task was to move the discovery into the background (see patch). That was the easy part. I am now working on breaking up the discovery into small chunks so that the user can perform other work concurrently with the discovery going on in the background (complicated by our somewhat monolithic locking model). That is going well - a bit difficult to come up with an approach that doesn't hurt performance for those on normal local file systems, but making progress. My next step after that is to mark "undiscovered" resources so that we know to eagerly discover them on demand (for example when the user expands an undiscovered folder in the navigator view. I'll keep this bug report updated with my progress.
I uncovered bug 77071 last night while testing lazy population of the Navigator view. The underlying TreeViewer will need to handle requests to add elements that have already been expanded for this solution to work.
Created attachment 15482 [details] Patch to org.eclipse.core.resources project The attached patch implements the requirements layed out above. It performs population of new projects in the background, in small chunks of work so that the user can continue browsing and modifying the workspace in the meantime. When an attempt is made to access a directory that has not yet been populated (for example by expanding in the Navigator), it moves that discovery request to the front of the refresh queue. It does not implement the "Pending..." and busy cursor behaviour in the CVS repository view, but since local refresh of a single directory (just discovering one level of children) is never likely to take a long time I'm not sure if this is necessary. I have been experimenting with this implementation on a simulated "slow" file system (I added sleeps to file system calls). It feels pretty good, but it would be nice to get feedback from real users on these systems to see how it behaves. We are approaching a milestone build next week (3.1 M3), so I won't be releasing this until after that point to minimize instability.
Note that the latest patch from bug 77071 will also be required to test the patch described in comment #22.
More data: Client machine: Eclipse V3.0.1 Build id: 200409161125 Solaris 5.8 ClearCase v2002.05.00-13 using a ClearCase dynamic view Server machine: Solaris 5.6 ClearCase v2003.06.10+ 44,031 files 3,127 directories Visit and Stat: 387,897.0ms (1 iteration) 383,402.0ms (1 iteration) 379,049.0ms (1 iteration) Standard Visitor: 1443.8ms (10 iterations) 1438.8ms (10 iterations)
Platform core changes have been released to HEAD. This still depends on two UI changes - I have pinged the UI team to consider these for next week. I will leave this bug open until everything is released and we get feedback on its effectiveness.
All UI changes for this feature have been released. The I20041123 build should be suitable for testing.
Marking fixed.
Running Eclipse 3.1M4 with CDT build 3.0.0-I200502011142 on Linux I am still getting very slow project creation times (over 10 minutes). Is there an option that needs to be enabled? Or might there be a CDT interaction?
There is almost certainly a CDT interaction. Background project loading doesn't happen "for free" because it would have caused breaking changes to existing clients. Code that creates projects (such as the CDT project creation wizards), need to set a flag to indicate background refresh is needed. Can you enter a bug against CDT for this? You can refer them to the following document for details: http://dev.eclipse.org/viewcvs/index.cgi/%7Echeckout%7E/platform-core-home/documents/3.1/background_refresh.html
This doesn't seem to be quite right... when I attempt to create a project of type 'Simple' (CDT is still not working) the Navigator view does not seem to be updating as the Refresh is progressing. The result is a little better than before, because the whole UI isn't locked, but the project still cannot be interacted with prior to a full refresh being performed...
Actually, this seems to be an issue with getting the top level dir populated... that takes some time... but once the top level is populated browsing works as expected (ie, populate subdirs on demand).
An additional issue playing with Refresh in a Simple project: you cannot open a file to edit it while a refresh is taking place. Refresh blocks file open. Since the idea was to allow users to navigate to a file and open it while the rest of refresh took place in the background, this is problematic.
Are you seeing a regression from the behaviour in previous milestones, or is this your first time trying it?
This is my first time trying it (I just noticed project type 'Simple' to try it with).
More data... Attempting to create a 'Simple' Project using Eclipse 3.1M7 (Build id: I20050513-1415). Background refresh kicked in, and so the UI remained usable, however: 1) On one attempt it took 4-6 minutes for the 'Navigator' view to reflect anything in the project except the .project file. After 4-6 minutes the top level files and directories appeared. 2) Attempts to open top level directories in the Navigator view locked the UI for several seconds (presumably as those directory trees were explored). 3) On a second attempt to create a project (the old project and it's .project file having been deleted) the .project file and two directories became visible immediately in the 'Navigator' view. Neither directory could be opened or expanded at first. Subsequently the 'Navigator' view did update with more directories and the original two displayed could be opened, but still subject to the issue mentioned in 2). It would seem that the updating of resources as we go is in some fashion not working in a desirable way. Please note, the equivalent of pulling the top level listing (ls -l) at the command line is taking about 14 seconds at the CLI. Should I open new bugs for resolving 1, 2, and 3 above or continue to work out of this one?
Poking around RefreshJob.runInWorkspace(IProgressMonitor) it seems the problem is that it starts by pulling 2 levels down for each resource, and doesn't throttle back until it's made at least 100 such attempts. This is fine for projects like Java that are deep, but for projects which are wide (or both deep and wide) it leads to hugely slow behavior in high io latency situations. Changing the default depth to 1 seems to greatly improve behavior (which not noticably degrading behavior for normal cases).
This new information is a bit late to do much for 3.1 - our first release candidate build is tomorrow. I agree the problem seems to be that I was testing on a deep file system hierarchy that is typically found in a Java project. If you have large number of files near the root then it won't quickly adapt by decreasing the refresh depth. Even with an initial depth of 2 this feature caused regression of refresh performance for the "normal" use with a local file system, so I think the solution isn't as simple as making the initial depth 1. I'm also curious about why this has changed since you tried it in comment #13 - I suspect there is more complex interaction with other things going on in the background, such as the encoding change job (bug 90927).
OK... any objection to reopenning this bug with the understanding it will be addressed post 3.1? Also, as I expect the 3.1 release to be... distracting ;)... when should I come back to work on trying to improve the fix for 3.2?
I suggest entering a new enhancement request for further improvements in this area in the next release. I suggest this because bugs with old numbers tend to get lost when we query for active bugs, and also because some work was done for 3.1 M4 as this bug indicates and it's useful for us to have a record of that. 3.1 will be released in June, and work on the next release will start soon after that.