74392 – Slow project creation due to stating many files

Bug 74392 - Slow project creation due to stating many files

Summary: Slow project creation due to stating many files

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Resources (show other bugs)
Version:	3.0
Hardware:	All All

Importance:	P2 enhancement (vote)
Target Milestone:	3.1 M4
Assignee:	John Arthorne
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:	77071 78532
Blocks:
	Show dependency tree

Reported:	2004-09-20 22:14 EDT by Ed Warnicke
Modified:	2022-02-04 04:19 EST (History)
CC List:	8 users (show)

See Also:	544975 578487

Attachments
Performance benchmark plugin (36.67 KB, application/octet-stream) 2004-10-01 12:37 EDT, John Arthorne	no flags	Details
Patch to org.eclipse.core.resources plugin (592.50 KB, application/octet-stream) 2004-10-07 15:19 EDT, John Arthorne	no flags	Details
Patch to org.ecllipse.core.resources (3.08 KB, patch) 2004-10-08 13:04 EDT, John Arthorne	no flags	Details \| Diff
Bug fix requirements (1.16 KB, text/plain) 2004-10-25 15:19 EDT, Andrew Kinard	no flags	Details
Patch to org.eclipse.core.resources project (18.50 KB, patch) 2004-10-29 11:42 EDT, John Arthorne	no flags	Details \| Diff
Show Obsolete (2) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ed Warnicke

2004-09-20 22:14:31 EDT

I am seeing very slow project creation because it seems that all files must 
be statted and identified prior to interacting with the project.  Similarly 
starting up Eclipse is slowed down as all files must be statted during startup. 
 
This problem is made even worse over NFS or MVFS. 
 
I propose 'lazy' resource discovery.  When a project is created or Eclipse is  
started just put a placeholder in place for the project.  Put the process of 
getting directory listings in a background thread, queued by directory  
depth (ie list and stat for top level directories for projects before 
beginning decent into subdirs).  User opening of a project or directory 
in a project adds priority to that projects or directories listings.  In  
the event that a User has drilled down a listing not yet filled 
in use the same "Pending..." that the CVS folks use to let the user 
know things are coming. 
 
The goal is to give the illusion that a project (or Eclipse as a whole) is 
immediately available, and then conspire to makes sure that whenever a 
developer is looking for information it is already present.

Comment 1 John Arthorne

2004-10-01 12:36:35 EDT

Ed, I'd like to collect some data before we explore solutions in more detail. I
am going to attach a benchmarking plugin that is used to test the performance of
various operations in different end-user configurations. I want to get an idea
of the performance cost of stat in your configuration. 

The plugin is installed by unzipping into a new directory, and then linking to
your product using Help > Software Updates > Manage Configuration > Add an
Extension Location. This allows you to install the new plugin without having to
restart.

Once installed, it adds a "Benchmark" sub-menu to the resource navigator's
context menu. The two benchmarks I want you to run are not actually context
sensitive (i.e., it doesn't matter what the current selection is in the
navigator view). Load up your large workspace, and run both the "Standard
Visitor" and "Vist and Stat" benchmarks. Pick a number of iterations that takes
a reasonably long time (at least a few seconds) in order to minimize noise in
the results. The standard visitor is just a baseline to discover the cost of
iterating over all resources in your workspace (no I/O or expensive operations
involved). The visit and stat performs exactly one stat call on each resource in
the workspace.

Comment 2 John Arthorne

2004-10-01 12:37:16 EDT

Created attachment 14951 [details]
Performance benchmark plugin

Comment 3 Ed Warnicke

2004-10-01 16:21:51 EDT

I ran the benchmarks requested over a workspace the contained a single large 
project: 
 
Benchmark Standard Visition (100 iterations): 393.48ms average per iteration 
 
Benchmark Visit and Stat (1 iterations): 501951.0 ms average per iteration 
 
I'll visit and stat over more iterations later... but as you can see for the 
one iteration result... generating them may take a while :)

Comment 4 Ed Warnicke

2004-10-01 16:26:57 EDT

It should be mentioned that the project I ran benchmarks for in bug 74392 
comment 3 is a ClearCase view over MVFS... and the ClearCase client I'm working 
on is 0.2 ms or so roundtrip from the vob server.

Comment 5 John Arthorne

2004-10-01 16:39:52 EDT

Wow! That definitely shows the bottle neck. 8 minutes to stat the project is
incredible. I think a single iteration is enough to show a reliable result on a
sample that size. How many resources in the project?  The "Find Markers"
benchmark prints the number of resources visited, so you can run that on the
project to see the size.

For reference, I just ran on my local workspace (24,378 resources). The basic
visitor took 256 ms, and Visit and Stat took 4909 ms.

Comment 6 Ed Warnicke

2004-10-01 17:14:01 EDT

Find Markers shows 74176 markers in the single project that is the only project 
in the current workspace.

Comment 7 Ed Warnicke

2004-10-01 17:15:18 EDT

Correction to bug 74392 comment 6.  It found 74176 resources, not marker.

Comment 8 Ed Warnicke

2004-10-01 17:24:39 EDT

I just noticed that in Eclipse 3.0.1 startup does not seem to be slowed down by 
what I hypothesised was the 'stating' problem.  I seem to remember it being in 
Eclipse 3.0.  I am I crazy or did something change?

Comment 9 John Arthorne

2004-10-04 09:27:40 EDT

You actually shouldn't be seeing extensive stating on startup under normal
circumstances. There is a preference (Workbench > Startup and Shutdown > Refresh
Workspace on starutp). If this is checked (or the -refresh command line argument
is used), it will stat the entire workspace on startup. Also ensure Preferences
> Workbench > Refresh workspace automatically is turned off. When this second
preference is turned on, a background thread will periodically stat the files in
the file system to see if they have changed.  Finally, when Eclipse detects that
the previous session crashed (didn't shutdown using File > Exit), it will do a
"free" refresh (stat) of the entire workspace on startup.

Comment 10 Ed Warnicke

2004-10-05 18:35:10 EDT

I just wall clock timed the process of creating a project over my single large
code base: 9:15 +/- 5 seconds.  So it looks like other than the 8:20 for 
stating we have about 55 seconds being used for other activities related to
building the project.  What can we do to find out what else is taking so long?

Comment 11 John Arthorne

2004-10-06 11:21:34 EDT

Given the speed of your file system, I suspect another big cost is that we call
java.io.File.list() once on each folder to gather the list of child names. In
other words, simply discovering the complete tree of file and directory names
will take some time.  If all stating was eliminated (or moved to the
background), would 55 seconds be an acceptable range or is it still too slow? 
How often are you needing to create such projects?

I have another general question about your work flow that would help us scope a
solution.  Out of the 75,000 files in the project, what proportion of them will
eventually be read locally?  Are builds run locally or remotely?  Do you only
ever need the bytes from a small set of those files?  We currently have the
notion in our API of "non-local" files and folders.  Such resources have a bit
set indicating they do not reside on a local disk. They are never stated, but
trying to obtain their contents without first making them local is not allowed.
Would it make sense in your situation for us to create a project of non-local
resources, and then mark them as local lazily (and obtain their stat info) only
as the contents are requested?  I.e., I buy your original argument for lazy
discovery, but how eagerly should we "conspire" to make them available?

Comment 12 John Arthorne

2004-10-07 15:19:53 EDT

Created attachment 15049 [details]
Patch to org.eclipse.core.resources plugin

Attached is a replacement for resources.jar contained in
eclipse/plugins/org.eclipse.core.resources for Eclipse 3.0.1.  It moves the
bulk of the file system work for opening a project into a background job.  This
is not intended as a workable solution, since the background job will prevent
you from doing any real work while it runs. However, it will demonstrate the
fastest possible response time for creating very large projects.  To turn this
into a workable solution, the background job would have to politely interrupt
its work whenever the user wants to make changes, and then resume its activity
during idle time.

Comment 13 Ed Warnicke

2004-10-07 18:14:25 EDT

I've tried your replacement resource.jar, and it seems to be a big move forward.
 The project is immediately created (although not populated) and I can move
about and do other things in Eclipse while the file activity is taking place.  

I did have an issue while doing independent CVS activity (trying to get the HEAD
listing for a repository for example) with those background CVS jobs not
finishing until after the background job discovering all of the files was
finished.  Does that fit your 'no real work till completed' expectation, or is
it some other effect?

Shall we move forward to the next step with this?

Comment 14 John Arthorne

2004-10-08 12:13:10 EDT

Expanding HEAD shouldn't conflict with the refresh going on, but it turns out it
does. The CVS team has entered a separate bug for this.

As for next steps, I'm currently in the middle of several other things - I was
just hoping to gather some early data that could influence the shape of the
solution. This bug, and related issues with large/remote workspaces, is
certainly on the radar though.

Comment 15 Ed Warnicke

2004-10-08 12:43:23 EDT

Could you either attach the code for the changes you made in the resource.jar
you posted, or provide reference to where it is in CVS?

Also, could you provide the bug number for the CVS bug you mentioned in bug
74392 comment 14 ?

Comment 16 John Arthorne

2004-10-08 13:04:00 EDT

Created attachment 15071 [details]
Patch to org.ecllipse.core.resources

Attached is a patch to the org.eclipse.core.resources plugin version 3.0.1,
corresponding to the JAR file in attachment 15049 [details].  The bug number for the CVS
defect is bug 75918.

Comment 17 Andrew Kinard

2004-10-25 15:19:06 EDT

Created attachment 15370 [details]
Bug fix requirements

Required features for the bug fix.

Comment 18 John Arthorne

2004-10-25 17:22:27 EDT

Thanks for the requirements, Andrew, this is useful information. Some comments:

R4: This seems to be an implementation detail rather than a useful requirement.
I.e., whether there are several concurrent discovery threads or one thread
processing a single queue of discovery requests shouldn't matter from a user's
point of view. It is possible that multiple discovery threads might speed things
up since one could proceed while another is blocked on the expensive I/O, but
one thread per project seems a bit arbitrary...

R6: Depth-based priority isn't interesting.  Any discovery of the file system
tree must happen top-down regardless (can't discover a file until its parent is
known).

An important detail not included here is what happens when an "undiscovered"
tree is queried programmatically. For example, if a builder or search engine
asks for the children of a directory that has not yet been fully discovered,
does it raise the priority, or just return what is already known. Picture a tool
that is attempting to "visit" the entire project tree - should it just see the
discovered subset or should it block until the entire tree is known?

Comment 19 Andrew Kinard

2004-10-26 11:52:37 EDT

John, I agree with your comments in bug 74392 comment 18. Are you working on any
code for this now?  If not, do you mind if I take a whack at it?

Comment 20 John Arthorne

2004-10-26 13:53:42 EDT

I'm actually working on it right now. My first task was to move the discovery
into the background (see patch). That was the easy part. I am now working on
breaking up the discovery into small chunks so that the user can perform other
work concurrently with the discovery going on in the background (complicated by
our somewhat monolithic locking model). That is going well - a bit difficult to
come up with an approach that doesn't hurt performance for those on normal local
file systems, but making progress. My next step after that is to mark
"undiscovered" resources so that we know to eagerly discover them on demand (for
example when the user expands an undiscovered folder in the navigator view. 
I'll keep this bug report updated with my progress.

Comment 21 John Arthorne

2004-10-27 10:32:38 EDT

I uncovered bug 77071 last night while testing lazy population of the Navigator
view. The underlying TreeViewer will need to handle requests to add elements
that have already been expanded for this solution to work.

Comment 22 John Arthorne

2004-10-29 11:42:08 EDT

Created attachment 15482 [details]
Patch to org.eclipse.core.resources project

The attached patch implements the requirements layed out above. It performs
population of new projects in the background, in small chunks of work so that
the user can continue browsing and modifying the workspace in the meantime. 
When an attempt is made to access a directory that has not yet been populated
(for example by expanding in the Navigator), it moves that discovery request to
the front of the refresh queue. It does not implement the "Pending..." and busy
cursor behaviour in the CVS repository view, but since local refresh of a
single directory (just discovering one level of children) is never likely to
take a long  time I'm not sure if this is necessary.

I have been experimenting with this implementation on a simulated "slow" file
system (I added sleeps to file system calls). It feels pretty good, but it
would be nice to get feedback from real users on these systems to see how it
behaves.

We are approaching a milestone build next week (3.1 M3), so I won't be
releasing this until after that point to minimize instability.

Comment 23 John Arthorne

2004-10-29 11:42:54 EDT

Note that the latest patch from bug 77071 will also be required to test the
patch described in comment #22.

Comment 24 John Arthorne

2004-11-12 10:36:05 EST

More data:

Client machine:
Eclipse V3.0.1
Build id: 200409161125
Solaris 5.8
ClearCase v2002.05.00-13
using a ClearCase dynamic view

Server machine:
Solaris 5.6
ClearCase v2003.06.10+

44,031 files
3,127 directories

Visit and Stat:
387,897.0ms (1 iteration)
383,402.0ms (1 iteration)
379,049.0ms (1 iteration)

Standard Visitor:
1443.8ms (10 iterations)
1438.8ms (10 iterations)

Comment 25 John Arthorne

2004-11-12 15:25:19 EST

Platform core changes have been released to HEAD.  This still depends on two UI
changes - I have pinged the UI team to consider these for next week. I will
leave this bug open until everything is released and we get feedback on its
effectiveness.

Comment 26 John Arthorne

2004-11-19 16:53:25 EST

All UI changes for this feature have been released.  The I20041123 build should
be suitable for testing.

Comment 27 John Arthorne

2004-12-02 16:59:57 EST

Marking fixed.

Comment 28 Ed Warnicke

2005-02-07 11:07:09 EST

Running Eclipse 3.1M4 with CDT build 3.0.0-I200502011142 on Linux I am still 
getting very slow project creation times (over 10 minutes).  Is there an option
that needs to be enabled?   Or might there be a CDT interaction?

Comment 29 John Arthorne

2005-02-07 11:45:55 EST

There is almost certainly a CDT interaction.  Background project loading doesn't
happen "for free" because it would have caused breaking changes to existing
clients.  Code that creates projects (such as the CDT project creation wizards),
need to set a flag to indicate background refresh is needed.  Can you enter a
bug against CDT for this?  You can refer them to the following document for details:

http://dev.eclipse.org/viewcvs/index.cgi/%7Echeckout%7E/platform-core-home/documents/3.1/background_refresh.html

Comment 30 Ed Warnicke

2005-05-06 08:51:02 EDT

This doesn't seem to be quite right... when I attempt to create a project of
type 'Simple' (CDT is still not working) the Navigator view does not seem to be
updating as the Refresh is progressing.  The result is a little better than
before, because the whole UI isn't locked, but the project still cannot be
interacted with prior to a full refresh being performed...

Comment 31 Ed Warnicke

2005-05-06 08:59:15 EDT

Actually, this seems to be an issue with getting the top level dir populated...
that takes some time... but once the top level is populated browsing works as
expected (ie, populate subdirs on demand).

Comment 32 Ed Warnicke

2005-05-06 10:03:45 EDT

An additional issue playing with Refresh in a Simple project:

you cannot open a file to edit it while a refresh is taking place.  Refresh
blocks file open.

Since the idea was to allow users to navigate to a file and open it while the
rest of refresh took place in the background, this is problematic.

Comment 33 John Arthorne

2005-05-06 10:28:34 EDT

Are you seeing a regression from the behaviour in previous milestones, or is
this your first time trying it?

Comment 34 Ed Warnicke

2005-05-06 15:32:02 EDT

This is my first time trying it (I just noticed project type 'Simple' to try it
with).

Comment 35 Ed Warnicke

2005-05-23 15:06:17 EDT

More data...

Attempting to create a 'Simple' Project using Eclipse 3.1M7 (Build id:
I20050513-1415).

Background refresh kicked in, and so the UI remained usable, however: 

1) On one attempt it took 4-6 minutes for the 'Navigator' view to reflect
anything in the project except the .project file.  After 4-6 minutes the top
level files and directories appeared.

2)  Attempts to open top level directories in the Navigator view locked the UI
for several seconds (presumably as those directory trees were explored).

3)  On a second attempt to create a project (the old project and it's .project
file having been deleted) the .project file and two directories became visible
immediately in the 'Navigator' view.  Neither directory could be opened or
expanded at first. Subsequently the 'Navigator' view did update with more
directories and the original two displayed could be opened, but still subject to
the issue mentioned in 2).

It would seem that the updating of resources as we go is in some fashion not
working in a desirable way.  Please note, the equivalent of pulling the top
level listing (ls -l) at the command line is taking about 14 seconds at the CLI.

Should I open new bugs for resolving 1, 2, and 3 above or continue to work out
of this one?

Comment 36 Ed Warnicke

2005-05-23 19:24:01 EDT

Poking around RefreshJob.runInWorkspace(IProgressMonitor) it seems the 
problem is that it starts by pulling 2 levels down for each resource,
and doesn't throttle back until it's made at least 100 such attempts.
This is fine for projects like Java that are deep, but for projects
which are wide (or both deep and wide) it leads to hugely slow 
behavior in high io latency situations.  Changing the default depth to 1 seems
to greatly improve behavior (which not noticably degrading behavior for normal
cases).

Comment 37 John Arthorne

2005-05-26 13:53:19 EDT

This new information is a bit late to do much for 3.1 - our first release
candidate build is tomorrow.  I agree the problem seems to be that I was testing
on a deep file system hierarchy that is typically found in a Java project.  If
you have large number of files near the root then it won't quickly adapt by
decreasing the refresh depth.  Even with an initial depth of 2 this feature
caused regression of refresh performance for the "normal" use with a local file
system, so I think the solution isn't as simple as making the initial depth 1.
I'm also curious about why this has changed since you tried it in comment #13 -
I suspect there is more complex interaction with other things going on in the
background, such as the encoding change job (bug 90927).

Comment 38 Ed Warnicke

2005-05-26 17:00:08 EDT

OK... any objection to reopenning this bug with the understanding it will be
addressed post 3.1?  Also, as I expect the 3.1 release to be... distracting
;)... when should I come back to work on trying to improve the fix for 3.2?

Comment 39 John Arthorne

2005-05-26 17:22:12 EDT

I suggest entering a new enhancement request for further improvements in this
area in the next release.  I suggest this because bugs with old numbers tend to
get lost when we query for active bugs, and also because some work was done for
3.1 M4 as this bug indicates and it's useful for us to have a record of that. 
3.1 will be released in June, and work on the next release will start soon after
that.