55182 – Shutdown takes more than a minute

Bug 55182 - Shutdown takes more than a minute

Summary: Shutdown takes more than a minute

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Resources (show other bugs)
Version:	3.0
Hardware:	PC Windows XP

Importance:	P3 major (vote)
Target Milestone:	3.1
Assignee:	Rafael Chaves
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:	66371 66369 66370
Blocks:
	Show dependency tree

Reported:	2004-03-17 17:54 EST by Dirk Baeumer
Modified:	2009-10-06 10:06 EDT (History)
CC List:	8 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dirk Baeumer

2004-03-17 17:54:53 EST

I20040316

This is not a new behavior I am seeing, but it seems to get worse. Shutting 
down Eclipse on mz laptop (P4 2.4 GHZ, 1GB RAM) usually takes more than a 
minute. A typical stack trace during shutdown is:

"main" prio=5 tid=0x00035c70 nid=0xd98 runnable [7f000..7fc3c]
        at 
org.eclipse.core.internal.localstore.CoreFileSystemLibrary.internalSetReadOnlyW
(Native Method)
        at 
org.eclipse.core.internal.localstore.CoreFileSystemLibrary.setReadOnly
(CoreFileSystemLibrary.java:144)
        at org.eclipse.core.internal.localstore.BlobStore.delete
(BlobStore.java:89)
        at org.eclipse.core.internal.localstore.BlobStore.deleteAllExcept
(BlobStore.java:170)
        at org.eclipse.core.internal.localstore.HistoryStore.removeGarbage
(HistoryStore.java:435)
        at org.eclipse.core.internal.localstore.HistoryStore.clean
(HistoryStore.java:199)
        at org.eclipse.core.internal.resources.SaveManager.save
(SaveManager.java:1292)
        at org.eclipse.core.internal.resources.Workspace.save
(Workspace.java:1592)
        at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor$7.run
(IDEWorkbenchAdvisor.java:543)
        at org.eclipse.jface.operation.ModalContext.runInCurrentThread
(ModalContext.java:302)
        at org.eclipse.jface.operation.ModalContext.run(ModalContext.java:252)
        at org.eclipse.jface.dialogs.ProgressMonitorDialog.run
(ProgressMonitorDialog.java:400)
        at 
org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.disconnectFromWorkspace
(IDEWorkbenchAdvisor.java:550)
        at org.eclipse.ui.internal.ide.IDEWorkbenchAdvisor.postShutdown
(IDEWorkbenchAdvisor.java:256)
        at org.eclipse.ui.internal.Workbench.shutdown(Workbench.java:1716)
        at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1423)
        at org.eclipse.ui.internal.Workbench.createAndRunWorkbench
(Workbench.java:257)
        at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:139)
        at org.eclipse.ui.internal.ide.IDEApplication.run
(IDEApplication.java:90)
        at org.eclipse.core.internal.runtime.PlatformActivator$1.run
(PlatformActivator.java:260)
        at org.eclipse.core.runtime.adaptor.EclipseStarter.run
(EclipseStarter.java:173)
        at org.eclipse.core.runtime.adaptor.EclipseStarter.run
(EclipseStarter.java:106)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:324)
        at org.eclipse.core.launcher.Main.basicRun(Main.java:305)
        at org.eclipse.core.launcher.Main.run(Main.java:729)
        at org.eclipse.core.launcher.Main.main(Main.java:713)

Comment 1 Dirk Baeumer

2004-03-17 17:56:53 EST

This time it took 4 minutes !!

Comment 2 DJ Houghton

2004-03-17 18:01:27 EST

Hmm...not sure what we can do here. We enforce some history store policies on
shutdown rather than taking the hit during the actual API calls.

For instance, when calling IFile.setContents, we add the old contents to the
history store and then set the new contents. We don't "trim the number of files
to the newest 50" (or whatever the policy may be) until we shut down the
resources plug-in.

How confident are we in the new Jobs API and the thread safety of the history
store? Perhaps we could create a job to enforce the policy every X min or X
operations or every time we do a snapshot or...

Comment 3 John Arthorne

2004-03-18 10:00:10 EST

Confident in Jobs API: yes
Confident in thread safety of history store: NO!

The amount of time taken here should be proportional to the number of changes
that occurred in that workspace.  I suspect you're doing something like running
large test suites or leaving the workspace open for long periods of time.

Yes, we could certainly improve this with a background job.  However, since this
behaviour hasn't changed since 1.0 I'm not excited about doing anything at this
stage of 3.0 development.

Comment 4 Dirk Baeumer

2004-03-19 06:31:18 EST

But I have the impression that this got worse the last couple of months. Or 
may be its because of my huge local history.

Comment 5 Erich Gamma

2004-05-13 09:19:00 EDT

I'm seeing the same problem.
I shutdown my workspace once per day and wait on the order of 1 minute.

Comment 6 Mike Wilson

2004-05-14 10:46:11 EDT

We need a benchmark that tests the performance of the history store (versus R2.1).

Comment 7 DJ Houghton

2004-05-21 15:48:27 EDT

Dirk, I've added a debug option to time how long it takes to apply the history
store polices and remove garbage on shutdown. It will be in the M9 build. It
will also show you other history store things like a message whenever a file
state gets added.
  org.eclipse.core.resources/debug=true
  org.eclipse.core.resources/history=true

If you could run with this option and then annotate this report with the lines
printed out at the end that would be great. For an initial reading it would be
interesting to see the numbers after you have been working for just a few minutes.

Comment 8 Dirk Baeumer

2004-05-25 07:33:11 EDT

Here are the number of starting the workbench and shutting it down again 
immediatelly

Tue May 25 13:16:26 CEST 2004 - [ModalContext] Time to apply history store 
policies: 4500ms.
Tue May 25 13:16:26 CEST 2004 - [ModalContext] Time to remove history store 
garbage (saving 22497 blobs): 250ms.
Time spent in registry parsing: null
Time spent in package admin resolve: null
Time spent resolving the dependency system: null

Comment 9 DJ Houghton

2004-05-25 09:36:56 EDT

And did it still seem to take close to a minute to shutdown completely?

Comment 10 Dirk Baeumer

2004-05-25 13:37:52 EDT

No, it was a little bit faster. 

Shutting down the workspace after working for 4 hours printed the following 
result:

Time to apply history store policies: 1832ms.
Time to remove history store garbage: 36646ms.

Shutdown took ~ 1 minute.

Comment 11 DJ Houghton

2004-05-25 13:54:42 EDT

Do you remember what the rest of the debug message said? (how many blobs it was
saving?)

Comment 12 Dirk Baeumer

2004-05-25 13:59:03 EDT

Unfortunatelly not. I wrote down the numbers but forget to remember the blob 
number. Will do the next time.

Comment 13 John Arthorne

2004-05-25 14:23:16 EDT

Dirk, out of curiosity, what are your local history policy settings (from the
local history preference page)?

Comment 14 Dirk Baeumer

2004-05-26 05:16:00 EDT

Days to keep: 999999
Entries per file: 1000
Maximum file size: 100

Comment 15 Dirk Baeumer

2004-05-26 18:36:35 EDT

New numbers from last shutdown:

Thu May 27 00:18:51 CEST 2004 - [ModalContext] Time to apply history store 
policies: 37030ms.
Thu May 27 00:19:25 CEST 2004 - [ModalContext] Time to remove history store 
garbage (saving 22738 blobs): 34008ms.
Time spent in registry parsing: null
Time spent in package admin resolve: null
Time spent resolving the dependency system: null

Comment 16 Dirk Baeumer

2004-05-28 06:35:54 EDT

New record ;-)

Fri May 28 12:16:57 CEST 2004 - [ModalContext] Time to apply history store 
policies: 22116ms.
Fri May 28 12:19:10 CEST 2004 - [ModalContext] Time to remove history store 
garbage (saving 22838 blobs): 133112ms.

Comment 17 DJ Houghton

2004-05-28 09:37:17 EDT

Hey, that *is* a record! I'll take a look around and see if we can find an
Eclipse hat to send you... ;-)

Comment 18 John Arthorne

2004-05-28 11:51:45 EDT

To add some more data, I've had the same workspace since May 2002, and have the
following history store settings:

Days to keep files: 100 (default is 7 days)
Entires per file: 50 (this is the default)
Maximum file size: 1 MB (default)

My history store is 100 MB on disk, and has about 5,300 blobs. Here are my
shutdown results:

Fri May 28 11:28:26 EDT 2004 - [ModalContext] Time to apply history store polici
es: 1953ms.
Fri May 28 11:28:26 EDT 2004 - [ModalContext] Time to remove history store garba
ge (saving 5301 blobs): 79ms.

So, it looks like the performance is still reasonable with more conservative
history store settings.

Comment 19 DJ Houghton

2004-06-02 11:25:23 EDT

Ok, here's the summary of what's going on. After a little investigation and some
memory jogging I believe I have everything straight. I'll try and provide a
little background too. 

In the #shutdown method we: 
- iterate over all the entries in the history store
- check to see if the entry is too old (apply the "days to keep" policy)
- delete if too old
- add the uuid of the associated blob to the "to keep" set

then we:
- iterate over all of the blobs in the blob store
- if the blob is in the "to keep" set then ignore, otherwise delete it

Yes, this is ugly and non-performant in cases where we have large amounts of
data (like Dirk's case).

Some background:

In the old days we never used to copy local history when we copied resources.
What this meant was we always had a 1 to 1 association between history store
entries and blobs. (when we deleted an entry, then we just went ahead and
deleted the blob because it wasn't possible for anyone else to be referencing it) 

Then in bug 24787 we decided that local history should be copied with the resource.

In order for this ammendment to be peformant, we decided that the blobs would
not be copied, but rather just the references would be copied. This now means
that it is possible to have many history store entries pointing to the same blob
in the blob store. It was recognized at this time that this would allow the
#copy operation to behave in a performant manner but we would be compromising
the timing of the #shutdown method in order to perform the garbage collection
for unreferenced blobs.

Possible resolutions:

I'll break the problems down into 2 sections: 1) applying the "date" policy and
2) gabage collection of unreferenced blobs.

1). All other history store policies are applied when the file is added to the
store. (policies based on file size, number of file states, etc) 

a). We need to decide whether or not we can trim the store based on the "days to
keep" policy when we add the file state. Trade-off is that it would make all
calls to IFile#setContents, IFile#copy, IFile#move, etc slower.

b). Can we apply this policy in a separate thread? We are not confident in the
thread safety of the indexed store.

2). This algorithm is ugly and in cases where the user has a large amount of
local history, it is prominent. Here are some possible ideas. Note that all of
them mean that we wouldn't be doing the garbage collection during shutdown anymore:

a). Copy blobs rather than just increasing the reference count. 
  Pros: don't need to keep track of who references what, can just delete the
blob when we delete the entry, safest change
  Cons: Slower performance of the actual IResource API calls, increases the size
of the local history, 

b). Storing a reference count with the blobs. When we delete an entry then
reduce the reference count and when it hits zero delete the blob.
  Pros: don't have to copy the blobs (size of store and speed of API calls
remain relatively the same)
  Cons: having to store extra information in the blob rather than just the data
itself, how to convert the blobs that already exist?, index store code is
complicated...do we feel comfortable making this change this late in the game?

c). Do the garbage collection and policy application in a separate thread. I
don't like this idea but I have to add it as a possible solution.
  Pros: No added time in the #shutdown method (dirk's 2 minute shutdown would
disappear)
  Cons: We are not confident at all in the thread safety of the indexed store.



The first step would be to create some performance tests to see exactly how much
of a degregation the API calls would be if we copied the blobs rather than just
adding a reference to them.

Anyone else have any thoughts?

Comment 20 DJ Houghton

2004-06-02 16:26:28 EDT

More thoughts from discussions between JA/JW and JA/DJ/McQ:

Concerning the policy checking: If the user has their "expiry date" pref set to
be a really high number, (before Eclipse 1.0?) then don't bother checking their
local history. Taking this one step further, we could store the timestamp of the
oldest history store entry with the workspace info. On shutdown, if the
timestamp is less than the policy setting, then skip the expiration date checking.

The one problem with this is that the garbage collection stage is piggy-backing
off the policy checking stage to collect the set of referenced blobs to keep in
the store.

Some thoughts about keeping a reference count for the blobs:
1). don't share the blobs - copy them when we do an IResource#copy
2). reference count could be in the filename of the blob
3). reference count could be stored in a separate file. To reduce the size of
the file we would only include blobs whose reference count was greater than 1.

Comment 21 John Wiegand

2004-06-07 14:57:55 EDT

DJ: Can you reproduce Dirk's performance problem?  We have two data points: 
Dirk has a long shutdown, John A does not.  If we can reproduce the long 
shutdown case, then you can more easily confirm possible solutions.

I don't understand the "no references" part of the gc: we want to be able to 
recover files that are deleted.  (I can propose optimizations/approaches, but 
the right approach depends on what is really taking the time.)

Comment 22 DJ Houghton

2004-06-08 11:52:43 EDT

I am trying to reproduce Dirk's problem. It is a combination of many states per
file as well as lots of files. 

In my first attempt, I created 25000 file states (50 per file, 500 files, all in
the same project). Shutdown and cleanup time were as follows:

1st start/shutdown:
Tue Jun 08 11:45:30 EDT 2004 - Time to apply history store policies: 10214ms.
Tue Jun 08 11:45:30 EDT 2004 - Total number of history store entries: 25000
Tue Jun 08 11:46:22 EDT 2004 - Time to remove history store garbage (saving
25000 blobs): 52175ms.

2nd start/shutdown: 
Tue Jun 08 11:47:05 EDT 2004 - Time to apply history store policies: 4516ms.
Tue Jun 08 11:47:05 EDT 2004 - Total number of history store entries: 25000
Tue Jun 08 11:47:05 EDT 2004 - Time to remove history store garbage (saving
25000 blobs): 301ms.

I will try different combinations of projects/files/states and see what I can
come up with. I am in the process of writing a test plug-in for browsing local
history, as well as a view (like the Jobs example view) which allows me to
create local history based on user-entered settings.

re: no references
There are 2 parts to local history: the entries and the associated blobs. The
blobs are just the raw data from IFile.getContents(). The history store is
implemented such that many entries might point to the same blob. (for instance
if a resource is copied)

Comment 23 DJ Houghton

2004-06-09 09:56:32 EDT

Ok, I've made a couple of changes which so far look really promising. 

1). Keep track of the oldest entry in the history store and don't try to apply
the "delete old entries" policy if we don't have to.

2). Whenever we remove an entry from the history store, keep track of its
associated blob. Add it to the set of "potentially unreferenced blobs". On
shutdown we go over this set rather than the previously used "blobs to
preserve". (this new set is a lot smaller)

Comment 24 DJ Houghton

2004-06-09 15:47:26 EDT

Created bugs for separate issues.

Comment 25 DJ Houghton

2004-06-09 17:50:29 EDT

Dirk, please post debug info after running with tonight's 8pm build. 

The second problem (garbage collection of unreferenced blobs) has been fixed and
you should see significant improvements. 

Thanks.

Comment 26 DJ Houghton

2004-06-11 10:30:45 EDT

Dirk, were you able to get timings after running for a few hours with the new
build? How are your and Erich's shutdown times with the RC2 candidate?

Comment 27 Dirk Baeumer

2004-06-11 10:40:23 EDT

yes, here are the timing for two shutdowns:

Fri Jun 11 00:38:19 CEST 2004 - [ModalContext] Time to apply history store 
policies: 13958ms.
Fri Jun 11 00:38:19 CEST 2004 - [ModalContext] Total number of history store 
entries: 30323
Fri Jun 11 00:38:19 CEST 2004 - [ModalContext] Time to remove 0 unreferenced 
blobs: 0ms.


Fri Jun 11 11:16:08 CEST 2004 - [ModalContext] Time to apply history store 
policies: 19517ms.
Fri Jun 11 11:16:08 CEST 2004 - [ModalContext] Total number of history store 
entries: 30339
Fri Jun 11 11:16:17 CEST 2004 - [ModalContext] Time to remove 122 unreferenced 
blobs: 8905ms.

Comment 28 DJ Houghton

2004-06-11 10:47:57 EDT

What are you doing with your workspace? You've added 5,000 files to your local
history in the past 2 weeks? I think that you win the award for the most
productive developer on the Eclipse team. :-)

Comment 29 DJ Houghton

2004-06-18 16:52:41 EDT

We've made some significant improvements for 3.0. We will address remaining
issues post-3.0.

Comment 30 DJ Houghton

2004-11-26 09:43:37 EST

Rafael is making changes to the history store. Re-assigning.

(Rafael: please see referenced bug reports)

Comment 31 Rafael Chaves

2005-02-24 12:44:56 EST

Dirk, the performance when applying the local history clean-up policy during
shutdown should have improved considerably, at least our tests show that, for
instance: 

http://download.eclipse.org/eclipse/downloads/drops/N20050222-0010/performance/win/org.eclipse.core.tests.resources.perf.LocalHistoryPerformanceTest.testClearHistory100x4().html

Do you still see shutdown taking a long time?

Comment 32 Dirk Baeumer

2005-02-24 13:19:43 EST

It still takes some time but has improved noticable over 3.0.

Comment 33 Tod Creasey

2005-03-07 11:57:15 EST

Adding my name to the cc list as we are now tracking performance issues more
closely. Please remove the performance keyword if this is not a performance bug.

Comment 34 John Arthorne

2005-04-11 17:38:23 EDT

Marking this as fixed.  Cleaning the history store is 93% faster with our new
implementation (as shown by 3.1 M6 core performance test results).

Comment 35 Markus Keller

2009-04-22 06:02:35 EDT

Bug 257167 is investigating further solutions to this problem.