Bug 149014 - Eclipse takes extremely long time to recover after workspace crash
Summary: Eclipse takes extremely long time to recover after workspace crash
Status: REOPENED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.2   Edit
Hardware: PC Linux
: P4 major with 1 vote (vote)
Target Milestone: ---   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
: 160014 173870 397269 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-06-28 10:54 EDT by James Shaw CLA
Modified: 2021-03-02 02:47 EST (History)
10 users (show)

See Also:


Attachments
sample backtrace of main thread (3.59 KB, text/plain)
2006-06-29 04:09 EDT, James Shaw CLA
no flags Details
debug log (1.29 KB, text/plain)
2006-06-30 04:32 EDT, James Shaw CLA
no flags Details
Another debug log (5.34 KB, text/plain)
2006-06-30 10:29 EDT, John Arthorne CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description James Shaw CLA 2006-06-28 10:54:42 EDT
I'm using eclipse to develop a large project (around 80000 files and 18000 directories including .svn files and .class files, 460MB in total.)  The project is stored in a single eclipse project, and I generally have one project per development branch (I only ever have one project open at any one time.)

Sometimes, eclipse will run out of heap space, and takes a very long time to recover when I relaunch it (quickest is ~20 mins, sometimes I have to leave eclipse running overnight.)  Eclipse will take under a minute under normal conditions, and a full build of the project takes around 20 minutes.

I've used jconsole to look at what's happening in the main thread, and eclipse is in a loop doing System.arraycopies, and spending almost 100% of its time doing "Copy" collections.

Some other stats on my machine: many versions of eclipse from 3.1.1 to 3.2R5, CentOS 4.3, P4, 1.5RAM, 650MB max heap for eclipse, Sun 1.5 JVM.
Comment 1 John Arthorne CLA 2006-06-28 11:48:29 EDT
Can you attach a stack trace where the time is going?  Just knowing that it's doing "System.arraycopies" doesn't tell us much.
Comment 2 James Shaw CLA 2006-06-29 04:09:04 EDT
Created attachment 45521 [details]
sample backtrace of main thread
Comment 3 John Arthorne CLA 2006-06-29 11:25:24 EDT
I don't see any System.arraycopies or collection copying in this stack... is this just a random stack during startup or does it look like this consistently? 

Adding the following to debug options yields much more detailed startup time information:

org.eclipse.core.resources/debug=true
org.eclipse.core.resources/restore=true
org.eclipse.core.resources/restore/markers=true
org.eclipse.core.resources/restore/syncinfo=true
org.eclipse.core.resources/restore/tree=true
org.eclipse.core.resources/restore/metainfo=true
org.eclipse.core.resources/restore/snapshots=true
org.eclipse.core.resources/restore/mastertable=true

Instructions for using .options: 
http://wiki.eclipse.org/index.php/FAQ_How_do_I_use_the_platform_debug_tracing_facility%3F
Comment 4 James Shaw CLA 2006-06-29 12:33:05 EDT
It consistently spends its time in this loop and the stack traces look very similar to this.  I believe the System.arraycopy() is invoked by one of the methods in SafeChunkyInputStream (I did see this in the eclipse source when I checked a little while back.)

I'll post another comment when I've run it with debug turned on.
Comment 5 James Shaw CLA 2006-06-30 04:32:19 EDT
Created attachment 45565 [details]
debug log

This is all that the log shows after several minutes after startup
Comment 6 John Arthorne CLA 2006-06-30 10:29:11 EDT
Created attachment 45586 [details]
Another debug log

Thanks for the details.  For comparison, here is a log from crash recovery on my workspace (35,000 files in 6,600 folders, 286 MB total size).  It seems in your case it is restoring marker info (breakpoints, problems, tasks, etc) at the time of the long delay.
Comment 7 John Arthorne CLA 2006-06-30 10:47:20 EDT
If you still have this workspace, can you search for files called ".markers.snap"?  I would be curious to know how many there are, and how large they are.
Comment 8 James Shaw CLA 2006-06-30 11:10:13 EDT
I have 7 .markers.snap files (one per workspace, it would seem).  Six of them are <1KB, but the large project's markers file is 137MB.  It would seem another large project's markers file is missing (I had two development branches checked out).  I presume this is because one project/branch was closed at the time.
Comment 9 John Arthorne CLA 2006-06-30 11:22:02 EDT
Ok, I think that's the source of the problem.  A typical marker snapshot file would be in the 1-5KB range.  137MB is absolutely massive! To give some perspective, this file just records the changed markers since the last save.  Each marker is typically about 400 bytes (assuming you don't have some special kind of markers with extra attributes). This suggests that on the order of 350,000 markers were changed in that workspace since the last time it was saved.

What other plugins do you have isntalled that operate on that large project?  Either somebody is "leaking" markers and not cleaning them up, or they are generating markers with very large attribute strings.
Comment 10 James Shaw CLA 2006-06-30 11:27:20 EDT
Deary me :-)  I have checkstyle, spring and WTP plugins installed.  I have
around 90,000 warnings in the project (which contribute to the markers file, I
guess)

I wonder if the checkstyle plugin might be the culprit, since it will generate
a large proportion of these warnings.
Comment 11 John Arthorne CLA 2006-10-06 10:22:16 EDT
*** Bug 160014 has been marked as a duplicate of this bug. ***
Comment 12 Chris Laffra CLA 2006-10-06 11:26:41 EDT
Can you just erase snap files larger than 1MB, or whatever arbitrary requirement. We are recovering from a crash anyway, so we can live with being told that markers were not recovered, and that we need to rebuild the workspace.

I would like to make a request for inclusion in 3.2.2 maintenance stream.

Large workspaces crash all the time - because people kill the underlying JVM. Eclipse should recover nicely from this.
Comment 13 Raj Mandayam CLA 2006-10-18 11:32:50 EDT
I saw a crash like this happen too. It was in WTP, a large project (3000 jsp and html) so I ran validation on them after it was done I was working on something else and I lost power and so my workbench was terminated, 

Now when I relaunch the product,

in the console it said, a crash has been detected.....


 its been at least 10 min, and eclipse is not launching, if I were to suspend threads this is where it was

Thread [main] (Suspended) 
SafeChunkyInputStream.accumulate(byte[], int, int) line: 40 
SafeChunkyInputStream.buildChunk() line: 62 
SafeChunkyInputStream.read() line: 141 
DataInputStream.readInt() line: 353 
MarkerSnapshotReader.readVersionNumber(DataInputStream) line: 47 
MarkerSnapshotReader.read(DataInputStream) line: 41 
MarkerManager.restoreFromSnap(IResource) line: 468 
MarkerManager.restore(IResource, boolean, IProgressMonitor) line: 434 
SaveManager.restoreMarkers(IResource, boolean, IProgressMonitor) line: 702 
SaveManager.restore(IProgressMonitor) line: 627 
SaveManager.startup(IProgressMonitor) line: 1299 
Workspace.startup(IProgressMonitor) line: 1883 
Workspace.open(IProgressMonitor) line: 1653 
ResourcesPlugin.startup() line: 367 
PluginActivator.start(BundleContext) line: 31 
BundleContextImpl$2.run() line: 991 
AccessController.doPrivileged(PrivilegedExceptionAction<T>) line: not available [native method] 
BundleContextImpl.startActivator(BundleActivator) line: 985 
BundleContextImpl.start() line: 966 
BundleHost.startWorker(boolean) line: 317 
BundleHost(AbstractBundle).start() line: 256 
EclipseLazyStarter.preFindLocalClass(String, ClasspathManager) line: 86 
ClasspathManager.findLocalClass(String) line: 409 
DefaultClassLoader.findLocalClass(String) line: 188 
BundleLoader.findLocalClass(String) line: 334 
SingleSourcePackage.loadClass(String) line: 37 
BundleLoader.findClass(String, boolean) line: 383 
BundleLoader.findClass(String) line: 347 
DefaultClassLoader.loadClass(String, boolean) line: 83 
DefaultClassLoader(ClassLoader).loadClass(String) line: 251 
DefaultClassLoader(ClassLoader).loadClassInternal(String) line: 319 
IDEApplication.run(Object) line: 96 
PlatformActivator$1.run(Object) line: 78 
EclipseAppLauncher.runApplication(Object) line: 92 
EclipseAppLauncher.start(Object) line: 68 
EclipseStarter.run(Object) line: 400 
EclipseStarter.run(String[], Runnable) line: 177 
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method] 
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39 
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25 
Method.invoke(Object, Object...) line: 585 
Main.invokeFramework(String[], URL[]) line: 336 
Main.basicRun(String[]) line: 280 
Main.run(String[]) line: 977 
Main.main(String[]) line: 952 

Yes, I had a .markers.snap file and its size was 37mb
Comment 14 Raj Mandayam CLA 2006-10-18 11:37:03 EDT
This seems important enough to request fix for 3.2.2.
I do not know if advising users of the product, that you should delete a file called markers..snap if you see this problem is a good solution.

Can we not have a fix based on the snap file size like, we know a crash has occurred and if the file size is greater than Xmb (or depending on how much time is acceptable in reading this file, say 2 minutes), not read that file and bail out.
Comment 15 John Arthorne CLA 2006-10-18 12:22:37 EDT
I'm not really keen on throwing away potentially valuable user data to speed up crash recovery.  Some markers, such as problems, can be regenerated without much difficulty, but that is not a general statement about other markers. Tasks, bookmarks, breakpoints, etc, cannot be trivially recreated. However, I'll take a look at the code to see if there are other ways to improve performance.

Another avenue to explore is limiting the number of markers created by WTP validators.  The java compiler by default will not generate more than 100 problems for a given compilation unit (which is still a bit high). If there was a similar user-configurable limit on the number of markers created by validators, it would also be a big performance boost - to startup, shutdown, and to the memory profile since markers are always kept in memory. I have seen large WTP workspaces recently with 100,000+ validation warnings, which is clearly too much information to be useful to the end user.
Comment 16 Chris Laffra CLA 2006-10-18 12:32:52 EDT
Can we read the snap file in a separate job and not block startup?
Comment 17 Szymon Brandys CLA 2008-06-18 14:26:32 EDT
We could put the workspace restore in a job. However the workspace would have to be still locked.

I agree with John (comment 15) about limiting the number of markers. Maybe this should be added to plug-ins developer guide?
Comment 18 John Arthorne CLA 2008-06-27 12:57:23 EDT
*** Bug 173870 has been marked as a duplicate of this bug. ***
Comment 19 Szymon Ptaszkiewicz CLA 2013-12-19 08:41:30 EST
*** Bug 397269 has been marked as a duplicate of this bug. ***
Comment 20 Eclipse Genie CLA 2020-07-02 13:04:27 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 21 Andrey Loskutov CLA 2021-03-02 02:47:38 EST
Still there in 4.15 (and 4.19 too).

We've got several log dumps where users killed application while waiting on Eclipse to startup. In all cases we see this stack below.

Unfortunately we don't have access to .metadata/.plugins/org.eclipse.core.resources/projects/<project>/.markers.snap files, but the projects our users are working on are huge (~200.000+ files).

"main" #1 prio=6 os_prio=0 tid=0x00007efc10050000 nid=0x228a runnable [0x00007efc16ad0000]
   java.lang.Thread.State: RUNNABLE
        at org.eclipse.core.internal.localstore.SafeChunkyInputStream.buildChunk(SafeChunkyInputStream.java:65)
        at org.eclipse.core.internal.localstore.SafeChunkyInputStream.read(SafeChunkyInputStream.java:146)
        at java.io.DataInputStream.readInt(DataInputStream.java:387)
        at org.eclipse.core.internal.resources.MarkerSnapshotReader.readVersionNumber(MarkerSnapshotReader.java:52)
        at org.eclipse.core.internal.resources.MarkerSnapshotReader.read(MarkerSnapshotReader.java:46)
        at org.eclipse.core.internal.resources.MarkerManager.restoreFromSnap(MarkerManager.java:561)
        at org.eclipse.core.internal.resources.MarkerManager.restore(MarkerManager.java:528)
        at org.eclipse.core.internal.resources.SaveManager.restoreMarkers(SaveManager.java:829)
        at org.eclipse.core.internal.resources.SaveManager.restore(SaveManager.java:714)
        at org.eclipse.core.internal.resources.SaveManager.startup(SaveManager.java:1565)
        at org.eclipse.core.internal.resources.Workspace.startup(Workspace.java:2454)
        at org.eclipse.core.internal.resources.Workspace.open(Workspace.java:2211)
        at org.eclipse.core.resources.ResourcesPlugin.start(ResourcesPlugin.java:489)
        at org.eclipse.osgi.internal.framework.BundleContextImpl$3.run(BundleContextImpl.java:842)
        at org.eclipse.osgi.internal.framework.BundleContextImpl$3.run(BundleContextImpl.java:1)
        at java.security.AccessController.doPrivileged(Native Method)
        at org.eclipse.osgi.internal.framework.BundleContextImpl.startActivator(BundleContextImpl.java:834)
        at org.eclipse.osgi.internal.framework.BundleContextImpl.start(BundleContextImpl.java:791)
        at org.eclipse.osgi.internal.framework.EquinoxBundle.startWorker0(EquinoxBundle.java:1013)
        at org.eclipse.osgi.internal.framework.EquinoxBundle$EquinoxModule.startWorker(EquinoxBundle.java:365)
        at org.eclipse.osgi.container.Module.doStart(Module.java:598)