Community
Participate
Working Groups
Build ID: 3.7.1 M20110909-1335 When launching Eclipse with an existing workspace, I got an error dialog An internal error occurred during "Initialize Java Tooling" and following backtrace was listed in the error log: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.eclipse.jdt.internal.core.ExternalFoldersManager.createPendingFolders(ExternalFoldersManager.java:171) at org.eclipse.jdt.core.JavaCore.initializeAfterLoad(JavaCore.java:3614) at org.eclipse.jdt.internal.ui.InitializeAfterLoadJob$RealJob.run(InitializeAfterLoadJob.java:36) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)
Looks like it's unlikely to be the same thread that is modifying while iterating. Access to pendingFolders might have to be synchronized. I will work on it.
Created attachment 209211 [details] Proposed fix One of the possible scenario that could have caused this is, classpath getting recomputed, thus forcing a folder being removed or added, while initializAfterLoad is running. Martin, can you tell me if something changed between eclipse sessions that may have kicked off classpath recomputation, such as like workspace being moved etc.? The patch attached should be good enough for the concurrent modifications issues.
Satyam, could you spare some time to review this. Please note that I have not added any tests.
(In reply to comment #2) > Martin, can you tell me if something changed between eclipse sessions that may > have kicked off classpath recomputation, such as like workspace being moved > etc.? Hm... I don't think so. I definitely - didn't move my workspace, - didn't update any files in my workspace since the last launch and I'm pretty sure I also didn't change any configuration (like installed host or target platform or JRE).
(In reply to comment #3) > Satyam, could you spare some time to review this. Please note that I have not > added any tests. The changes look good.
Thanks, Satyam, I have released the fix in master. http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=de08ae3ac42ace8070739ebb7715adc6c4ed1304
This caused a deadlock, see bug 369251. Please revert for M5.
From locking at the stack dump in bug 369251 and this bug's patch I suppose instead of synchronizing the entire loop we should just fetch a copy of pendingFolders for iteration, wipe the original set and release the lock before starting the actual loop, no? Alternatively, we would have to acquire all necessary scheduling rules before entering the synchronized block, but that would be more difficult to implement (if it's possible) because the resources are created only inside the loop AFAICS. Unfortunately, I wouldn't immediately know how to test such a change.
Illustration of comment 8: Iterator iterator; synchronized (this.pendingFolders) { iterator = new ArrayList(this.pendingFolders).iterator(); this.pendingFolders.clear(); } while (iterator.hasNext()) { Object folderPath = iterator.next(); try { createLinkFolder((IPath) folderPath, false, externalFoldersProject, monitor); } catch (CoreException e) { Util.log(e, "Error while creating a link for external folder :" + folderPath); //$NON-NLS-1$ } }
Created attachment 209869 [details] Proposed patch This patch is to revert the previous change and add a new schedule rule while creating the pending folders from JavaCore. The result of this is as follows: 1. Obviously, bug 369251 and the bug reported here get fixed. 2. There is one more scenario where the race condition (similar to the one reported here) might occur - from the JavaProject#addToResult(). However, it happens only in when the user tries to change the classpath when the pending folders get created. Keeping the pendingFolders as a synchronizedSet or introducing a schedule rule in the thread executing addToResult() could have handled it (2), but there are two reasons for not doing that: a) It's a rare scenario and is different than what is reported here. We can create a new bug for that. b) The JavaProject@addToResult() is often associated with a UI thread, which means it can't be blocked, which can happen with either of the above mentioned approaches. Stephan, can you please review this patch? Please also note that I have moved the addFolder and createPendingFolders calls together in JavaCore for simplicity sake. This shouldn't affect the earlier fix made in that area.
(In reply to comment #10) > Created attachment 209869 [details] > Proposed patch Another important side effect that I forgot to mention is this: if there is any job that obtains a lock on workspace root, then initializeAfterLoadJob must wait for that job and vice versa. However, the lock itself is obtained for a very short time and not during the entire job. This would address the many concurrent thread issues we faced in this area in the past in a better way.
(In reply to comment #10) > Created attachment 209869 [details] > Proposed patch > > This patch is to revert the previous change and add a new schedule rule while > creating the pending folders from JavaCore. Thanks for putting together an alternate patch quickly. Just so this is an informed/considered decision, a few questions regarding the problem outlined in comment#0: - Is this a ship stopper for M5 ? Or is it something that can wait ? - When did the root cause for the faulty behavior get introduced ? Is it a regression, if so which fix caused it ? when ? What is the earliest version that shows the faulty behavior ? - Is there a crisp scenario description for triggering the bug ? - Are there any workarounds (to prevent triggering) ? How reasonable are they ? - How likely is the bug's manifestation (per educated guess) ? If the comment#0 problem is not due to recent behavior change it looks like a no-brainer decision to me to opt for a roll back of the problematic fix and consider the alternate patch for inclusion in early M6.
(In reply to comment #12) > If the comment#0 problem is not due to recent behavior change > it looks like a no-brainer decision to me to opt for a roll back > of the problematic fix and consider the alternate patch for inclusion > in early M6. In any case, if we don't have clear agreement on the new fix by the time of the Sunday warm up build, we should revert the problematic fix so as not to block partner project's testing.
(In reply to comment #12) > - Is this a ship stopper for M5 ? Or is it something > that can wait ? The bug itself is not a show-stopper > - When did the root cause for the faulty behavior get introduced ? > Is it a regression, if so which fix caused it ? when ? > What is the earliest version that shows the faulty behavior ? The code causing the bug was introduced as part of the fix to bug 320618, which was released in 3.7. > - Is there a crisp scenario description for triggering the bug ? > - Are there any workarounds (to prevent triggering) ? How reasonable > are they ? > - How likely is the bug's manifestation (per educated guess) ? It's a timing issue and not always reproducible. To reproduce one may have to suspend thread and such. One of the workarounds I can think of is closing and opening the projects again. The bug is likely to manifest when working with large workspace, esp. when something has changed in the workspace between sessions, something like bug 369251, comment #0.
Created attachment 209871 [details] Updated patch I have split the previous patch into two parts: 1. Reverting the changes made in the first page; I have already released this under bug 369251 just so that HEAD is clean. 2. This patch, an alternate fix to the bug reported here. Stephan, please review this patch. TIA.
(In reply to comment #14) > The code causing the bug was introduced as part of the fix to bug 320618, > which was released in 3.7. For the record, that was for 3.7 M2. So this bug has been lying dormant for 16+ months. That makes me think if there is any question at all around the new fix as it is being reviewed, it can safely be moved out to early M6.
(In reply to comment #10) > Created attachment 209869 [details] > Proposed patch > ... > Stephan, can you please review this patch? Honestly, *if this is for M5* I don't see how I could quickly establish sufficient confidence that the latest patch solves the original problem without causing new issues. Nothing I can judge by a quick look at the code. I'd definitely want to test the patch back to back with the previous state, but for this I would need to reproduce the bug at least *sometimes* in the debugger. If, however, reverting the previous patch is good enough for M5 and the latest patch shall go into M6 I'll do my best to collaborate with Jay in understanding how to trigger the problem and based on that understanding: how to fix. As an aside: should we succeed in deterministically forcing the bug using breakpoints in the debugger, I should be able to create a regression test: I recently made experiments how OT/J can force a given execution order of concurrent executions. Even if that's not directly deployable to the SDK build, we could run such regression tests locally at certain points in time.
Let's not hurry here.
(In reply to comment #17) > (In reply to comment #10) > > Created attachment 209869 [details] > > Proposed patch > > ... > > Stephan, can you please review this patch? > > Honestly, *if this is for M5* I don't see how I could quickly establish > sufficient confidence that the latest patch solves the original problem without > causing new issues. Stephan, could you look at this now for inclusion in early M6 ? Thanks.
Satyam, could you also review this please ? TIA.
Satyam and Stephan, I know you are guys are very busy, sorry for nagging you on this: We should try and get this in early in the mile stone period, we have already seen two I builds go by. Seeking your indulgence, TIA.
(In reply to comment #10) > b) The JavaProject@addToResult() is often associated with a UI thread, which > means it can't be blocked, which can happen with either of the above mentioned > approaches. I see that JavaProject@addToResult() is called from JavaProject#resolveClasspath(). Looking at the call stack, I believe this can be also called from IJavaProject#getResolvedClasspath(...). I think the logic could be improved to lock only if they are some folders that really need to be created. IMO, this will the most prevalent case.
Created attachment 210971 [details] new fix (In reply to comment #22) > I see that JavaProject@addToResult() is called from > JavaProject#resolveClasspath(). Looking at the call stack, I believe this can > be also called from IJavaProject#getResolvedClasspath(...). Satyam, I can't confirm that plugins cannot request for classpath resolution before JavaCore#initializeAfterLoad(), even though I haven't found such a scenario. So, for hypothetical scenarios, I admit we have to consider the other scenario also. The new fix does the following: 1. Obviously the new schedule rule introduced in the previous patch is not mandatory. It could be useful, but not in the context of this bug esp. with the following two changes. 2. The elements of the pendingFolders map are copied to a temporary array and processed and the map is cleared right after copying the elements into the array. This is what Stephan had suggested in his earlier comment. 3. The createLinkFolder calls have been moved out of the synchronized block to keep away the deadlock issue.
(In reply to comment #23) > Created attachment 210971 [details] > new fix This patch looks good for me. It will fix the current scenario and will also not run into a deadlock.
(In reply to comment #24) > This patch looks good for me. It will fix the current scenario and will also > not run into a deadlock. Stephan, if you have time, take a look at this patch. I know you are busy and Satyam has already reviewed the patch. Besides the patch is a straight forward one. So, if you don't have time, that's okay.
Thanks, Satyam. Released the fix here: http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=e4d4ff5013f9df9fd57cdef7bafc039c1cd3f062
(In reply to comment #25) > (In reply to comment #24) > > This patch looks good for me. It will fix the current scenario and will also > > not run into a deadlock. > > Stephan, if you have time, take a look at this patch. Stephan, despite the resolution you are welcome to look the patch if you manage to find time for it.
Verified for 3.8M6 using build I20120312-1300
Hello, is this also fixed for 3.7.2? Currently, I get this error on every startup and have no idea to workaround it. Does it do any harm?
(In reply to comment #29) > Hello, is this also fixed for 3.7.2? Take a look at the 'target milestone' to get the answer.
Setting a requested +1 flag that was more than overdue :/