Bug 395645 - Progress view can show finished job as unfinished (race condition introduced by bug 258352)
Summary: Progress view can show finished job as unfinished (race condition introduced ...
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: IDE (show other bugs)
Version: 4.2   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: 4.3 M6   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: greatbug
: 197258 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-12-03 18:15 EST by John Cortell CLA
Modified: 2013-03-12 10:50 EDT (History)
4 users (show)

See Also:


Attachments
example for reproducing problem (5.61 KB, application/x-zip-compressed)
2012-12-03 18:16 EST, John Cortell CLA
no flags Details
proposed fix (5.85 KB, patch)
2012-12-03 18:19 EST, John Cortell CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Cortell CLA 2012-12-03 18:15:13 EST
The fix to bug 258352 introduced a race condition in the Progress viewer.

The code that tries to ensure the update job runs no more than once every 100 ms is prone to a race condition. The update job is scheduled with a 100ms delay, and subsequent calls to schedule another update has an overly simplistic test for determining if a job is already scheduled. This can lead to an update being erroneously skipped. 

This reproduces easily for me with a simple example (attached). Open the Progress view then invoke the RunJobs command (CTRL+6). On my machine, about 30% of time the job ends up shown as incomplete when it in fact has completed.

I'm attaching a git patch created with git diff. Much of the delta is due to a scope addition indenting code).
Comment 1 John Cortell CLA 2012-12-03 18:16:12 EST
Created attachment 224245 [details]
example for reproducing problem
Comment 2 John Cortell CLA 2012-12-03 18:19:16 EST
Created attachment 224247 [details]
proposed fix
Comment 3 Paul Webster CLA 2013-02-05 15:39:26 EST
John, when you get a chance could you have a look at this patch?

PW
Comment 4 John Arthorne CLA 2013-02-12 17:27:46 EST
*** Bug 197258 has been marked as a duplicate of this bug. ***
Comment 5 John Arthorne CLA 2013-02-12 17:48:08 EST
This was a great find. Many people have seen this "phantom job" problem with the progress view but it was never tracked down. At first I thought we should pull out all external synchronization and just rely on the job manager - repeated schedule attempts are ignored after all. However I see the purpose of bug 258352 was to avoid this thrashing on the very contentious job manager lock. 

So your additional synchronization fix looks good. However it makes me nervous because it introduces nested locks in two places. This is always a risk of introducing deadlocks and I have reworked the patch to avoid it. This does mean if a change happens to a job while the update is running, we will always schedule another job. In some cases this will cause the update job to run one more time, but this is not much overhead and better than introducing a lock that waits for the update to complete before seeing if we need to run it again.
Comment 6 John Arthorne CLA 2013-02-12 17:51:46 EST
Fix released:

http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/?id=f092d6459601a6ff565dbc6c46b1f2e96bcf43bf

Thank you for this fantastic bug report. You did all the hard work here of isolating the problem and providing a reproducible test case which was 95% of the work.
Comment 7 John Cortell CLA 2013-02-12 18:03:15 EST
(In reply to comment #5)
> This was a great find. Many people have seen this "phantom job" problem with
> the progress view but it was never tracked down. At first I thought we
> should pull out all external synchronization and just rely on the job
> manager - repeated schedule attempts are ignored after all. However I see
> the purpose of bug 258352 was to avoid this thrashing on the very
> contentious job manager lock. 
> 
> So your additional synchronization fix looks good. However it makes me
> nervous because it introduces nested locks in two places. This is always a
> risk of introducing deadlocks and I have reworked the patch to avoid it.
> This does mean if a change happens to a job while the update is running, we
> will always schedule another job. In some cases this will cause the update
> job to run one more time, but this is not much overhead and better than
> introducing a lock that waits for the update to complete before seeing if we
> need to run it again.

I wasn't feeling great about the nested locks either, but ultimately I felt it was OK given the short and simple lock scopes. Also, blocking on the update of a view as simple as this one didn't seem like a big deal to me, particularly since it would happen pretty infrequently. Since the point of the original code was to avoid refresh overload, I felt better avoiding any unnecessary refreshes. That said, I agree that an additional, redundant job is a reasonable way to mitigate both these concerns.

Thanks for acting on this patch.
Comment 8 John Cortell CLA 2013-02-12 18:06:49 EST
(In reply to comment #6)
> Fix released:
> 
> http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/
> ?id=f092d6459601a6ff565dbc6c46b1f2e96bcf43bf
> 
> Thank you for this fantastic bug report. You did all the hard work here of
> isolating the problem and providing a reproducible test case which was 95%
> of the work.

You bet. Glad I could help.
Comment 9 John Arthorne CLA 2013-02-13 09:29:17 EST
I'm still a bit worried there could be a case where updateScheduled.value is true, but there is no job scheduled. Since this updater is a singleton it would mean progress updating would be permanently broken. I'm trying some things to make sure that can never happen.
Comment 10 Paul Elder CLA 2013-03-12 10:50:09 EDT
Verified in 4.3.0.I20130311-2000