Community
Participate
Working Groups
I200406221600 Sun JDK 1.4.2_04 Not sure what effect this had on overall stability since I immediately restarted eclipse. log follows: !ENTRY org.eclipse.core.runtime 4 2 Jun 24, 2004 08:45:13.820 !MESSAGE An internal error occurred during: "Decoration Calculation". !STACK 0 java.lang.NullPointerException at org.eclipse.ui.internal.decorators.DecorationScheduler$1.run(DecorationScheduler.java:222) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:66)
Did you have any more in you log?
Line 222 is monitor.beginTask(WorkbenchMessages.getString ("DecorationScheduler.CalculatingTask"), 100); //$NON-NLS-1$ This is a monitor passed to me by the JobManager. Moving to Core
This is suspicous code in JobManager: /** * Attempts to immediately start a given job. Returns true if the job was * successfully started, and false if it could not be started immediately * due to a currently running job with a conflicting rule. Listeners will never * be notified of jobs that are run in this way. */ protected boolean runNow(InternalJob job) { synchronized (lock) { //cannot start if there is a conflicting job if (findBlockingJob(job) != null) return false; changeState(job, Job.RUNNING); job.setProgressMonitor(new NullProgressMonitor()); job.run(null); } return true;
Looks like it was introduced as fix to Bug 62368 last month.
Nothing else in the log. It happened to me again and this time I kept running with no adverse effects.
The spec for Job#run says that the progress monitor is never null. I will check with John this morning but I believe that a good change would be: ... job.setProgressMonitor(new NullProgressMonitor()); job.run(job.getProgressMonitor()); ...
We have reviewed the code and that was a red herring. The #runNow method is only called on ThreadJob instances which have no-ops for their #run methods. We will investigate further.
I can reproduce this in a test case. If a job change listener reschedules the same job from within the scheduled() callback, we can get into a state where the same job is being executed in two worker threads at once. When the first thread finishes, it nulls the progress monitor, thus causing the NPE in the second thread. Thus the NPE is a sympton of an extremely severe problem. Having the same job running in multiple threads at once can potentially have many adverse side-effects.
Created attachment 12785 [details] Patch to JobManager.java This is the proposed fix. The old code did this: 1) enter sync block 2) assert scheduling preconditions 3) exit sync block 4) call job change listeners 5) enter sync block 6) change job state and add to wait queue 7) exit sync block Step 2) (assertion preconditions) is the guard code to ensure that the same job is not scheduled twice. If this occurs, the second schedule() attempt is ignored (this is the specified behaviour). During 4), another thread (or the same thread in a recursive schedule() call), could reschedule the identical job. Step 6) would then occur twice on the same job, causing the NPE, or much worse problems. The patch adds a new step to re-assert the preconditions after entering the sync block a second time: 1) enter sync block 2) assert scheduling preconditions 3) exit sync block 4) call job change listeners 5) enter sync block 5a) NEW: assert scheduling preconditions 6) change job state and add to wait queue 7) exit sync block Why check the preconditions twice? We want to avoid calling job change listeners and telling them the job is scheduled if in fact it will be discarded. This reduces notifications that would cause unnecessary flicker. With the patched code, it is still possible for a listener to get the scheduled() notification twice for the same job. This is not great, but much better than actually running the job twice simultaneously.
We need to get the fix approval process started.
Created attachment 12795 [details] Improved patch Improved patch. This patch avoids the possibility of multiple scheduling notifications. It adds a new internal state to indicate that a job has been scheduled but not yet added to the wait queue. Thus a second call to schedule() can distinguish this case properly and immediately bail out.
Fix released. Fix reviewed by Jean-Michel, DJ, McQ, MVM, and Kevin H.
It is quite likely that the following bugs are duplicates of this one: bug 57656 bug 63635 bug 65415 bug 68240