Bug 537903 - Running evaluation on debug context activation can cause wrong selection in Debug View
Summary: Running evaluation on debug context activation can cause wrong selection in D...
Status: NEW
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Debug (show other bugs)
Version: 4.6   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: JDT-Debug-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-13 09:35 EDT by Simeon Andreev CLA
Modified: 2023-01-03 08:03 EST (History)
1 user (show)

See Also:


Attachments
Video showing how the race condition can be reproduced during a test. (4.50 MB, video/mp4)
2018-08-13 09:35 EDT, Simeon Andreev CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Simeon Andreev CLA 2018-08-13 09:35:26 EDT
Created attachment 275380 [details]
Video showing how the race condition can be reproduced during a test.

The race condition which causes the bug is unfortunately complicated, see attached video to better understand how to reproduce. Reproduction steps are:

Our product does some JDI evaluations, whenever a suspended thread is selected in the Debug View. This sometimes leads to wrong selection in the Debug View. See attached JUnit test for example code.


1. Set breakponts at the following locations (see below for details, since line numbers change):

JDIThread [line: 817] - runEvaluation(IEvaluationRunnable, IProgressMonitor, int, boolean)	
JDIThread [line: 1024] - invokeMethod(ClassType, ObjectReference, Method, List<? extends Value>, boolean)	
JDIThread [line: 1054] - invokeMethod(ClassType, ObjectReference, Method, List<? extends Value>, boolean)	
JDIThread [line: 1493] - handleSuspendForBreakpoint(JavaBreakpoint, boolean)	
ThreadEventHandler [line: 255] - fireDeltaUpdatingSelectedFrame(IThread, int, DebugEvent)	
ThreadEventHandler [line: 265] - fireDeltaUpdatingSelectedFrame(IThread, int, DebugEvent)	

2. Debug attached test "BugXXXXXXTests.java" as Eclipse JUnit plug-in test.

3. Two breakpoints are simultaneously hit, JDIThread.handleSuspend and JDIThread.runEvaluation. The thread which hit the breakpoint in the test snippet (in the debugee Eclipse) is selected and suspended.
4. Resume breakpoint in JDIThread.handleSuspend.
5. Resume breakpoint in JDIThread.runEvaluation.
6. Two breakpoints are simultaneously hit, JDIThread.invokeMethod and ThreadEventHandler.fireDeltaUpdatingSelectedFrame. The one in JDIThread wants to set the state of the thread as "running". The one one in ThreadEventHandler wants to query the top stack frame.
7. Resume first breakpoint in JDIThread.invokeMethod. The state of the suspended thread is now set to "running".
8. Second breakpoint in JDIThread.invokeMethod is hit, at the call to invokeComplete. The thread state is still running.
9. Resume first breakpoint in ThreadEventHandler. This doesn't compute any stack frames due to the running thread state. Resulting delta therefore selects thread.
10. Second breakpoint in ThreadEventHandler.fireDeltaUpdatingSelectedFrame is hit, at the check of having no stack frame and a suspended thread.
11. Resume second breakpoint in JDIThread. The state of the thread is now "suspended" again.
12. Resume second breakpoint in ThreadEventHandler. Due to the now suspended state, this results in delta:

Model Delta Start
  Element: org.eclipse.debug.internal.core.LaunchManager@23aaa756
      Flags: NO_CHANGE
      Index: -1 Child Count: -1
    Element: org.eclipse.debug.core.Launch@3442f8f4
        Flags: NO_CHANGE
        Index: 0 Child Count: 2
      Element: org.eclipse.jdt.internal.debug.core.model.JDIDebugTarget@8eb9935
          Flags: NO_CHANGE
          Index: 0 Child Count: 5
        Element: org.eclipse.jdt.internal.debug.core.model.JDIThread@3b6037da
            Flags: EXPAND | SELECT | STATE | REVEAL | 
            Index: 4 Child Count: 3
Model Delta End

This delta results in selecting and expanding the thread, as opposed to its top stack frame. As a result, e.g. its not possible to use resume and stepping. Client plug-in code which depends on correct selection is also broken.


There are two race conditions here. The first occurs from step 1. to step 5.. The latter occurs from step 6. onward, and can only occur after the first race condition did.

For the first race condition, the following occurs:

    a) Debug target proxy installation is triggered "late" and detects a suspended thread.
    b) A selection is fired. Evaluation from example plug-in kicks in due to this.
    c) JDIThread.handleSuspendForBreakpoint(JavaBreakpoint, boolean) is not yet done with resetting the fSuspendVoteInProgress flag to false.
    d) The evaluation from the example plug-in sees fSuspendVoteInProgress==true and fires "quiet" deltas.
    e) The quiet deltas lead to collapsing the expanded and selected thread from step b), setting up for the second race condition.

In the second race condition, ThreadEventHandler.fireDeltaUpdatingSelectedFrame will create a bad delta due to the following:

    i) A JDI evaluation is running in paralle to fireDeltaUpdatingSelectedFrame. The evaluation sets the thread state to "running".
   ii) fireDeltaUpdatingSelectedFrame sees this and decides that there is no top stack frame in the thread.
  iii) The JDI evaluation finishes, setting the thread state to "suspended" again.
   iv) fireDeltaUpdatingSelectedFrame continues, sees that there is no top stack frame but that the thread is suspended. It creates the bad delta from above, breaking any code which depends on correct stack frame selection.


Here are the breakpoint locations (see also attached video), that can force the race conditions to occur:


Two breakpoints in JDIThread.involeMethod, to control the running state of the suspended thread for the second race condition:

{code}
	protected Value invokeMethod(ClassType receiverClass,
			ObjectReference receiverObject, Method method, List<? extends Value> args,
			boolean invokeNonvirtual) throws DebugException {
		...
			synchronized (this) {
				...
				setRequestTimeout(Integer.MAX_VALUE);
				setRunning(true);  // <------------ BREAKPOINT HERE
				setInvokingMethod(true);
			}
                ...
		invokeComplete(timeout);  // <---------- BREAKPOINT HERE
		return result;
	}
{code}

One breakpoint in JDIThread.handleSuspendForBreakpoint, to control the first race condition (due to fSuspendVoteInProgress):

{code}
	public boolean handleSuspendForBreakpoint(JavaBreakpoint breakpoint,
			boolean suspendVote) {
		...
		} finally {
			synchronized (this) {         // <------------- BREAKPOINT HERE
				fSuspendVoteInProgress = false;
				...
			}
		}
		return suspend;
	}
{code}

One breakpoint in JDIThread.runEvaluation:

        @Override
	public void runEvaluation(IEvaluationRunnable evaluation,
			IProgressMonitor monitor, int evaluationDetail,
			boolean hitBreakpoints) throws DebugException {
		...
		synchronized (fEvaluationLock) {
			fEvaluationRunnable = evaluation;
			fHonorBreakpoints = hitBreakpoints;
		}
		boolean quiet = isSuspendVoteInProgress();     // <--------------- BREAKPOINT HERE
		if (quiet) {
			// evaluations are quiet when a suspend vote is in progress
			// (conditional breakpoints, etc.).
			fireEvent(new DebugEvent(this, DebugEvent.MODEL_SPECIFIC,
					RESUME_QUIET));
		} ...
                ...
	}


Two breakpoints breakpoints in ThreadEventHandler.fireDeltaUpdatingSelectedFrame, again to control the second race condition:

	private void fireDeltaUpdatingSelectedFrame(IThread thread, int flags, DebugEvent event) {
		...
		try {
			Object frameToSelect = event.getData();
			if (frameToSelect == null || !(frameToSelect instanceof IStackFrame)) {
				frame = thread.getTopStackFrame();       // <------------------------------ BREAKPOINT HERE
			}
                        ...
    	        if (isEqual(frame, prev)) {
    		    if (frame == null) {
    			if (thread.isSuspended()) {                      // <------------------------------ BREAKPOINT HERE
	    			// no frames, but suspended - update & select
	    			node = node.addNode(thread, threadIndex, flags | IModelDelta.STATE | IModelDelta.SELECT, childCount);
    			}
    		        ...
    	        fireDelta(delta);
	}
Comment 1 Eclipse Genie CLA 2018-08-13 09:44:16 EDT
New Gerrit change created: https://git.eclipse.org/r/127347
Comment 2 Simeon Andreev CLA 2018-08-13 09:53:02 EDT
Please find JUnit test in https://git.eclipse.org/r/#/c/127347/ (patch set 1). I decided against attaching the test file as the test requires some refactoring to avoid a lot of code duplication.
Comment 3 Simeon Andreev CLA 2018-08-14 09:06:39 EDT
So far the best solution that comes to mind is changing the client code; the job which runs an evaluation can wait for DebugPlugin event handling to finish.

An alternative would maybe be:

diff --git a/org.eclipse.debug.ui/ui/org/eclipse/debug/internal/ui/viewers/update/ThreadEventHandler.java b/org.eclipse.debug.ui/ui/org/eclipse/debug/internal/ui/viewers/update/ThreadEventHandler.java
index c30102aac..e5bba8f9c 100644
--- a/org.eclipse.debug.ui/ui/org/eclipse/debug/internal/ui/viewers/update/ThreadEventHandler.java
+++ b/org.eclipse.debug.ui/ui/org/eclipse/debug/internal/ui/viewers/update/ThreadEventHandler.java
@@ -44,6 +44,9 @@ public class ThreadEventHandler extends DebugEventHandler {
         * Map of previous TOS per thread
         */
        private Map<IThread, IStackFrame> fLastTopFrame = new HashMap<>();
+
+       private IThread fEvaluatingThread;
+
        /**
         * Constructs and event handler for a threads in the given viewer.
         *
@@ -66,6 +69,7 @@ public class ThreadEventHandler extends DebugEventHandler {
        @Override
        protected void handleSuspend(DebugEvent event) {
         IThread thread = (IThread) event.getSource();
+               boolean hasPendingSelect = clearPendingSelect(thread);
                if (event.isEvaluation()) {
                ModelDelta delta = buildRootDelta();
                ModelDelta node = addPathToThread(delta, thread);
@@ -74,6 +78,9 @@ public class ThreadEventHandler extends DebugEventHandler {
                                IStackFrame frame = thread.getTopStackFrame();
                 if (frame != null) {
                        int flag = IModelDelta.NO_CHANGE;
+                                       if (hasPendingSelect) {
+                                               flag = flag | IModelDelta.SELECT;
+                                       }
                        if (event.getDetail() == DebugEvent.EVALUATION) {
                                // explicit evaluations can change content
                                flag = flag | IModelDelta.CONTENT;
@@ -173,6 +180,7 @@ public class ThreadEventHandler extends DebugEventHandler {
        @Override
        protected void handleLateSuspend(DebugEvent suspend, DebugEvent resume) {
                IThread thread = queueSuspendedThread(suspend);
+               boolean hasPendingSelect = clearPendingSelect(thread);
                if (suspend.isEvaluation() && suspend.getDetail() == DebugEvent.EVALUATION_IMPLICIT) {
                        // late implicit evaluation - update thread and frame
                ModelDelta delta = buildRootDelta();
@@ -181,8 +189,12 @@ public class ThreadEventHandler extends DebugEventHandler {
                        try {
                                IStackFrame frame = thread.getTopStackFrame();
                 if (frame != null) {
-                    node.addNode(frame, IModelDelta.STATE);
-                    fireDelta(delta);
+                                       int flag = IModelDelta.STATE;
+                                       if (hasPendingSelect) {
+                                               flag = flag | IModelDelta.SELECT;
+                                       }
+                                       node.addNode(frame, flag);
+                                       fireDelta(delta);
                 }
                        } catch (DebugException e) {
                        }
@@ -283,6 +295,8 @@ public class ThreadEventHandler extends DebugEventHandler {
        }
        if (frame != null) {
             node.addNode(frame, indexOf(frame), IModelDelta.STATE | IModelDelta.SELECT, childCount(frame));
+               } else {
+                       fEvaluatingThread = thread;
         }
        synchronized (this) {
                if (!isDisposed()) {
@@ -388,4 +402,9 @@ public class ThreadEventHandler extends DebugEventHandler {
                return null;
        }
 
+       private boolean clearPendingSelect(IThread thread) {
+               boolean hasPendingSelect = fEvaluatingThread == thread;
+               fEvaluatingThread = null; // TODO: this should be done only when a non-evaluation suspend is coming?
+               return hasPendingSelect;
+       }
 }

This doesn't deal with callers of IJavaObject.sendMessage though, which don't run in JDIThread.runEvaluation. I can e.g. ask IJavaStackFrame after some variable, down-cast it to IJavaObject, and then call sendMessage on the result. This is all public API.
Comment 4 Simeon Andreev CLA 2018-08-15 03:50:55 EDT
I've updated the test in review https://git.eclipse.org/r/#/c/127347/. The race condition between ThreadEventHandler.handleResume and JDIThread.invokeMethod is able to cause a bad selection, without the first race condition listed in the description.

Unfortunately, when running this test overnight I ran into only 1 failure in 1000 test iterations (about 2 hours).

I'll try adding similar handling as in the fix for bug 515206, when computing thread stack frames. If this works in our product, we can consider adding this to the platform as well. Potentially we can try merging the fix for bug 515206 with a fix for this bug, since they appear very similar.

I see no other viable solutions so far.
Comment 5 Eclipse Genie CLA 2018-08-31 10:46:58 EDT
New Gerrit change created: https://git.eclipse.org/r/128452
Comment 6 Simeon Andreev CLA 2018-09-14 03:57:47 EDT
With https://git.eclipse.org/r/#/c/128452/, one case of the bug is fixed. In particular, it should no longer be possible to receive a delta which selects a thread and not its top stack frame.

However if an evaluation runs during the entire thread suspend handling, the behaviour is unchanged. So the bug still exists, depending on how client code triggers JDI evaluations.
Comment 7 Andrey Loskutov CLA 2018-09-14 04:01:47 EDT
(In reply to Simeon Andreev from comment #6)
> However if an evaluation runs during the entire thread suspend handling, the
> behaviour is unchanged. So the bug still exists, depending on how client
> code triggers JDI evaluations.

OK, let keep the bug open. I don't see right now a good solution.
Comment 9 Eclipse Genie CLA 2020-09-21 00:11:15 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 10 Eclipse Genie CLA 2023-01-03 08:03:50 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.