Bug 146115 - [breakpoints] Deadlock on launch due to CBreakpointManager
Summary: [breakpoints] Deadlock on launch due to CBreakpointManager
Status: NEW
Alias: None
Product: CDT
Classification: Tools
Component: cdt-debug (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P3 normal with 4 votes (vote)
Target Milestone: ---   Edit
Assignee: cdt-debug-inbox@eclipse.org CLA
QA Contact: Jonah Graham CLA
URL:
Whiteboard:
Keywords:
Depends on: 249951
Blocks:
  Show dependency tree
 
Reported: 2006-06-08 16:57 EDT by Ryan Hapgood CLA
Modified: 2020-09-04 15:20 EDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Hapgood CLA 2006-06-08 16:57:07 EDT
To reproduce try:

- clear workspace
- new managed make project
- create a new file in project, add code and save
- add breakpoint
- create launch with stop on main
- run launch

Thread 1:

AutoBuildJob (owns workspace job queue) -> ... -> BreakpointManager.resourceChanged(...) -> CBreakpointManager.breakpointsChanged(...)

	public void breakpointsChanged( IBreakpoint[] breakpoints, IMarkerDelta[] deltas ) {
		ArrayList removeList = new ArrayList( breakpoints.length );
		ArrayList installList = new ArrayList( breakpoints.length );
		synchronized ( getBreakpointMap() ) {

###### Waiting for map to become free ####


Thread 2:

EventManager -> CBreakpointManager.handleDebugEvents(...) -> handleBreakpointCreatedEvent(...) -> doHandleLocationBreakpointCreatedEvent(...)

	private void doHandleLocationBreakpointCreatedEvent( ICDILocationBreakpoint cdiBreakpoint ) {
		if ( cdiBreakpoint.isTemporary() )
			return;
		ICBreakpoint breakpoint = null;
		synchronized( getBreakpointMap() ) {

####### Has lock on map #######

			breakpoint = getBreakpointMap().getCBreakpoint( cdiBreakpoint );
			if ( breakpoint == null ) {
				breakpoint = createLocationBreakpoint( cdiBreakpoint );

####### Requires job to execute on workspace job queue #####

			}
			if ( breakpoint != null )
				getBreakpointMap().put( breakpoint, cdiBreakpoint );
		}

	public CBreakpoint( final IResource resource, final String markerType, final Map attributes, final boolean add ) throws CoreException {
		this();
		IWorkspaceRunnable wr = new IWorkspaceRunnable() {

			public void run( IProgressMonitor monitor ) throws CoreException {
				// create the marker
				setMarker( resource.createMarker( markerType ) );
				// set attributes
				ensureMarker().setAttributes( attributes );
				//set the marker message
				setAttribute( IMarker.MESSAGE, getMarkerMessage() );
				// add to breakpoint manager if requested
				register( add );
			}
		};

		run( wr ); 							
##### Cannot execute due to AutoBuildJob waiting to complete ##########
	}
Comment 1 Pawel Piech CLA 2006-06-08 19:19:17 EDT
I wonder if fixing the bug 138473, which is what I ran into a while ago, might also address this problem.  Autobuild job is actually run after every resource change, whether autobuild is turned on or not.  I suggested changing the platform breakpoint manager to issue breakpoint changed events immediately after each change, rather than waiting for the auto-build job to run.
Unfortunatley, this would only take the build job out of the picture, if other threads happen to be modifying breakpoint markers and locking the C breakpoint map, the dead-lock will still be a problem.
Comment 2 Ryan Hapgood CLA 2006-06-08 20:42:52 EDT
> Autobuild job is actually run after every resource
> change, whether autobuild is turned on or not.  

I can confirm this is indeed the case, and although the build is not actually performed the PRE and POST build events are fired anyway (a bit misleading I feel), so the problem still happens.

> I wonder if fixing the bug 138473, which is what I 
> ran into a while ago, might also address this problem

It may, but as you say, does not address the heart of the issue. [ie. Please do mark this as a duplicate]

I'm currently looking for a possible workaround that may prevent the deadlock so we (HI-TECH) can do a release (it happens quite regularly on a first run from a fresh install) but feel the only real way to fix the problem is to avoid queuing a job inside the synchronise. Mikhail seemed to agree on the cdt-dev mailing list, however it is not as simple as it sounds and the changes may take a while.

Comment 3 Nobody - feel free to take it CLA 2006-06-09 13:55:53 EDT
Ryan, as I mentioned in the mailing list, I agree that it is not a good idea to create a job from a synchronized block. And this needs to be fixed. But I can't reproduce the deadlock on my machine. Maybe it's just a timing issue. But there is one thing that puzzles me: the CBreakpoint object should be created as soon as you add a breakpoint. When doHandleCreatedEvent is called the corresponding CBreakpoint has already been created and there is no need to create it again. Why the CBreakpoint object doesn't exist in your case?
Comment 4 Ryan Hapgood CLA 2006-06-12 17:57:10 EDT
Mikhail, I've been wondering that myself and after looking into it I've found that the problem manifests itself when the launch itself adds breakpoints. A colleague (who just so happens to be on holidays) allowed for further embedded specific options to our launch (eg. breakpoint at reset vector) and it seems adding these breakpoints cause the problem. This means it is probably not going to happen to most CDT users. I have found a way, in our code, to avoid the problem, so I guess it can be ignored?

Comment 5 Nobody - feel free to take it CLA 2006-06-13 10:50:45 EDT
(In reply to comment #4)
Yes, that's what I was suspecting. This can only happen when a breakpoint is set internally during the launching. It's still a legitimate issue, but it is better to address it in another release, probably, 4.0.
Comment 6 James Blackburn CLA 2006-09-26 06:04:39 EDT
Ah.  I'm seeing exactly this issue - though without (explicitly) setting breakpoints in the launch configuration.

Running multiple GDB sessions in a s single launch (connecting to multiple cores), if any breakpoints are enabled Eclipse deadlocks repeatably for the very same reason.

Can anyone think of a solution?
Comment 7 Nobody - feel free to take it CLA 2007-06-18 10:39:29 EDT
Returning to the pool.
Comment 8 Rich Wagner CLA 2007-09-24 13:53:06 EDT
I just voted for this, since a developer where I work (Tilera Corp; www.tilera.com) just ran into this.

Can anyone suggest a work-around which I can either pass on to the Eclipse/CDT user, or a work-around that I can somehow include in the code?  (We're extending Eclipse 3.2 and CDT 3.1.2, but haven't changed the CDT code base in our extensions, so a code-level workaround that could be accomplished using some extension mechanism - XML extension, new adapter, etc. - would be preferable.  Thanks; Rich).
Comment 9 Rich Wagner CLA 2007-09-24 15:01:53 EDT
P.S. As James Blackburn writes in comment #6, we at Tilera are also "running multiple GDB sessions in a single launch (connecting to multiple cores)".

To elaborate: our applications typically consist of a single "starter" process which "exec"s other worker processes on multiple cores.  Our Eclipse/CDT extensions detect such "exec"s and provide ways for GDB instances to attach to some or all of those child processes dynamically: all as part of a single launch.  So I suspect this is why we're seeing this...

Comment 10 James Blackburn CLA 2007-09-24 18:01:11 EDT
Hi Rich,

I did have similar difficulties.  By changing the code a bit I got it into a state where it didn't crash -- I did this a year or so ago, and to be perfectly honest it's probably not the way it should be done/correct, however it doesn't deadlock on launch, so I haven't looked at it since...

What we're doing is something similar to this:

for (cores) {
  <create process>
  <set configuration>
  CDIDebugModel.newDebugTarget(..., resumeTarget = false, ...)

  <post customGDB CLI commands>
  
  //Create attach message.
  CLITargetAttach ta = target.getMISession().getCommandFactory().createCLITargetAttach(0);

  try {
    // Set inferior to suspended (so we are allowed to send the attach message
    target.getMISession().getMIInferior().setSuspended();
    target.getMISession().postCommand(ta, -1);
    target.getMISession().getMIInferior().setRunning();
    //This tells the world that the target has started
    //Put it here to prevent Deadlock... Very very fragile...
    MIEvent event = new MIRunningEvent(target.getMISession(),0,MIRunningEvent.CONTINUE);
    target.getMISession().fireEvent(event);				 
  } catch (MIException e) {}
}
					
The comments are as they were left when I last touched it.  Your problem sounds similar -- i.e. attaching to an already running remote target.  (The nonsense of sending an event isn't required when starting the target from scratch target.restart()).  I found many other permutation of connecting to already running targets caused the deadlock.

Hope this is half helpful.

Cheers,

James
Comment 11 Rich Wagner CLA 2007-09-25 10:51:06 EDT
Hi James,

Thanks for the info.  Alas, my code is sufficiently different that I can't really try to use your method.  In particular, when we dynamically attach a new GDB instance to a process, our lower-level (Eclipse-external) target management stuff has arranged for the process to be suspended while Eclipse performs the attach.  It looks like your code attaches GDB to a process in the running state (?).
Comment 12 Rich Wagner CLA 2007-09-25 11:28:57 EDT
I'm going to post the tracebacks of the two threads I'm seeing deadlock because of this bug, in the hope that someone might be able to suggest a work-around for me.  Or maybe (?) this will provide enough information for Mikhail K, given he hasn't been able to reproduce this.

Looking at these tracebacks myself, the main thing that puzzles and concerns me is: Why does the following method use "asyncExec"?  It's true that "fireBreakpointChanged" MAY notify a listener whose code needs to run in the UI thread.  But isn't it the responsibility of any PARTICULAR listener to wrap its code in "[a]syncExec" if that's true?  It seems like having "asyncExec" at this level locks the UI thread for longer than necessary:

   public void breakpointInstalled( final IDebugTarget target, IBreakpoint breakpoint ) {
      if ( breakpoint instanceof ICBreakpoint && target instanceof ICDebugTarget ) {
         final ICBreakpoint b = (ICBreakpoint)breakpoint;
         asyncExec( new Runnable() {
   
            public void run() {
               try {
                  if ( b.incrementInstallCount() == 1 )
                     DebugPlugin.getDefault().getBreakpointManager().fireBreakpointChanged( b );
               }
               catch( CoreException e ) {
                  CDebugUIPlugin.log( e.getStatus() );
               }
            }
         } );
      }
   }

(BTW, a really helpful trick is: sending a QUIT signal to the JVM for an Eclipse instance causes the JVM to dump stack traces for all its threads, *INCLUDING* information about who owns what locks, who's waiting for locks, etc.).

First, an "MI Event Thread", which owns "0xf0a110a8" (a CBreakpointManager$BreakpointMap), and is waiting for the UI thread to so a "syncExec":

"MI Event Thread" prio=1 tid=0xc8dfc940 nid=0x7176 in Object.wait() [0xc67fc000..0xc67fd130]
	at java.lang.Object.wait(Native Method)
	- waiting on <0xef92fe88> (a org.eclipse.ui.internal.Semaphore)
	at org.eclipse.ui.internal.Semaphore.acquire(Semaphore.java:43)
	- locked <0xef92fe88> (a org.eclipse.ui.internal.Semaphore)
	at org.eclipse.ui.internal.UISynchronizer.syncExec(UISynchronizer.java:46)
	at org.eclipse.swt.widgets.Display.syncExec(Display.java:3720)
	at org.eclipse.cdt.make.ui.MakeContentProvider.resourceChanged(MakeContentProvider.java:239)
	at org.eclipse.core.internal.events.NotificationManager$2.run(NotificationManager.java:280)
	at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:37)
	at org.eclipse.core.internal.events.NotificationManager.notify(NotificationManager.java:274)
	at org.eclipse.core.internal.events.NotificationManager.broadcastChanges(NotificationManager.java:148)
	at org.eclipse.core.internal.resources.Workspace.broadcastPostChange(Workspace.java:256)
	at org.eclipse.core.internal.resources.Workspace.endOperation(Workspace.java:958)
	at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:1746)
	at org.eclipse.core.internal.resources.Workspace.run(Workspace.java:1719)
	at org.eclipse.cdt.debug.internal.core.breakpoints.CBreakpoint.run(CBreakpoint.java:188)
	at org.eclipse.cdt.debug.internal.core.breakpoints.CBreakpoint.<init>(CBreakpoint.java:68)
	at org.eclipse.cdt.debug.internal.core.breakpoints.AbstractLineBreakpoint.<init>(AbstractLineBreakpoint.java:45)
	at org.eclipse.cdt.debug.internal.core.breakpoints.CLineBreakpoint.<init>(CLineBreakpoint.java:38)
	at org.eclipse.cdt.debug.core.CDIDebugModel.createLineBreakpoint(CDIDebugModel.java:198)
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.createLineBreakpoint(CBreakpointManager.java:706)
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.createLocationBreakpoint(CBreakpointManager.java:683)
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.doHandleLocationBreakpointCreatedEvent(CBreakpointManager.java:423)
	- locked <0xf0a110a8> (a org.eclipse.cdt.debug.internal.core.CBreakpointManager$BreakpointMap)
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.handleBreakpointCreatedEvent(CBreakpointManager.java:410)
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.handleDebugEvents(CBreakpointManager.java:340)
	at org.eclipse.cdt.debug.mi.core.cdi.EventManager.fireEvents(EventManager.java:249)
	at org.eclipse.cdt.debug.mi.core.cdi.EventManager.update(EventManager.java:217)
	at java.util.Observable.notifyObservers(Unknown Source)
	at org.eclipse.cdt.debug.mi.core.MISession.notifyObservers(MISession.java:762)
	at org.eclipse.cdt.debug.mi.core.EventThread.run(EventThread.java:46)


Then there's the "main" thread.  It looks to me like it "owns" the UI thread, given it's in "Synchronizer.runAsyncMessages".  But in trying to execute one of those async runnables, it's waiting to lock the 0xf0a110a8/BreakpointMap object owned by the previous thread.

"main" prio=1 tid=0x0805d660 nid=0x7126 waiting for monitor entry [0xffffa000..0xffffb2c8]
	at org.eclipse.cdt.debug.internal.core.CBreakpointManager.breakpointsChanged(CBreakpointManager.java:300)
	- waiting to lock <0xf0a110a8> (a org.eclipse.cdt.debug.internal.core.CBreakpointManager$BreakpointMap)
	at org.eclipse.debug.internal.core.BreakpointManager$BreakpointsNotifier.run(BreakpointManager.java:865)
	at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:37)
	at org.eclipse.debug.internal.core.BreakpointManager$BreakpointsNotifier.notify(BreakpointManager.java:884)
	at org.eclipse.debug.internal.core.BreakpointManager.fireUpdate(BreakpointManager.java:738)
	at org.eclipse.debug.internal.core.BreakpointManager.fireBreakpointChanged(BreakpointManager.java:510)
	at org.eclipse.cdt.debug.internal.ui.CBreakpointUpdater$1.run(CBreakpointUpdater.java:62)
	at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
	at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:123)
	- locked <0xf0c9e2a0> (a org.eclipse.swt.widgets.RunnableLock)
	at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3141)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:2843)
	at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:1914)
	at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:1878)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:419)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
	at org.eclipse.ui.internal.ide.IDEApplication.run(IDEApplication.java:95)
	at org.eclipse.core.internal.runtime.PlatformActivator$1.run(PlatformActivator.java:78)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:92)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:68)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:400)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:177)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.core.launcher.Main.invokeFramework(Main.java:336)
	at org.eclipse.core.launcher.Main.basicRun(Main.java:280)
	at org.eclipse.core.launcher.Main.run(Main.java:977)
	at org.eclipse.core.launcher.Main.main(Main.java:952)
Comment 13 Andrew Camm CLA 2009-03-12 05:22:55 EDT
I have a similar problem still with CDT 5.0.2

In my case I have to check the "search for duplicate files" in the launch configuration; this causes a UI dialog box to pop up during debug launch asking me which of the files I meant the breakpoint to be on. This is done from the MI Event Thread, which while synchronized on getBreakpointMap() needs to gain access to the UI, which is owned by the main thread. However the main thread also gets a breakpointsChanged event, which synchronizes on getBreakpointMap() - depending on timing this can result in deadlock.

(Note that I dont actually have any duplicate files in the workspace, but I have two projects in the workspace, one to build a library of the main project sources, one to build a unit test executable. The unit test code exists in a subdirectory of the main project. The CDT debug code sees the unit test source files as existing in both projects even though the unit test sources are excluded from the main project build. The "search for duplicate files" checkbox lets me tell it which workspace-relative path to use) 


Thread [MI Event Thread] (Suspended)	
	Object.wait(long) line: not available [native method]	
	RunnableLock(Object).wait() line: 485 [local variables unavailable]	
	UISynchronizer(Synchronizer).syncExec(Runnable) line: 185	
	UISynchronizer.syncExec(Runnable) line: 150	
	Display.syncExec(Runnable) line: 4251	
	Prompter.handleStatus(IStatus, Object) line: 79	
	CSourceLookupDirector(AbstractSourceLookupDirector).resolveSourceElement(Object, List) line: 513	
	CSourceLookupDirector(AbstractSourceLookupDirector).getSourceElement(Object) line: 741	
	CBreakpointManager.getSourceElement(String) line: 1373	
	CBreakpointManager$BreakpointMap.isSameBreakpoint(ICBreakpoint, ICDIBreakpoint) line: 220	
	CBreakpointManager$BreakpointMap.getCBreakpoint(ICDIBreakpoint) line: 158	
	CBreakpointManager.doHandleLocationBreakpointCreatedEvent(ICDILocationBreakpoint) line: 523	
	CBreakpointManager.handleBreakpointCreatedEvent(ICDIBreakpoint) line: 481	
	CBreakpointManager.handleDebugEvents(ICDIEvent[]) line: 385	
	EventManager.fireEvents(ICDIEvent[]) line: 249	
	EventManager.update(Observable, Object) line: 217	
	MISession(Observable).notifyObservers(Object) line: not available	
	MISession.notifyObservers(Object) line: 768	
	EventThread.run() line: 46


Thread [main] (Suspended)	
	CBreakpointManager.breakpointsChanged(IBreakpoint[], IMarkerDelta[]) line: 345	
	BreakpointManager$BreakpointsNotifier.run() line: 946	
	SafeRunner.run(ISafeRunnable) line: 37	
	BreakpointManager$BreakpointsNotifier.notify(IBreakpoint[], IMarkerDelta[], int) line: 965	
	BreakpointManager.fireUpdate(List, List, int) line: 819	
	BreakpointManager.fireBreakpointChanged(IBreakpoint) line: 578	
	CBreakpointUpdater$1.run() line: 62	
	RunnableLock.run() line: 35	
	UISynchronizer(Synchronizer).runAsyncMessages(boolean) line: 133	
	Display.runAsyncMessages(boolean) line: 3800	
	Display.readAndDispatch() line: 3425	
	Workbench.runEventLoop(Window$IExceptionHandler, Display) line: 2382	
	Workbench.runUI() line: 2346	
	Workbench.access$4(Workbench) line: 2198	
	Workbench$5.run() line: 493	
	Realm.runWithDefault(Realm, Runnable) line: 288	
	Workbench.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 488	
	PlatformUI.createAndRunWorkbench(Display, WorkbenchAdvisor) line: 149	
	IDEApplication.start(IApplicationContext) line: 113	
	EclipseAppHandle.run(Object) line: 193	
	EclipseAppLauncher.runApplication(Object) line: 110	
	EclipseAppLauncher.start(Object) line: 79	
	EclipseStarter.run(Object) line: 386	
	EclipseStarter.run(String[], Runnable) line: 179	
	NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
	NativeMethodAccessorImpl.invoke(Object, Object[]) line: not available	
	DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: not available	
	Method.invoke(Object, Object...) line: not available	
	Main.invokeFramework(String[], URL[]) line: 549	
	Main.basicRun(String[]) line: 504	
	Main.run(String[]) line: 1236	
	Main.main(String[]) line: 1212		
    
    
   Thread [Worker-6] (Suspended)	
	CBreakpointManager.breakpointsChanged(IBreakpoint[], IMarkerDelta[]) line: 345	
	BreakpointManager$BreakpointsNotifier.run() line: 946	
	SafeRunner.run(ISafeRunnable) line: 37	
	BreakpointManager$BreakpointsNotifier.notify(IBreakpoint[], IMarkerDelta[], int) line: 965	
	BreakpointManager.fireUpdate(List, List, int) line: 819	
	BreakpointManager.access$3(BreakpointManager, List, List, int) line: 806	
	BreakpointManager$BreakpointManagerVisitor.update() line: 675	
	BreakpointManager.resourceChanged(IResourceChangeEvent) line: 611	
	NotificationManager$2.run() line: 288	
	SafeRunner.run(ISafeRunnable) line: 37	
	NotificationManager.notify(ResourceChangeListenerList$ListenerEntry[], IResourceChangeEvent, boolean) line: 282	
	NotificationManager.broadcastChanges(ElementTree, ResourceChangeEvent, boolean) line: 148	
	Workspace.broadcastBuildEvent(Object, int, int) line: 297	
	AutoBuildJob.doBuild(IProgressMonitor) line: 143	
	AutoBuildJob.run(IProgressMonitor) line: 238	
	Worker.run() line: 55