Bug 190277 - Stopping a debugg session causes all user processes to terminate
Summary: Stopping a debugg session causes all user processes to terminate
Status: RESOLVED FIXED
Alias: None
Product: CDT
Classification: Tools
Component: cdt-debug (show other bugs)
Version: 3.1.2   Edit
Hardware: Sun Solaris-GTK
: P3 major with 1 vote (vote)
Target Milestone: 4.0.1   Edit
Assignee: Anton Leherbauer CLA
QA Contact:
URL:
Whiteboard:
Keywords: contributed
Depends on:
Blocks:
 
Reported: 2007-05-31 12:54 EDT by Ali Ghorashi CLA
Modified: 2008-06-22 02:18 EDT (History)
2 users (show)

See Also:


Attachments
Workspace and instructions used to recreate this bug. (651.32 KB, application/octet-stream)
2007-05-31 13:03 EDT, Ali Ghorashi CLA
no flags Details
This patch solves the bug (9.98 KB, patch)
2007-08-24 12:25 EDT, Piotr Kundu CLA
bjorn.freeman-benson: iplog+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ali Ghorashi CLA 2007-05-31 12:54:34 EDT
Build ID: 3.2.2 CDT 3.1.2

Steps To Reproduce:
1.Load the attached workspace on Solaris (9 or 10)
2.Build
3.Run Debug
4.Click on Run->Terminate

All user programs will be terminated.


More information:
Originally, I thought this was a problem with Sun's X implementation because my X session kept getting killed while I was in the middle of debugging (quite annoying). So after going back and forth with Sun for about two months, they pinned it down to the following:


Engineering was able to reproduce your problem with the Eclipse debugger by following your instructions.  This is the response that I got:

"I noted that all user processes get killed when I click Run -> Terminate.  A truss of the java process running eclipse shows:

1468/17:        kill(-1, SIGINT)      = 0

This means that all user processes are going to be killed.  I believe that the problem is being caused by the Eclipse debugger and not by any desktop products or libraries.

Please raise this ticket to the owners of the Eclipse debugger for further analysis."
Comment 1 Ali Ghorashi CLA 2007-05-31 13:03:38 EDT
Created attachment 69559 [details]
Workspace and instructions used to recreate this bug. 

This is just a simple client server application. Instructions on how to recreate this problem are also included in the zip file.
Comment 2 Sven Lundblad CLA 2007-06-27 04:14:24 EDT
We run into the same issue and it looks like it is a CDT / GDB interaction issue. 

When launching, CDT will issue a "program info" GDB command and parse the output to get the pid of the debugee and store the pid in the org.eclipse.cdt.debug.mi.core.MIInferior object (this is done from the update() method of MIInferior). The output of the "program info" is not what CDT expects and in fact the out from GDB varies between GDB versions and hosts, on Solaris the main LWP id is given, typically 1.

When terminating the debug session CDT first try to interrupt GDB in MIProcessAdapter.interrupt() and if the MIInferior thinks the debugee is still running it then tries to send a SIGINT directly to the debugee through  org.eclipse.cdt.utils.spawner.Spawner.raise() with the pid stored in the MIInferior object. It is now it goes very wrong since the pid is "1". Somehow, through native code, the result is a kill on pid -1 which will kill all the user's processes.

If the call to raise in MIProcessAdapter.interrupt() is removed then the bad kill will not happen.
Comment 3 Sven Lundblad CLA 2007-06-27 09:28:42 EDT
A somewhat cleaner fix, than removing the call to raise, might be to change the parseLine() method in org.eclipse.cdt.debug.mi.core.output.CLIInfoProgramInfo to be stricter. We ended up adding these lines in the while loop:
/* Not a process id if LWP is reported */
if (s.equals("LWP")) break;

This will cause the pid to be marked as illegal in MIInferior and raise will not be called.

A few questions:
 - Should CDT really use the "program info" command since the output varies quite much between different GDBs?
 - Is it wise for CDT to operate directly on the debugged process without going through the debugger? Assume that the debugged process is not local.
Comment 4 Piotr Kundu CLA 2007-08-17 09:17:43 EDT
Implementing patch suggested in Commment #3 by Sven, I'm not logged out all the time on Solaris. I found that CDT, with the patch, is able to kill the GDB process after several attempts to terminate, but the debugee will continue to run, although CDT indicates it's not running. 

If the developer chooses to relaunch a new debug session and the 1st running debugee is using resources that are need by the 2nd, there will be issues.

GDB (on Solaris) that is debugging a local application and running will respond to SIGSTOP, SIGTERM or SIGKILL, but NOT the attempted SIGINT. In any case, terminating GDB, does not terminate the debuggee. Realizing that barking up the wrong tree (GDB), will not get me anywhere, the first step was to find another way to get the PID to be able to kill the debugee. CLI Command "info proc" looks like a good starting point, returning a string:

(gdb) info proc
process 19127 flags:
PR_STOPPED Process (LWP) is stopped
PR_ISTOP Stopped on an event of interest
PR_RLC Run-on-last-close is in effect
PR_FAULTED : Incurred a traced hardware fault FLTBPT: Breakpoint trap

I've tested this on Solaris 9 using GDB 5.3, 6.3, 6.4 and 6.6 and all produce a string with a similar result as above, always starting with the string "process" which can be used for parcing out the PID which follows "process ". 

"info proc" is NOT supported on Windows, so "info program" will be used if "info proc" fails. 

Implementation will be described later.
Comment 5 Piotr Kundu CLA 2007-08-24 12:25:14 EDT
Created attachment 76922 [details]
This patch solves the bug

Here is the suggested implementation with 2 new classes and changes in 3 other classe, all in org.eclipse.cdt.debug.mi.core component. 

1. The update() method of org.eclipse.cdt.debug.mi.core.MIInferior has been change to first use "info proc" and only then use "info program" 

2. The parseLine() method of org.eclipse.cdt.debug.mi.core.output.CLIInfoProgramInfo has been changed to stop parsing if "LWP" is found in the string.

3. A new method createCLIInfoProc() has been added in org.eclipse.cdt.debug.mi.core.command.CommandFactory just prior to createCLIInfoProgram(). 

5. A new class org.eclipse.cdt.debug.mi.core.command.CLIInfoProc has been added: 
6. A new class org.eclipse.cdt.debug.mi.core.output.CLIInfoProcInfo has been added:
Comment 6 Anton Leherbauer CLA 2007-09-06 04:49:10 EDT
Your patch looks good.
I verified that on Solaris 'info program' does not yield the pid:

(gdb) info program
        Using the running image of child LWP 1 via /proc.
Program stopped at 0x15074.
It stopped at breakpoint 1.

I am going to apply the patch.
Comment 7 Anton Leherbauer CLA 2007-09-06 07:55:59 EDT
I applied the patch with minor modifications ($NON-NLS$ tags).
The fix will be available in 4.0.1 > I200709060101. Thanks!
Comment 8 Piotr Kundu CLA 2007-09-06 08:54:31 EDT
Cheers Anton, we really needed this one.