Community
Participate
Working Groups
(The following was found in CDT 2.0.1) There are complementary bugs in gdb and CDT that lead to gdb getting into a state where it no longer reponds to MI commands. I'll describe what leads to the problem, the source of the problem, and how it can be fixed or worked around in CDT. We use and extend CDT to debug applications that are actually implemented as shared libraries within a host application. The shared libraries are loaded when needed, and are unloaded when the user exits the "application" (which leaves the host application running and the debug session still running). After exiting and re-entering the "application" in the shared library a few times during the same debug session, gdb suddenly stops repsonding to MI commands. What is happening inside of gdb is that it is crashing because of an invalid memory access. It appears that gdb catches this hardware exception (the gdb process does not exit), but is no longer in a state to receive MI commands. gdb is crashing because some data structures contain stale pointers to de- allocated data. In particular, the "variable" data structures created in response to -var-create MI commands contain stale pointers to symbol information. The pointers are stale because of the library unloading and loading: the variable is created (resulting in incorporating pointers to symbol information in the loaded shared library file), the library is unloaded (resulting in the pointers to the symbol information becoming stale), the library is reloaded (resulting in new symbol information being loaded, but information that is NOT hooked up to the previously-created variables), and CDT tries to use the previously-created variables, leading gdb to cause a hardware exception. Naturally, it would be nice for gdb to clean up data structures that point to symbol information when the symbol table goes away. However, there is also probably some responsibility for CDT to clean up the variables it created when the shared library is unloaded. Adding support for this in CDT will also have the effect of removing any references to removed symbol information. There are three places where this cleanup appears to be needed: ExpressionManager, VariableManager, and RegisterManager. In all three cases, the added code is similar: register with the EventManager for CDI notifications, and respond to shared-library-unloaded events by deleting the appropriate MI objects: public ExpressionManager(Session session) { super(session, true); expList = Collections.synchronizedList(new ArrayList()); + session.getEventManager().addEventListener(this); } + public void handleDebugEvents(ICDIEvent[] events) { + + /* + * Iterate over all of the events, looking for one indicating that a shared + * library was unloaded. + */ + for (int ii = 0; ii < events.length; ++ii) { + ICDIEvent event = events[ii]; + if (event instanceof DestroyedEvent) { + ICDIObject source = event.getSource(); + if (source instanceof ICDISharedLibrary) { + + /* + * Iterate over all of the expressions, looking for ones associated + * with the shared library that was unloaded. + */ + try { + ICDITarget target = source.getTarget(); + ICDIExpression[] expressions = this.getExpressions(); + for (int jj = 0; jj < expressions.length; jj++) { + Expression expression = (Expression) expressions[jj]; + + /* + * If we find one, then remove it. + */ + if (expression.getTarget() == target) { + // this.removeExpression(expression.getMIVar().getVarName()); + expList.remove (expression); + removeMIVar (expression.getMIVar()); + } + } + } catch (CDIException e) { + } + } + } + } + }
Fantastic analysis. But timing is no good, I'm gone on vacation. Cc'ing Dave and Mikhail. But I'll get to it when back.
1- the code base in the head have radically change. Expressions are now evaluated when needed and do not keep the variable across suspended state. 2- Even if the code base is change, as you pointed it would be a good idea to clean. The problem we do not know what to clean ... unless we know the address of the variable and it is in the address range of the unloaded library ... lot of work. So for now we follow your approach and destroy/delete all the variable, The UI when receiving the destroy event should probably recreate the variable(Note to mikhailk). 3- Instead of making the ExpressionManager be a listener, in the head we make this in the EventManager when the unloaded event arrives. We can not rely on the order of the listener being call. 4- This PR had no milestone, so I'll assume it if for 2.1 Commited to the head.
Original submitter no longer with PalmSource. Please direct any questions about origin of bug to Brad.Jarvinen@palmsource.com (but Keith may still be in the know if you want to try him too).
Fix in the head
Note to keith .. or Brad .. they palmSource folks. The Event manager when receiving a libary Unloaded event will look at the variables(in the variableManager) and delete any variables that addresses was in the range of the sharedLibrary being unloaded. For this to work, we are assuming that "info shared" is filling the "From"(Start address) and "To"(End address) in GDB. See EventManager.processSharedLibaryUnloadedEvent()
Doesn't the shared library range simply cover the range of the the machine code itself? If so, that wouldn't cover the stack or global variables areas, which is where the variable addresses lie. Also, should the ExpressionManager and RegisterManager also be handled? I no longer have Eclipse installed, so I can't check the actual code, and could easily be overlooking some aspect of what you've just done that makes it work.
> Doesn't the shared library range simply cover the range of the the machine > code itself? If so, that > wouldn't cover the stack or global variables areas, which is where the > variable addresses lie. Sigh .... you are right, not sure what I was thinking. Then we still have a problem(Reopening library). The only way to fix this would be to destroy all variables. But for the UI(user) that would be a problem. Say you add global variable to watch, when a library is unloaded all globals will be remove including the ones that are not affected. In our enviroment that may not be a big issue since, folks do not constently dlopen/dlclose libraries. Is this true for palm environment ? > Also, should the ExpressionManager and RegisterManager also be handled? We've change the behaviour of the ExpressionManager, at every suspended state all the variables are destroyed. An expression is a code snippet that has to be evaluated at every context(stackframe) i.e. "x", may have different value in different context. For the RegisterManager, how in this PR scenario will it be affected ? Let me know asap, we need to decide if we move this PR to cdt-3.0 or strive to find a fix for 2.1
retargetting this PR for 3.0. The conditions describe below will not occur in 2.1 Expression are always deleted and globals are only globals from the executable not shared libraries since CDT-2.1 CDT-3.x we are looking at ways to add globals from specific libraries.
The code has change and this race condition is no longer a problem.
Returning to the pool.
Hi, It seems that the orignal bug reported is still valid with Eclipse 3.1.1 with CDT 3.0 (with gdb 6.3) I am also running in a Linux environment. The original bug noted that gdb was crashing when CDT was using the MI interface to update variables from shared libraries, where the shared library was loaded, unloaded and loaded again. It seems that CDT uses stale variable objects when the shared library is loaded again which causes a crash from gdb. I will file a separate bug against gdb (since it should never crash) but I don't think CDT is acting properly either. I've been able to create a pretty simple testcase. The testcase opens a library, uses a function within it and then closes the library. It does this three times. To reproduce: 1. Import the attached projects into Eclipse/CDT. There are two projects: libraryC and testcase. 2. Build libraryC. 3. Modify the testcase.c file within the testcase project to open libraryC shared library. This is necessary because that file uses an absolute pathname. 4. Build the testcase project and start a gdb debug session. 5. At this point you should be stopped in the first line of main. 6. Please make sure that the "Variables View" is visible and that the debug connection updates variables. 7. Set a breakpoint at bla.c:7. This file is from the shared library project. 8. Set a breakpoint at the end of the main function. 9. Run until the first time the breakpoint at bla.c:7 is hit. Focus on the variable view window now. Continue. Do this TWICE. 10. GDB should crash after the second time. Continue running the program and never reach the end of main to verify. Thank You, Ewa Matejska PalmSource
Created attachment 33020 [details] Testcase to reproduce the problem.
(In reply to comment #11) Ewa, Thanks for the test case. The problem is that we rely on gdb in order to decide whether a variable needs to be deleted or not. And the command is "var-update". Unfortunately, this particular command crashes gdb in your example. I disagree with the original reporter that CDT should delete all local variables when the shared library is unloaded - that's the gdb's responsibility to notify the front end. And it definitely shoudn't crash. Because we can not identify which local variable is created by the library the only possible workaround I can see now is to delete ALL local variables when any library is unloaded. But I am not sure about consequences and I think we need to investigate other possibilities, so I am setting the target milestone to 3.1.
Mikhail, I think that the solution I included in my original post works, and does so without deleting ALL local variables when any library is unloaded. Why not just use it? I'm not sure I see in your response any specific objection to the approach I took. -- Keith
Keith, I meant all local variables created for the same target, not the local variables associated with the library. This should be done in the gdb/mi implementation, not at the CDI level. Alain tried to fix it for expressions, but it doesn't work. The order of gdb commands is simply wrong.
Bugs still assigned to the inbox can not have the target milestone set. If someone is really going to fix this in 3.1, please assign it to yourself.