Bug 566861 - Deadlock canceling indexer while closing project
Summary: Deadlock canceling indexer while closing project
Status: NEW
Alias: None
Product: CDT
Classification: Tools
Component: cdt-indexer (show other bugs)
Version: 9.4.3   Edit
Hardware: PC Windows 10
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact: Jonah Graham CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-10 10:30 EDT by Christian Walther CLA
Modified: 2020-09-10 10:30 EDT (History)
2 users (show)

See Also:


Attachments
stack dump (18.69 KB, text/plain)
2020-09-10 10:30 EDT, Christian Walther CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Walther CLA 2020-09-10 10:30:45 EDT
Created attachment 284105 [details]
stack dump

I have a report of a deadlock from a user where they were closing a project while the indexer was working. (Similar situation to https://git.eclipse.org/r/c/cdt/org.eclipse.cdt/+/90883/ and to bug 424466 and bug 327126 but not exactly the same.)

The stack dump (attached) indicates that Worker-11, while processing the project closing, was canceling the indexer and waiting for its job to exit, while running in a job using some workspace scheduling rule. At the same time, the indexer job itself in Worker-4 was waiting to acquire the workspace root scheduling rule and therefore never exited.

It probably can’t be avoided that the indexer will occasionally do workspace operations (as the commit linked above says, “Because so much code can be hooked up to the C model (extensions, listeners, etc), it is difficult to guarantee that this will not happen.”), so can the deadlock be broken on the other side?

I am wondering whether in this case, when called via PDOMManager.preRemoveProject -> PDOMManager.stopIndexer -> PDOMManager.cancelIndexerJobs -> PDOMIndexerJob.cancelJobs, PDOMIndexerJob.cancelJobs() should be called with argument waitUntilCancelled = false rather than true, so that the PRE_CLOSE notification does not have to wait for the indexer to exit?

I am not familiar enough with the indexer however to judge if in PDOMManager.preRemoveProject() anything that comes after the call to stopIndexer() relies on having the indexer job gone synchronously. I also don’t know which other methods in the call hierarchy of cancelJobs() really need the synchronous behavior, in other words how far up in the hierarchy the waitUntilCancelled argument would need to be pulled.

Bug 327126 comment #1 says “The indexer needs to be cancelled during the pre-delete notification. Otherwise, the indexer runs into exceptions working on resources that no longer exist.”, but I’m not sure that is still true. The indexer running into exceptions on resources that no longer exist is normal and happens all the time, e.g. when resource filters cause resources to disappear or when files are deleted externally, and as of the fix to bug 352952 is handled gracefully.