Bug 572324 - AdapterManager threadsafety issue
Summary: AdapterManager threadsafety issue
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Components (show other bugs)
Version: 4.15   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 4.22 M2   Edit
Assignee: Andrey Loskutov CLA
QA Contact:
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-03-26 04:46 EDT by Olaf Titz CLA
Modified: 2021-10-13 07:21 EDT (History)
6 users (show)

See Also:


Attachments
Patch to add synchronization (1.30 KB, patch)
2021-03-26 08:53 EDT, Olaf Titz CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Olaf Titz CLA 2021-03-26 04:46:11 EDT
This applies to org.eclipse.equinox.common 3.11.0.

In a big RCP based business application, we see frequent errors of this kind:

java.util.ConcurrentModificationException
	at java.base/java.util.ArrayList$Itr.checkForComodification(Unknown Source)
	at java.base/java.util.ArrayList$Itr.next(Unknown Source)
	at org.eclipse.core.internal.runtime.AdapterManager.addFactoriesFor(AdapterManager.java:106)
	at org.eclipse.core.internal.runtime.AdapterManager.getFactories(AdapterManager.java:212)
	at org.eclipse.core.internal.runtime.AdapterManager.getAdapter(AdapterManager.java:296)

called by a class implementing IAdaptable simply as

    public <T> T getAdapter(Class<T> adapterType) {
        return Platform.getAdapterManager().getAdapter(this, adapterType);
    }

This happens on application startup, where the application does a lot of programmatic adapter factory registrations and starts background threads, also doing adapter factory registrations, mixed with getAdapter calls.

The cause of the bug is missing synchronization in certain critical sections in AdapterManager.
AdapterManager has the field

private final HashMap<String, List<IAdapterFactory>> factories;

containing mappings with List<IAdapterFactory> values. This HashMap as well as the contained ArrayList values (filled in by the public void registerFactory(IAdapterFactory factory, String adaptableType) method) are not threadsafe by themselves, so synchronization is already needed for correctness. What is happening here however is a race of the loop in the method

	private void addFactoriesFor(String typeName, Map<String, IAdapterFactory> table) {
		List<IAdapterFactory> factoryList = getFactories().get(typeName);
		if (factoryList == null)
			return;
		for (IAdapterFactory factory : factoryList) {
			// (omitted)
		}
	}

against a concurrent add operation on the same List inside of

	public void registerFactory(IAdapterFactory factory, String adaptableType) {
		List<IAdapterFactory> list = factories.get(adaptableType);
		if (list == null) {
			list = new ArrayList<>(5);
			factories.put(adaptableType, list);
		}
		list.add(factory);
	}

Not all call paths to these methods employ proper synchronization. In particular, the call path

public <T> T getAdapter(Object adaptable, Class<T> adapterType)
-> private Map<String, IAdapterFactory> getFactories(Class<? extends Object> adaptable)
-> private void addFactoriesFor(String typeName, Map<String, IAdapterFactory> table) 

is completely unsynchronized. Any registerFactory call happening while addFactoriesFor is running on the same adaptable type will break the loop in addFactoriesFor.

This (apparently very old) bug in the Eclipse Equinox framework has been severely aggravated by a recent change, which changes the loop from
		for (int i = 0; i < factoryList.size(); ++i) {
			IAdapterFactory factory = factoryList.get(i);
			// (omitted)
		}
to the current
		for (IAdapterFactory factory : factoryList) {
			// (omitted)
		}

While this is seemingly functionally equivalent, the older form has the advantage that an intervening add to the end of the list by a second thread, despite being thread-unsafe on ArrayList, with very high probability will just work. This is because the repeated calls to size will then just give the new size and the repeated calls to get will just get the new elements (and size and get in ArrayList are in fact atomic). The newer form employs a deliberately fail-fast Iterator instead.

To fix this, all methods in AdapterManager accessing the internal data structures have to be properly synchronized. Especially the methods

private Map<String, IAdapterFactory> getFactories(Class<? extends Object> adaptable) and
public void registerFactory(IAdapterFactory factory, String adaptableType)

have to be declared synchronized, but check also all other unsynchronized methods for possible conflicts.

Note that it is not sufficient to wrap the internal data structures in Collections.synchronized... forms.
Comment 1 Andrey Loskutov CLA 2021-03-26 05:06:35 EDT
Thanks Olaf for the analysis! Regression in 4.15, from https://git.eclipse.org/r/c/equinox/rt.equinox.bundles/+/152997

Yeah, seemingly "safe" code cleanups can lead to changed behavior and regressions.

@Carsten: please provide a fix or revert the affected code change.
Comment 2 Alexander Kurtakov CLA 2021-03-26 06:04:00 EDT
Olaf, do you want to try providing a patch? After all the investigations you might be able to even add a test case for the issue.
Comment 3 Olaf Titz CLA 2021-03-26 08:53:03 EDT
Created attachment 285956 [details]
Patch to add synchronization

Patch to add synchronization to the two critical methods
Comment 4 Olaf Titz CLA 2021-03-26 08:56:11 EDT
Unfortunately there is no easy testcase, as this bug is a race condition dependent on precise timing of threads. The easiest way to reproduce would be an application with two threads, one of which repeatedly calls getAdapter on an object and the other one repeatedly calls registerAdapters for the class of that object (which is roughly what our app does in production).
Comment 5 Carsten Hammer CLA 2021-03-26 14:38:38 EDT
(In reply to Olaf Titz from comment #4)
> Unfortunately there is no easy testcase, as this bug is a race condition
> dependent on precise timing of threads. The easiest way to reproduce would
> be an application with two threads, one of which repeatedly calls getAdapter
> on an object and the other one repeatedly calls registerAdapters for the
> class of that object (which is roughly what our app does in production).

Thanks Olaf for your investigation. I put your suggestion into a gerrit:
https://git.eclipse.org/r/c/equinox/rt.equinox.bundles/+/178445

I had a short look on creating a test in this case but agree that it is not so easy while of course possible. Did you test this change against your use case already?
Comment 6 Niraj Modi CLA 2021-04-08 07:43:40 EDT
Please re-target this bug.
Comment 7 Niraj Modi CLA 2021-05-21 02:05:05 EDT
Resetting milestone.
Comment 8 Eclipse Genie CLA 2021-10-11 02:15:14 EDT
New Gerrit change created: https://git.eclipse.org/r/c/equinox/rt.equinox.bundles/+/186344
Comment 10 Andrey Loskutov CLA 2021-10-13 07:21:48 EDT
(In reply to Eclipse Genie from comment #9)
> Gerrit change
> https://git.eclipse.org/r/c/equinox/rt.equinox.bundles/+/186344 was merged
> to [master].
> Commit:
> http://git.eclipse.org/c/equinox/rt.equinox.bundles.git/commit/
> ?id=9340606dbea17265a1d70d23b616a8e172ca6898

@Olaf: could you please verify if the new SDK build fixes your issue?

This or later SDK build contain the patch above:

https://download.eclipse.org/eclipse/downloads/drops4/I20211012-1800/

Would be also interesting to see if any of new changes via bug 576024 added even more MT-issues.