Hi,
I had a deeper look at the Facebook issue, and this is what I found.
I rewrote the JMH benchmark to test a simple synchronized block against ReentrantLock against SpinLock. The code inside the lock is BlackHole.consumeCPU(128).
Results for 1 thread (uncontended):
JMHLockTest.lock 28591618.486 ± 670795.875 ops/s
JMHLockTest.spin 27887960.738 ± 386684.551 ops/s
JMHLockTest.sync 28373431.662 ± 45088.485 ops/s
Results for 4 threads:
JMHLockTest.lock 19102607.983 ± 537321.988 ops/s
JMHLockTest.spin 4615534.795 ± 761143.689 ops/s
JMHLockTest.sync 12640333.967 ± 612131.150 ops/s
Results for 8 threads:
JMHLockTest.lock 15642880.428 ± 391710.628 ops/s
JMHLockTest.spin 3151759.476 ± 52040.178 ops/s
JMHLockTest.sync 10591495.276 ± 673044.945 ops/s
So if SpinLock is contended, it performs really badly, and as expected, ReentrantLock outperforms synchronized.
The worst thing is that SpinLock really spins consuming all CPU cores, so if it's contended it shows almost 800/800 CPU in the latest case.
Contrast this with the other two that show something like 100/800.
This is important for anyone that has some alerting tool that checks for high CPU usage.
I went on and wrote a JMH benchmark for ConnectionPool, with 4 threads creating connections, 3 releasing them, and 1 removing them, and a capacity of 250 connections, like Facebook has.
Results for ReentrantLock:
JMHConnectionPoolTest.pool 2093794.641 ± 216581.343 ops/s
JMHConnectionPoolTest.pool:close 319128.145 ± 35888.616 ops/s
JMHConnectionPoolTest.pool:create 1230826.046 ± 254282.232 ops/s
JMHConnectionPoolTest.pool:release 543840.450 ± 10255.645 ops/s
Results for SpinLock:
JMHConnectionPoolTest.pool 1666248.025 ± 45640.942 ops/s
JMHConnectionPoolTest.pool:close 253181.188 ± 30190.708 ops/s
JMHConnectionPoolTest.pool:create 867763.796 ± 20266.778 ops/s
JMHConnectionPoolTest.pool:release 545303.042 ± 24274.735 ops/s
Results for synchronized:
JMHConnectionPoolTest.pool 1855606.213 ± 48375.993 ops/s
JMHConnectionPoolTest.pool:close 161502.597 ± 17047.142 ops/s
JMHConnectionPoolTest.pool:create 1014543.654 ± 46278.642 ops/s
JMHConnectionPoolTest.pool:release 679559.962 ± 30492.268 ops/s
Again, SpinLock is the loser, and it consumes 800/800 CPU whereas the other 2 consume about 240/800.
Based on these results, for 9.3.0 I decided to change the implementation of SpinLock. I renamed the class Locker, and added an implementation based on ReentrantLock, which is now the default.
The old behavior (spin) can be enabled using -Dorg.eclipse.jetty.util.thread.Locker.spin=true.
For 9.2.12 and Facebook I am thinking to just ditch entirely the SpinLock class, and replace its usage (only in jetty-client) with direct usage of ReentrantLock.
Another things that popped out from the benchmark is the use of the "test-and-test-and-set" optimization. It makes a huge difference (for the better) in the behavior of SpinLock. Jetty 9.2 does not have this optimization (it only used "test-and-set"), while Jetty 9.3 has it.
However, I am thinking that we should re-analyze our usage of SpinLock in favor of ReentrantLock for 3 reasons: A) customer alerting tools won't be happy; B) we get free performance for each JDK major release; C) JMH showed it's a loser.
Then again, I may be totally wrong, or wrote the benchmarks wrong, so review is definitely needed. But I have the gut feeling that's the last time I will write a lock implementation :)