Community
Participate
Working Groups
I was trying to work out why many CCRC operations are so slow, and this looks like one major culprit: FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[], int, int) line: 199 BufferedInputStream.read1(byte[], int, int) line: 256 BufferedInputStream.read(byte[], int, int) line: 317 BufferedInputStream(FilterInputStream).read(byte[]) line: 90 Properties$LineReader.readLine() line: 418 Properties.load0(Properties$LineReader) line: 337 Properties.load(InputStream) line: 325 ProxyProvider(UnixProxyProvider).getEnv(String) line: 196 ProxyProvider(UnixProxyProvider).getSystemProxyInfo(String) line: 135 ProxyProvider(UnixProxyProvider).getProxyData(URI) line: 58 ProxyProvider(AbstractProxyProvider).select(URI) line: 43 ProxyManager.getProxyDataForHost(String, String) line: 366 EclipseProxySelector.select(URI) line: 46 SocksSocketImpl.connect(SocketAddress, int) line: 358 Socket.connect(SocketAddress, int) line: 529 For each Socket.connect, UnixProxyProvider#getEnv exec's "/bin/sh", "-c", "env | grep -i proxy" This is pretty bad as forking a process and waiting for it to finish takes many 100s of milliseconds. It seems for many operations CCRC does java.net.Socket connect for each resource.
Created attachment 185312 [details] potential patch 1 I added a try finally timing around getSystemProxyInfo: getSystemProxyInfo(socks) took 171 getSystemProxyInfo(socks) took 131 getSystemProxyInfo(socks) took 161 getSystemProxyInfo(socks) took 154 getSystemProxyInfo(socks) took 130 getSystemProxyInfo(socks) took 204 getSystemProxyInfo(socks) took 165 getSystemProxyInfo(socks) took 164 getSystemProxyInfo(socks) took 145 getSystemProxyInfo(socks) took 285 getSystemProxyInfo(socks) took 209 Which makes this API call very expensive. System.getEnv was replaced as part of bug 257172. We should likely only fall-back to execing when the System.getEnv is known not to work... If we fall-back to running env in the Exception handler when System.getenv fails, we can remain performant while still supporting non-compliant JVMs. The attached patch reduces time to <1ms.
Pawel you made the original change in bug 257172. Can you review this?
Note that #getEnv went from being deprecated in 1.4.2 to not deprecated in 1.5.0+ (due to developer complaints).
Created attachment 185357 [details] patch 2 Looks like they used to throw Error, so catch all Throwable. Also the synchronized on #getEnv looks wrong. There's no state accessed concurrently. Likely the I/O errors received were due to not correctly destroying the process and closing the file descriptors.
Szymon this is a simple patch, can you review? As for the impact of this performance problem, CCRC spent 50 mins. synchronizing my view. Approximately 9mins of this was spent in ProxyProvider(UnixProxyProvider).getSystemProxyInfo (3500 calls * 150ms).
The method System.getenv(String) is not defined in the bundle's required execution environment: CDC-1.1/Foundation-1.1. We should try to call it using reflection and if it fails, fall back on the old code.
Created attachment 186477 [details] patch2 Updated the patch to use reflection. Installed the "PDE/API Tools Environment Descriptions" to provide the baselines which caused the API tooling to flag the error. Interestingly before bug 257172 System.getenv was called directly without reflection.
*** Bug 333962 has been marked as a duplicate of this bug. ***
In Robert's bug 333962 waitFor is hanging indefinitely. This usually happens if one of the pipe buffers is full. Perhaps something has gone wrong and the input stream hasn't been emptied, or the error stream has become full. This causes the thread to block forever... Robert, what does: /bin/sh -c env | grep -i proxy do on your system? And what goes to stdout and to stderr?
James, I get absolutely no output for that command. For reference robert@raptor ~$ /bin/sh -c env | wc -c 3884
(In reply to comment #10) > James, I get absolutely no output for that command. > > For reference > > robert@raptor ~$ /bin/sh -c env | wc -c > 3884 It's difficult to know what's going wrong without being able to reproduce. In the fallback catch (Throwable) case, I think the best solution is to remove the Process#waitFor() call. It seems to add no value as the input stream has already been processed in-line by the time waitFor is called. All it will do is increase the time needed for the #getEnv call, and potentially cause it to block forever.
(In reply to comment #11) > (In reply to comment #10) > > James, I get absolutely no output for that command. > > > > For reference > > > > robert@raptor ~$ /bin/sh -c env | wc -c > > 3884 > > It's difficult to know what's going wrong without being able to reproduce. > > In the fallback catch (Throwable) case, I think the best solution is to remove > the Process#waitFor() call. It seems to add no value as the input stream has > already been processed in-line by the time waitFor is called. All it will do is > increase the time needed for the #getEnv call, and potentially cause it to block > forever. If I do find a wait to reproduce the issue I will let you know. But it's not even on a daily basis.
Created attachment 186485 [details] patch 4 Looking at the CVS history, this code has been changed again and again incorrectly to correct the issue. waitFor() was added in 1.19 to fix bug 258941. The reason the fix 'worked' was because getEnv was made synchronized serializing accesses and minimizing the number of in-progress Process execs. The waitFor between each maximised the chance that the garbage collector cleans up the file-descriports. The real fix for hitting the fd limits was by adding proc.destroy() in 1.21. As such I don't believe the waitFor, nor the synchronized are required.
To sum up: It seems that the initial problem was caused by not cleaning up processes and OS resources. Adding synchronization and waitFor just added extra time to gc resources what somehow worked on IBM vm, but not on Sun/Oracle one. There are some changes in UnixProcess implementation in IBM which clean some resources on process exit. I can't see it in Sun vm impl. Anyway, just adding proc.destroy as introduced by the fix for bug 293085 seems to be a good fix. I committed it to HEAD. BTW interesting investigation.
Fixed.
*** Bug 313597 has been marked as a duplicate of this bug. ***
*** Bug 326029 has been marked as a duplicate of this bug. ***
Verified in I20110124-1800