Bug 487915 - download.eclipse.org needs more connections
Summary: download.eclipse.org needs more connections
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Servers (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P1 blocker (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 487945 492104 (view as bug list)
Depends on:
Blocks: 486207
  Show dependency tree
 
Reported: 2016-02-16 22:25 EST by David Williams CLA
Modified: 2017-09-06 03:51 EDT (History)
7 users (show)

See Also:


Attachments
stack trace at end of log (6.37 KB, text/plain)
2016-04-26 23:20 EDT, David Williams CLA
no flags Details
404/50x error distrubution (116.38 KB, image/jpeg)
2016-04-27 11:39 EDT, Denis Roy CLA
no flags Details
Interruption (114.36 KB, image/jpeg)
2016-05-04 10:11 EDT, Denis Roy CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2016-02-16 22:25:44 EST
We (the Platform) have been getting fairly frequent "connection timed out" error during out builds. And this is getting something from 'downloads' server to the 'build' server. It has been happening once every one or two weeks. 

[ERROR] Failed to execute goal org.eclipse.tycho.extras:tycho-p2-extras-plugin:0.23.1:mirror (mirror-build-emf) on project eclipse.platform.repository: Error during mirroring: Mirroring failed: Messages while mirroring artifact descriptors.: [Unable to read repository at http://download.eclipse.org/modeling/emf/emf/updates/2.12milestones/base/S201601280808/plugins/org.eclipse.emf.ecore_2.12.0.v20160128-0808.jar.]: connect timed out

I am not sure if I can collect or provide more data to get at the root problem? I thought if nothing else I should "keep track" of them ... in this bug. 

An exception was printed I have pasted below. 

= = = = = = = = = 

Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.eclipse.ecf.internal.provider.filetransfer.httpclient4.ECFHttpClientProtocolSocketFactory.connectSocket(ECFHttpClientProtocolSocketFactory.java:84)
        at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
        at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)
        at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:131)
        at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
Comment 1 David Williams CLA 2016-02-17 22:05:52 EST
Perhaps related to bug 487945?
Comment 2 David Williams CLA 2016-02-23 23:20:58 EST
This happened again, on 2/23, in our Nightly build: 

http://download.eclipse.org/eclipse/downloads/drops4/N20160223-2000/


[ERROR] Failed to execute goal org.eclipse.tycho.extras:tycho-p2-extras-plugin:0.23.1:mirror (mirror-build-ecf) on project eclipse.platform.repository: Error during mirroring: 
Mirroring failed: 
Messages while mirroring artifact descriptors.: 
[Unable to read repository at http://download.eclipse.org/rt/ecf/3.12.0/site.p2/features/org.eclipse.ecf.filetransfer.httpclient4.feature_3.12.0.v20151130-0157.jar.]: connect timed out 

The jar in question does exist and I could "fetch" a few hours later (when I noticed the build failure).
Comment 3 Denis Roy CLA 2016-02-24 09:29:50 EST
*** Bug 488383 has been marked as a duplicate of this bug. ***
Comment 4 Denis Roy CLA 2016-02-24 09:31:01 EST
We might need to increase the worker limit on download.e.o or advance its replacement machine.  At some periods during the day it's reaching its 20,000 connection limit.
Comment 5 Denis Roy CLA 2016-02-24 10:30:51 EST
Connection limit was 15000, increased to 24000. I've recently upped bandwidth limit from 210 Mbps to 250 Mbps. That should carry us a while. Reopen if you still see timeouts.
Comment 6 Denis Roy CLA 2016-02-24 10:31:16 EST
Fixed means fixed.
Comment 7 Eclipse Webmaster CLA 2016-02-24 10:38:46 EST
*** Bug 487945 has been marked as a duplicate of this bug. ***
Comment 8 David Williams CLA 2016-02-24 15:15:44 EST
Would this same problem also cause DNS errors? 

I ask for two reasons: 

1. I just noticed an error in one of our logs (from Tuesday morning) that said

[WARNING] [Tue Feb 23 08:24:51 EST 2016] HTTP request failed.
HTTP Error 500 (reason: Codesign tool(running on: build,456) exit status: 1.)
 updating: META-INF/MANIFEST.MF
jarsigner: unable to sign jar: java.net.UnknownHostException: timestamp.geotrust.com


Error 500: Codesign tool(running on: build,456) exit status: 1.
Server response has been saved to '/opt/public/eclipse/builds/4I/gitCache/eclipse.platform.releng.aggregator/eclipse.platform.swt.binaries/bundles/org.eclipse.swt.gtk.linux.s390/target/org.eclipse.swt.gtk.linux.s390-3.105.0-SNAPSHOT.jar-1797467009475557455-RemoteJarSigner.log'

2. bug 488350. A case of our debugger not being able to "connect to the VM", which is does through "localhost". That may not sound like "localhost" needs a DNS, but I have a vague memory of hearing, long ago, that on eclipse.org infrastructure even "localhost" is "looked up" on the intranet's DNS. Is that still true? (Was it ever? :)
Comment 9 David Williams CLA 2016-03-02 08:21:42 EST
Last night, we saw again, where things could not "find" things on downloads (or archives, in one case) and there were "Socket timed out" messages in log.
Comment 10 Denis Roy CLA 2016-03-02 08:50:20 EST
I suspect part of the problem is with the way we're serving the OOmph setup files. To allocate more bandwidth we're serving them via www.eclipse.org but using a ProxyPass to download.eclipse.org.

We need to change that. It's a waste of connections.
Comment 11 David Williams CLA 2016-03-09 09:27:04 EST
This happened twice more yesterday, once in evening during our build, and once in afternoon during some tests. 

The build error "stack trace" (part of it) is below, showing the "socket timed out" error. 

= = = = = = 

     [exec]     at org.eclipse.equinox.internal.p2.artifact.repository.ArtifactRepositoryManager.loadRepository(ArtifactRepositoryManager.java:100)
     [exec]     at org.eclipse.equinox.internal.p2.director.app.DirectorApplication.initializeRepositories(DirectorApplication.java:532)
     [exec]     at org.eclipse.equinox.internal.p2.director.app.DirectorApplication.run(DirectorApplication.java:1106)
     [exec]     at org.eclipse.equinox.internal.p2.director.app.DirectorApplication.start(DirectorApplication.java:1293)
     [exec]     at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
     [exec]     at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
     [exec]     at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
     [exec]     at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:388)
     [exec]     at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:243)
     [exec]     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     [exec]     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     [exec]     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     [exec]     at java.lang.reflect.Method.invoke(Method.java:497)
     [exec]     at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:670)
     [exec]     at org.eclipse.equinox.launcher.Main.basicRun(Main.java:609)
     [exec]     at org.eclipse.equinox.launcher.Main.run(Main.java:1516)
     [exec]     at org.eclipse.equinox.launcher.Main.main(Main.java:1489)
     [exec] Caused by: java.net.SocketTimeoutException: Read timed out
     [exec]     at java.net.SocketInputStream.socketRead0(Native Method)
     [exec]     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
     [exec]     at java.net.SocketInputStream.read(SocketInputStream.java:170)
     [exec]     at java.net.SocketInputStream.read(SocketInputStream.java:141)
     [exec
Comment 12 David Williams CLA 2016-03-21 21:34:32 EDT
Tonight 3/21, we had a build fail due to a more specific sort of "connection time out" (from the little I have looked at it). 

See 
http://download.eclipse.org/eclipse/downloads/drops4/N20160321-2000/buildFailed-pom-version-updater

But a small sample is 

[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[ERROR] Unresolveable build extension: Plugin org.eclipse.tycho:tycho-maven-plugin:0.23.1 or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-core:jar:3.0 from/to central (https://repo.maven.apache.org/maven2): Connect to repo.maven.apache.org:443 [repo.maven.apache.org/23.235.46.215] failed: Connection timed out @ 
[ERROR] Unresolveable build extension: Plugin org.eclipse.tycho:tycho-maven-plugin:0.23.1 or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-core:jar:3.0 from/to central (https://repo.maven.apache.org/maven2): Connect to repo.maven.apache.org:443 [repo.maven.apache.org/23.235.46.215] failed: Connection timed out @ 
[ERROR] Unknown packaging: eclipse-target-definition @ line 11, column 14
[ERROR] Unresolveable build extension: Plugin org.eclipse.tycho:tycho-maven-plugin:0.23.1 or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-core:jar:3.0 from/to central (https://repo.maven.apache.org/maven2): Connect to repo.maven.apache.org:443 [repo.maven.apache.org/23.235.46.215] failed: Connection timed out @ 
[ERROR] Unresolveable build extension: Plugin org.eclipse.tycho:tycho-maven-plugin:0.23.1 or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-core:jar:3.0 from/to central (https://repo.maven.apache.org/maven2): Connect to repo.maven.apache.org:443 [repo.maven.apache.org/23.235.46.215] failed: Connection timed out @ 
[ERROR] Unresolveable build extension: Plugin org.eclipse.tycho:tycho-maven-plugin:0.23.1 or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-core:jar:3.0 from/to central (https://repo.maven.apache.org/maven2): Connect to repo.maven.apache.org:443 [repo.maven.apache.org/23.235.46.215] failed: Connection timed out @
Comment 13 David Williams CLA 2016-04-24 23:38:21 EDT
Tonight, 4/24, had network issues, apparently, getting stuff from 
https://repo.maven.apache.org/maven2
from build.eclipse.org. 

Note: no trouble at all getting this dependency on my own local test machine. 
Also, note, we do not specify that URL, https://repo.maven.apache.org/maven2. I am assuming this is some automatic thing, set up by Eclipse Foundation, or Maven itself? 

This problem was worse that usual since it wasn't a "temporary" glitch, but lasted about two or three hours, at least -- I tried just to repeat the build, but got the exact same error. 

First failure log: 
http://download.eclipse.org/eclipse/downloads/drops4/I20160424-2000/buildlogs/mb060_run-maven-build_output.txt

Second failure log:
http://download.eclipse.org/eclipse/downloads/drops4/I20160424-2245/buildlogs/mb060_run-maven-build_output.txt

= = = = = = 

[ERROR] Failed to execute goal on project org.eclipse.equinox.p2.metadata.repository: Could not resolve dependencies for project org.eclipse.equinox:org.eclipse.equinox.p2.metadata.repository:eclipse-plugin:1.2.300-SNAPSHOT: Failed to collect dependencies at org.apache.ant:ant:jar:1.7.1: Failed to read artifact descriptor for org.apache.ant:ant:jar:1.7.1: Could not transfer artifact org.apache.ant:ant:pom:1.7.1 from/to central (https://repo.maven.apache.org/maven2): Connection reset 

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project org.eclipse.equinox.p2.metadata.repository: Could not resolve dependencies for project org.eclipse.equinox:org.eclipse.equinox.p2.metadata.repository:eclipse-plugin:1.2.300-SNAPSHOT: Failed to collect dependencies at org.apache.ant:ant:jar:1.7.1
        at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies(LifecycleDependencyResolver.java:221)
        at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies(LifecycleDependencyResolver.java:127)
        at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved(MojoExecutor.java:257)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:200)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
Comment 14 Markus Keller CLA 2016-04-25 09:27:10 EDT
(In reply to David Williams from comment #13)
> Tonight, 4/24, had network issues, apparently, getting stuff from 
> https://repo.maven.apache.org/maven2
> from build.eclipse.org. 
...

Looks like an illegal (and unnecessary) external dependency in p2. Filed bug 492367.
Comment 15 David Williams CLA 2016-04-25 12:07:03 EDT
Received a similar failure from this morning's build: 

[ERROR] Failed to execute goal org.eclipse.cbi.maven.plugins:eclipse-cbi-plugin:1.1.3:generate-api-build-xml (default) on project eclipse-platform-parent: Execution default of goal org.eclipse.cbi.maven.plugins:eclipse-cbi-plugin:1.1.3:generate-api-build-xml failed: Plugin org.eclipse.cbi.maven.plugins:eclipse-cbi-plugin:1.1.3 or one of its dependencies could not be resolved: Failed to collect dependencies at org.eclipse.cbi.maven.plugins:eclipse-cbi-plugin:jar:1.1.3 -> org.apache.maven:maven-plugin-api:jar:3.1.1 -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5 -> org.sonatype.sisu:sisu-guice:jar:no_aop:3.1.0 -> aopalliance:aopalliance:jar:1.0: Failed to read artifact descriptor for aopalliance:aopalliance:jar:1.0: Could not transfer artifact aopalliance:aopalliance:pom:1.0 from/to central (https://repo.maven.apache.org/maven2): Connection reset
Comment 16 David Williams CLA 2016-04-25 12:10:42 EDT
(In reply to David Williams from comment #15)
> Received a similar failure from this morning's build: 
> 


Earlier in that same build was another, git-related error message, that says "fatal", but seemed to continue: 

To file:///gitroot/platform/eclipse.platform.releng.aggregator.git
   0ad5f09..bcea47e  HEAD -> master
fatal: read error: Connection reset by peer
Cloning into '/shared/eclipse/builds/4I/siteDir/eclipse/downloads/drops4/I20160425-0800/eclipse.platform.releng.aggregator'...
remote: warning: ignoring extra bitmap file: objects/pack/pack-6602b1a9f54a0512dfbcf26e1fa11a8c6460649e.pack
Note: checking out '0ad5f09cd6ccc8217c3a6c6e5425b59bcf1b3fec'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 0ad5f09... Build input for build I20160424-2245
Comment 17 Dani Megert CLA 2016-04-25 13:48:22 EDT
There seems to be some issue with the infrastructure. I had many timeouts when accessing Git today via EGit and our Hudson jobs also run "forever", see e.g.
https://hudson.eclipse.org/platform/job/eclipse.platform.ui-Gerrit/
Some are running for more than 5 hours now!
Comment 18 Denis Roy CLA 2016-04-25 14:13:46 EDT
There seems to be a mix of issues in this bug, many of which are vague and/or have no reproducible use case.  Plowing through the massive build log is not easy for us sysadmins.

I'll reaffect this bug for performance to download.e.o specifically, as I've seen it run out of connections at peak times. The number one culprit is the "smart" 404 system since, at peak times, we must deliver in excess of 200 404 Not Found errors per second.
Comment 19 David Williams CLA 2016-04-25 21:20:11 EDT
(In reply to Denis Roy from comment #18)
> There seems to be a mix of issues in this bug, many of which are vague
> and/or have no reproducible use case.  Plowing through the massive build log
> is not easy for us sysadmins.
> 
> I'll reaffect this bug for performance to download.e.o specifically, as I've
> seen it run out of connections at peak times. The number one culprit is the
> "smart" 404 system since, at peak times, we must deliver in excess of 200
> 404 Not Found errors per second.

Thanks, Denis. 

I have opened bug 492412 because it describes a failure case we see fairly often, but is admittedly intermittent. (And was perhaps unfairly mixed in here since to us a "network problem" is just a network problem :).

The git problem we've seen today was pretty specific to today, and (from our build's point of view) is no longer a problem. Not sure if those in other countries are still seeing issues or not.
Comment 20 Denis Roy CLA 2016-04-26 09:51:46 EDT
> The git problem we've seen today

Which was odd, since you are using the 'file://' protocol. There are no connections to be reset by any peer.


> To file:///gitroot/platform/eclipse.platform.releng.aggregator.git
>    0ad5f09..bcea47e  HEAD -> master
> fatal: read error: Connection reset by peer
Comment 21 David Williams CLA 2016-04-26 23:20:52 EDT
Created attachment 261275 [details]
stack trace at end of log

When it rains it pours! 

The latest Platform "BUILD FAILED" seems to fall into this bug, instead of the "external connections" one. 

[ERROR] Failed to execute goal org.eclipse.tycho.extras:tycho-p2-extras-plugin:0.25.0:mirror (mirror-build-ecf) on project eclipse.platform.repository: Error during mirroring: Mirroring failed: HTTP Server 'Service Unavailable': http://download.eclipse.org/rt/ecf/3.13.1/site.p2/artifacts.xml: HttpComponents connection error response code 503. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.eclipse.tycho.extras:tycho-p2-extras-plugin:0.25.0:mirror (mirror-build-ecf) on project eclipse.platform.repository: Error during mirroring

Full stack trace is attached.
Comment 22 Denis Roy CLA 2016-04-27 09:15:00 EDT
(In reply to David Williams from comment #21)
> mirroring: Mirroring failed: HTTP Server 'Service Unavailable':
> http://download.eclipse.org/rt/ecf/3.13.1/site.p2/artifacts.xml:
> HttpComponents connection error response code 503. -> [Help 1]

Thanks, that was helpful. As I suspected in comment 18, our "smart" 404 handler is problematic, since the above URL is a 404 Not Found but caused a 503 Service Unavailable

I'll see if I can tweak the handler some more, but new hardware will be incoming soon enough.
Comment 23 Denis Roy CLA 2016-04-27 11:39:14 EDT
Created attachment 261308 [details]
404/50x error distrubution

Here's a chart that maps the 404 vs. 50x error distribution from Tuesday (yesterday) and some random Tuesday in November.

Some observations:

- We definitely have very pronounced 404 spikes throughout the day
- A 404 spike often leads to a 50x error spike
- A 404 won't break a build, but a 50x will
- We have a steady stream of 50x errors all the time, which is odd
- The amount of 404s we serve today is much greater than that of November

This doesn't really change our conclusion to redesign download.e.o but I thought it would be an interesting data point.
Comment 24 Denis Roy CLA 2016-05-03 09:38:44 EDT
*** Bug 492104 has been marked as a duplicate of this bug. ***
Comment 25 David Williams CLA 2016-05-03 23:51:38 EDT
Yet another failed build tonight, this time due to p2 going to 

http://archive.eclipse.org/webtools/downloads/drops/T3.2.0/I-3.2.0-20100521140232/repository/

from build.eclipse.org. 


Received a stack trace like this (partial one) : 


     [exec] Caused by: java.net.SocketTimeoutException: Read timed out
     [exec]     at java.net.SocketInputStream.socketRead0(Native Method) 
     [exec]     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
     [exec]     at java.net.SocketInputStream.read(SocketInputStream.java:170)
     [exec]     at java.net.SocketInputStream.read(SocketInputStream.java:141)
     [exec]     at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
Comment 26 Denis Roy CLA 2016-05-04 10:05:44 EDT
David, do you know at what time that happened?
Comment 27 Denis Roy CLA 2016-05-04 10:11:01 EDT
Created attachment 261464 [details]
Interruption

download.e.o did suffer an interruption (gray line).

Not much we can do about it right now until the hardware I've ordered comes in.
Comment 28 David Williams CLA 2016-05-04 19:20:34 EDT
(In reply to Denis Roy from comment #26)
> David, do you know at what time that happened?

It was around 9 or 10 PM, Tuesday, 5/3. 

I can not quite tell if that is "in the grey line" of your graph, but is close enough to say "that was it". 



(In reply to Denis Roy from comment #27)
> Created attachment 261464 [details]
> Interruption
> 
> download.e.o did suffer an interruption (gray line).
> 
> Not much we can do about it right now until the hardware I've ordered comes
> in.

I'm not looking for a commitment but am curious as to when that is expected. 
Just ballpark ... a few weeks? a few months? The next fiscal year? :) 
[I am partially just curious, but am also loosely considering some work to improve working around the limitation, such as auto-restart (once or twice) if we get a "BUILD FAILED" and there is a "SocketException" in the log.]
Comment 29 Denis Roy CLA 2016-05-04 20:33:31 EDT
About 4 weeks to get a few servers and set them up. But I think we have some old hardware in the lab that we can cobble together to help us gain some short-term reliablility in the meanwhile.

I'll also see if I can tweak our 404 handler.
Comment 30 Eclipse Webmaster CLA 2016-05-06 14:36:31 EDT
I've setup some connection count logging on the proxy servers to allow us to see if that's a factor.

-M.
Comment 31 David Williams CLA 2016-05-17 23:59:36 EDT
In an attempt to help by providing data, I saw a several "socket timeout" exceptions again in our builds "tonight". It started about 10:30 PM 5/17. 

A typical exception was 

     [exec] !ENTRY org.eclipse.equinox.p2.repository 4 1002 2016-05-17 22:35:01.230
     [exec] !MESSAGE Communication with repository at http://download.eclipse.org/webtools/releng/repository/ failed. 
     [exec] !STACK 0
     [exec] java.net.SocketTimeoutException: Read timed out
     [exec]     at java.net.SocketInputStream.socketRead0(Native Method)  

In addition, "build.eclipse.org" seemed overloaded, with "load" numbers being reported from 30 to 50 from about 10:30 to 11:30. It appears to be settling down to the usual 4 and 5, now. 

= = = = = = = = 

BUT, surprisingly, it appears p2 was able to retry and eventually was satisfied! That is actually hard to see in my logs, but the build did finally succeed. I wonder if the response code, or something, changed that allowed p2 to retry? I also did not see a "connection reset" in the log. So, good news wrapped in a mystery.  

= = = = = = = = 

If there is more good news about "tonight's failure", I also witnessed similar socket exceptions on my home system during the same time, which I normally never see. (It was attempting to connect to a different p2 repository). The best news of that is that my home system was supposed to be using a "local mirror", but was not, due to a "typo" in my settings, which is now fixed due to me seeing the socket timeout exceptions. So, maybe that is sort of half good news -- good news for me, but the same server problem. (My home build did fail, not sure why it did, but the build.eclipse.org build did not -- perhaps because my system was not overloaded so the "retries" happened very quickly :)
Comment 32 Denis Roy CLA 2016-05-24 10:50:36 EDT
Thanks for the info, David.
Comment 33 Denis Roy CLA 2016-05-25 11:22:35 EDT
Another thing that is impacting download connections is that, at peak times, the Oomph setup files take up to 30% of our high-priority bandwidth, leaving download.eclipse.org with very little room.

I'll look to add some bandwidth temporarily so that download.e.o does not always have a queue 10 miles long.
Comment 34 Denis Roy CLA 2016-11-23 11:37:01 EST
We've since added 100Mb of bandwidth, which has brought down our connection counts dramatically.
Comment 35 Denis Roy CLA 2017-01-17 11:21:11 EST
We've added four new download servers to the pool.  Each new server has the ability to handle 2x the connections the old download.e.o could handle.

Closing as fixed.