Community
Participate
Working Groups
Created attachment 267913 [details] Access from OSU OSL download.eclipse.org will occasionally time-out while requesting files. It seems to happen sporadically. [ERROR] Unable to connect to repository http://download.eclipse.org/eclipse/updates/4.7-I-builds/I20170419-0430/plugins/org.eclipse.ui.intro_3.5.100.v20170418-0710.jar.pack.gz [ERROR] Internal error: org.eclipse.tycho.repository.local.MirroringArtifactProvider$MirroringFailedException: Could not mirror artifact osgi.bundle,org.eclipse.ui.intro,3.5.100.v20170418-0710 into the local Maven repository.See log output for details. Connection timed out: connect -> [Help 1] As per the attached graph, we can see that download.e.o (gray line) has some timeouts -- but specifically from 2:00am to ~5:00am local time.
My initial suspect is php-fpm configuration and the load balancer.
I'm seeing timeouts again right now: !ENTRY org.eclipse.equinox.p2.transport.ecf 2 0 2017-05-11 17:02:06.505 !MESSAGE Connection to http://download.eclipse.org/releases/neon/201610111000/content.jar failed on Connection timed out: connect. Retry attempt 0 started !STACK 0 java.net.ConnectException: Connection timed out: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.eclipse.ecf.internal.provider.filetransfer.httpclient4.ECFHttpClientProtocolSocketFactory.connectSocket(ECFHttpClientProtocolSocketFactory.java:86) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:179) at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144) at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:134) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:612) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:447) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer.performConnect(HttpClientRetrieveFileTransfer.java:1084) at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer.access$0(HttpClientRetrieveFileTransfer.java:1075) at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer$1.performFileTransfer(HttpClientRetrieveFileTransfer.java:1071) at org.eclipse.ecf.filetransfer.FileTransferJob.run(FileTransferJob.java:74) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56) Users are complaining on the forum about other sites being broken.
*** Bug 516473 has been marked as a duplicate of this bug. ***
I was finally able to regenerate Oomph's product catalog after a 1/2 of trying...
Here, since this morning I haven't succeeded to load a TP, with 16 distinct locations. Each time the error is not on the same location: * Connection to http://download.eclipse.org/modeling/emft/eef/updates/releases/1.5/R201601141612/features/org.eclipse.emf.eef.collab.runtime-feature_1.5.1.201601141612.jar failed on Connection timed out: connect. Retry attempt 0 started * Connection to http://download.eclipse.org/releases/neon/201703231000/p2.index failed on Connection timed out: connect. Retry attempt 0 started * Connection to http://download.eclipse.org/diffmerge/releases/0.7.1/edm-patterns-site/p2.index failed on Connection timed out: connect. Retry attempt 0 started * ...
It's OK now...after a long day with several tries ...
The problem is recurring again today right now.
yes it is bad again
I get the feeling that transparent mirroring of p2 metadata is just a bad idea. I see repeated strange problem reports in the forum. Yesterday I could not load Neon's latest update site. Today I get repeated timeouts (from where I'm traveling in Italy). May 24, 2017 10:01:23 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org:80: Connection reset May 24, 2017 10:01:23 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org:80: Connection reset May 24, 2017 10:01:23 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: Retrying request to {}->http://download.eclipse.org:80 May 24, 2017 10:01:23 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: Retrying request to {}->http://download.eclipse.org:80 May 24, 2017 10:01:44 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org:80: Connection reset May 24, 2017 10:01:44 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: Retrying request to {}->http://download.eclipse.org:80 It's stuck like this for many minutes (9 minutes expire) just accessing the metadata: May 24, 2017 10:10:59 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFO: Retrying request to {}->http://download.eclipse.org:80 Please reconsider transparent mirroring of p2 metadata because there's something just not working well in this regard.
yes we had failing mwe/xtext builds as well. http://services.typefox.io/open-source/jenkins/job/mwe/job/master/8/console http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/cd_umbrella_issue28/1/console
I've located the issue, and the transparent mirroring is not the cause, as I was getting timeouts on files that are not even handled by that system. This morning I observed our load balancer actively "Dying" some of the download servers randomly, then re-enabling them after a few seconds. It shouldn't. Looking at the download severs themselves, their connection tables were full. I've tweaked the tcp stack on the download server to increase the connection pool, terminate connections faster and reduce timeouts for idle connections. I'll continue to monitor but we should be good here.
Instances of "table full" on one of the download.e.o nodes for the last few days: zgrep -c "table full" /var/log/messages.?.gz /var/log/messages.1.gz:0 = thursday the 25th /var/log/messages.2.gz:25706 = wednesday, 24 /var/log/messages.3.gz:29139 = tuesday, 23 /var/log/messages.4.gz:0 /var/log/messages.5.gz:0 /var/log/messages.6.gz:0 /var/log/messages.7.gz:0 /var/log/messages.8.gz:0 /var/log/messages.9.gz:11482 = wednesday /var/log/messages.10.gz:15714 = tuesday 16th /var/log/messages.11.gz:0 /var/log/messages.12.gz:0 /var/log/messages.13.gz:0 /var/log/messages.14.gz:58138 = friday /var/log/messages.15.gz:85056 = thursday /var/log/messages.16.gz:73319 = wed /var/log/messages.17.gz:32746 = tues /var/log/messages.18.gz:0 /var/log/messages.19.gz:0 /var/log/messages.20.gz:0 /var/log/messages.21.gz:0 /var/log/messages.22.gz:0 /var/log/messages.23.gz:2839 = wed /var/log/messages.24.gz:6706 = tues /var/log/messages.25.gz:0 /var/log/messages.26.gz:0 /var/log/messages.27.gz:0 /var/log/messages.28.gz:0 /var/log/messages.29.gz:0 /var/log/messages.30.gz:9262 = wed /var/log/messages.31.gz:13985 = tues /var/log/messages.32.gz:0 /var/log/messages.33.gz:0 The pattern seems to revolve around Black Tuesday. I'll re-check the TCP settings on the the download nodes. My goal is "0" for next week's deluge.
TCP tuning seems to be effective, as we're not seeing any connection tables full.
(In reply to Denis Roy from comment #12) > My goal is "0" for next week's deluge. Goal achieved. We are done here. Thanks for your patience, everyone.
I think there are still problems. This user is reporting that he's get a 502 https://www.eclipse.org/forums/index.php/mv/msg/1086664/1765151/#msg_1765151 I.e., he reports this problem: ERROR: org.eclipse.equinox.p2.transport.ecf code=1002 HTTP Server 'Bad Gateway' : http://download.eclipse.org/technology/epp/packages/neon/content.xml.xz ERROR: org.eclipse.ecf.identity code=0 HttpComponents connection error response code 502. at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer.openStreams(HttpClientRetrieveFileTransfer.java:667) at org.eclipse.ecf.provider.filetransfer.retrieve.AbstractRetrieveFileTransfer.sendRetrieveRequest(AbstractRetrieveFileTransfer.java:885) at org.eclipse.ecf.provider.filetransfer.retrieve.AbstractRetrieveFileTransfer.sendRetrieveRequest(AbstractRetrieveFileTransfer.java:576) at org.eclipse.ecf.provider.filetransfer.retrieve.MultiProtocolRetrieveAdapter.sendRetrieveRequest(MultiProtocolRetrieveAdapter.java:106) at org.eclipse.equinox.internal.p2.transport.ecf.FileReader.sendRetrieveRequest(FileReader.java:428) at org.eclipse.equinox.internal.p2.transport.ecf.FileReader.readInto(FileReader.java:360) at org.eclipse.equinox.internal.p2.transport.ecf.RepositoryTransport.download(RepositoryTransport.java:101) at org.eclipse.oomph.p2.internal.core.CachingTransport.download(CachingTransport.java:192) at org.eclipse.oomph.p2.internal.core.CachingTransport.download(CachingTransport.java:255) at org.eclipse.equinox.internal.p2.repository.CacheManager.updateCache(CacheManager.java:402) at org.eclipse.equinox.internal.p2.repository.CacheManager.createCacheFromFile(CacheManager.java:132) at org.eclipse.equinox.internal.p2.metadata.repository.XZedSimpleMetadataRepositoryFactory.getLocalFile(XZedSimpleMetadataRepositoryFactory.java:56) at org.eclipse.equinox.internal.p2.metadata.repository.XZedSimpleMetadataRepositoryFactory.load(XZedSimpleMetadataRepositoryFactory.java:78) at org.eclipse.equinox.internal.p2.metadata.repository.MetadataRepositoryManager.factoryLoad(MetadataRepositoryManager.java:57) at org.eclipse.equinox.internal.p2.repository.helpers.AbstractRepositoryManager.loadRepository(AbstractRepositoryManager.java:768) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.oomph.util.ReflectUtil.invokeMethod(ReflectUtil.java:117) at org.eclipse.oomph.p2.internal.core.CachingRepositoryManager.loadRepository(CachingRepositoryManager.java:396) at org.eclipse.oomph.p2.internal.core.CachingRepositoryManager.loadRepository(CachingRepositoryManager.java:199) at org.eclipse.oomph.p2.internal.core.CachingRepositoryManager$Metadata.loadRepository(CachingRepositoryManager.java:463) at org.eclipse.equinox.internal.p2.metadata.repository.MetadataRepositoryManager.loadRepository(MetadataRepositoryManager.java:96) at org.eclipse.equinox.internal.p2.metadata.repository.MetadataRepositoryManager.loadRepository(MetadataRepositoryManager.java:92) at org.eclipse.oomph.p2.internal.core.ProfileTransactionImpl$RepositoryLoader$Worker.perform(ProfileTransactionImpl.java:1613) at org.eclipse.oomph.util.WorkerPool$Worker.run(WorkerPool.java:428) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55) Perhaps a 502 is a server problem that might be caused by redirecting to some other bad mirror server that's not responding with a proper header?
Ed, I think a single 502 from one user is not enough to qualify this as reopened.
there are others complaining as well .... https://www.eclipse.org/forums/index.php?t=msg&th=1086814&goto=1765869&#msg_1765869
I also had a problem with artifact downloads Tuesday morning installing all of Oxygen from the Oracle office in Ottawa. It worked the second time though...
I will run some stress-tests this afternoon.
I believe this transparent mirroring of p2 metadata continues to be a problem. While doing the work for https://bugs.eclipse.org/bugs/show_bug.cgi?id=509929 I repeatedly get failures loading repositories at download.eclipse.org. For example, it downloads http://download.eclipse.org/releases/luna/201409261001/content.jar saving it in p2's cache, but it can't process the entry in the jar. When I try to unzip from it, I can see that the jar is somewhat corrupted: merks@ecore MINGW64 /d/stuff $unzip -l "C:\Users\merks\AppData\Local\Temp\fake-5200449190190453117-user-home\.eclipse\org.eclipse.oomph.setup\.p2\org.eclipse.equinox.p2. repository\cache\content1330234655.jar" Archive: C:\Users\merks\AppData\Local\Temp\fake-5200449190190453117-user-home\.eclipse\org.eclipse.oomph.setup\.p2\org.eclipse.equinox.p2.r epository\cache\content1330234655.jar warning [C:\Users\merks\AppData\Local\Temp\fake-5200449190190453117-user-home\.eclipse\org.eclipse.oomph.setup\.p2\org.eclipse.equinox.p2.re pository\cache\content1330234655.jar]: 61 extra bytes at beginning or within zipfile (attempting to process anyway) Length Date Time Name --------- ---------- ----- ---- 41555829 2014-09-25 08:41 content.xml --------- ------- 41555829 1 file But if I download it from the browser I can see they are indeed different in size by 61 bytes. The bad thing is, once this file is in p2's cache, it will just keep using it. That explains why people have problems and those problems aren't fixed except by clearing p2's cache. So I must reiterate again how much of a concern I have about this transparent mirroring of p2 metadata. It definitely causes problems, and there's no way to avoid them (unlike p2's own use of a mirror URL, which I can disable if some mirrors have problems).
and again [WARNING] An error occurred while transferring artifact packed: osgi.bundle,com.google.guava,15.0.0.v201403281430 from repository http://download.eclipse.org/releases/luna/201502271000: [WARNING] Retry another mirror: [WARNING] HTTP Server 'Service Unavailable': http://download.eclipse.org/releases/luna/201502271000/plugins/com.google.guava_15.0.0.v201403281430.jar.pack.gz
Christian, when did you get that error, just now or earlier?
i get errors like these about 2-3 times a week
(on different machines in different companies with different internet lines)
We're dealing with 503's via bug 519877. We'll nail this down.
here is another one http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/cd_issue312/1/console
or here http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/cd_issue312/3/console
Also, locally from Grenoble/France over Numericable ISP, I see timeouts with any client (Firefox, Eclipse IDE, Tycho...) when trying to reach download.eclipse.org. And http://downorisitjustme.com/res.php?url=download.eclipse.org also reports that the server is not accessible.
*** Bug 520193 has been marked as a duplicate of this bug. ***
I'm seeing this too right now, I'm unable to build Sirius locally: [INFO] Adding repository http://download.eclipse.org/tools/orbit/downloads/drops/R20170307180635/repository juil. 26, 2017 10:59:28 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset juil. 26, 2017 10:59:28 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org [ERROR] Failed to resolve target definition /home/pcdavid/src/sirius/5.1.x/git/org.eclipse.sirius/packaging/org.eclipse.sirius.parent/../../releng/org.eclipse.sirius.targets/./sirius_oxygen.target: Failed to load p2 metadata repository from location http://download.eclipse.org/tools/orbit/downloads/drops/R20170307180635/repository: Unable to read repository at http://download.eclipse.org/tools/orbit/downloads/drops/R20170307180635/repository/content.xml. Connect to download.eclipse.org:80 timed out -> [Help 1]
I also have error with eclipse installer: [2017-07-26 12:17:08] Executing bootstrap tasks [2017-07-26 12:17:08] Java(TM) SE Runtime Environment 1.8.0_121-b13 [2017-07-26 12:17:08] Product org.eclipse.products.epp.package.jee.oxygen [2017-07-26 12:17:08] Bundle org.eclipse.oomph.setup 1.8.0.v20170408-0745, build=3059, branch=2161405b80cf99ed791602ba56cdf44084f5ca43 [2017-07-26 12:17:08] Bundle org.eclipse.oomph.setup.core 1.8.0.v20170531-0903, build=3059, branch=2161405b80cf99ed791602ba56cdf44084f5ca43 [2017-07-26 12:17:08] Bundle org.eclipse.oomph.setup.p2 1.8.0.v20170318-0419, build=3059, branch=2161405b80cf99ed791602ba56cdf44084f5ca43 [2017-07-26 12:17:08] Performing P2 Director (Eclipse IDE for Java EE Developers (Oxygen)) [2017-07-26 12:17:08] Offline = false [2017-07-26 12:17:08] Mirrors = true [2017-07-26 12:17:08] Resolving 65 requirements from 3 repositories to C:\Users\stalinski\eclipse\jee-oxygen3\eclipse [2017-07-26 12:17:08] Requirement epp.package.jee [4.7.0,4.8.0) [2017-07-26 12:17:08] Requirement org.eclipse.platform.feature.group [4.7.0,4.8.0) [2017-07-26 12:17:08] Requirement org.eclipse.rcp.feature.group [4.7.0,4.8.0) [2017-07-26 12:17:08] Requirement org.eclipse.buildship.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.cft.server.core.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.cft.server.ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.common.doc.user.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.connectivity.doc.user.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.connectivity.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.doc.user.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.enablement.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.intro.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.modelbase.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.sqldevtools.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.datatools.sqltools.doc.user.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.eclemma.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.egit.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.egit.mylyn.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jdt.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.common.eclipselink.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.common.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.dbws.eclipselink.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.jaxb.eclipselink.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.jaxb.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.jpa.eclipselink.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jpt.jpa.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jsf.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.common.fproj.enablement.jdt.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.enterprise_ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.jsf.apache.trinidad.tagsupport.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.server_adapters.ext.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.server_adapters.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.server_ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.web_ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.webpageeditor.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.ws.axis2tools.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.ws.cxf.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.ws.jaxws.dom.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.jst.ws.jaxws.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.logback.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.wtp.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.wtp.jaxrs.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.wtp.jpa.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.m2e.wtp.jsf.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn.bugzilla_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn.context_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn.ide_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn.java_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn.wikitext_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.mylyn_feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.pde.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.recommenders.mylyn.rcp.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.recommenders.rcp.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.rse.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.rse.useractions.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.tm.terminal.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.common.fproj.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.jsdt.chromium.debug.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.jsdt.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.server_adapters.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.web_ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.xml_ui.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.wst.xsl.feature.feature.group [2017-07-26 12:17:08] Requirement org.eclipse.oomph.setup.feature.group [2017-07-26 12:17:08] Repository http://download.eclipse.org/technology/epp/packages/oxygen [2017-07-26 12:17:08] Repository http://download.eclipse.org/releases/oxygen/201706281000 [2017-07-26 12:17:08] Repository http://download.eclipse.org/oomph/updates/milestone/latest [2017-07-26 12:18:32] ERROR: org.eclipse.equinox.p2.transport.ecf code=1002 Unable to connect to repository http://download.eclipse.org/releases/oxygen/201706281000/content.xml java.net.ConnectException: Connection timed out: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.SocksSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:120) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:179) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:328) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:612) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:447) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientFileSystemBrowser.runRequest(HttpClientFileSystemBrowser.java:263) at org.eclipse.ecf.provider.filetransfer.browse.AbstractFileSystemBrowser$DirectoryJob.run(AbstractFileSystemBrowser.java:69) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56) [2017-07-26 12:18:32]
I've made several adjustments via bug 519877. Please let me know if the timeouts persist.
Just now, I have these messages in a job: ... [INFO] Adding repository http://download.eclipse.org/technology/swtbot/releases/2.1.1 [WARNING] Failed to access p2 repository http://download.eclipse.org/technology/swtbot/releases/2.1.1, use local cache. Unable to connect to repository http://download.eclipse.org/technology/swtbot/releases/2.1.1/content.xml [INFO] Fetching p2.index from http://download.eclipse.org/cbi/updates/license/ [INFO] Adding repository http://download.eclipse.org/cbi/updates/license août 02, 2017 9:07:55 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset août 02, 2017 9:07:55 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org août 02, 2017 9:11:33 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset août 02, 2017 9:11:33 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org [INFO] Fetching p2.index from http://download.eclipse.org/acceleo/updates/nightly/latest/ [INFO] Fetching p2.index from http://download.eclipse.org/acceleo/updates/nightly/latest/ [INFO] Adding repository http://download.eclipse.org/acceleo/updates/nightly/latest août 02, 2017 9:12:27 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset août 02, 2017 9:12:27 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org [INFO] Adding repository http://download.eclipse.org/modeling/emft/eef/updates/nightly/latest/neon ...
Yes, I have problems right now as well. And even this page doesn't work right now: https://dev.eclipse.org/committers/help/status.php
(In reply to Ed Merks from comment #34) > Yes, I have problems right now as well. And even this page doesn't work > right now: > > https://dev.eclipse.org/committers/help/status.php That page doesn't work anymore, it was replaced by this one: https://accounts.eclipse.org/committertools/infra-status I'll add a redirect and continue investigating. Connection reset is a different animal.
I have just got a target resolution problem :"Error occured during resolution of 'http://download.eclipse.org/tools/orbit/downloads/drops/R20170307180635/repository'. The IU 'org.junit' with range constraint '[4.11.0,5.0.0)' can not be found." Everything was OK just few minutes ago.
It's happening again: https://www.eclipse.org/forums/index.php/mv/msg/1088108/1770198/#msg_1770198
same here: https://github.com/xtext/maven-xtext-example/issues/48 and here: https://github.com/xtext/maven-xtext-example/issues/47 and with some builds on the xtext jenkins
Same here tonight with our internal Sirius test jobs: [WARNING] Failed to access p2 repository http://download.eclipse.org/tools/orbit/downloads/drops/R20160520211859/repository, use local cache. Neither http://download.eclipse.org/tools/orbit/downloads/drops/R20160520211859/repository/content.jar nor http://download.eclipse.org/tools/orbit/downloads/drops/R20160520211859/repository/content.xml found. [WARNING] Failed to access p2 repository http://download.eclipse.org/cbi/updates/license, use local cache. Neither http://download.eclipse.org/cbi/updates/license/compositeContent.jar nor http://download.eclipse.org/cbi/updates/license/compositeContent.xml found. août 09, 2017 5:29:49 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset août 09, 2017 5:29:49 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org [WARNING] Failed to access p2 repository http://download.eclipse.org/cbi/updates/license/1.0.0.v20131003-1638, use local cache. Neither http://download.eclipse.org/cbi/updates/license/1.0.0.v20131003-1638/content.jar nor http://download.eclipse.org/cbi/updates/license/1.0.0.v20131003-1638/content.xml found. [INFO] Fetching p2.index from http://download.eclipse.org/acceleo/updates/releases/3.7/R201705121344/ [INFO] Fetching p2.index from http://download.eclipse.org/acceleo/updates/releases/3.7/R201705121344/ [INFO] Adding repository http://download.eclipse.org/acceleo/updates/releases/3.7/R201705121344 août 09, 2017 5:36:58 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: I/O exception (java.net.SocketException) caught when processing request to {}->http://download.eclipse.org: Connection reset août 09, 2017 5:36:58 AM org.apache.http.impl.client.DefaultRequestDirector tryExecute INFOS: Retrying request to {}->http://download.eclipse.org [WARNING] Failed to access p2 repository http://download.eclipse.org/acceleo/updates/releases/3.7/R201705121344, use local cache. Unable to read repository at http://download.eclipse.org/acceleo/updates/releases/3.7/R201705121344/content.xml. [INFO] Adding repository http://download.eclipse.org/modeling/emft/eef/updates/milestones/2.0/S20170531111419 [WARNING] Failed to access p2 repository http://download.eclipse.org/modeling/emft/eef/updates/milestones/2.0/S20170531111419, use local cache. Unable to read repository at http://download.eclipse.org/modeling/emft/eef/updates/milestones/2.0/S20170531111419/content.xml. [...] [ERROR] Failed to resolve target definition /home/integration/workspace/sirius--tests-master-oxygen/PLATFORM/oxygen/SUITE/swtbot-sequence/jdk/JDK8/label/GTK3/packaging/org.eclipse.sirius.parent/../../releng/org.eclipse.sirius.targets/./sirius_oxygen.target: Failed to load p2 metadata repository from location http://download.eclipse.org/sirius/updates/nightly/latest/oxygen/: Unable to read repository at http://download.eclipse.org/sirius/updates/nightly/latest/oxygen/content.xml. Connect to download.eclipse.org:80 timed out
Probably related: Often when the comments in this bug reach a high frequency I see next to the download.eclipse.org timeouts also problems with oss.sonatype.org (maven central) uploads. The final closing-acknowledgement from the Nexus repository never reaches our hudson, making the build eventually fail.
http://services.typefox.io/open-source/jenkins/job/xtext-extras/job/cd_extras_issue166/1/console [ERROR] Failed to resolve target definition /var/jenkins_home/workspace/t-extras_cd_extras_issue166-572ZK4AEQDYS6TCKIKCK47E75DQAT5PIZNTWLRJ6WIGUM5KOFECA/releng/releng-target/xtext-extras.target.target: Failed to load p2 metadata repository from location http://download.eclipse.org/modeling/tmf/xtext/updates/releases/2.10.0/: Unable to read repository at http://download.eclipse.org/modeling/tmf/xtext/updates/releases/2.10.0/content.jar. connect timed out -> [Help 1]
We're on it. Thanks for your patience!
We've restarted the http server on two out of four of the download nodes. For some reasons, they were keeping a high number of connections in WAIT state. It should be back to normal now. Thank you for your patience.
Hi, same problems again to resolve a TP locally: Connection to http://download.eclipse.org/releases/neon/201612211000/content.xml.xz failed on Connection timed out: connect. Retry attempt 0 started Connection to http://download.eclipse.org/releases/neon/201606221000/p2.index failed on Connection timed out: connect. Retry attempt 0 started Connection to http://download.eclipse.org/releases/neon/p2.index failed on Connection timed out: connect. Retry attempt 0 started Connection to http://download.eclipse.org/sirius/updates/nightly/5.1.0-N20170811-090020/neon/content.jar failed on Connection timed out: connect. Retry attempt 0 started Connection to http://download.eclipse.org/tools/orbit/downloads/drops/R20170307180635/repository/p2.index failed on Connection timed out: connect. Retry attempt 0 started And as a result: "The current target platform contains errors, open Window > Preferences > Plug-in Development > Target Platform for details."
Download server 1 had a high number of connections. After a restart it's now back to normal.
Thanks, it resolves my timeout problem.
Created attachment 269873 [details] Access from OSU OSL I noticed a blip in overall access time about 40 minutes before midnight (Ottawa time), and even a complete failure of download.e.o (gray line). I'm tracking this but the leads are light.
Repeated connection resets are observed now at download.eclipse.org: ERROR] Internal error: java.lang.RuntimeException: Failed to load p2 repository with ID 'neon' from location http://download.eclipse.org/releases/neon: Unable to connect to repository http://download.eclipse.org/releases/neon/content.xml: Connection timed out: connect
*** Bug 521284 has been marked as a duplicate of this bug. ***
2 out of 4 of our download nodes had a high number of connections. I've restarted the http daemon and everything seems to run smoothly now.
Created attachment 269954 [details] Screenshot The timing seems to be consistent. ~10:30 pm local time, (and again at around 2:00am local time). The gray line in the attached chart shows that access from OSU OSL was interrupted for download.e.o for a noticeable period.
and yet another case: http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/cd_issue337/1/console
Again, 2 out of 4 of our download nodes had a high number of connections. I've restarted the http daemon and everything seems to run smoothly now. download1 29056 download2 25152
nope: http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/cd_issue337/4/console
(In reply to Christian Dietrich from comment #54) > nope: > http://services.typefox.io/open-source/jenkins/job/xtext-eclipse/job/ > cd_issue337/4/console Please give it another try, your build ran just after the restart and the nodes were probably in an intermediate state.
grr, now it's download3 which has a high number of connections... restart in progress
Should be better now. For the record download3 15552
Are the download servers merely underpowered or is it a different problem? With maven central uploads we have the problem that final acknowledgments never get from the remote server to our HIPPs. Might this be a similar problem where the downloading client closes the connection but the download server never gets that final message and keeps a lot of unnecessary connections open?
looks better now
Do you guys know the source of the connections? - Are they active or stale? - All from the same source or from multiple? I wonder why so many connections exist.
We seem to have a pattern of late Tuesday, early Wednesday localtime. I'm still looking into this. download.e.o gets sooo much traffic it's not always easy to spot this.
*** Bug 521565 has been marked as a duplicate of this bug. ***
> Are the download servers merely underpowered or is it a different problem? There are four servers behind download.e.o. Each server can handle >25,000 concurrent connections. At our peak, each server averages less than 1,000. In this case, we're seeing a pattern of so many connections being established but, perhaps intentionally, stalling the servers. We're considering this as perhaps an unintentional Denial of Service attack, we just need to pinpoint where it comes from. But there does seem to be a pattern.
Created attachment 270024 [details] Screenshot of 2017-08-30 Access from OSUOSL showing the period in which download.e.o struggles (gray line).
Do the servers support HTTP/2? Could these be idle connections that are kept open?
Created attachment 270026 [details] Correlation I superimposed a few charts: 1) Page load times 2) Firewall CPU usage 3) Router/Load balancer CPU usage 4) TCP Attack stats As you can see, when page load times start to suffer, CPU usage on the firewall spikes (2) and on the Load Balancer (3). (4) shows a definite SYN attack, with almost 1500 SYN attacks per second, sustaining ~500/second for a few hours, then dying off. I'll try to trace this back to the offending IPs and block at the edge as necessary.
CPU usage increases before SYN floods are detected. We're not loosing ACKs on high load somewhere?
(In reply to Arthur van Dorp from comment #67) > CPU usage increases before SYN floods are detected. We're not loosing ACKs > on high load somewhere? CPU usage spikes to 40% regularly. These devices can operate at high CPU loads before packets are dropped. Regardless, TCP/IP will retry a number of times before a timeout, so even if the random SYN/ACK/SYN+ACK is discarded because of CPU load, you won't necessarily get a timeout. Just a slow connection.
As an aside, while our networking devices are plenty capable of dealing with this silliness, I'll engage with our upstream provider to see if they can (and want) to filter upstream.
Created attachment 270053 [details] Pattern The gray line on this chart definitely shows the Tuesday late night (localtime) pattern.
(In reply to Denis Roy from comment #70) > The gray line on this chart definitely shows the Tuesday late night > (localtime) pattern. Can it still be the automatic updates schedule from earlier releases still hitting the download server? Is there a way to know which HTTP requests/response were done by the WAITing connections to find out whether it's something implemented in the IDE?
> Can it still be the automatic updates schedule from earlier releases still > hitting the download server? I don't want to speak to soon, but based on our firewall reports, the traffic sent is pure garbage -- and lots of it. Malformed tcp packets, port scans, syn attacks, etc. > Is there a way to know which HTTP requests/response were done by the WAITing > connections to find out whether it's something implemented in the IDE? Again, I don't want to speak to soon on this, but I think there's something else at play. download.e.o is hitting what appears to be this bug (based on the errors and comment 53): https://bz.apache.org/bugzilla/show_bug.cgi?id=53555 Other hosts, including www, are also the targets of this "garbage" traffic but only download.e.o ends up being filled with connections it cannot get rid of. It could be a red herring; though. We're not using the "Event" MPM, which the bug targets.
At the beginning of the DoS attack (22:01 localtime), download.e.o was receiving over 4,200 requests per second (typically download.e.o serves between 200 and 1000 req/sec). The top-20 IP addresses map to these host organizations: 1. China Mobile Guangdong 2. Red Hat (really?) 3. China Mobile Shandong 4. Singapore Telecommunications 5. China Mobile Guangdong 6. shenzhen branch, china netcom corp 7. China Telecom Shanghai 8. China Telecom Guangdong 9. Cyber Express Communication (Shanghai) 10. Beijing Guanghuan Xinwang Digital 11. China Telecom jiangsu 12. Hong Kong Broadband Network Ltd 13. China Unicom Heilongjiang 14. China Mobile Guangdong 15. China Mobile Guangdong 16. China Unicom Jiangsu 17. China Telecom Guangdong 18. Hewlett-Packard Company 19. China Telecom jiangsu 20. China Telecom Guangdong (In reply to Mickael Istria from comment #71) > Can it still be the automatic updates schedule from earlier releases still > hitting the download server? Based on the timimg, 10:01pm our time, it would appear that the auto updates could be a solid contributor.
Created attachment 270089 [details] SYN attack We were able to weather the storm tonight. The pattern is repeatable and predictable: around 10:00pm localtime on Tuesday night, we get absolutely clobbered with SYN packets for download.e.o from China. The main cause of the slowdown is the Load Balancer -- at the peak, its CPU is depleted. I need to filter these SYN packets at the border. But since it's widely distributed, it's quite challenging.
Created attachment 270090 [details] CSS CPU usage 100% CPU usage on the CSS during the SYN flood
stuff is down again
http://download.eclipse.org/ is down. I get this error while trying to load the page: A communication error occurred: "Connection refused" The Web Server may be down, too busy, or experiencing other problems preventing it from responding to requests. You may wish to try again at a later time.
It seems like a lot of Eclipse IDE instances in China are still facing/producing this bug 498116. Should we reopen this other bug? In term of hardware, would it be possible and worth it to buy/rent a specific load balancer in China that would be the one download.eclipse.org would resolve to from China? This issue is currently costing both IT staff and community a lot of time...
(In reply to Mickael Istria from comment #78) > It seems like a lot of Eclipse IDE instances in China are still > facing/producing this bug 498116. Should we reopen this other bug? > In term of hardware, would it be possible and worth it to buy/rent a > specific load balancer in China that would be the one download.eclipse.org > would resolve to from China? This issue is currently costing both IT staff > and community a lot of time... I wonder though what time is it in China at that point in Ottawa local time and what day? I thought it would be Wednesday in China then...
(In reply to Ed Merks from comment #79) > I wonder though what time is it in China at that point in Ottawa local time > and what day? I thought it would be Wednesday in China then... You're right, it would be 10:00AM *Wednesday* in China not Tuesday as initial auto-update schedule was set... But the coincidence on the 10:00AM is strong enough to dig in the direction of auto-update scheduling. Could it be that the fuzzy-schedule isn't that fuzzy and resolves on most chinese machines to the same day and hour?
We are investigating the current outage.
download.eclipse.org is ramping up. We had a misconfiguration that spreads our apache nodes.
(In reply to Lakshmi Shanmugam from comment #77) > http://download.eclipse.org/ is down. > I get this error while trying to load the page: > A communication error occurred: "Connection refused" Unfortunately, that one was self-inflicted, as Mikael mentioned in comment 82. *sigh* (In reply to Mickael Istria from comment #80) > You're right, it would be 10:00AM *Wednesday* Yes, on Wednesday 10:00am something happens in China. I'm not 100% convinced it's auto updates but my investigation continues on that front, and from preventing it from killing us. This week, however, it did not kill us. We killed ourselves hours later :(
> I need to filter these SYN packets at the border. But since > it's widely distributed, it's quite challenging. I've reconfigured our border firewall to be more aggressive with monitoring and blocking threats. The Load Balancer should see much less garbage traffic next week, but I'll keep a close eye on it. Just to recap the scope of this bug since it was opened: - misconfigurations due to the transparent mirroring system and new download.e.o server cluster (php-fpm) - Apache httpd bug https://bz.apache.org/bugzilla/show_bug.cgi?id=53555 (inconclusive, but we've not needed to restart Apache httpd since applying latest patches) - Weekly DoS attack since beginning of August from China (work in progress) - puppet configuration error (our mistake) (oops) If our systems can handle next Tuesday, I'll close this as fixed.
> If our systems can handle next Tuesday, I'll close this as fixed. We are not yet done here, we were still affected by massive traffic at 10:30pm last night. A solid contributor was that the tcp tweaks put in place in May were no longer applied, so I need to find out why. (In reply to Denis Roy from comment #11) > I've tweaked the tcp stack on the download server to increase the connection > pool, terminate connections faster and reduce timeouts for idle connections. > I'll continue to monitor but we should be good here.
Created attachment 270702 [details] Network status We handled DoS Tuesday very well last night. There was some flakiness accessing some of our services, but a retry typically worked. Some strategies that seemed to make a difference: - more aggressive filtering on the firewall - opened up the bandwidth cap from 10:00pm to 10:30 to handle the incoming storm - paused rsync to mirrors to save bandwidth - download.e.o connection limits stayed in place (from comment 85) The attachment shows, in order: - bandwidth spike at 10:00pm as we opened up the pipe - page load times at 10:00pm did suffer somewhat, but there was no outage - Load Balancer CPU usage didn't go above 75% - Firewall CPU usage peaked at 86% - Load Balancer SYN attacks stayed reasonable, 50/sec (from 3000+/sec) One thing I did notice, is the odd provenance of some of the "bad" traffic. There are likely more than these: -> DDoSDDoSDeflect.org -> SECURITYTEAMVPN.COM -> nuclearvpn.me -> Skapy-vpn-ny.com (A VPN can be used to obfuscate the source IP) My investigation will now go into 2 directions: 1. Why 10:00pm Tuesdays (ie, what are these connections hitting) 2. What are all those domains used for.
Lowering severity, as this is no longer blocking.
> 1. Why 10:00pm Tuesdays (ie, what are these connections hitting) From 22:00-22:10 localtime (10pm) here were the most requested files from one download server: download1:~ # egrep "2017:22:0" access_log.1 | awk {'print $7'} | sort | uniq -c | sort -nr | head -30 11270 /e4/snapshots/org.eclipse.e4.ui/content.xml.xz 10201 /releases/mars/compositeContent.jar 9859 /eclipse/updates/4.5/compositeContent.jar 9469 /mylyn/releases/mars/compositeContent.xml 8875 /mylyn/releases/mars/compositeContent.jar 7994 /technology/epp/packages/mars/content.xml.xz 7701 /releases/mars/201506241002/content.xml.xz 7230 /eclipse/updates/4.5/categoriesMars/content.jar 7162 /tools/orbit/downloads/drops/R20150821153341/repository/content.jar 6362 /releases/mars/201510021000/content.xml.xz 6187 /webtools/repository/mars/compositeContent.jar 5869 /releases/mars/201602261000/content.xml.xz 5691 /eclipse/updates/4.5/R-4.5-201506032000/content.xml.xz 4936 /eclipse/updates/4.5/R-4.5.1-201509040015/content.xml.xz 4792 /eclipse/updates/4.5/R-4.5.2-201602121500/content.xml.xz 4663 /mylyn/drops/3.18.0/v20151215-0126/content.jar 4541 /webtools/downloads/drops/R3.7.0/R-3.7.0-20150609111814/repository/content.jar 12 hours earlier (10:00am - 10:10am local time) this is what was being requested: download1:~ # egrep "2017:10:0" access_log.1 | awk {'print $7'} | sort | uniq -c | sort -nr | head -30 1138 /technology/epp/packages/oxygen/content.xml.xz 1129 /releases/oxygen/compositeContent.jar 1105 /releases/oxygen/201706281000/content.xml.xz 950 /releases/neon/compositeContent.jar 850 /oomph/updates/milestone/latest/compositeContent.jar 839 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml 838 /technology/epp/packages/neon/content.xml.xz 832 /releases/neon/201705151400/content.xml.xz 828 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml.sha1 801 /eclipse/updates/4.7/compositeContent.jar 794 /oomph/drops/milestone/S20170714-043548-1.9.0-M1/content.jar 785 /releases/neon/201703141400/content.xml.xz 775 /releases/neon/201612211000/content.xml.xz 766 /releases/neon/201606221000/content.xml.xz 764 /webtools/repository/neon/compositeContent.jar Two things I note: 1. 10:00pm accesses are for Mars files. Oxygen doesn't even chart. 2. 10:00pm volume is roughly 10x higher than 10:00am From 22:20 - 22:30 (10:20pm localtime) we're still being asked for files >2y old, Oxygen content doesn't even register. download1:~ # egrep "2017:22:2" access_log.1 | awk {'print $7'} | sort | uniq -c | sort -nr | head -30 4115 /tools/orbit/downloads/drops/R20150821153341/repository/content.jar 3677 /webtools/downloads/drops/R3.7.2/R-3.7.2-20160217020110/repository/content.jar 3510 /webtools/patches/drops/R3.7.1/P-3.7.1-20151211112237/repository/content.jar 3451 /webtools/downloads/drops/R3.7.1/R-3.7.1-20150915020029/repository/content.jar 3279 /webtools/repository/mars/compositeContent.jar 3225 /webtools/downloads/drops/R3.7.0/R-3.7.0-20150609111814/repository/content.jar 3095 /e4/snapshots/org.eclipse.e4.ui/content.xml.xz 3071 /eclipse/updates/4.5/compositeContent.jar 2962 /jetty/updates/jetty-wtp/compositeContent.xml 2890 /eclipse/updates/4.5/categoriesMars/content.jar 2833 /eclipse/updates/4.5/R-4.5.2-201602121500/content.xml.xz 2808 /eclipse/updates/4.5/R-4.5.1-201509040015/content.xml.xz 2749 /eclipse/updates/4.5/R-4.5-201506032000/content.xml.xz Earlier, 10:20-10:30am localtime. download1:~ # egrep "2017:10:2" access_log.1 | awk {'print $7'} | sort | uniq -c | sort -nr | head -30 1065 /releases/oxygen/compositeContent.jar 1043 /technology/epp/packages/oxygen/content.xml.xz 978 /releases/neon/201705151400/content.xml.xz 934 /releases/neon/compositeContent.jar 922 /releases/oxygen/201706281000/content.xml.xz 817 /technology/epp/packages/neon/content.xml.xz 792 /oomph/drops/milestone/S20170714-043548-1.9.0-M1/content.jar 785 /releases/neon/201703231000/content.xml.xz 778 /oomph/updates/milestone/latest/compositeContent.jar 770 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml.sha1 769 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml 761 /releases/neon/201610111000/content.xml.xz 749 /releases/neon/201609281000/content.xml.xz 739 /eclipse/updates/4.7/compositeContent.jar 738 /releases/neon/201606221000/content.xml.xz 734 /releases/neon/201612211000/content.xml.xz 734 /eclipse/updates/4.7/categoriesOxygen/content.jar User-agent strings from 22:00-22:10, compared to 10:00-10:10: download1:~ # egrep "2017:22:0" access_log.1 | awk {'print $12'} | sort | uniq -c | sort -nr | head -30 312883 "Apache-HttpClient/4.3.6 53753 "p2/mars-sr0 11252 "Apache-HttpClient/4.5.2 4676 "Aether" 2193 "p2/1.1.201.v20161115-1927 2051 "p2/1.1.300.v20161004-0244 1250 "Nexus/2.13.0-01 762 "Jakarta 570 "Mozilla/5.0 483 "Mozilla/4.0 418 "Apache-HttpClient/4.2.6 349 "Java/1.8.0_144" 303 "Apache-HttpClient/4.1.3 226 "Java/1.8.0_131" 191 "Java/1.8.0_111" download1:~ # egrep "2017:10:0" access_log.1 | awk {'print $12'} | sort | uniq -c | sort -nr | head -30 32153 "Apache-HttpClient/4.3.6 25836 "Apache-HttpClient/4.5.2 6804 "Aether" 5867 "p2/1.1.300.v20161004-0244 4863 "p2/mars-sr0 4597 "Nexus/2.13.0-01 4153 "p2/1.1.201.v20161115-1927 1611 "Jakarta 1179 "Apache-HttpClient/4.1.3 857 "Apache-HttpClient/4.2.6 819 "Mozilla/5.0
We are done here. Thanks for everyone's patience. We'll never be immune to DDoS attacks, and we've done everything we could to tune/tweak our firewalls, servers and networking gear. To summarize what has been done: - tuning tcp stack on download.e.o cluster - switch php-fpm to unix sockets instead of tcp - lowering Apache httpd timeouts - Apache httpd bug fixed - 53555 - border firewall more aggressive threat detection and mitigation - open bandwidth limit during attacks to minimize retransmissions and retries - per-file mirror tracking and pre-redirect checks (bug 517294)
Thanks a lot Denis and others! That was an epic issue, going in so many directions, including mysterious ones, with a lot of pressure on your shoulders from the community. That was hard work, and it's now complete, so thank you and congrats!
Created attachment 271099 [details] CircularRedirectException for Eclipse Mars Repository I was advised here https://www.eclipse.org/forums/index.php/t/1089501/ to attach my log file to this bug. My fresh eclipse mars installation is unable to contact its repository via proxy, though the internal browser and other repos are working fine.
See bug 526280 Thank you!
Over the last few weeks we keep getting sporadic failures accessing http://download.eclipse.org/jetty/updates/jetty-bundles-9.x/content.xml. This is obviously killing builds that try to access this URL for a target platform. Today we keep getting "502 Bad Gateway" / nginx.
Yep, I also get unmotivated errors trying to install plugins into SDK: HTTP Server 'Bad Gateway' : http://download.eclipse.org/tools/orbit/downloads/drops/R20180829150157/repository/artifacts.jar org.eclipse.ecf.filetransfer.IncomingFileTransferException: HttpComponents connection error response code 502. HTTP Server 'Bad Gateway' : http://download.eclipse.org/tools/orbit/downloads/drops/R20180525155205/repository/content.jar org.eclipse.ecf.filetransfer.IncomingFileTransferException: HttpComponents connection error response code 502.
*** Bug 541179 has been marked as a duplicate of this bug. ***
The Bad Gateway was handled via Bug 541158 download.e.o is still vulnerable to DDoS. See Bug 541179. Last Wednesday around 8:30-10:30pm Local we got hit by a massive influx of connections -- 13,000 new connections per second, sustained for extended periods. Filtering/blocking only takes us so far. Our core router needs to be upgraded, which I'm in the process of doing.
We've been good here for the last several weeks. We'll keep monitoring into January before closing FIXED.
Our build is getting a "Connection timed out" every time it tries to download this file: http://download.eclipse.org/releases/neon/201703231000/content.xml.xz In my browser, that URL resolves to... http://mirror.rise.ph/eclipse//releases/neon/201703231000/content.xml.xz Started about Dec 22, 2018.
(In reply to Laurent Redor from comment #5) > Here, since this morning I haven't succeeded to load a TP, with 16 distinct > locations. Each time the error is not on the same location: > * Connection to > http://download.eclipse.org/modeling/emft/eef/updates/releases/1.5/ > R201601141612/features/org.eclipse.emf.eef.collab.runtime-feature_1.5.1. > 201601141612.jar failed on Connection timed out: connect. Retry attempt 0 > started > * Connection to > https://whatstatus.co/non-veg-jokes failed on > Connection timed out: connect. Retry attempt 0 started > * Connection to > http://download.eclipse.org/diffmerge/releases/0.7.1/edm-patterns-site/p2. > index failed on Connection timed out: connect. Retry attempt 0 started > * ... I get the feeling that transparent mirroring of p2 metadata is just a bad idea. I see repeated strange problem reports in the forum. Yesterday I could not load Neon's latest update site. Today I get repeated timeouts (from where I'm traveling in Italy).
> I get the feeling that transparent mirroring of p2 metadata is just a bad > idea. Yeah, I getting to agree with you, but that is a different issue. We could perhaps talk about that in bug 539316
Unfortunately almost every day someone has problems that I can only attribute to server problems: https://www.eclipse.org/forums/index.php/mv/msg/1097082/1801211/#msg_1801211 Is there Eclipse server returns bad contents or is it some mirror doing that? And even with timeouts, is it the Eclipse server or the mirror to which it redirects timing out. There's just no way to know. :-(
I'm baking a patch that will disable transparent mirroring for JAR and XZ files, but keep it active for bin|zip|gz Indeed, when a redirect is sent to Eclipse, there is no way of knowing what the remote server is doing.
wget -S --no-hsts http://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz Connecting to download.eclipse.org (download.eclipse.org)|198.41.30.199|:80... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK Server: nginx Date: Tue, 15 Jan 2019 20:54:39 GMT Content-Type: text/xml Content-Length: 871356 Connection: keep-alive Vary: Accept-Encoding Last-Modified: Thu, 13 Dec 2018 11:39:34 GMT ETag: "d4bbc-57ce5c4c2a980" X-NodeID: download2 X-Proxy-Cache: HIT Accept-Ranges: bytes Length: 871356 (851K) [text/xml] No redirection.
(In reply to Denis Roy from comment #102) > I'm baking a patch that will disable transparent mirroring for JAR and XZ > files, but keep it active for bin|zip|gz Denis, this patch needs a bit more tweaking: > .jar.pack.gz. GET http://download.eclipse.org/releases/2018-12/201812191000/plugins/org.eclipse.dltk.core_5.11.0.201811220629.jar.pack.gz HTTP/1.1 [DEBUG] headers - >> Host: download.eclipse.org [DEBUG] headers - >> Proxy-Connection: Keep-Alive [DEBUG] DefaultClientConnection - Receiving response: HTTP/1.0 307 Temporary Redirect [DEBUG] headers - << HTTP/1.0 307 Temporary Redirect [DEBUG] headers - << Server: nginx
"but keep it active for bin|zip|gz" It's working as expected. I don't want to disable it entirely, only on the repos.
(In reply to Denis Roy from comment #105) > It's working as expected. I don't want to disable it entirely, only on the > repos. Understood. The URL is from a repo. There are jars as well as compressed jars in repos. Compresses jars end in *.jar.pack.gz. Thus, we still have issues mirroring from the Eclipse.org repos because of this and the redirects.
I have similar problem: curl -vvv http://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz * About to connect() to download.eclipse.org port 80 (#0) * Trying 198.41.30.199... * Connected to download.eclipse.org (198.41.30.199) port 80 (#0) > GET /releases/2018-12/201812191000/content.xml.xz HTTP/1.1 > User-Agent: curl/7.29.0 > Host: download.eclipse.org > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Tue, 26 Feb 2019 13:11:51 GMT < Content-Type: text/xml < Content-Length: 871356 < Connection: keep-alive < Vary: Accept-Encoding < Last-Modified: Thu, 13 Dec 2018 11:39:34 GMT < ETag: "d4bbc-57ce5c4c2a980" < X-NodeID: download2 < X-Proxy-Cache: HIT < Accept-Ranges: bytes < * transfer closed with 871355 bytes remaining to read * Closing connection 0 curl: (18) transfer closed with 871355 bytes remaining to read And via https is correct curl -o content.xml.xz -vvv https://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to download.eclipse.org port 443 (#0) * Trying 198.41.30.199... * Connected to download.eclipse.org (198.41.30.199) port 443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 * Server certificate: * subject: CN=*.eclipse.org,OU=IT,O="Eclipse.org Foundation, Inc.",L=Ottawa,ST=Ontario,C=CA * start date: Jan 17 00:00:00 2017 GMT * expire date: Mar 02 12:00:00 2020 GMT * common name: *.eclipse.org * issuer: CN=DigiCert SHA2 High Assurance Server CA,OU=www.digicert.com,O=DigiCert Inc,C=US > GET /releases/2018-12/201812191000/content.xml.xz HTTP/1.1 > User-Agent: curl/7.29.0 > Host: download.eclipse.org > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Tue, 26 Feb 2019 13:13:32 GMT < Content-Type: text/xml < Content-Length: 871356 < Connection: keep-alive < Vary: Accept-Encoding < Last-Modified: Thu, 13 Dec 2018 11:39:34 GMT < ETag: "d4bbc-57ce5c4c2a980" < X-NodeID: download2 < Strict-Transport-Security: max-age=15552000; includeSubDomains; preload < X-Frame-Options: SAMEORIGIN < X-Content-Type-Options: nosniff < X-XSS-Protection: 1; mode=block < X-Proxy-Cache: HIT < Accept-Ranges: bytes < { [data not shown] 100 850k 100 850k 0 0 120k 0 0:00:07 0:00:07 --:--:-- 149k Quesiton quick fix how i can change everywhere in eclipse to use https and not http or fix this in ngninx config
> curl -vvv > http://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz > And via https is correct > curl -o content.xml.xz -vvv > https://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz Um, you're not even calling the same command for http and https. curl will abort transfer when binary data is to be sent to the terminal. This works just fine: curl -vvv -o content.xml.xz http://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz
the -o is that output is go the stdout not to file but i try it and i get same response curl -vvv -o content.xml.xz http://download.eclipse.org/releases/2018-12/201812191000/content.xml.xz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to download.eclipse.org port 80 (#0) * Trying 198.41.30.199... * Connected to download.eclipse.org (198.41.30.199) port 80 (#0) > GET /releases/2018-12/201812191000/content.xml.xz HTTP/1.1 > User-Agent: curl/7.29.0 > Host: download.eclipse.org > Accept: */* > 0 0 0 0 0 0 0 0 --:--:-- 0:00:05 --:--:-- 0* Empty reply from server 0 0 0 0 0 0 0 0 --:--:-- 0:00:05 --:--:-- 0 * Connection #0 to host download.eclipse.org left intact curl: (52) Empty reply from server
Can you email me your public-facing IP address to webmaster@eclipse.org ?
Ok find reason i am on new company network and they enable http antivirus scanning in fortigate and thats block serving file over http indefinitely when i change to other proxy server not on default port 80. All start work as expected.
Interesting, thanks for posting the reason. I'm glad you've resolved your issue. (In reply to Ed Merks from comment #101) > Unfortunately almost every day someone has problems that I can only > attribute to server problems: I wonder if other users are behind corporate networks where http man-in-the-middle scanning and validation can affect our ability to deliver bits.
Closing fixed.