Bug 498116 - Automatic Update Checking hoses the Download Server every Tuesday
Summary: Automatic Update Checking hoses the Download Server every Tuesday
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Cross-Project (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P1 blocker (vote)
Target Milestone: ---   Edit
Assignee: Cross-Project issues CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 421779
Blocks: 515596
  Show dependency tree
 
Reported: 2016-07-19 04:59 EDT by Ed Merks CLA
Modified: 2017-09-06 01:39 EDT (History)
20 users (show)

See Also:
tjwatson: pmc_approved+


Attachments
Screenshot (191.53 KB, image/png)
2016-11-22 14:19 EST, Denis Roy CLA
no flags Details
Screenshot - UpdatePrefsOrig.png (170.20 KB, image/png)
2016-11-23 03:40 EST, Martin Oberhuber CLA
no flags Details
Screenshot - UpdatePrefsMikael.png (185.45 KB, image/png)
2016-11-23 03:41 EST, Martin Oberhuber CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Merks CLA 2016-07-19 04:59:19 EDT
I noticed that right after 10:00AM this morning (here in Berlin), the download.eclipse.org became unresponsive.  I have a feeling that's related to the automatic update checking in the IDE that by default is set to kick in every Tuesday at 10:00AM.  Surely it's a bad thing to have all running Eclipse installations in any given time zone kick in at the same time and hammer download.eclipse.org with requests for p2 metadata?
Comment 1 Mickael Istria CLA 2016-07-19 05:03:45 EDT
I also felt unresponsive or lew responsive download.eclipse.org in this same time frame.
Is it technically possible to check whether Ed's feeling is right? Can webmaster check whether so many requests arrived at the same time and caused the latency?
Comment 2 Mikaël Barbero CLA 2016-07-19 05:11:56 EDT
If you look at https://dev.eclipse.org/committers/webstats/download.eclipse.org/usage_201605.php
(I choose may to avoid the release month and to have a full month). Tuesdays were the 3, 10, 17, 24 and 31. And every times we have some huge peaks. So yes, I think it's definitely related. 

(Note that the peaks usually span over 2 days. I suspect thisis because of timezones).
Comment 3 Mickael Istria CLA 2016-07-19 05:27:42 EDT
An alternative strategy, rather than scheduling, would be to use the uniform distribution of a good random. For example, run a "maybe check for updates" every hour, which would randomly decide whether to actually check or not. Assuming we want people to check for updates about once a week and that they have Eclipse IDE open 35 hours a week, we just check for update if the hourly random number is < 1/35.
Benefits:
* uniform distribution, resulting is less peaks and better performance of the download server (hopefully)
Drawbacks:
1. No guarantee everyone is notified within a given timeframe
2. Someone using Eclipse rarely is less likely to receive a notification
Comment 4 Ed Willink CLA 2016-07-19 05:40:21 EDT
A bit simpler: Rather than 10:00, let it be 10:00 plus a uniformly distributed delay in the range 0 to 3 hours. (3 hours could be an obscure preference so that those who really care can adjust it to 0.)
Comment 5 Markus Knauer CLA 2016-07-19 06:00:39 EDT
Many different options has been discussed in the past when it had been enabled (in Mars) in bug 421779, but the pressure to implement a new search-for-update timing strategy wasn't high enough. Maybe the pressure to revisit this again is now high enough?
Comment 6 David Williams CLA 2016-07-19 08:03:20 EDT
(In reply to Mikaël Barbero from comment #2)
> If you look at
> https://dev.eclipse.org/committers/webstats/download.eclipse.org/
> usage_201605.php
> (I choose may to avoid the release month and to have a full month). Tuesdays
> were the 3, 10, 17, 24 and 31. And every times we have some huge peaks. So
> yes, I think it's definitely related. 
> 
> (Note that the peaks usually span over 2 days. I suspect thisis because of
> timezones).

Cool, data!
Comment 7 Mikaël Barbero CLA 2016-09-22 06:47:16 EDT
It's time to tackle this issue. I think it would be too late for Neon.1 (already in quiet week), but handling it for Neon.2 in Dec. should be our target.

I see three solutions, two easy:

- we remove this automatic check for update. 
- we change the schedule to something so that Mars.all, Neon.0 and Neon.1 users check for updates on a different day. 

and one which require more work:

- add a new update schedule to the UI "check every week", the implementation being something close to the algorithm proposed in bug 421779 comment 14:

if(date_last_check > 1 month) {
    check_now(ask_questions_later);
}
else if(date_last_check > 1 week) {
    sleep(random(86400) seconds);
    check_now();
}

Thoughts?
Comment 8 Ed Merks CLA 2016-09-22 08:16:24 EDT
I'd be happy to remove it.  I don't want automatic updates...  But I'm not sure that's a reasonable alternative or a good reason for choosing it.

We can't change Mars.* because there are no updates anymore.  We're stuck with the behavior that it already has. We can only hope that most users switch to Neon and moving up to Neon.* when it's available.  In general, I'd be afraid if we just changed the days, we have more days with outages rather than one big outage per week.

Another alternative to consider is to have it off by default because we all know users are generally incapable of changing a preference. :-P

Randomly spreading the time seems a reasonable approach for when it is enabled.  But what happens if I restart Eclipse?  What happens if my machine sleeps/hibernates (and is it likely to start as soon as it wakes up in that case)?
Comment 9 Mikaël Barbero CLA 2016-09-23 09:53:49 EDT
(In reply to Ed Merks from comment #8)
> I'd be happy to remove it.  I don't want automatic updates...  But I'm not
> sure that's a reasonable alternative or a good reason for choosing it.
> 
> We can't change Mars.* because there are no updates anymore.  We're stuck
> with the behavior that it already has. We can only hope that most users
> switch to Neon and moving up to Neon.* when it's available.  

Yup, that's my hope too.

> In general, I'd
> be afraid if we just changed the days, we have more days with outages rather
> than one big outage per week.

Agreed. 

> Another alternative to consider is to have it off by default because we all
> know users are generally incapable of changing a preference. :-P

That was actually my initial proposition... by removing it, I meant turning it off by default (like it was before Mars.0)

> Randomly spreading the time seems a reasonable approach for when it is
> enabled.  But what happens if I restart Eclipse?  What happens if my machine
> sleeps/hibernates (and is it likely to start as soon as it wakes up in that
> case)?

These are all fair issues that we will need to solve if we choose this path. Thanks for the insights.

I would love to hear Markus's opinions as he has filled bug 421779 which lead to activating the weekly check for update ;)
Comment 10 Mikaël Barbero CLA 2016-11-15 06:52:40 EST
FYI, I should be able to provide a patch by the end of the day that implement solution 3 from comment 7.
Comment 11 Eclipse Genie CLA 2016-11-15 07:34:27 EST
New Gerrit change created: https://git.eclipse.org/r/85046
Comment 12 Eclipse Genie CLA 2016-11-15 07:35:17 EST
New Gerrit change created: https://git.eclipse.org/r/85047
Comment 13 Eclipse Genie CLA 2016-11-15 07:35:18 EST
New Gerrit change created: https://git.eclipse.org/r/85047
Comment 14 Eclipse Genie CLA 2016-11-15 07:35:19 EST
New Gerrit change created: https://git.eclipse.org/r/85047
Comment 15 Michael Vorburger CLA 2016-11-15 07:52:00 EST
Whoa.. folks this is a real problem! I just found this bug following complaints from end-users of https://github.com/vorburger/opendaylight-eclipse-setup hitting timeouts from eclipse.org; and I'm just seeing that it's virtually impossible for me to e.g. locally rebuild Oomph from sources right now; I'm like on my 20th re-re-re-mvn due to lots of:

Retry another mirror:
[ERROR] HTTP Server 'Service Unavailable': http://mirror.switch.ch/eclipse/eclipse/updates/4.5/R-4.5-201506032000/plugins/org.eclipse.jface.databinding_1.7.0.v20150406-2148.jar.pack.gz
[ERROR] Internal error: org.eclipse.tycho.repository.local.MirroringArtifactProvider$MirroringFailedException: Could not mirror artifact osgi.bundle,org.eclipse.jface.databinding,1.7.0.v20150406-2148 into the local Maven repository.See log output for details. HttpComponents connection error response code 503

Ed, when you told me at EclipseCon that "eclipse.org p2 are down every Tuesday" I had just assumed that you meant it figuratively and as a joke! ;-) Reading this made me ROTFL and realize you quite literally meant every Tuesday morning we're down.. I've thus taken the liberty to adjust the summary of this issue to remove the question mark, hope that's OK with everyone. 

Now re. the options for a way forward, personally I would suggest 3 steps:

1. Short term: Disable the auto-update check on intro. in bug 421779 by default (BUT don't actually remove the code and preference, IMHO).  Let's do this NOW, in time for Neon.2 in Dec! I've taken the liberty to push two Gerrits suggesting this... as much as to stimulate discussion as expecting that they'll actually be merged... ;-)
  FTR: Personally I actually think that having to do this is a shame, and I'm in favour of automated updates in principle, to push out the latest fixes on stable maintenance branches automatically.  But... as-is, it's de facto breaking our ecosystem, and we should fix it yesterday, not tomorrow.

2. Medium term: Come up with some sort of smart algo. for evenly distributing update requests..  IMHO this perhaps needs a bit more thought.. E.g. IMHO ideally it should somehow work well even for users who only open their Eclipse IDE for 15' minutes (which above wouldn't, right?).  How do other automatic update checkers do this?  This problem must have already been solved elsewhere... I've added Pascal Rapicault from p2 to this bug, perhaps he has some thoughts.

3. Longer term: Perhaps Foundation could be motivated to look at using some real content delivery network (CDN, à la Akamai, CloudFlare, or Google's own etc.) instead of the current system of the network of manual p2 mirrors, which bottom line is just unreliable IMHO - or has this already seriously been looked at this in the past? Already rejected it as No go, way too expensive? (Or "Not open source??" Come on, we got real work to do here.. ;-) BTW, just FYI: I've internally been told that http://downloads.jboss.org/ is backed by Akamai. I just did a quick Bugzilla search but couldn't find anything about this yet - would it be crazy to open a new bug, separate from this one, to suggest this? (Only found Bug 507519, seems vaguely related, but is specifically about non-p2 consumers; which is a different problem?)
Comment 16 Eclipse Genie CLA 2016-11-15 09:47:34 EST
New Gerrit change created: https://git.eclipse.org/r/85060
Comment 17 Eclipse Genie CLA 2016-11-15 09:47:34 EST
New Gerrit change created: https://git.eclipse.org/r/85060
Comment 18 Mikaël Barbero CLA 2016-11-15 09:50:16 EST
Patch https://git.eclipse.org/r/85060 provides a new way to schedule check for updates. Users can choose to check "once a day", "once a week" or "once a month". The initial time used for next checks is the time when the user activates the option. However, it never checks for update exactly after a day, a week or a month. It introduces some randomness:

- If the delay for checking for update is passed, it schedules a check sometime in the next 8 hours.
- If the delay is well overpassed (see definition of "well overpassed" below), it schedules a check in the next hour.

The delay is considered "well overpassed" depending on the recurrence of the check:
- If the recurrence is "once a day", the delay is considered overpassed after a day and 6 hours.
- If the recurrence is "once a week", the delay is considered overpassed after a week and 2 days.
- If the recurrence is "once a month", the delay is considered overpassed after a week and 6 days.

Of course, I'm open to suggestions for these values ;)
Comment 19 Ed Merks CLA 2016-11-22 05:00:40 EST
It's Tuesday again, the server is hosed again, and so am I. :-(

What's the outlook on getting the patch committed for Neon.2?  

More importantly, if a user did update from Neon.1 to Neon.2 with these patches available, will this change the behavior for them, or will they continue with the old pattern because the preferences are already set to use the old pattern?  

If the latter, it seems a bit pointless to push these changes into Neon.2 because most user's will update to Neon.2 rather than install it fresh.
Comment 20 Mikaël Barbero CLA 2016-11-22 05:53:36 EST
I think it's already too late for Neon.2 (my bad , I should have provided this patch much earlier). 

Regarding the update thing, you're right, I don't think it will change anything for users who don't install a fresh Eclipse. The only solution for that would be add some code (during p2.ui startup?) that checks if the check for update schedule is Tuesday/10a.am, then change it to once a week. But it's a bit brutal IMO.
Comment 21 Michael Vorburger CLA 2016-11-22 06:08:00 EST
> add some code (during p2.ui startup?) that checks if the check for update schedule is Tuesday/10a.am, then change it to once a week. But it's a bit brutal IMO.

I actually think that this is a Good Idea, and would encourage doing this for Neon.2, and not brutal at all - think of it like a "data (preference) migration code", this is a very common to do to evolve something that's not code but that's stored somewhere between versions of products (e.g. for RDBMS there are tools like https://flywaydb.org or http://www.liquibase.org to manage a similar kind of requirement).
Comment 22 Mikaël Barbero CLA 2016-11-22 06:12:51 EST
(In reply to Michael Vorburger from comment #21)
Ok, but my patch has not been reviewed nor tested. Neon.2 RC3 is very close and the patch should be rock solid to be integrated. 

I'm willing to add this "preference migration" code quickly if I see enough interest to review/test/integrate this patch in Neon.2/RC.
Comment 23 Martin Oberhuber CLA 2016-11-22 07:11:30 EST
(In reply to Mikaël Barbero from comment #22)
> I'm willing to add this "preference migration" code quickly if I see enough
> interest to review/test/integrate this patch in Neon.2/RC.

I would support this as an Eclipse PMC member, since the issue is really severe. I can't guarantee the PMC to not veto the change if the pref migration code ends up looking too complex and risky, but I definitely support the approach.
Comment 24 Mikaël Barbero CLA 2016-11-22 11:01:16 EST
I've updated the patchset and implemented the preference migration. Note that I've also taken the opportunity to remove the fixed weekday/time of the day schedule option as it was quite confusing to have both this one and the "fuzzy" one.

Please test and review quickly if you want it to have a chance to be included in Neon.2  and thus help the Tuesdays to be less annoying :)
Comment 25 Pascal Rapicault CLA 2016-11-22 11:07:44 EST
A couple days ago, I quickly looked at the code and it seemed good. I don't see any issue in fast tracking it if the PMC agreed.

The other thing I would change is the packages since they are setting the preference in the first place.
Comment 26 Pascal Rapicault CLA 2016-11-22 11:08:44 EST
(In reply to Ed Merks from comment #19)
> It's Tuesday again, the server is hosed again, and so am I. :-(

  Could you please describe the symptoms? Is it that it is slow for you to get the artifacts? Are you failing to load the repos, etc.
Comment 27 Martin Oberhuber CLA 2016-11-22 11:23:02 EST
(In reply to Mikaël Barbero from comment #24)
In today's meeting, the Eclipse PMC agreed that we should try fix this for Neon.2:
https://wiki.eclipse.org/Eclipse/PMC#Meeting_Minutes

So @Mikael we really appreciate your efforts :)

> I've also taken the opportunity to remove the fixed weekday/time of the day schedule
> option as it was quite confusing to have both this one and the "fuzzy" one.

Could that cause issues if a Neon.1 Eclipse opens a workspace that was opened with Neon.2 before (thus removing that particular Preference) ?
Comment 28 Mikaël Barbero CLA 2016-11-22 11:28:15 EST
(In reply to Pascal Rapicault from comment #25)
> The other thing I would change is the packages since they are setting the
> preference in the first place.

Agreed. Will submit a change to EPP.
Comment 29 Eclipse Genie CLA 2016-11-22 11:55:30 EST
New Gerrit change created: https://git.eclipse.org/r/85513
Comment 30 Mikaël Barbero CLA 2016-11-22 11:56:30 EST
https://git.eclipse.org/r/85513 change configuration of all EPP packages to use the new fuzzy scheduler, set at "Once a week" value.
Comment 31 Mikaël Barbero CLA 2016-11-22 12:18:51 EST
(In reply to Martin Oberhuber from comment #27)
> > I've also taken the opportunity to remove the fixed weekday/time of the day schedule
> > option as it was quite confusing to have both this one and the "fuzzy" one.
> 
> Could that cause issues if a Neon.1 Eclipse opens a workspace that was
> opened with Neon.2 before (thus removing that particular Preference) ?

I've just modified the code so that it handles gracefully this use case. If a user rollbacks to Neon.1, the scheduler will be the old one, with the previous settings.
Comment 32 Markus Knauer CLA 2016-11-22 12:36:30 EST
(In reply to Mikaël Barbero from comment #30)
> https://git.eclipse.org/r/85513 change configuration of all EPP packages to
> use the new fuzzy scheduler, set at "Once a week" value.

+1 for this change.

As soon as the required changes are available in Platform, I turn that into a +2. Let me known when it can be merged.
Comment 33 Ed Merks CLA 2016-11-22 14:10:04 EST
(In reply to Pascal Rapicault from comment #26)
> (In reply to Ed Merks from comment #19)
> > It's Tuesday again, the server is hosed again, and so am I. :-(
> 
>   Could you please describe the symptoms? Is it that it is slow for you to
> get the artifacts? Are you failing to load the repos, etc.

Even a HEAD request fails, after a long time, a socket timeout.  So you can't access any metadata, let alone get as far as downloading artifacts.  The server may as well have crashed.  Tuesdays are write-off.
Comment 34 Denis Roy CLA 2016-11-22 14:19:58 EST
Created attachment 265518 [details]
Screenshot

(In reply to Ed Merks from comment #33)
> Even a HEAD request fails, after a long time, a socket timeout.  So you
> can't access any metadata, let alone get as far as downloading artifacts. 
> The server may as well have crashed.  Tuesdays are write-off.

Oddly, I don't share your observations. From Portland, OR we monitor a GET request to download.e.o every 10 minutes (gray line in chart) and although I can see a few spikes, for the most part the request succeeds.

Regardless, thanks Mikael for patch. Spreading out the load more uniformly over the day will be beneficial (and will likely be easier to scale).
Comment 35 Ed Merks CLA 2016-11-22 14:56:33 EST
(In reply to Denis Roy from comment #34)
> Created attachment 265518 [details]
> Screenshot
> 
> (In reply to Ed Merks from comment #33)
> > Even a HEAD request fails, after a long time, a socket timeout.  So you
> > can't access any metadata, let alone get as far as downloading artifacts. 
> > The server may as well have crashed.  Tuesdays are write-off.
> 
> Oddly, I don't share your observations. From Portland, OR we monitor a GET
> request to download.e.o every 10 minutes (gray line in chart) and although I
> can see a few spikes, for the most part the request succeeds.
> 
> Regardless, thanks Mikael for patch. Spreading out the load more uniformly
> over the day will be beneficial (and will likely be easier to scale).

Denis,

I would suggest you come to Berlin and measure what we see here at 10:00AM every Tuesday.  I'll take screen captures of the numbers shown by https://dev.eclipse.org/committers/help/status.php at that time.  In the left column we see numbers like 12.00+ and apparently that means:

10.00 and up: general server overload. It could be just a spike due to replication or disk activity, because this isn't normal.

In the right column, download1 goes red.  But according you your graph, all is good.  Unfortunately nothing works here Dennis, so I *strongly* resent your "Oddly" prefixed comments that suggest that perhaps there really is no problem at all, just some personal melodrama. I simply cannot work on Tuesday until late in the afternoon on anything that involves access to update sites.  I can't do a local Tycho build so I cannot do any end-to-end testing of Oomph.  It's all completely hosed.  There's no big drama involved here.  The logs show it all:


!ENTRY org.eclipse.equinox.p2.transport.ecf 2 1002 2016-11-22 10:29:00.858
!MESSAGE Unable to connect to repository http://download.eclipse.org/technology/nebula/snapshot/content.jar
!STACK 0
java.net.ConnectException: Connection timed out: connect
	at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
	at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
	at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
	at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
	at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
	at java.net.PlainSocketImpl.connect(Unknown Source)
	at java.net.SocksSocketImpl.connect(Unknown Source)
	at java.net.Socket.connect(Unknown Source)
	at org.eclipse.ecf.internal.provider.filetransfer.httpclient4.ECFHttpClientProtocolSocketFactory.connectSocket(ECFHttpClientProtocolSocketFactory.java:86)
	at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
	at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)
	at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:131)
	at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:674)
	at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
	at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
	at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer.performConnect(HttpClientRetrieveFileTransfer.java:1084)
	at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer.access$0(HttpClientRetrieveFileTransfer.java:1075)
	at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientRetrieveFileTransfer$1.performFileTransfer(HttpClientRetrieveFileTransfer.java:1071)
	at org.eclipse.ecf.filetransfer.FileTransferJob.run(FileTransferJob.java:74)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)


!ENTRY org.eclipse.equinox.p2.core 4 0 2016-11-22 10:29:01.951
!MESSAGE Provisioning exception
!STACK 1
org.eclipse.equinox.p2.core.ProvisionException: Unable to read repository at http://download.eclipse.org/modeling/emf/emf/updates/2.12milestones/core/.
	at org.eclipse.equinox.internal.p2.metadata.repository.CompositeMetadataRepository.addChild(CompositeMetadataRepository.java:185)
	at org.eclipse.equinox.internal.p2.metadata.repository.CompositeMetadataRepository.<init>(CompositeMetadataRepository.java:106)
	at org.eclipse.equinox.internal.p2.metadata.repository.CompositeMetadataRepositoryFactory.load(CompositeMetadataRepositoryFactory.java:122)
	at org.eclipse.equinox.internal.p2.metadata.repository.MetadataRepositoryManager.factoryLoad(MetadataRepositoryManager.java:57)
	at org.eclipse.equinox.internal.p2.repository.helpers.AbstractRepositoryManager.loadRepository(AbstractRepositoryManager.java:768)
	at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.oomph.util.ReflectUtil.invokeMethod(ReflectUtil.java:117) 

!SUBENTRY 2 org.eclipse.equinox.p2.transport.ecf 4 1002 2016-11-22 10:36:34.551
!MESSAGE Unable to read repository at http://download.eclipse.org/modeling/emf/emf/updates/2.12milestones/core/S201603210508/content.xml.
!STACK 0
java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(Unknown Source)
	at java.net.SocketInputStream.read(Unknown Source)
	at java.net.SocketInputStream.read(Unknown Source)
	at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
	at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
	at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
	at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
	at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
	at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
	at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
	at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
	at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
	at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
	at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
	at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
	at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
	at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
	at org.eclipse.ecf.provider.filetransfer.httpclient4.HttpClientFileSystemBrowser.runRequest(HttpClientFileSystemBrowser.java:263)
	at org.eclipse.ecf.provider.filetransfer.browse.AbstractFileSystemBrowser$DirectoryJob.run(AbstractFileSystemBrowser.java:69)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)

What theory do you have for why your graph looks so lovely in Oregon at exactly the same time that my logs fill up with socket timeouts?  That's definitely odd.
Comment 36 Denis Roy CLA 2016-11-22 15:58:05 EST
> Denis,
> 
> I would suggest you come to Berlin and measure what we see here at 10:00AM
> every Tuesday.  

> your "Oddly" prefixed comments that suggest that perhaps there really is no
> problem at all, just some personal melodrama.


Ed,

I apologize for the miscommunication. I witnessed it first hand ECE on Tuesday morning at 10:00am local time, and it was brutal.

My "oddly" comment applies to the fact my graphs don't capture that, and the implication that Tuesdays as a whole are a write-off.  But I'm not running builds.
Comment 37 Pascal Rapicault CLA 2016-11-22 16:05:00 EST
(In reply to Mikaël Barbero from comment #24)
> I've updated the patchset and implemented the preference migration. Note
> that I've also taken the opportunity to remove the fixed weekday/time of the
> day schedule option as it was quite confusing to have both this one and the
> "fuzzy" one.
   As mentioned in the patch I'm not at ease with this change since it removes a feature in a .2 release. If the PMC is ok with this then I'm fine.
Comment 38 Martin Oberhuber CLA 2016-11-23 03:40:49 EST
Created attachment 265528 [details]
Screenshot - UpdatePrefsOrig.png
Comment 39 Martin Oberhuber CLA 2016-11-23 03:41:47 EST
Created attachment 265529 [details]
Screenshot - UpdatePrefsMikael.png
Comment 40 Mikaël Barbero CLA 2016-11-23 03:51:53 EST
(In reply to Pascal Rapicault from comment #37)
>    As mentioned in the patch I'm not at ease with this change since it
> removes a feature in a .2 release. If the PMC is ok with this then I'm fine.

As I've implemented automatic migration from "on-schedule" to "on-fuzzy-schedule" preference, it means that if we don't remove the "on-schedule" UI, then the user will be able to select it, but it will be migrated every time he restarts Eclipse; it's quite an undesired behavior. 

An alternative it to implement a one shot only migration by storing yet another preference that knows if we've already migrated to the "on-fuzzy-schedule" pref. We won't re-do the migration again if this "migrated" pref is set. I'm not really sure it's better. 

Anyway, I will do whatever the PMC decide for Neon.2/Neon.3. But for Oxygen, I think we should really get rid of this "on-schedule" option.
Comment 41 Martin Oberhuber CLA 2016-11-23 03:59:43 EST
(In reply to Pascal Rapicault from comment #37)
>    As mentioned in the patch I'm not at ease with this change since it
> removes a feature in a .2 release. If the PMC is ok with this then I'm fine.

Dani pointed out that Equinox/P2 is actually goverend by the RT PMC:
https://www.eclipse.org/rt/team-leaders.php

So I'll express my own opinion here, but a decision needs to be made by the RTP PMC:

- I agree with Pascal that a .2 release shouldn't remove UI
- I would propose just keeping the existing UI, and adding the 3 new values to the
  left-hand "schedule" combo: ("Once a Day, Once a Week, Once a Month"). If any of 
  those is chosen, the right-hand "time" combo would be disabled.

I think the key value of this change is the new "fuzzy" functionality, and we really want to have it selected by default in the Neon.2 packages. So having the new option somewhat hidden is not a very big concern - I don't see the value of renaming the
combo with the new radiobutton. It's more important IMHO not breaking existing users.
I've attached screenshots to aid in this discussion.

Regarding the Preference Migration, I suggest *only* migrating the offending previous default value "Tuesday 10:00 am" and not migrating any other values. Or perhaps not even migrate anything, since statistics have shown that way more people "install fresh" anyways than updating from Neon.1 to Neon.2.

For those who haven't follow along, Mikael's patch that we are discussing is here:
https://git.eclipse.org/r/#/c/85060/
Comment 42 Martin Oberhuber CLA 2016-11-23 04:06:39 EST
(In reply to Mikaël Barbero from comment #40)
> Anyway, I will do whatever the PMC decide for Neon.2/Neon.3. But for Oxygen,
> I think we should really get rid of this "on-schedule" option.

+1 for simplifying updates for Oxygen. For end-users most software just offers two options "update manually" or "update automatically" and don't actually expose any schedule at all. So maybe just adding a single additional option to the combo is sufficient, keeping the "fuzzy schedule" as a non-UI plugin_customization.ini Preference for product builders only.

For Neon, my own opinion is that we need to keep existing functionality, and have the migration as non-intrusive as possible: one-shot migration of the offending "Tuesday 10:00 am" only, or not migrate at all (and trust in fresh downloads of Neon.2 instead).
Comment 43 Mikaël Barbero CLA 2016-11-23 04:11:38 EST
Thanks Martin. I'm adding Thomas and Christian to have their opinions. I will do the changes they request to integrate such a patch into Neon.2
Comment 44 Christian Campo CLA 2016-11-23 05:23:22 EST
I think we should wait what Tom has to say, since he is very much more involved in Equinox and p2 than myself.

HOWEVER I believe that we are too religious here on "a service release should not remove UI". I believe in KISS (keep it simple). If we believe the right direction (and the best approach) in the long run (Oxygen) is to only offer "manual" and "automatic" (with a fuzzy schedule)  than I would also go for that in a Neon.2 release.

After all many other software distributions (as already pointed out) follow the same approach.

I feel uncomfortable about introducing a new option (hourly, weekly, monthly) that we then drop 6 month later.

But then this is not the PMC position but just my take. Lets see what Tom has to say.
Comment 45 Mikaël Barbero CLA 2016-11-23 06:14:48 EST
(In reply to Christian Campo from comment #44)
> I feel uncomfortable about introducing a new option (hourly, weekly,
> monthly) that we then drop 6 month later.

This new option is here to stay. The question is wether we delete the old option (fixed weekday / time) in Neon.2 or Oxgen, and whether we migrate the old preferences to the new option.
Comment 46 Eclipse Genie CLA 2016-11-23 10:51:56 EST
New Gerrit change created: https://git.eclipse.org/r/85604
Comment 47 Denis Roy CLA 2016-11-23 10:56:44 EST
Having the option to check once a day may be a bit excessive, no?  Even if fuzzy?  What's the default?
Comment 48 Mikaël Barbero CLA 2016-11-23 11:02:12 EST
(In reply to Denis Roy from comment #47)
> Having the option to check once a day may be a bit excessive, no?  Even if
> fuzzy?  What's the default?

It is already an option, at a fixed time of the day.
Comment 49 Thomas Watson CLA 2016-11-23 11:43:11 EST
Pascal and I discussed this.  We agree that it makes the most sense to go with the new UI for automatic updates at this time.

What needs to be tested is what happens when you have an existing installation of Neon.1 updating every Tuesday at 10:00 AM and you update this to Neon.2.  My assumption is that becomes a random once a week check.
Comment 50 Pascal Rapicault CLA 2016-11-23 13:32:02 EST
I've pushed another round of changes to complete the removal of the old preference:
- Remove externalized strings that are no longer necessary;
- Remove the initialization of constants for the UI code that got removed and tweaked the migration logic because of this;
- Use the previous wording in the preference UI (a suggestion done by Tom)

Mikael please review again and let me know what you think.
Comment 51 Mikaël Barbero CLA 2016-11-24 01:28:52 EST
LGTM. Thanks for the changes.
Comment 53 Eclipse Genie CLA 2016-11-24 10:11:24 EST
New Gerrit change created: https://git.eclipse.org/r/85697
Comment 54 Eclipse Genie CLA 2016-11-24 11:16:49 EST
Gerrit change https://git.eclipse.org/r/85697 was merged to [R4_6_maintenance].
Commit: http://git.eclipse.org/c/equinox/rt.equinox.p2.git/commit/?id=f7ad9acaf7f9be698569972c3cb396cf351102ef
Comment 55 Denis Roy CLA 2016-11-28 10:41:01 EST
I've added a rule to our bandwidth throttler to simply remove our bandwidth limit on Tuesday mornings from 10:00am to 10:20am Central European Time.

That should help cope with the load.
Comment 57 Eclipse Genie CLA 2016-11-29 14:58:06 EST
New Gerrit change created: https://git.eclipse.org/r/85978
Comment 59 Denis Roy CLA 2016-11-30 10:22:49 EST
(In reply to Denis Roy from comment #55)
> I've added a rule to our bandwidth throttler to simply remove our bandwidth
> limit on Tuesday mornings from 10:00am to 10:20am Central European Time.
> 
> That should help cope with the load.

Ed, and others, I'm curious to know if the above change made a difference yesterday morning?
Comment 60 Ed Merks CLA 2016-12-01 02:47:26 EST
Denis,

So sorry, I should have checked on Tuesday at that time.  I know things went haywire again for a short time on Wednesday at 10:00AM, but within 20 minutes or so it was all good again for the rest of the day...
Comment 61 Abel Hegedus CLA 2016-12-01 03:32:03 EST
(In reply to Denis Roy from comment #59)
> Ed, and others, I'm curious to know if the above change made a difference
> yesterday morning?

I was trying to run a local Maven build at 10:03 AM and it failed with connection problems. I was also following the stats on https://dev.eclipse.org/committers/help/status.php

The change might have made a difference, as after about 30 minutes, things got back to normal as far as I could see.
Comment 62 Denis Roy CLA 2016-12-01 11:27:07 EST
Thanks for the follow-up. Opening the bandwidth isn't the solution but will help until the patch is out there.
Comment 63 Mickael Istria CLA 2017-03-21 08:21:06 EDT
The issue is still here and download.eclipse.org is slow and unusable on this Tuesday morning. That makes me wonder: are we sure that people who update from Neon.0/1 to Neon.2 have the property updated and their update checks fuzzily in the week? Can we already perceive an impact of those fixes?
If the fix works but the adoption trend is too slow, shouldn't we consider a mailing to all community members asking them to update to Neon.2/3 ASAP (but not on Tuesday morning) to definitely tackle the botnet we created and the almost DDOS it causes weekly?
Comment 64 Denis Roy CLA 2017-03-21 08:38:51 EDT
Mikael, that was a coincidence. We were having a DNS issue overnight, and I've just resolved that. download.e.o should be much faster now.
Comment 66 Mickael Istria CLA 2017-03-21 10:26:00 EDT
(In reply to Marc-André Laperle from comment #65)
> At this very moment, if I try
> http://download.eclipse.org/tracecompass/releases/2.2.0/rcp/trace-compass-2.
> 2.0-20161221-1532-macosx.cocoa.x86_64.tar.gz
> It downloads at around 30KB/s
> whereas
> http://mirror.csclub.uwaterloo.ca/eclipse/tracecompass/releases/2.2.0/rcp/
> trace-compass-2.2.0-20161221-1532-macosx.cocoa.x86_64.tar.gz
> downloads at at round 8MB/s.

Same here. As a result, more p2 sites are too slow to be perceived as working.
Comment 67 Denis Roy CLA 2017-03-21 11:32:59 EDT
> It downloads at around 30KB/s

> downloads at at round 8MB/s.

We don't have the same bandwidth allocation as our mirrors to keep costs low, hence the reason we try to use our mirrors as much as we can.

I am working on a transparent mirror system that will make download.e.o fast and efficient again.  Or, for the first time ever.
Comment 68 David Williams CLA 2017-07-18 11:40:56 EDT
I believe this issue could be marked fixed, right? 
I think EPP packages changed (or reverted) the default update time and I have not heard any complaints about it for a while. 

Please re-open if my skim reading missed something important that is yet to be done.