[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [p2-dev] Brittle downloads

The retry is set to one extra attempt in M7, but the extra try on timeout is removed in RC1 because it just prolongs the wait to select another possibly better mirror.
The retry delay is 200ms. (Both values comes from org.eclipse.equinox.internal.p2.repository.RepositoryPreferences - there is a "todo" there to allow these values to be set via properties - which I think is very useful in some situations).

I don't know the best solution here - but I think the retry logic should be moved to a higher layer as there is an overall workload to get done, and too many retries with delay blocks other things from being downloaded. I also think the download should be given constraints on how long it is allowed to try. You could hit a very slow connection and keep reading at say one block every minute. It could then give up and try a different mirror.

The problem as I see it is that the file transfer does not know enough about what it should do. In some cases you want it to keep retrying much longer (say in a headless build), sometimes there are better mirrors, sometimes there is only one.

However, the general improvement is too big to make it into 3.5 IMO.

So, suggested fixes are:
- make retry and retry delay be configurable values
- retry also on timeout
- perhaps handle connect timeout differently than read timeout (i.e. connect=select another mirror, read=try a bit longer)
- just change the default to 3, and also retry on all timeout :) (which is not unreasonable when cancel on hung socket actually works).

I think we should discuss this in a bugzilla.

Henrik Lindberg

On May 14, 2009, at 1:16 PM, Thomas Hallgren wrote:

Henrik Lindberg wrote:
I worked on timeout for connect recently, and I know that connect has a timeout of 120 seconds. Without actually testing read timeout, I know that it is setup to be 120 seconds as well. This was done because of discussion in https://bugs.eclipse.org/bugs/show_bug.cgi?id=266246 .
(This is when using httpClient).

I don't think there are any retries.
There is a retry mechanism in the FileReader. Its initialized in the constructor.

connectionRetryCount = RepositoryPreferences.getConnectionRetryCount();
connectionRetryDelay = RepositoryPreferences.getConnectionMsRetryDelay();

what are those values set to by default? The Buckminster predecessor of the FileReader use a retry count of 3 and it works reasonably well.

Don't know if the problem is because you are hitting a bad mirror, but it does not look like it because the exception message is showing the full URL to the artifact (as opposed to just the repository URL) - but I am not 100% sure.

No, I don't think it's a bad mirror either. It's just good old build.eclipse.org being very busy or something.

- thomas
p2-dev mailing list