Re: [ecf-dev] E-intro [Was Efficient downloads]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [ecf-dev] E-intro [Was Efficient downloads]

From: Scott Lewis <slewis@xxxxxxxxxxxxx>
Date: Wed, 30 May 2007 10:27:51 -0700
Delivered-to: ecf-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/listinfo/ecf-dev>
List-help: <mailto:ecf-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/ecf-dev>, <mailto:ecf-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/ecf-dev>, <mailto:ecf-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Thunderbird 1.5.0.10 (Windows/20070221)

Hi Filip,

Filip Hrbek wrote:

Hi Scott, comments inside.
- resume from a different location (e.g. different mirror)
Hmm. Don't know how you are going to accomplish that withoutsomething quite different from normal http, but sounds interesting.
Not sure for what protocols we are able to implement. To do this, wemust be able to start downloading at a particular offset and finallycheck the file consistency, e.g. using a digest file if available. Wealso have to have a list of mirrors containing the same artifact(let's assume we've obtained it somewhere). This should be possiblewith http

There could be API supporting this feature.

This is what I would like to understand, as if additional API is*required* I would like to get that API (probably implemented as anadapter) into the ECF filetransfer API prior to the implementation.

Protocols which wouldn't support this would either make a workaround,or throw an exception.

The approach we've generally been using to allow runtime access tooptional/additional features is IAdaptable:

ISomeInterface adapter = (ISomeInterface)someAdaptable.getAdapter(ISomeInterface.class);

if (adapter == null) {
   // optional feature not supported
} else {
  // optional feature is supported...use it!
  adapter.<whatever>
}

This makes it possible to introduce new API (ISomeInterface) in pluginseparate from filetransfer API, or in same plugin. It's quite handy,also, in the use of the IAdapterManager OSGi service/extension point,which lets new plugins set themselves up as implementers of a giveninterface declaratively. In any event, we don't have to use thismechanism to introduce new API, but we can if necessary/desired and itwill have minimal impact on existing API.

- retrieving information from special headers (likeContent-Disposition)
- detecting URL redirections to final mirrors
I'm not sure what you are going to use to implement this, but wouldbe curious to find out.
If you download a file from an URL, you have to discover the filenameif user doesn't specify it explicitly. The most precise solution isparsing the Content-Disposition header if it's available (browsers useit for determining the name of the file to save). Unlike other httpheaders, Content-Disposion has a very complex syntax. We should beable to parse it properly.

OK. Do all http x.y servers support Content-Disposition? Could youalso point to the spec for it (w3c?) just for my information? And doyou know if Apache httpclient 3.0.1 implements the parsing ofContent-Disposition? If so, then perhaps the existingorg.eclipse.ecf.provider.filetransfer.httpclient could simply be modified.

Detecting URL redirections would help us in statistics collection. Itwould be wrong to assign statistics belonging to different mirrors toone URL covering all the mirrors. This is why we should detect thatreading from the covering URL points to different mirrors on differentretrieval attempts. Finally we could automatically deprecate usingsome of the black-listed mirrors to avoid speed or timeout problems.

OK, this does sound like new API/interfaces for collecting thesestatistics.

I think you would need to describe what statistics are desired here.We can easily add adapter interfaces for collecting statisticsassociated with a given file retrieval/all to ecf or individualproviders, but would need to know what stats are of interest.
The most interesting statistics:
- average download speed (related to concrete mirrors, geographicalprovider/consumer location, day time etc.)- amount of bytes downloaded from particular location / duringparticular time period
- frequency of timeouts including timeout values
- etc.
We could share the statistics among users in an application by storingthem on a server (the downloader would send the statistics to theserver automatically). This would prevent users from attempts toaccess corrupted/slow repositories.

OK. Remy may want to comment on the overlap of these statistics withbittorrent (have you looked at bt as a possible approach? as it's prettyubiquitous) and whether or not a common stats api could/should becreated for both. Remy is the committer that did the bittorrent impl.We won't be able to do that immediately, given Europa finishing work, asI'm sure you understand.


Scott

Follow-Ups:
- Re: [ecf-dev] E-intro [Was Efficient downloads]
  - From: Filip Hrbek

References:
- [ecf-dev] Efficient downloads
  - From: Thomas Hallgren
- Re: [ecf-dev] Efficient downloads
  - From: Scott Lewis
- Re: [ecf-dev] Efficient downloads
  - From: Thomas Hallgren
- Re: [ecf-dev] Efficient downloads
  - From: Scott Lewis
- [ecf-dev] E-intro [Was Efficient downloads]
  - From: Thomas Hallgren
- Re: [ecf-dev] E-intro [Was Efficient downloads]
  - From: Scott Lewis
- Re: [ecf-dev] E-intro [Was Efficient downloads]
  - From: Filip Hrbek
- Re: [ecf-dev] E-intro [Was Efficient downloads]
  - From: Scott Lewis
- Re: [ecf-dev] E-intro [Was Efficient downloads]
  - From: Filip Hrbek

Prev by Date: Re: [ecf-dev] E-intro [Was Efficient downloads]
Next by Date: [ecf-dev] Upcoming: Europa and Graduation Reviews
Previous by thread: Re: [ecf-dev] E-intro [Was Efficient downloads]
Next by thread: Re: [ecf-dev] E-intro [Was Efficient downloads]
Index(es):
- Date
- Thread

Breadcrumbs