[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [p2-dev] Improving download speed

> One thing that I discovered when working on "Mirror ranking needs to be
> improved" [1] (I'd like some input on the patch) was that the actual
> meta-data is never downloaded from a mirror. I can understand why since
> the actual mirrorsURL is in the repository, but it got me thinking.
This is a known situation that we never really bothered improving so far because it never turned out to be a big issue. On the largest repos we have around at this point (Galileo / Helios), the content is fairly static once things have been shipped and as such people only end up downloading things once and then only checking for timestamp.

> One increasingly common scenario is to use composites to include
> history. All references in these composites are relative. So why don't
> we allow a mirrorsURL in the composite? If we did, all the referenced
> meta-data could be fetched from the mirrors. That would mean a serious
> improvement for those of us that rely heavily on the mirrors.

At this point, what I believe is the biggest pain factor around the composite is not the speed at which we can get the metadata (which all in all is relatively small), but the fact that we are getting the children repositories sequentially rather than in parallel. I think just that could improve the situation drastically (https://bugs.eclipse.org/bugs/show_bug.cgi?id=300251)

But if on top of that we can do mirroring, why not. However what I am wondering how does mirroring play with our caching story? It feels like suddenly the timestamp check would become a much more expensive process, but even more important, are we sure that the timestamp for the content.jar is the same on every mirror? If we can't guarantee this then I think we are exposing the user to a much worse situation since we would end up downloading the indexes way too frequently.
Basically I say why not, but a screw up there could be fatal, and historically we have been having problems with transports and making the obtention of the indexes more complex makes me nervous.
> Another thought is that perhaps we should have some kind manifest file?
> I think this has been discussed before but I was not able to find the
> bugzilla. The repository manager could make an attempt to load a
> specific file, p2.mf or similar, that would contain information about
> how to read the update site content. This manifest would typically be
> very small and contain information such as:
> 1. The format of the repositories (so that no scan is needed for
> compositeContent.jar, compositeContent.xml, content.jar, content.xml, etc.)

This is captured in bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=177231

> 2. Name of the repositories (for quick browsing)
> 3. Location.
> 4. MirrorsURL.
> 5. (optional) Meta-requirements for reading the repository and a pointer
> to another repository that provides the needed artifacts. Done right,
> this could be the mechanism to use to ensure forward and backward
> compatibility.