Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[cross-project-issues-dev] Transparent P2 mirrors (Was: Why (not) a "Final Daze" in SimRel?)

> I must echo something that Ed touched on:
> 
> >> * Doesn't the Foundation plan to enable a transparent mirroring
> >> system very soon that would make all this p2.mirrorsUrl useless?
> > No!!!  With mirror URLs, the mirrors are directly accessed with no
> > further access to download.eclipse.org.  With transparent mirroring,
> > the download server remains a bottleneck because it must be consulted
> > in order to redirect "transparently" to some other site.
> 
> Agreed -- transparent mirroring is nice for zip files and such, where a
> short URL can be convenient. But for p2 repos which are accessed by
> machines, NO!!!!  :)  Let the client gather bits from the local source
> directly.

This goes waaaay off-topic from the original discussion, but:

Depending on how the transparent mirroring works, I could actually see it help a lot for P2 repos as well - at least with a little support from P2.

A) The situation right now is, for some fictional composite release update site on eclipse.org, and a very simplified P2 install:

1. Get http://download.eclipse.org/foo/updates/latest/p2.index
2. Get http://download.eclipse.org/foo/updates/latest/compositeContent.jar
3. Read child URL "../../drops/1.2.3", resolve it to http://download.eclipse.org/foo/drops/1.2.3
4. Get http://download.eclipse.org/foo/drops/1.2.3/p2.index
5. Get http://download.eclipse.org/foo/drops/1.2.3/content.jar
6. Resolve the P2 install plan
7. Get http://download.eclipse.org/foo/drops/1.2.3/artifacts.jar
8. Read p2.mirrorsUrl from artifacts.jar
9. Get mirror list from p2.mirrorsUrl (e.g. http://www.eclipse.org/downloads/download.php?file=/foo/drops/1.2.3&format=xml)
10. Select a mirror from the list and download individual artifacts from the mirror

While step 10 is typically the one that transfers the most actual bits over the pipes, that's actually six requests with non-trivial content that hit eclipse.org before we ever reach the mirror. In my experience, this already adds up to some significant delay if download.eclipse.org has a slow day or your pipe to it is thin in general. 

Also, having to mess with the artifacts.xml, e.g. to provide this repository completely from your own mirror, is quite messy. And having your Eclipse still contact download.eclipse.org due to the p2.mirrorsUrl when you've put up a mirror e.g. on a local server can be quite a surprise.

B) Now imagine a transparent mirroring solution where step 1 immediately redirects us to a suitable mirror, and P2 cooperates:

1. Request http://download.eclipse.org/foo/updates/latest/p2.index
a. Receive HTTP 307 to http://mirror.example.org/foo/updates/latest/p2.index 
b. Update P2 repo object so further repository loading uses the redirected URL
c. Follow redirect and get http://mirror.example.org/foo/updates/latest/p2.index 
2. Get http://mirror.example.org/foo/updates/latest/compositeContent.jar
3. Read child URL "../../drops/1.2.3", resolve it to http://mirror.example.org/foo/drops/1.2.3
...

That's a single request to eclipse.org (maybe two or three if there is no p2.index and P2 has to guess) and we're off to a faster mirror. And all that without messing with p2.mirrorsUrl. Also, it's much closer to how HTTP redirects are handled on the web and feels a lot more natural, to me at least.

Right now, this doesn't work because P2 transparently fetches the redirected content for each request, one at a time, but does not interpret it as a location change for the whole repo (which means you still get the first sequence of requests, just with a redirect after each step). I never quite understood the rationale behind this behavior. Was this a concious decision, and for what use-case? Or is it just coincidentally the ECF transport's redirect handling "just doing its job"?

WDYT?
Carsten


Back to the top