[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [p2-dev] Mirror ranking


Yes, we just noticed this yesterday and Andrew has raised a bug for it: https://bugs.eclipse.org/bugs/show_bug.cgi?id=297408

John




"Kirill Balod" <Kirill.Balod@xxxxxxxxxxx>
Sent by: p2-dev-bounces@xxxxxxxxxxx

12/10/2009 10:51 AM

Please respond to
P2 developer discussions <p2-dev@xxxxxxxxxxx>

To
"P2 developer discussions" <p2-dev@xxxxxxxxxxx>
cc
Subject
RE: [p2-dev] Mirror ranking





One more notice - CompositeRepository is used for a "mirror task" marks a repo as bad if download of one IU was failed.
I guess - it is optimisation for remote sites (but not sure it is always correct) but it strange solution for local site.
For example - some zipped archive sites now don't have "jar" entries but only "jar.pack.gz". "jar" entries were manually removed before zip but "artifacts.xml" still contains references to two artifacts for each IU. In spite of ability to fetch all IU from such site - mirror task can not mirror it.
 

Kirill A. Balod
Borland (Micro Focus)

kirill.balod@xxxxxxxxxxx


From: p2-dev-bounces@xxxxxxxxxxx [mailto:p2-dev-bounces@xxxxxxxxxxx] On Behalf Of John Arthorne
Sent:
Thursday, December 10, 2009 5:43 AM
To:
P2 developer discussions
Subject:
Re: [p2-dev] Mirror ranking



I'm sure this algorithm can be improved. Please enter a bug report for it. I like the idea of resetting or lowering the failure count if we later have a successful transfer. How it behaved on the most recent transfer is generally going to be much more interesting than historical behaviour.


John



Thomas Hallgren <thomas@xxxxxxx>
Sent by: p2-dev-bounces@xxxxxxxxxxx

12/09/2009 01:40 PM

Please respond to
P2 developer discussions <p2-dev@xxxxxxxxxxx>


To
P2 developer discussions <p2-dev@xxxxxxxxxxx>
cc
Subject
[p2-dev] Mirror ranking







I mirrored Helios today and it basically took forever. After a few
hours, I was beginning to wonder what was going on and luckily, the
process ran in a debugger. I found that the top ranked mirror was the
one at eclipse.org. That surprised me since I know that I have a fast
mirror in Sweden that serves up a copy of Helios.

First I checked if this mirror was included in the list served up by the
mirror request to Eclipse.org. It was. Next, I stopped the debugger and
patched the URL for entry number zero in my mirrors list with the URL of
that mirror. I resumed and now the processing went very much faster. So
the mirror was actually OK.

So why did download.eclipse.org move to the top of the list? It's
supposed to be right at the bottom. The algorithm for sorting the
mirrors looks like this:

       public int compareTo(Object o) {
           if (!(o instanceof MirrorInfo))
               return 0;
           MirrorInfo that = (MirrorInfo) o;
           //less failures is better
           if (this.failureCount != that.failureCount)
               return this.failureCount - that.failureCount;
           //faster is better
           if (this.bytesPerSecond != that.bytesPerSecond)
               return (int) (that.bytesPerSecond - this.bytesPerSecond);
           //trust that initial rank indicates geographical proximity
           return this.initialRank - that.initialRank;
       }

A failure count of one will deem the mirror forever worse then a failure
count the zero, no matter if that mirror is a hundred times faster. I
think that was what caused my problem. All mirrors in the list have a
failureCount of 1 and a byte-count of -1, except two,
download.eclipse.org (initialRank = 55) and one other (initialRank=10)
because after some initial failure, they were never given a second chance.

My guess is that something went wrong at the very beginning that caused
all mirrors except download.eclipse.org and node number 10 to fail. Not
sure what that was. That however, moved download.eclipse.org to the top
and node number 10 to second place. And although I have mirrors 100
times faster close by, they are never consulted again. I'm downloading
about 3.800 artifacts.

Mirrors may have temporary and fairly short outages. They may be
incomplete in some respect, or just be under very heavy load for a short
period of time. I think the algorithm could be improved by adding a
periodic retry on mirrors with an initialRank value that indicates that
it is geographically close. I also think that we should have a ratio
between high transfer rate and failure count. Let's say that 5 times
higher transfer rate is worth one failure. Perhaps a successful transfer
should reset the failure count, or at least cut it in half so that
failures are forgiven by subsequent good behavior.

One question that I don't know the answer to at this point is what
happens when an artifact is missing although it should be there
according to the artifact repository. Will the mirror get punished by
that? If that's the case, then it's not so good. The same will be true
on all mirrors but the best one will be punished.

What do you think?

- thomas



_______________________________________________
p2-dev mailing list
p2-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/p2-dev



Click here to report this email as spam.

This message has been scanned for viruses by MailController._______________________________________________
p2-dev mailing list
p2-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/p2-dev