platform-update-home/doc/mirror_support.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (download) (as text) (annotate)
Mon Nov 7 18:55:52 2005 UTC (4 years ago) by btripkovic
Branch: MAIN
CVS Tags: HEAD
Changes since 1.1: +1 -1 lines
*** empty log message ***
<html>
<head>
<meta content="Microsoft FrontPage 6.0" name="GENERATOR">
</head>
<body>





<p></p>
<h1>Improved Mirroring Support in Eclipse Install/Update </h1>
<address>By Branko Tripkovic and Dejan Glozic<br>
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%m/%d/%Y" startspan -->11/07/2005<!--webbot bot="Timestamp" endspan i-checksum="12601" --></address>
<h2>Background</h2>
<p>Current mirroring support in Eclipse Install/Update was added on the 
request of Eclipse foundation just before Eclipse 3.1 went live in order to help 
with the traffic spikes during the release times. However, its current 
implementation leaves a lot to be desired. It requires too much of the annoying user 
interaction and currently there is 
no way around this (<a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=97806">bug 
97806</a>). In addition, 
there is no standard way for the user to select the correct mirror.</p>


<p>At the moment we support mirrors using optional <code>mirrorsURL</code>
attribute in the site tag of site.xml which points to an xml file that contains the
update site mirror definitions. This file has mirrors
defined by the <code>mirror</code> tag with attributes that define mirror's name and URL. Once
Update detects <code>mirrorsURL</code> attribute in the <code>site</code> tag of site.xml it downloads
the file and pops up a dialog with a list of
mirrors for the user to select.</p>


<h2>Goals</h2>
<p>We would like to improve the mirroring support in Eclipse Install/Update and 
have the following goals:</p>
<ol>
	<li>Automate choosing or at least help the user with selecting the most 
	appropriate mirror. Ideally we would like no user input, however there is not 
	enough information on the system to prevent this all the time. We want to 
	make this process as simple as possible both for the end user and the service 
	providers. We also want to have algorithm for determining the best available 
	mirror that is not too CPU and/or memory-intensive.<br>
&nbsp;</li>
	<li>Provide ISVs building applications on top of the Eclipse platform 
	(e.g. RCP applications) with the way to contribute alternate mirror sorting 
	algorithms. This will allow companies to solve specific problems such as national rules, business 
	policies, 
	network performances etc.<br>
&nbsp;</li>
	<li>Propose a server-side alternative to the client side mirror selection.</li>
</ol>


<h2>Options</h2>
<p>There are several possible options to improve the mirror handling. They can 
be first classified based on the place where mirror determination takes place 
(client or server). Client-side choices can be further classified based on the 
algorithm used to determine the best mirror.</p>
<h3>Client-side options</h3>
<h4>Geographic proximity algorithm</h4>
<p>As its name says this algorithm works based on geographic proximity of the 
mirror to client and usually countries are used for determining proximity. </p>
<p>Issues:</p>
<ul>
	<li>For the large countries this might be an issue since they might have several 
	mirrors at considerably different geographical proximity to the client. This 
	could be solved using Postal/Zip codes </li>
	<li>This also breeds a problem for centralized organization and ISPs. Due to 
	the cost and security issues, many ISPs have only one gateway to the internet which 
	may not be in the same location as the client and in some cases even in the 
	same country</li>
	<li>Country borders will prevent us from exploiting real client/mirror 
	proximity. For instance, imagine a site located in Seattle with mirror in 
	Halifax. If a client tries to connect from Vancouver he will be forwarded to 
	Halifax mirror with possible considerable degradation of user experience.</li>
</ul>
<h4>Link cost algorithm</h4>
<p>This algorithm works by measuring cost (number of hops) of connecting to 
different mirrors. It is a better algorithm since it calculates the real cost, 
not the estimated cost.</p>
<p>Issues: </p>
<ul>
	<li>Lots of additional network traffic </li>
	<li>Routes can change </li>
	<li>Can be a very lengthy process </li>
	<li>Very likely to produce problems when working behind corporate and other 
	firewalls</li>
	<li>Counting AS (Autonomous Systems) is useful but it is not the correct 
	measurement since cost of traversing ASs can be considerably different </li>
</ul>
<h3>Server-side options</h3>
<p>Mirroring support is usually put on the server. There are several reasons for 
this. It is much easier to change mirror selecting logic. It is also
much easier to update data that mirroring logic uses to select correct mirrors. 
Finally, servers usually know more about networks and other servers then clients. Work
in update to support server side mirror selection is very limited, since update
has only to understand one of the two standard ways that 'redirect' instructions
from the server are issued. These two standard ways are:</p>



<ol style="margin-top: 0in;" start="1" type="1">
     <li>one of the 300-series http return codes (probably 300) issued by the server to redirect traffic to another site</li>
     <li>using meta tag in the HTML header that defines the redirect (mirror) site</li>
</ol>

<h2>Proposed Solutions</h2>
<h3>Client-side</h3>

<p>If we decide to go with client side solution we will have to
define extension point to support plugging of algorithms from other sources, and
extend definitions of the site tag in site.xml and mirror tag in the mirrors
definition file to support this. It would be done as follows: </p>

<h4>Extension Point</h4>

<p>A new extension point will be added to the Install/Update Core plug-in:</p>

<blockquote>
	<pre>&lt;<font color="#000080">extension</font> point=<font color="#008000">"org.eclipse.update.core.mirrorSorter"</font><font color="#000080">&gt;</font><br>   &lt;<font color="#000080">sorter</font> <br>         id=<font color="#008000">"com.example.xyz.MirrorSorter"</font><br>         class=<font color="#008000">"com.example.xyz.update.XYZMirrorSorter"</font><br>   <font color="#000080">/&gt;</font><br><font color="#000080">&lt;/extension&gt;</font>
</pre>
</blockquote>

<p>The extension point provides for registering a mirror sorter with a unique 
identifier referenced from within the site.xml using the newly added <code>
sorter-id</code> attribute.</p>

<p>All of the extender classes must implement following interface:</p>

<blockquote>
	<pre>public interface IMirrorSorter {<br>/**<br> * Accepts a list of mirrors as defined in the definition file and sorts them based on<br> * preferred use, with the best mirrors at the head of the list.<br> * @param candidates mirrors to sort<br> * @return the ordered list of mirrors (the best mirrors first)<br> */<br>	public IMirror[] sort(IMirror[] candidates);<br>}<br><br>public interface IMirror {<br> /**<br>  * Returns the URL as a string (as defined in the mirror definition file)<br>  * @return mirror URL as a string<br>  */<br>	public String getAddress();<br>  /**<br>   * Returns the mirror label (as defined in the mirror definition file)<br>   * @return mirror label<br>   */  <br>	public String getLabel();<br>  /**<br>   * Returns mirror property value given the property name<br>   * @return value of the named mirror property or &lt;code&gt;null&lt;/code&gt; if not defined<br>   */<br>        public String getProperty(String name);<br>}<br><br></pre></blockquote>
<h4>Augmented Site Tag Definition in site.xml:</h4>


<blockquote>
	
  <pre><tt>&lt;!ATTLIST site</tt> <br><tt>&nbsp;&nbsp;&nbsp; type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt> <br><tt>&nbsp;&nbsp;&nbsp; url&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt> <br><tt>&nbsp;&nbsp;&nbsp; mirrorsURL&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt><br>    <big><b style="font-weight: bold;"><small>sorter-id     CDATA #</small></b><small><tt style="font-weight: bold;">IMPLIED</tt></small></big><br><tt>&gt;</tt></pre>

</blockquote>


<p>where <code>sorter-id</code> represents the identifier of the mirror sorting class that is 
registered user of extension point <code>mirrorSorter</code>. If <code>sorter-id</code> is not present, 
the default sorter provided by Install/Update will be used (the exact identifier 
to be defined).<br>
</p>

<h4>Augmented Mirror Tag Definition in mirrors definition file:
	<br>
</h4>

<blockquote><pre>&lt;!ELEMENT mirror (property*)&gt; <br>&lt;!ELEMENT properties&gt; <br></pre>
  <blockquote>
	
  </blockquote>
  <pre><tt>&lt;!ATTLIST </tt>property<br><tt>&nbsp;&nbsp;&nbsp; name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #</tt><tt>REQUIRED</tt><br><tt>&nbsp;&nbsp;&nbsp; value&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #</tt><tt>REQUIRED</tt><big><small><tt style="font-weight: bold; font-style: italic;"></tt></small></big><br><tt>&gt;</tt></pre>
  <blockquote>

  </blockquote>
</blockquote>
<pre><span style="font-size: 12pt; font-family: &quot;Times New Roman&quot;;">where </span><span style="font-family: &quot;Times New Roman&quot;;"><code><font size="2">property</font></code></span><span style="font-size: 12pt; font-family: &quot;Times New Roman&quot;;"> element carries sorter-specific information that can be used to determine suitable mirror.<o:p></o:p></span></pre>

<h4>Install/Update default mirror sorter implementation<br>
</h4>

<p>An issue with both of the client-side options above is that they are based on the 
assumption that all mirrors are created equal i.e. they have same performance. 
In our case, this is not true. We can try to alleviate this issue by assigning 
mirror ratings and use them in our calculations as follows:</p>
<ol>
	<li>Divide the world intro large regions (East Cost USA, Central USA, West 
	Cost USA, Europe, East Asia), assign countries and states/provinces to 
	these regions and assign the time zone to each region.</li>
	<li>Put this information in a comma-separated value (CSV) file. Format would be:</li><pre>Region, Country, State/Province, Time Zone</pre>
	<li>Make a weighted graph of the world's adjacent regions. This graph 
	would be put in a CSV file. 
	Format would be: </li>
<pre>Region, Region, Weight</pre>
	<li>Weight is an approximate network round trip time (closely related to 
geographic distance, but not the same) in integers 1-5 </li>
	<li>User will be allowed 
to set his/her location, country and geographic region in the preferences. If user 
does not provide this information, system-provided country and time zone will be 
used in our calculations </li>
	<li>Allow the user to have "Select 
	the mirror automatically" and "Always prompt" mutually exclusive choices in the preferences.</li>
	<li>Service
provider will be required to supply a mirror definition file with
	the following additional properties defined for each mirror: region,
country, state/province, time zone, rating (actual property names to be defined 
	later)<br>
	<br>
	Based on this information we will sort mirrors in the following manner:<br>
&nbsp;<ul>
		<li>mirrors in the end user's Country/State/Province</li>
		<li>mirrors from same region</li>
		<li>the 
rest of the mirrors by using the weighted graph of regions </li>
		<li>use mirror rating to decide on the order of mirrors in the same physical proximity to the end 
user</li>
	</ul>
	</li>
</ol>
<p>Install/Update will use the list of mirrors obtained from the sorter by 
looping through it until we get the response from one of them. 
This would allow us to skip the unresponsive mirrors.</p>
<p>

</p>
<p class="MsoNormal">This solution does not address the problem where the user and his/hers
gateway is not in the physical proximity of each other.</p>


<h3>Server side</h3>
<p>

</p>

<p>As previously mentioned, the beauty of the server side solution is
that we do not have to define mirror selection algorithm in advance. It
can be changed on the fly and on each site independently. However we
should decide in advance which method of redirecting we will use
(although both can be supported at the same time). There are several
ways to do this that in use on the web:<br>
</p>
<ol>
  <li>Using HTTP status codes and new location interpreted by clients</li>
  <li>Using meta tag in HTML</li>
  <li>Using JavaScript</li>
</ol>
<p>Since the second solution is HTML specific and the third requires a
JavaScript interpreter we suggest using the HTTP 300 status code with a
sorted list of provided mirrors. Sorting can be done using either the
client's IP address and an IP to Geographical location database,
using a client provided location (using a GET request issued by the
client with 'country' and either 'time zone' or 'state/province' as
parameters), or some other way that can be devised by the site
implementing mirroring support. In any case using client provided
location information, a server-side algorithm similar to the one
presented for the client side solution can be used. Our proposal is for
the update to pass the location information at all times and let the
server decide if it wants to use it. This way we will not dictate the
choice of algorithms in advance. <br>
</p>
<p>Provided here is a <a href="new_mirror_suport_script.zip">php script</a> with configuration files that will
sort mirrors based on the client's country and time zone. Country and
time zone are expected to be provided by as GET parameters. We have
adopted the above mentioned default mirror sorter for the
client side solution (one difference is that mirrors are in the CSV
file and we did not do mirror rating). As an added bonus, this script
can be used to generate sorted mirrors lists for current
implementations of mirror support. We also provide patches for the <a href="org.eclipse.update.core.patch">org.eclipse.update.core</a> and <a href="org.eclipse.update.ui.patch">org.eclipse.update.ui</a> plug-ins of the
Update component that enables the script to be used in this manner.
These patches add the "Automatically select mirror" checkbox on the
Update preferences page. When checked, Update will automatically select
the mirror from the mirror list. Also, the timeZone and countryCode
parameters are added to every mirror request.</p>
<p>Server side solutions can also resolve issues of a user's location
differing from their gateway location. For example, it is well known
what IP address ranges are used by certain large organizations and
where their internet gateways are located and this can be included in
the decision making process.</p>
<h2>Conclusion<br>
</h2>
<p>We recommend that the server-side solution is selected for handling mirrors 
in the Update component.</p>
<p><i>Note: All files contained herein are provided as is. If they are to
be used in production environments, error checking and other
miscellaneous improvements should be added.</i></p>


</body>
</html>