platform-update-home/doc/mirror_support.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (view) (download) (as text)

1 : btripkovic 1.1 <html>
2 :     <head>
3 :     <meta content="Microsoft FrontPage 6.0" name="GENERATOR">
4 :     </head>
5 :     <body>
6 :    
7 :    
8 :    
9 :    
10 :    
11 :     <p></p>
12 :     <h1>Improved Mirroring Support in Eclipse Install/Update </h1>
13 :     <address>By Branko Tripkovic and Dejan Glozic<br>
14 :     <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%m/%d/%Y" startspan -->11/07/2005<!--webbot bot="Timestamp" endspan i-checksum="12601" --></address>
15 :     <h2>Background</h2>
16 :     <p>Current mirroring support in Eclipse Install/Update was added on the
17 :     request of Eclipse foundation just before Eclipse 3.1 went live in order to help
18 :     with the traffic spikes during the release times. However, its current
19 :     implementation leaves a lot to be desired. It requires too much of the annoying user
20 :     interaction and currently there is
21 :     no way around this (<a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=97806">bug
22 :     97806</a>). In addition,
23 :     there is no standard way for the user to select the correct mirror.</p>
24 :    
25 :    
26 :     <p>At the moment we support mirrors using optional <code>mirrorsURL</code>
27 :     attribute in the site tag of site.xml which points to an xml file that contains the
28 :     update site mirror definitions. This file has mirrors
29 :     defined by the <code>mirror</code> tag with attributes that define mirror's name and URL. Once
30 :     Update detects <code>mirrorsURL</code> attribute in the <code>site</code> tag of site.xml it downloads
31 :     the file and pops up a dialog with a list of
32 :     mirrors for the user to select.</p>
33 :    
34 :    
35 :     <h2>Goals</h2>
36 :     <p>We would like to improve the mirroring support in Eclipse Install/Update and
37 :     have the following goals:</p>
38 :     <ol>
39 :     <li>Automate choosing or at least help the user with selecting the most
40 :     appropriate mirror. Ideally we would like no user input, however there is not
41 :     enough information on the system to prevent this all the time. We want to
42 :     make this process as simple as possible both for the end user and the service
43 :     providers. We also want to have algorithm for determining the best available
44 :     mirror that is not too CPU and/or memory-intensive.<br>
45 :     &nbsp;</li>
46 :     <li>Provide ISVs building applications on top of the Eclipse platform
47 :     (e.g. RCP applications) with the way to contribute alternate mirror sorting
48 :     algorithms. This will allow companies to solve specific problems such as national rules, business
49 :     policies,
50 :     network performances etc.<br>
51 :     &nbsp;</li>
52 :     <li>Propose a server-side alternative to the client side mirror selection.</li>
53 :     </ol>
54 :    
55 :    
56 :     <h2>Options</h2>
57 :     <p>There are several possible options to improve the mirror handling. They can
58 :     be first classified based on the place where mirror determination takes place
59 :     (client or server). Client-side choices can be further classified based on the
60 :     algorithm used to determine the best mirror.</p>
61 :     <h3>Client-side options</h3>
62 :     <h4>Geographic proximity algorithm</h4>
63 :     <p>As its name says this algorithm works based on geographic proximity of the
64 :     mirror to client and usually countries are used for determining proximity. </p>
65 :     <p>Issues:</p>
66 :     <ul>
67 :     <li>For the large countries this might be an issue since they might have several
68 :     mirrors at considerably different geographical proximity to the client. This
69 :     could be solved using Postal/Zip codes </li>
70 :     <li>This also breeds a problem for centralized organization and ISPs. Due to
71 :     the cost and security issues, many ISPs have only one gateway to the internet which
72 :     may not be in the same location as the client and in some cases even in the
73 :     same country</li>
74 :     <li>Country borders will prevent us from exploiting real client/mirror
75 :     proximity. For instance, imagine a site located in Seattle with mirror in
76 :     Halifax. If a client tries to connect from Vancouver he will be forwarded to
77 :     Halifax mirror with possible considerable degradation of user experience.</li>
78 :     </ul>
79 :     <h4>Link cost algorithm</h4>
80 :     <p>This algorithm works by measuring cost (number of hops) of connecting to
81 :     different mirrors. It is a better algorithm since it calculates the real cost,
82 :     not the estimated cost.</p>
83 :     <p>Issues: </p>
84 :     <ul>
85 :     <li>Lots of additional network traffic </li>
86 :     <li>Routes can change </li>
87 :     <li>Can be a very lengthy process </li>
88 :     <li>Very likely to produce problems when working behind corporate and other
89 :     firewalls</li>
90 :     <li>Counting AS (Autonomous Systems) is useful but it is not the correct
91 :     measurement since cost of traversing ASs can be considerably different </li>
92 :     </ul>
93 :     <h3>Server-side options</h3>
94 :     <p>Mirroring support is usually put on the server. There are several reasons for
95 :     this. It is much easier to change mirror selecting logic. It is also
96 :     much easier to update data that mirroring logic uses to select correct mirrors.
97 :     Finally, servers usually know more about networks and other servers then clients. Work
98 :     in update to support server side mirror selection is very limited, since update
99 :     has only to understand one of the two standard ways that 'redirect' instructions
100 :     from the server are issued. These two standard ways are:</p>
101 :    
102 :    
103 :    
104 :     <ol style="margin-top: 0in;" start="1" type="1">
105 :     <li>one of the 300-series http return codes (probably 300) issued by the server to redirect traffic to another site</li>
106 :     <li>using meta tag in the HTML header that defines the redirect (mirror) site</li>
107 :     </ol>
108 :    
109 :     <h2>Proposed Solutions</h2>
110 :     <h3>Client-side</h3>
111 :    
112 :     <p>If we decide to go with client side solution we will have to
113 :     define extension point to support plugging of algorithms from other sources, and
114 :     extend definitions of the site tag in site.xml and mirror tag in the mirrors
115 :     definition file to support this. It would be done as follows: </p>
116 :    
117 :     <h4>Extension Point</h4>
118 :    
119 :     <p>A new extension point will be added to the Install/Update Core plug-in:</p>
120 :    
121 :     <blockquote>
122 :     <pre>&lt;<font color="#000080">extension</font> point=<font color="#008000">"org.eclipse.update.core.mirrorSorter"</font><font color="#000080">&gt;</font><br> &lt;<font color="#000080">sorter</font> <br> id=<font color="#008000">"com.example.xyz.MirrorSorter"</font><br> class=<font color="#008000">"com.example.xyz.update.XYZMirrorSorter"</font><br> <font color="#000080">/&gt;</font><br><font color="#000080">&lt;/extension&gt;</font>
123 :     </pre>
124 :     </blockquote>
125 :    
126 :     <p>The extension point provides for registering a mirror sorter with a unique
127 :     identifier referenced from within the site.xml using the newly added <code>
128 :     sorter-id</code> attribute.</p>
129 :    
130 :     <p>All of the extender classes must implement following interface:</p>
131 :    
132 :     <blockquote>
133 :     <pre>public interface IMirrorSorter {<br>/**<br> * Accepts a list of mirrors as defined in the definition file and sorts them based on<br> * preferred use, with the best mirrors at the head of the list.<br> * @param candidates mirrors to sort<br> * @return the ordered list of mirrors (the best mirrors first)<br> */<br> public IMirror[] sort(IMirror[] candidates);<br>}<br><br>public interface IMirror {<br> /**<br> * Returns the URL as a string (as defined in the mirror definition file)<br> * @return mirror URL as a string<br> */<br> public String getAddress();<br> /**<br> * Returns the mirror label (as defined in the mirror definition file)<br> * @return mirror label<br> */ <br> public String getLabel();<br> /**<br> * Returns mirror property value given the property name<br> * @return value of the named mirror property or &lt;code&gt;null&lt;/code&gt; if not defined<br> */<br> public String getProperty(String name);<br>}<br><br></pre></blockquote>
134 :     <h4>Augmented Site Tag Definition in site.xml:</h4>
135 :    
136 :    
137 :     <blockquote>
138 :    
139 :     <pre><tt>&lt;!ATTLIST site</tt> <br><tt>&nbsp;&nbsp;&nbsp; type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt> <br><tt>&nbsp;&nbsp;&nbsp; url&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt> <br><tt>&nbsp;&nbsp;&nbsp; mirrorsURL&nbsp;&nbsp;&nbsp; CDATA #IMPLIED</tt><br> <big><b style="font-weight: bold;"><small>sorter-id CDATA #</small></b><small><tt style="font-weight: bold;">IMPLIED</tt></small></big><br><tt>&gt;</tt></pre>
140 :    
141 :     </blockquote>
142 :    
143 :    
144 :     <p>where <code>sorter-id</code> represents the identifier of the mirror sorting class that is
145 :     registered user of extension point <code>mirrorSorter</code>. If <code>sorter-id</code> is not present,
146 :     the default sorter provided by Install/Update will be used (the exact identifier
147 :     to be defined).<br>
148 :     </p>
149 :    
150 :     <h4>Augmented Mirror Tag Definition in mirrors definition file:
151 :     <br>
152 :     </h4>
153 :    
154 :     <blockquote><pre>&lt;!ELEMENT mirror (property*)&gt; <br>&lt;!ELEMENT properties&gt; <br></pre>
155 :     <blockquote>
156 :    
157 :     </blockquote>
158 :     <pre><tt>&lt;!ATTLIST </tt>property<br><tt>&nbsp;&nbsp;&nbsp; name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #</tt><tt>REQUIRED</tt><br><tt>&nbsp;&nbsp;&nbsp; value&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CDATA #</tt><tt>REQUIRED</tt><big><small><tt style="font-weight: bold; font-style: italic;"></tt></small></big><br><tt>&gt;</tt></pre>
159 :     <blockquote>
160 :    
161 :     </blockquote>
162 :     </blockquote>
163 :     <pre><span style="font-size: 12pt; font-family: &quot;Times New Roman&quot;;">where </span><span style="font-family: &quot;Times New Roman&quot;;"><code><font size="2">property</font></code></span><span style="font-size: 12pt; font-family: &quot;Times New Roman&quot;;"> element carries sorter-specific information that can be used to determine suitable mirror.<o:p></o:p></span></pre>
164 :    
165 :     <h4>Install/Update default mirror sorter implementation<br>
166 :     </h4>
167 :    
168 :     <p>An issue with both of the client-side options above is that they are based on the
169 :     assumption that all mirrors are created equal i.e. they have same performance.
170 :     In our case, this is not true. We can try to alleviate this issue by assigning
171 :     mirror ratings and use them in our calculations as follows:</p>
172 :     <ol>
173 :     <li>Divide the world intro large regions (East Cost USA, Central USA, West
174 :     Cost USA, Europe, East Asia), assign countries and states/provinces to
175 :     these regions and assign the time zone to each region.</li>
176 :     <li>Put this information in a comma-separated value (CSV) file. Format would be:</li><pre>Region, Country, State/Province, Time Zone</pre>
177 :     <li>Make a weighted graph of the world's adjacent regions. This graph
178 :     would be put in a CSV file.
179 :     Format would be: </li>
180 :     <pre>Region, Region, Weight</pre>
181 :     <li>Weight is an approximate network round trip time (closely related to
182 :     geographic distance, but not the same) in integers 1-5 </li>
183 :     <li>User will be allowed
184 :     to set his/her location, country and geographic region in the preferences. If user
185 :     does not provide this information, system-provided country and time zone will be
186 :     used in our calculations </li>
187 :     <li>Allow the user to have "Select
188 :     the mirror automatically" and "Always prompt" mutually exclusive choices in the preferences.</li>
189 :     <li>Service
190 :     provider will be required to supply a mirror definition file with
191 :     the following additional properties defined for each mirror: region,
192 :     country, state/province, time zone, rating (actual property names to be defined
193 :     later)<br>
194 :     <br>
195 :     Based on this information we will sort mirrors in the following manner:<br>
196 :     &nbsp;<ul>
197 :     <li>mirrors in the end user's Country/State/Province</li>
198 :     <li>mirrors from same region</li>
199 :     <li>the
200 :     rest of the mirrors by using the weighted graph of regions </li>
201 :     <li>use mirror rating to decide on the order of mirrors in the same physical proximity to the end
202 :     user</li>
203 :     </ul>
204 :     </li>
205 :     </ol>
206 :     <p>Install/Update will use the list of mirrors obtained from the sorter by
207 :     looping through it until we get the response from one of them.
208 :     This would allow us to skip the unresponsive mirrors.</p>
209 :     <p>
210 :    
211 :     </p>
212 :     <p class="MsoNormal">This solution does not address the problem where the user and his/hers
213 :     gateway is not in the physical proximity of each other.</p>
214 :    
215 :    
216 :     <h3>Server side</h3>
217 :     <p>
218 :    
219 :     </p>
220 :    
221 :     <p>As previously mentioned, the beauty of the server side solution is
222 :     that we do not have to define mirror selection algorithm in advance. It
223 :     can be changed on the fly and on each site independently. However we
224 :     should decide in advance which method of redirecting we will use
225 :     (although both can be supported at the same time). There are several
226 :     ways to do this that in use on the web:<br>
227 :     </p>
228 :     <ol>
229 :     <li>Using HTTP status codes and new location interpreted by clients</li>
230 :     <li>Using meta tag in HTML</li>
231 :     <li>Using JavaScript</li>
232 :     </ol>
233 :     <p>Since the second solution is HTML specific and the third requires a
234 :     JavaScript interpreter we suggest using the HTTP 300 status code with a
235 :     sorted list of provided mirrors. Sorting can be done using either the
236 : btripkovic 1.2 client's IP address and an IP to Geographical location database,
237 : btripkovic 1.1 using a client provided location (using a GET request issued by the
238 :     client with 'country' and either 'time zone' or 'state/province' as
239 :     parameters), or some other way that can be devised by the site
240 :     implementing mirroring support. In any case using client provided
241 :     location information, a server-side algorithm similar to the one
242 :     presented for the client side solution can be used. Our proposal is for
243 :     the update to pass the location information at all times and let the
244 :     server decide if it wants to use it. This way we will not dictate the
245 :     choice of algorithms in advance. <br>
246 :     </p>
247 :     <p>Provided here is a <a href="new_mirror_suport_script.zip">php script</a> with configuration files that will
248 :     sort mirrors based on the client's country and time zone. Country and
249 :     time zone are expected to be provided by as GET parameters. We have
250 :     adopted the above mentioned default mirror sorter for the
251 :     client side solution (one difference is that mirrors are in the CSV
252 :     file and we did not do mirror rating). As an added bonus, this script
253 :     can be used to generate sorted mirrors lists for current
254 :     implementations of mirror support. We also provide patches for the <a href="org.eclipse.update.core.patch">org.eclipse.update.core</a> and <a href="org.eclipse.update.ui.patch">org.eclipse.update.ui</a> plug-ins of the
255 :     Update component that enables the script to be used in this manner.
256 :     These patches add the "Automatically select mirror" checkbox on the
257 :     Update preferences page. When checked, Update will automatically select
258 :     the mirror from the mirror list. Also, the timeZone and countryCode
259 :     parameters are added to every mirror request.</p>
260 :     <p>Server side solutions can also resolve issues of a user's location
261 :     differing from their gateway location. For example, it is well known
262 :     what IP address ranges are used by certain large organizations and
263 :     where their internet gateways are located and this can be included in
264 :     the decision making process.</p>
265 :     <h2>Conclusion<br>
266 :     </h2>
267 :     <p>We recommend that the server-side solution is selected for handling mirrors
268 :     in the Update component.</p>
269 :     <p><i>Note: All files contained herein are provided as is. If they are to
270 :     be used in production environments, error checking and other
271 :     miscellaneous improvements should be added.</i></p>
272 :    
273 :    
274 :     </body>
275 :     </html>