platform-update-home/doc/working/site-performance-enhancement.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (view) (download) (as text)

1 : dejan 1.1 <h1>Update Site Performance Enhancement Utility Proposal</h1>
2 :     <h2>The problem</h2>
3 :     <p>Eclipse Install/Update design concept include grouping artifacts called <b>
4 :     features</b> published on a remote server on an <b>Update site</b>. A feature
5 :     consists of the feature manifest file, NL property files, images, licenses,
6 :     copyrights and other resources placed in a single JAR archive. When directed at
7 :     the update site, Eclipse Update Manager must download each of these JARs and
8 :     parse the manifest in order to perform activities such as site browsing,
9 :     searching, dependency checking etc.</p>
10 :     <p>This approach works reasonably well for moderate update sites, but we are
11 :     facing scalability problems with the upcoming Callisto update site that contains
12 :     hundreds if not thousands of features. Each of the feature JARs is small, but
13 :     opening a connection and downloading this small JAR is costly and adds up. Even
14 :     worse, users need to pay this price BEFORE they even decide if they want to
15 :     install anything from the site. A solution is needed to reduce the number of
16 :     connections simply to browse or search the update site.</p>
17 :     <p>Once the features to install have been selected, Update needs to physically
18 : dejan 1.2 download plug-in JARs onto user's machine. At this point, payload size ceases to
19 : dejan 1.1 be trivial - a full Callisto download is several hundred megabytes. A technique
20 :     to reduce the payload size would benefit users who are downloading the full
21 :     Callisto set.</p>
22 :     <h2>The solution</h2>
23 :     <p>We propose a <b>site enhancement utility</b> that can run on any update site
24 :     and produce artifacts that will address the problems mentioned above. In
25 :     addition to the utility, enhancements of the Install/Update code will be made in
26 :     order to make Update capable of consuming these artifacts. The performance
27 :     enhancement is optional and Install/Update should continue to perform as today
28 :     in their absence.</p>
29 :     <p>The core of the solution is in using <b>update site digests</b> to merge all
30 :     the information needed for browsing and searching into one file (digest.xml)
31 :     that is archived for size and downloaded using one connection instead of many
32 :     separate connections for individual features. Once the install decision has been
33 :     made, the use of <b>Pack200</b> utility (a part of J2SE 5.0) on plug-in and
34 :     fragment archives will make the payload smaller and faster to transport.</p>
35 :     <h3>The performance enhancement utility</h3>
36 :     <p>The utility will be some kind of a command line tool fully driven by the
37 :     content of the site.xml file:</p>
38 :     <blockquote>
39 :     <pre>&lt;utility-name&gt; site.xml</pre>
40 :     </blockquote>
41 :     <p>The file will contain additional attributes that take advantage of the
42 :     performance enhancements. The role of the utility will be to generate required
43 :     artifacts according to the specification in site.xml. This specification is
44 :     accomplished using additional attributes of the element 'site':</p>
45 :     <ul>
46 :     <li><b>digest</b> - an optional attribute that points at the site digest
47 :     file. When the utility finds this attribute, it will generate the update
48 :     site digests based on all the features referenced in the file. In addition
49 :     to the default digest (e.g. 'digests/digest.xml') there may be
50 :     locale-specific versions (e.g. 'digests/digest_en_US.xml').</li>
51 :     <li><b>digestLocales</b> - a comma-separated list of locales for which a
52 :     digest file exists. The existence of this list prevents Update Manager from
53 :     making multiple trips to the server and opening connections to the missing
54 :     files. This list must be exhaustive i.e. is must match the existing digest
55 :     files. If a particular locale version of the digest is on the server but the
56 :     local is not listed in this list, it will not be used. The value of this
57 :     attribute is generated by the utility based on all the different digest
58 :     locales.</li>
59 :     <li><b>pack200</b> - a boolean attribute indicating that the site contains
60 :     archives packed using pack200.exe.</li>
61 :     </ul>
62 :     <p>Example:</p>
63 :     <blockquote>
64 :     <pre>&lt;site digest=&quot;digests/digest.xml&quot; digestLocales=&quot;en_US,ja_JP,de_DE,fr_FR&quot; pack200=&quot;true&quot;&gt;
65 :     ....</pre>
66 :     <pre>&lt;/site&gt;</pre>
67 :     </blockquote>
68 :     <p>The utility will use <b>digest</b> and <b>pack200</b> attributes as input. If
69 :     <b>digest</b> is present, it will cause the generation of digest files (the
70 :     default as well as for each supported locale), and the addition/update of the<b>
71 :     digestLocales</b> attribute listing all the locales for which a digest has just
72 :     been generated. If <b>pack200</b> is present, it will cause the utility to call
73 :     pack200.exe on each plug-in or fragment archive (say, com.example.xyz.jar) and
74 : dejan 1.2 generate a packed version (com.example.xyz.jar.pack.gz).</p>
75 : dejan 1.1 <p>It is the responsibility of the utility caller to ensure that it is run
76 :     regularly in order to keep the generated content in sync with the source.
77 :     Failure to do so will result in stale content and browsing or installation
78 : dejan 1.2 errors. Digest generator portion of the utility can be simple and always
79 :     generate all the files if it does not take too much time. On the other hand,
80 :     pack200 portion must be incremental in order to avoid packing jars that have not
81 :     changed.</p>
82 : dejan 1.1 <h3>Update site digests</h3>
83 :     <p>The goal of the update site digests is to minimize the number of files that
84 :     need to be downloaded in order to browse or search the update site. Digests are
85 :     made by parsing all the referenced features in site.xml and generated merged
86 :     content in one XML file. The DTD of the file is implementation detail. Suffice
87 :     to say is that the support will be added to Update Manager to download the file,
88 :     expand it, parse it and use it to feed light features sufficient to populate the
89 :     UI or perform dependency checks. Digest will have a processing instruction
90 :     listing the digest version to allow future enhancements while retaining backward
91 :     compatibility. </p>
92 :     <p>Once the choice has been made, features selected for installation will be
93 :     fully downloaded into regular feature objects. The gain is in the ratio of
94 :     features that are downloaded for searching to those actually needed for the
95 :     installation process. In update sites such as those for Callisto, ratio is
96 :     typically 100:1 or higher, hence the gain. Consequently, digests are not needed
97 :     for small and simple sites.</p>
98 :     <h3>Support for Pack200</h3>
99 :     <p>If Pack200 is indicated in site.xml, Eclipse Update will first try to
100 : dejan 1.2 download &quot;com.example.xyz.jar.pack.gz&quot; when downloading &quot;com.example.xyz.jar&quot;
101 : dejan 1.1 archive. If found, it will be downloaded and unpacked at the client machine. The
102 :     rest of the process will be as usual. If the file is not present, &quot;com.example.xyz.jar&quot;
103 :     will be downloaded instead. This may actually slow down the installation due to
104 :     the redundant connection attempts. For this reason, there are no options on the
105 :     utility - if the pack200 option is present, the tool will traverse all the
106 :     plug-ins and fragments that are present at the site and generate a packed
107 :     version.</p>
108 :     <p>Since packed version of the archive needs to be unpacked on the client, it is
109 :     a prerequisite for the client to have unpack200.exe in the system path. Update
110 :     Manager will check if it can unpack Pack200 archives before downloading them.
111 :     For this reason, update sites must have normal versions of the archives in
112 :     addition to the packed versions.</p>
113 : dejan 1.2 <p>The location of the unpack200 executable can be specified to update using a
114 :     system property &quot;org.eclipse.update.jarprocessor.pack200&quot;. The value can be one
115 :     of:</p>
116 :     <ul>
117 :     <li>&quot;@jre&quot; - find unpack200 in java.home/bin</li>
118 :     <li>&quot;@path&quot; - find unpack200 on the search path</li>
119 :     <li>&quot;@none&quot; - don't use unpack200, just download the .jar</li>
120 :     <li>&quot;path/to/dir&quot; - path to the directory containing the unpack200
121 :     executable</li>
122 :     </ul>
123 :     <p>If the property is not set, we will look for unpack200.exe first in java.home/bin,
124 :     then on the system path. If that fails, unpack will not be used.</p>