Re: [provisioning-dev] Improved update generation/distribution algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [provisioning-dev] Improved update generation/distribution algorithm

From: "Nick Boldt" <codeslave@xxxxxxxxxx>
Date: Thu, 26 Apr 2007 12:58:46 -0400
Delivered-to: provisioning-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=MKcKUpkhFCtns9UYP7ryqeRc/4uLFOr9MSfrMAVhQL1/mI1LdBtUGX01/h1z3qrqBavonVNK1QwoN+mtVDT9Po6+CCHi62uAkbj+oeLkC5KbP//XMxMdK8Iex3RqxS6MZ/QMSHoU/P9dTZILmwA7SnxarJtbVc9ALuLqg130/xU=
List-archive: <https://dev.eclipse.org/mailman/listinfo/provisioning-dev>
List-help: <mailto:provisioning-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/provisioning-dev>, <mailto:provisioning-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/provisioning-dev>, <mailto:provisioning-dev-request@eclipse.org?subject=unsubscribe>

What if the deltas between V(n) and V(n+1), V(n+2) ... V(n+latest) were precomputed and stored on download.eclipse.org for mirroring, rather than only in-memory? In this case, n would be a Milestone, and each I build that followed (+1, +2, ...) would ONLY consist of incremental deltas from its predecessor.

Milestones, which tend to get picked up more than I builds, could be treated differently. I can see three possible ways to publish a milestone:

a) Delta from last milestone (M7 vs. M6)
b) Delta from last I build (M7-S20080428 vs. I-20080421)
c) full build (because people often download a full milestone build in order to have a clean start)

The same should probably be true for R builds:

a) Delta from last R ( 3.4.0 vs. 3.3.0)
b) Delta from last I/S build (R3.4.0-R20080528 vs. RC7-S20080521)
c) full build

For maintenance branches, I'd suggest that the first GA build on the branch (say, 3.4.0) would be the baseline, and all maintenance builds (M, S) would be delta-only updates. The final R build in a branch ( 3.4.1, 3.4.2) ought to be available either in both formats or just as a full build so that people who are coming to Eclipse for the first time don't have to download 3.4.0 and then wait as it updates to 3.4.2; instead, they can start with 3.4.2 directly.

a) Delta from last R (3.4.1 vs. 3.4.0)
b) Delta from last M/S build (R3.3.2-R20090228 vs. M-20090221)
c) full build

This approach takes all the benefits of the delta computation for space/bandwidth savings, but also allows the mirrors to participate, and lets us do the computation-heavy/memory-heavy work up front when first publishing the build from build server to production server @ eclipse.org.

On the other hand, offering three ways to update (for stable and release builds) might be overkill. Would people be OK with downloading 3.4.0 and waiting to update it via delta-only updates to 3.4.2, if it was never offered as a full download? I suppose it would be no worse than installing a ubuntu or debian based linux distro and immediately running `apt-get update; apt-get upgrade` to get the updates not included in the .iso, or installing Windows and then running Windows Update & Microsoft Update to get the umpteen patches, fixes and drivers that weren't on the install CD. But is that such a great thing to emulate? ;-)

On the issue of storing terrabytes of old builds, delta precomputation would get around that problem -- at any time you'd only ever need:

this week's I (20070421, 20070414, ...)
last week's I (20070428, 20070421, ...)
last month's milestone S (M6, M5, ...)
this month's milestone S (M7, M6, ...)
last year's release R (x.y.0, x.y.1)
this year's release R (x.y.1, x.y.2)

With precomputed deltas, you'd only need to store those deltas, not the builds for which they were calculated, and could purge old builds cyclically so you'd only ever have 2 I builds, 2 milestone/RC builds, and all the releases. AFAIK, that's not so different from the current policy.

Thoughts?

--
Nick Boldt :: Release Engineer, IBM Toronto Lab
Eclipse Modeling :: http://www.eclipse.org/modeling
http://wiki.eclipse.org/index.php/User:Nickb

On 4/26/07, Denis Roy <denis.roy@xxxxxxxxxxx > wrote:

A few thoughts...

David J. Orme wrote:

- Is the size of the update site ever an issue?

For us, no. We use a separate physical build server and update site server. For a project like Eclipse, this is a good question, since an algorithm like this would be required to eventually keep terabytes of information around in order to generate diffs.

For those of us who maintain large update sites and can't afford terabytes of disk space, the answer is yes, the size of the update site is an issue to consider :)

- With a server that could run code, have you considered the generation of
the deltas on the fly given what the client has?

Two considerations:
a) big update sites with lots of traffic can potentially require massive server resources to generate deltas on-the-fly.
b) deltas on-the-fly means we can't use the free bandwidth provided by mirror sites. Although sending deltas likely translates to a much smaller transfer than what we send out now, static "dumb http" mirrors also allow users to pick a server geographically close to them instead of one set of servers in one physical location.

Thanks,

Denis
- How do you see that fit with the concept of artifact repository developed
in equinox and our attempt to use a mirroring metaphor instead of download?

This is a way to mirror new versions of plugins between repositories using minimal bandwidth with reasonable assurances of correctness.


   "David J. Orme"
   <djo@coconut-palm
   -software.com> To
   Sent by: provisioning-dev@xxxxxxxxxxx
   provisioning-dev- cc
   bounces@eclipse.o
   rg Subject
   [provisioning-dev] Improved update
   generation/distribution algorithm
   04/25/2007 06:22
   PM


   Please respond to
   "Developer
   discussions for
   provisioning
   \(Update
   Manager\)
   initiatives"
   <provisioning-dev
   @eclipse.org>



The current feature-based update sites can get huge quickly when small
numbers of changes are scattered across a large number of features. This
kind of situation happens easily when releasing a dot-level or a patchlevel
upgrade--ie: 1.0.0 to 1.0.1--where lots of small bugs are fixed.

The purpose of this email is to open a dialog and to outline our solution.
Apologies in advance for those who are less comfortable with expressing
algorithms in formal math. Time constraints have dictated communication
using the most compact and efficient means possible.

The algorithm we developed worked as follows:

The build server maintains copies of all production versions of the
application. In the general case, then, we need an algorithm to upgrade a
given user from a version V(n) to a version V(n+delta) where delta is the
number of versions that have elapsed since the user last upgraded. We want
this algorithm to have the following properties:
   Send a minimal number of bits across the wire
   Be as simple as possible
   Be verifiably correct. We couldn't afford to have 40k-70k of broken
   clients if the algorithm malfunctioned.
   Related to the previous: Have safeguards built-in so that if a bad
   update site is ever generated, the whole thing fails fast--before
   *any* bits are ever sent to clients.
You will notice some conflicts in the requirements above. In particular,
the requirement to send the fewest number of bits over the wire conflicts
with the requirements to be as simple as possible and to be verifiably
correct. The result was that our algorithm sacrifices some efficiency in
the number of bits it sends over the wire in order to be simpler and more
verifiably correct.

The result: In our application, going from version 1.0.0 to version 1.0.1
using update site features resulted in about a 10 meg download. After
applying our algorithm, the download size shrunk to about 3 megs. This is
still large for some of our users who utilize a dial-up connection, but it
is doable, whereas the 10 meg download would simply be prohibitive.

Here is the algorithm:

First, we simplify the problem. Instead of upgrading users from an
arbitrary version V(n) to V(n+delta), we only upgrade them from V(n) to
V(n+1). Going to V(n+delta) is simply an iteration of the algorithm over
delta steps. This simplification works fine for our user population. If
it would be acceptable for the general Eclipse population is an open
question.

Second, we notice that sending JAR'd plugins is very expensive. Suppose
only two classes in a large plugin change: the whole plugin must still be
shipped.

Our algorithm then is simple. We dynamically generate a function F where
F(P(V(n))) = P(V(n+1)). F is really just a zipped collection of files that
encodes just the files that changed between the two plugin versions.

This is accomplished using the following algorithm. For a given plugin P
in two versions V(n) and V(n+1):
   1. Unpack P(V(n)) and P(V(n+1)) into temporary directories.
   2. For files that correspond to each other in the two plugins, MD5 hash
   them, and if the hashes are different, put the file from P(V(n+1))
   into F.
   3. Include new files from P(V(n+1)) in F.
   4. Create a text file manifest listing files in P(V(n)) that are no
   longer included in P(V(n+1)) and include that text file in F.
   5. Package all of the files in F in a zip file.
I believe that the algorithm for applying this function F to P(V(n)) to get
a P(V(n+1)) is self-evident.

This approach has advantages over binary-diffing the two plugin jars
P(V(n)) and P(V(n+1)): If the order of the files contained in the plugin
changes between versions, a naive binary diff algorithm will not achieve
nearly the same amount of "compression" that we achieve.

So we have described a simple algorithm that is able to dramatically reduce
download bandwidth requirements over using update sites. However, we have
not described how we achieved the "fail-fast" requirement. If a flawed
update site is ever generated, we want to detect this automatically at
build time rather than after 50,000 clients have already downloaded it and
broken their installs.

Here is the algorithm we use:

After a build has completed, we have the following artifacts on our build
server:

P(V(n)) - the old version
P(V(n+1)) - the original new version generated by PDEBuild
F - the function (basically an intelligent binary diff) we generated that
we can apply to P(V(n)) to get P(V(n+1))

The approach we take to validate F is the following:
   1. Apply F to P(V(n)) on the build server using the same code that is
   running on the clients.
   2. Let's call the result of applying F to P(V(n)), P(V(n+1))' (notice
   the prime there). So formally, F(P(V(n))) = P(V(n+1))'
   3. We need to prove that P(V(n+1))' is the same as P(V(n+1)), the
   version originally generated by PDEBuild.
   4. To do this, we simply unpack P(V(n+1))' and P(V(n+1)) into temporary
   directories and perform an MD5 hash on all files, making sure that
   all files present in one are also present in the other and that their
   hashes are equal.
   5. If applying F to P(V(n)) on the build server using the same code as
   is running on the clients produces a P(V(n+1))' identical to
   P(V(n+1)) on the server, then it should work on the clients as well.
In our implementation, all update site building on the build server was
done in Ruby. The effort to implement the server side of this algorithm in
Ruby was about three days. The effort to implement the client-side (fully
unit-tested, pair-programmed, one pair programming team working on it
full-time) in Java was about a month. This was due to the hassle of
working with Java input and output streams. However, the result was that
the Java code worked entirely in memory, piping input streams into output
streams and so on until only the final results were written to disk. Thus,
the Java solution would be immune to breakage resulting from something on
the file system being changed in a way we did not expect.

Let me know if you have any questions.

Regards,

Dave Orme
_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/provisioning-dev

_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/provisioning-dev
_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx


https://dev.eclipse.org/mailman/listinfo/provisioning-dev
  
-- 
Denis Roy
Manager, IT Infrastructure
Eclipse Foundation, Inc.  --  http://www.eclipse.org/


Office: 613.224.9461 x224 (Eastern time)
Cell: 819.210.6481
denis.roy@xxxxxxxxxxx
_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/provisioning-dev

References:
- Re: [provisioning-dev] Improved update generation/distribution algorithm
  - From: David J. Orme
- Re: [provisioning-dev] Improved update generation/distribution algorithm
  - From: Denis Roy

Prev by Date: Re: [provisioning-dev] Improved update generation/distribution algorithm
Next by Date: Re: [provisioning-dev] Improved update generation/distribution algorithm
Previous by thread: Re: [provisioning-dev] Improved update generation/distribution algorithm
Next by thread: Re: [provisioning-dev] Improved update generation/distribution algorithm
Index(es):
- Date
- Thread

Breadcrumbs