Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [provisioning-dev] Improved update generation/distribution algorithm

Hi Pascal,

Excellent questions!

- From the description, it seems that the server can be a directory or a
plain http server (or any other form of "dumb" server), am I correct?

We use a plain HTTP server and Apache HTTPClient on the client side.

Anything that can copy files will work.

- Is the size of the update site ever an issue?

For us, no.  We use a separate physical build server and update site server.  For a project like Eclipse, this is a good question, since an algorithm like this would be required to eventually keep terabytes of information around in order to generate diffs.  This is one reason we decided to only support upgrading between adjacent versions.  Adding 3 megs of overhead per version isn't bad.  But that number goes up exponentially if you want to support upgrading to an arbitrary V(n+delta).

- With a server that could run code, have you considered the generation of
the deltas on the fly given what the client has?

Well, there is an exponential growth in the number of differences you could have for V(n+delta) as delta gets big.  So you can't test all possibilities of delta.  I guess if you run the test at build time and build on the fly--when a user requests an update for a specific value of delta--maybe this doesn't matter.  You just test for the set of deltas that you've actually generated.

- Have you identified the threshold in the size of the delta where it would
be more effective to send the full file?

We didn't.  Our system is a CRM system that is used every day by most of our users.  We figured it wasn't bad to penalize the few folks that didn't stay up to date. :-)

- The application of F on the client seems to be pretty memory intensive.
Could you characterize the complexity of F in terms of memory?

We only process a single file at a time out of the ZipInputStream.  So the upper bound of RAM is whatever is required to decompress a file and write it to disk plus the size of a HashMap of all file paths in P(V(n)).  So the algorithm on the client is:
  1. Populate a Set with file paths in P(V(n)).
  2. For each file in the ZipInputStream: if the file is the deletion manifest, remove all entries it lists from the file path Set; else copy the file from the ZipInputStream into the JarOutputStream of the new plugin, then remove its path from the Set.  (If the file is not in the Set, it's a new file and no extra processing is required.)
  3. After this loop, any files remaining in the Set are the same from last time and are copied from the old plugin JAR P(V(n)) to the new plugin JAR P(V(n+1)).
- Even though you verify the content of the server, there are always many
things that can go wrong. So on the client how do you ensure that
F(P(V(n))) is equal to P(V(n+1)). What about shipping a checksum of the
expected result after the merge?

One property of this algorithm is that it doesn't guarantee that the order of the elements inside the JAR are the same after the merge.  So you can't just MD5 hash the two JARs and compare them.

As long as the update apply code is identical on client and server, the only things that can go wrong are things that are otherwise detectable or are out of our control anyway.  If the disk fills up, we can detect that.  If solar radiation or an EMP bomb hits the computer, we've got much worse problems than we can detect and deal with anyway.

The problem I forsee will occur when we want to upgrade the code that applies updates.  It would be advantageous to change the implementation so that it guarantees that file orders within the JAR files so that a checksum could be sent and used to verify the update at the client as well.  But then you've got an O(n) lookup algorithm in a ListSet rather than an O(1) lookup algorithm in our existing HashSet.  Maybe we don't care.  Maybe O(n)*number of plugins is fast enough and we optimized prematurely.  It's an interesting open question, at the least.

- Do you handle signed jars?

Not in our implementation.

- In your example, you say that 3M is still large? Why do you think so, are
the actual changes smaller? Why don't you ship the delta of each files
instead of each complete file?

3M is still a pretty long download for a modem, but it's not horrible.

We could ship a delta of the individual classes (binary diff the actual class files too) but we chose not to do that because we decided that 3M was small enough and the extra complexity (and possibility of bugs) was not worth the potential gain for our use-case.

- How do you see that fit with the concept of artifact repository developed
in equinox and our attempt to use a mirroring metaphor instead of download?

This is a way to mirror new versions of plugins between repositories using minimal bandwidth with reasonable assurances of correctness.


                                                                          
             "David J. Orme"                                              
             <djo@coconut-palm                                            
             -software.com>                                             To
             Sent by:                  provisioning-dev@xxxxxxxxxxx        
             provisioning-dev-                                          cc
             bounces@eclipse.o                                            
             rg                                                    Subject
                                       [provisioning-dev] Improved update  
                                       generation/distribution algorithm  
             04/25/2007 06:22                                              
             PM                                                            
                                                                          
                                                                          
             Please respond to                                            
                "Developer                                                
              discussions for                                              
               provisioning                                                
                 \(Update                                                  
                 Manager\)                                                
               initiatives"                                                
             <provisioning-dev                                            
               @eclipse.org>                                              
                                                                          
                                                                          




The current feature-based update sites can get huge quickly when small
numbers of changes are scattered across a large number of features.  This
kind of situation happens easily when releasing a dot-level or a patchlevel
upgrade--ie: 1.0.0 to 1.0.1--where lots of small bugs are fixed.

The purpose of this email is to open a dialog and to outline our solution.
Apologies in advance for those who are less comfortable with expressing
algorithms in formal math.  Time constraints have dictated communication
using the most compact and efficient means possible.

The algorithm we developed worked as follows:

The build server maintains copies of all production versions of the
application.  In the general case, then, we need an algorithm to upgrade a
given user from a version V(n) to a version V(n+delta) where delta is the
number of versions that have elapsed since the user last upgraded.  We want
this algorithm to have the following properties:
      Send a minimal number of bits across the wire
      Be as simple as possible
      Be verifiably correct.  We couldn't afford to have 40k-70k of broken
      clients if the algorithm malfunctioned.
      Related to the previous: Have safeguards built-in so that if a bad
      update site is ever generated, the whole thing fails fast--before
      *any* bits are ever sent to clients.
You will notice some conflicts in the requirements above.  In particular,
the requirement to send the fewest number of bits over the wire conflicts
with the requirements to be as simple as possible and to be verifiably
correct.  The result was that our algorithm sacrifices some efficiency in
the number of bits it sends over the wire in order to be simpler and more
verifiably correct.

The result: In our application, going from version 1.0.0 to version 1.0.1
using update site features resulted in about a 10 meg download.  After
applying our algorithm, the download size shrunk to about 3 megs.  This is
still large for some of our users who utilize a dial-up connection, but it
is doable, whereas the 10 meg download would simply be prohibitive.

Here is the algorithm:

First, we simplify the problem.  Instead of upgrading users from an
arbitrary version V(n) to V(n+delta), we only upgrade them from V(n) to
V(n+1).  Going to V(n+delta) is simply an iteration of the algorithm over
delta steps.  This simplification works fine for our user population.  If
it would be acceptable for the general Eclipse population is an open
question.

Second, we notice that sending JAR'd plugins is very expensive.  Suppose
only two classes in a large plugin change: the whole plugin must still be
shipped.

Our algorithm then is simple.  We dynamically generate a function F where
F(P(V(n))) = P(V(n+1)).  F is really just a zipped collection of files that
encodes just the files that changed between the two plugin versions.

This is accomplished using the following algorithm.  For a given plugin P
in two versions V(n) and V(n+1):
   1. Unpack P(V(n)) and P(V(n+1)) into temporary directories.
   2. For files that correspond to each other in the two plugins, MD5 hash
      them, and if the hashes are different, put the file from P(V(n+1))
      into F.
   3. Include new files from P(V(n+1)) in F.
   4. Create a text file manifest listing files in P(V(n)) that are no
      longer included in P(V(n+1)) and include that text file in F.
   5. Package all of the files in F in a zip file.
I believe that the algorithm for applying this function F to P(V(n)) to get
a P(V(n+1)) is self-evident.

This approach has advantages over binary-diffing the two plugin jars
P(V(n)) and P(V(n+1)): If the order of the files contained in the plugin
changes between versions, a naive binary diff algorithm will not achieve
nearly the same amount of "compression" that we achieve.

So we have described a simple algorithm that is able to dramatically reduce
download bandwidth requirements over using update sites.  However, we have
not described how we achieved the "fail-fast" requirement.  If a flawed
update site is ever generated, we want to detect this automatically at
build time rather than after 50,000 clients have already downloaded it and
broken their installs.

Here is the algorithm we use:

After a build has completed, we have the following artifacts on our build
server:

P(V(n)) - the old version
P(V(n+1)) - the original new version generated by PDEBuild
F - the function (basically an intelligent binary diff) we generated that
we can apply to P(V(n)) to get P(V(n+1))

The approach we take to validate F is the following:
   1. Apply F to P(V(n)) on the build server using the same code that is
      running on the clients.
   2. Let's call the result of applying F to P(V(n)), P(V(n+1))'  (notice
      the prime there).  So formally, F(P(V(n))) = P(V(n+1))'
   3. We need to prove that P(V(n+1))' is the same as P(V(n+1)), the
      version originally generated by PDEBuild.
   4. To do this, we simply unpack P(V(n+1))' and P(V(n+1)) into temporary
      directories and perform an MD5 hash on all files, making sure that
      all files present in one are also present in the other and that their
      hashes are equal.
   5. If applying F to P(V(n)) on the build server using the same code as
      is running on the clients produces a P(V(n+1))' identical to
      P(V(n+1)) on the server, then it should work on the clients as well.
In our implementation, all update site building on the build server was
done in Ruby.  The effort to implement the server side of this algorithm in
Ruby was about three days.  The effort to implement the client-side (fully
unit-tested, pair-programmed, one pair programming team working on it
full-time) in Java was about a month.  This was due to the hassle of
working with Java input and output streams.  However, the result was that
the Java code worked entirely in memory, piping input streams into output
streams and so on until only the final results were written to disk.  Thus,
the Java solution would be immune to breakage resulting from something on
the file system being changed in a way we did not expect.

Let me know if you have any questions.


Regards,

Dave Orme
_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/provisioning-dev


_______________________________________________
provisioning-dev mailing list
provisioning-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/provisioning-dev

Back to the top