[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[equinox-dev] [prov] processing steps, restartable downloads and ECF


Thanks to Stefan we have been introducing  the notion of ProcessingSteps to munge the content as it is downloaded from an artifact repository.  This allows for things like inline MD5 digest checking, unpack200 processing, delta merging, signature checking, ...  All great stuff.  Pascal just raised a very interesting question.  How do we handle restarting?  Some background.  In the current prototype (in my workspace, not yet in CVS) there is a chain of ProcessingStep objects.  Each step is actually an OutputStream that knows about the next step (output stream) in the chain.  When a byte is written to step/stream N, it is processed (counted, transformed, ...) and then the result passed on to step N+1.  This repeats until finally the content gets to the last stream in the chain which is usually a FileOutputStream of some sort and so the content is then written to disk.  All is well.

Now, what happens if we crash or the user somehow pauses the download?  The content is partially processed/transformed but it would likely be too costly for each step to persist its intermediate results.  It would be more likely that somehow the raw content coming in to the head of the chain of steps is cached and then when the download is restarted after a crash/exit, the chain is recreated and the download is effectively replayed through the chain from the cache.  When that is done, the further content from the source would then be pushed through the chain.

So, two questions.  Does this make sense?  and if so, how should we implement this?  I wonder if ECF has some technology/support/designs in this area since it seems they support restartable downloads.  Scott?

Jeff