Hi Jeff,
Jeff McAffer wrote:
Thanks to Stefan we have been
introducing
the notion of ProcessingSteps to munge the content as it is downloaded
from an artifact repository. This allows for things like inline MD5
digest checking, unpack200 processing, delta merging, signature
checking,
... All great stuff. Pascal just raised a very interesting
question. How do we handle restarting? Some background. In
the current prototype (in my workspace, not yet in CVS) there is a
chain
of ProcessingStep objects. Each step is actually an OutputStream
that knows about the next step (output stream) in the chain. When
a byte is written to step/stream N, it is processed (counted,
transformed,
...) and then the result passed on to step N+1. This repeats until
finally the content gets to the last stream in the chain which is
usually
a FileOutputStream of some sort and so the content is then written to
disk.
All is well.
Now, what happens if we crash or the
user somehow pauses the download? The content is partially
processed/transformed
but it would likely be too costly for each step to persist its
intermediate
results. It would be more likely that somehow the raw content coming
in to the head of the chain of steps is cached and then when the
download
is restarted after a crash/exit, the chain is recreated and the
download
is effectively replayed through the chain from the cache. When that
is done, the further content from the source would then be pushed
through
the chain.
So, two questions. Does this make
sense? and if so, how should we implement this? I wonder if
ECF has some technology/support/designs in this area since it seems
they
support restartable downloads. Scott?
Unfortunately not as much as we would like. We do have API support for
pausing/resuming downloads (IFileTransferPausable), and the existing
impls do naively support this interface, but we need/want to add
further/better implementation support (e.g. direct protocol support for
protocols that have pause/resume, partial file caching, etc).
Actually, I'm a little surprised that you have so far passed the
ProcessingSteps as output streams directly to the ECF OutputStream, as
I was expecting that you would have a temporary file to receive the
file contents, and then when the file reception is done *then* apply
the ProcessingSteps.
But in any event, we can add impl support for pause/resume/caching etc
to the ECF receive implementations w/o changing API to support required
use cases. I would appreciate a little better understanding of the
existing ProcessingSteps and their function...so could someone point me
at the relevant packages/classes and I'll take more of a look?
Seems like this would also be a good topic for the upcoming Equinox
Summit: what enhancements are needed for file transfer both at API and
impl: e.g. pause/resume enhancements, file caching,
monitoring/transfer statistics collection?, support more/other
providers, etc.
Scott
Jeff
_______________________________________________
equinox-dev mailing list
equinox-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/equinox-dev
|