Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [eclipse-incubator-e4-dev] [resources] EFS, ECF and asynchronous

Hi Martin,

Oberhuber, Martin wrote:
<stuff deleted>

Could you cite a use case where async access is necessary?

When blocking the calling thread (e.g. any synchronous reads/writes) results in system-wide 'problems' (e.g. UI is stopped, other server processing is blocked, etc).

I think that (assuming all synchronous methods have progress monitors for cancellation, which is the case in EFS), the only difference between sync and async access is (1) the number of Threads in "wait" state,
  (2) locking of resources while Threads synchronously wait,
  (3) potential for coalescing multiple requests to the
      same item in the case of asynchronous queries.

I would also say that the sync access is not 'predictable' in a way that can result in problems...particularly when the network is involved. For example, when calling a blocking file.read, if that read is on the local filesystem with modern OSes it's quite likely to eventually complete....and in a timely manner. But when that same read is done over the network it's much more likely to block for a very long time (e.g. due to variable network performance), or simply never return (block forever).

In the asynchronous case, no Threads are waiting and resources
*may* be unlocked until the callback returns, but this unlocking
of resources needs to be carefully considered in each case. Does the system always remain in a consistent state? RESTful systems ensure this by placing all state info right into the request, which is a great idea but likely not always possible. It's not only a matter of the API being complex or not. The fact is that the concept of being asynchronous as such is more flexible,
but also requires adopters to be more careful, or at least think
along different lines.

I also think that we should look into the need for being asynchronous or not separately for the kinds of requests:
  (A) Directory retrieval (aka childNames())
  (B) Full file tree retrieval
  (C) Status/Attribute retrieval for an individual file
  (D) File contents retrieval

For (D) we already use Streams in EFS, which can be implemented in an asynchronous manner. What's currently missing in EFS is the ability to perform random access, like the JSR 203 SeekableByteChannel [1]. Interestingly, nio2 has both a synchronous FileChannel [2] and AsynchronousFileChannel [3].

For (A), (B), (C) I'm not sure how much we would win from
an asynchronous variant, since I'd assume that not much
work could be done (and not much resources freed) while
asynchronously waiting for their result anyways. But perhaps
I'm wrong?

I do think this assumption is wrong (not being able to do much while waiting for results). That is, A-C are all accessing file 'meta-data' (directories, trees, file attributes, etc). This meta-data *can* be very large (e.g. for large directory trees, lots of data for each file, etc). So although the file content is often much larger than the meta-data, that's not always the case. So blocking on these operations (meta-data access...particularly over the net) can be an issue as well...again depending upon the application/system performance requirements.

So I think the best evidence for the need for asynchronous access to file systems is simply the persistent availability of asynchronous approaches...even to local file systems. I agree completely that in general synchronous i/o (local and/or network) is easier to program...and in the environments where performance and reliability are high (e.g. accessing a local physical disk) then for the most part synchronous i/o can be used. But there are app-level and/or system-level requirements that are more amenable to asynchronous i/o approaches, and so those APIs are available...even in the case of local (non-network) i/o.

3) Using (e.g.) adapters it's not necessary to force such an API on anyone (rather it can be available when needed)

Hm... so, let's assume that client X wants to do something asynchronous. So it does
   myFileStore.getAdapter(IAsyncFileStore.class);
some file systems would provide that adapter, others not.
What's the client's fallback strategy in case the async adapter is not available?

Well, if such an adapter is not available then they could do it synchronously rather than asynchronously.

I'm afraid that if we use such adapters, we end up with the
same code in clients again and again, because they need some
fallbacks strategy. It seems wiser to place the fallback strategy right into the EFS provider, since it is always possible to write a bridge between a synchronous and an
asynchronous API in a single, generic way.

If the client uses synchronous i/o (perhaps in a new thread) as a fallback strategy then isn't the EFS (synchronous) already built into the EFS provider?

Therefore, I'm more in favor of determining what APIs we want
to be asynchronous, and just adding them to EFS.

Huh? It seems to me that for simplicity it would make more sense to provide new interfaces (possibly as adapters, but not necessarily) rather than put a bunch more methods on (e.g.) IFileStore.

The adapter
idea could be used for adding provisional API, but the final
API should not need that.

Seems to me that this would result in a very large and complicated API, that would include both synchronous and asynchronous calls...meaning that

1) clients would have tease apart which methods are which making it harder to use the API(s) 2) it would require providers to implement both synchronous and asynchronous operations...thus also making it harder to implement even simple EFS providers. Why not provide some separation of concerns through whatever means (e.g. different packages...e.g. java.io, java.nio, adapters, others)?

To that extent, let's start assuming that files are quick
and local. And
let's investigate how we could leverage ECF to support remote file
systems. If that doesn't meet our needs, we can always add
async later.

I'm not sure if this is a good strategy. It seems to lead
towards more and more separation of local vs. remote -- which, I think, leads to either duplication of code in the end, or non-uniform workflows for end users.

I disagree. I think the problem is with trying to make local and remote access look exactly the same (network transparency) to all applications. Sometimes/many times applications care that resources are remote, take longer to load, are more likely to go away and not return, etc., etc. I think the workflows that are most problematic are those that assume that local resource access is exactly the same as disk access. As much as I would like to have network and local be exactly alike in terms of performance and reliability they are not.

Let me draw some sceanrio of what the world could look like in 10 years: with the Internet getting more and more into our lives, you'd want to use an Eclipse based product to dive into some code base that you just found on the net.
Without downloading everything in advance. Or you browse into
some mp3 music store. Add some remotely hosted Open Source Library to your UML drawing just by drag and drop.

Sounds nice...but I'll guarantee you that in 10 years the Internet will still be much slower than local disc access, and much less reliable than disk access...and that applications (and users) will notice.

I think that users will more and more want to operate on
remote networked resources just the same as on local resources.

I don't deny that people (particularly programmers) would like to operate on remote networked resources in the same way as local resources...I certainly would. But I don't think that the differences between local system and network systems can be made to entirely disappear. Many have tried...all have failed as far as I'm concerned :). I don't have any problems with a virtual file system, but I think it's a big step to assume that such an abstraction alone can deal with all the issues of networked file systems. There's a good reason why NFS (e.g.) isn't used over the WAN...and probably never will be. Rather, we get applications like the web browser to deal with remote resources.


E4 gives us the chance to try and come up with
models that support such workflows in a uniform way. Let's not throw away that chance prematurely.

It doesn't seem to me that E4 is likely to deal with issues that affect all distributed systems...e.g. differences in performance, reliability, partial failure. If it is trying to do that, it seems to me that it's biting off more than it should chew.
I agree that we need to start on concrete work items
rather than endlessly discussing concepts. But as we
start on these work items, let's keep the concept that
things may be remote in our minds.

There's an interesting discussion of strategies for dealing with network issues in a 'uniform' in terms of API in the Note on Distributed Computing paper:

http://research.sun.com/techrep/1994/abstract-29.html

They sort of ask the question: should API be designed with local systems in mind or remote systems in mind? There are major difficulties with both extremes...because assuming local (the natural first assumption) essentially ignores the differences in the net WRT performance, partial failure, etc and results in API that doesn't work when moved to the net. On the other hand, designing APIs as if everything were remote is also problematic because then the programmer has to deal with lots of failure cases, asynchrony, etc that makes things much harder for even the simple cases (local, fast, reliable access).
Sounds reasonable. Just as an aside: I think there's a lot of potential to use asynchronous file transfer + replication
to do caching of remote resources.

That's a great approach, especially if it works on the file block level (such that random access to huge remote files can be cached). Again, one thing that's missing from EFS
today is random access to files. Does ECF have it?

We do have support for replication in ECF's APIs. Replication is used, for example, for the real-time shared editing that ECF introduced in 3.4. Basically, for a shared editor the IDocument model is replicated to the shared editing participants, and then read accesses are very fast (reading local copy like Hadoop/GFS). Changes/writes to the replicated IDocument are applied locally and then asynchronously distributed to replicants. Once the change has been received, a synchronization algorithm is used (cola) to resolve conflicts and prevent divergence of the replicas. We've just introduced a replicated model synchronization API (see bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=234142), that exposes an API for synchronization strategies to be added/created, assuming a replicated model approach.

Although we could do something similar to what Hadoop is doing WRT replicating file blocks, we have not as yet as we've been concentrating on other use cases/applications (real-time shared editing).

Note in case it's not clear from my statements above....I'm not arguing for a single-asynchronous API...I think EFS and synchronous i/o approaches are completely appropriate for many use cases. I'm also not arguing that we should discuss this into the ground and not do any implementation. But I do think there is a place for support for asynchronous i/o, asynchronous messaging, replication + synchronization, etc...particularly since it seems that E4 is intended to be a platform for more distributed applications.

Scott




Back to the top