Community
Participate
Working Groups
This bug is created from the discussion of bug 194204. When doing a refresh, opening a file, etc unnecessary remote list commands are performed. Console Log from opening a file from /public_html/B2--aa/test55555: NOOP 200 Zzz... CWD /public_html/B2--aa/test55555 250 OK. Current directory is /public_html/B2--aa/test55555 PORT 9,26,148,193,6,177 200 PORT command successful LIST 150 Connecting to port 8226 226-ASCII 226-Options: -a -l 226 4 matches total NOOP 200 Zzz... CWD /public_html/B2--aa/test55555 250 OK. Current directory is /public_html/B2--aa/test55555 PORT 9,26,148,193,6,178 200 PORT command successful RETR test25.php 150 Connecting to port 8246 226-File successfully transferred 226 0.007 seconds (measured here), 1.47 Kbytes per second NOOP 200 Zzz... CWD /public_html/B2--aa/test55555 250 OK. Current directory is /public_html/B2--aa/test55555 PORT 9,26,148,193,6,179 200 PORT command successful LIST 150 Connecting to port 8275 226-ASCII 226-Options: -a -l 226 4 matches total NOOP 200 Zzz... CWD /public_html/B2--aa 250 OK. Current directory is /public_html/B2--aa PORT 9,26,148,193,6,180 200 PORT command successful LIST 150 Connecting to port 8309 226-ASCII 226-Options: -a -l 226 4 matches total NOOP 200 Zzz... CWD /public_html/B2--aa/test55555 250 OK. Current directory is /public_html/B2--aa/test55555 PORT 9,26,148,193,6,181 200 PORT command successful LIST 150 Connecting to port 8335 226-ASCII 226-Options: -a -l 226 4 matches total -----------Enter bugs above this line----------- TM 2.0 Testing installation : eclipse-SDK-3.3 RSE install : RSE 2.0 java.runtime : Sun 1.5.0_11-b03 os.name: : Windows XP, Service Pack 2 ------------------------------------------------
Improving performance is very important especially with respect to EFS. It looks like FTP is a good test candidate since all remote operations are visible in the log. I'd like to see this addressed with high priority though I'm not assigning a targetmilestone yet since it's unclear whether the necessary changes can be fully API Contract compatible.
My first observations on this are as follows: The listFiles() method of FTPService is used frequently and not just for doing a query. It's used whenever getFile() is called - within the FTPService, getFile() is called by createFile(), createFolder(), delete(), download and setReadOnly(). During the open scenario the following happens: 1) in FTPService.download(), we call FTPService.getFile() which calls list. 2) In SystemEditableRemoteFile.doDownload(), we call getRemoteFileObject(), which calls FTPService.getFile() which calls list. 3) SystemEditableRemoteFile.openEditor() which call getRemoteFileObject() which calls FTPService.getFile(). The operation should only need to query the remote file once before the download to make sure we're dealing with a file that hasn't changed on the host. IRemoteFiles have been initialized to be marked stale so that we don't used cached info when querying them. However, this should only apply to folders and archives since files are leaf nodes. I've changed the code to mark a IRemoteFile as non-stale if it is a file but isn't an archive. This prevents the 2) query here so for FTP it reduces the number of list calls to two.
I thought that the Service could use a "Stat cache" in order to avoid unnecessary lists. This is a common technique which is also used in the Eclipse Resource system, and we also had it in our earlier products. Basically, the list() command would store its result in a local cache along with a timestamp. When list() is requested again with the same properties within a very short time frame (typically 1 second or less), use the cached information instead. Any operation that modifies the remote needs to invalidate the stat cache. I think that the stat cache could probably (at least partially) be implemented in an abstract superclass so all the services could use it, but that would be an API change. If we want to use a stat cache for performance improvements in 2.0.1, each service needs to do it itself (it should be simple). Keep in mind, though, that a stat cache is kind of a workaround for avoiding unnecessary queries that occur because the higher levels don't re-use information properly. So analyzing the higher levels as to why the extra queries are there is always a good idea.
I believe the extra list (via getFile()) from the service's download() code is unique to the FTPService. Other services don't have this extra query so I'm thinking this particular issue doesn't need to be handled generically.
Javier can you investigate why there is an extra getFile() call?
The getFile() in FTP service dowload() is used to get the file size, so the progress bar and percentage can be shown during the download. This have to be done as FTP doens't provide the size of the file before the download.
Hm, I see. Perhaps we should change the IFileSerivce.download() API in the future to call with an IHostFile instead of a plain String for the remote file. Then, the IHostFile should already hold the remote file size. Thoughts?
As a termporary workaround, FTP could do a statCache to remember the Mapping of remote-path (String) to IHostFile, at least for a short time. Since RSE must have got the IHostFile before, then the FTPService knows the IHostFile from its internal map and can deduce the file size from there. But anyways, I think that download is slower than getting the file size so the extra list command is perhaps not much of an issue.
Removed unnecessary LIST commands by adding a map between absolute file names and their IHostFile object. This should be a temporary solution to avoid accessing the remote server when not necessary. The ideal solution would be passing the IHostFile to the FTP service.
I also found that during the session of FTP, it performs many NOOP operation. Can we get rid of it? It causes the FTP connection with high latency connection to be very slow especially for file browsing.
(In reply to comment #10) > I also found that during the session of FTP, it performs many NOOP operation. I have filed bug 269171 to track this.