Bug 196662 - [refresh] Refresh performs queries in the Dispatch (main) thread
Summary: [refresh] Refresh performs queries in the Dispatch (main) thread
Status: RESOLVED WORKSFORME
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 2.0   Edit
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: 3.1 M7   Edit
Assignee: David McKnight CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: investigate
Depends on:
Blocks: 191374 198650 198651 203306 204684
  Show dependency tree
 
Reported: 2007-07-16 10:54 EDT by Martin Oberhuber CLA
Modified: 2010-05-27 08:14 EDT (History)
4 users (show)

See Also:


Attachments
patch to avoid query on display thread (3.29 KB, patch)
2007-07-24 15:58 EDT, David McKnight CLA
no flags Details | Diff
avoiding use of ss.getTargetForFilter() on main thread (3.22 KB, patch)
2007-11-30 14:55 EST, David McKnight CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Oberhuber CLA 2007-07-16 10:54:58 EDT
RSE CVS HEAD of 16-Jul-2007.
With an SSH connection, expand 
    My Home/RSETest/1.0.1/eclipse/plugins
with the "plugins" folder selected, right-click > Refresh
--> Tracing shows that right after the SftpService.internalFecth() call,
the SftpFileService.getFile() method is called on the "main" thread.

This is very problematic since any call on the main thread may block the entire UI. In the Sftp case, because there is only a single DirChannel used for remote list queries, if the directory is very large the following getFile() call can block the UI for long (bug #191374 comment 4).

A breakpoint on SftpFileService.getFile() shows the following traceback, which indicates that the reason for calling getFile() on the UI thread is SystemView$ResourceChangedJob:

SftpFileService.getFile(String, String, IProgressMonitor) line: 250	
FileServiceSubSystem.getFile(String, String, IProgressMonitor) line: 298	
FileServiceSubSystem.getRemoteFileObject(String, IProgressMonitor) line: 220	
FileServiceSubSystem(RemoteFileSubSystem).getObjectWithAbsoluteName(String) line: 1059	
SystemView$ResourceChangedJob.runInUIThread(IProgressMonitor) line: 2164	

This should be re-written in order to avoid calling into the Service on the UI thread.

-----------Enter bugs above this line-----------
TM 2.0
installation : eclipse-SDK-3.3 (I20070625-1500), cdt-4.0.0, emf-2.3.0
RSE install  : workspace HEAD
java.runtime : Sun 1.6.0_01-b06
os.name:     : Windows XP 5.1, Service Pack 1
------------------------------------------------
systemtype   : Windows-local, Dstore-win, Dstore-linux
targetos     : Red Hat Enterprise Linux WS release 4 (Nahant Update 3)
targetuname  : Linux parser 2.6.9-34.EL #1 i686 athlon i386 GNU/Linux
targetvm     : Sun Java HotSpot(TM) Client VM (build 1.4.2_12-b03, mixed mode)
------------------------------------------------
Comment 1 David McKnight CLA 2007-07-24 15:56:43 EDT
The reason the getObjectWithAbsoluteName() is called is for a workaround in not having an adapter method to check if the container really exists.   The idea is that the source of the remote refresh event may no longer exist (such as the case where someone deletes a folder from the shell).  In such a case, the remote object will not be stale and a query will be made on an object that no longer exists.  

Without the getObjectwithAbsoluteName() code, here's are a couple scenarios we face:

Scenario #1
-----------
1) create a folder "abc" in RSE
2) in a shell, delete the folder
3) select "abc" in RSE and refresh (F5)

Problem:
--------
The refresh indicates we want to query it's contents but "abc" has never been expanded in the RSE tree so we only mark the folder stale so that when an expansion is actually done, a query will be made on it's contents.  This doesn't account for the fact that "abc" does not exist anymore.

Solution:
---------
Since the folder has not been expanded before, we can assume that a query will not be made on it, and instead refresh it's parent.  This avoids the need to requery the object to determine whether it exists.


Scenario #2
-----------
1) create a folder "abc" in RSE
2) create a file "123.c" under "abc"
3) in a shell, use "rm -Rf abc" to remove the folder and it's contents
4) select "abc" in RSE and refresh (F5)

Problem:
--------
The refresh indicates we want to query it's contents, so we query the non-existent "abc" to find no results.  As a result of the deferred query, the SystemView.add() callback method is called with the empty array of results to add so we end up removing "123" from the contents of "abc", however the folder "abc" itself is still shown.  

Solution:
---------
Ideally, we would use this opportunity in add() to remove "abc" from the tree but we don't have any way to know whether the object exists or not.  There's no ISystemViewElementAdapter.exists() method.  Since we can't add API for 2.0, an alternative is, from the add() method, to fire a remote refresh event on the parent when there are no children or the children is a SysteMessageObject. 
Comment 2 David McKnight CLA 2007-07-24 15:58:19 EDT
Created attachment 74502 [details]
patch to avoid query on display thread

This patch deals with scenarios 1 and 2 although it's not very pretty.  If this doesn't look clean enough, feel free to offer other suggestions.
Comment 3 Martin Oberhuber CLA 2007-07-25 13:08:19 EDT
The solution seems to make sense for files: refreshing a collapsed container only marks it stale, so I cannot see why we should not refresh the parent instead.

For arbitrary contributed subsystems, however, it may be problematic to refresh the parent when actually a child refresh was requested. On our embedded targets, we need to be very careful not to refresh too much.

Couldn't we just refresh the item instead and react to the exception thrown from the remote in case the item does not exist any more?
Comment 4 David McKnight CLA 2007-07-25 13:57:15 EDT
> Couldn't we just refresh the item instead and react to the exception thrown
> from the remote in case the item does not exist any more?
> 

If we refresh the item, we do react to the exception thrown in SystemView.add() by using either the empty list of results of the SystemMessageObject as a queue to refreshing the parent.

Comment 5 David McKnight CLA 2007-07-25 14:08:26 EDT
I've committed this change to cvs.  There may still be main thread calls to getFile() however due to other things (i.e. not from the getObjectWithAbsoluteName() call in the remote refresh).
Comment 6 David McKnight CLA 2007-07-25 14:47:19 EDT
I'm going to mark this as fixed since it addressed the issue listed in this defect.  For other main thread queries, we should address in new defects.
Comment 7 Martin Oberhuber CLA 2007-07-27 12:05:35 EDT
(In reply to comment #4)
> If we refresh the item, we do react to the exception thrown in SystemView.add()
> by using either the empty list of results of the SystemMessageObject as a queue
> to refreshing the parent.

So why again do we need to refresh the Parent of X if a refresh on X was requested and it happens to be collapsed? It seems your code change is specific to the File subsystem which does not perform any query but just marks X as stale. 

Other subsystems can work differently. I'm worried about adding extra burden on them, refreshing stuff they do not really want refreshed.

Remember the SystemView is generic for all kinds of subsystems. If the Files subsystem is doing things specially we should not have special handling for it in the SystemView.

Reopening until this is clarified.

Comment 8 David McKnight CLA 2007-07-27 12:13:47 EDT
First of all, the marking stale is not specific to file subsystem stuff.  Anyone can implement a subsystem to make use of ISystemContainer which deals with cached contents and marking them as stale.  The reason we're refreshing the parent is because we don't know whether the object we just queried exists anymore or not.  We have no exists() method in our adapters so we use the query on the object's parent to determine whether or not it's still there - and consequently the view is updated.


Comment 9 Martin Oberhuber CLA 2007-07-27 13:00:42 EDT
Sorry if I'm just too dumb to understand this.

So you're asked to refresh X. 
We don't know if X still exists, but we know that IF it exists it's a container.
So the query to refresh X must be a query to get the children of X.

My naive approach would be to just go and get the children of X since I've explicitly been asked to, and if an exception is thrown check whether that's an exception of a "item does not exist" kind.

Why doesn't that work? Is it because exceptions when getting children are translated into SystemMessageObjects? Or is it because the FileServiceSubsystem doesn't even make the getChildren() query if X is already stale, so you also don't get the exception?

If that's the case, my feeling would be that the logic for getting the parent instead of the actual item (in case X is stale) should be in the FileServiceSubsystem and not in the SystemView.
Comment 10 David McKnight CLA 2007-07-27 15:37:28 EDT
> My naive approach would be to just go and get the children of X since I've
> explicitly been asked to, and if an exception is thrown check whether that's an
> exception of a "item does not exist" kind.
> 
> Why doesn't that work? Is it because exceptions when getting children are
> translated into SystemMessageObjects?

Not all services behave the same way when the object to query does not exist.  For SSH, a SystemMessageObject is returned while, for Dstore, no contents come back and instead the original remote file is updated to indicate that it doesn't exist.  I'm not sure if we have a clearly prescribed way to deal with this particular situation at the subsystem/service layer.




Comment 11 David McKnight CLA 2007-07-30 15:33:28 EDT
I'm guessing the expectation you have is that if we did a query and it failed because the subject of the query no longer exists, we would explicitly remove that item from the view.  This makes sense, however, we currently have no way of determining for sure whether the subject of the query really exists without querying it's parent.  There is no ISystemViewElementAdapter.exists(Object) method and if we are to add one, that's new API (so it would have to wait).   Instead, that leaves us having to guess that the subject may no longer exist based on the children that come back (either nothing or a SystemMessageObject).  In that case, rather than incorrectly assuming the subject is no longer around, we refresh the parent.  Does this make sense?    
Comment 12 Martin Oberhuber CLA 2007-08-02 10:47:38 EDT
Notes from Telecon today:

Ideal Solution:
---------------
* ISystemViewElementAdapter.exists()
* Martin: reduce turnarounds, combine list() query with exists() 
  in a single call --> have an API with a clearly defined method for 
  throwing a particular exception when the item did not exist

Current Solution:
-----------------
* User performs REFERSH on child.
  - child is a file --> really refresh the parent.
  - child is an expanded folder
  - child is a collapsed folder but not stale
  - child is a collapsed stale folder

* child throws error --> LIST the parent.
  - parent throws no error
    - child exists --> (e.g. permission denied) --> show SystemMessageObject
         PROBLEM (ssh) messageObject shown quickly but then vanishes
            because parent is being refreshed
         PROBLEM (dstore) does not create a messageObject but updates
            the original internal object to mark it nonExistent
    - child does not exist --> remove child
         PROBLEM (dstore) does not create a messageObject but updates
            the original internal object to mark it nonExistent
  - parent throws an error --> WHAT TO DO NOW?
         Go all way up refreshing the parents (in the view)?
         In case of communication error, subsystem marks itself disconnected
     --> We should probably go up a single level of parent only.

Action Items:
-------------
DaveM investigate bug 198651, bug 198650    
Martin investigate how EFS-backed resource viewer handles the case
Comment 13 David McKnight CLA 2007-09-13 10:31:26 EDT
Since the original bug was dealt with and changes suggested in the last comments require API changes, I'm lowering the severity and priority and moving the target milestone to 3.0.
Comment 14 Martin Oberhuber CLA 2007-09-13 14:36:32 EDT
Even though this needs to be deferred to 3.0, I still think it needs to be addressed with high priority (as the recent bug #203306 showed, for instance).
Comment 15 David Dykstal CLA 2007-09-19 11:15:19 EDT
Need a bug to instrument the services so that we can determine when these services are called on the UI thread. 

Then, file individual bugs to fix the services so that they are ensured of running on a background thread.

API documentation should be changed so that clients know that certain calls should not be made on the UI thread.

The SystemView may be the class most affected by this.
Comment 16 David McKnight CLA 2007-11-30 14:55:32 EST
Created attachment 84228 [details]
avoiding use of ss.getTargetForFilter() on main thread

The remaining problem of querying on the main thread is as follows: 

When refreshing a filter (or folder), we attempt to refresh other filters that point to the same place.  The first problem in doing that is we ignore the fact that a matching filter could be itself.  We should ignore the filter in such a case.  When comparing the filters to the object we're refreshing, we make the call SubSystem.getTargetForFilter().  For files, this does a query via getRemoteFileObject() to get the corresponding IRemoteFile folder for the first filter string of a filter.  If we want to avoid that call, we may be able to use the filter string directly to compare (although I've only tried this out for file filters).

The patch here has the changes to avoid checking a filter if it's the object of the refresh and also uses the filter strings for comparision instead of the absolute name of a queried remote object.

Any thoughts on this?
Comment 17 Martin Oberhuber CLA 2008-01-24 14:17:58 EST
Regarding your question about filter comparison - My feeling is that just taking the first object that a filter resolves to, and comparing that against the first object that another filter resolves to, is not really a good indicator for whether filters are the same...

In the long run, I think I'd like to get moving away from filter strings to more flexible filter objects that have the ability to compare themselves against other filter objects, based on their filtering expression. I do think it should be possible to see whether two filters resolve to an overlapping set of objects, just by looking at the filter expressions.

While we still have filter strings, though, this seems risky. Consider the processes subsystem -- an "All processes" filter and a "My processes" filter would resolve to overlapping set of objects but have different filter strings. 

So for me, the question is, do we really need to also refresh other filters? Wouldn't that result in duplicate queries to the remote? Shouldn't we just refresh those items that the user has pointed to with his selection; and once the elements have been refreshed, send an "Elements have been refreshed" event that other filters would honor to update their contents?

The "check for self" part of your patch I did not understand. Why would a filter resolve to a resource that's the filter reference?

On a completely different note, I think here is another scenario where we do queries (and the connect operation) on the main thread:
  * On a disconnected connection, expand until "My Home" filter is visible
  * Select the "My Home" filter and right-click > refresh
    --> All connecting and one query happen on the main thread, thus stalling
        all of Eclipse until the connection is made... which can be long in 
        my case... 37 seconds according to the timestamps below:

2008-01-24 20:01:08.105 | main | SshConnectorService.connecting...
2008-01-24 20:01:43.843 | main | SshConnectorService.connected
2008-01-24 20:01:43.858 | main | SftpFileService.initService
2008-01-24 20:01:43.858 | main | SftpFileService.connecting...
2008-01-24 20:01:45.047 | main | SftpFileService.connected
2008-01-24 20:01:45.062 | main | SftpFileService.getFile: /home/data/users/moberhuber
2008-01-24 20:01:45.234 | main | SftpFileService.getFile <--
2008-01-24 20:01:45.360 | Worker-2 | SftpFileService.internalFetch: /home/data/users/moberhuber


Comment 18 David McKnight CLA 2009-03-18 15:55:35 EDT
(In reply to comment #17)

> On a completely different note, I think here is another scenario where we do
> queries (and the connect operation) on the main thread:
>   * On a disconnected connection, expand until "My Home" filter is visible
>   * Select the "My Home" filter and right-click > refresh
>     --> All connecting and one query happen on the main thread, thus stalling
>         all of Eclipse until the connection is made... which can be long in 
>         my case... 37 seconds according to the timestamps below:
> 
> 2008-01-24 20:01:08.105 | main | SshConnectorService.connecting...
> 2008-01-24 20:01:43.843 | main | SshConnectorService.connected
> 2008-01-24 20:01:43.858 | main | SftpFileService.initService
> 2008-01-24 20:01:43.858 | main | SftpFileService.connecting...
> 2008-01-24 20:01:45.047 | main | SftpFileService.connected
> 2008-01-24 20:01:45.062 | main | SftpFileService.getFile:
> /home/data/users/moberhuber
> 2008-01-24 20:01:45.234 | main | SftpFileService.getFile <--
> 2008-01-24 20:01:45.360 | Worker-2 | SftpFileService.internalFetch:
> /home/data/users/moberhuber
> 

I tried to reproduce this but I find that refreshing the "My Home" filter isn't causing a connect to happen (and therefore no getFile() call).  Can you still reproduce this?
Comment 19 Martin Oberhuber CLA 2010-05-27 08:14:25 EDT
I'm closing this since there is doesn't seem to be an immediate problem.