Bug 249544 - [usability] Save conflict dialog appears when saving files in the editor
Summary: [usability] Save conflict dialog appears when saving files in the editor
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 3.0.1   Edit
Hardware: PC Windows XP
: P1 critical (vote)
Target Milestone: 3.1 M3   Edit
Assignee: David McKnight CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 213363 250140
  Show dependency tree
 
Reported: 2008-10-02 16:44 EDT by Ankit Pasricha CLA
Modified: 2008-11-07 12:24 EST (History)
2 users (show)

See Also:


Attachments
patch to keep waiting while timestamp gets updated (2.90 KB, patch)
2008-10-03 10:32 EDT, David McKnight CLA
no flags Details | Diff
patch with update for conflict and dstore file service change (10.03 KB, patch)
2008-10-03 16:49 EDT, David McKnight CLA
no flags Details | Diff
updated patch (10.51 KB, patch)
2008-10-06 11:29 EDT, David McKnight CLA
no flags Details | Diff
patch to ensure remote file is current cached version before requery (9.22 KB, patch)
2008-10-20 18:52 EDT, David McKnight CLA
no flags Details | Diff
patch with javadoc for upload() and marking the file stale before calling upload (5.71 KB, patch)
2008-11-07 12:02 EST, David McKnight CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ankit Pasricha CLA 2008-10-02 16:44:49 EDT
Steps to reproduce:

1. Edit a file on a remote system. 
2. Click Save to save the new contents. DO NOT close the editor.
3. Restart the workbench with the file open.
4. When the workbench comes up, the file is still open.
5. Edit the file again and click Save.
6. RSE will prompt you to connect. Connect and allow the save.
7. Edit the file again and click Save.

A save conflict dialog pops up. I did some investigation...It seems that AFTER the upload, when RSE queries for the last modified date of the remote file, it still gets the old timestamp!!!
RSE should wait for the timestamp to be updated before saving the timestamp for future use. At the moment, there is an arbitrary sleep(1000) after which RSE queries for the new timestamp and assumes that this is the final value.
Comment 1 David McKnight CLA 2008-10-03 09:58:18 EDT
Is this really "critical"?  How frequently do you run into this situation?  Is there loss of data?
Comment 2 Ankit Pasricha CLA 2008-10-03 10:12:28 EDT
Dave, we run into this all the time...the above situation is only one of the scenarios in which we see it...The reason I used this scenario in the bug is because it can be consistently reproduced with these steps.
Our testers have been running into this so frequently that it has made the workbench almost unusable since the dialog keeps popping up.

loss of data question - when users see this conflict dialog:

        a) do not know the file that it happened on
        b) are not sure whether the content is really different.

If they choose to pick up changes from the remote file (thinking that the remote file has been updated), they would lose the changes that they were about to save. That makes this a critical bug.
Comment 3 David McKnight CLA 2008-10-03 10:32:41 EDT
Created attachment 114197 [details]
patch to keep waiting while timestamp gets updated

Could you see if this patch fixes the problem?
Comment 4 Ankit Pasricha CLA 2008-10-03 14:58:50 EDT
Comments about the patch:

1. If the timestamp does not update in 500ms (after the first sleep), then the loop will exit. We should make sure that the "new' timestamp is different from the timestamp before the upload....If I am performing the upload then this MUST be the case.

2. Dont you also need to update the SystemUploadConflictAction.BackgroundUploadJob  job with this fix?
Comment 5 David McKnight CLA 2008-10-03 16:49:38 EDT
Created attachment 114226 [details]
patch with update for conflict and dstore file service change

I've added the same wait method in the conflict dialog code.  Also, I made a change to the dstore file service upload code to ensure the file upload has completed on the host side.  The dstore part of this might actually be the only thing needed although the temp file listener code attempts ensure the updated timestamp for all.
Comment 6 Ankit Pasricha CLA 2008-10-04 00:52:52 EDT
I can still reproduce the conflict dialog with the latest patch...I still think we need the changes outline in comment 4 step 1.
Comment 7 Martin Oberhuber CLA 2008-10-06 03:35:08 EDT
From what I read here, the underlying problem may be the same as in bug 213363, we might want to make that one a duplicate of this one eventually when it is fixed.

Comment #2 a) seems to be the request from bug 242389, and to some respect bug 224960. Knowing the file that's in conflict is very important. Having the "diff files" option from bug 199438 would also help.

I agree that this is a severe usability problem.
Comment 8 David McKnight CLA 2008-10-06 11:29:32 EDT
Created attachment 114319 [details]
updated patch

Changed the patch to ensure the original timestamp changed before allowing the wait to complete.  Ankit, does this help?
Comment 9 Ankit Pasricha CLA 2008-10-06 16:32:29 EDT
Works better...It worked for the scenario mentioned in the bug.
Comment 10 David McKnight CLA 2008-10-06 16:42:37 EDT
I've committed the patch to cvs.  I'll mark this as fixed.  If you run into other problems with this please reopen.
Comment 11 Ankit Pasricha CLA 2008-10-16 22:18:02 EDT
Noticed a couple of problems while testing:

1. what happens if the timestamp is updated almost instantaneously. That is, what happens when the originalTimestamp is the final timestamp? This loop will never end...that is what we are seeing.

I think we should use the "remoteModifiedStamp" before the fs.upload is called as the original timestamp.

2. Since this is a while loop which could run forever, it might be useful to have a timeout in case something happens that causes the infinite loop. 
Comment 12 David McKnight CLA 2008-10-16 23:08:29 EDT
(In reply to comment #11)
> Noticed a couple of problems while testing:
> 
> 1. what happens if the timestamp is updated almost instantaneously. That is,
> what happens when the originalTimestamp is the final timestamp? This loop will
> never end...that is what we are seeing.
> 
> I think we should use the "remoteModifiedStamp" before the fs.upload is called
> as the original timestamp.
> 

The original timestamp that we pick up at the beginning of the wait method should be coming from the remote file that we queried before the upload so I'm not sure I understand why it would not have changed.  In a debugger do you see a difference between the remoteModifiedStamp of line 259 of SystemUniversalTempFileListener compared to the original timestamp in the wait method?   


> 2. Since this is a while loop which could run forever, it might be useful to
> have a timeout in case something happens that causes the infinite loop. 
> 

That's probably a good idea.
Comment 13 Ankit Pasricha CLA 2008-10-17 01:05:37 EDT
What happens when the file is local? The originalTimestamp will be the value after the upload, right?
Comment 14 David McKnight CLA 2008-10-17 10:27:04 EDT
(In reply to comment #13)
> What happens when the file is local? The originalTimestamp will be the value
> after the upload, right?
> 

Hmm, yes, local would be a problem since it's getting at the timestamp directly rather than via the cached object.  I guess I'll have to pass the original timestamp into the wait method.
Comment 15 David McKnight CLA 2008-10-17 11:09:07 EDT
I've made changes to pass in the original timestamp into the wait method since it's possible we might have a more direct handle to the current timestamp via the IRemoteFile, as in the case of local.  I've also added a timeout to ensure we never hang on the wait.
Comment 16 Martin Oberhuber CLA 2008-10-20 09:39:59 EDT
I'm concerned about this fix from an API point of view.

Looking at our IRemoteFileSubSystem#upload() method, one should think that this method is fully synchronous i.e. when the method returns, it has finished all its work (including updating the modified time).

With this fix, we're changing two particular clients to perform some additional busy waiting (with additional server roundtrips) after the upload is complete. What about other clients of this API, written by adopters that we do not know? Should we change the upload() API Docs to inform all clients that they might need to wait until some timestamp is changed? This seems wrong to me.

Moreover, this seems to be a dstore specific issue. I'd be very surprised if the SSH, FTP and Local implementations were not synchronous i.e. wouldn't provide updated timestamps after the upload is complete.

I think that we should move this fix into the DStoreFileService implementation in order to ensure that when its upload() method returns, the upload is in fact complete including updating of the timestamp. 

We might also want to look at the API Docs and clarify the fact that upload etc are assumed to be fully synchronous. Finally, we might also want to write Unit Test(s) to exhibit the bug with the current dstore implementation (timestamp not updated immediately after upload) and see the fix work (timestamp updated after upload complete).
Comment 17 David McKnight CLA 2008-10-20 10:59:10 EDT
(In reply to comment #16)
> I'm concerned about this fix from an API point of view.
> 
> Looking at our IRemoteFileSubSystem#upload() method, one should think that this
> method is fully synchronous i.e. when the method returns, it has finished all
> its work (including updating the modified time).
> 
> With this fix, we're changing two particular clients to perform some additional
> busy waiting (with additional server roundtrips) after the upload is complete.
> What about other clients of this API, written by adopters that we do not know?
> Should we change the upload() API Docs to inform all clients that they might
> need to wait until some timestamp is changed? This seems wrong to me.
> 
> Moreover, this seems to be a dstore specific issue. I'd be very surprised if
> the SSH, FTP and Local implementations were not synchronous i.e. wouldn't
> provide updated timestamps after the upload is complete.
> 
> I think that we should move this fix into the DStoreFileService implementation
> in order to ensure that when its upload() method returns, the upload is in fact
> complete including updating of the timestamp. 
> 
> We might also want to look at the API Docs and clarify the fact that upload etc
> are assumed to be fully synchronous. Finally, we might also want to write Unit
> Test(s) to exhibit the bug with the current dstore implementation (timestamp
> not updated immediately after upload) and see the fix work (timestamp updated
> after upload complete).
> 

I've seen the same problem with FTP as well as dstore.   In the dstore case, even after the final write call on the server the timestamp may still change.  It looks like this is the case for FTP too and I imagine the implementer of that service wouldn't have expected that.  I guess that means the only way to ensure the timestamp is the final timestamp is to poll.  But do we want to do that for each service, or have the framework take care of it as the current fix does?


Comment 18 Martin Oberhuber CLA 2008-10-20 11:18:32 EDT
If we don't do it in the Service, then we need to add a note to the API docs like "Note that after the upload completes, the timestamp may still change..."

Which would make the API confusing and hard to use. In my opinion, the right place for doing this is in the Service. This also puts the code where the expert is -- Services will know best how to efficiently implement the requirement.

Currently, the code is in 2 places. I'm very sure that there is no need for doing it with Local and SSH (though I'd like to see a unit test to prove this). Doing it for DStore and FTP leaves again 2 places for this.

Putting the Unittest into FileServiceTest allows us running the same test against all our connection types. If I'm not mistaken, the test would simply create a remote file, query it's timestamp, wait a little, upload a file into the existing file, query the timestamp again and fail if the timestamp has not been modified. Right?

What I'm not sure about is whether this issue depends on the size of files being uploaded. Ankit, does the file need to be large to exhibit the problem? - We could easily parameterize the testcase, and write a unittest that tests it repeatedly. For instance 4 times with sizes of 1K, 10K, 100K, 1M.
Comment 19 David McKnight CLA 2008-10-20 11:44:23 EDT
(In reply to comment #18)
> If we don't do it in the Service, then we need to add a note to the API docs
> like "Note that after the upload completes, the timestamp may still change..."
> 
> Which would make the API confusing and hard to use. In my opinion, the right
> place for doing this is in the Service. This also puts the code where the
> expert is -- Services will know best how to efficiently implement the
> requirement.
> 
> Currently, the code is in 2 places. I'm very sure that there is no need for
> doing it with Local and SSH (though I'd like to see a unit test to prove this).
> Doing it for DStore and FTP leaves again 2 places for this.
> 
> Putting the Unittest into FileServiceTest allows us running the same test
> against all our connection types. If I'm not mistaken, the test would simply
> create a remote file, query it's timestamp, wait a little, upload a file into
> the existing file, query the timestamp again and fail if the timestamp has not
> been modified. Right?
> 
> What I'm not sure about is whether this issue depends on the size of files
> being uploaded. Ankit, does the file need to be large to exhibit the problem? -
> We could easily parameterize the testcase, and write a unittest that tests it
> repeatedly. For instance 4 times with sizes of 1K, 10K, 100K, 1M.
> 

I just saw this with SSH as well.  It's pretty easy to reproduce with a small file.  Sometimes you can reproduce it by repeatedly saving the file.  Restarting and saving right away is also an easy way to reproduce it.


Comment 20 Martin Oberhuber CLA 2008-10-20 14:15:20 EDT
(In reply to comment #19)
> I just saw this with SSH as well.  It's pretty easy to reproduce with a small

Hm. I can't believe that this is the SSH FileService's fault unless you can prove to me with a unittest. It seems to me that there is some logical problem inside the subsystem layer.

Doe we have an IRemoteFile#markStale() call after the upload is complete? I cannot see it in FileServiceSubSystem#upload(), but I believe that we need this. Perhaps we don't need the busy waiting at all after all.
Comment 21 Martin Oberhuber CLA 2008-10-20 14:18:00 EDT
I think we need this in FileServiceSubSystem#upload():

try {
    getFileService().upload(new File(source), ...);
} finally {
    destination.markStale();
}
Comment 22 David McKnight CLA 2008-10-20 18:36:15 EDT
I investigated this a bit further.  I added a junit to the FileServiceTest class and found that we don't hit this problem in the services directly (although for dstore, I do need my earlier change).  Then I tried comparing the timestamps of the IRemoteFile vs. an IHostFile that I queried directly via the service in SystemUniversalTempFileListener.  In some cases, the IHostFile had a different timestamp.  I checked and we were correctly marking the remote file stale but we ending up picking up an unexpected cached remote file when the workspace is restarted and then the file is saved.

On the first save of an open editor after restart, the temp file listener uses Display.asynchExec() to call SystemEditableRemoteFile.openEditor() in order to reassociate the editor with the corresponding remote file.  Inside openEditor() we call remoteFile.markStale() and getRemoteFileObject(), which replaces the cached remote file with a new one that isn't stale.  

In the SystemUniversalTempFileListener.upload() call, we now are using a handle to a remote file that isn't the cached one.  As a result, after the upload, when we call remoteFile.markStale() and getRemoteFileObject(), we end up getting the cached file we retrieved via openEditor(), which was before the upload() call.  So the new timestamp is the same as the timestamp before the upload.



 
Comment 23 David McKnight CLA 2008-10-20 18:52:30 EDT
Created attachment 115642 [details]
patch to ensure remote file is current cached version before requery

This patch ensures that the remote file we use before the upload is the latest one from the cache so that, when we mark it stale and do a requery after the upload, we get the new version, not an older cached version.  As a result, we should get the correct timestamp after an upload.

Ankit, could you try this patch out in place of the previous?
Comment 24 David McKnight CLA 2008-10-21 11:41:38 EDT
I haven't seen any problems with save conflicts using the patch so I've committed the code to cvs.    
Comment 25 Martin Oberhuber CLA 2008-11-06 11:15:25 EST
Dave - when reviewing the changes, I got the feeling that in 
  SystemUniversalTempFileListener.upload() line 266
we should be doing 
  remoteFile.markStale()
before getting the remote file object.

We are deciding here whether the upload will result in a conflict or not. Upload will be slow anyways, so spending an extra round trip to ensure we really get the latest status of the remote seems to be wise. 

Or am I missing something, do we know that the remoteFile is current before this is called? Then we should probably add a corresponding comment.

Reopening to clarify.
Comment 26 David McKnight CLA 2008-11-06 11:32:57 EST
(In reply to comment #25)
> Dave - when reviewing the changes, I got the feeling that in 
>   SystemUniversalTempFileListener.upload() line 266
> we should be doing 
>   remoteFile.markStale()
> before getting the remote file object.
> 
> We are deciding here whether the upload will result in a conflict or not.
> Upload will be slow anyways, so spending an extra round trip to ensure we
> really get the latest status of the remote seems to be wise. 
> 
> Or am I missing something, do we know that the remoteFile is current before
> this is called? Then we should probably add a corresponding comment.
> 
> Reopening to clarify.
> 


On line 181 of SystemUniversalTempFileListener, we call remoteFile.markStale() before the upload() call.  Originally we didn't do the line 266 call of fs.getRemoteFileObject() at the beginning of upload().  What was found was that it was possible that a call to SystemEditableRemoteFile.openEditor() from doResourceSynchronization() would change the current cached remote file before we got into upload().  However, now that the code that calls openEditor() is moved to after the upload(), it's probably no longer necessary to make the line 266 call.
    
Comment 27 Martin Oberhuber CLA 2008-11-07 09:18:31 EST
Fine, but SystemUniversalTempFileListener#upload() is a public API method. And actually, it is also called from SystemEditableRemoteFile line 709. It looks like the markStale() is missing there! Which means, if the remote has changed "from the outside" it will be overwritten without conflict warning, right?

I'd like to either see another markStale in upload() which will perform an unnecessary extra round trip in the context of the SystemUniversalTempFileListener, but makes the public API cleaner and easier to understand; or, add Javadoc to the upload() method explaining that the remote file's status will be taken from the cache and not necessarily re-queried so overwrite might happen.

Actually, the public API upload() method should have Javadoc anyways.



Comment 28 David McKnight CLA 2008-11-07 12:02:51 EST
Created attachment 117337 [details]
patch with javadoc for upload() and marking the file stale before calling upload

I've added javadoc for SystemUniversalTempFileListener.upload().  Martin, is this enough to address the issues?
Comment 29 Martin Oberhuber CLA 2008-11-07 12:05:43 EST
Whow! Very nice and exhaustive. I like it, and it's certainly enough :-)
Comment 30 David McKnight CLA 2008-11-07 12:24:22 EST
I've committed the latest patch to cvs.