Bug 227135 - [ssh][nls] Cryptic exception when connecting SSH host which has no sftp-server
Summary: [ssh][nls] Cryptic exception when connecting SSH host which has no sftp-server
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 2.0.3   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.1 M6   Edit
Assignee: Martin Oberhuber CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: PII
: 265881 (view as bug list)
Depends on: 203490 204710 272882
Blocks:
  Show dependency tree
 
Reported: 2008-04-15 10:26 EDT by Yaron Mazor CLA
Modified: 2009-04-20 08:49 EDT (History)
1 user (show)

See Also:


Attachments
screenshot (13.91 KB, image/png)
2008-04-15 10:29 EDT, Yaron Mazor CLA
no flags Details
Screenshot: proposed solution (6.27 KB, image/gif)
2009-03-05 11:21 EST, Martin Oberhuber CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yaron Mazor CLA 2008-04-15 10:26:09 EDT
An Exception is thrown when trying to browse a remote resource using SFTP or Unix connections. "My Home" has no data and Root has "/" but browsing for children causes the exception to be thrown. Attached is a screen shot of what happens. 

The account I was using was disabled for SSH and so an error was being given but RSE simply threw an exception without any information instead of providing a reason for the exception.
Comment 1 Yaron Mazor CLA 2008-04-15 10:29:57 EDT
Created attachment 96080 [details]
screenshot
Comment 2 Martin Oberhuber CLA 2008-04-15 11:37:47 EDT
CQ:WIND00108151

We've had this report before, see bug 203502 comment 1 -- it happens on small embedded targets with Dropbear SSH Service for instance, since Dropbear does not provide sftp by default. See bug 213438 comment 1 for enabling sftp on dropbear.

Anyways, we should improve the error message when sftp is not available. This might require introducing a new NLS String so I'm marking it M7.

We'll also need to check if or how we can disable the Sftp subsystem but still keep the SSh Shell Subsystem active. They share the same connector service, so "disconnect" on only one of them seems impossible, making the problem nontrivial.
Comment 3 Martin Oberhuber CLA 2008-05-05 17:30:03 EDT
Not for M7
Comment 4 Martin Oberhuber CLA 2008-05-20 17:45:00 EDT
Bulk update of target milestone
Comment 5 Martin Oberhuber CLA 2008-06-09 19:54:40 EDT
Will need to look at this in 3.0.1
Comment 6 Martin Oberhuber CLA 2008-09-30 09:04:38 EDT
bulk update of target milestone
Comment 7 Martin Oberhuber CLA 2009-03-05 09:55:43 EST
*** Bug 265881 has been marked as a duplicate of this bug. ***
Comment 8 Martin Oberhuber CLA 2009-03-05 11:21:56 EST
Created attachment 127669 [details]
Screenshot: proposed solution

I've got a potential solution coded, where the error message is shown as a child of the node that user tries to expand in RSE.

That way, we do not disconnect the Sftp subsystem but show the exception; if users continue to fiddle with the Sftp subsystem, more connect attempts will be made resulting in the same exception again. 

Still, the advantage of this approach is that the Shell and Terminal subsystems can remain connected (which was nontrivial since all SSH services share the same connectorService).

I still need to think about if or how to log the exception. Usually, we'd say that this is a message shown to the user (with localized String), so we'd not want it logged. On the other hand, the actual exception backtrace is lost that way, making more investigations hard. I think that logging the exception or not should depend on a global logging level; but that may be hard to do in a generic way.

Feedback about the proposed solution would be appreciated.
Comment 9 Alex Pitigoi CLA 2009-03-11 13:58:16 EDT
Martin, your proposal looks good from my perspective: as long as the end-user gets informed about the original cause of the communication error it would be consistent with any other clients of the (SSH) protocol. That means, the end-user is not required to try other means to connect to the same server to understand what's failing.

Logging may be required when this proposed handling does not always show in UI as your screen capture shows, and yes it would likely depend on a global logging level.

Thanks for your timely fix,
Alex P.
Comment 10 Martin Oberhuber CLA 2009-03-11 18:54:59 EDT
One problem I've seen is that shell connections (e.g. when you do "ls") try contacting the sftp server a zillion times, thus performing slower than necessary. I think that we'll want to cache the fact that there won't ever be an sftp-server, in order to avoid trying again and again. That's why the fix isn't committed yet...

By sure a cleaner solution would be if the SFTP Files Subsystem just doesn't get connected. But as it stands today, this re-uses the generic FileServiceSubSystem, which only asks the ConnectorService about "connected" status. I don't see an easy way to change this such that SSH subsystems can have independent "connected / available" status even if they share the same ConnectorService. Perhaps somebody has an idea?
Comment 11 Alex Pitigoi CLA 2009-03-12 09:17:00 EDT
I am only beginning to understand the inner layers of the subsystems, and I wonder if the issues I'm currently seeing are not related to your last comment. I have at least 2 cases where FileServiceSubSystem.list(...) fails during runtime (with another cryptic exception: RSEF1002: Operation failed. File system input or output error: Message reported from file system: <empty>). This never happens when I run through debug.

I noticed a similar debug vs. runtime behavior divergence with RemoteFileUtility.getFileSubSystem(), but I found a work-around through using RSECorePlugin.waitForInitCompletion(). This again seemed to be a timing issue that makes me believe there are a few asynchronous aspects that are not well behaving or documented, and the exception handling gaps make it a rough experience.

Should I open a different defect, or is there something obvious I'm missing ?
Comment 12 Martin Oberhuber CLA 2009-03-19 19:19:29 EDT
Ok, I think I've committed something that's good enough:

  [227135] Cryptic exception when sftp-server is missing

Along the way, I cleaned up SystemOperationFailedException to provide better and more consistent reporting, and added Javadocs. The most important part of the fix is in

  SftpFileService.connect()

where the error is now properly reported. One problem with the current behavior is that "My Home" behaves differently than any other path: when the user simply expands the "My Home" folder, an empty list is returned instead of the expected exception. 

That's partially an API problem (because getUserHome() doesn't throw any exceptions), and partially a problem in the FileServiceSubSystem sine an exception on connect() is not properly reported but silently swallowed. I'll leave it to Alex whether he wants to follow up on this or not with a separate bug -- some background information about the odd and IMO unsatisfying behavior of IFileService#getUserHome() is in bug 203490 and bug 204710.