Community
Participate
Working Groups
Burak Kulakli found while testing 2.0.1: Folder name allows Turkish (Ex. ğ) characters but when you try to save a file in that folder, it gives error. I tried connecting SSH-Only, then under My Home/aatmp: New > Folder > Name = "ağb" It keeps telling me "Folder already exists". Under dstore, this works correctly. It seems that SSH converts "ağb" into "a?b" and finds that resource already existing. FTP has the same issue as SSH. The FTP console shows MKD a?b 521 "/folk/mober/aatmp/a?b" directory exists -----------Enter bugs above this line----------- TM 2.0.1 Testing installation : eclipse-SDK-3.3 (I20070625-1500), cdt-4.0.0, emf-2.3.0 RSE install : Download RSE-2.0.1RC1: RSE-SDK,examples,tests,discovery,terminal java.runtime : Sun 1.6.0_01-b06 os.name: : Windows XP 5.1, Service Pack 2 ------------------------------------------------ systemtype : Linux SSH-Only ------------------------------------------------
Burak also found: File names allows turkish characters (Ex. ğ) but when you save the file, it becomes "?" Suppose this is the same underlying problem, updating Summary.
Created attachment 78634 [details] Patch to support encodings in FTP files and paths Attached patch supports encodings in FTP. Note that due to a limitation in Commons Net, FTP commands will be encoded with the same encoding so this will not work for encodings which are not compatible with 8-bit ASCII (UTF-16 and other wide encodings specifically).
Created attachment 78636 [details] Patch to support encodings in SSH Sftp files and paths Attached patch fixes the issue for SSH Sftp. In the future, recoding should be done inside Jsch.
Note that for SSH and FTP, we cannot find out the remote default encoding. Thus when no encoding has been specified by the user, we fall back to the local client default encoding. This should ensure that the characters which users typically use on the client can actually by encoded to some form of byte streams. But those encodings may not be appropriate for the actual target platform. I wonder if it might be better to throw an exception when a path can not be encoded (resulting in a question mark, ? in the file; or 16-bit wide encoding on FTP) such that users can review and update their encoding settings.
Created attachment 79249 [details] Updated patch warning about encoding problems Attached updated patch fixes both FTP and Sftp to honor the specified encoding. As discussed during our F2F meeting, they now warn in case the user tries to modify the remote file system (create, rename, copy, delete, upload) with a local Unicode file name that can not be properly expressed with the given encoding (exception is thrown; text of the exception is not yet externalized). For Sftp, a bug in Jsch always encodes with the local platform default encoding; therefore, if the requested remote encoding is different, we need to emulate and recode. Unfortunately, there are combinations related to he local default encoding (particularly the normal Windows cp1252), where some bytes can not be properly expressed. This leads to some unicode characters (particularly "č") not being able to be used on a local cp1252 / remote utf8 combination. A Jsch bug has been filed for this. The patch is large, but in the default case (remote encoding == local encoding: this was always the case before the patch), recode() does nothing so the patch should be safe.
I'm committing the patch since Kushal orally agreed to review the patches: [203500] Support encodings for SSH Sftp and FTP paths FTPService SftpFileService SshConnectorService FTPConnectorService ISshSessionProvider