Bug 199773 - [ftp][regression] FTP always uploads remote files as binary
Summary: [ftp][regression] FTP always uploads remote files as binary
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 2.0   Edit
Hardware: PC Windows XP
: P2 major (vote)
Target Milestone: 2.0.1   Edit
Assignee: Javier Montalvo Orús CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: investigate
Depends on:
Blocks: 203114 203357
  Show dependency tree
 
Reported: 2007-08-13 11:47 EDT by Stephen Davison CLA
Modified: 2012-05-23 16:48 EDT (History)
1 user (show)

See Also:
dmcknigh: review+
mober.at+eclipse: review+


Attachments
FileServiceSubSystem passing transfer mode from RSE Preferences (2.70 KB, patch)
2007-09-12 09:20 EDT, Javier Montalvo Orús CLA
no flags Details | Diff
patch not checking remote encoding to force binary transfer (1.12 KB, patch)
2007-09-13 07:20 EDT, Javier Montalvo Orús CLA
no flags Details | Diff
patch with simplified logic (1.76 KB, patch)
2007-09-13 09:45 EDT, Javier Montalvo Orús CLA
mober.at+eclipse: iplog-
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Davison CLA 2007-08-13 11:47:26 EDT
FTP Only connection stores files to remote system as binary, even with file transfer mode set to Text in Preferences > Remote Systems > Files.  Files are retrieved from remote system as text or binary according to the preference settings.

(RSE 1.0.1 FTP service retrieves and stores as text or binary according to the preference settings.)

Platform Specifics:
Windows XP SP 2
Eclipse Platform: version 3.3.0, build id I20070621-1340
RSE Core: 2.0.0.v20070609-7P--EB7sQRxRjbc
RSE FTP: 2.0.0.v20070609-77378_kE77Y7U9K5G7A
Remote host: SunOS 5.9
Comment 1 Martin Oberhuber CLA 2007-08-14 05:19:21 EDT
This may be related to bug 189352, see also the newsgroup entry that this bug came from:
http://dev.eclipse.org/newslists/news.eclipse.dsdp.tm/msg00227.html
Comment 2 Javier Montalvo Orús CLA 2007-09-12 09:20:18 EDT
Created attachment 78177 [details]
FileServiceSubSystem passing transfer mode from RSE Preferences

After some investigation, it looks like when uploading or downloading a file in FTP, the transfer mode depends on the file type rather than on the file transfer mode selected by the user in the RSE Preferences.

The solution, instead of modifying the FTP Service to access the Preferences, that would have some UI/non-UI issues, could be updating org.eclipse.rse.subsystems.files.core.servicesubsystem.FileServiceSubSystem to provide the transfer mode from the RSE Preferences when calling the FTP Service.

I'm attaching a patch for FileServiceSubSystem providing the user preferred transfer mode in the upload and download methods, but it should be reviewed as other file subsystems will be also affected.
Comment 3 David McKnight CLA 2007-09-12 09:48:48 EDT
So, let me see if I understand this right.   The reason this wasn't working for you before was because the type of the file you wanted to transfer already had a specific setting for it.  You expected that the default setting would be used rather than the specific setting?  If that's the case, I would think the behavior is as expected since the "default" implies the mode that is used when no specific setting is made, while in your case there is a specific setting.  In other words, this is designed such that a specific file type can have specific settings, but the fallback is to use the default.  Or am I misunderstanding what you're saying?
Comment 4 Javier Montalvo Orús CLA 2007-09-12 11:39:59 EDT
Changing the File Transfer Mode in the RSE Preferences has no effect in the isBinary attribute passed to the FTP service.
This attribute just informs if the file attempted to be uploaded or downloaded has been previously categorized as binary (executable) or not, not using the information from the Preferences set by the user.

Shouldn't it use the user preferences instead ?
Comment 5 David McKnight CLA 2007-09-12 11:46:13 EDT
What type of file is this?  Doesn't the file type show up in the Files preference page table?
Comment 6 Martin Oberhuber CLA 2007-09-12 14:02:35 EDT
I agree that the patch just taking the default mode is not acceptable.

I'm not sure what exactly the problem is, but we should check that

  - FileTransferModeRegistry is re-initialized when user changes Preferences;
    it looks like a cache for file name mappings needs to be cleared in that
    case

  - I do not understand why FileTransferModeRegistry has some default-defaults
    of name and extension mappings built in. This is not user-visible. Why
    not expose that default-defaults to the user and let him choose / change?
    Why are *.xml files transferred in Binary by default, but that decision
    is made somewhere else?
    There should be ONE user-configurable place where the decision of Binary
    vs. Text mode is made, and that's the FileTransferModeRegistry. The code
    in getDefaultMapping() should be removed IMHO since only the user-specified
    mappings as visible in the Preferences should be considered.

What I also find interesting is, that the bug description states that download does take the transfer mode into account as desired, but upload does not.
    
Comment 7 Martin Oberhuber CLA 2007-09-12 14:08:49 EDT
BTW, the UI for specifying transfer mode in the Preference page is very unintuitive. I filed bug 203114 for this.
Comment 8 Javier Montalvo Orús CLA 2007-09-12 14:09:16 EDT
Some more investigation showed the following:

org.eclipse.rse.subsystems.files.core.model.ISystemFileTransferModeRegistry

contains some arrays of String containing files and extensions that override the default file transfer.

Those arrays are:

DEFAULT_TEXT_FILE_NAMES
DEFAULT_TEXT_FILE_EXTENSIONS
DEFAULT_LPEX_TEXT_FILE_EXTENSIONS
DEFAULT_ISERIES_LPEX_TEXT_FILE_EXTENSIONS
DEFAULT_UNIX_LPEX_TEXT_FILE_EXTENSIONS
DEFAULT_BINARY_FILE_EXTENSIONS  

So, if I have a "test.txt" file, it will we transferred as text because "txt" is listed in DEFAULT_TEXT_FILE_EXTENSIONS, and if I have a "test.exe" it will be treated as binary because "exe" is part of DEFAULT_BINARY_FILE_EXTENSIONS, independently of the default preferences.

This behavior is a bit confusing, as the user will see the files being transferred not using the default settings. 
Comment 9 Martin Oberhuber CLA 2007-09-12 14:17:35 EDT
Agree. Looks like the code works properly for any file types listed explicitly in the listbox on the Preference page.

The code currently fails (from a user's perspective) for any file extension 
1.) that is NOT shown explicitly on the listbox on the Preference page,
2.) that is listed in the built-in default-defaults,
3.) where the built-in default-default differs from the user-visible
    "default" radiobutton.

For instance, 
"*.txt" files will always be transferred in ASCII mode even if not listed on the Prefs page and the "Default File Transfer Mode" says Binary.

"*.xml" files will always be tranferred in Binary mode.

"*.exe" files will always be tranferred in Binary mode even if not listed on the Prefs page and the "Default File Transfer Mode" says ASCII.

I think the only correct fix is to get rid of the built-in default-defaults, and convert them to user-visible default-Preferences instead. This might, however, create a backward compatibility issue because Preferences are saved in the Workspace.

The right solution for 2.0.1 seems to be to only add the builtin default-defaults to the visual preference page. Then, users will see what's going on and they can change the filetype-level default. It will fail again, though, when they manually delete such a mapping. And the bugfix will not help on old, pre-existing workspaces. 

Thinking again, I guess an even better solution is to 
1.) mark it as a known limitation
2.) document what builting default-defaults exist, and
3.) document the workaround for fixing the issue (i.e. manually adding a
    filetype level mapping for overriding the builtin default behavior).
Comment 10 Martin Oberhuber CLA 2007-09-12 14:24:54 EDT
Changed the summary to better describe the issue (previous value was:
   [ftp][regression] FTP always uploads remote files as binary)

Decreased Severity since there is a Workaround (Adding the file type in question to the Preferences).

The workaround does not work for *.xml files due to the code in FileServiceSubSystem, but it does work for *.exe files. I believe that this should be fixed, I see no reason for unconditionally forcing *.xml files to binary in FileServiceSubsystem.
Comment 11 David McKnight CLA 2007-09-12 15:25:58 EDT
I think the reason we explicitly treat xml files as binary is because the encoding for the xml files is typically embedded in the xml code.  So the reader/editor of the xml file would normally use that to determine the encoding.  For example, the following is a typical start for an xml file:

<?xml version="1.0" encoding="UTF-8"?>

If we can't assume that the file is readable in binary form, then what encoding would need to be used to read this header?


   
Comment 12 Stephen Davison CLA 2007-09-13 00:48:25 EDT
(In reply to comment #3)
> So, let me see if I understand this right.   The reason this wasn't working for
> you before was because the type of the file you wanted to transfer already had
> a specific setting for it.  You expected that the default setting would be used
> rather than the specific setting?  If that's the case, I would think the
> behavior is as expected since the "default" implies the mode that is used when
> no specific setting is made, while in your case there is a specific setting. 
> In other words, this is designed such that a specific file type can have
> specific settings, but the fallback is to use the default.  Or am I
> misunderstanding what you're saying?
> 

My original summary didn't specify "Default transfer mode" because I had the same problem for specific file types and the default.  

Here's an example using a specific file type:
Starting with a clean install of Eclipse 3.3 and RSE 2.0.0.1, the Remote Systems -> Files preferences page shows Text as the file transfer mode for *.sql files.  A remote sql file will be retrieved with an ASCII transfer, but when the file is saved back to the remote system the STOR operation uses a binary transfer.  

I get the exact same behavior for other file types listed (and that I've added), as well as for file types that are not listed.  In the latter case I would expect the Default File Transfer Mode to apply.  In reality the file is retrieved as text or binary as specified in the preference setting, but it always gets saved (stored) with a binary transfer.

It looks to me like you're ignoring the user's preferences, opting to always upload files as binary instead.
Comment 13 Stephen Davison CLA 2007-09-13 01:17:51 EDT
(In reply to comment #10)
> Changed the summary to better describe the issue (previous value was:
>    [ftp][regression] FTP always uploads remote files as binary)
> 
> Decreased Severity since there is a Workaround (Adding the file type in
> question to the Preferences).
> 
> The workaround does not work for *.xml files due to the code in
> FileServiceSubSystem, but it does work for *.exe files. I believe that this
> should be fixed, I see no reason for unconditionally forcing *.xml files to
> binary in FileServiceSubsystem.
> 

I'm just getting caught up with the comments.  (I'm on vacation right now.)  Please don't reduce the severity.  File types specifically listed in the preferences for Text transfer mode are uploaded with a binary transfer.  I've tried this with file type already there in a fresh install and workspace (e.g. *.sql) and for types I've added (e.g. *.sh, *.csh, *.cfm, *.cfc).
Comment 14 Stephen Davison CLA 2007-09-13 01:44:18 EDT
I'm not sure what the right answer is for XML files, but I definitely don't agree with forcing binary transfers for two reasons:
1. the Remote Systems -> Files preferences page includes *.xml in the list of file types with the transfer mode set to Text.  Consequently, I would expect my XML files to be FTP'ed with the type set to ASCII.

2. I'm using RSE to edit files (including xml files) on my Windows PC and UNIX (SunOS) hosts, and to copy them between Windows and Unix.  Life would be simpler if Microsoft and Apple would break lines the Unix way, but until then I'd like text files to be FTP'ed as ASCII.  And even though Eclipse has some convenient features to deal with line breaks, users shouldn't have to overtly convert them before FTP'ing files.
Comment 15 Martin Oberhuber CLA 2007-09-13 02:28:38 EDT
Back to major until this is clarified for XML.

@DaveM: I think the question is, how "Text mode" transfer is defined. Does
 it imply that encodings are changed? If yes, then your argument about the
 encoding being embedded in the file is correct.
 But the FTP notion of Text mode vs. Binary mode just is about the line endings,
 which are a different topic than the encodings. 

Will need to think about this a bit more. But my personal feeling would be that at any rate, the user-entered visible Preference (*.xml in the Prefs page) must have precedence and honored under all circumstances. There are users who do not care about encodings at all, but who do care about the line endings.

Other users care about the encodings but not the line endings. So again, what is the definition of "Text Mode" as visible in the Preferences? - That definition should perhaps go into the User docs.

Any ideas how to address this?
Comment 16 Martin Oberhuber CLA 2007-09-13 02:42:49 EDT
(In reply to comment #13)
> Please don't reduce the severity.  File types specifically listed in the
> preferences for Text transfer mode are uploaded with a binary transfer.  I've

Not for me with RSE HEAD. I definitely tried *.exe and it was first uploaded binary; after adding an entry for *.exe files it was uploaded in text mode.

Javier can you please test the scenario indicated by Stephen (download as text but upload binary) with the file types indicated by him (*.sql, *.sh) on a fresh empty workspace with RSE 2.0.1RC1 (today's test candidate). Once we have agreement on these, we'll think about xml separately.
Comment 17 Martin Oberhuber CLA 2007-09-13 03:44:04 EDT
I reproduced the issue with test.csh on a fresh empty workspace: drag&drop from FTP to local, downloads as ASCII; drag back up to FTP uploads as Binary. This is a must-fix for 2.0.1 (but still please concentrate on testing 2.0.1RC1 today).
Comment 18 Javier Montalvo Orús CLA 2007-09-13 07:20:44 EDT
Created attachment 78297 [details]
patch not checking remote encoding to force binary transfer

The upload method in FileServiceSubSystem uses a method to check if the file has to be treated as binary different from the one used in download.
This isBinary method currently flags as binary if the local and remote encondings are the same. 

The attached patch removes checking it (same behavior as the isBinary used for download) and, after applying it files are correctly uploaded as specified in the File Preferences.
Comment 19 Martin Oberhuber CLA 2007-09-13 08:31:32 EDT
The new patch is much better and I vote for applying it.
Here is the original logic:

   isText := !hostEncoding.equals(localEncoding) && 
             TransferModeRegistry.isText(remote) &&
             !isXML(remote);

Through boolean conversions, this is the same as the following:

   isBinary := hostEncoding.equals(localEncoding) ||
               TransferModeRegistry.isBinary(remote) ||
               isXML(remote);

Now it is much clearer to see that it makes no sense to treat a file as binary just because the encodings happen to be the same on the local and on the remote (since there still might be line ending differences; the transfer mode registry has to decide; recoding won't happen anyways). Thus here is what Javier did:

   isBinary := TransferModeRegistry.isBinary(remote) ||
               isXML(remote);

We still need to decide what to do with XML files, but we can discuss this at the F2F meeting. For now I'd suggest to change the patch once again so the simplified logic is used, and then apply the patch since this is quite clearly a bug.

Comment 20 Martin Oberhuber CLA 2007-09-13 08:45:10 EDT
(In reply to comment #11)
Thinking about XML again, I'd propose the following:

* Use ONLY the FileTransferModeRegistry to decide about ascii vs. binary 
  transfer. In the Registry, avoid all built-in or default-default settings
  such that the settings are always explicitly visible.

* Only if the FileTransferModeRegistry asks for ASCII transfer:
  -) Check if the file has an encoding embedded (e.g. is an XML file).
     We might want to use Eclipse ContentTypes for that at some point,
     but can also use the dumb XML heuristics for now.
     If an encoding is embedded, then set the target encoding == source 
     encoding such that no recoding happens even though we do ASCII upload.

This may not be the perfect solution, and may need more discussions, but it could be a start.
Comment 21 Javier Montalvo Orús CLA 2007-09-13 09:45:41 EDT
Created attachment 78307 [details]
patch with simplified logic
Comment 22 Martin Oberhuber CLA 2007-09-13 16:23:56 EDT
Since Javier committed the patch, I propose marking this bug fixed since the original obvious issue is fixed.

I filed bug 203357 to track the issue of *.xml and *.xmi files.

Bug 203114 will be used to track the issue of having invisible default-defaults in the SystemFileTransferModeRegistry, and improving the UI.
Comment 23 Martin Oberhuber CLA 2007-09-13 16:34:22 EDT
Restoring original Summary.

The issue is now fixed such that if an explicit transfer mode is set for a file type in the Preference Page, and the file is not "*.xml" or "*.xmi", it will transfer in the specified mode.

Remaining issues:
- *.xml files always transfer binary (bug #203114)
- On the Preference page, the "default transfer mode" setting is not honored
  for some file types - there are default-defaults (bug #203357).

Fixing these issues is deferred for now.
Comment 24 Martin Oberhuber CLA 2012-05-23 16:48:38 EDT
Comment on attachment 78307 [details]
patch with simplified logic

iplog- since Javier was committer since Aug 2006 already:
http://dev.eclipse.org/mhonarc/lists/dsdp-tm-dev/msg00375.html