Bug 209704 - [api][dstore] Ability to override default encoding conversion needed.
Summary: [api][dstore] Ability to override default encoding conversion needed.
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 2.0.2   Edit
Hardware: PC Windows XP
: P3 enhancement (vote)
Target Milestone: 3.0 M4   Edit
Assignee: David McKnight CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: api
Depends on: 210812
Blocks:
  Show dependency tree
 
Reported: 2007-11-13 16:08 EST by Violaine Batthish CLA
Modified: 2007-12-04 12:57 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Violaine Batthish CLA 2007-11-13 16:08:53 EST
When files are transferred from host to client, RSE does the file encoding conversion automatically.  A way is need to provide subsystems and connections with the ability to override that conversion and provide a different one.
Comment 1 David McKnight CLA 2007-11-13 17:50:14 EST
This may need to be dstore-specific since I don't think the other file service implementations (ftp or ssh) do any conversion.  

Martin, suppose we have an extension for this that is used in the IFileService layer.  Do you think it would make sense for it to be picked up by all the services or just dstore?


Comment 2 Martin Oberhuber CLA 2007-11-14 04:24:06 EST
I never quite liked the idea of encoding conversions at all.

Can you explain
  - why these are needed (Use cases)
  - when these are done in today's RSE (what UI controls are involved)

My understanding of Eclipse is that it can edit files in any kind of encoding, so I'd prefer having files transferred binary by default and edited in their native encoding.
Comment 3 Violaine Batthish CLA 2007-11-14 08:09:13 EST
(in the internal version of) RSE, files were stored locally in the temp folder as UTF-8 (with no option to change this).  When content of files are transfered, the String java class is used to handle this.
In the case of bi-directional text, this is not practical as text converted this way does not perform the "round-trip" and appears corrupted upon download from hosts where data is encoded in visual format due to limitations of the logical representation of data on Windows and requires additional characters (LTR and RTL) to be added.
In the case of RDz, the bidi competency centre developed a transformation which is optimized for the source type being transfered (ie COBOL, C/C++ etc) and provides the users with a number of configurable options to transform the file.
Comment 4 Martin Oberhuber CLA 2007-11-14 10:39:06 EST
So, when I read you right, not transforming the file might also be an option? And leaving the actual transformation to the consumer (i.e. the editor)?
Comment 5 Ankit Pasricha CLA 2007-11-14 10:51:35 EST
I think RSE should do the transformation otherwise EVERY editor will need to do their own transformation. I think it will be cleaner if the file transport provides this facility (or allows this facility to be plugged in).
Comment 6 Martin Oberhuber CLA 2007-11-14 10:56:52 EST
But shouldn't the transformations be part of the standard editor framework, such that any editor can easily re-use them? I did not check in detail, but I could imagine that something like EditableInputStream or IResourceInputStream would do all the conversions as specified by the Eclipse Workspace settings.
Comment 7 Martin Oberhuber CLA 2007-11-14 12:35:54 EST
We discussed this in the Committer meeting today:
http://wiki.eclipse.org/DSDP/TM/Committer_Phone_Meeting_14-Nov-2007

we're wondering what the UI and/or extension point could look like for the custom contributed conversions for BIDI, or whatever else you need? Can you list some requirements?
Comment 8 Violaine Batthish CLA 2007-11-20 09:33:30 EST
RE Comment #4-#6 
Not transforming the file (ie leaving it in binary) would not be acceptable.  Editors (those that are not RSE aware) would assume the default workspace encoding - which would confuse users who would have already set their encoding once in RSE.

As for requirements, no UI would be needed by RSE.
The extension point should define a converter (and optionally the subsystem id that it applies to)
The interface for the converter should then include:
1.  A check for whether or not the converter applies to the current subsystem and codepage combo (isConverterValid)
2.  A method to convert from host bytes to client String, given the host bytes and the host and client codepages (convertHostBytesToClientString)
3.  A method to convert from client String to host bytes, given the client String and the host and client codepages (convertClientStringToHostBytes) 
Comment 9 David McKnight CLA 2007-11-23 18:01:53 EST
Thinking about this again, I think it makes sense for this code page converter to be applicable for all the services.  When we use the text transfer mode, as specified in the file preferences, we need to honour that and convert from the remote encoding to the local encoding.  In that case, the editor will not actually use the remote encoding as the charset - instead it will use the local encoding.


Comment 10 David McKnight CLA 2007-11-23 19:43:31 EST
I've contributed the following for this:

IFileServiceCodePageConverter

	public byte [] convertClientStringToRemoteBytes(String clientString, String remoteEncoding, IFileService fs);

	public void convertFileFromRemoteEncoding(File file, String remoteEncoding, String localEncoding, IFileService fs);
	
	public boolean isServerEncodingSupported(String remoteEncoding, IFileService fs);

	public int getPriority(String remoteEncoding, IFileService fs);


Also, I've provided a default implementation, DefaultFileServiceCodePageConverter that is used for text transfer when no other code page converter is available.  The DStoreFileService uses this right now but other file services could make use of this by getting the appropriate converter via calls to:

	Activator.getDefaultCodePageConverter()
	Activator.getCodePageConverter(String serverEncoding, IFileService fileService)

I've also contributed the extension point, codePageConverters.  Here is an example of a contribution:

   <extension point="org.eclipse.rse.services.codePageConverters">
      <codePageConverter
         id="org.eclipse.rse.services.files.defaultCodePageConverter"
         name="Default Code Page Converter"
         class="org.eclipse.rse.files.DefaultFileServiceCodePageConverter">
      </codePageConverter>
   </extension>




Comment 11 David McKnight CLA 2007-11-23 19:44:30 EST
Violaine, could you tell me whether this meets your requirements?
Comment 12 Martin Oberhuber CLA 2007-11-26 08:14:23 EST
Looking at the contribution, I have a couple of questions and comments:

1. In comment #9 you say "makes sense for this code page converter to be
   applicable for all the services". But why is it then
   "IFileServiceCodePageConverter" and not just "ICodePageConverter" ?

2. Currently, every Service is responsible for doing the conversion. I've 
   questioned that before and asked whether we wouldn't want to have the 
   conversion in the subsystem, to avoid duplication of code; I think we then
   agreed to only put this into dstore for now, so it's on the Service layer.
   But anyways, since Activator.getCodePageConverter() is "internal" non-API,
   getting the code page converter can only be done by breaking the API boundary
   right now. Implementors of new Services must be able to get the code page
   converter without breaking API.
   I believe that getCodePageConverter() must be API, and I believe 
   that the logic for falling back to the default must be insider the get...()
   method and not to be implemented by each and every service.

3. Please update rse.doc.isv to add the new extension point. In the end, it 
   should be visible on dsdp.eclipse.org similar to this one (and, also please
   add the NLS string for %extPoint.remoteFileTypes):
   http://dsdp.eclipse.org/help/latest/topic/org.eclipse.rse.doc.isv/reference/extension-points/org_eclipse_rse_subsystems_files_core_remoteFileTypes.html

Comment 13 David McKnight CLA 2007-11-26 09:40:00 EST
(In reply to comment #12)
> Looking at the contribution, I have a couple of questions and comments:
> 1. In comment #9 you say "makes sense for this code page converter to be
>    applicable for all the services". But why is it then
>    "IFileServiceCodePageConverter" and not just "ICodePageConverter" ?

I meant all the file services.  I suppose other types of services could make use of something like this, but for now I'd prefer to keep this with files, since we have a good idea how we want to use it there.

> 2. Currently, every Service is responsible for doing the conversion. I've 
>    questioned that before and asked whether we wouldn't want to have the 
>    conversion in the subsystem, to avoid duplication of code; 

The way I've implemented it right now, is that the service is still responsible for making use of this mechanism.  Doing this at the subsystem layer is okay for download, however, for upload, since we're sending bytes at the service layer, we need to be converting there as we read the bytes to send.  If we were to do that in the subsystem layer, we'd have to convert the entire file to send to a separate temporary file prior to passing the file down to the service layer - and that is problemmatic for various reasons.  


> I think we then
>    agreed to only put this into dstore for now, so it's on the Service layer.

Currently, the only service using this is dstore.

>    But anyways, since Activator.getCodePageConverter() is "internal" non-API,
>    getting the code page converter can only be done by breaking the API
> boundary
>    right now.
> Implementors of new Services must be able to get the code page
>    converter without breaking API.
>    I believe that getCodePageConverter() must be API, and I believe 
>    that the logic for falling back to the default must be insider the get...()
>    method and not to be implemented by each and every service.

I wasn't sure where the best place to put the code for getting at the code page converter should be.  Perhaps a new utility class is required.  Any suggestions?



Comment 14 Martin Oberhuber CLA 2007-11-27 12:40:10 EST
As per http://wiki.eclipse.org/DSDP/TM/Committer_Phone_Meeting_27-Nov-2007 :

What happens if user selects a conversion but service does not provide/support it? - Perhaps have a supportsEncodingConversion() in IFileService with a default returning false in AbstractFileService; and subsystem to check .
Comment 15 David McKnight CLA 2007-12-03 11:34:01 EST
(In reply to comment #14)
> As per http://wiki.eclipse.org/DSDP/TM/Committer_Phone_Meeting_27-Nov-2007 :
> What happens if user selects a conversion but service does not provide/support
> it? - Perhaps have a supportsEncodingConversion() in IFileService with a
> default returning false in AbstractFileService; and subsystem to check .

When you say if a "user selects a conversion" do you mean if a user specifies an attribute of the conversion on a property page (i.e. a BIDI conversion table page)?  Currently, it's up to the service whether the conversion extension point actually gets used.  Whether the user settings (on a property page) are appropriate is a separate issue from whether the extension point is used.  I'm wondering what would be querying IFileService.supportsEncodingConversion().  Are you suggesting that the property page should query the service (so that a conversion page like the BIDI one could be disabled if false is returned)?
  
Comment 16 Martin Oberhuber CLA 2007-12-04 07:43:09 EST
(In reply to comment #15)
> Are you suggesting that the property page should query the service (so that a
> conversion page like the BIDI one could be disabled if false is returned)?

Yes, exactly. We should not present UI for settings that cannot be made or have no effect with the currently selected service.

The point is, that the Property Page extension for BIDI would get activated whenever dstore is *installed*, regardless of whether dstore is currently *active* for the given connection. We need to make sure that property pages for extensions are only presented when the requirements for that extension are all met.

There might be other options to achieve this instead of an isEncodingSupported() in the service -- the property page could perhaps be bound to the dstore subsystem only, and we could forbid changing service at runtime (a feature which I have always found questionable).
Comment 17 David McKnight CLA 2007-12-04 12:57:10 EST
I've moved the calls to get the code page converter out of the Activator and into a new class - CodePageConverterManager.  I've also added a new API method to IRemoteFileSubSystem and IFileService: supportsEncodingConversion().