Bug 229610 - [api] File transfers should use workspace text file encoding
Summary: [api] File transfers should use workspace text file encoding
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.0 M7   Edit
Assignee: David McKnight CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: api
Depends on:
Blocks: 224799 230001
  Show dependency tree
 
Reported: 2008-04-30 09:13 EDT by Don Yantzi CLA
Modified: 2008-05-26 11:26 EDT (History)
5 users (show)

See Also:
mober.at+eclipse: review? (ddykstal.eclipse)


Attachments
patch that changes use of System.getProperty("file.encoding") to use the workspace encoding preference (38.35 KB, patch)
2008-05-01 15:23 EDT, David McKnight CLA
no flags Details | Diff
patch to use DefaultEncodingProvider (22.70 KB, patch)
2008-05-05 11:16 EDT, David McKnight CLA
no flags Details | Diff
updated patch as per comment #17 (32.53 KB, patch)
2008-05-05 13:30 EDT, David McKnight CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Don Yantzi CLA 2008-04-30 09:13:35 EDT
Build ID: I20080207-1530

Steps To Reproduce:
The UniversalFileTransferUtility.downloadFileToWorkspace method uses:

String localEncoding = System.getProperty("file.encoding"); 

For the encoding of the local file. This should be changed to use the default workspace file encoding (General > Workspace, Text file encoding preference) to be consistent with other workspace files.

More information:
Comment 1 David McKnight CLA 2008-04-30 12:18:31 EDT
We can use the following to get at the preference:

String localEncoding = file.getParent().getDefaultCharset();

The problem is that there is some code that can't depend on the Eclipse resource plugin that uses System.getProperty("file.encoding").  The services layer doesn't have this dependency (in this case DStoreFileService).

Martin, what would you suggest in such a scenario?  Do you think the "preferences listener" should also be setting the workspace encoding preference in the dstore file service like we do with other preferences?



Comment 2 David McKnight CLA 2008-05-01 15:23:58 EDT
Created attachment 98356 [details]
patch that changes use of System.getProperty("file.encoding") to use the workspace encoding preference

I've tried replacing System.getProperty("file.encoding" with the workspace encoding preference.   In the case of the DStoreFileService, I needed to add an API (setLocalEncoding()) in order for the preference to get passed over to the service.  I haven't done a lot of testing and there are a fair number of changes here so I'd feel more comfortable if someone could review this.
Comment 3 Martin Oberhuber CLA 2008-05-02 10:39:16 EDT
We have had the request for using the Workspace default Encoding before, see bug 224799 comment 4.

I think it makes a lot of sense, but the problem I have with this approach is that reading the workspace default encoding creates a dependency into org.eclipse.core.resources which we want to eventually avoid, in order to be compatible with RCP (see bug 182363).

I think the right solution is this:

  * We add new API for getLocalDefaultEncoding() very low down in RSE, that
    is in the Services Layer or in the Core Layer.

  * Wherever we currently do System.getProperty("file.encoding") we use the
    new API method instead.

  * The default-default for getLocalDefaultEncoding() is System.getProperty
    ("file.encoding")

  * We add "internal" non-API for setLocalDefaultEncoding() -- that is, only
    RSE itself is allowed to set it.

  * At a proper place where a dependency into core.resources is acceptable,
    we read the Workspace default encoding and make the call into
    setLocalDefaultEncoding -- this must happen during early RSE startup,
    e.g. the plugin can use the new RSE startup extension point (with "session"
    startup).
    At the same place, we also install Listener to listen for changes of the
    workspace default encoding -- not sure if we need a Preference Change 
    Listener here, or a Workspace / Resource change listener. We call
    setLocalDefaultEncoding() when needed.

Ability to change the local default encoding might be a risk in some cases, e.g. where we keep an internal (Unicode) cache of a resource in memory, if the encoding changes in this case then the cache will be out-of-sync with the corresponding resource on the file system.

If we think that this might be an issue for some clients, we also need to provide ability for clients to register as listeners for default encoding changes. Personally, I do not think that it is an issue because the "default
encoding" will always be a fallback only for "real" encoding.
Comment 4 Martin Oberhuber CLA 2008-05-02 10:50:21 EDT
In fact, it turns out that for the encoding getter, we already have
   
   SystemEncodingUtil#getEnvironmentEncoding()

so it looks like we only need to care for a setter. One other thing to consider might be that encoding getter must be in sync with the RemoteTempFile project encoding, in order to ensure that opening an editor on a tempfile uses the right encoding. 

This issue is probably along the lines of the caching problem I mentioned earlier; but it seems to be inconsistent keeping encoding of the tempfiles project the same, when the user changes the workspace default encoding? I'm really not sure what the expected behavior is, when user has a remote file (from the tempfiles project) already open in the editor, and then chanes the workspace default encoding. I *assume* that expected behavior is 

  (1) tempfiles project picks up the workspace default encoding, because it 
      does not have a different encoding set on its own
  (2) Any open remote files from the tempfiles project get re-read in order to
      pick up the new encoding

Because of this issue, dependency injection might be a better solution than what I outlined above. We do the getLocalDefaultEncoding() as mentioned before, but instead of the setter, we do this:

public class SystemEncodingUtil {

   public interface DefaultEncodingProvider {
       public String getLocalDefaultEncoding();
   }

   private DefaultEncodingProvider fDefaultEncodingProvider = null;

   public void setDefaultEncodingProvider(DefaultEncodingProvider p) {
       fDefaultEncodingProvider = p;
   }

   public String getLocalDefaultEncoding() {
      if(fDefaultEncodingProvider != null) {
         return fDefaultEncodingProvider.getLocalDefaultEncoding();
      }
      return System.getProperty("file.encoding"); //$NON-NLS-1$
   }
};
Comment 5 Martin Oberhuber CLA 2008-05-02 10:50:48 EDT
And, of course, the RemoteEditManager registers itself as default encoding provider.
Comment 6 David McKnight CLA 2008-05-02 10:57:46 EDT
It is possible that a user may have changed the temp files project, or a folder within that to use an encoding other than the workspcae default.  What would we do in such a case?
Comment 7 Martin Oberhuber CLA 2008-05-02 11:05:38 EDT
Good point. In order to be consistent with what the editor shows, we need to pick up the modified (overridden) encoding from the folder for the download.

But I wonder how users could change the encoding if they never see the remote edit project or any of its folders?

In the future, we'll likely have to move away from the remote edit project anyways, and towards a different caching strategy that does not involve Ecilpse Resources (but works in the metadata). At that point, we'll be able to get rid of any unwanted dependencies into core.resources.

But for the sake of bug 224799 comment 4, I still think that doing dependency injection into SystemEncodingUtil is the right thing to do. We'd use the getLocalDefaultEncoding() in the right places; in those places that reference files from the remote edit project directly, we'd use the getParentFolder().getEncoding() instead.

I'm marking this bug as API because we'll need the following API additions to SystemEncodingUtil:

public class SystemEncodingUtil {

   public interface DefaultEncodingProvider {
      public String getLocalDefaultEncoding();
   }

   public void setDefaultEncodingProvider(DefaultEncodingProvider p);
}

These changes are guaranteed to be non-breaking because SystemEncodingUtil is a singleton, so it cannot be instantiated or extended by clients. Unless we also want to rename getEnvironmentEncoding() into getLocalDefaultEncoding() -- which I'd think is a good idea -- but would make the change a breaking one.

Marking M7 to get the API in. Dave D, Dave M, Xuan, are you OK with this API change?
Comment 8 Martin Oberhuber CLA 2008-05-02 11:34:52 EDT
Little change: in order to be able to change the DefaultEncodingProvider interface in the future without breaking API, we probably better declare a class DefaultEncodingProvider, which clients may extend. This allows us to add methods such as getRemoteDefaultEncoding() in the future (well, perhaps we should do that right away).

In this case, RemoteEditManager will not implement DefaultEncodingProvider direcly, but create a local anonymous instance of a subclass of DefaultEncodingProvider, where he overrides the getLocalDefaultEncoding() method. We could also enforce that the encoding provider can be set only once, but I'm not sure if this is actually beneficial.

public class SystemEncodingUtil {

   private DefaultEncodingProvider fDefaultEncodingProvider = 
          new DefaultEncodingProvider();
   
   /**
    * Provider for the default encodings that RSE uses.
    * Clients may subclass this class, and override methods.
    * @since org.eclipse.rse.services 3.0
    */
   public class DefaultEncodingProvider {
      
      /**
       * Return the default encoding for local workspace resources.
       * Clients may override.
       * @return String the local default encoding.
       */
      public String getLocalDefaultEncoding() {
          return System.getProperty("file.encoding"); //$NON-NLS-1$
      }
   }

   /**
    * Change the default encoding provider.
    *
    * This is a system-wide change, and clients will not be notified
    * of changed default encodings due to changing the provider. Therefore,
    * changing the provider should be done only once during early system 
    * startup.
    * 
    * @param p the new encoding provider.
    */
   public void setDefaultEncodingProvider(DefaultEncodingProvider p) {
       fDefaultEncodingProvider = p;
   }

   public String getLocalDefaultEncoding() {
      return fDefaultEncodingProvider.getLocalDefaultEncoding();
   }
}
Comment 9 David McKnight CLA 2008-05-02 11:37:18 EDT
(In reply to comment #7)
> Good point. In order to be consistent with what the editor shows, we need to
> pick up the modified (overridden) encoding from the folder for the download.

So in some cases we would use file.getParent().getDefaultCharset() while in others we'd use SystemEncodingUtil.getLocalDefaultEncoding().  We would need to be clear when each should be used. 

> But I wonder how users could change the encoding if they never see the remote
> edit project or any of its folders?

Users could remove the view filter which hide the temp files project (or there could be views that don't have those filters).  Also it's possible that some tool might programmatically set the project or folder encoding.

> In the future, we'll likely have to move away from the remote edit project
> anyways, and towards a different caching strategy that does not involve Ecilpse
> Resources (but works in the metadata). At that point, we'll be able to get rid
> of any unwanted dependencies into core.resources.
> But for the sake of bug 224799 comment 4, I still think that doing dependency
> injection into SystemEncodingUtil is the right thing to do. We'd use the
> getLocalDefaultEncoding() in the right places; in those places that reference
> files from the remote edit project directly, we'd use the
> getParentFolder().getEncoding() instead.
> I'm marking this bug as API because we'll need the following API additions to
> SystemEncodingUtil:
> public class SystemEncodingUtil {
>    public interface DefaultEncodingProvider {
>       public String getLocalDefaultEncoding();
>    }
>    public void setDefaultEncodingProvider(DefaultEncodingProvider p);
> }
> These changes are guaranteed to be non-breaking because SystemEncodingUtil is a
> singleton, so it cannot be instantiated or extended by clients. Unless we also
> want to rename getEnvironmentEncoding() into getLocalDefaultEncoding() -- which
> I'd think is a good idea -- but would make the change a breaking one.
> Marking M7 to get the API in. Dave D, Dave M, Xuan, are you OK with this API
> change?

I'm okay with this API change.  Where do we want to call setDefaultEncodingProvider()?  Will the encoding provider listen to preference changes?

This is what I had tried before in my patch:
	IPropertyChangeListener preferenceListener = new IPropertyChangeListener()      {
		public void propertyChange(PropertyChangeEvent event){
		       String propertyName = event.getProperty();
			if (propertyName.equals(ResourcesPlugin.PREF_ENCODING)){
				String value = (String)event.getNewValue();
			        service.setLocalEncoding(value);
			}
		}
	};
		
        // for the local encoding preference
	Preferences rstore = ResourcesPlugin.getPlugin().getPluginPreferences();
	String encoding = rstore.getString(ResourcesPlugin.PREF_ENCODING);
	service.setLocalEncoding(encoding);

Comment 10 David McKnight CLA 2008-05-02 11:39:18 EDT
...but I guess if the default encoding provider is passed down from something that already has access to the resource, we don't need any listening.
Comment 11 Martin Oberhuber CLA 2008-05-02 11:45:53 EDT
(In reply to comment #9)
> So in some cases we would use file.getParent().getDefaultCharset() while in
> others we'd use SystemEncodingUtil.getLocalDefaultEncoding().  We would need
> to be clear when each should be used. 

Yes. Get the parent folder in actions related to tempfiles and the edit project. Do getLocalDefaultEncoding() -- which gets the workspace default -- in all other cases.

> Also it's possible that some tool might programmatically set the project
> or folder encoding.

Ah, good point. that makes sense.

> I'm okay with this API change.  Where do we want to call
> setDefaultEncodingProvider()?  Will the encoding provider listen to
> preference changes?

When the RemoteEditProject is the encoding provider, it doesn't need to listen to preference changes because it can always return the current value that it reads from the workspace. That's the advantage of dependency injection.

We can call setDefaultEncodingProvider in any plugin that gets loaded early, and has a dependency into core.resources. Eventually, it will likely be a separate new plugin, along the lines of bug 182363. While we do not have that separate plugin, PFWorkspaceAnchor might be a good place, because this gets loaded as soon as the workspace is first used in some way.

Perhaps we should also add Javadoc saying that getting the local default encoding is only allowed after RSE startup has completed (so clients had a chance to set the encoding provider). And mention that in the default case, the Workspace Default Encoding will be used.

If reading the local default encoding is only allowed after completed startup, this is an argument in favor of keeping the existing getEnvironmentEncoding() unchanged and adding getLocalDefaultEncoding() in addition to it, since their meaning is slightly different (environment is available from the very beginning and will never change; local default is valid only later and might change).
Comment 12 David McKnight CLA 2008-05-02 12:09:51 EDT
Could we add the workspace encoding provider from files.core startup?  Prior to that, we could use the default (i.e. System.getProperty("file.encoding")).
Comment 13 Martin Oberhuber CLA 2008-05-02 12:43:38 EDT
I don't think this is a good idea.

When a product built on top of RSE does not include RSE files support for any reason, then the observed local encoding default would not be linked to the Workspace. 

So the fix for what bug 224799 comment 4 requests would not work in that case.

No, I think the default encoding provider must be in a "core" plugin.
Comment 14 David McKnight CLA 2008-05-02 14:29:27 EDT
(In reply to comment #13)
> I don't think this is a good idea.
> When a product built on top of RSE does not include RSE files support for any
> reason, then the observed local encoding default would not be linked to the
> Workspace. 
> So the fix for what bug 224799 comment 4 requests would not work in that case.
> No, I think the default encoding provider must be in a "core" plugin.

By "core" plugin, do you mean rse.core?  If that's the case, then I guess we need to add the core.resources plugin to it's dependencies.
Comment 15 Martin Oberhuber CLA 2008-05-02 18:31:24 EDT
rse.core already depends on eclipse.core.resources, because the PFWorkspaceAnchor needs it (and some other stuff too, when I'm not mistaken).
Comment 16 David McKnight CLA 2008-05-05 11:16:08 EDT
Created attachment 98644 [details]
patch to use DefaultEncodingProvider

With this patch, we use SystemEncodingUtil.getLocalDefaultEncoding() to determine the workspace encoding, rather than System.getProperty("file.encoding").

Martin, is this what you had in mind?
Comment 17 Martin Oberhuber CLA 2008-05-05 13:06:21 EDT
Patch looks good, I just have 3 comments:

(1) tempFile.getParent().getDefaultCharset()
    why not directly use tempFile.getDefaultCharset() ?

(2) encodingUtil.new DefaultEncodingProvider()
    -- this can be made simpler when the DefaultEncodingProvider is a static
       nested class, rather than an inner class:
    public static class DefaultEncodingProvider

(3) ResourcesPlugin.getPlugin().getPluginPreferences().getString(ResourcesPlugin.PREF_ENCODING);
    -- I'm not sure but I think it's more correct to do
    ResourcesPlugin.getEncoding();
    -- Or, you do
    ResourcesPlugin.getWorkspace().getRoot().getDefaultCharset();
Comment 18 David McKnight CLA 2008-05-05 13:30:37 EDT
Created attachment 98679 [details]
updated patch as per comment #17

I've updated the patch as per comment #17.
Comment 19 Martin Oberhuber CLA 2008-05-05 13:42:19 EDT
(1) Just one question about this comment:
    // otherwise, the default charset is inherited so no need to set
    Could it happen that a "wrong" encoding is inherited from the parent
    folder, rather than doing fallback on the workspace default encoding?

(2) MANIFEST.MF
    Bundle-RequiredExecutionEnvironment=J2SE-1.6
    Ouch! That hurts. Please REVIEW in the synchronization perspective
    before uploading a patch. Of course this cannot get committed.

(3) The changes in DStoreFileService are intermixed with another fix. Please
    commit those with the other fix.

Comment 20 David McKnight CLA 2008-05-05 14:39:54 EDT
(In reply to comment #19)
> (1) Just one question about this comment:
>     // otherwise, the default charset is inherited so no need to set
>     Could it happen that a "wrong" encoding is inherited from the parent
>     folder, rather than doing fallback on the workspace default encoding?

Well, then the question is whether the parent resource encoding is more or less correct then the workspace default.

> (2) MANIFEST.MF
>     Bundle-RequiredExecutionEnvironment=J2SE-1.6
>     Ouch! That hurts. Please REVIEW in the synchronization perspective
>     before uploading a patch. Of course this cannot get committed.

Oh...that's a mistake with the patch...not planning on committing that (just I lost a 1.4 java from my current environment so I was getting errors).

> (3) The changes in DStoreFileService are intermixed with another fix. Please
>     commit those with the other fix.


So I should leave DStoreFileService completely out of this commit?
Comment 21 Martin Oberhuber CLA 2008-05-05 14:54:23 EDT
(In reply to comment #20)
> Well, then the question is whether the parent resource encoding is more or less
> correct then the workspace default.

I don't know, do you?

> So I should leave DStoreFileService completely out of this commit?

Yes, unless you want to fiddle the two patches apart. In DStoreFileService, the stuff related to this bug is far less than the stuff related to the other bug. Though fiddling them apart is even better if you have the time.

Comment 22 David McKnight CLA 2008-05-05 15:38:51 EDT
(In reply to comment #21)
> (In reply to comment #20)
> > Well, then the question is whether the parent resource encoding is more or less
> > correct then the workspace default.
> I don't know, do you?

I would think the preferred default would be from the parent.

> > So I should leave DStoreFileService completely out of this commit?
> Yes, unless you want to fiddle the two patches apart. In DStoreFileService, the
> stuff related to this bug is far less than the stuff related to the other bug.
> Though fiddling them apart is even better if you have the time.

I committed DStoreFileService separately via Bug 221211.
Comment 23 Martin Oberhuber CLA 2008-05-05 15:52:56 EDT
Ok, let's mark this fixed ... DaveD, you still have a "review?" flag please set this to review+