Bug 145096 - FileLocator.resolve(URL) returns invalid file URL (spaces not escaped)
Summary: FileLocator.resolve(URL) returns invalid file URL (spaces not escaped)
Status: CLOSED WONTFIX
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Framework (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P3 normal with 9 votes (vote)
Target Milestone: ---   Edit
Assignee: equinox.framework-inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
: 212243 214114 246647 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-06-02 12:43 EDT by Petr Lindovsky CLA
Modified: 2019-10-31 15:36 EDT (History)
21 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Lindovsky CLA 2006-06-02 12:43:19 EDT
Eclipse 3.2RC6:

org.eclipse.core.runtime.FileLocator.resolve(URL) returns file URLs where spaces are not escaped as specified in RFC2396, e.g. file:/C:/Program Files/ instead of file:/C:/Program%20Files/ (see http://java.sun.com/j2se/1.5.0/docs/api/java/net/URL.html).

The problem seems to lie in the implementation of the org.eclipse.osgi.baseadaptor.bundlefile.FileBundleEntry.getFileURL() method which calls file.getURL() instead of file.getURI().toURL().

This e.g. causes compatibility problems with EMF where e.g. org.eclipse.emf.ecore.resource.Resource.getURI() returns a correctly escaped URI.
Comment 1 Jeff McAffer CLA 2006-06-02 15:53:22 EDT
are you seeing a change in this behaviour from previous versions or noticing that it has alwasy returned invalid results?
Comment 2 Petr Lindovsky CLA 2006-06-03 06:33:03 EDT
It was there already in 3.2RC2.

Anyway, the migration to 3.2 broke the path resolution in our project - I'm not sure what, I simply quickly replaced FileLocator.resolve() with a wrapper that replaces spaces with %20 in the return value and it works (an ugly hack). It seems to me that if methods like FileLocator.resolve() guaranteed to return valid URLs, these problems wouldn't exist. And probably these methods should also reject an invalid URL if passed as an argument.

On the other hand, I can imagine that if some client relied on that fact that FileLocator.resolve() returns a non-escaped URL (if of course it isn't passed an escaped one which produces a mixture), it would break their code.
Comment 3 Remy Suen CLA 2007-09-15 11:54:45 EDT
Any updates on this?
Comment 4 Thomas Watson CLA 2007-09-17 09:43:31 EDT
Unfortunately I'm not sure how to tackle this one.  We have had a form of this method (first on Platform and now on FileLocator) since 3.0 and it has always behaved this way WRT spaces. Albeit this is incorrect behavior according to the URI specification but there is a large number of clients that depend on the spaces to not be escaped from these methods.  All of the existing clients will be broken.

The only safe way to do this is to deprecate the existing methods on FileLocator and replace them with new methods which clearly spec that special characters will be escaped according to URI specification.
Comment 5 Jeff Ramsdale CLA 2007-09-17 19:28:31 EDT
(In reply to comment #4)
> Unfortunately I'm not sure how to tackle this one.  We have had a form of this
> method (first on Platform and now on FileLocator) since 3.0 and it has always
> behaved this way WRT spaces. Albeit this is incorrect behavior according to the
> URI specification but there is a large number of clients that depend on the
> spaces to not be escaped from these methods.  All of the existing clients will
> be broken.

"All" would seem to be an overstatement. Some users have undoubtedly wrapped the output of FileLocator.resolve with the proper encoding. Their code may not be effected by fixing this method. Others have likely shipped broken code without realizing it (I suspect this is the largest group). Either of these two groups might benefit from shipping a fix.

The remaining users would seem to be those who know the method is flawed and depend on the flawed behavior. These are the users most likely to know what to do if the method is fixed. And they should feel chagrined for not posting the bug or voting on it in the first place.

-j
Comment 6 Oleg Besedin CLA 2008-01-03 13:23:55 EST
*** Bug 214114 has been marked as a duplicate of this bug. ***
Comment 7 Oleg Besedin CLA 2008-01-03 13:49:17 EST
As Tom said the comment 4, problem is that fixing it will break the clients.

This is further aggravated by Java recommending using File.toURI().toURL() which produces encoded strings. (As opposing to File.toURL() which produces non-encoded strings.)

I think lots of places in Eclipse expect (or produce) non-encoded URLs. Probably, the real solution at this point should be to start specifying in Javadocs if URLs produced or consumed are encoded?
Comment 8 Remy Suen CLA 2008-04-13 12:10:29 EDT
Any actions plans for this for 3.4 or is this going to get deferred?
Comment 9 Oleg Besedin CLA 2008-04-14 10:48:09 EDT
As far as I know there are no plans to address it in 3.4. 

There are several potential ways to handle this:

- Deprecate current APIs that consume or produce non-encoded URLs and create new alternatives (likely there are a lot of such APIs in runtime/OSGi);

- Don't change APIs but add Javadocs that would say if URLs are encided or not;

- Say that all Runtime/OSGi APIs expect non-encoded URLs. It is my understanding that most (all?) URLs in the Runtime/OSGi are expected to be non-encoded.

From a practical side, as long as everybody agrees that we are always dealing with non-encoded URLs, the last solution should not introduce any limitations.

Tom, do you know if OSGi spec says anything about URLs being encoded or not?
Comment 10 Thomas Watson CLA 2008-04-14 10:59:15 EDT
Oleg is correct, there is no plans for 3.4.  This is one area where changing the behavior will have an explosion of unintended consequences.  For the 3.x stream likely would have to go with the last option mentioned in comment 9.  For e4 we should consider deprecating the API dealing with URLs and introducing new API that does proper encoding.

OSGi is mute on URL encoding.  They just refer to the URI/URL specs that define the proper format of a URI/URL.  With this assumption the URLs likely should be encoded.  No one is arguing that the current behavior is correct, but unfortunately many clients of eclipse APIs have come to depend on file: URLs that are not encoded.
Comment 11 David Carver CLA 2008-05-06 20:55:19 EDT
(In reply to comment #10)
> Oleg is correct, there is no plans for 3.4.  This is one area where changing
> the behavior will have an explosion of unintended consequences.  For the 3.x
> stream likely would have to go with the last option mentioned in comment 9. 
> For e4 we should consider deprecating the API dealing with URLs and introducing
> new API that does proper encoding.
> 

URI's should be escaped.  It causes interoperability problems with standard JAVA api's that expect the URI to be encoded correctly.   The implementation is wrong, and should be fixed regardless if it breaks the existing users.  In most cases it shouldn't matter if they are using it correctly.  This is very noticeable when passing the URI to another application like Xerces or Saxon which expect correctly encoded URIs.
Comment 12 Abel Muiño CLA 2008-08-10 14:33:33 EDT
(In reply to comment #10)
> No one is arguing that the current behavior is correct, but
> unfortunately many clients of eclipse APIs have come to depend on file: URLs
> that are not encoded.

On the other hand, many other clients break under this situation, like PDE (Bug 243689).

Comment 13 Abel Muiño CLA 2008-08-10 14:43:02 EDT
Apparently, PDE problem was caused by Ant.
Sorry for the noise.
Comment 14 Oleg Besedin CLA 2008-08-11 10:06:30 EDT
*** Bug 212243 has been marked as a duplicate of this bug. ***
Comment 15 Oleg Besedin CLA 2008-09-09 10:05:28 EDT
*** Bug 246647 has been marked as a duplicate of this bug. ***
Comment 16 Chris McGee CLA 2008-09-09 14:14:35 EDT
I was wondering about one particular side of this issue.

What happens if I have an encoded platform:/plugin URL and I want to open up an input stream/output stream? It seems with the current implementation (3.4) that it is not able to give me a stream if there happens to be "%20" characters in the URL.

I think that the platform should be able to handle opening input streams on encoded and non-encoded URL's just like the way that file and jar:file URL's are handled. Being more robust in this way should not break anybody and would certainly solve my problem. I get these URL's from somewhere I cannot be guaranteed they are always unencoded.
Comment 17 Volker Stolz CLA 2012-04-13 05:13:11 EDT
I just got bitten by this as well on a Hudson continuous integration instance where the workspace name is derived from the job name---which happened to contain a whitespace:

Caused by: java.net.URISyntaxException: Illegal character in path at index 28: file:/rpool/hudson/jobs/rCOS Modeler/workspace/edu.unu.iist.rcos.xslt/xslt/ooSDtransCSD/CheckOOSDtoCSD.xslt
	at java.net.URI$Parser.fail(URI.java:2810)
	at java.net.URI$Parser.checkChars(URI.java:2983)
	at java.net.URI$Parser.parseHierarchical(URI.java:3067)
	at java.net.URI$Parser.parse(URI.java:3015)
	at java.net.URI.<init>(URI.java:577)
	at java.net.URL.toURI(URL.java:918)

The URL was obtained through FileLocator.resolve() in Helios SR 2.
Comment 18 Sebastian Fuchs CLA 2012-07-27 05:43:48 EDT
Any progress on that bug ?

URL bundleUrlCircle = FileLocator.find(Activator.getDefault().getBundle(), new Path("graphic/kreis.svg"), null);
URL fileURLCircle = FileLocator.toFileURL(bundleUrlCircle);
uriCircle = fileURLCircle.toURI();

java.net.URISyntaxException: Illegal character in path at index 28: file:/D:/temp/TWSolution2-64 Bit/TWSolution2-win32.win32.x86_64-I-2012-07-26_14-59-03/configuration/org.eclipse.osgi/bundles/32/1/.cp/graphic/kreis.svg
Comment 19 Carsten Otto CLA 2013-02-03 15:21:10 EST
I, too, would like to see some progress on this bug. For me a way to manually escape the result of resolve() would help, too.
Comment 20 Carsten Otto CLA 2013-02-03 15:53:35 EST
For a pragmatic solution: http://stackoverflow.com/questions/14676966/escape-result-of-filelocator-resolveurl
Comment 21 Kamil Fejfar CLA 2013-07-09 10:42:33 EDT
This bug has same cause as 3109 (I moslty agree with comment https://bugs.eclipse.org/bugs/show_bug.cgi?id=3109#c28)

"Unencoded" URL is not right approach. Arguments?
1. File.toURL() is deprecated, use File.toURI().toURL()
2. As written in URL class - It is the responsibility of the caller to encode any fields, which need to be escaped prior to calling URL, and also to decode any escaped fields, that are returned from URL.

I met this issue while solving XUL runner problem using http://www.eclipse.org/swt/faq.php#specifyxulrunner solution. If Eclipse is installed in directory with space in path, this solution will not work.

I would let current API that returns encoded URLs, but it is up to you.

Also method that returns File from URL (and hides encoding problems) could be usefull. File FileLocator.toFile(URL url).

Thank you.
Comment 22 Kamil Fejfar CLA 2013-07-10 03:30:50 EDT
When I said that URLs should be encoded, I was talking about all APIs (runtime/OSGi) not just FileLocator, that is why I do not preffer to make current APIs deprecated and create a new ones. It should be done together with mentioned bug 3109.
Comment 23 Markus Keller CLA 2016-03-12 05:45:05 EST
(In reply to Oleg Besedin from comment #9)
> From a practical side, as long as everybody agrees that we are always
> dealing with non-encoded URLs, the last solution should not introduce any
> limitations.

Wrong. Non-encoded URLs are lossy. See explanations in bug 3109 comment 28.

The only right solution is to stop using non-encoded URLs, even if that breaks buggy API clients that tried to "encode" wrong URLs (that "encoding" is not a well-defined process and hence cannot be supported). Code that tries to use org.eclipse.core.runtime.URIUtil#toURI(URL) is suspicious and shows that the author either didn't know how URLs work, or was forced to cope with broken APIs.
Comment 24 Eclipse Genie CLA 2019-10-31 15:36:02 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.