Summary: | Allow to specify encoding for source attachments | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Eclipse Project] JDT | Reporter: | Victor Homyakov <vkhomyackov> | ||||||
Component: | Core | Assignee: | Jay Arthanareeswaran <jarthana> | ||||||
Status: | VERIFIED FIXED | QA Contact: | |||||||
Severity: | enhancement | ||||||||
Priority: | P3 | CC: | amj87.iitr, daniel_megert, deepakazad, satyam.kandula, srikanth_sankaran | ||||||
Version: | 3.8 | ||||||||
Target Milestone: | 3.8 M5 | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
Bug Depends on: | |||||||||
Bug Blocks: | 368015 | ||||||||
Attachments: |
|
Description
Victor Homyakov
2011-10-19 04:59:47 EDT
The library provider has two choices: 1. it offers Javadoc for each platform 2. it generates the Javadoc so that the <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> directive is inside the Javadoc. Eclipse does nothing but display the given HTML in the browser widget i.e. I would assume that the same Javadoc will also not render correctly when you open it with your browser outside Eclipse. I was not talking about HTML. I was talking about Javadoc in source codes. (In reply to comment #2) > I was not talking about HTML. I was talking about Javadoc in source codes. I see. We've added the support to specify the encoding in the 3.7 release. *** This bug has been marked as a duplicate of bug 303511 *** > I see. We've added the support to specify the encoding in the 3.7 release. I am using the 3.7: Eclipse Java EE IDE for Web Developers. Version: Indigo Service Release 1 Build id: 20110916-0149 But I still don't see the way to specify the encoding for one particular library and make sources readable. > *** This bug has been marked as a duplicate of bug 303511 *** I've read bug 303511 and see that my case is falling under scenario 3 (The source attachment is an external folder or archive), and this case was not fixed in patch - Eclipse uses project encoding, exactly as I described. > I've read bug 303511 and see that my case is falling under scenario 3 (The
> source attachment is an external folder or archive), and this case was not
> fixed in patch - Eclipse uses project encoding, exactly as I described.
It should use the workspace encoding for external JARs, hence as a workaround you can set the workspace encoding to Cp1251 or put the attachments into a folder in your workspace which as the Cp1251 encoding.
(In reply to comment #5) > It should use the workspace encoding for external JARs, hence as a workaround > you can set the workspace encoding to Cp1251 I cannot set the workspace encoding to more than one value (UTF-8 and Cp1251) simultaneously. So either JAR with UTF-8 source or JAR with Cp1251 source will be unreadable. As I wrote, the preferred way (for me) is to assign encoding to each used source manually. > or put the attachments into a > folder in your workspace which as the Cp1251 encoding. This works great for manually attached libraries. And how to deal with Maven dependencies (sources are attached by Maven plugin from ~/.m2/repository/)? I do see your point. We can reopen this bug as a feature request if you want. (In reply to comment #7) > We can reopen this bug as a feature request if you want. Reopen, please. Thanks for your help and patience! Jay, please take a look. Thanks! This is a follow up of enhancement 303511. We'll need to see whether we have time to do this. Will investigate/consider for M4. Since Jay had to go on unavoidable unplanned time off for several days, this is not ready: retargetting to 3.8 M5. Created attachment 209089 [details]
Proposed patch
I have introduced a new classpath attribute for storing the source attachment encoding in the .classpath file. Now the order of encoding selection will be in the following order:
1. Encoding set on the resource representing the source file (.java or .zip)
2. Encoding specified for the corresponding classpath entry
3. Encoding of the project enclosing the source, if a member
4. Workspace encoding
I will release this fix after some more testing.
Is there a bug to handle the UI part already?
> I will release this fix after some more testing. Please wait until we have verified it through the UI part. > Is there a bug to handle the UI part already? Not that I'm aware of. Raised bug 368015 for the UI part. (In reply to comment #13) > Now the order of encoding selection will be in > the following order: > > 1. Encoding set on the resource representing the source file (.java or .zip) > 2. Encoding specified for the corresponding classpath entry > 3. Encoding of the project enclosing the source, if a member > 4. Workspace encoding This looks quite wrong to me. Take the case when the source attachment is a source folder in the workspace - You will almost never have the encoding set on individual .java files, hence you move to step 2 - If an encoding is specified on the classpath it will be used. - However, the canonical encoding in this case is the encoding on the container (project or workspace), but it is never used. In short, for source attachments from the workspace the new attribute should not have any meaning. Ideally, we should not even allow to set the attribute for attachments from workspace, but I guess it is also OK if we silently ignore it. (In reply to comment #16) > This looks quite wrong to me. Take the case when the source attachment is a > source folder in the workspace > - You will almost never have the encoding set on individual .java files, hence > you move to step 2 Though I agree that it's unlikely, we can't not support that case. > - If an encoding is specified on the classpath it will be used. > - However, the canonical encoding in this case is the encoding on the container > (project or workspace), but it is never used. > > In short, for source attachments from the workspace the new attribute should > not have any meaning. Ideally, we should not even allow to set the attribute > for attachments from workspace, but I guess it is also OK if we silently ignore > it. If you look at comment #4, one of the requirements is to be able to support encoding at classpath entry (or library) level. If the user knows that the source attachment is inside the project/workspace and fine with the project/workspace encoding being used, he can very well do that by not setting the encoding for the library. (In reply to comment #17) > If you look at comment #4, one of the requirements is to be able to support > encoding at classpath entry (or library) level. If the user knows that the > source attachment is inside the project/workspace and fine with the > project/workspace encoding being used, he can very well do that by not setting > the encoding for the library. By that logic the algorithm should be 1. Encoding specified for the corresponding classpath entry 2 a. Encoding set on the resource representing the source file (.java or .zip) b. Encoding of the project enclosing the source, if a member c. Workspace encoding 2 a,b,c constitute a well defined algorithm to compute the encoding for an artifact in the workspace. I don't think we should break that. In any case I still think that it would be better to not use the new classpath attribute for source attachments from workspace. (In reply to comment #18) > By that logic the algorithm should be > 1. Encoding specified for the corresponding classpath entry > 2 a. Encoding set on the resource representing the source file (.java or .zip) > b. Encoding of the project enclosing the source, if a member > c. Workspace encoding > > 2 a,b,c constitute a well defined algorithm to compute the encoding for an > artifact in the workspace. I don't think we should break that. > > In any case I still think that it would be better to not use the new classpath > attribute for source attachments from workspace. (b) and (c) are not explicitly set by the user or the author of the file. It's an algorithm used by the tool and we shouldn't really let it dictate terms to the user. If there is an encoding set explicitly, be it a resource, or a classpath entry, it must be given a higher priority than a default. Think about this scenario: User has a source attachment, for which he wants to use a particular encoding. And let's say a particular file X.java in the source attachment has a different encoding than other files. How else can we support this? (In reply to comment #19) > (b) and (c) are not explicitly set by the user or the author of the file. It's > an algorithm used by the tool and we shouldn't really let it dictate terms to > the user. If there is an encoding set explicitly, be it a resource, or a > classpath entry, it must be given a higher priority than a default. > > Think about this scenario: User has a source attachment, for which he wants to > use a particular encoding. And let's say a particular file X.java in the source > attachment has a different encoding than other files. How else can we support > this? ... so you are saying that the encoding for the source attachment will be different from the encoding for the same resource accessed from its original location in the workspace? (In reply to comment #20) > ... so you are saying that the encoding for the source attachment will be > different from the encoding for the same resource accessed from its original > location in the workspace? Yes, only if the user has opted for a different encoding than the project's encoding and the individual resource doesn't have any encoding set on it. And I would assume that the user knows what he is doing. When the resource is directly accessed, the content may not be displayed correctly, which is the case even today. (In reply to comment #21) > And I > would assume that the user knows what he is doing. When the resource is > directly accessed, the content may not be displayed correctly, which is the > case even today. The idea should be to specify one property in only once place, and for source attachments from workspace I think the encoding is best defined in the original location, something we should not change. As part of bug 368015 we can also prevent users from setting the encoding for attachments from workspace. Also, the algorithm to compute the encoding is not trivial it needs to be documented along with the new API org.eclipse.jdt.core.IClasspathAttribute.SOURCE_ATTACHMENT_ENCODING. (In reply to comment #22) > The idea should be to specify one property in only once place, and for source > attachments from workspace I think the encoding is best defined in the original > location, something we should not change. > > As part of bug 368015 we can also prevent users from setting the encoding for > attachments from workspace. Are you saying that the user will have to set the encoding on the library's source attachment for external folders and on the resource if the source attachment is part of the workspace? The user will be confused if he doesn't see the encoding for some attachments, won't he? For consistency sake, I would prefer it to be provided for all libraries in one place - where we attach the source. Perhaps, we can warn the user if it's an internal resource and if there is already an encoding set on the resource. > Also, the algorithm to compute the encoding is not trivial it needs to be > documented along with the new API > org.eclipse.jdt.core.IClasspathAttribute.SOURCE_ATTACHMENT_ENCODING. I agree, the document should be updated. The encoding property must only be defined and editable in the UI for case 3 from bug 303511 comment 1. If the property is not set, the workspace encoding is used. If possible JDT Core should also define and implement the property like that. However, I'm OK, if only the UI imposes the restriction. (In reply to comment #24) > the UI imposes the restriction. I have imposed the restriction in bug 368015. Jay, your patch is OK for me once you update the doc. (In reply to comment #23) > > Also, the algorithm to compute the encoding is not trivial it needs to be > > documented along with the new API > > org.eclipse.jdt.core.IClasspathAttribute.SOURCE_ATTACHMENT_ENCODING. > > I agree, the document should be updated. (In reply to comment #25) > (In reply to comment #24) > > the UI imposes the restriction. > I have imposed the restriction in bug 368015. > > Jay, your patch is OK for me once you update the doc. Thanks, Deepak. I have a patch with the updated doc. I will also test this fix with the UI and after that I will post/release the patch. Will do that after completing the 3.7.2 testing. (In reply to comment #14) > > I will release this fix after some more testing. > Please wait until we have verified it through the UI part. You're good to go now. Created attachment 209785 [details]
Updated patch
Same fix, with update documentation for the new classpath attribute.
(In reply to comment #28) > Created attachment 209785 [details] > Updated patch > > Same fix, with update documentation for the new classpath attribute. Tested the fix with the UI portion (bug 368015) and things look good, though I could test only the external libraries through the UI. Released with few cosmetic changes on the doc + additional tests for external libraries. http://git.eclipse.org/c/jdt/eclipse.jdt.core.git/commit/?id=a74cc2e7acd1e0aec09bded5c981fd784a55b04c I got a mid-air collision while trying to say... "Hold on with releasing the patch. Looks like some more work needs to be done to support jars in a container e.g. rt.jar. I am currently investigating this." (In reply to comment #30) > I got a mid-air collision while trying to say... > > "Hold on with releasing the patch. Looks like some more work needs to be done > to > support jars in a container e.g. rt.jar. I am currently investigating this." I did play around with containers and they looked to be alright. Let me know your findings. We still have time for the warm-up build. (In reply to comment #31) > I did play around with containers and they looked to be alright. Really? Can you view and edit the source attachment encoding for rt.jar ? (In reply to comment #32) > Really? Can you view and edit the source attachment encoding for rt.jar ? Arrg, I see the value entered doesn't get stored if I come back again! But it's okay for individual jars. (In reply to comment #33) > (In reply to comment #32) > > Really? Can you view and edit the source attachment encoding for rt.jar ? > > Arrg, I see the value entered doesn't get stored if I come back again! But it's > okay for individual jars. Reopening then. (In reply to comment #32) > (In reply to comment #31) > > I did play around with containers and they looked to be alright. > Really? Can you view and edit the source attachment encoding for rt.jar ? Do we have a consensus as to whether the JDT/Core side of things is incorrect or merely incomplete (or is complete and is not really contributing to the problem scenario) ? This bug can be closed now. (See also bug 368015 comment 17) Verified for 3.8M5 using build I20120123-1800 Does not work in 3.8 (Version: 3.8.0 Build id: M20120626-2030) for sources downloaded by Maven plugin (m2e) - encoding is always "Default (UTF-8)". Steps to reproduce: 1. Install M2E - Maven Integration for Eclipse (version 1.1.0.20120530-0009 for Eclipse 3.8). 2. Window -> Preferences -> Maven -> check "Download Artifact Sources" option. 3. Create Maven project or enable Maven dependency management on existing project (right-click on project -> Configure -> Convert to Maven Project). 4. Add to project dependency with sources in non-UTF-8 encoding. 5. Open Project Explorer, go to project -> Maven Dependencies, right click on dependency, change encoding, apply changes. After reopening encoding will be again "Default (UTF-8)". (In reply to comment #38) > Does not work in 3.8 (Version: 3.8.0 Build id: M20120626-2030) for sources > downloaded by Maven plugin (m2e) - encoding is always "Default (UTF-8)". > > Steps to reproduce: > 1. Install M2E - Maven Integration for Eclipse (version 1.1.0.20120530-0009 for > Eclipse 3.8). > 2. Window -> Preferences -> Maven -> check "Download Artifact Sources" option. > 3. Create Maven project or enable Maven dependency management on existing > project (right-click on project -> Configure -> Convert to Maven Project). > 4. Add to project dependency with sources in non-UTF-8 encoding. > 5. Open Project Explorer, go to project -> Maven Dependencies, right click on > dependency, change encoding, apply changes. After reopening encoding will be > again "Default (UTF-8)". Please file a new bug report with steps to reproduce the problem using a plain Eclipse SDK: http://download.eclipse.org/eclipse/downloads/drops4/R-4.2-201206081400/ to rule out Maven specific issue. (In reply to comment #39) > Please file a new bug report with steps to reproduce the problem using a plain > Eclipse SDK: > http://download.eclipse.org/eclipse/downloads/drops4/R-4.2-201206081400/ > to rule out Maven specific issue. OK |