Bug 240034 - [buildpath] Eclipse ignores .classpath file if it is encoded in UTF8 with BOM
Summary: [buildpath] Eclipse ignores .classpath file if it is encoded in UTF8 with BOM
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.5 M2   Edit
Assignee: Jerome Lanneluc CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-08 13:45 EDT by Michael Schierl CLA
Modified: 2008-09-15 11:19 EDT (History)
4 users (show)

See Also:


Attachments
Proposed fix and regression test (6.29 KB, patch)
2008-09-08 07:18 EDT, Jerome Lanneluc CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Schierl CLA 2008-07-08 13:45:04 EDT
Build ID: I20080617-2000

Steps To Reproduce:
1. Create a new project
2. Add something nontrivial to the classpath (e. g. another project)
3. Open the .classpath file in an external editor.
4. Save the file in the external editor (NB: The external editor adds an UTF-8 BOM, i. e. 0xEF 0xBB 0xBF to the beginning of the file). You can use a hex editor to add that BOM for testing purposes as well.
5. Refresh the project in Eclipse - classpath is empty
6. Remove the BOM manually from the .classpath file: Project is working again.
7. Do the same with the .project file: the project will still work as before. Adding anything else to the beginning of the .project file will make the project stop working. So, the HTML parser itself can parse a file with BOM if it is .project, but not if it is .classpath. 

Expected result: BOMs work in .classpath files as well.
Comment 1 Dani Megert CLA 2008-07-09 04:40:16 EDT
Can reproduce. Looks like JDT Core isn't using the resource APIs to read the file.
Comment 2 Jerome Lanneluc CLA 2008-09-02 05:32:58 EDT
Actually we are using IFile.getContents(boolean) to read the file. Dani, did you have something else in mind?

Also step 3 and 4 are not very clear. What external editor should I use? Tempering with bytes in the file doesn't seem to be a valid use case (i.e. it is not supported).
Comment 3 Dani Megert CLA 2008-09-02 12:26:56 EDT
IFile.getContents(*) gives you a stream. Do you use the correct encoding when reading from the stream? The easiest way to read the file would be to use file buffers. If this is not an option you can take a look at the code in org.eclipse.core.internal.filebuffers.ResourceTextFileBuffer.setDocumentContent(IDocument, IFile, String).
Comment 4 Michael Schierl CLA 2008-09-02 13:01:11 EDT
@Jerome: In my case it was an older version of UltraEdit (a commercial text editor available from www.UltraEdit.com) when using the "Replace in multiple files" feature. The latest version seems to have an option to enable/disable generation of BOMs for UTF-8 files, but unfortunately upgrades are not free. I once also had a similar problem with Visual Studio.NET (but in that case not with an Eclipse file). Visual Studio 2005/2008, however, preserves the BOM but do not add a new one. If you have VS2005/2008, you can reproduce the issue by selecting File->Save Advanced and select "UTF-8 with signature" as the file format. Eclipse does not create any BOM in any case, so you cannot reproduce it from Eclipse itself :(

If you know the issue, it is easy to fix, but since both Notepad and Wordpad silently ignore the BOM, there is no way to "see" the problem without a hex editor.
Comment 5 Jerome Lanneluc CLA 2008-09-08 07:14:43 EDT
Thanks Dani and Michael. I will make sure that we support this scenario.
Comment 6 Jerome Lanneluc CLA 2008-09-08 07:18:29 EDT
Created attachment 111947 [details]
Proposed fix and regression test

Note to verify that the fix is in, one can either use an external editor as indicated by Michael, or run the test without the fix.
Comment 7 Jerome Lanneluc CLA 2008-09-08 09:23:50 EDT
Fix and test released for 3.5M2
Comment 8 Olivier Thomann CLA 2008-09-15 11:19:30 EDT
Verified for 3.5M2 using I20080914-2000 using an external editor to add the BOM header.