Summary: | Batch compiler writes log using default encoding instead of UTF-8 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Eclipse Project] JDT | Reporter: | Nick Edgar <n.a.edgar> | ||||
Component: | Core | Assignee: | Olivier Thomann <Olivier_Thomann> | ||||
Status: | VERIFIED FIXED | QA Contact: | |||||
Severity: | minor | ||||||
Priority: | P3 | CC: | david_audel, Olivier_Thomann | ||||
Version: | 3.3 | Flags: | david_audel:
review+
|
||||
Target Milestone: | 3.5 RC1 | ||||||
Hardware: | PC | ||||||
OS: | Windows XP | ||||||
Whiteboard: | |||||||
Attachments: |
|
Description
Nick Edgar
2009-05-08 10:39:57 EDT
Note: I'd expect the date line to be in the locale-specific format, which would likely use double-bytes in the Chinese ('zh') locale. I also noticed that the line that writes the date: this.log.println("<!-- " + new String(dateFormat.format(date).getBytes(), Util.UTF_8) + " -->");//$NON-NLS-1$//$NON-NLS-2$ converts between encodings incorrectly: it's converting to bytes using the default encoding, then back to a string using UTF-8. There's no need for this conversion. it should just do: this.log.println("<!-- " + dateFormat.format(date) + " -->");//$NON-NLS-1$//$NON-NLS-2$ Hm, the use of the default encoding for the PrintWriter might not be the problem. The name of the log file we're using (in the Ant script) is declared as: <property name="compileLog" value="${java.io.tmpdir}/compilelog.xml"/> The Logger code tries to handle XML files differently: int index = logFileName.lastIndexOf('.'); if (index != -1) { if (logFileName.substring(index).toLowerCase().equals(".xml")) { //$NON-NLS-1$ this.log = new GenericXMLWriter(new OutputStreamWriter(new FileOutputStream(logFileName, false), Util.UTF_8), Util.LINE_SEPARATOR, true); which looks good to me. We're invoking the Ant javac task with: <javac destdir="${build.output}" failonerror="false" debug="on" debuglevel="2" includes="**/*.java, *.java" srcdir="${workingDir}"> <compilerarg line="-log ${compileLog}"/> </javac> It may be that the expansion of ${java.io.tmpdir} is confusing things (though it works OK for me on WinXP in English Canada locale). I'll dig further. Turns out we were running an older version of the compiler (from 3.3). Looks like the main issue with the encoding was fixed in 3.4. Earlier versions (3.2 and 3.3) use the default encoding: this.log = new GenericXMLWriter(new FileOutputStream(logFileName, false), Util.LINE_SEPARATOR, true); You might still want to consider the minor issue in comment 1. Reduce to minor as the problem is only with the date encoding. Created attachment 134997 [details]
Proposed fix
Patch fixes problem mentionned in comment 1. David, please review. Patch looks good. Released for 3.5RC1. Code verification is required in order to verify this fix. Verified using I20090513-2000 |