Summary: | [compiler] Provide XML output option for Eclipse compiler | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Eclipse Project] JDT | Reporter: | Nick Crossley <ndjc> | ||||||||||||||||||||||||
Component: | Core | Assignee: | Olivier Thomann <Olivier_Thomann> | ||||||||||||||||||||||||
Status: | VERIFIED FIXED | QA Contact: | |||||||||||||||||||||||||
Severity: | enhancement | ||||||||||||||||||||||||||
Priority: | P3 | CC: | philippe_mulet | ||||||||||||||||||||||||
Version: | 3.0 | ||||||||||||||||||||||||||
Target Milestone: | 3.1 M5 | ||||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||
Attachments: |
|
Description
Nick Crossley
2004-09-20 23:35:19 EDT
It would be trivial to add this into the batch compiler. Don't hesitate to provide a patch. OK, I'll work on a patch/suggested implementation. It probably won't be until December that I'll get a chance to do this. Would you prefer my patch to be based on the CVS HEAD at that time, or against the latest 3.1 milestone or integration build? Could you also please point me at some existing code in Eclipse that shows your preferred techniques for writing XML. A patch against HEAD would be better. You can see an example of writing XML in org.eclipse.jdt.internal.core.JavaProject#encodeClasspath(...) Any news on that front? I expect to work on this during my Christmas break over the next two weeks. Look for an update in the first week in January when I return. ok, if you haven't started yet, I'd like to provide a first implementation that you could review. That would be great! Do you think such output would be sufficient? <?xml version="1.0" encoding="UTF-8"?> <compiler name="Eclipse Java Compiler" version="0.529, pre-3.1.0 milestone-4"> <problems> <problem start="78" end="81" severity="ERROR" line="3" source="C:\tests_sources\Test.java" id="IncompatibleReturnType"> <message value="The return type is incompatible with Writer.append(char), PrintWriter.append(char)"/> <arguments> <argument value="java.io.Writer.append(char), java.io.PrintWriter.append(char)"/> </arguments> </problem> <problem start="78" end="81" severity="ERROR" line="3" source="C:\tests_sources\Test.java" id="IncompatibleReturnType"> <message value="The return type is incompatible with Writer.append(CharSequence, int, int), PrintWriter.append(CharSequence, int, int)"/> <arguments> <argument value="java.io.Writer.append(CharSequence, int, int), java.io.PrintWriter.append(CharSequence, int, int)"/> </arguments> </problem> <problem start="78" end="81" severity="ERROR" line="3" source="C:\tests_sources\Test.java" id="IncompatibleReturnType"> <message value="The return type is incompatible with Writer.append(CharSequence), PrintWriter.append(CharSequence)"/> <arguments> <argument value="java.io.Writer.append(CharSequence), java.io.PrintWriter.append(CharSequence)"/> </arguments> </problem> <problem start="287" end="297" severity="WARNING" line="11" source="C:\tests_sources\Test.java" id="UnnecessaryCast"> <message value="Unnecessary cast from String to String"/> <arguments> <argument value="java.lang.String"/> <argument value="java.lang.String"/> </arguments> </problem> </problems> <problem_summary problems="4" errors="3" warnings="1"/> <command_line> <argument value="C:\tests_sources\Test.java"/> <argument value="-1.5"/> <argument value="-source"/> <argument value="1.4"/> <argument value="-g"/> <argument value="-d"/> <argument value="c:\tests_sources"/> <argument value="-verbose"/> <argument value="-classpath"/> <argument value="C:\tests_sources"/> <argument value="-log"/> <argument value="c:\log.xml"/> <argument value="-warn:+uselessTypeCheck"/> </command_line> </compiler> Created attachment 16918 [details]
Apply on HEAD
Here is the corresponding implementation. We can still discuss if we want more
information. The log is generated only in case of errors or warnings.
This is very much what I was looking for - thanks! A couple of questions: 1. I presume the 'start' and 'end' attributes are character or byte locations of the error in the given source line? What if the error extends over more than one line? (The current messages produced by the compiler sometimes include more than one line of source.) 2. You say the log is only produced in case of errors or warnings - wouldn't that make it a little harder to script? And wouldn't the compiler and command line information be useful in the event of a successful compiler run? I would suggest producing an XML log if a command line flag was given to request it, regardless of the number of errors or warnings. I suggest the following format: <compiler ....> <sources> <source path="......"> <problem_summary problems="4" errors="3" warnings="1"> <problem ...> </problem_summary> <classfile path="...."/> <tasks> <task message="...."/> </tasks> </source> </sources> <command-line> ... </command-line> <stats ... /> </compiler> This would allow to get all source files compiled and all errors for each source files + all class files generated for each source files. Would this be good enough? Shouldn't the classpath and options also be surfaced ? In command line, it requires to be decoded. The command-line part would include the whole command line argument with this format: <command_line> <argument value="C:\tests_sources\Test.java"/> <argument value="-1.5"/> <argument value="-source"/> <argument value="1.4"/> <argument value="-g"/> <argument value="-d"/> <argument value="c:\tests_sources"/> <argument value="-verbose"/> <argument value="-classpath"/> <argument value="C:\tests_sources"/> <argument value="-log"/> <argument value="c:\log.xml"/> <argument value="-warn:+uselessTypeCheck"/> </command_line> Sorry if this was unclear. Is this enough? If not, let me know what you expect. Created attachment 16996 [details]
New patch to apply on HEAD
Created attachment 16997 [details]
Corresponding xml file
The xml log is generated as soon as the log file name ends with ".xml". The
source start/source end values includes the characters on multiple lines if the
error spawns on more than one line. The line value is the line number where the
problem starts.
Hope this is close to what you want. If yes, I will release that first draft
shortly.
It all looks very good, and is just what I was looking for - but I still do not fully understand the 'start' and 'end' attributes on an error. Suppose I see one of your examples: <problem start="78" end="81" severity="ERROR" line="3" ...> From reading the XML, how do I distinguish between an error that starts at position 78 of line 3 and ends at position 81 of line 3, vs. an error that starts at position 78 of line 3 and ends at position 81 of line 4 or 5? Do I do so by counting the number of lines in the detailed_message element - that is, do I assume the first line shown in the detailed message line is the start line, and the last line before the ^^^^ indicators is the last list of the error? That's possible, but seems a little fragile. Perhaps the start and end attributes should contain the corresponding line numbers: <problem start="3.78" end="4.81" severity="ERROR" ...> That's a little harder to parse in the most common single-line error case, but better than parsing the detailed_message in the more general multi-line case. sourceStart and sourceEnd are character positions in the source code. They are not relative the the corresponding line. They are absolute positions in the source code. The first character of the source file is 0 and the last one is file.length - 1. So you don't know if there is a new line in the middle, but if you extract the characters between sourceStart (inclusive) and sourceEnd (exclusive) from the source code, you get the piece of code that is causing the problem. Isn't this enough? Thanks - now I understand. Yes, that is perfectly acceptable! As far as I am concerned, you can resolve the bug - will it get into M5? Hopefully yes. I will write a DTD for the corresponding format. I will close this PR when everything is released in HEAD. Maybe rename start/end into charStart/charEnd to be more obvious. Created attachment 17043 [details]
Apply on HEAD
Latest patch. I changed the name. I also include an internal DTD in each log
file. I could use an external DTD, but I didn't have a URL to specify. Maybe we
can provide the DTD on the JDT/Core web page in the development section?
Created attachment 17044 [details]
Example of log files that can be successfully validated
Created attachment 17046 [details]
Apply on HEAD
This patch makes the log file to point to an external DTD file called
compiler.dtd that is located in the same folder than the log file. It makes the
log file a bit smaller.
Created attachment 17047 [details]
Example of log file
Created attachment 17048 [details]
DTD file
Created attachment 17056 [details]
New DTD file
Created attachment 17057 [details]
New patch to apply on HEAD
This should be the final one. Let me know if this fits your expectations. If
yes, it will be released shortly after I made some benchmarks and if the
performances are acceptable.
The sample output all looks good to me. I have not actually tried to build and run the patched compiler myself, but feel no specific need to do so. First draft has been released. Fixed in HEAD. I will reopen if major problems are found. Created attachment 17405 [details]
New DTD file
Working on converting this xml to html I realized that I introduced unnecessary
complexity in the element.
The problem element should contain the problem_source and the message as
parameters and not nested element.
This is a proposal and it has not been released yet.
What do you think?
I also don't know what to do with the problem_source. The idea of this entry was to provide the source that is causing the problem, but if I don't provide any context I don't find this useful. In the batch compiler, we do provide some context by underlying the corresponding part of the line. For example, we provide this: (at line 14) return (String) ""; ^^^^^^^^^^^ Unnecessary cast from String to String In order to let the user render the context like he wants, I'd like to provide the following information inside the problem_source. return #(String) ""#; So I don't preserve the underlines. The relevant part of the line is between '#'. This also the converters to parse that string and extract what they want. Then a HTML converter could underline the guilty part of the line using HTML tags, whereas a TXT converter would underline the part of the source code using '^'. Do you have a better idea? I thought there was an issue with white space normalization in XML attributes - that is, white space was always normalized in attributes, whereas that was controllable in elements? Since you might not want white space normalization in source and messages, do you really want those as attributes? As for marking the source, I agree that some context is useful. I have no objection to your proposal in principle, though if you use '#' as the marker, how would an actual # character be shown? One alternative to marking the source would be to provide an attribute indicating the start position in the file of the source string. By subtracting this from the start position of the error, a reader could find the relative position of the error in the given string. Ok, I didn't know that. So I leave them as is. Meaning they will be argument and not attributes. I'll try to find a better solution for positions in the context. Adding two attributes to the problem_source element could be a solution. This would not pollute the problem itself and like these values would be meaningful only in the context of the source, I think they are good candidates for attributes in the problem_source element. The positions would be relative to the source specified in the element. I updated the DTD in HEAD to include the number of tasks in the problem summary. It is now: <!ATTLIST problem_summary problems CDATA #REQUIRED errors CDATA #REQUIRED warnings CDATA #REQUIRED tasks CDATA #REQUIRED > instead of: <!ATTLIST problem_summary problems CDATA #REQUIRED errors CDATA #REQUIRED warnings CDATA #REQUIRED > Before the number of warnings included the number of tasks. This was not consistent with the file format that makes the distinction between tasks and warnings. Verified with 3.1 M5 candidate (I20040215-2300) Is there any plan to update the file jdt-core-home/howto/batch compile/batchCompile.html with information on the XML reports, accessibility rules, and other options added to the batch compiler in 3.1? Yes, we will work on the doc before 3.1 is out. |