Community
Participate
Working Groups
this is in v_853: When compiling more than one compilation unit the batch compiler may report errors in all units or only the first. This non-determinism is caused by a racing between Main.new ICompilerRequester().acceptResult (line 2689) and the Compiler. The "normal" case seems to be reporting errors only of the first CUD, so here is how you see errors in more than one unit: Given 2 simple classes public class C1 { void bar(C2 other) { other.missing(); } } and public class C2 { void foo() { this.wrong(); } } and debug the batch compiler with "-classpath . C1.java". I have two breakpoints: a) ProcessTaskManager line 144 (last statement in run()) b) Main line 2705 (in acceptResult of the anonymous requester, the line where queuedUnits is defined and assigned) 1. Launch 2. See a hit of a) and in this thread do one F6 3. See the other thread hit b) and do six F6's to see queuedUnit contains now C2.java 4. One F8 in the TaskManager's thread compiles C2 5. F8 on the main thread reports 2 errors. This may look contrived but it actually occurred as the difference between running the same code on two different machines. Which one is the intended behavior?
What's your command line ? And you're sure that C2.class is deleted prior to each test ?
(In reply to comment #1) > What's your command line ? From my .launch: <stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="org.eclipse.jdt.internal.compiler.batch.Main"/> <stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="-classpath . C1.java"/> > And you're sure that C2.class is deleted prior to each test ? Yes, actually no class files were ever generated (no -proceedOnError). If it would still exist, seeing only one error would of course be correct. The second error can actually be missed in two ways: 1. Thread TaskProcessManager being quicker than main and emptying the list of units to process before logging will pick them up, 2. After picking up the queuedUnits thread main could be too quick such as to find the compilationResult still empty when reporting.
Errors for multiple units can be reported because of the change for bug 123476. Even using our single threaded batch compiler, which removes the race condition, I still only get the error for the first unit. Need to look into it more.
Ok - this is the trace from the batch compiler before we added the multi-threaded support. [parsing C1.java - #1/1] [reading java/lang/Object.class] [analyzing C1.java - #1/1] [parsing D:\temp6\.\C2.java - #2/2] ---------- 1. ERROR in C1.java (at line 3) other.missing(); ^^^^^^^ The method missing() is undefined for the type C2 ---------- 1 problem (1 error) Since C1 only needs the type C2 to exist, C2.java is not completely processed before C1 reports its error. Now with the latest batch compiler, I get : [parsing C1.java - #1/1] [reading java/lang/Object.class] [analyzing C1.java - #1/1] [parsing D:\temp6\.\C2.java - #2/2] [analyzing D:\temp6\.\C2.java - #2/2] ---------- 1. ERROR in C1.java (at line 3) other.missing(); ^^^^^^^ The method missing() is undefined for the type C2 ---------- 1 problem (1 error) And if my machine was quicker, its reasonable to expect C2.java to be finished processing so it had errors to report before C1.java reported its errors. I'll look into protecting the unitsToProcess during accept, but with the change for bug 123476, I cannot see how we can guarantee that errors for other processed units will not be reported with the errors for the first unit. If it matters to you that the errors are ALWAYS repeatable in this case, then you can always disable the multi-threaded compiler by setting the system property jdt.compiler.useSingleThreaded=true
So with the multi-threaded compiler, we cannot guarantee that the reported errors will always be the same. Different machines & hard drives will process units at different speeds, so as long as proceedOnError is false, the number of errors reported may not be the same. If its important that you get the same errors each time, you can : - enable proceedOnError so the compiler deals with every source file - or, disable the multi-threaded compiler Is that ok ?
(In reply to comment #5) > [...] If its important that you get the same errors each time, you can : > > - enable proceedOnError so the compiler deals with every source file > > - or, disable the multi-threaded compiler > > Is that ok ? Both tricks work and either one is OK for me. Thanks! Will the provisional system property still be supported in 3.4 final? BTW.: it's -Djdt.compiler.useSingleThread=true not -Djdt.compiler.useSingleThreaded=true ;-) Since non-deterministic behavior of a compiler may not be the most normal thing to expect, where would one search for documentation of this behavior (ie., reporting secondary errors that happen to be detected before the compiler shuts down)?
Yes the system property (-Djdt.compiler.useSingleThread=true) will be supported in 3.4. Philippe - do we have any doc on the error reporting behaviour of the batch compiler?
(In reply to comment #5) > So with the multi-threaded compiler, we cannot guarantee that the reported > errors will always be the same. Personally, I'd love to get the same errors, at least if we decided to run our regression tests on various hardware with multi-threaded operations, expecting the same results as we got before?
The problem only arises when not using -proceedOnError... I believe our tests do use the resilient mode.
I admit that they tend to. However, this is a manual option for batch tests (some may have escaped), and I have seen quite a few policy classes that return false for this (not looked how much they were used though). Question: will the errors be guaranteed to be in the same order in case we use proceedOnError (but not useSingleThread)?
Yes, normally they should, since units are always processed in same order. If you let the compiler consumes the entire queue of units, then you should get errors in exact same order. If the compiler aborts because it is not proceeding on errors, then it will look for queued units, and dump the errors for these as well (as they may contain important primary error info). The problem is that because of background processing, you may have more or less units in the queue when abort decision is made. When proceed on error is enabled, then no abort decision is made, and the queue is exhausted entirely. Once we start processing units in parallel, we may see ordering issues again, but this is beyond 3.4.
If I get you well, using proceedOnError is expected to buy us reproducibility in 3.4 timeframe, but the option is mandated, and we may change this in 3.5. Last but not least, while we have a workaround, we admit that in at least one known scenario (which is the default by the way) we will emit a random subset of the errors that we'd have got with proceedOnError. Considering the situation, I would contend that we'd better keep the bug open for 3.5 (preferred), or else resolve it as WONTFIX.
I would keep it open for later investigation. Even if our behavior is tolerable for 3.4 given the workaround, I believe we have a real issue which we need to tackle at some point. Reopening but not for 3.4
2 issues: 1. Our default behaviour (-proceedOnError is not enabled) should potentially change to report all errors for all source files that were compiled. So for example, if the command line has only X.java, but we request Y.java & Z.java to find types Y & Z, then we should report all errors found in X, Y & Z. This would remove the race condition we have now where we check to see what units have problems to report, while accepting the first problem unit. 2. When the multi-threaded compiler starts to process units in several threads & as a result, cannot predict the order that units are processed, we could keep all problems associated with each unit & dump them in a sorted order by default (say using the full path of the unit). We can no longer predict what order the additional source units (those not included on the command line) will be requested/found. On some machines, Y.java may be the 2nd unit found/processed and on another machine it may be the 3rd. So using the requested id to sort is not a solution.
Re: 1. Problem is located in Main#getBatchRequestor(...) where it treats proceedOnError = false mode as a condition to abort on first file with errors. This is stricter than it should be. It should only record the fact there were some errors, and let it continue. I also checked that the IErrorHandlingPolicy used by batch compiler allows to accumulate all errors as well.
Created attachment 106243 [details] Proposed patch for (1) This allows the batch compiler to process all files in queue, and only exit at the very end; instead of in the middle, with more or less errors detected up to that point.
Stephan - could you apply the patch to 3.4.0 and assess whether this does you some good ? I believe (2) would be a separate request if we were to support parallel unit processing, which isn't the case right now (i.e. this bug could be closed independantly)
(In reply to comment #17) > Stephan - could you apply the patch to 3.4.0 and assess whether this does you > some good ? Yes, judging from one run of our test-suite without the single thread switch all expected errors are actually reported.
Wondering if the simple patch couldn't be released for 3.4.1. Any client to the batch compiler may be seeing this issue as a regression over 3.3.
Candidating for 3.4.1
Talking with Kent, we may not want to inject it in 3.4 maintenance at this point.
Olivier - Could you please release this patch into HEAD (3.5) with regression tests ?
I cannot write a regression test as is since the modified code is not run when inside the test harness because we don't want to exit once the compiler is done. So I'll simply release the patch. Verification should be done by looking at the source code.
Verified for 3.5M1 using I20080805-1307