Bug 162054 - [build] Got a failure on MultiProjectTests.testCycle5 on my speedy test box...
Summary: [build] Got a failure on MultiProjectTests.testCycle5 on my speedy test box...
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.2   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.3 RC2   Edit
Assignee: Kent Johnson CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-24 07:01 EDT by Frederic Fusier CLA
Modified: 2007-05-29 04:31 EDT (History)
0 users

See Also:
philippe_mulet: review+
maxime_daniel: review+


Attachments
A trace showing the root cause of the problem (3.75 KB, text/plain)
2007-05-16 09:57 EDT, Maxime Daniel CLA
no flags Details
Will this do the trick ? (1.58 KB, patch)
2007-05-16 11:34 EDT, Kent Johnson CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frederic Fusier CLA 2006-10-24 07:01:44 EDT
Since a while (ie. at least since 3.2...), I often get following failure stack trace while running all JDT/Core tests on my test box:

junit.framework.ComparisonFailure: unexpected compiling order.
----------- Expected ------------
{p1.X,p2.Y,p1.X,p2.Y}
------------ but was ------------
{p1.X,p2.Y,p1.X}
--------- Difference is ----------
 expected:<{p1.X,p2.Y,p1.X[,p2.Y]}> but was:<{p1.X,p2.Y,p1.X[]}>
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertStringEquals(TestCase.java:211)
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertEquals(TestCase.java:192)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompiling(BuilderTests.java:169)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompilingOrder(BuilderTests.java:147)
	at org.eclipse.jdt.core.tests.builder.MultiProjectTests.testCycle5(MultiProjectTests.java:744)
Comment 1 Olivier Thomann CLA 2006-10-24 09:39:48 EDT
At home on my AMD box, I also get the same failure plus another one, but I don't remember it.
It is a AMD 3500+.
Comment 2 Maxime Daniel CLA 2007-03-29 06:48:35 EDT
According to the tests made by Frédéric on 3.3 and 3.2.2 today on his fast machine (even using a RAM drive for his files), this no more happens.
Resolving as WORKSFORME.
Comment 3 Maxime Daniel CLA 2007-04-23 00:38:19 EDT
I got it today on my T60p machine. Will investigate.
Comment 4 Frederic Fusier CLA 2007-04-23 11:36:12 EDT
I just got testCycle4 also failed once today on my T60p:

RunAllJDTCoreTests (BDF)
org.eclipse.jdt.core.tests.RunAllJDTCoreTests
org.eclipse.jdt.core.tests.RunAllBuilderTests
org.eclipse.jdt.core.tests.RunBuilderTests
org.eclipse.jdt.core.tests.builder.BuilderTests
org.eclipse.jdt.core.tests.builder.MultiProjectTests
testCycle4(org.eclipse.jdt.core.tests.builder.MultiProjectTests)

                
                junit.framework.ComparisonFailure: unexpected compiling order.
----------- Expected ------------
{p1.X,p2.Y,p3.Z,p1.X,p2.Y}
------------ but was ------------
{p1.X,p3.Z,p1.X,p2.Y,p1.X}
--------- Difference is ----------
 expected:<{p1.X,p[2.Y,p3.Z,p1.X,p2.Y]}> but was:<{p1.X,p[3.Z,p1.X,p2.Y,p1.X]}>
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertStringEquals(TestCase.java:225)
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertEquals(TestCase.java:201)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompiling(BuilderTests.java:180)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompilingOrder(BuilderTests.java:158)
	at org.eclipse.jdt.core.tests.builder.MultiProjectTests.testCycle4(MultiProjectTests.java:679)
Comment 5 Maxime Daniel CLA 2007-05-02 04:39:55 EDT
Too late for 3.3 M7.
Comment 6 Maxime Daniel CLA 2007-05-16 09:57:12 EDT
Created attachment 67402 [details]
A trace showing the root cause of the problem

This trace shows that:
- on a fast machine, we manage to construct two successive but structurally different instances of State for project P1 within the same millisecond;
- we rely on no two states for the same project to have the same lastStructuralBuildTime unless they denote the same structural state.
In other words, the millisecond precision is not enough on fast machine to identify projects states.

The test case is a bit contrived since the projects are very small indeed. But we could still expect small projects to happen (especially in the learn by doing phases).

Using a smaller precision (nanoseconds) is not an option with J2SE 1.4.

This leaves us with two options:
- refine the use of State#buildNumber in case of full builds of projects that participate in a cycle; policy might be to increment the build number if a project is involved into a full build but has already undergone a build within the same millisecond; this might collide with other uses of buildNumber, hence should be examined carefully;
- introduce a build ticker that, using the timer or not, would provide unique build numbers on a system-wide basis; would probably be a function best located at the base builders level (not Java builder).

In both cases, we would now have a reliable sequencing of project states, but the use of the full sequence identifier (say the combination of lastStructuralBuildTime and buildNumber if option 1 above is retained) would have to be leveraged throughout the code in general, and within State#wasStructurallyChanged in particular.

Kent, would you please let me know what you think?
Comment 7 Kent Johnson CLA 2007-05-16 10:06:59 EDT
Can we not just change State.tagAsStructurallyChanged() to be :

	this.previousStructuralBuildTime = this.lastStructuralBuildTime;
	this.structurallyChangedTypes = new StringSet(7);
	this.lastStructuralBuildTime = System.currentTimeMillis();
	if (this.lastStructuralBuildTime == this.previousStructuralBuildTime)
		this.lastStructuralBuildTime++;
Comment 8 Kent Johnson CLA 2007-05-16 10:08:33 EDT
Actually probably should be :

if (this.lastStructuralBuildTime <= this.previousStructuralBuildTime)
    this.lastStructuralBuildTime = this.previousStructuralBuildTime + 1;
Comment 9 Maxime Daniel CLA 2007-05-16 10:48:27 EDT
Discussed the matter with Philippe who suggested that we merely increment the timestamp when it conflicts. Seems practical enough. I'll prototype something along these lines and I will report my results here.
Comment 10 Kent Johnson CLA 2007-05-16 11:04:04 EDT
Did you not see the comment #7 or 8?
Comment 11 Maxime Daniel CLA 2007-05-16 11:06:36 EDT
The issue is that we do not call tagAsStructurallyChanged at all. We build a
brand new State in that scenario. Hence the suggestion of Philippe, which I
have prototyped with success, works under the covers without changing too much
of our code. My prototype uses a static member in State (a singleton) that is
an instance of the following class: 
class DiscriminatingPseudoClock {
        private long tick;
        synchronized long currentTimeMillis() {
                long systemTime;
                if ((systemTime  = System.currentTimeMillis()) <= this.tick) {
                        this.tick++;
                } else {
                        this.tick = systemTime;
                }
                return this.tick;
        }
}

Still need to document the said class (especially, warn that it should not be
used in too tight loops).
Comment 12 Maxime Daniel CLA 2007-05-16 11:08:07 EDT
Calls to System.currentTimeMillis within State are replaced with clock.currentTimeMillis (would have to check if we do not introduce timestamps elsewhere in our code, that should sync up with clock as well).
Comment 13 Kent Johnson CLA 2007-05-16 11:34:39 EDT
Created attachment 67428 [details]
Will this do the trick ?
Comment 14 Maxime Daniel CLA 2007-05-19 12:05:33 EDT
From what I have understood, we create a brand new State (no relationship to any prior State instance for the same project). Hence the approach I suggested, that merely 'twists the system clock' to achieve some proper order between otherwise unrelated instances.
I agree though that if we established a relationship between the new State instance and the prior State instance for the same project, we could go for a more local (and less tricky) increment on a per-project basis.
I'll investigate further on Monday why we use State#State() in this scenario and see whether it gives us a fair chance to adjust the clock as needed only, and I'll get back to you. 
The system wide clock change is more tricky in some sense, but it has no chance to break the logic behind the State and project instances and relationships. The other path is cleaner, but slightly more risky I fear.
Comment 15 Maxime Daniel CLA 2007-05-21 05:09:10 EDT
According to the tests, your patch does the trick indeed. If I understand it well, it provides the chaining I called for - I missed that part when I read it for the first time. There might be corner cases where we miss the lastState, but I could not bring any of these up.
I also made a separate test replacing the System.currentTimeMillis() call by a constant (kind of simulating an 'everything occurs during the same millisecond' scenario) and it worked as well.
I'll run the full tests suite and report the results here.
Comment 16 Maxime Daniel CLA 2007-05-21 06:56:36 EDT
All tests pass. Need to decide whether we want this in RC2.
Comment 17 Maxime Daniel CLA 2007-05-22 08:18:38 EDT
Discussed with Philippe. Kent, since we'd go with your patch, I reassign the bug to you. Requesting a review by Philippe and adding mine.
Comment 18 Philipe Mulet CLA 2007-05-22 08:22:47 EDT
Looks good, minor point, when no lastBuildState, could have used the currentTimeMillis() instead of 0 to be consistent.
Comment 19 Philipe Mulet CLA 2007-05-22 08:23:59 EDT
Ignore my previous comment about 0. It does the trick already.
Comment 20 Philipe Mulet CLA 2007-05-22 08:37:56 EDT
Once done for 3.3, I would also want the fix to be backported to 3.2.x.
Comment 21 Kent Johnson CLA 2007-05-22 14:25:16 EDT
Released into HEAD for 3.3RC2
Comment 22 Frederic Fusier CLA 2007-05-25 05:22:55 EDT
Verified for 3.3 RC2 using I20070525-0010.
Comment 23 Maxime Daniel CLA 2007-05-29 04:31:15 EDT
Released in R3_2_maintenance.