162054 – [build] Got a failure on MultiProjectTests.testCycle5 on my speedy test box...

Bug 162054 - [build] Got a failure on MultiProjectTests.testCycle5 on my speedy test box...

Summary: [build] Got a failure on MultiProjectTests.testCycle5 on my speedy test box...

Status:	VERIFIED FIXED

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	3.2
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.3 RC2
Assignee:	Kent Johnson
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2006-10-24 07:01 EDT by Frederic Fusier
Modified:	2007-05-29 04:31 EDT (History)
CC List:	0 users

See Also:

Flags:	philippe_mulet: review+ maxime_daniel: review+

Attachments
A trace showing the root cause of the problem (3.75 KB, text/plain) 2007-05-16 09:57 EDT, Maxime Daniel	no flags	Details
Will this do the trick ? (1.58 KB, patch) 2007-05-16 11:34 EDT, Kent Johnson	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Frederic Fusier

2006-10-24 07:01:44 EDT

Since a while (ie. at least since 3.2...), I often get following failure stack trace while running all JDT/Core tests on my test box:

junit.framework.ComparisonFailure: unexpected compiling order.
----------- Expected ------------
{p1.X,p2.Y,p1.X,p2.Y}
------------ but was ------------
{p1.X,p2.Y,p1.X}
--------- Difference is ----------
 expected:<{p1.X,p2.Y,p1.X[,p2.Y]}> but was:<{p1.X,p2.Y,p1.X[]}>
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertStringEquals(TestCase.java:211)
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertEquals(TestCase.java:192)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompiling(BuilderTests.java:169)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompilingOrder(BuilderTests.java:147)
	at org.eclipse.jdt.core.tests.builder.MultiProjectTests.testCycle5(MultiProjectTests.java:744)

Comment 1 Olivier Thomann

2006-10-24 09:39:48 EDT

At home on my AMD box, I also get the same failure plus another one, but I don't remember it.
It is a AMD 3500+.

Comment 2 Maxime Daniel

2007-03-29 06:48:35 EDT

According to the tests made by Frédéric on 3.3 and 3.2.2 today on his fast machine (even using a RAM drive for his files), this no more happens.
Resolving as WORKSFORME.

Comment 3 Maxime Daniel

2007-04-23 00:38:19 EDT

I got it today on my T60p machine. Will investigate.

Comment 4 Frederic Fusier

2007-04-23 11:36:12 EDT

I just got testCycle4 also failed once today on my T60p:

RunAllJDTCoreTests (BDF)
org.eclipse.jdt.core.tests.RunAllJDTCoreTests
org.eclipse.jdt.core.tests.RunAllBuilderTests
org.eclipse.jdt.core.tests.RunBuilderTests
org.eclipse.jdt.core.tests.builder.BuilderTests
org.eclipse.jdt.core.tests.builder.MultiProjectTests
testCycle4(org.eclipse.jdt.core.tests.builder.MultiProjectTests)

                
                junit.framework.ComparisonFailure: unexpected compiling order.
----------- Expected ------------
{p1.X,p2.Y,p3.Z,p1.X,p2.Y}
------------ but was ------------
{p1.X,p3.Z,p1.X,p2.Y,p1.X}
--------- Difference is ----------
 expected:<{p1.X,p[2.Y,p3.Z,p1.X,p2.Y]}> but was:<{p1.X,p[3.Z,p1.X,p2.Y,p1.X]}>
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertStringEquals(TestCase.java:225)
	at org.eclipse.jdt.core.tests.junit.extension.TestCase.assertEquals(TestCase.java:201)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompiling(BuilderTests.java:180)
	at org.eclipse.jdt.core.tests.builder.BuilderTests.expectingCompilingOrder(BuilderTests.java:158)
	at org.eclipse.jdt.core.tests.builder.MultiProjectTests.testCycle4(MultiProjectTests.java:679)

Comment 5 Maxime Daniel

2007-05-02 04:39:55 EDT

Too late for 3.3 M7.

Comment 6 Maxime Daniel

2007-05-16 09:57:12 EDT

Created attachment 67402 [details]
A trace showing the root cause of the problem

This trace shows that:
- on a fast machine, we manage to construct two successive but structurally different instances of State for project P1 within the same millisecond;
- we rely on no two states for the same project to have the same lastStructuralBuildTime unless they denote the same structural state.
In other words, the millisecond precision is not enough on fast machine to identify projects states.

The test case is a bit contrived since the projects are very small indeed. But we could still expect small projects to happen (especially in the learn by doing phases).

Using a smaller precision (nanoseconds) is not an option with J2SE 1.4.

This leaves us with two options:
- refine the use of State#buildNumber in case of full builds of projects that participate in a cycle; policy might be to increment the build number if a project is involved into a full build but has already undergone a build within the same millisecond; this might collide with other uses of buildNumber, hence should be examined carefully;
- introduce a build ticker that, using the timer or not, would provide unique build numbers on a system-wide basis; would probably be a function best located at the base builders level (not Java builder).

In both cases, we would now have a reliable sequencing of project states, but the use of the full sequence identifier (say the combination of lastStructuralBuildTime and buildNumber if option 1 above is retained) would have to be leveraged throughout the code in general, and within State#wasStructurallyChanged in particular.

Kent, would you please let me know what you think?

Comment 7 Kent Johnson

2007-05-16 10:06:59 EDT

Can we not just change State.tagAsStructurallyChanged() to be :

	this.previousStructuralBuildTime = this.lastStructuralBuildTime;
	this.structurallyChangedTypes = new StringSet(7);
	this.lastStructuralBuildTime = System.currentTimeMillis();
	if (this.lastStructuralBuildTime == this.previousStructuralBuildTime)
		this.lastStructuralBuildTime++;

Comment 8 Kent Johnson

2007-05-16 10:08:33 EDT

Actually probably should be :

if (this.lastStructuralBuildTime <= this.previousStructuralBuildTime)
    this.lastStructuralBuildTime = this.previousStructuralBuildTime + 1;

Comment 9 Maxime Daniel

2007-05-16 10:48:27 EDT

Discussed the matter with Philippe who suggested that we merely increment the timestamp when it conflicts. Seems practical enough. I'll prototype something along these lines and I will report my results here.

Comment 10 Kent Johnson

2007-05-16 11:04:04 EDT

Did you not see the comment #7 or 8?

Comment 11 Maxime Daniel

2007-05-16 11:06:36 EDT

The issue is that we do not call tagAsStructurallyChanged at all. We build a
brand new State in that scenario. Hence the suggestion of Philippe, which I
have prototyped with success, works under the covers without changing too much
of our code. My prototype uses a static member in State (a singleton) that is
an instance of the following class: 
class DiscriminatingPseudoClock {
        private long tick;
        synchronized long currentTimeMillis() {
                long systemTime;
                if ((systemTime  = System.currentTimeMillis()) <= this.tick) {
                        this.tick++;
                } else {
                        this.tick = systemTime;
                }
                return this.tick;
        }
}

Still need to document the said class (especially, warn that it should not be
used in too tight loops).

Comment 12 Maxime Daniel

2007-05-16 11:08:07 EDT

Calls to System.currentTimeMillis within State are replaced with clock.currentTimeMillis (would have to check if we do not introduce timestamps elsewhere in our code, that should sync up with clock as well).

Comment 13 Kent Johnson

2007-05-16 11:34:39 EDT

Created attachment 67428 [details]
Will this do the trick ?

Comment 14 Maxime Daniel

2007-05-19 12:05:33 EDT

From what I have understood, we create a brand new State (no relationship to any prior State instance for the same project). Hence the approach I suggested, that merely 'twists the system clock' to achieve some proper order between otherwise unrelated instances.
I agree though that if we established a relationship between the new State instance and the prior State instance for the same project, we could go for a more local (and less tricky) increment on a per-project basis.
I'll investigate further on Monday why we use State#State() in this scenario and see whether it gives us a fair chance to adjust the clock as needed only, and I'll get back to you. 
The system wide clock change is more tricky in some sense, but it has no chance to break the logic behind the State and project instances and relationships. The other path is cleaner, but slightly more risky I fear.

Comment 15 Maxime Daniel

2007-05-21 05:09:10 EDT

According to the tests, your patch does the trick indeed. If I understand it well, it provides the chaining I called for - I missed that part when I read it for the first time. There might be corner cases where we miss the lastState, but I could not bring any of these up.
I also made a separate test replacing the System.currentTimeMillis() call by a constant (kind of simulating an 'everything occurs during the same millisecond' scenario) and it worked as well.
I'll run the full tests suite and report the results here.

Comment 16 Maxime Daniel

2007-05-21 06:56:36 EDT

All tests pass. Need to decide whether we want this in RC2.

Comment 17 Maxime Daniel

2007-05-22 08:18:38 EDT

Discussed with Philippe. Kent, since we'd go with your patch, I reassign the bug to you. Requesting a review by Philippe and adding mine.

Comment 18 Philipe Mulet

2007-05-22 08:22:47 EDT

Looks good, minor point, when no lastBuildState, could have used the currentTimeMillis() instead of 0 to be consistent.

Comment 19 Philipe Mulet

2007-05-22 08:23:59 EDT

Ignore my previous comment about 0. It does the trick already.

Comment 20 Philipe Mulet

2007-05-22 08:37:56 EDT

Once done for 3.3, I would also want the fix to be backported to 3.2.x.

Comment 21 Kent Johnson

2007-05-22 14:25:16 EDT

Released into HEAD for 3.3RC2

Comment 22 Frederic Fusier

2007-05-25 05:22:55 EDT

Verified for 3.3 RC2 using I20070525-0010.

Comment 23 Maxime Daniel

2007-05-29 04:31:15 EDT

Released in R3_2_maintenance.