Bug 566379 - [JUnit] wrong durations of test methods and test classes
Summary: [JUnit] wrong durations of test methods and test classes
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 4.16   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 4.19 M1   Edit
Assignee: Alex Blewitt CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-25 09:39 EDT by Michael Gutfleisch CLA
Modified: 2021-01-06 04:52 EST (History)
4 users (show)

See Also:


Attachments
JUnit test durations (86.80 KB, image/png)
2020-08-25 09:39 EDT, Michael Gutfleisch CLA
no flags Details
Test case (1.30 KB, application/x-gzip)
2020-12-09 05:41 EST, Alex Blewitt CLA
no flags Details
JUnit 4 equivalent of the prior attachment (1.30 KB, application/x-gzip)
2020-12-09 06:07 EST, Alex Blewitt CLA
no flags Details
Michael's Example after the patch (100.41 KB, image/png)
2020-12-14 06:19 EST, Matthias Becker CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Gutfleisch CLA 2020-08-25 09:39:34 EDT
Created attachment 283963 [details]
JUnit test durations

I'm using JUnit5 and made some weird observations concerning the durations of test classes and test methods. 
When I use @beforeAll, the duration of the corresponding method is not added to the duration of the test class. This might happen by intention, but in my opinion it is wrong, especially if we take into account that the duration of @beforeEach is added to each test method. The duration of the test class is not always identical to the sum of the durations of its test methods.

It gets even worsen if I execute the tests of two test classes (see attached screenshot). In this case, the duration of @beforeAll of the second test class gets added to the duration of the last test method of the previous test class.

I suppose @afterAll will behave similar... but I didn't test it.
Comment 1 Andrey Loskutov CLA 2020-08-25 10:58:24 EDT
Is this a regression in 4.17? Which Eclipse version was used?
Comment 2 Michael Gutfleisch CLA 2020-08-25 11:01:24 EDT
Version: 2020-06 (4.16.0)
Build id: 20200615-1200
Comment 3 Michael Gutfleisch CLA 2020-10-06 05:17:49 EDT
Somehow I was not aware that I was meant to answer the question about the regression. Actually I don't know, I was just wondering about the current behavior.

And basically I am really interested what others think about the current behavior and if you think that the current behavior is correct I'd like to hear the reasons for that. And if you think it should be changed I would like to read your proposals.
Comment 4 Michael Gutfleisch CLA 2020-12-09 02:32:05 EST
Hm, seems like everyone is happy with the wrong numbers...
Comment 5 Mickael Istria CLA 2020-12-09 03:03:17 EST
(In reply to Michael Gutfleisch from comment #4)
> Hm, seems like everyone is happy with the wrong numbers...

The backlog is full of higher priority issues or tasks to work on. It's not a matter of "being happy with wrong" but more a matter of "being happier with wrong this and good that than with good this and wrong that".
Have you considered looking at the code and attempting to provide a fix for this issue?
Comment 6 Noopur Gupta CLA 2020-12-09 04:04:34 EST
Please compare the runtime numbers in JUnit 4 and JUnit libraries.

If it looks like a bug/enhancement for the Eclipse IDE then a good starting point will be to check the code around org.eclipse.jdt.internal.junit.runner.MessageIds.TEST_RUN_END : "%RUNTIME".
Comment 7 Alex Blewitt CLA 2020-12-09 05:17:36 EST
I'm not sure what the issue is here.

The @BeforeAll annotation is not a time that is specific to a particular test case; it's a one-off setup for the class.

If you run this outside of Eclipse, you get the same effect as seen here; the time reported for each individual case is the time of the individual test + beforeEach, but the total time for the class as a whole is the time for all the individual test + beforeEach + beforeAll.

> [INFO] Running gudi.unit.Test_12
>  [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.043 s - in gudi.unit.Test_12
> grep testcase TEST-gudi.unit.Test_12.xml
> <testcase name="should_be_3_seconds" classname="gudi.unit.Test_12" time="3.021"/>
> <testcase name="should_be_4_seconds" classname="gudi.unit.Test_12" time="4.005"/>

The setup for the JVM will always be added to the first point that it gets triggered by the classes, if you're not forking the VM between test cases.

Consider two tests (in different classes) which refer to a create-on-demand data structure:

class BigData {
  static {
    Thread.sleep(5000); // ...
  }
  static int VERSION = 1;
}

Now let's use that in two tests:

void testVersionPositive() {
  assertTrue(BigData.VERSION > 0);
}

void testVersionOne() {
  assertEquals(BigData.VERSION,1);
}

In this case, whichever test runs first (whichever class it's in) will pay the cost fo the class initialisation of the BigData class. The second one will get it for free.

The only way you're going to get correct, repeatable numbers is to be able to fork the VM after each test is run; in that case, you'll get the cost of beforeEach + test + beforeAll. But in that case, it's not that different from having the setup work in beforeAll moved into the beforeEach, which is another thing you can do to get the same time.

Even if you fix this particular issue in this particular way and special case @BeforeAll annotation, you're still not going to get consistent test result numbers reported. Anything which has an initialise-on-demand cache is going to be attributed to the first test.

Note that a non-trivial amount of code is going to pay the costs of class loading and initialisation. Much of the work involved in speeding up the startup of Eclipse hasn't come from any huge algorithmic or clever coding tricks, but rather, moving the point at which the class is loaded from during startup to as needed later on.

What seems to be happening here is that the code is working as expected, but that the BeforeAll of the first test case run is being executed before the test is displayed in the counter.

So the last method of the first test case's time is taken up with counting the beforeAll of the next class in the sequence.
Comment 8 Michael Gutfleisch CLA 2020-12-09 05:27:26 EST
---- "So the last method of the first test case's time is taken up with counting the beforeAll of the next class in the sequence."

And that's obviously wrong, isn't it?

Your answer implies that it can't be fixed unless VM is forked for each test. So in such cases the times are wrong and will stay wrong...

Is this the conclusion?
Comment 9 Alex Blewitt CLA 2020-12-09 05:41:34 EST
Created attachment 285000 [details]
Test case

The attached project demonstrates a minimal problem.

Running from the command line, Test_A is approximately 0s, and Test_B is approximately 5s.

Running from within Eclipse, Test_A runs but the time taken for the setup of Test_B is counted towards Test_A's time, so Test_A appears to take 5s and Test_B is almost instantaneous.

The problem is a fencepost one; we're counting the setup time of the next class in the cost of running this class for some reason. So we're counting:

(Test A runtime + Test B setup time), (Test B runtime)

If we had a 3rd test in here,  then it would look like:

(Test A runtime + Test B setup time), (Test B runtime + Test C setup time), (Test C runtime)

The cost of Test A's setup time is missed in all cases, because it's executed before the test reports view is populated.
Comment 10 Alex Blewitt CLA 2020-12-09 05:45:18 EST
> And that's obviously wrong, isn't it?

Yes, I believe so. Took a while to minimise to the attached test case though :)

I'm not familiar with how JDT runs its tests or sets up the BeforeAll calls, but the fact that there's a difference between the times reported by a maven run and Eclipse indicates that there is an issue, though I didn't compare it with Junit 4 runner in Eclipse.
Comment 11 Alex Blewitt CLA 2020-12-09 06:07:49 EST
Created attachment 285001 [details]
JUnit 4 equivalent of the prior attachment

The same issue is seen in JUnit 4, so it's not a JUnit 4/5 thing.
Comment 12 Michael Gutfleisch CLA 2020-12-10 02:08:58 EST
If it is not possible to display the correct times, does it really make sense to display them at all?
Or should there be at least a warning connected to the last test method of each class?
Comment 13 Alex Blewitt CLA 2020-12-10 04:37:21 EST
(In reply to Michael Gutfleisch from comment #12)
> If it is not possible to display the correct times, does it really make
> sense to display them at all?
>
> Or should there be at least a warning connected to the last test method of
> each class?

The timing is only incorrect if you have a @BeforeClass or @BeforeAll annotation, in which the cost of that method is incorrectly attributed to the immediate prior test. It is possible that it could be corrected, but if you do not wish to see the numbers then you can hide them manually by unselecting the "Show Execution Time" filter in the view options.

However, my point from earlier still stands; since you're running all the tests in the same JVM, you're likely to see effects caused by earlier tests initialising or setting up some shared content that is then relied on in later tests, so all timings/statistics should be interpreted appropriately.

If you run a single test, the time is correct, and therefore it wouldn't make sense to turn off showing the timing for all cases.
Comment 14 Matthias Becker CLA 2020-12-10 04:40:12 EST
(In reply to Alex Blewitt from comment #13)
> The timing is only incorrect if you have a @BeforeClass or @BeforeAll
> annotation, in which the cost of that method is incorrectly attributed to
> the immediate prior test. It is possible that it could be corrected, but if
> you do not wish to see the numbers then you can hide them manually by
> unselecting the "Show Execution Time" filter in the view options.

Wouldn't it be better to show dedicated nodes for the @BeforeClass / @BeforAll methods in the result and show their execution time there?
Comment 15 Alex Blewitt CLA 2020-12-10 04:43:11 EST
(In reply to Matthias Becker from comment #14)
> 
> Wouldn't it be better to show dedicated nodes for the @BeforeClass /
> @BeforAll methods in the result and show their execution time there?

I'm sure the JDT project would welcome patches that provided that behaviour.
Comment 16 Matthias Becker CLA 2020-12-10 04:50:28 EST
(In reply to Alex Blewitt from comment #15)
> (In reply to Matthias Becker from comment #14)
> > 
> > Wouldn't it be better to show dedicated nodes for the @BeforeClass /
> > @BeforAll methods in the result and show their execution time there?
> 
> I'm sure the JDT project would welcome patches that provided that behaviour.

Are these numbers available form JUnit or is that wrong summing already happening inside JUnit?
Comment 17 Alex Blewitt CLA 2020-12-10 06:18:48 EST
So from looking through the code, it looks like the timing is calculated client (i.e. IDE) side rather than the remote side.

Eclipse creates a TestRunner, which then spawns a new JVM with the components in place and starts up a server that the IDE connects to. This reports the status of tests as they run.

For the attached project, it looks like this:

%TESTC  2 v2
%TSTTREE1,bug566379.Test_A,true,1,false,-1,bug566379.Test_A,,
%TSTTREE2,a(bug566379.Test_A),false,1,false,-1,a(bug566379.Test_A),,
%TSTTREE3,bug566379.Test_B,true,1,false,-1,bug566379.Test_B,,
%TSTTREE4,b(bug566379.Test_B),false,1,false,-1,b(bug566379.Test_B),,
%TESTS  2,a(bug566379.Test_A)
%TESTE  2,a(bug566379.Test_A)
%TESTS  4,b(bug566379.Test_B)
%TESTE  4,b(bug566379.Test_B)
%RUNTIME561644

We have the TESTS (test start) and TESTE (test end) reported, which is propagated back to the client, which updates the UI.

%TESTS  2,a(bug566379.Test_A)
%TESTE  2,a(bug566379.Test_A) // 0.0 s later
// This is when B's @BeforeAll happens, 5s
%TESTS  4,b(bug566379.Test_B) 
%TESTE  4,b(bug566379.Test_B) // 0.0s later

The issue seems to be that the 'time for Test A' seems to be calculated from when the next test starts, rather than when the test ended.
Comment 18 Alex Blewitt CLA 2020-12-10 07:02:04 EST
OK it turns out this is a lack of flushing in the message queue. We flush on test failed and next test started, but we don't flush when the test ends, so the UI never gets the message that the test has ended, even though we've sent it through. It's a couple of lines fix. Will push a Gerrit for it shortly.
Comment 19 Matthias Becker CLA 2020-12-10 07:11:33 EST
(In reply to Alex Blewitt from comment #18)
>Will push a Gerrit for it shortly.
Comment 20 Eclipse Genie CLA 2020-12-10 07:12:13 EST
New Gerrit change created: https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/173651
Comment 21 Alex Blewitt CLA 2020-12-10 07:18:02 EST
https://git.eclipse.org/r/c/jdt/eclipse.jdt.ui/+/173651 fixes the issue in that the time for the individual test cases is correctly accounted for.

However, there is no support in the protocol or connections to measure the time for the BeforeAll step. As a result, the total time shown for Class B in the example above will report a 0.01s delay instead of a 5.01s delay, but it will take 5s for the class to be run. The protocol translates the testStarted call of JUnit's run listener to the time the test starts running, which is after the BeforeAll step has occurred.

If measuring the BeforeAll step is required, a new bug/implementation should be provided.
Comment 22 Michael Gutfleisch CLA 2020-12-10 07:39:32 EST
Sounds great, happy to hear that it could be fixed after all.
Comment 23 Alex Blewitt CLA 2020-12-11 03:55:49 EST
From the Gerrit comments:

> Thanks Alex for the investigation and patch. And thanks Matthias for reviewing it. 
>
> Matthias, please also verify the patch on your existing tests code base and ensure that you are getting the expected results with this.

Matthias, are you able to verify the proposed change and comment on the review?
Comment 24 Matthias Becker CLA 2020-12-14 06:19:06 EST
Created attachment 285029 [details]
Michael's Example after the patch

(In reply to Alex Blewitt from comment #23)
> Matthias, are you able to verify the proposed change and comment on the
> review?

For me this looks good. (See the screenshot).
@Michael: Do you agree?
Comment 25 Noopur Gupta CLA 2020-12-14 06:52:12 EST
(In reply to Matthias Becker from comment #24)
> Created attachment 285029 [details]
> Michael's Example after the patch
> 
> (In reply to Alex Blewitt from comment #23)
> > Matthias, are you able to verify the proposed change and comment on the
> > review?
> 
> For me this looks good. (See the screenshot).
> @Michael: Do you agree?

Thanks, Matthias. Also, please verify this with your test codebase. This is to ensure that everything looks good on real tests also with no regressions.
Comment 26 Michael Gutfleisch CLA 2020-12-14 08:29:51 EST
Yes, I agree. The numbers look much better than before!
Comment 27 Matthias Becker CLA 2020-12-14 08:45:29 EST
(In reply to Noopur Gupta from comment #25)
> (In reply to Matthias Becker from comment #24)
> > Created attachment 285029 [details]
> > Michael's Example after the patch
> > 
> > (In reply to Alex Blewitt from comment #23)
> > > Matthias, are you able to verify the proposed change and comment on the
> > > review?
> > 
> > For me this looks good. (See the screenshot).
> > @Michael: Do you agree?
> 
> Thanks, Matthias. Also, please verify this with your test codebase. This is
> to ensure that everything looks good on real tests also with no regressions.

I did some spot-checks both with pain JUnit Test as well as with JUnit Plugin Tests. For both types of tests the numbers look good.
Comment 28 Noopur Gupta CLA 2020-12-14 08:58:17 EST
(In reply to Matthias Becker from comment #27)
> I did some spot-checks both with pain JUnit Test as well as with JUnit
> Plugin Tests. For both types of tests the numbers look good.

Thanks for confirming. I will review and release the patch.
Comment 30 Noopur Gupta CLA 2021-01-06 04:52:51 EST
Verified based on comment #27.