Bug 380171 - Support running tests in separate VMs in tycho-surefire-plugin
Summary: Support running tests in separate VMs in tycho-surefire-plugin
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Tycho (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement with 6 votes (vote)
Target Milestone: ---   Edit
Assignee: Pascal Rapicault CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 386481
Blocks:
  Show dependency tree
 
Reported: 2012-05-21 14:42 EDT by Pascal Rapicault CLA
Modified: 2021-04-28 16:55 EDT (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pascal Rapicault CLA 2012-05-21 14:42:56 EDT
maven-surefire plugin has an option called forkMode (http://maven.apache.org/plugins/maven-surefire-plugin/test-mojo.html#forkMode) which allow to control how the tests should be run. As far as I can tell this option is not supported on tycho-surefire-plugin. It would be a great addition :)
Comment 1 Jan Sievers CLA 2012-05-22 03:23:29 EDT
> Can be "never", "once", "always" or "perthread".

- "never" will not work as we require a forked OSGi runtime
- "once" is what we have now
- "always" will require test class scanning to be moved from OsgiSureBooter (forked process) to TestMojo (main process)
- "perhtread" will require parallel mode to be implemented first
Comment 2 Igor Fedorenko CLA 2012-06-07 22:56:38 EDT
@pascal what is the problem you are trying to solve? can you provide scenarios when each of these options is either required for reliable test execution or desired because it makes test run faster?
Comment 3 Pascal Rapicault CLA 2012-06-08 08:56:45 EDT
I'm involved in migrating the build of large app to Tycho. Currently the tests for this app are run using the perTest mode of the Junit ant task (http://ant.apache.org/manual/Tasks/junit.html), which is equivalent to the "always" mode in surefire. Note that currently the tests are not run in an osgi container which is why the usage of this ant task work.

The main reason for running the tests this way is isolation since everything gets reinitialized.
Comment 4 Igor Fedorenko CLA 2012-06-08 09:34:07 EDT
If you don't need OSGi container, you should be able to run the tests with maven-surefire-plugin and this is something I am interested to see working.
Comment 5 Pascal Rapicault CLA 2012-06-08 10:10:30 EDT
Let me try again, I tried to hide too much :). 
With our existing build infrastructure our automated build process can only run plain java tests (where the osgi container is not initialized). However we also have tests that require an OSGi container to be up and running, and these are not run in our automated test suite and this is the issue. To solve this isuse we are looking into using tycho, and given the current setup of the existing plain java tests, we know that we will need the same abilities than those used by the plain Java tests.
Hope this helps. In short, we will need it and we could provide a patch :)

As for the particular modes, I don't think they are all needed. "Always" and "once" are probably sufficient for the everybody.
Comment 6 Pascal Rapicault CLA 2012-11-02 21:11:00 EDT
I have started work on this. At this point I'm limiting the implementation to the "always" option with the goal of finishing the implementation for mid-November.

Based on the working prototype I have, I have a couple questions on how to proceed:
- OSGiSurefireBooter - This class contains a lot of changes. The main goals of the changes are:
  - Support to gather the list of classes to run. This change is currently done by simply adding a parameter to the existing application. I chose this approach over adding a completely new application. Any issue with this?
  - Handling of errors. The runTest method will raise an exception when the test is failing. However in the case of the forked mode, we want to return the message to the main process but not throw a mojo. I would need to make some refactorings in this area, any objection?

- TestMojo - this class is mostly modified to gather the tests to run, execute them and then collect the results. I'm reusing the approach used by surefire itself using the ForkClient stream consumer.

- EquinoxLauncher - The equinox launcher needs to be modified to allow for a stream consumer to be passed to the execution of java. Currently I have done a hack to allow this but I would like to know what would be your prefered way to allow for a parameter to be passed. There are three ways: add a new method on the EquinoxLauncher interface, change the signature of the execute method, or add a method to the LaunchConfiguration class. What do you prefer? (I will open a new bug for this once we have agreed on the path to take)
Comment 7 Pascal Rapicault CLA 2012-11-04 14:21:47 EST
Minor correction from the previous comment. The handling of error that I describe as being part of OSGiSurefireBooter is instead in TestMojo.

Also, tycho-surefire-plugin now has a copy of the following surefire classes: DeserializedStacktraceWriter, ForkClient and ThreadedStreamConsumer. I did this because depending on the maven-surefire-common brought in a lot of unecessary jars. Let me know which way you prefer to go on this, copy or dependency. Thx.
Comment 8 Jan Sievers CLA 2012-11-05 03:20:01 EST
(In reply to comment #6)
> I have started work on this. At this point I'm limiting the implementation
> to the "always" option with the goal of finishing the implementation for
> mid-November.
> 
> Based on the working prototype I have, I have a couple questions on how to
> proceed:

This discussion should take place in gerrit, on the code. Even if it's a prototype, push it to gerrit and we will know what you are talking about.
Comment 9 Jan Sievers CLA 2012-11-05 03:21:13 EST
(In reply to comment #7)
> Minor correction from the previous comment. The handling of error that I
> describe as being part of OSGiSurefireBooter is instead in TestMojo.
> 
> Also, tycho-surefire-plugin now has a copy of the following surefire
> classes: DeserializedStacktraceWriter, ForkClient and
> ThreadedStreamConsumer. I did this because depending on the
> maven-surefire-common brought in a lot of unecessary jars. Let me know which
> way you prefer to go on this, copy or dependency. Thx.

I am uncomfortable with copying classes. We worked hard to get rid of some of the copies. Copies require CQs with questions like "did you provide a patch to the original project?"
Comment 10 Jan Sievers CLA 2012-11-05 03:31:19 EST
It's probably worth reading our current problems with consuming a new surefire version (which would enable us to consume includes our own fixes in surefire), see bug 386481
Comment 11 Pascal Rapicault CLA 2012-11-05 07:39:58 EST
I'll put the contribution into gerrit in the evening.

If in the meantime you had answers for the EquinoxLauncher (right now the solution I have in place is a total hack), it would help.
Comment 12 Tobias Oberlies CLA 2012-11-06 07:21:23 EST
Pascal, the way that Tycho currently integrates with Tycho is a dead end (see bug 386481). We need to revise fundamentally how this is done so that we don't have to port every new Surefire version to OSGi.

Your change seems to make the Tycho/Surefire integration more complicated, and hence will make it even harder to get this right. Therefore I am reluctant to accept your proposed change. If you are willing to help cleaning up the technical debt in the Surefire integration first, this would appreciated, but I know that this is much asked of a contributor. In any case, I need to propose that you put the work on this feature to a halt until the architectural problem is resolved.
Comment 13 Pascal Rapicault CLA 2012-11-06 20:30:34 EST
Tobias, I went through bug 386481 and the details are sparse. It just says that a new approach is needed but I fail to see if this is just because there is a problem moving to a new version of surefire, or if there is something more involved. From having looked at the code I assumed that you would want to stop having your own Mojo and just hook in surefire but I'm not sure if this really is what you imply.
So two questions:
- do you already have a detailed idea on how to address the issue
- how much time it would take to implement it

Also I would like to conclude in adding that the scope of my changes are rather small and are not making our tie to surefire worst than it is today as it just relies on some of infrastructure clss that are still present in versions of surefire more recent than 2.10.
Comment 14 Pascal Rapicault CLA 2012-11-25 22:58:06 EST
I finally pushed my change to gerrit at https://git.eclipse.org/r/#/c/8859/
Sorry for the delay.
Comment 15 Pascal Rapicault CLA 2014-03-26 10:57:03 EDT
After some quiet time, I would like to resume work on this.
At this point I was thinking of picking up where I left off, which means continuing with the addition of the forkMode flag. 

I understand that this is not what the maven-surefire plugin promotes since they now support reuseFork and forkCount, but implementing forkMode is simpler given the current code structure of tycho-surefire plugin.

What are you thoughts? Assuming that the code works, that all the requirements are met, do you foresee any issue that would prevent support for forkMode to be merged in Tycho?
Comment 16 Jan Sievers CLA 2014-03-27 04:28:54 EDT
some thoughts:

- it's only fair that we should treat our inability/lack of motivation to move to a newer surefire version and this enhancement separately and not let it block this enhancement
- that said, if we do this change it should at least not worsen the tight coupling of tycho to a particular surefire version (e.g. classes copied from surefire instead of binaries referenced)
- the tycho code base has evolved since the last patch so I would propose to make a fresh start
- I assume we only want forkMode=none (as today) and forkMode=always (=one VM forked per test class)

We should break up the implementation work into smaller steps.

Things I could think of off the top of my head:

- right now we let surefire do a filesystem scan for test classes inside the forked VM. This should probably be moved into the maven VM as a prerequisite
- Multiple OSGi executions can still have side effects/accumulate state in between VM launches; need to make sure each launch is "clean" (by launching in dedicated separate subfolder per launch?)
- need to solve the problem of how to merge the surefire results from multiple launches into one combined result
- right now we fail the build based on the process exit code. Need to change this logic to fail/succeed only after all launches are done
- right now TestMojo is hardcoded to use one forked VM by using one EquinoxLauncher. we probably need a facade on the client VM side that takes the forkMode parameter and hides the different launch handling based on forkMode


I don't know how many of these problems are solved already by surefire. Ideally we can reuse their solution but it may not be possible if the "fork plain classpath JVM" and "fork OSGi framework in JVM" usecases are too different.
Comment 17 justin georgeson CLA 2014-05-05 16:01:35 EDT
Hi all, my organization paid Pascal for his initial forkMode work, and our internal build is currently based off of 0.17.0. We asked him to estimate updating our build to 0.20.0 as a team using the forkMode support he added would see much benefit from the work on bug 427556. But due to bug 433522 we can't adopt 0.20.0. We've recently also setup Java 8 builds for QA before moving our production environment to Java 8, and are using 0.19.0 with -Dmaven.compiler.{source,target}=1.8 and leaving the BREE at JavaSE-1.7 I know that 0.21.0 will be the first release to officially support the Java 8 compliant JDT, so it'd be really nice if Pascal could deliver these changes to tycho-surefire back to you so we don't have to maintain an internal fork anymore. That would also potentially help us get approval to pay [Pascal] for the work.

And to confirm Jan's assumption in Comment #16, the teams using forkMode are indeed only concerned about forkMode=always.
Comment 18 Jan Sievers CLA 2014-08-18 09:48:47 EDT
(In reply to Jan Sievers from comment #16)
> - right now we let surefire do a filesystem scan for test classes inside the
> forked VM. This should probably be moved into the maven VM as a prerequisite

this is done with bug 386481
Comment 19 Tobias Oberlies CLA 2014-12-12 07:38:31 EST
Apparently forkMode has already been deprecated again [1]. The new properties to be used are called forkCount and reuseForks. Changing the title to be independent of the potential implementation.

[1] http://maven.apache.org/surefire/maven-surefire-plugin/test-mojo.html#forkMode
Comment 20 Sebastien Arod CLA 2020-03-31 05:59:40 EDT
This limitation is very problematic for projects incrementally migrating to junit 5 because in such projects the JunitPlatformProvider is used to run both Junit5 and Junit4 with jupiter and vintage engine respectively.

However the "parallel" option is only supported for Junit47Provider and cannot be used with JunitPlatformProvider.
Comment 21 Sebastien Arod CLA 2020-04-01 08:48:05 EDT
By the way can someone with appropriate privileges edit the title to reflect that this is about supporting "forkCount" and "reuseForks". This would make it much more discoverable.
Comment 22 Mickael Istria CLA 2021-04-08 18:04:44 EDT
Eclipse Tycho is moving away from this bugs.eclipse.org issue tracker to https://github.com/eclipse/tycho/issues/ instead. If this issue is relevant to you, your action is required.
0. Verify this issue is still happening with latest Tycho 2.4.0-SNAPSHOT
  if issue has disappeared, please change status of this issue to "CLOSED WORKFORME" with some details about your testing environment and how you did verify the issue; and you're done
  if issue is still present when latest release:
* Create a new issue at https://github.com/eclipse/tycho/issues/
  ** Use as title in GitHub the title of this Bugzilla ticket (may include the bug number or not, at your own convenience)
  ** In the GitHub description, start with a link to this bugzilla ticket
  ** Optionally add new content to the description if it can helps towards resolution
  ** Submit GitHub issue
* Update bugzilla ticket
  ** Add to "See also" property (up right column) the link to the newly created GitHub issue
  ** Add a comment "Migrated to <link-to-newly-created-GitHub-issue>"
  ** Set status as CLOSED MOVED
  ** Submit

All issues that remain open will be automatically closed next week or so. Then the Bugzilla component for Tycho will be archived and made read-only.