Bug 470272 - Tycho-Surefire may create an invalid JUnit xml file
Summary: Tycho-Surefire may create an invalid JUnit xml file
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Tycho (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-16 09:04 EDT by Camille Letavernier CLA
Modified: 2021-04-28 16:52 EDT (History)
3 users (show)

See Also:


Attachments
Invalid report (2.43 MB, application/xml)
2015-06-16 13:19 EDT, Camille Letavernier CLA
no flags Details
Invalid Report - Plain text (2.43 MB, text/plain)
2015-06-17 03:46 EDT, Camille Letavernier CLA
no flags Details
Valid report (0.21.0) (2.88 MB, application/xml)
2015-06-17 04:48 EDT, Camille Letavernier CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Camille Letavernier CLA 2015-06-16 09:04:58 EDT
Starting with version 0.22, Tycho-Surefire sometimes produces an invalid XML files for the test results

It works for most builds, but for some reason fails on others. The failure is deterministic, but I can't tell under which conditions it happens

Reverting back to Tycho 0.21 always solves the issue. Using 0.22 or 0.23 always produces the same invalid XML file for the same build

Hudson then fails when trying to publish this invalid XML file:

> Caused by: org.xml.sax.SAXParseException; systemId: file:///home/hudson/genie.papyrus/.hudson/jobs/Papyrus-Master-Tests-Failures/workspace/tests/junit/plugins/core/org.eclipse.papyrus.tests/target/surefire-reports/TEST-org.eclipse.papyrus.tests.AllTests.xml; lineNumber: 33023; columnNumber: 1; XML document structures must start and end within the same entity.

I couldn't find the root cause of the issue, so if you have any idea of what could have changed between 0.21 and 0.22 to cause this kind of issue, I might dig a little bit more

We have 3 test jobs; 2 of them always work, and the last one fails with Tycho >= 0.22
Comment 2 Jan Sievers CLA 2015-06-16 11:42:45 EDT
please attach the invalid XML and describe steps to reproduce (if possible)
Comment 3 Camille Letavernier CLA 2015-06-16 12:06:53 EDT
Thanks for the quick feedback. I'll run the job with Tycho 0.23 and send you the resulting file. However we have lots of tests in this build (~10k) and I couldn't identify the minimal subset
Comment 4 Camille Letavernier CLA 2015-06-16 13:19:18 EDT
Created attachment 254480 [details]
Invalid report

Invalid report produced by Surefire
Comment 5 Jan Sievers CLA 2015-06-17 03:40:31 EDT
please attach broken XML as plaintext or comment, bugzilla gives me an XML parsing error and I can't see the plain text
Comment 6 Camille Letavernier CLA 2015-06-17 03:46:14 EDT
Created attachment 254498 [details]
Invalid Report - Plain text

The same report, in plain text
Comment 7 Martin Schreiber CLA 2015-06-17 03:52:41 EDT
The close tag </testsuite> is missing. Is that just a copy/paste issue here for the attachment(s) or is that the issue why it is invalid?
Comment 8 Camille Letavernier CLA 2015-06-17 04:03:31 EDT
> The close tag </testsuite> is missing. Is that just a copy/paste issue here for the attachment(s) or is that the issue why it is invalid?

That is the XML produced by surefire

I'm currently running the same test suite with Tycho 0.21 to get the full report, for comparison. The </testSuite> is indeed missing, but I suspect that some <testCase> are missing, too (Not sure about that)
Comment 9 Camille Letavernier CLA 2015-06-17 04:48:03 EDT
Created attachment 254500 [details]
Valid report (0.21.0)

Report produced by Tycho 0.21.0
Comment 10 Martin Schreiber CLA 2015-06-22 14:11:22 EDT
Do you have a test timeout setting [1] specified in your test plugin pom or in one of your parent pom files?

[1] https://eclipse.org/tycho/sitedocs/tycho-surefire/tycho-surefire-plugin/test-mojo.html#forkedProcessTimeoutInSeconds
Comment 11 Camille Letavernier CLA 2015-06-26 04:31:37 EDT
> Do you have a test timeout setting [1] specified in your test plugin pom or in one of your parent pom files?

No; we use a single TestSuite without any Timeout
Comment 12 Christian Damus CLA 2015-10-12 10:06:55 EDT
This is happening again in our latest Mars maintenance branch tests:

https://hudson.eclipse.org/papyrus/job/Papyrus-Mars-Tests/236/

The closing </testsuite> tag is missing:

https://hudson.eclipse.org/papyrus/job/Papyrus-Mars-Tests/ws/tests/junit/plugins/core/org.eclipse.papyrus.tests/target/surefire-reports/

All of the test cases are properly recorded (the ALF tests are the last group in the suite).  What could be causing the truncation of this file?  The test execution does not time out.  The Eclipse instance running the tests shuts down and the Hudson build script continues, apparently normally, until it fails because the test publisher cannot parse the XML.
Comment 13 Jan Sievers CLA 2015-10-12 11:55:23 EDT
(In reply to Christian W. Damus from comment #12)
> This is happening again in our latest Mars maintenance branch tests:
> 
> https://hudson.eclipse.org/papyrus/job/Papyrus-Mars-Tests/236/
> 
> The closing </testsuite> tag is missing:
> 
> https://hudson.eclipse.org/papyrus/job/Papyrus-Mars-Tests/ws/tests/junit/
> plugins/core/org.eclipse.papyrus.tests/target/surefire-reports/
> 
> All of the test cases are properly recorded (the ALF tests are the last
> group in the suite).  What could be causing the truncation of this file? 
> The test execution does not time out.  The Eclipse instance running the
> tests shuts down and the Hudson build script continues, apparently normally,
> until it fails because the test publisher cannot parse the XML.

creation of this file is entirely delegated to surefire (we use surefire 2.17 as of now)

possible reasons the file is truncated:

* a bug in surefire
* the test process is exited before all tests are finished (for whatever reason). Any chance you configured a timeout https://eclipse.org/tycho/sitedocs/tycho-surefire/tycho-surefire-plugin/test-mojo.html#forkedProcessTimeoutInSeconds ?

for more details of what happened in the forked test process, you should check the eclipse log file in the workspace created (or run the whole build with maven debug option -X)

noone  apart from papyrus has reported the problem up to now so I rather tend to assume  it's something special about your test setup
Comment 14 Mickael Istria CLA 2021-04-08 18:08:33 EDT
Eclipse Tycho is moving away from this bugs.eclipse.org issue tracker to https://github.com/eclipse/tycho/issues/ instead. If this issue is relevant to you, your action is required.
0. Verify this issue is still happening with latest Tycho 2.4.0-SNAPSHOT
  if issue has disappeared, please change status of this issue to "CLOSED WORKFORME" with some details about your testing environment and how you did verify the issue; and you're done
  if issue is still present when latest release:
* Create a new issue at https://github.com/eclipse/tycho/issues/
  ** Use as title in GitHub the title of this Bugzilla ticket (may include the bug number or not, at your own convenience)
  ** In the GitHub description, start with a link to this bugzilla ticket
  ** Optionally add new content to the description if it can helps towards resolution
  ** Submit GitHub issue
* Update bugzilla ticket
  ** Add to "See also" property (up right column) the link to the newly created GitHub issue
  ** Add a comment "Migrated to <link-to-newly-created-GitHub-issue>"
  ** Set status as CLOSED MOVED
  ** Submit

All issues that remain open will be automatically closed next week or so. Then the Bugzilla component for Tycho will be archived and made read-only.