Bug 492580 - deltapack can not be created by test, on Linux.
Summary: deltapack can not be created by test, on Linux.
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Releng (show other bugs)
Version: 4.6   Edit
Hardware: PC Linux
: P1 major (vote)
Target Milestone: 4.6 RC1   Edit
Assignee: David Williams CLA
QA Contact:
URL:
Whiteboard: routine releng
Keywords:
: 492551 (view as bug list)
Depends on: 488667 492601 492763
Blocks:
  Show dependency tree
 
Reported: 2016-04-27 13:17 EDT by Markus Keller CLA
Modified: 2016-05-01 12:36 EDT (History)
5 users (show)

See Also:


Attachments
move crashing test to the end (1.36 KB, patch)
2016-04-27 13:28 EDT, Markus Keller CLA
no flags Details | Diff
workspace .log from failed I20160427-1200 delta pack creation (53.31 KB, text/x-log)
2016-04-27 19:06 EDT, David Williams CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Keller CLA 2016-04-27 13:17:46 EDT
These 5 test suites consistently DNF'd on ep46I-unit-lin64 and ep46I-unit-cen64 since http://download.eclipse.org/eclipse/downloads/drops4/I20160425-1300/testResults.php :

jdt.core.tests.compiler
jdt.core.tests.model
jdt.ui.tests
jdt.ui.tests.refactoring
pde.build.tests

At the end of http://download.eclipse.org/eclipse/downloads/drops4/I20160427-0400/testresults/consolelogs/ep46I-unit-lin64_linux.gtk.x86_64_8.0_consolelog.txt , there are many [p2.mirror] lines, and then:

An error has occurred. See the log file
/opt/users/genie.shared/workspace/ep46I-unit-lin64/workarea/I20160427-0400/eclipse-testing/workspace/.metadata/.log.

https://hudson.eclipse.org/shared/view/Eclipse%20and%20Equinox/job/ep46I-unit-lin64/ws/workarea/I20160427-0400/eclipse-testing/workspace/.metadata/.log/*view*/ is hard to read, but I think the essence is here:

/opt/users/genie.shared/workspace/ep46I-unit-lin64/workarea/I20160427-0400/eclipse-testing/test-eclipse/eclipse/plugins/org.eclipse.pde.build.tests_1.1.700.v20160421-0815/test.xml:192: Error occurred while transforming repository: An error occurred while collecting items to be installed.
	at org.eclipse.equinox.p2.internal.repository.tools.tasks.Repo2RunnableTask.execute(Repo2RunnableTask.java:62)

/org.eclipse.pde.build.tests/test.xml:192 contains this task:

    <p2.repo2runnable
      destination="file://${installDeltapack}/eclipse"
      failonerror="true">
      <source>
        <repository location="file://${featureTemp}" />
      </source>
    </p2.repo2runnable>
Comment 1 Markus Keller CLA 2016-04-27 13:28:30 EDT
Created attachment 261312 [details]
move crashing test to the end

If this cannot be fixed or reverted promptly, then I suggest we at least move the crashing pdebuild test to the end.
Comment 2 David Williams CLA 2016-04-27 15:36:52 EDT
*** Bug 492551 has been marked as a duplicate of this bug. ***
Comment 3 David Williams CLA 2016-04-27 15:55:19 EDT
Just to document it, bug 488667 is where this change was made, and committed on 2016-04-21 04:15:28. 

There have been several successful N-build tests since then. So ... 

I wonder if the "equinox.executable.feature" comparator error is more significant than I thought? (bug 492013). Even if true the problem is due to "equinox feature" error, then still should probably come last. Tom 'touched' the feature before the noon I-build, so might say something if it succeeds now. 

We should probably change failOnError=true to failOnError=false. 
I am not sure what would cause that task to fail, but that setting may be what causes the "whole thing" to fail ... instead of just some delta pack specific tests. (I am guessing).
Comment 4 David Williams CLA 2016-04-27 16:08:48 EDT
(In reply to Markus Keller from comment #1)
> Created attachment 261312 [details]
> move crashing test to the end
> 
> If this cannot be fixed or reverted promptly, then I suggest we at least
> move the crashing pdebuild test to the end.

I have applied this patch, plus, I have made similar change to the test.xml that we actually use during our "production tests", at 
/eclipse.platform.releng.aggregator/production/testScripts/configuration/sdk.tests/testScripts/test.xml
Comment 5 David Williams CLA 2016-04-27 16:31:57 EDT
I've opened bug 492601 to get "failOnError" set to false.
Comment 6 Markus Keller CLA 2016-04-27 17:52:15 EDT
I should have grabbed and attached the workspace log. Next chance will be here once the linux tests have finished:
https://hudson.eclipse.org/shared/view/Eclipse%20and%20Equinox/job/ep46I-unit-lin64/ws/workarea/I20160427-1200/eclipse-testing/workspace/.metadata/.log/*view*/

The exception location Repo2RunnableTask.java:62 is very strange: There's no exception on that line. The ProvisionException is only on line 63. I also checked the history, but AFAICS, line 62 never threw an exception.
Comment 7 David Williams CLA 2016-04-27 19:06:37 EDT
Created attachment 261320 [details]
workspace .log from failed I20160427-1200 delta pack creation

Seems especially on that it fails on Linux, but not on Windows or Mac! 

Encoding? URLS?
Comment 8 David Williams CLA 2016-04-27 19:08:00 EDT
Adding Tom, just in case anything stands out in those "required capabilities" or "unable to resolve jetty" errors.
Comment 9 David Williams CLA 2016-04-27 21:05:43 EDT
For the record, locally, using the current I-build binary platform, I can run things in a console window such as 

./runAntRunner.sh test.xml createDeltaPack -Dinstall=${PWD} -DexecutionDir=${PWD} -DcurrentUpdateSite=http://download.eclipse.org/eclipse/updates/4.6-I-builds/I20160427-1200/

And, all works fine. 

'runAntRunner.sh' is just where Java and Eclipse executable locations are defined.
I used the one at 
/eclipse.platform.releng.aggregator/production/testScripts/configuration/sdk.tests/testScripts/runAntRunner.sh

and test.xml was from the PDE Build Tests.
Comment 10 Markus Keller CLA 2016-04-28 06:44:27 EDT
The Repo2RunnableTask that's used in the build is very outdated. The exceptions are thrown at Repo2RunnableTask.java:59 and Repo2RunnableTask.java:62, and the latest version where those line numbers match is from 4.3:
http://git.eclipse.org/c/equinox/rt.equinox.p2.git/tree/bundles/org.eclipse.equinox.p2.repository.tools/src_ant/org/eclipse/equinox/p2/internal/repository/tools/tasks/Repo2RunnableTask.java?id=2333d81a3f405da591be1b9c61999221e89b6ba0#n49

The content of the org.eclipse.equinox.p2.repository.tools_2.1.300.v20160421-0324.jar/lib/repository-tools-ant.jar in I20160427-0400 looks up-to-date, so the problem is that the test machine doesn't use the current version of that inner JAR.

AntCorePlugin#extractExtensions(String) sounds like it should extract the JAR, but I don't understand yet how this should work. But if this Ant task is outdated, it's well possible that other tasks are outdated as well (including the p2.mirror task that seems to fail).
Comment 11 David Williams CLA 2016-04-28 08:57:08 EDT
While it may not matter, I have switched the test to run on "hudson-slave4". It has been running on "hudson-slave2" most recently. The N-builds, where I am pretty sure this test worked, has been running on 'hudson-slave4'. 

If it does make a difference, we should still strive to understand the difference between the machines, but would at least give some comfort that our code was ok.
Comment 12 Markus Keller CLA 2016-04-28 09:35:35 EDT
(In reply to David Williams from comment #11)
> While it may not matter, I have switched the test to run on "hudson-slave4".
> It has been running on "hudson-slave2" most recently. The N-builds, where I
> am pretty sure this test worked, has been running on 'hudson-slave4'. 

In N20160423-1500, the consolelogs showed these values for env.NODE_NAME:
lin64: hudson-slave4
cen64: hippcentos

In I20160427-2000, we had:
lin64: hudson-slave2
cen64: hippcentos

=> Since the tests passed in N20160423-1500 on both linux platforms, the hudson-slave* difference doesn't explain why the CentOS test run fails in I-builds.
Comment 13 David Williams CLA 2016-04-28 10:25:46 EDT
(In reply to Markus Keller from comment #10)
> The Repo2RunnableTask that's used in the build is very outdated. 

There is one place where (don't laugh) we download the very, very old "basebuilder". 

You can see "it" in the logs page. 

org.eclipse.releng.basebuilder (only used to start unit tests): R38M6PlusRC3G

In the back of my mind, I was thinking it was not even used to "start unit tests" any longer, except maybe to download stuff, but ... perhaps it is "in charge" as the "tests are found" (which is where this "createDeltaPack" is executed, not literally "as the test is running". 

That is just a small matter of programming to use a more modern version (per platform) so could fix for RC1, I suspect -- if that is the issue. I will slog through that part of the code to A. confirm it is being used (have you ever wished the launcher had a --version command? :) and B. write some code to install a recent "binary platform" (which, we already do, just at later point). 

There are already two old bugs that (sort of) cover this, bug 324682 and bug 404612. 

Thanks to you and Tom (via IM) who pointed out that "old code is being used".
Comment 14 David Williams CLA 2016-04-28 10:48:06 EDT
(In reply to David Williams from comment #13)
> (In reply to Markus Keller from comment #10)
> > The Repo2RunnableTask that's used in the build is very outdated. 
> 
> There is one place where (don't laugh) we download the very, very old
> "basebuilder". 
> 

BTW, this doesn't automatically explain why it "worked a few times" and "still works on Windows and the Mac". My vague guess is it might be related to the launcher changing due to the removal of the 32-bit SPARC and Solaris IUs. Perhaps that was "just enough" of a "Linux change" that it somehow confuses the old code? 

Better guesses welcome.
Comment 15 David Williams CLA 2016-04-28 11:35:10 EDT
This is probably not related to this bug, and could have always been in the logs as far as I know, but I do see "warnings" in the log while creating test framework: 


[WARNING] Mirror tool: Problems resolving provisioning plan.: [Unable to satisfy dependency from org.eclipse.ant.optional.junit 3.3.200.v20160315-2119 to bundle org.apache.ant [1.6.5,2.0.0).; Unable to satisfy dependency from org.eclipse.test 3.3.200.v20151106-1314 to bundle org.apache.ant 0.0.0.; Unable to satisfy dependency from org.eclipse.test 3.3.200.v20151106-1314 to bundle org.eclipse.ui 0.0.0.; Unable to satisfy dependency from org.eclipse.test 3.3.200.v20151106-1314 to bundle org.eclipse.core.runtime 0.0.0.; Unable to satisfy dependency from org.eclipse.test 3.3.200.v20151106-1314 to bundle org.eclipse.ui.ide.application 0.0.0.; Unable to satisfy dependency from org.eclipse.test 3.3.200.v20151106-1314 to bundle org.eclipse.equinox.app 0.0.0.; Unable to satisfy dependency from org.eclipse.test.performance 3.12.0.v20160111-1759 to bundle org.eclipse.core.runtime 0.0.0.]

I think usually these "don't hurt", and frequently the dependencies appear to be added in subsequent steps, but since I do not understand it very well, I thought I would make a note of it here.
Comment 16 David Williams CLA 2016-04-28 15:48:54 EDT
Good news is that with bug 492601 fixed the pde build tests did run.  
The bad news is the 22 unit tests associated with delta pack still failed but at least that is how it should be, and not "fail" the whole test sequence. 

I will be looking to fix (i.e. eliminate) "basebuilder" for RC1.
Comment 17 David Williams CLA 2016-04-28 15:52:16 EDT
Adjusting the title to reflect current state or problem. 

Fixing bug 492601 resolved the "DNF" issue.
Comment 18 David Williams CLA 2016-04-29 15:53:33 EDT
(In reply to David Williams from comment #13)
> (In reply to Markus Keller from comment #10)
> > The Repo2RunnableTask that's used in the build is very outdated. 
> 
> There is one place where (don't laugh) we download the very, very old
> "basebuilder". 
> 
> ...
>
> In the back of my mind, I was thinking it was not even used to "start unit
> tests" any longer, except maybe to download stuff, but ... perhaps it is "in
> charge" as the "tests are found" (which is where this "createDeltaPack" is
> executed, not literally "as the test is running". 
> 

From what I can see, the "back of my mind" was almost correct. We only used the old base builder on Linux. My guess is it was that continuous programming problem of "fixing what breaks" and Windows and Mac OS must have broken previously. 
(The Mac probably would have last release, when we moved to the "Mac App" form. I have no memory of Windows "changing").

The main fix is where we launch the "runTests". 
http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/tree/production/testScripts/configuration/sdk.tests/testScripts/runtests.sh#n32

We do want an "old, stable version" to so the launch, so that "running the tests" does not break based on something in "current build", but "4.5.2" should be plenty old and stable enough, yet new enough that p2 and antRunner work as expected. 

The actual commit 
http://git.eclipse.org/c/platform/eclipse.platform.releng.aggregator.git/commit/production?id=1823e4acb9abb4b673bead2a06a260f25a5b546d

contains lots of changes, in part due to "clean up", but also due to changing the results page where we used to list the version used (I now just say "base binary platform: 452") and also removed the file that retrieved the basebuilder, etc. 

The pde build tests once again work on my local machine. 


= = = = = 

For full disclosure: 

There is a tiny chance this fix may break the tests run on Windows or something else (I am not set up to test them locally) but I could find no sign of it in the code. Therefore, I'll be optimistic and declare this fixed ... and we'll find out for sure in tonight's I-build. 

Another possible "breakage" is that this "4.5.2" version of the "previousRelease" is used in some capacity by the p2 tests. I am not sure if it actually "runs" that version, but even if so, I see no reason that will would not work (but, I do not really know what those tests use it for ... so, we'll see ... but I think works on the Mac and Windows, but as far as I know, perhaps there we have two copies, or something there? (There's a lot of code that sets up and runs these tests on multiple platforms!)
Comment 19 David Williams CLA 2016-04-30 11:52:01 EDT
I think mostly good news. The pde.build tests ran fine on 
SUSE instance of Linux running on the shared instance of Hudson. 

http://download.eclipse.org/eclipse/downloads/drops4/I20160429-2300/testResults.php

But, some bad news since all the tests, as a whole, fail to run on the CentOS machine. I have opened bug 492763 for that issue. It appears it might be related to this change, but the CentOS machine required some "special casing" that I might have gotten wrong.