Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[cross-project-issues-dev] IOExceptions on hudson-slave1 & -4 but not on -2

About half of the Virgo Hudson jobs started failing in the same way apparently since the server move at the weekend. The symptom is an IOException on some (undetermined) file.

For example, Hudson » Virgo » virgo.apps.snapshot » #1264 failed with:

/opt/users/hudsonbuild/workspace/virgo.apps.snapshot/virgo-build/common/common.xml:198: Execute failed: java.io.IOException: Cannot run program "ant" (in directory "/opt/users/hudsonbuild/workspace/virgo.apps.snapshot/org.eclipse.virgo.apps.splash"): java.io.IOException: error=2, No such file or directory

I cleaned the workspace, but the problem persisted. The git revision checked out is correct. When I download a zip of the workspace, it builds locally. Building with the -v (verbose) and -d (debug) Ant switches produces additional diagnostics, but no hint of which file is causing the IOException.

However the failures seem to correlate to Hudson slaves.

hudson-slave2 doesn't hit the problem whereas hudson-slave1 and hudson-slave4 do hit the problem predictably. Since the Virgo jobs were configured against the build2 group consisting of slaves 1, 2, and 4 they were prone to running on the "problem" slaves.

I' guessing this is associated with the configuration of those slaves and the fact that the Virgo jobs locate their "shared" Ivy cache in the home directory of the slave.

I have reconfigured most of the jobs to run on hudson-slave2 and they are now passing. The virgo.medic.snapshot job is tied to hudson-slave4 and may be used to reproduce and hopefully diagnose the problem.

I wonder if slaves 1 and 4 were somehow reconfigured during the server move?

Regards,
Glyn


Back to the top