Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cross-project-issues-dev] Hudson job keeps rebuilding

The webmaster (Matt) removed the label "hudson-slave2", since then my jobs which were building repeated stopped building. Are others seeing the same?

I have filed a bug to keep track of the issue

372755 - Hudson confuses node name and label name which cause git poll to trigger build repeatedly

- Winston

On 2/27/12 2:17 PM, Winston Prakash wrote:
David,

I already announced the solution (pasted it here again), I'm waiting for the webmaster to fix it.

-------->

We have two options

- Change the label of the node "hudson-slave1" to "hudson-slave1" (currently it is "hudson-slave2"). Then change those jobs
which are tied to the label "hudson-slave2" to "hudson-slave1"

- Leave the label of the node "hudson-slave1" as "hudson-slave2", but change all those jobs which are tied to the label "hudson-slave1" to
"hudson-slave2"

I think options 1 is easier because only three jobs are tied to the label "hudson-slave2", but more than 35 jobs are tied to the label
"hudson-slave1".

<---------


Also since this require reboot, it would help if the job owners fix their jobs with the other two suggestions I made earlier.

- Specify in the job to retain only about 15-20 old builds
- Reduce the amount of standard output (debug trace etc) in JUnit tests

With that done, along with the NFS change, I expect better performing hudson.

- Winston


On 2/27/12 9:27 AM, David M Williams wrote:
If you are interested to know the reason (lengthy read though) here is
the note I shared with Matt and Denis.

Thanks for sharing the information. It seems very significant (little that
I understand it). I would encourage you to open/track such issues in
bugzilla, as it makes it easier to track progress and provides a better
long-term record of what's happened, what was changed, etc.

Though communication here is good too, and sounds like at some point you'll
need to announce "the solution" and who has to change what.

Greatest thanks,





From:    Winston Prakash<winston.prakash@xxxxxxxxx>
To:    Cross project issues<cross-project-issues-dev@xxxxxxxxxxx>,
Date:    02/26/2012 03:06 PM
Subject:    Re: [cross-project-issues-dev] Hudson job keeps rebuilding
Sent by:    cross-project-issues-dev-bounces@xxxxxxxxxxx



I was also baffled by this repeated building. Initially we thought it was
because of Gerrit plugin. After couple of days of  pouring through
everything, finally found out the reason. I discussed the reason and
possible solution with Matt and Denis. Hopefully we could fix this next
week.

If you are interested to know the reason (lengthy read though) here is the
note I shared with Matt and Denis.

When you tie a job to a slave configuration, in the configuration file
following is written

<assignedNode>hudson-slave1</assignedNode>

When a build finishes, a file is created<job>/builds/<build-no>/build.xml.
This file has

<builtOn>hudson-slave1</builtOn>

When Git plugin does a polls, it first checks what node the job is tied to
(in this case "hudson-slave1") and then checks which node the last build
happened (in this case it is again "hudson-slave1"). If the "builtOn" node
is not same as "assignedNode", then Git poll triggers a build.

Strangely Git plugin still triggers the build even though both are same. I
wrote a small Groovy script to verify both are same. See below the object (
@5301ca0c) and the display name are same.




I noticed one odd thing. If you see the label I printed out above it says
"hudson-slave1" which belongs to the node "hudson-slave1". Strangely, the
node "hudson-slave1" has two labels "build2" and "hudson-slave2". There is
no such label "hudson-slave1" (see the first picture).




However, if I look at nodes tied to to label "hudson-slave1", it shows the
corresponding node as "hudson-slave1". This is really odd, because there is
no such label.




Coincidentally all these jobs which are  going crazy and doing repeated
builds belong to the mysterious label "hudson-slave1", which in fact
doesn't exists.

I changed the tied label of one of the job to "hudson-slave2" which is the
label for the node "hudson-slave1" and it stopped building repeatedly.

Appears the labels of the nodes are messed up. If we clean up this mess,
all those jobs going crazy will get back its sanity. We have two options to
clean up the mess

- Change the label of the node "hudson-slave1" to
"hudson-slave1" (currently it is "hudson-slave2"). Then change those jobs
which are tied to the label "hudson-slave2" to "hudson-slave1"

- Leave the label of the node "hudson-slave1" as "hudson-slave2", but
change all those jobs which are tied to the label "hudson-slave1" to
"hudson-slave2"

I think options 1 is easier because only three jobs are tied to the label
"hudson-slave2", but more than 35 jobs are tied to the label
"hudson-slave1".

BTW, this also requires a Hudson restart and the good news is we can put
back Gerrit plugin, because it seems it has nothing to do  with the
repeated build.

- Winston

On 2/26/12 9:35 AM, Doug Schaefer wrote:
       I'm seeing the same thing. The build page says there were "No
       changes." Checking the polling log, I see:

       Started on Feb 26, 2012 3:01:15 AM
       Using strategy: Default
       [poll] Last Build : #255
       [poll] Last Built Revision: Revision
       d39f64adfb1570f6be9f07a636d4d2a1bf781562 (origin/cdt_8_0)
       Last build was not on tied node, forcing rebuild.
       Done. Took 3.1 sec
       Changes found

       "forcing rebuild" is probably the key. Is everyone seeing the same?
       What does "not on tied node" mean?

       Doug.

       On Sun, Feb 26, 2012 at 12:21 PM, David M Williams
       <david_williams@xxxxxxxxxx>  wrote:


             I know there was a problem where a Gerrit Plugin was triggering
             jobs
             (unrelated to Gerrit) but as far as I know, that was
             disabled/removed from
             the production Hudson system. (Sorry, don't know bug number).
             Plus, I had a case where "URL Content Change" trigger stopped
             working
             right, and those jobs were building over and over again, even
             though no
             content change (bug 363891 [1]).

             In the past, I've sometimes found it helps "odd Hudson
             behavior" to "wipe
             out workspace", essentially resetting what ever is there. To
             get to that
             option, click on the job, click on "Workspace" (on left, upper
             part of
             page), which displays the workspace, but also reveals a "Wipe
             out
             workspace" option. I'm sure it is mostly "superstitious
             behavior" ... but
             has seemed to help in some cases.

             In any case, I'd encourage you see if any existing bugs are
             related to your
             observations [2] and if not, to open one [3] being sure to make
             note "it is
             on Eclipse Infrastructure".

             Good luck,

             [1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=363891

             [2]
             https://bugs.eclipse.org/bugs/query.cgi?classification=Technology&product=Hudson&query_format=advanced


             [3] https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Hudson







             From:   Eike Stepper<stepper@xxxxxxxxxx>
             To:     cross-project-issues-dev@xxxxxxxxxxx,
             Date:   02/25/2012 10:58 PM
             Subject:        [cross-project-issues-dev] Hudson job keeps
             rebuilding
             Sent by:        cross-project-issues-dev-bounces@xxxxxxxxxxx



             Hi,

             One of my Hudson jobs,
             https://hudson.eclipse.org/hudson/job/emf-cdo-maintenance ,
             repeatedly sees
             Git changes (SCM
             trigger) that do not exist. It always reports (a) Started by
             SCM change and
             (b) No changes. But then it triggers a build
             rather than just exit. I'm polling for SCM changes every 2
             hours. That
             results in 12 unnecessary builds per day ;-(

             This strange behaviour started yesterday. Does anybody see the
             same
             behaviour?

             Cheers
             /Eike

             ----
             http://www.esc-net.de
             http://thegordian.blogspot.com
             http://twitter.com/eikestepper


             _______________________________________________
             cross-project-issues-dev mailing list
             cross-project-issues-dev@xxxxxxxxxxx
             https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev




             _______________________________________________
             cross-project-issues-dev mailing list
             cross-project-issues-dev@xxxxxxxxxxx
             https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev


       _______________________________________________
       cross-project-issues-dev mailing list
       cross-project-issues-dev@xxxxxxxxxxx
       https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev
       _______________________________________________
       cross-project-issues-dev mailing list
       cross-project-issues-dev@xxxxxxxxxxx
       https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev


_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev

Back to the top