Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cross-project-issues-dev] Hudson job keeps rebuilding

> If you are interested to know the reason (lengthy read though) here is
the note I shared with Matt and Denis.

Thanks for sharing the information. It seems very significant (little that
I understand it). I would encourage you to open/track such issues in
bugzilla, as it makes it easier to track progress and provides a better
long-term record of what's happened, what was changed, etc.

Though communication here is good too, and sounds like at some point you'll
need to announce "the solution" and who has to change what.

Greatest thanks,





From:	Winston Prakash <winston.prakash@xxxxxxxxx>
To:	Cross project issues <cross-project-issues-dev@xxxxxxxxxxx>,
Date:	02/26/2012 03:06 PM
Subject:	Re: [cross-project-issues-dev] Hudson job keeps rebuilding
Sent by:	cross-project-issues-dev-bounces@xxxxxxxxxxx



I was also baffled by this repeated building. Initially we thought it was
because of Gerrit plugin. After couple of days of  pouring through
everything, finally found out the reason. I discussed the reason and
possible solution with Matt and Denis. Hopefully we could fix this next
week.

If you are interested to know the reason (lengthy read though) here is the
note I shared with Matt and Denis.

When you tie a job to a slave configuration, in the configuration file
following is written

<assignedNode>hudson-slave1</assignedNode>

When a build finishes, a file is created <job>/builds/<build-no>/build.xml.
This file has

<builtOn>hudson-slave1</builtOn>

When Git plugin does a polls, it first checks what node the job is tied to
(in this case "hudson-slave1") and then checks which node the last build
happened (in this case it is again "hudson-slave1"). If the "builtOn" node
is not same as "assignedNode", then Git poll triggers a build.

Strangely Git plugin still triggers the build even though both are same. I
wrote a small Groovy script to verify both are same. See below the object (
@5301ca0c) and the display name are same.




I noticed one odd thing. If you see the label I printed out above it says
"hudson-slave1" which belongs to the node "hudson-slave1". Strangely, the
node "hudson-slave1" has two labels "build2" and "hudson-slave2". There is
no such label "hudson-slave1" (see the first picture).




However, if I look at nodes tied to to label "hudson-slave1", it shows the
corresponding node as "hudson-slave1". This is really odd, because there is
no such label.




Coincidentally all these jobs which are  going crazy and doing repeated
builds belong to the mysterious label "hudson-slave1", which in fact
doesn't exists.

I changed the tied label of one of the job to "hudson-slave2" which is the
label for the node "hudson-slave1" and it stopped building repeatedly.

Appears the labels of the nodes are messed up. If we clean up this mess,
all those jobs going crazy will get back its sanity. We have two options to
clean up the mess

- Change the label of the node "hudson-slave1" to
"hudson-slave1" (currently it is "hudson-slave2"). Then change those jobs
which are tied to the label "hudson-slave2" to "hudson-slave1"

- Leave the label of the node "hudson-slave1" as "hudson-slave2", but
change all those jobs which are tied to the label "hudson-slave1" to
"hudson-slave2"

I think options 1 is easier because only three jobs are tied to the label
"hudson-slave2", but more than 35 jobs are tied to the label
"hudson-slave1".

BTW, this also requires a Hudson restart and the good news is we can put
back Gerrit plugin, because it seems it has nothing to do  with the
repeated build.

- Winston

On 2/26/12 9:35 AM, Doug Schaefer wrote:
      I'm seeing the same thing. The build page says there were "No
      changes." Checking the polling log, I see:

      Started on Feb 26, 2012 3:01:15 AM
      Using strategy: Default
      [poll] Last Build : #255
      [poll] Last Built Revision: Revision
      d39f64adfb1570f6be9f07a636d4d2a1bf781562 (origin/cdt_8_0)
      Last build was not on tied node, forcing rebuild.
      Done. Took 3.1 sec
      Changes found

      "forcing rebuild" is probably the key. Is everyone seeing the same?
      What does "not on tied node" mean?

      Doug.

      On Sun, Feb 26, 2012 at 12:21 PM, David M Williams
      <david_williams@xxxxxxxxxx> wrote:


            I know there was a problem where a Gerrit Plugin was triggering
            jobs
            (unrelated to Gerrit) but as far as I know, that was
            disabled/removed from
            the production Hudson system. (Sorry, don't know bug number).
            Plus, I had a case where "URL Content Change" trigger stopped
            working
            right, and those jobs were building over and over again, even
            though no
            content change (bug 363891 [1]).

            In the past, I've sometimes found it helps "odd Hudson
            behavior" to "wipe
            out workspace", essentially resetting what ever is there. To
            get to that
            option, click on the job, click on "Workspace" (on left, upper
            part of
            page), which displays the workspace, but also reveals a "Wipe
            out
            workspace" option. I'm sure it is mostly "superstitious
            behavior" ... but
            has seemed to help in some cases.

            In any case, I'd encourage you see if any existing bugs are
            related to your
            observations [2] and if not, to open one [3] being sure to make
            note "it is
            on Eclipse Infrastructure".

            Good luck,

            [1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=363891

            [2]
            https://bugs.eclipse.org/bugs/query.cgi?classification=Technology&product=Hudson&query_format=advanced


            [3] https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Hudson







            From:   Eike Stepper <stepper@xxxxxxxxxx>
            To:     cross-project-issues-dev@xxxxxxxxxxx,
            Date:   02/25/2012 10:58 PM
            Subject:        [cross-project-issues-dev] Hudson job keeps
            rebuilding
            Sent by:        cross-project-issues-dev-bounces@xxxxxxxxxxx



            Hi,

            One of my Hudson jobs,
            https://hudson.eclipse.org/hudson/job/emf-cdo-maintenance ,
            repeatedly sees
            Git changes (SCM
            trigger) that do not exist. It always reports (a) Started by
            SCM change and
            (b) No changes. But then it triggers a build
            rather than just exit. I'm polling for SCM changes every 2
            hours. That
            results in 12 unnecessary builds per day ;-(

            This strange behaviour started yesterday. Does anybody see the
            same
            behaviour?

            Cheers
            /Eike

            ----
            http://www.esc-net.de
            http://thegordian.blogspot.com
            http://twitter.com/eikestepper


            _______________________________________________
            cross-project-issues-dev mailing list
            cross-project-issues-dev@xxxxxxxxxxx
            https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev




            _______________________________________________
            cross-project-issues-dev mailing list
            cross-project-issues-dev@xxxxxxxxxxx
            https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev


      _______________________________________________
      cross-project-issues-dev mailing list
      cross-project-issues-dev@xxxxxxxxxxx
      https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev
      _______________________________________________
      cross-project-issues-dev mailing list
      cross-project-issues-dev@xxxxxxxxxxx
      https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev




Back to the top