[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[gmt-dev] [modeling-dev] Modeling projects fail to publish update site bits due to nfs problems (was Re: teneo not getting published on emf update sites)

I can wait, but it'll certainly need a look-see next week. No point ruining Denis' weekend w/ something that could take days (or minutes?) to solve.

Copying modeling-dev@ so as to minimize the freakouts in the meantime. :)

N

Webmaster (Karl Matthias) wrote:
I'm seeing some NFS errors on build like "build kernel: nfsacl: RPC call returned error 22". Looks like RPC calls for ACLs are failing on occasion and that might be what's going on. At least it makes some sense given what's happening. I don't recall seeing these before, but I need to bounce this off of Denis who has spent more time on this stuff than I have to see what he thinks. They look like they've only been happening in the last couple of weeks. He's in on Monday, so unless you think this is urgent then I'll wait to then to ping him first.

Cheers,
Karl

Martin Taal wrote:
The cp was from my home dir to the update site dir. However, the unzip I did was from within this directory to this directory itself. I did not even mention that before I tried this unzip action I tried to copy the whole interim directory to an interim_orig directory (as backup) in /home/data/httpd/download.eclipse.org/modeling/emf/updates/
and I got this same Invalid argument exception.


So my guess is that there is something wrong with /home/data/httpd/download.eclipse.org/modeling/emf/updates and not with the home directories necessarily.

gr. Martin

Nick Boldt wrote:
That's seriously foo'd up. I've never seen linux behave where if you try the same thing twice it fails, then works on the second try.

I expect that behaviour from Windows, or p2, but Linux? Yikes. What's the world coming to? :P

p2 has this on-again-off-again problem because of out of date or hard-to-ping mirrors, so sometimes an install fails, sometimes it works (using the same input). I just had this happen to me today trying to install something that depends on a feature that was only added to the BIRT 2.5 update site about less than a day ago.

So... could this filesystem problem be related to network access snafus between mtaal's home dir (or mine, we tried building as nickb too) and /home/data/httpd/download.eclipse.org, both ostensibly on build.eclipse.org but in reality on different disks in different physical places?

N

Martin Taal wrote:
Hi,
I tried some things directly on build.eclipse.org in the /home/data/httpd/download.eclipse.org/modeling/emf/updates/interim directory.


First I tried to copy a zip file (with the update site) to the above update site directory. The first try fails with the invalid argument exception, when I try it again with the exact same argument it works out fine:

mtaal@build:/home/data/httpd/download.eclipse.org/modeling/emf/updates/interim> cp ~/emf-teneo-1.1-M-M200907090216.zip .
cp: cannot create regular file `./emf-teneo-1.1-M-M200907090216.zip': Invalid argument
mtaal@build:/home/data/httpd/download.eclipse.org/modeling/emf/updates/interim> cp ~/emf-teneo-1.1-M-M200907090216.zip .
mtaal@build:/home/data/httpd/download.eclipse.org/modeling/emf/updates/interim>



then I did unzip of this zip file in the same directory and I get these errors (truncated for this email, see the attachment for the full list):
mtaal@build:/home/data/httpd/download.eclipse.org/modeling/emf/updates/interim> unzip emf-teneo-1.1-M-M200907090216.zip
Archive: emf-teneo-1.1-M-M200907090216.zip
replace header.interim.site.xml? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
error: cannot create header.interim.site.xml
error: cannot create buildUpdateSiteXML.sh
error: cannot create buildUpdateSiteDigest.sh
error: cannot create buildUpdateSiteMetadata.sh
error: cannot create jarlist.clean.modeling.eclipse.org.txt


I am not sure what's wrong but it does not feel right.... Also that I needed two cp actions before I could copy the zip file.

gr. Martin

Webmaster (Karl Matthias) wrote:
Hi Guys,

That POSIX ACL looks fine to me... are you sure you're experiencing a file permissions error?

Cheers,
Karl

Martin Taal wrote:
Hi Nick,
I mean build.eclipse.org.

Yes maybe... but to be honest I don't understand what this facl things mean...
Also Patrick has the same problems and promoting under your user did not work either.


gr. Martin

Nick Boldt wrote:
The folder /home/data/httpd/download.eclipse.org/modeling/emf/updates/interim on dev.eclipse.org (or build.eclipse.org, same thing(?)) says that the files are mostly owned by group modeling.emf.website; you're in that group. The only files I see which aren't in that group (and group-writable) are owned by mtaal:common..

So maybe it's an ACL problem?

$ getfacl /home/data/httpd/download.eclipse.org/modeling/emf/updates/interim

getfacl: Removing leading '/' from absolute path names
# file: home/data/httpd/download.eclipse.org/modeling/emf/updates/interim
# owner: nickb
# group: modeling.emf.website
user::rwx
user:khussey:rwx
user:nickb:rwx
group::rwx
mask::rwx
other::r-x
default:user::rwx
default:user:khussey:rwx
default:user:nickb:rwx
default:group::rwx
default:mask::rwx
default:other::r-x



Martin Taal wrote:
Hi Nick
I did that, see the buildUpdateSite2.sh and buildUpdateSite3.sh

My conclusion it fails in this step:
               unzip -uo -qq ~/${siteLabel}-${buildID}.zip;

In the script see the pushd statement just before, afaik that causes the unzip to take place in the update site directory. And there it fails! Looking at the update site directory the permissions look strange also.
I managed to get the zip file to my home directory on build.eclipse.org.
Then I have this script (see the modeling/script directory on modeling.eclipse.org and then the buildUpdateSite3.sh):
ssh mtaal@xxxxxxxxxxxxxxxxx "
echo -e 'Unzipping';
pushd /home/data/httpd/download.eclipse.org/modeling/emf/updates/interim >/dev/null;
unzip -uo -qq ~/emf-teneo-1.1-M-M200907090216.zip
";
this one fails.


Can you check/see what is wrong with the update site directory permissions on build.eclipse.org?

gr. Martin

Nick Boldt wrote:
check buildUpdateSite.sh - match console output to echo "..." statements.

Martin Taal wrote:
We are in the same boat :-), with you as a user it also fails:
http://modeling.eclipse.org/promo_logs/promo_log_teneo_1.1.1.M200907090216_2009-07-10-05.37.52.txt



As the error shows the complete contents of the zip I assume that the zip file is there and correct...
So I would think the error is with the target location. Does the unpack use a specific target directory? Or does it unpack in the home directory of the user?


Where is this unpack command? I could not find it in promoteToEclipse.sh

gr. Martin

Nick Boldt wrote:
sure, you can switch "mtaal" for "nickb" in your _common.php file.

Martin Taal wrote:
Hi Nick,
And is it possible to promote under another user (you! for example)? Just for this and maybe one other build before moving on to the new build system.


gr. Martin

Martin Taal wrote:
damn same result... I am at a total loss. Well going to sleep now, tomorrow is another day...

Thanks for your help Nick, ofcourse if you get an idea while I am sleeping and you are awake then let me know!

gr. Martin

Nick Boldt wrote:
cd /var/www/html/modeling/emf; sudo su; chown -R apache:www *; chmod -R g+w *

Done. Try again?

Martin Taal wrote:
I tried but there is no time to press ctrl-c between the upload and the remove of the temp files....
In addition it seems that ctrl-c does not work (it looks like a forked process or something).


One thing which I find strange that in the directory:
/var/www/html/modeling/emf/updates/interim

I see permissions for apache www and mtaal users. Everyone has apache www except the teneo ones I have been trying out. Can this be a symptom of a problem?

gr. Martin


Nick Boldt wrote:
yeah, you can run promoteToEclipse.sh via commandline and hit CTRL-C
while it's running. :)


On Thu, Jul 9, 2009 at 3:11 PM, Martin Taal<mtaal@xxxxxxxxx> wrote:
I saw in the log that there is a remove step of temporary zips etc. Can this
remove be disabled? To see which files are really there.


gr. Martin

Nick Boldt wrote:
So... you scp a file to mtaal@xxxxxxxxxxxxxxxxx's home dir, then ssh
in to unpack it and the unpacking fails to create files.


I tried ssh'ing in by hand as you and touching files in your home dir
and the target dir and it worked fine.


Wierder is that this has worked for months untouched... until now.

Has anything changed in your .bashrc on build.eclipse.org? Did the
reboot a couple weeks ago break anything, like umask settings, default
group, or has the filesystem changed so that "~" isn't properly
resolving to your home dir when pushed up as
mtaal@xxxxxxxxxxxxxxxxx:~/emf-teneo-1.1-M-M200907090216.zip ?


Do we know for sure that that temporary zip is even being produced on
modeling.eclipse? Could there be a failure higher up in the build
itself?


N

On Thu, Jul 9, 2009 at 1:14 PM, Martin Taal<mtaal@xxxxxxxxx> wrote:

Here it is:

http://modeling.eclipse.org/promo_logs/promo_log_teneo_1.1.1.M200907090216_2009-07-09-13.08.14.txt


My guess is that moving to Athena will be a few weeks work for me. The
last
time I did major things regarding builds was in February (getting a
separate
eclipselink feature). That took me about a month throughput time. So I am
not at all optimistic about the time it takes to accomplish that....


So as these builds are done and solve some issues it would be nice to get
them out of the way and then move to the new build environment.


gr. Martin

Nick Boldt wrote:

Try it again.

The line in the log with "Promote zip to build.eclipse.org..."

has been changed to "Promote zip to
$eclipseSSHUser:~/${siteLabel}-${buildID}.zip

which will give us a clue as to whose account is being used, and where
the
zip is created.


I wonder if the problem is nfs write lag? In theory using your home dir
on
build.eclipse should be fine, but maybe there's a write delay between
uploading the zip (scp or rsync) and it being available when we ssh in a
few
seconds later to unpack it & run the script to generate the metadata.


We used to have this problem when we used download1.eclipse.org because
you never knew which node you'd get with each connection and it could
take
up to 30 mins for the uploaded file to be available on all nodes. But
with
the increased space on build.eclipse, I thought we were past this
problem...


Of course if you switch to the Athena build then you run on
build.eclipse
and publishing is a simple copy (no ssh connection required)... if
problems
persist, that'd be the way to go IMHO.


N

Martin Taal wrote:

Hi Nick,
Thanks, the promote gets further but I still see errors around the
update
of the interim site:



http://modeling.eclipse.org/promo_logs/promo_log_teneo_1.1.1.M200907090216_2009-07-09-12.11.47.txt



gr. Martin

Nick Boldt wrote:

File and dir permissions were incorrect.

You can now

$ W bash
$ ssh mtaal@xxxxxxxxxxxxxxxxxxxx

without a password prompt; try your promote again.

Martin Taal wrote:

Hi Nick,
The build on modeling.eclipse.org went fine. However with promote I
get
a permission denied exception:



http://modeling.eclipse.org/promo_logs/promo_log_teneo_1.1.1.M200907090216_2009-07-09-05.06.23.txt



I can ssh from modeling to build, to dev and to download1 without
logging in. I can also ssh as the webuser to build:
W ssh mtaal@xxxxxxxxxxxxxxxxx


I updated the promote properties from cvs also.

So I am not sure what the remaining permission issue can be.

gr. Martin

Nick Boldt wrote:

Fwiw, build.eclipse.org is a real 4-core ppc box with lots of ram
and
nfs-mounted disc space.


If youy want to move there, Athena does everything the old Modeling
system does except galileo .build file generation (coming soon) and
javadoc generation in the doc plugin (waiting for someone to
rearchitect the crap we have now). You'd also get your own update
site
instead of being merged w/ EMF.


Something to consider.

N


On 7/8/09, Nick Boldt <nickboldt@xxxxxxxxx> wrote:


It's a virtual server, so I'm not sure. 'cat /proc/meminfo' should
tell you, iirc.


On 7/8/09, Martin Taal <mtaal@xxxxxxxxx> wrote:


Hi Nick,
Thanks I am building right now on that server. One question, it
seems
that this server has again (only) 512mb. Is that really true?


gr. Martin

Nick Boldt wrote:


ssh login for modeling.eclipse: use u: mtaal, p: mtaal1 (please
change
it once you're in using `passwd`)


for build page, use u: emf-build, p: emf$YAll

migration steps are here:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=273485#c19


Martin Taal wrote:


Yes I did try:
mtaal@xxxxxxxxxxxxxxxxx
and that worked fine (no password needed)


Can you forward your note again?

what is the login/password for modeling.eclipse.org?

I don't remember the pwd for emft.eclipse.org anymore.....
(login
was
emft-build)


gr. Martin

Nick Boldt wrote:


Close... But you should be connecing as mtaal@xxxxxxxxxxxxx, I
think.


Have you looked at building on modeling.eclipse instead of
emft.eclipse?


Emft goes off the air next week... Did you see my note about
that?



On 7/8/09, Martin Taal <mtaal@xxxxxxxxx> wrote:




ssh mtaal@xxxxxxxxxxxxxxxxx works fine
when doing:
W ssh apache@xxxxxxxxxxxxxxxxx or
ssh www-data@xxxxxxxxxxxxxxxxx

then I need to enter a password for both.

Is this what you mean?

gr. Martin

Nick Boldt wrote:



Check that you and www-data@ or apache@ can ssh to
build.eclipse.org,
and that you have write permission once there.


On 7/8/09, Martin Taal <mtaal@xxxxxxxxx> wrote:




Hi Nick,
It seems that Teneo maintenance builds are not published on
EMF
update
sites. Can you see what is wrong?
Here is the promolog (which contains several errors):



http://emft.eclipse.org/promo_logs/promo_log_teneo_1.1.1.M200907080536_2009-07-08-06.23.36.txt





Thanks!

--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxxxxxxxx - mtaal@xxxxxxxxx
Web: www.springsite.com - www.elver.org







--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxxxxxxxx - mtaal@xxxxxxxxx
Web: www.springsite.com - www.elver.org





--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxxxxxxxx - mtaal@xxxxxxxxx
Web: www.springsite.com - www.elver.org




--
Sent from my mobile device

Nick Boldt :: JBoss by Red Hat
Productization Lead :: JBoss Tools & Dev Studio
Release Engineer :: Dash Athena
http://nick.divbyzero.com



--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxxxxxxxx - mtaal@xxxxxxxxx
Web: www.springsite.com - www.elver.org





--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxxxxxxxx - mtaal@xxxxxxxxx
Web: www.springsite.com - www.elver.org






















-- Nick Boldt :: http://nick.divbyzero.com Release Engineer :: Eclipse Modeling & Dash Athena _______________________________________________ modeling-dev mailing list modeling-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/modeling-dev