Bug 271064 - gLite job submit fails
Summary: gLite job submit fails
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Geclipse (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Pawel Wolniewicz CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-02 17:55 EDT by Kestas CLA
Modified: 2014-01-09 16:01 EST (History)
1 user (show)

See Also:


Attachments
gEclipse error log. (5.06 KB, text/plain)
2009-04-02 17:55 EDT, Kestas CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kestas CLA 2009-04-02 17:55:36 EDT
Created attachment 130773 [details]
gEclipse error log.

Build ID: I20080617-2000

Steps To Reproduce:
1. install http://www.geclipse.org/download/gEclipse_1.0RC2-linux.gtk.x86_64.tar.gz
2. register in balticgrid VO (https://voms.balticgrid.org:8443/voms/balticgrid/webui/request/user/create)
3. submit job to "balticgrid" VO via any it's WMS'es (submit fails with an SSL error)


More information:
- data management works ok (gridftp, srm)
- if I take workspace/.metadata/.plugins/eu.geclipse.core/.tokens/VOMS\ Proxy\@balticgrid#01 and place it in X509_USER_PROXY=/tmp/x509up_u5000, I can successfuly run jobs using glite-wms-* tools.
- 3 more balticgrid VO users were able to reproduce this bug on Windows and Linux.
- 1 balticgrid user is able to submit jobs successfuly
-- if I replace my .tokens/VOMS\ Proxy with his .tokens/VOMS\ Proxy - I can submit jobs;
-- if this user replaces his .tokens/VOMS\ Proxy with mine - he can't submit jobs.
-- when on the same machine/account where from the user was submiting jobs successfuly we replaced user certificates with mine, we couldn't submit jobs.
- user certificates we are using are issued by BalticGrid CA.
- I was using gEclipse at GridKa School 2008 on Gilda and it worked ok.
Comment 1 Kestas CLA 2009-04-06 08:58:03 EDT
Also we have tried to setup new CA and VO. We have issued new user certificates, tried gEclipse and got the same error. Standard glite-wms-job-* commands work ok in this VO.
Comment 2 Mathias Stümpert CLA 2009-04-06 10:08:01 EDT
Hi Kestas,

This indeed seems to be a weird thing. Two suggestions I can give you to sort this out:

1) Make sure that the machine(s) clocks are in sync with the WMS's clocks. We already had a few weird auth problems in the past due to non-synced clocks. Also http://www.globus.org/mail_archive/discuss/2001/Archive/msg00109.html suggests to sync the machine clocks in case of an SSLv3 alert number 42.

2) Make sure that the imported CA certificates are fine, i.e. the necessary certs are really in place and they are not expired or invalid or something else.

As easy as 1) sounds there are a few pitfalls to circumnavigate (AM or PM, wrong date, etc.).

Cheers, Mathias
Comment 3 Kestas CLA 2009-04-06 16:56:56 EDT
Thanks for response :)

I've retested it on glite UI machine (x86, not x86_64), so gEclipse and glite-wms-* commands are run on the same host:
- machine (pupa.elen.ktu.lt) has ntpd runing and time is correct (the same machine works as CE and SE, so there would be problems othervise)
- the only CA I imported is BalticGrid CA (ca_BalticGrid-1.28.tar.gz) and it seems valid for me (diff shows no differencies .metadata/.plugins/eu.geclipse.core/.security/2a237f16.0 to /etc/grid-security/certificates/2a237f16.0).
- I also imported balticgrid VO (had to add ldap://runkelis.elen.ktu.lt:2170 by hand, also added wms.bg.ktu.lt as WMS)
- data management works fine
- created simple job ant tried to submit it to 'balticgrid VO' via two WMSes, got the same error.
- copied .metadata/.plugins/eu.geclipse.core/.tokens/VOMS\ Proxy\@balticgrid#01 /tmp/x509up_??? and tried to submit simple job via glite-wms-job-submit to the same WMSes. It worked ok.

NTP on UI and WMS:

[kestas@pupa gridtest]$ /usr/sbin/ntpdate -q europe.pool.ntp.org
server 193.2.111.3, stratum 2, offset -0.029147, delay 0.07211
 6 Apr 23:32:18 ntpdate[8041]: adjust time server 193.2.111.3 offset -0.029147 sec

[root@wms ~]# /usr/sbin/ntpdate -q 193.2.111.3
server 193.2.111.3, stratum 2, offset 0.034261, delay 0.06856
 6 Apr 23:32:52 ntpdate[15699]: adjust time server 193.2.111.3 offset 0.034261 sec

Kestas
Comment 4 Mathias Stümpert CLA 2009-04-06 18:31:14 EDT
Just to rule the CA thing out try to import all CA certificates from the EUGridPMA repository. The fact that one of your user certificates seems to be ok and the other is not could point to a PKI related problem. If you're using your user cert from the command line you normally also have all CA certs available there, so you should do the same within g-Eclipse.
Comment 5 Kestas CLA 2009-04-07 07:38:28 EDT
I've imported all EuGridPMA certificates to gEclipse. Job submit fails
the same as before.

Tried to compare
workspace3/.metadata/.plugins/eu.geclipse.core/.security/ to
/etc/grid-security/certificates. Differencies:
1149214e.0 - has different line-ends (unix and dos)
16da7552.0 - has decoded text
34f8e29c.0 - has decoded text and dos/unix lineends
5e5501f3.0 - has decoded text
9b59ecad.0 - has decoded info
b93d6240.0 - different formating
c4435d12.0 - has decoded text
cc800af0.0 - has empty line
e1fce4e9.0 - only in /etc/grid-security/certificates
fe102e03.0 - has decoded text and dos/unix lineends

After, I removed all certificates from gEclipse and added all certs from
/etc/grid-security/certificates. Regenerated VOMS proxy. Tried to submit
job. Submit failed.

After, I removed all certificates and imported all certificates vi
gEclipse (accredited, experimental, worthless) except GILDA, which is
missing. Regenerated VOMS proxy, retried submit. Submit failed.


Kestas
Comment 6 Harald Kornmayer CLA 2009-04-07 08:48:21 EDT
I assigned the bug to Pawel. He is involved in Jobsubmission and can perhaps check on BalticGrid resources. 

Pawel, can you take care! 

Comment 7 Pawel Wolniewicz CLA 2009-06-09 08:33:50 EDT
The error comes from gLite server and suggest internal problem with connection between WMS and LB. 
And the command line submission uses different server frontend on another port, so it is hard to compare g-Eclipse submission and comand line submission. It can happen that one of the WMS server frontend is working and other is not.

But, is the error still valid?

I checked submission and I can submit jobs to balticgrid WMSes with no problem.
I see 6 WMSes and I can submit jobs to all of them.
Jobs ID:

https://wms.grid.vgtu.lt:9000/mtXf9V4AcqFjqoHe7wy6Qw
https://wms.grid.etf.rtu.lv:9000/_Jlu8GlWDmr53X7d8Oowvg
https://lxb067.mif.vu.lt:9000/n4Q0sTI9xbihph37Al-EQg
https://broker.eenet.ee:9000/TJan3Twsj7M48yIqFf3wsA
https://lb.grid.cyf-kr.edu.pl:9000/ZEV9YCoKckafxsRYMn4yNg
https://wms.basnet.by:9000/n7MjLMUXPmOutI-fDYBaaA

If you still will experience problems please contact WMS admin to check if there are any errors in log. 
Comment 8 Kestas CLA 2009-06-09 09:12:43 EDT
Yes, error is still valid (just tried basnet.by, mif.vu.lt and cyf-kr.edu.pl brokers).

Certificate which was working previuosly for one user has expired, and the new one dosn't work with geclipse. So currectly noone is able to use gEclipse on Balticgrid as far as i know.

Previously we already tried to debug this problem on WMS/LB but found nothing in logs. With strace we found server was trying to verify some certificate (only hash was visible, which dosn't match any installed CA).

Kestas
Comment 9 Pawel Wolniewicz CLA 2009-06-09 09:30:40 EDT
I am able to use gEclipse on Balticgrid :)
I can create BalticGrid VOMS proxy and successfully submit jobs to basnet.by, mif.vu.lt and cyf-kr.edu.pl. 

g-Eclipse is using standard user certificate and VOMS extension. No additional CA should be involved. I am quite convinced that the problem is on server side...

I am using certificate signed by Polish Grid CA. 
Which cerfiticate are you  using?
What is the problematic hash?
Are you able to access gridftp on any Balticgrid SE? (please try also Poznan SE)





Comment 10 Kestas CLA 2009-06-09 10:35:17 EDT
Yes, I meant noone with BalticGrid CA issued certificates. I'm using BalticGrid CA certificate.

while tacing glite_wms_wmproxy_server:

[pid 26106] stat64("/etc/grid-security/certificates//778a17ab.0", 0xbfef5b14) = 
-1 ENOENT (No such file or directory)

[pid 26113] stat64("/etc/grid-security/certificates//778a17ab.0", 0xbfe9aeb4) = -1 ENOENT (No such file or directory)

Gridftp works ok:
$ uberftp se.reef.man.poznan.pl
220 se.reef.man.poznan.pl GridFTP Server 2.3 (gcc64dbg, 1144436882-63) ready.
230 User balticgridsgm001 logged in.
uberftp> ls
drwxr-xr-x    7  root  root   4096 May 30 04:04  lib64
drwxr-xr-x    2  root  root   4096 Mar 14 22:56  srv
drwx------    2  root  root  16384 Dec 20 16:02  lost+found
drwxr-xr-x    2  root  root   4096 Jan 27 04:02  bin
drwxr-xr-x   12  root  root   4096 May 20 13:02  mnt
drwxr-xr-x    3  root  root   4096 Dec 20 15:33  tftpboot
drwxr-xr-x    2  root  root   4096 Dec 20 15:32  afs
-rw-r--r--    1  root  root      0 Apr 15 09:53  .autofsck
drwxr-xr-x   88  root  root  12288 Jun  8 14:24  etc
drwxr-xr-x   12  root  root   4096 Mar 26 21:17  opt
drwxr-xr-x    4  root  root   1024 Dec 20 16:09  boot
drwxr-xr-x   23  root  root   4096 Apr 20 01:44  usr
drwxr-xr-x   24  root  root   4096 Apr 20 01:44  var
drwxr-xr-x    2  root  root   4096 Nov 19 14:54  misc
dr-xr-xr-x  319  root  root      0 Apr 15 11:52  proc
drwxr-xr-x   11  root  root   4096 Jul 24 04:02  lib
-rw-------    1  root  root     32 Nov 16 16:16  .bash_history
drwxr-xr-x    2  root  root   4096 Jan 17 11:55  media
drwxr-xr-x   10  root  root   6160 May 29 17:26  dev
drwxr-xr-x    9  root  root      0 Apr 15 11:52  sys
drwxr-xr-x    2  root  root  12288 Apr 20 01:44  sbin
drwxr-xr-x   27  root  root   4096 Apr 15 09:53  .
drwxrwxrwx    6  root  root   4096 Jun  9 12:07  tmp
drwxr-xr-x   16  root  root   4096 Jun  9 15:55  root
drwxr-xr-x    2  root  root   4096 Mar 14 22:56  initrd
drwxr-xr-x    2  root  root   4096 Mar 12 12:13  data
drwxr-xr-x   27  root  root   4096 Apr 15 09:53  ..
drwxr-xr-x    2  root  root   4096 Dec 20 15:03  selinux
drwxr-xr-x   15  root  root   4096 May 21 14:19  home


Kestas