Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Error on ProxyPacket

I tried updating my proxy to use a lock to protect the call to send the 
event messages and it makes no difference. Looking at the list.c code in 
org.eclipse.ptp.utils, it should not have made a difference since the list 
is already protected with a lock and my function is just a call to 
proxy_svr_queue_msg, which is in turn nothing more than a call to 
AddToList. So it looks like something else is going wrong.

Dave



Dave Wootton/Poughkeepsie/IBM@IBMUS 
Sent by: ptp-dev-bounces@xxxxxxxxxxx
11/30/2007 09:25 AM
Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>


To
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
cc
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>, 
ptp-dev-bounces@xxxxxxxxxxx
Subject
Re: [ptp-dev] Error on ProxyPacket






Greg
I am getting the failure intermittently. I ran my application a couple 
times with my proxy, shut the proxy down, restarted and got the error. For 

some reason, I'm seeing this fairly consistently this morning, so I've 
attached 5 logs. log5 is the simplest case where I just started a remote 
proxy ran my program and it failed immediately.

Looking at the logs, it appears something is going wrong just after the 
event containing the list of processes in the job is processed. I looked 
at my code and the sequence at this point where I've fork/exec the poe app 

is that I should be sending a new job event, then an ok event from my main 

thread. In the meantime, a second thread I created is watching for the 
attach.cfg file poe creates so I get get the task/node/pid mapping for my 
program. Once I have that, I create and send the event with the task list. 

Then that thread exits, and at that point, the next message I get should 
be a process change event with stdout text. I don't seem to get that. It's 

possible I'm doing something wrong but this code has been running for a 
few months now without problems until recently. The only possible problem 
I have is that I don't properly synchronize between my main thread which 
issues the new job event and the monitor thread which sends the task list, 

so in theory I could send the task list before the new job message. In 
reality, I think that's almost impossible since that means that the poe 
process needs to be fork/execed, the application tasks created, and then 
the attach.cfg file created before my main thread issues a new job event 
and an ok event.

What might be happening is that I have a second potential race condition 
between this same monitoring thread and the main thread, where the main 
thread is generating the process change event with stdout text. I don't 
have a lock on the function that equeues the event message, so it's 
possible that both threads are trying to create events and one thread's 
message gets trashed. I tried running with my proxy redirecting stdout to 
a file and I didn't see the problem. As soon as I ran with the proxy 
generating process change events again, I got the problem back. 

I'm not sure what proxy Clement is using. If that proxy also uses threads, 

that might explain what's going on.

Dave




Greg Watson <g.watson@xxxxxxxxxxxx> 
Sent by: ptp-dev-bounces@xxxxxxxxxxx
11/28/2007 09:14 PM
Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>


To
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
cc

Subject
Re: [ptp-dev] Error on ProxyPacket






I just committed a check for NumberFormatException. Can you send the 
output?

Greg

On Nov 28, 2007, at 1:35 PM, Dave Wootton wrote:

> Is there a switch I can turn on of some sort, such as a compile 
> flag, that
> will print out the actual message data? Otherwise, the closest I 
> think I
> can get is to paste the last event message that was logged in the 
> console
> (before the failing message) and hope that gets close enough to 
> where the
> problem is.
> Dave
>
>
>
> Greg Watson <g.watson@xxxxxxxxxxxx>
> Sent by: ptp-dev-bounces@xxxxxxxxxxx
> 11/27/2007 01:41 PM
> Please respond to
> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>
>
> To
> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
> cc
>
> Subject
> Re: [ptp-dev] Error on ProxyPacket
>
>
>
>
>
>
> Can you get the whole proxy message that caused the error? That way
> we'd at least know where it was coming from.
>
> Greg
>
> On Nov 26, 2007, at 3:43 PM, Dave Wootton wrote:
>
>> I can't recreate this consistently. Sometimes I run my application
>> without
>> problems, but then the next time I get the exception. My traceback is
>> slightly different from the stack entry for Integer.parseInt and
>> whatever
>> it calls, probably due to a different Java runtime, but identical
>> before
>> that. In looking at code, it looks like teh ProxyPacket.read method 
>> is
>> trying to parse what it thinks is an 8 hex digit integer and failing
>> when
>> it sees the ' ' at the start of the string. Since I can't reliably
>> recreate this, I'm not sure what's happening. A cojuple
>> possibilities are
>> that  whatever is generating the packet is generating garbage for
>> length
>> strings sometimes or the communications sequence is out of sync and
>> the
>> read method is reading something which is not really a length string.
>> Dave
>>
>>
>>
>> Clement Kam Man Chu <clement.chu@xxxxxxxxxxxxxxxxxxxxxx>
>> Sent by: ptp-dev-bounces@xxxxxxxxxxx
>> 11/26/2007 03:06 PM
>> Please respond to
>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>
>>
>> To
>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>> cc
>>
>> Subject
>> Re: [ptp-dev] Error on ProxyPacket
>>
>>
>>
>>
>>
>>
>> Dave Wootton wrote:
>>> I've also been seeing this intermittently for a while. I updated 
>>> from
>> head
>>> today and just saw this again, using a remote proxy.
>>> Dave
>>>
>>>
>>>
>> Hi Dave,
>>
>>   Do you know how to reproduce this error?  I am not sure because
>> this
>> error does not occur frequently.  Sometimes occurred after I
>> launched a
>> debug job with a large number of processes.
>>
>> Clement
>>> Clement Kam Man Chu <clement.chu@xxxxxxxxxxxxxxxxxxxxxx>
>>> Sent by: ptp-dev-bounces@xxxxxxxxxxx
>>> 11/21/2007 10:38 PM
>>> Please respond to
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>>
>>>
>>> To
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>> cc
>>>
>>> Subject
>>> [ptp-dev] Error on ProxyPacket
>>>
>>>
>>>
>>>
>>>
>>>
>>> Hi,
>>>
>>> I got the following error from the latest version of head.
>>>
>>> java.lang.NumberFormatException: For input string: " 00df:00"
>>>   at
>>>
>> java
>> .lang
>> .NumberFormatException.forInputString(NumberFormatException.java:48)
>>>   at java.lang.Integer.parseInt(Integer.java:447)
>>>   at
>> org.eclipse.ptp.proxy.packet.ProxyPacket.read(ProxyPacket.java:157)
>>>   at
>>>
>> org
>> .eclipse
>> .ptp
>> .proxy
>> .client.AbstractProxyClient.sessionProgress(AbstractProxyClient.java:
>> 354)
>>>   at
>>>
>> org.eclipse.ptp.proxy.client.AbstractProxyClient.access
>> $8(AbstractProxyClient.java:352)
>>>   at
>>>
>> org.eclipse.ptp.proxy.client.AbstractProxyClient
>> $2.run(AbstractProxyClient.java:297)
>>>
>>> Clement
>>>
>>>
>>
>>
>> -- 
>> Clement Kam Man Chu
>> Research Assistant
>> Faculty of Information Technology
>> Monash University, Caulfield Campus
>> Ph: 61 3 9903 2355
>>
>> _______________________________________________
>> ptp-dev mailing list
>> ptp-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>
>>
>> _______________________________________________
>> ptp-dev mailing list
>> ptp-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>
>
> _______________________________________________
> ptp-dev mailing list
> ptp-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>
>
> _______________________________________________
> ptp-dev mailing list
> ptp-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

Attachment: log
Description: Binary data

Attachment: log5
Description: Binary data

Attachment: log2
Description: Binary data

Attachment: log3
Description: Binary data

Attachment: log4
Description: Binary data


Back to the top