Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[ptp-user] Problem with LSF monitoring

I’m having a problem when monitoring a system that uses LSF. Jobs are showing up correctly in the Active/Inactive lists, however I’m not seeing any activity in the system monitoring view. Looking at the log files, it seems like there is an issue with the node names not being interpreted correctly. Would someone (Carsten?) be able to remind me what I need to do to get this working?

Thanks,
Greg


LML_da.errlog:

LML Data Access Workflow Manager 1.0, starting at (Tue Aug 12 21:35:58 IST 2014)
LML_file_obj: read  XML in 0.0006 sec
LML_file_obj: parse XML in 0.0170 sec
LML_file_obj: read  XML in 0.0001 sec
LML_file_obj: parse XML in 0.0016 sec
Use of uninitialized value $nodenum in numeric lt (<) at /home/ibm/.eclipsesettings/LML2LML//LML_gen_nodedisplay_insert_job.pm line 344.
Use of uninitialized value $nodenum in numeric gt (>) at /home/ibm/.eclipsesettings/LML2LML//LML_gen_nodedisplay_insert_job.pm line 345.
Use of uninitialized value $nodenum in array element at /home/ibm/.eclipsesettings/LML2LML//LML_gen_nodedisplay_insert_job.pm line 351.
Use of uninitialized value $nodenum in array element at /home/ibm/.eclipsesettings/LML2LML//LML_gen_nodedisplay_insert_job.pm line 351.
Use of uninitialized value $nodenum in array element at /home/ibm/.eclipsesettings/LML2LML//LML_gen_nodedisplay_insert_job.pm line 355.

LML_da.log:

execute_step: input file for step not found ./tmp_iitmlogin3_24610/datastep___init__.xml ...
execute_step: --> generating empty ./tmp_iitmlogin3_24610/datastep___init__.xml ...
"Specified Hosts"                        => "",
execute_step: output file not generated by step, renaming input file to ./tmp_iitmlogin3_24610/datastep_getdata.xml ...
reading file: ./tmp_iitmlogin3_24610/sysinfo_LML.xml  ...
LML_file_obj: read  XML in 0.0001 sec
LML_file_obj: parse XML in 0.0006 sec
reading file: ./tmp_iitmlogin3_24610/jobs_LML.xml  ...
LML_file_obj: read  XML in 0.0003 sec
LML_file_obj: parse XML in 0.0088 sec
reading file: ./tmp_iitmlogin3_24610/nodes_LML.xml  ...
LML_file_obj: read  XML in 0.0001 sec
LML_file_obj: parse XML in 0.0056 sec
scan system: type is Cluster
system_type=Cluster
check_jobs: WARNING: unset attribute 'detailedstatus' 6 occurrences
objects: total #42
        |--          6 (job)
        |--         35 (node)
        |--          1 (system)
/home/ibm/.eclipsesettings/LML_color/LML_color_obj.pl
reading file: ./tmp_iitmlogin3_24610/datastep_addcolor.xml  ...
objects: total #42
        |--          6 (job)
        |--         35 (node)
        |--          1 (system)
reading file: ./tmp_iitmlogin3_24610/layout.xml  ...
objects: total #0
tablelayout: total #2
        |--        1x15 (tl_WAIT)
        |--        1x14 (tl_Run)
nodedisplaylayout: total #1
scan system: type is Cluster
LML_gen_table::process: gid=org.eclipse.ptp.rm.lml.ui.InactiveJobsView contenttype=jobs objtype_pattern=job
Table Layout: tl_WAIT processed (0 objects found)
Table Layout: objects           of tl_WAIT copied (0 new objects)
Table Layout: info objects      of tl_WAIT copied (0 new objects)
Table Layout: info data objects of tl_WAIT copied (0 new objects)
LML_gen_table::process: gid=org.eclipse.ptp.rm.lml.ui.ActiveJobsView contenttype=jobs objtype_pattern=job
Table Layout: tl_Run processed (6 objects found)
Table Layout: objects           of tl_Run copied (6 new objects)
Table Layout: info objects      of tl_Run copied (6 new objects)
Table Layout: info data objects of tl_Run copied (6 new objects)
_get_system_type: type is 'Cluster'
_get_system_type: name is 'iitmlogin3'
_get_system_size_cluster: found    2 nodes of size: 1
_get_system_size_cluster: found   29 nodes of size: 1152
_get_system_size_cluster: found    1 nodes of size: 1169
_get_system_size_cluster: found    3 nodes of size: 1184
_get_system_size_cluster: Cluster found of size: 35
LML_gen_nodedisplay::process: gid=nd_1
get_numbers_from_name: not found >iitmc11n01-ib0-c00<
get_numbers_from_name: not found >iitmc11n01-ib0-c01<
get_numbers_from_name: not found >iitmc11n01-ib0-c02<
...

Back to the top