Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [smila-user] Metadata in Pipelets not associated with Records

Hi Stefan,

sorry for the late reply, we have been a bit busy yesterday (;

You wrote: 
> ... The initial idea is to put the  uploading part and the triggering 
> part into different pipelets. The triggering pipelet however is 
> completely independent on the record ids that were passed into it 
> (as it would simply call a webservice method and be done). I guess 
> this could easily work by simply ignoring the records.

Yes, no problem. Of course, you should return the input IDs as the result
so that the next pipelet (if any) can work on the records again.

> Another method of the webservice lists all uploaded data and another 
> method can be used to remove specific data items. I'd like to have the 
> listing and the deletion methods be in separate pipelets. For a workflow 
> like "Delete all old data, upload new data, trigger processing", I would 
> need to create a list of "old data" and pass it to the deletion pipelet. 
> Both of them could be more or less ignorant of the records passed to 
> them, but I would need to pass on the list from the first to the second 
> pipelet. Is there a distinguished way to pass on these general kind of 
> metadata which are not really associated with a record? Or is SMILA on 
> the pipelet level just not designed for this?

If both pipelets are in the same pipeline, you could use the "setGlobalNotes"
/"getGlobalNotes" methods of the blackboard to pass something extra from one
pipelet to the next that is not associated to a record. These notes are not
persisted anywhere and will be just deleted when the pipeline is done. 

> Another question that popped to my mind is whether it is possible to 
> just trigger a workflow with need of the crawler to find new files? So 
> say if I want to trigger the processing every night at 2:00, could I 
> write a little cronscript which would somehow access SMILA (through the 
> REST API?) and kicks off the worklow with no records at all? How do I 
> trigger this? A link to appropriate documentation would be sufficient.

Yes, easily. Just use "curl" to do HTTP requests from shell scripts. For
more convenience you can use "resty", a shell script that provides shell
HTTP commands like "POST" and "GET" that hide the details of the curl stuff.
See http://wiki.eclipse.org/SMILA/Documentation/Using_The_ReST_API#Shell_scripting
for some quick guide and links to the download sites.

> Finally, I have code which can monitor the status of the processing job 
> given an id which is returned when triggering this processing (e.g. from 
> the above mentioned pipelet). Again, the pipelet triggering the 
> processing would need to store this id somewhere from which the 
> monitoring code could access it. Is it possible at all to build 
> something like: "Monitor status and if status is 'finished', trigger 
> workflow X with record set Y"? Or is SMILA simply not designed for this?

No, there is nothing yet in SMILA that could do this monitoring for you and 
trigger the workflow, so you will have to do it from outside. Is the trigger 
pipelet invoked in a pipeline called from outside? Then it could store the 
id to watch in a result record so that the caller could read it and use it 
for monitoring. If it's invoked inside an asynchronous workflow, it would 
have to store it somewhere else (a small database?) that the monitoring
code can access, too.

Cheers,
Juergen.





Back to the top