Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[smila-user] Setting bulkLimitSize in jobmanager json file

Hi Andreas
this is exactly what I need.
I did not found it in the documentation about jobManager or any other "managers". Just something in http://wiki.eclipse.org/SMILA/Documentation/Importing/CrawlingMultipleStartURLs

However, what about execution of BPEL pipelines directly from REST interface (calling directly the BPEL pipeline without using a standard workflow)?

I see no way to set a specific default bulkLimitSize for "pipelineProcessor" or am I wrong?

Thank you

Il 17/01/2014 8.45, smila-user-request@xxxxxxxxxxx ha scritto:
Message: 2
Date: Fri, 17 Jan 2014 08:45:12 +0100
From: Andreas Schank <andreas.schank@xxxxxxxxxxx>
To: Smila project user mailing list <smila-user@xxxxxxxxxxx>
Subject: Re: [smila-user] SMILA in a Cluster (Andreas Weber)
Message-ID:
	<341D23938BE0024892F84BB807E78F42A34E1E7D@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
	
Content-Type: text/plain; charset="iso-8859-1"

Hi Lorenzo,

You cannot set the parameters for a worker in the "worker.json" file. This is the place where the parameters are defined that can be used in a job (or in a workflow, if you need to ensure all jobs for that workflow use the same parameter) to set up the worker's behaviour for that job.

So, you might set these parameters in "jobs.json", "workflow.json" or in any job/workflow you define by pushing it to the SMILA jobmanager API.

So, e.g define these parameters in your job like that:
{
       "name" : "myIndexUpdate",
       "workflow" : "indexUpdate",
       "parameters" : {
             "tempStore" : "temp",
             "bulkLimitSize" : "1k"
       }
}

Bye,
Andreas

Von: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] Im Auftrag von Lorenzo Eccher
Gesendet: Donnerstag, 16. Januar 2014 18:35
An: smila-user@xxxxxxxxxxx
Betreff: Re: [smila-user] SMILA in a Cluster (Andreas Weber)

Hi Andreas.
thank for your quickly response.

I modified the parameter in bulkbuilder.properties as suggested and it is working as you said.

In the page you suggested me to read I read that is possible to set that parameters also in workers.json but for me is not clear how set the parameter (I used 1 kibibite = 1k in bulckbuilder.properties).

{
      "name":"bulkbuilder",
      "modes":[
        "bulkSource",
        "autoCommit"
      ],
      "parameters":[
        {
          "name":"bulkLimitTime",
          "optional":true,
          "type":"long"
        },
        {
          "name":"bulkLimitSize",
          "optional":true
        }
      ],
      "output":[
        {
          "name":"insertedRecords",
          "type":"recordBulks",
          "modes":[
            "optional",
            "maybeEmpty"
          ]
        },
        {
          "name":"deletedRecords",
          "type":"recordBulks",
          "modes":[
            "optional",
            "maybeEmpty"
          ]
        }
      ]
    },


Is also possible to set the parameter in the definition of the job? and what about the execution of a pipelineProcess?

I understand that it seems trivial but could be useful while on my SMILA instance run more than one "process" or "pipeline".

Thanks


Date: Fri, 10 Jan 2014 19:11:25 +0100

From: Andreas Weber <Andreas.Weber@xxxxxxxxxxx><mailto:Andreas.Weber@xxxxxxxxxxx>

To: Smila project user mailing list <smila-user@xxxxxxxxxxx><mailto:smila-user@xxxxxxxxxxx>

Subject: Re: [smila-user] SMILA in a Cluster

Message-ID:

  <34C96EDD492B394C9B5234DB4543C1FB02A43E54D7FC@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx><mailto:34C96EDD492B394C9B5234DB4543C1FB02A43E54D7FC@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>



Content-Type: text/plain; charset="iso-8859-1"



Hi Lorenzo,



there's nothing special to define in a job to have its work shared, if you setup your cluster as described in the documentation.



However, the unit to share are tasks, and a task corresponds to a bulk (of records).

So, if you have only a small amount of data, you have to limit the size of the bulks running through the workflow.



Have a look at the bulkbuilder, and its configuration:

http://wiki.eclipse.org/SMILA/Documentation/Bulkbuilder#Configuration



You could define very low limits if your amount of data is small, e.g. bulkLimitSize=1



Regards,

Andreas



Von: smila-user-bounces@xxxxxxxxxxx<mailto:smila-user-bounces@xxxxxxxxxxx> [mailto:smila-user-bounces@xxxxxxxxxxx] Im Auftrag von Lorenzo Eccher

Gesendet: Freitag, 10. Januar 2014 17:58

An: smila-user@xxxxxxxxxxx<mailto:smila-user@xxxxxxxxxxx>

Betreff: [smila-user] SMILA in a Cluster



Hallo SMILAers

As you read in my old emails, I am testing the new features of SMILA, working in a cluster.

After having tested the provided processes such as file system crawling and web crawling I tried do run a stupid workflow that reads records from a text file and index them into solr.



As expected the job doesn't involve directly each machine (just the job is created but the records are not shared). I suppose that the process must be designed in a specific way to be shared in the cluster. Isn't it?



Is there possible to have a description about the right design of the process?



How should I use the objectstore fot implementing precisely the feature?



Thank you.

--

________________________________



Lorenzo Eccher

lorenzo.eccher@xxxxxx<mailto:lorenzo.eccher@xxxxxx><mailto:lorenzo.eccher@xxxxxx><mailto:lorenzo.eccher@xxxxxx>

   (+39) 0461 312 306

Engineering Ingegneria informatica s.p.a

www.eng.it<http://www.eng.it><http://www.eng.it><http://www.eng.it>



ENGINEERING Society and Territory Trento Research Office

EIT-ITC Labs<http://eit.ictlabs.eu><http://eit.ictlabs.eu>, Trento node



ESTRO Lab at FBK building

via Sommarive, 18

Povo - 38123 Trento



  Le informazioni trasmesse sono destinate esclusivamente alla persona o alla societ? in indirizzo e sono da intendersi confidenziali e riservate. Ogni trasmissione, inoltro, diffusione o altro uso di queste informazioni a persone o societ? differenti dal destinatario ? proibita. Se ricevete questa comunicazione per errore, contattate il mittente e cancellate le informazioni da ogni computer.

  The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

  Die Informationen in dieser E-Mail-Mitteilung sind vertraulich und deren Verbreitung in jeglicher Art oder Form ist untersagt. Sollten Sie diese Nachricht irrt?mlich erhalten haben, ersuchen wir Sie, sofort den Absender dar?ber zu informieren und die Mail zu l?schen.

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://dev.eclipse.org/mailman/private/smila-user/attachments/20140110/f11da256/attachment.html><https://dev.eclipse.org/mailman/private/smila-user/attachments/20140110/f11da256/attachment.html>



------------------------------



_______________________________________________

smila-user mailing list

smila-user@xxxxxxxxxxx<mailto:smila-user@xxxxxxxxxxx>

https://dev.eclipse.org/mailman/listinfo/smila-user





End of smila-user Digest, Vol 51, Issue 6

*****************************************







--
________________________________

Lorenzo Eccher
lorenzo.eccher@xxxxxx<mailto:lorenzo.eccher@xxxxxx>
   (+39) 0461 312 306
Engineering Ingegneria informatica s.p.a
www.eng.it<http://www.eng.it>

ENGINEERING Society and Territory Trento Research Office
EIT-ITC Labs<http://eit.ictlabs.eu>, Trento node

ESTRO Lab at FBK building
via Sommarive, 18
Povo - 38123 Trento

  Le informazioni trasmesse sono destinate esclusivamente alla persona o alla societ? in indirizzo e sono da intendersi confidenziali e riservate. Ogni trasmissione, inoltro, diffusione o altro uso di queste informazioni a persone o societ? differenti dal destinatario ? proibita. Se ricevete questa comunicazione per errore, contattate il mittente e cancellate le informazioni da ogni computer.
  The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
  Die Informationen in dieser E-Mail-Mitteilung sind vertraulich und deren Verbreitung in jeglicher Art oder Form ist untersagt. Sollten Sie diese Nachricht irrt?mlich erhalten haben, ersuchen wir Sie, sofort den Absender dar?ber zu informieren und die Mail zu l?schen.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dev.eclipse.org/mailman/private/smila-user/attachments/20140117/bae66523/attachment.html>

------------------------------

_______________________________________________
smila-user mailing list
smila-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-user


End of smila-user Digest, Vol 51, Issue 7
*****************************************



--

Lorenzo Eccher
lorenzo.eccher@xxxxxx
   (+39) 0461 312 306
Engineering Ingegneria informatica s.p.a
www.eng.it

ENGINEERING Society and Territory Trento Research Office
EIT-ITC Labs, Trento node

ESTRO Lab at FBK building
via Sommarive, 18
Povo - 38123 Trento

  Le informazioni trasmesse sono destinate esclusivamente alla persona o alla società in indirizzo e sono da intendersi confidenziali e riservate. Ogni trasmissione, inoltro, diffusione o altro uso di queste informazioni a persone o società differenti dal destinatario è proibita. Se ricevete questa comunicazione per errore, contattate il mittente e cancellate le informazioni da ogni computer.
  The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
  Die Informationen in dieser E-Mail-Mitteilung sind vertraulich und deren Verbreitung in jeglicher Art oder Form ist untersagt. Sollten Sie diese Nachricht irrtümlich erhalten haben, ersuchen wir Sie, sofort den Absender darüber zu informieren und die Mail zu löschen.


Back to the top