Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [smila-user] performance degredation with the new processing

Hi,

So I replaced the obj store with an Memory Impl. While this improved the finishing time of the crawler (it was done for 45k files in a minute, which is just as it was when using AMQ) it did little to improve the overall processing time which is still @ 42 min which is supported by the still short TODO list.

You wrote:
> But if the tasks are created too slowly, scaleUp cannot help anyway.
So what factors control this? And how can I speed it up?


Thomas Menzel @ brox IT-Solutions GmbH


-----Original Message-----
From: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] On Behalf Of Jürgen Schumacher
Sent: Donnerstag, 29. September 2011 15:32
To: Smila project user mailing list
Subject: Re: [smila-user] performance degredation with the new processing

Hi,

Ad 1: Yes it's quite possible that the Bulkbuilder/ObjectStore combination has some ... uhm ...
potential for optimization. For example, I think that increasing the crawlers buffer size will not change much, because the Bulkbuilder appends the records one by one. Maybe I can have a look at this tomorrow.

Yes, an in-Memory solution would quite certainly improve performance. And the whole purpose of separating SMILA into independent (OSGi) services is to make it easy to exchange service implementations. Just do a new service implementation and put it in config.ini instead of o.e.s.objectstore.filesystem and you should be done.

Ad 2: The scaleUpLimits for pipelineProcessor and pipelineProcessor are OK. The first one for the "_finishingTasks" is not a global one, but one for a "system worker" and it's OK that it's 1. You'll find the global scaleUp limit for the node at the end of /smila/tasks. The clusterconfig.json looks OK to me, too. But if the tasks are created to slowly, scaleUp cannot help anyway.

/smila/debug is currently not documented, it's kind of an "experimental and inofficial sandbox API" 
anyway and may change often. If parts of it prove to be very important for monitoring, we should rather move it to an "official" URLs instead of documenting /smila/debug (;

Cheers,
Juergen
_______________________________________________
smila-user mailing list
smila-user@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-user


http://www.Taglocity.com Tags: smila

Back to the top