Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [smila-dev] SMILA IP Overview (workflow view)

Hi Ivan,

thanks for your comments.

One installation scenario will be the installation of the connectivity module on an external computer (as a cluster). 

Therefore displaying this module externally shows an logical view.

On point II.

The pipeline is really a bit simple and a lot of work is done... The queue is a scalability option to scale the processing on multiple threads and on multiple compouter. Therefore I don't think the queue is "not needed". It's just the choice for distributing our work. 

What is the benefit of splitting the current pipeline into "ParsePipeline" and "AddToIndexPipeline"? It's yet exemplary (the pipeline). We would add communication overhead by queue. But what would we gain?

Kind regards,

Georg


-----Original Message-----
From: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] On Behalf Of Ivan Churkin
Sent: Mittwoch, 15. Oktober 2008 09:17
To: Smila project developer mailing list
Subject: Re: [smila-dev] SMILA IP Overview (workflow view)

And, also, User communicates with system via Management module.

Ivan Churkin wrote:
> Hi,
>
> I)
> I want to suggest a few amendments to diagram:
>
> 1. Filter now is a part of blackboard (BB), every BB service user able 
> to draw filtered record from BB.
> 2. Crawler controller works directly with DI service and, finally, put 
> it into Router. So, there is no separate connectivity module ( or it 
> contains only Router? ).
> 3. Router and Listener are also able to communicate with BB ( by task 
> "Synchronize" in "Rule" configuration )
>
> II)
> In my opinion AddPipeline did too much work (synchronously). As a 
> result, with current pipelines queue is not needed. We may directly 
> call AddPipeline after crawling ( for example by Router ). Its better 
> to split it into "ParsePipeline" and "AddToIndexPipeline" at least...
>
> III) only FMY:
> What is the issue to use following components?
>
> 1) "net.sf.joost" -  STX language processor (similar to XSLT 1.0 but 
> not W3C standard)
> 2) "org.w3c.tidy"  - HTML clean-up tool
>
>
> -- 
> Regards, Ivan
>
>
>
>
>
> HTML Parser.
>
> August Georg Schmidt wrote:
>>
>> Hi Folks,
>>
>>  
>>
>> as answer to some questions from our PMC Sofya added a workflow 
>> overview for the indexing process.
>>
>>  
>>
>> Within this process you can find additional information regarding 
>> 3^rd party components that are used in SMILA.
>>
>>  
>>
>> http://wiki.eclipse.org/SMILA/Workflow_Overview
>>
>>  
>>
>> Kind Regards,
>>
>>  
>>
>> Georg
>>
>>  
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> smila-dev mailing list
>> smila-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/smila-dev
>>   
>
> _______________________________________________
> smila-dev mailing list
> smila-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/smila-dev

_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev


Back to the top