Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [smila-dev] SMILA IP Overview (workflow view)

And, also, User communicates with system via Management module.

Ivan Churkin wrote:
Hi,

I)
I want to suggest a few amendments to diagram:

1. Filter now is a part of blackboard (BB), every BB service user able to draw filtered record from BB. 2. Crawler controller works directly with DI service and, finally, put it into Router. So, there is no separate connectivity module ( or it contains only Router? ). 3. Router and Listener are also able to communicate with BB ( by task "Synchronize" in "Rule" configuration )

II)
In my opinion AddPipeline did too much work (synchronously). As a result, with current pipelines queue is not needed. We may directly call AddPipeline after crawling ( for example by Router ). Its better to split it into "ParsePipeline" and "AddToIndexPipeline" at least...

III) only FMY:
What is the issue to use following components?

1) "net.sf.joost" - STX language processor (similar to XSLT 1.0 but not W3C standard)
2) "org.w3c.tidy"  - HTML clean-up tool


--
Regards, Ivan





HTML Parser.

August Georg Schmidt wrote:

Hi Folks,

as answer to some questions from our PMC Sofya added a workflow overview for the indexing process.

Within this process you can find additional information regarding 3^rd party components that are used in SMILA.

http://wiki.eclipse.org/SMILA/Workflow_Overview

Kind Regards,

Georg

------------------------------------------------------------------------

_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev

_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev



Back to the top