Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [smila-dev] RE: FYI :: new feature :: Message Resequencer

Hi igor,

> illegal parallel pipelining
I'm currently updating the wiki and splitting section clustering, complex processing chain into two.

So, bare with me a little longer and then I will post it here when done.


> the other ideas
Bear with me a little longer, unfortunately I don't have parallel workflows in my brain yet... ;)
But I think I have a good idea.

Kind regards
Thomas Menzel @ brox IT-Solutions GmbH


> -----Original Message-----
> From: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-
> bounces@xxxxxxxxxxx] On Behalf Of Igor.Novakovic@xxxxxxxxxxx
> Sent: Freitag, 9. Oktober 2009 11:19
> To: smila-dev@xxxxxxxxxxx
> Subject: AW: [smila-dev] RE: FYI :: new feature :: Message Resequencer
> 
> Hi Tom,
> 
> 
> > a) Parallel workflows should be totally legitimate and not illegal.
> >
> > Imagine that you want to process a record in two completely different
> ways ending
> > up in diff indexes. Why should we force it to run in serial fashion
> if
> the
> > customer provides the computing power?
> There are two "flavors" of this use case:
> 1) The "illegal" one:
> The user defines two pipelines that contain several preprocessing
> pipelets and an indexing service at the end of each pipeline.
> This case is "illegal" because
> 	a) We must assume that each preprocessing pipelet updates the
> record.
> 	b) Now if those two pipelines are running simultaneously the
> record would be arbitrary updated by _all_ the pipelets and no
> predictable result/workflow can be guaranteed.
> 
> 2) The "legal" one:
> If the user really wants to store the record in two or more different
> indexes than all he has to do is to construct the pipeline that does
> some (complex) preprocessing (by merging the two pipelines) and at the
> end of the new pipeline simply fork it with two index writer pipelets.
> 
> 
> 
> > Also the router explicitly allows several Send tasks in its config,
> which we would
> > have to take out.
> IMO we should take it out. It just causes problems by "seducing" the
> user to run into this pitfall.
> 
> 
> 
> > b) following your discussion it seems to me that you slowly approach
> the idea
> > where you need to register first and unregister at the end, albeit
> you
> use the
> > terms (un)lock and move the functionality into existing components.
> Yes, the proposed changes would affect already existing components like
> blackboard, connectivity and listener.
> But there is one important difference between locking and registering:
> By locking the record we also prevent having it changed simultaneously
> in the preprocessing part of the process.
> By registering operations you would eventually only keep the order of
> them but still be unable to prevent their parallel execution and
> therefore arbitrary updates of the record.
> 
> 
> 
> > To solve this, I have to agree with you that we need a buffer that
> allows us to
> > queue and consolidate subsequent PRs as long as the item in question
> is being
> > processed.
> Exactly!
> 
> 
> 
> > New idea:
> >
> > An idea that I had (but not thought thru yet) was to have such a
> buffer in
> > connectivity myself but I don't want to delay all PRs by a fixed
> amount of time.
> > instead I want to have pipelets just before calling the PT to signal
> to
> > connectivity that processing has completed
> I like your idea of not having buffer operating in constant intervals.
> I would like only to suggest another implementation:
> Instead of expanding buffer with callbacks and annoying it with a bunch
> of information that he is not interested in (remember: only a very
> small
> portion of records would be changed in short time periods), we could
> use
> "record locking concept" so that the buffer proactively query the
> blackboard if some record is "ready" for reprocessing.
> 
> 
> > Thought: since we use an MQ anyhow we just could open up another Q to
> send such
> > messages back.
> That is in principle the same idea as I've just proposed only that the
> buffer would not query the blackboard but the queue. If this is easier
> to implement - I'll support it!
> 
> 
> > New idea2:
> > Take the core of juergen's idea and instead of opening up a buffer,
> map or a Q in
> > addition to the recordstore, place additional information associated
> with the
> > record not as part of it, so that it is not shared.
> Sorry, but I do not understand what you mean.
> Can you please rephrase your statement?
> 
> Regards
> Igor
> _______________________________________________
> smila-dev mailing list
> smila-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/smila-dev
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.421 / Virus Database: 270.14.7/2421 - Release Date:
> 10/08/09 06:39:00


Back to the top