RE: [smila-dev] Message Resequencer :: concept bug detected and general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [smila-dev] Message Resequencer :: concept bug detected and general SMILA concurrency problem

From: Thomas Menzel <tmenzel@xxxxxxx>
Date: Tue, 29 Sep 2009 23:42:42 +0200
Accept-language: en-US, de-DE
Acceptlanguage: en-US, de-DE
Delivered-to: smila-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/smila-dev>
List-help: <mailto:smila-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/smila-dev>, <mailto:smila-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/smila-dev>, <mailto:smila-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: Aco6sxTSTnQhgQRTQQWkWsWxxiXfAwACOQOwAAhNIOAAGgf+UAB5LHWgAAPlefABAmb+oA==
Thread-topic: [smila-dev] Message Resequencer :: concept bug detected and general SMILA concurrency problem

hi folks,

i'm sad to report that i just discovered a conceptual bug in my solution AND it is a general concurrency bug for whatever processing is going on!!

initially I had planned to add the sequence number (SN) as a JMS property, but we have no means to get hold on jmx properties in services or pipelets AFAIK, so I switched to adding it as an annotation, the same as it works for the Lucene index service.

however, since the blackboard always holds the version of a record that got saved last, we have a concurrency problem, like so (general description, not specific to resequencer):

I will illustrate this with 2 ADD operations for the same resource/record with id R:

· let there be a pipeline P1 that processes all records and at the end puts them back into the Q for further processing by pipeline P2.

o P1 execution takes long and it gets its needed record content via the BB at the beginning (T1) and saves its state at the very end (Tn).

o the listeners for P1 and P2 react to a certain property that is added by the respective Send tasks, e.g. "process by P1/P2".

· P1 processes record R,sn1,add (means: Record with ID R, having SN==1 and operation ADD) during time T1..Tn (means: T<discrete time index> that suggests the chronological order)

· R,sn2,add is added to the Q

o case: this happens @ Tm, i.e. right after processing of P1 but before P2 picks up

§ -> we have 2 messages in the Q, both pointing to R but with diff. JMS properties!!! the one is ready to be processes in P1 and the other in P2.

§ -> the result of P1 for R,sn1 is overwritten

§ -> P2 will want to process R in its pre P1 state of SN2 and will probably fail or even worse: produce a wrong output

§ …

o case: this happens @ Tk, i.e. during processing of P1

§ -> we have 2 messages in the Q, both pointing to R but with diff. JMS properties!!! the one is ready to be processes in P1 and the other in P2.

§ -> R in BB has now the state SN2.

§ -> this is overwritten @ Tn when P1 finishes with state of P1.process(R,sn1), i.e. the outcome of P1 working on R,sn1

§ …

these examples can be driven further but it is easy to tell: all very messy!!!!

I just popped into my mind: maybe my mail on "potential concurrency bug with listeners " from last week is caused by this!

so, what to do?

alternative a)

the resequencer wont work if I need to put the SN into the record as an annotation or such, as this is just a special case of the above ADD/ADD problem. however, if we can also access JMX properties in a service it should work.

this leads me to the idea: why change the ProcessingService interface if I could implement the resequencer as a true-and-blue messaging-system element, i.e. as a MQ listener/sender that has access to JMS properties.

FYI: i'm reading a book on message systems and design patterns in that realm, and find the SMILA way of things (mixing MQ with a shared BB and BPEL) not very intuitive, or rather: It feels like an odd hybrid where u really have to think hard how to accomplish what you need to do with the SMILA standard procedures, because you not only need to know MQ design but also take into consideration what the shared BB will do to your records. but I guess that is another sort of discussion…

alternative b)

we need to implement the partition concept to be able to access the diff. versions of the record

I vote for a).

at least I'd like to investigate this a little further, but not tonight anymore.

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

From: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] On Behalf Of Igor.Novakovic@xxxxxxxxxxx
Sent: Donnerstag, 24. September 2009 19:14
To: smila-dev@xxxxxxxxxxx
Subject: AW: [smila-dev] Message Resequencer :: change to Agent Interface

Hi Tom,

I share Daniel’s opinion on both issues.

Before you start programming (I see that you’ve already opened a dedicated branch in repository for the resequencer), please let’s discuss the problem and do some conceptual work.

BTW: What is the use case that you’re trying to cover with your resequencer?

Regards

Igor

Von: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] Im Auftrag von Daniel.Stucky@xxxxxxxxxxx
Gesendet: Donnerstag, 24. September 2009 17:31
An: smila-dev@xxxxxxxxxxx
Betreff: AW: [smila-dev] Message Resequencer :: change to Agent Interface

Hi Tom,

I see two drawbacks of your proposed solution(s):

1) it will only work on one machine. In a distributed environment it is not guaranteed, that a Resequencer will get all the relevant messages concerning one Record. And as each Resequencer has it’s own map of Ids and sequence numbers two competing operations would not be recognized as such. The map has to be shared across all Resequencer instances (e.g. by using another Queue, or a database).

The initial idea of the Buffer component in Connectivity was to filter out and resolve competing operations before they enter the “system”, that is before they are processed. Of course this Buffer would also have to share its internal state across all instances (At the moment a Agent/Crawler is bound to one instance of Connectivity, so this distribution is not relevant, yet). In either case, the processing is left totally untouched by introducing the Buffer component., which leads me to my second issue:

2) The workflow has to be adapted to architecture changes. So in order to benefit from the Resequencer business logic, workflows have to be designed in special ways (first do some processing, second store thee processed data). I think this is hard to grasp by users. BTW: would the actual storing be configurable, I mean will the Resequencer execute a BPEL pipeline or is the LuceneIndexing hardcoded ? The latter is of course no valid scenario, we have to be flexible in this regard, as users may want to store their data in arbitrary stores/indexes/whatsoever

Perhaps you could elaborate about your concerns with our initial Buffer idea ?

Bye,

Daniel

Von: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] Im Auftrag von Thomas Menzel
Gesendet: Dienstag, 22. September 2009 07:28
An: Smila project developer mailing list
Betreff: RE: [smila-dev] Message Resequencer :: change to Agent Interface

oops,

http://wiki.eclipse.org/SMILA/Specifications/ProcessingMessageResequencer

PS: all along writing this draft I had this mail open but still managed to forget to add link….

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

From: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] On Behalf Of Igor.Novakovic@xxxxxxxxxxx
Sent: Montag, 21. September 2009 19:04
To: smila-dev@xxxxxxxxxxx
Subject: AW: [smila-dev] Message Resequencer :: change to Agent Interface

Hi Thomas,

Could you please provide us the link to your specification draft?

Cheers

Igor

Von: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] Im Auftrag von Thomas Menzel
Gesendet: Montag, 21. September 2009 18:28
An: Smila project developer mailing list
Betreff: [smila-dev] Message Resequencer :: change to Agent Interface

Hi,

I wrote a specification draft for this change. plz feel free to comment.

in order for this to implement I will need to change the interface of the agent.

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

From: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] On Behalf Of Thomas Menzel
Sent: Montag, 21. September 2009 14:00
To: Smila project developer mailing list
Subject: [smila-dev] FYI :: new feature :: Message Resequencer

hi folks,

just wanted to announce and inform you that I will be working on the problem that messages don’t get out of sync when there are changes in close succession

this change will be tracked thru the bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=289995

Kind regards

Thomas Menzel @ brox IT-Solutions GmbH

Follow-Ups:
- RE: [smila-dev] Message Resequencer :: concept bug detected and general SMILA concurrency problem
  - From: Thomas Menzel

References:
- [smila-dev] FYI :: new feature :: Message Resequencer
  - From: Thomas Menzel
- [smila-dev] Message Resequencer :: change to Agent Interface
  - From: Thomas Menzel
- AW: [smila-dev] Message Resequencer :: change to Agent Interface
  - From: Igor.Novakovic
- RE: [smila-dev] Message Resequencer :: change to Agent Interface
  - From: Thomas Menzel
- AW: [smila-dev] Message Resequencer :: change to Agent Interface
  - From: Daniel.Stucky
- AW: [smila-dev] Message Resequencer :: change to Agent Interface
  - From: Igor.Novakovic

Prev by Date: RE: [smila-dev] potential concurrency bug with listeners
Next by Date: [smila-dev] source formatting: proposition to turn on Save Action: remove traling spaces
Previous by thread: RE: [smila-dev] Message Resequencer :: change to Agent Interface
Next by thread: RE: [smila-dev] Message Resequencer :: concept bug detected and general SMILA concurrency problem
Index(es):
- Date
- Thread

Breadcrumbs