Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
AW: [smila-dev] RE: FYI :: new feature :: Message Resequencer

Hi,

> > This issue _does not_ occur when crawling some data source. 
> there might be rare cases where it could occur there 
> (links to the same resource, e.g. the same document referenced from 2 websites)
Please explain this.
My assumption is, that the referenced resource _does not_ change that fast, so in that case (same document referenced from 2 websites) this issue does not occur.


> > (Crawling is the most common use case.)
> not sure if crawling really is the most common case. 
> In the past I usually integrated our former product more in the agent style
Ok. I did not know that.


> > This issue only occurs if the data source has been monitored by 
> > an agent _and_ the user is doing “ADD” (or “UPDATE”) and subsequently almost 
> > instantly an “DELETE” operation on the document (data set that is represented later as a record in SMILA). 
> a) doesn’t only affect ADD/DEL ops but also ADD/ADD as pointed out
Yes. But the _delta_ is in ADD/ADD case very small. In worst case you process a slightly different version of the document. This is something that the user may live with.
By executing ADD/DEL in reverse order the result is totally unacceptable: Instead of removing the document from (index, storages etc.) the document will be still there.

> > b) > Instantly
> a. Highly depends on the setup
Please explain this.


> b. generalized: as long as the change on the same resource occurs 
> within the time period a previous change event is being processed.
Yes. But how long does the processing take?
For me is anything that lasts more than 0.5s to long.
Any event processing that takes less than 0.5s is for me "instantly".
Or do you assume that the user can make _significant_ changes on the document in less than a half of a second?


> > The chances that this happens are very low.
> Highly depends on the setup and where u get the data from and the frequency that this data changes.
As said above: Please give me some realistic example.


> > agreed that it should be addressed in the connectivity module by a component called “Buffer”.
> As I pointed out, this solution is
> a) not safe
> b) might not meet the application needs/requirements
As I stated in my mail, let's first define the needs/requirements and then discuss the technical solutions and their pros and cons.


> > Since this issue occurs very rare, it can be generally rated as “low”.
> Well, with the assumptions you have made, yes. If those assumptions fail: it is not low IMO. 
> As I said: It depends on the use case. Here are some (think agent):
Please do not oversee the fact that we want to buffer user operations in order to:
a) Do not execute superfluous operations and thereby lower the load on our application 
b) Make sure that the _order of consuming messages from queue does not matter_ and therefore 
	* make high scalability possible
	* completely avoid the use case that we are discussing now


> Use cases:
> • a wiki that I used by many users concurrently: 
>   o Here it can happen fairly frequently that the same page is saved twice in fast succession. 
>   At least it happens to me that after saving I notice a typo or add a quick note, resulting on another "save".
Sure. That is exactly what I meant with a non significant change!
This is also one of the reasons why we should not do instant processing after _every_ user action.


> • web application
>  o in order to ease the DB load,  the search is the primary means to access the data, 
>    especially those where a complex SQL query would be crafted.
>  o To have always an accurate result, a minimum time diff. between 
>   resource change and index update is required.
Are you sure?
AFAIK the DB load (produced by a web app) is being reduced with some caching technique - not with the search.


Regards
Igor

Back to the top