Bug 216586 - Provide optional pooling of store accessors
Summary: Provide optional pooling of store accessors
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.core (show other bugs)
Version: 2.0   Edit
Hardware: PC Windows Vista
: P3 enhancement (vote)
Target Milestone: M3   Edit
Assignee: Eike Stepper CLA
QA Contact:
URL:
Whiteboard: Lighter, Faster and Better
Keywords: readme
Depends on:
Blocks:
 
Reported: 2008-01-25 09:37 EST by Eike Stepper CLA
Modified: 2010-06-29 09:23 EDT (History)
3 users (show)

See Also:
stepper: galileo+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eike Stepper CLA 2008-01-25 09:37:46 EST
... just in case the data source doesn't support this
Comment 1 Eike Stepper CLA 2008-06-10 02:30:06 EDT
Reversioned due to graduation
Comment 2 Martin Taal CLA 2008-06-28 05:37:45 EDT
Hi Eike,
Just for the record, if accessors are pooled it would be great if the accessor would get notified when it starts to be used (to create a transaction) and once when it finishes (to commit a transaction). In the finish event it would help if the accessor would get to know if an exception occured (to rollback). The finish call should be guaranteed to be called (also in case of exceptions etc.). 

It is just a quick idea and proposal so a different method (accomplishing the same) would also be fine for me.

gr. Martin
Comment 3 Eike Stepper CLA 2008-06-28 05:50:51 EDT
Yes, I planned something like the lifecycle protocol between repository and store/accessor you described. In any circumstance I'll discuss important protocol changes with all IStore implementors (before I change something) ;-)
Comment 4 Jasper CLA 2008-08-20 05:03:09 EDT
From our recent discussion on the newsgroup:

>>> Basically I thought that in production a pooling data source is used and 
>>> in that case I'm not sure how double pooling affects the overall performance.

>> I'm not sure what you mean by 'in production'. Unless running within an 
>> app server that pools connections, CDO would have to do its own pooling,
>> right? Correct me if I'm wrong about this.

> I'm not a particular expert in this area, but I think you don't necessarily
> need an application server just to have JDBC connection pooling. AFAIK there
> are several open source solutions for this purpose which basically just proxy
> underlying data sources and provide sort of transparent pooling. What about
> this one: http://commons.apache.org/dbcp

To this I add:

Yes, using a standalone pool is possible and this could be transparent to CDO. But I noticed that the current MySQLAdapter is tied to MySQL's JDBC driver directly. Actually I'm not sure why.. it seems #createJDBCDataSource() only gets called from a unit test and #getJDBCDriver() doesn't get called at all.

Still, I think it'd be simpler if CDO did the pooling. It would give much better out-of-the-box performance with a DBStore, and would be much easier for someone interested in CDO but not familiar with setting up his own dbase connection pool.

To give you an idea: I added a few lines of crude code to DBStoreAccessor, to hold on to a single connection rather than drop it. Bulk read performance jumped by a factor of 4..
Comment 5 Eike Stepper CLA 2008-08-20 12:09:28 EDT
(In reply to comment #4)
> Yes, using a standalone pool is possible and this could be transparent to CDO.
> But I noticed that the current MySQLAdapter is tied to MySQL's JDBC driver
> directly. Actually I'm not sure why.. it seems #createJDBCDataSource() only
> gets called from a unit test and #getJDBCDriver() doesn't get called at all.

The db.mysql plugin no longer *contains* the mysql driver jar but AFAIK it imports the needed mysql packages.

And yes, there might be a bit of a historical mess regarding the API of the IDBAdapter. There were several changes from using a Driver to using a DataSource. I'll clean this up eventually.


> Still, I think it'd be simpler if CDO did the pooling. It would give much
> better out-of-the-box performance with a DBStore, and would be much easier for
> someone interested in CDO but not familiar with setting up his own dbase
> connection pool.

Perfectly right! If only I wouldn't have to prioritize among the flood of cool feature requests or bug reports ;-) Usually those with the easiest workarounds go to the bottom of the list.

But if you're willing to contribute in the areas where you're interested we can get this to work sooner.


> To give you an idea: I added a few lines of crude code to DBStoreAccessor, to
> hold on to a single connection rather than drop it. Bulk read performance
> jumped by a factor of 4..

Yes, I remember that once I already prepared a few places to better support pooled accessors. Have you also paid attention to properly release the pooled resources at some time? In case you're planning to attach a patch, you'd also need to write some tests to ensure that the feature works and does not affect the rest of the functionality.

Of course I'd be glad to review your code and commit it (after going throught the IP approval) ;-)

BTW. another possibility would be a completely independent (of CDO) DataSource implementation in the net4j.db plugin. It would be a replication of the effort of DBCP or similar. It would require to implement the DataSource interface by delegation and pooling. I think the needed pooling infrastructure is present in net4j.util...

Comment 6 Eike Stepper CLA 2008-08-28 04:46:56 EDT
Restarting discussion.
Comment 7 Eike Stepper CLA 2008-08-28 09:01:56 EDT
I've added pooling infrastructure for the store accessors:

1) StoreAccessor now extends Lifecycle
2) StoreAccessor now offers additional doPassivate() and doUnpassivate() callbacks
3) Store now requires to implement getReaderPool(ISession) and getWriterPool(IView)
4) DBStore now provides a single reader pool and a single writer pool
5) All other stores currently do not support pooling

Martin, when you have time please look at this new SPI so that we can continue our disscussion for the HibernateStore.
Simon, please look if the ObjectivityStore can benefit from this.
JavaDoc is provided ;-)

Btw. currently the pools can only grow!
Comment 8 Martin Taal CLA 2008-08-29 09:05:44 EDT
Hi Eike,
Hmmm, I have looked at the code. But afaics the only reason mentioned for pooling is connection pooling to the database. However imo for hibernate it is better to use an open source connection pooling mechanism than building your own (for other stores not using jdbc this can be different ofcourse). Also if someone uses hibernate then the person must have stumbled over the connection pooling topics in the hibernate manual. 

Other than the connection there is no other information in a HibernateStoreAccessor/Hibernate Session which can be pooled (afaics).

Let me know if there is another reason than connection pooling.

Another approach (followed by Seam) is to have one session per conversation. This could be interesting but then there needs to be something similar to a conversation in CDO (maybe the cdo session?). A few things to keep in mind for such an approach:
- a hibernate session is single-threaded and it needs to be connected to something at the client which has the same state and only talks single-threaded with the hibernate session at the server
- a hibernate session also caches things (first level cache) and this could clash with the cached objects maintained by cdo. In addition hibernate has a second level cache.

Just some thoughts, let me know what you think.

gr. Martin

Comment 9 Eike Stepper CLA 2008-09-01 02:31:03 EDT
(In reply to comment #8)
> Another approach (followed by Seam) is to have one session per conversation.
> This could be interesting but then there needs to be something similar to a
> conversation in CDO (maybe the cdo session?). A few things to keep in mind for
> such an approach:
> - a hibernate session is single-threaded and it needs to be connected to
> something at the client which has the same state and only talks single-threaded
> with the hibernate session at the server
> - a hibernate session also caches things (first level cache) and this could
> clash with the cached objects maintained by cdo. In addition hibernate has a
> second level cache.

Yes, I think a CDOSession (ISession) or CDOView (IView) is similar to a conversation and with the new design it would easily be possible to provide each request with all the things that the previous requests opf the same session/view has already used. It's also possible to detect that the session/view has been closed so that a pool cleanup can happen. Do you think that would make sense?
Comment 10 Martin Taal CLA 2008-09-01 03:21:16 EDT
Hi Eike,
Yes, however after some more thought. The hibernate session cache can interfere with the rest of cdo. When a session is re-used in different requests, then the objects read in a previous request (ak. hibernate session) are re-used. This eventhough they may have changed in the database. So the objects in the hibernate session are not refreshed because of updates done by other clients.
Also I am not sure if hibernate session caching gives so much benefit as cdo also caches revisions.

Is a view/session always used singlethreaded?

gr. Martin
Comment 11 Eike Stepper CLA 2008-09-01 03:44:51 EDT
Hard for me to judge about these Hibernate internals ;-(

Your question applies only to the server side, right?
I'd say no, because each request (i.e. the indication) is executed in an own thread.
This is configurable but the default is concurrent execution of the indications.

But I'm not sure if this really is the important question.
With the new pooling design we could easily ensure that one store accessor is only used by one thread.
And we could ensure that it is only reused within the same session/view.
Comment 12 Martin Taal CLA 2008-09-01 04:17:44 EDT
Hi Eike,
Okay, do you think that there is any interference between the cdo cache and the hibernate caching? This is important because the only reason to re-use a hibernate session is that it caches information. 
What won't work currently with a hibernate cache (the first-level) is that if another client changes an object then the first-level hibernate cache is not refreshed. 

Regarding single threading, the relation should go further, one cdosession/view should always use the same accessor and only in a single-threaded mode. Afaiu this means that a cdosession/view should also only be used in single-threaded mode.

gr. Martin
Comment 13 Eike Stepper CLA 2008-09-01 04:32:44 EDT
(In reply to comment #12)
> Hi Eike,
> Okay, do you think that there is any interference between the cdo cache and the
> hibernate caching? This is important because the only reason to re-use a
> hibernate session is that it caches information. 
Hmm, hard to say. Wouldn't hibernate only be called if the cdo cache (revision manager) doesn't have the revision?

> What won't work currently with a hibernate cache (the first-level) is that if
> another client changes an object then the first-level hibernate cache is not
> refreshed. 
Does it make a difference that in the server we talk about revisions which are immutable? Well, the revised attribute can change...


> Regarding single threading, the relation should go further, one cdosession/view
> should always use the same accessor and only in a single-threaded mode. Afaiu
> this means that a cdosession/view should also only be used in single-threaded
> mode.

I think for the view this is true.
Comment 14 Martin Taal CLA 2008-09-01 04:55:46 EDT
Hi Eike,
Yes the hibernate cache would only be called if the cdo cache does not have it. But if the cdo cache does not have the revision then the hibernate cache won't have it either probably. 

Hmmm, the scenario I mentioned (how Seam works) with one hibernate session per cdosession/view makes sense if there is no other similar framework between hibernate and the client. In this case cdo is in between hibernate and the client (which is good otherwise we can't support client notification). So cdo already covers what hibernate can do... (afaics, imho) 

So one of the two reasons to pool/re-use a hibernate session is because then the client and hibernate are nicely in sync. However as cdo already takes care of this, then this is not a reason to pool/reuse a hibernate session.

For the other reason to pool/re-use a hibernate session: re-use the database connection, it makes much more sense to use a proven connection pooling mechanism such as c3p0 (imho).

So afaics it does not make sense to support pooling for hibernate.

Let me know what you think.

gr. Martin

Comment 15 Eike Stepper CLA 2008-09-06 06:03:25 EDT
Ok, we should wait until users come back on this issue ;-)

Committed to HEAD.
Comment 16 Eike Stepper CLA 2008-09-16 14:04:44 EDT
Fix available in HEAD: 2.0.0.I200809110620.
Comment 17 Eike Stepper CLA 2008-09-16 14:06:28 EDT
Fix available in HEAD: 2.0.0.I200809110653.
Comment 18 Eike Stepper CLA 2009-06-27 11:51:50 EDT
Generally available.