Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[eclipselink-users] Using partitioning for data source-per-tenant multitenancy

Hi,
I'm new to EclipseLink (until now I've used Hibernate).
I'm currently setting up a web application with EclipseLink (under Tomcat 8+Spring Framework 4) and I want it to support a data source-per-tenant multitenant environment.
Currently EclipseLink supports single-table or table-per-tenant (which can also become schema-per-tenant... I just discovered it now...) multitenancy strategies, but it also offers support for partitioning.
The idea I'm working on is to use partitioning with an appropriate partitioning policy to split data across different data sources, one for each tenant.
I ended up with a solution that dynamically injects new connection pools with externally managed data source instances into EclipseLink as soon as the application needs to access data for a new tenant (in my case tenant = customer and each user that logs into the application belongs to a specific customer, hence each HTTP session is "bound" to a specific connection pool/data source).
All tables are partitioned, so I didn't specify any partitioning policy via annotations, but rather defined my own PartitioningPolicy which is added to the entity manager factory via serverSession.getProject().addPartitioningPolicy(PartitioningPolicy) and then set as the default partitioning policy via serverSession.setPartitioningPolicy(PartitioningPolicy).
As soon as I disable the shared cache by passing the property eclipselink.cache.shared.default=false at the entity manager factory creation time and use a generation strategy for ids which does not use pre-allocation, all seems to work fine (at least in my preliminary tests). But, as I have expected, the concept of "partition" in EclipseLink means that an entity instance of type T with id X is meant to be the same in all of the partitions where it is located (= replicated). In my case, instead, data in a partition is completely independent from data in another partition and hence an entity instance of type T with id X for tenant/partition A is actually different from another entity instance of the same type T with the same id X for tenant/partition B.

I thought of two possible solutions to this:
  1. I may try to instruct EclipseLink to use the combination of "entity id" and "partition id" to determine the identity of an object (with regards to caching, but not only...); I read that EclipseLink should do something like that for multitenancy entities, but I don't know exactly how this works and if this kind of strategy can be changed in some (relatively easy) way... In particular, I was wondering if setting a tenant id with eclipselink.tenant-id will have any effect, considering that I'm not using @Multitenant at all... I suspect it would be completely ignored...
  2. I may try to make EclipseLink use a centralized sequence to generate ids, so that there won't ever be two entity instances of the same type T from different tenants/partitions with the same id X, and hence there shouldn't be any problem with caching or whatsoever...
In any case, if I want to use generated ids preallocation I must find a way to centralize the sequence generation. With the default strategy that uses a sequence table, this would just mean I have to put such table in a central data source, the one I use to store users and related customer/tenant ids and that I'm setting on the entity manager factory as the "default" connection pool. Reading the docs I see that it should also be possible to specify a dedicated connection pool for sequence generation, although the default one should be used if none is specified. This would be perfect for me, but....
... the problem I'm experiencing is that once I set a default partitioning policy with ServerSession.setPartitioningPolicy(PartitioningPolicy), so that every entity is partitioned, the "default" (or even the "sequence") connection pool is completely ignored by EclipseLink and the partitioning policy is asked even for queries related to sequence preallocation. The net effect is that EclipseLink always searches for the sequence table in each tenant data source, instead of in the central one.
Can this be considered a bug?
May the cause of this problem be at line 468 (in EclipseLink 2.5.2 source) of org.eclipse.persistence.internal.sequencing.SequencingManager (method org.eclipse.persistence.internal.sequencing.SequencingManager.Preallocation_Transaction_NoAccessor_State.getNextValue(Sequence, AbstractSession)), where a null accessor (instead of the just created "accessor") is passed to org.eclipse.persistence.sequencing.Sequence.getGeneratedVector(Accessor, AbstractSession) to allocate ids?
Otherwise I should make my partitioning policy able to distinguish a request related to sequence generation, for which it should return the "default" (or "sequence") connection pool identifier... but at a first glance I can't find an obvious way to do this.

Thanks in advance for any help.

Mauro

Back to the top