Version 3

    Collected resources and discussions about multi-tenancy in Hibernate.




    (answers to questions asked during webinar)


    Q: What benefits does hibernate multi-tenancy have over database based multi-tenancy?

    A: To be honest, I am not sure what you mean by "database based multi-tenancy".  Do you mean vendor-specific features like Oracle VPD?  If so, its not a question of one *over* the other; they would need to work in unison.  Even if you use Orcle VPD to achieve multi-tenancy, you would still need Hibernate to be aware of that multi-tenancy so that it can (1) pass the tenant id along to the database on the JDBC Connection and (2) properly account for second level caching.




    Q: Is there a way to use Filters and ConnectionProvider in JPA 2 / @PersistenceContext injection? How about JPA 2.1?

    A: Using a custom ConnectionProvider in JPA is much, much cleaner.  You would simly specify the Hibernate ConnectionProvider to use in your persistence.xml propeties section.  Using Filters is going to require access to the Session behnd the EntityManager, possibly via unwrap.  The difficulty is when/where to enable the filters.




    Q: With the separate schema approach, can you use just JPA, instead of using Hibernate specific sessions?

    A: Yes.  Because the ConnectionProvider to use is specified on the config for the SessionFactory/EntityManagerFactory there is no Hibernate-specific calls in your application code using this approach.




    Q: Given that this is currently Hibernate specific, is there a roadmap to feed this back into the JPA/EJB standard?

    A: There is a pending question posted to the JPA expert group regarding "multi-tenancy" support in regards to JPA 2.1.  However, what is meant by that is still to be determined and as of yet no conclusion has been reached.




    Q: In separate schema design if I add tenant Id to the entity as a sql and would I still be able to use second level cache

    A: Currently in Hibernate, no.  That is something that would be covered by the proposed new feature for Hibernate 4.




    Q: Do you know of a cache that will support tenant isolation or cache federation to implement cache by tenant?

    A: Most (every?) cache represents keyed access to state.  The concern with second level cache and multi-tenancy in Hibernate is strictly a concern where Hibernate itself does not account for the tenant id value as part of that cache key as would be needed because Hibernate does not really know about the notion of a tenant id currently.  The proposal for Hibernate 4 is therefore to essentially make it aware of multi-tenancy and the notion of a tenant id.




    Q: Regarding the discriminator approach, is the tenant ID regarded implicitly when querying other entities related to the entity with the tenant Id? So if I query, say, all accounts, will it return me all accounts belonging to the customer with my tenant ID? Or must the tenant ID added to multiple entities?

    A: There is a decision you must make in terms of your application and data.  It is possible to have the tenant id be isolated to just a "root entity" if you so desire provided you adhere to few guidelines:

    1. The entities should not share PK values across tenants.  This would be required in order to ensure that your "related entities" could refer back to the root entity simply by its PK.
    2. You always, always, always access these dependant entities through the root entity.  This is required specifically so that the tenant id "filtering" can be applied.

    Because these requirements are generally not feasible in most applications it is probably best to simply consider the answer here to be "no, you will always need tenant ids on all entities".  The one exception to this is "reference data" which conceivably could be shareable across tenants.  Such "reference data" is safe provided the FK points *to* the reference data.




    Q: With the discriminator approach, is it possible to have seperate sequences per tenant?

    A: Currently Hibernate wil not automatically manage this for you.  You could, however, implement a custom IdentifierGenerator (the interface used by Hibernate for entity PK generation) to apply such logic yourself.  Whether the new Hibernate 4 features would manage this transparently is to be determined.




    Q: What are pros and cons of two approaches in terms of performance?

    A: This has to be evaluated in 2 parts.  First, what is the impact on the database server?  Really this is going to be vendor specific, but in general you usually have more options if you physically create multiple schemas or multiple database instances especially in terms of isolating processes and resources between tenants.  The second part concerns the impact on the application side, specifically the JVM in which the appliction runs.  Really from the Hibernte perspetive these are going to be the essentially the same.  You would have more resources associated with pooling of JDBC Connections and other JDBC resources with the separate schema approach, but generally speaking these should be pretty minimal.  Another consideration in Hibernte currently is the fact that only the discriminator approach supports use of second level caching; that might be a pro or con either way depending on how much data gets cached and how well the connection with the database server performs as well as other influences.




    Q: What about administrators that should  be priveleged to get access to data independently of the tenant?

    A: Generally this is limited to agents of the cloud/SaaS provider.  In those cases you would need to determine whether access to entities across tenant boundaries within the same unit of work is really a requirement.  If so then an approach similar to discriminator approach discussed is required.  Further, if the entities across tenants share PK values then you really need to be defining the identifiers of thos entities as composite including the tenant id so that they can be properly identified within a Hibernate Session or JPA EntityManager.  Certainly in this scenario the data needs to live or at least be available within the same database schema.  If, on the otherhand, the determination is that they do not need to be accesed within the same unit of work, then the considerations are really no different than the normal accesses.




    Q: If we are using compound/composite primary keys can we use tenant id and if say can/should it be put in the PK object?

    A: This approach is certainly valid and actually corect from the data modeling perspetive since the tenant id is part of the PK uniqueing provided that PK (without the tenant id) values are shared across tenants.  The question of whether to encode this into a composite PK class for the entity is a matter o f choice.  However realize it means that Hibernate would not be able to transparently handle the tenant id values as is being proposed for Hibernate 4; if you go this route your application is responsible for managing the tenant id.  That does not make it an invalid approach, its just something to consider.




    Q: Can the tenant_id be made available as a session variable in the database so the tenant_id value can be included in a database view WHERE-clause?

    A: Sure!  You are wanting a custom ConnectionProvider as discussed in the webinar which pushes the tenant id to the JDBC Connection in whatever specific means are defined by you database vendor.  Usually this takes the form of executing an "ALTER SESSION" SQL command, but really it is vendor-specific.  As an example, what you describe is exactly the Oracle VPD approach and is exactly how you would access a VPD from Hibernate.




    Q: Is there any way to apply the tenant ID when choosing the datasource to use when using the discriminator approach (like you do with the non-discriminator approach)?  Our custom solution does this at the moment.  We have a custom approach that does both. We take a tenant ID from the request context to determine the datasource to use and the datasource uses a DB user which can only see rows rows with that tenant ID...

    A: Yep, see the previous response.




    Q: Considering the multi-schema approach, does hibernate allow to create new schemas in runtime?

    A: This really is a matter of how you define the custom ConnectionProvider and from where it gets Conections.  For example, if you use DataSources, does your application server support definition and deplpoyment of DataSources at runtime?




    Q: When using database vendor specific solutions like Oracle VPD, do I have to turn off hibernate caching to avoid wrong cache hits?

    A: The answer here is really just the same as the other cases.  If the tenants can share PK values for the same entity then caching needs to account for that as part of the cache key which is not part of the Hibernate code currently.   If the PK values cannot be the same across tenants, then there is no issue.