1 2 Previous Next 17 Replies Latest reply on Aug 9, 2008 12:00 AM by sanne

    Cache @ bootcamp

    belaban

      Hi Belan,

      I'm still around to bug and confuse you!

      Nice to hear that cache will become an XAResource, that'll give me a chance to learn the TX workings while following the Cache project.

      I've read you're update an maybe have a constructive suggestion.

      Since all cached values are wrapped by a CacheValue object you could also use this object to trace the tx/version information of reads/writes/commits. The Cachevalue would then store version and tx information in a map and could then apply all kinds of thread specific heuristics on reading/writing/commiting of a cached value.

      Basically it would be like this. On begin register the XID with the cache. On each change don't write the actual update to a list, but write the CacheID of each 'touched' CacheValue to a map on a change. On prepare or commit you could iterate through this list to gather the information to be committed, for instance the CacheValue could emit an Update object when it's commit is called, or an exception which could then be propagated.

      This has one immediate advantage: if an object is put twice within each transaction, it will only end up once in the map.

      It also has further advantages I think when you would like an instance per transaction/optimistic locking strategy. The cache value object could do several things on the basis of the XID of the current transaction when it reads/writes a value in cooperation with the MVCC Postgresql pattern I posted earlier:

      http://www.linuxgazette.com/issue68/mitchell.html

      and a thread/version tracking list.

      One could track for instance which 'read version' a thread has when it wants to put to a cache on the basis of an CacheValue internal access map, or on commit. All kinds of heuristics could be applied on reading/writing/commiting.

      A second advantage is that changes to the cache values are not only known thread local, but also CacheValue local. A rollback could be issued when for instance a thread tries to put a object in a multiple instance scenario instead of on prepare.

      I've tried to measure through all the possibilities in conjunction with distributed caches/commit options etc, but it soon gets very complex.

      Still I think that this approach could accomodate alot of scenarios.

      Regards,

      Sanne

        • 1. Re: Cache @ bootcamp

          'I've been messing where I shouldn't have been messing ...
          These ...'

          I'll take a look at the AOP forum and I'll take a look at the JDO spec for the 'CMP like semantics'.

          Damn, JBoss is moving so fast.

          • 2. Re: Cache @ bootcamp
            marc.fleury

            I will tell you what I really want from the JDO spec in relation to the caches... a query engine on the cache with CMP like semantics and a simple defintion of the primary key

            • 3. Re: Cache @ bootcamp
              marc.fleury

              sanne,

              seriously take a look at "ACID on AOP" in the AOP forum. It is a simple idea by julien viet where we version the state of a POJO for acidity with transactions.

              meaning that we instrument the POJO with an state change interceptor that deep copies native fields and versions them in a map or history queue (configurable see that thread).

              If there is a transaction we enroll some Synchronization, are we sure we want the full XAResource for the state? ... let's keep it simple for "intra VM" it means that a simple "beforeCommit" and "afterCommit(boolean)" get called (see state management in EJB code today in CVS)

              period.

              It is a POJO aspect, not a property of the cache. The cache is there to maintain references to the objects and make them available local/remote.

              You owe me one when I get to amsterdam :)

              • 4. Re: Cache @ bootcamp
                belaban

                > In Castor there is a concept of long transactions
                > where cache objects can be timestamped (implementing
                > a timestampable interface) and then disassociated
                > from the cache and subsequently reassociated. The
                > timestamp is checked on the way back in to see if the
                > object has become dirty since being first accessed.
                > Is there any thought to such functionality.

                For optimistic transactions we will have to associate a version number with a cache entry. On a local modification we increment it. When we PREPARE, every local entry checks whether that version number is higher than the one given in the PREPARE mcast. If yes, this means the cache entry was modified in the meantime, and we'll abort the transaction.

                So, yes, what you describe above will be implemented. I don't know though whether this will be the same as long transactions in Castor. I don't think we will disassociate objects and the re-associate them.

                I'm currently reading up on JDO, let's see whether we can use some of their ideas.
                Bela

                • 5. Re: Cache @ bootcamp
                  ronr

                  In Castor there is a concept of long transactions where cache objects can be timestamped (implementing a timestampable interface) and then disassociated from the cache and subsequently reassociated. The timestamp is checked on the way back in to see if the object has become dirty since being first accessed. Is there any thought to such functionality.

                  Thanks.

                  • 6. Re: Cache @ bootcamp

                    #$%#$^&#$@#$^#$^% AOP -> XP !

                    • 7. Re: Cache @ bootcamp

                      Hi Bela,

                      First of all thanks for your appreciation. I don't know if there are other places except the AOP forum where cache development is being discussed: in that case I would definitely be repeating a few discussions here. But since everything appears quiet, I feel free to rant: making mistakes is only surpassed by doing nothing.

                      I'll ponder over your response, but I can't restrain myself from making some quick judgements.

                      I can catch your drift in the locking discussion, since optimistic locking is done first. In other words: first lock some entities locally, make 'm dirty, and then on prepare try to lock the peer beans on other servers.
                      Basically you would end up with two points in the interceptor chain where locking can occur.
                      One is the locking manager. It could state that there is no lock on a bean, so the call can go ahead. But when the thread arrives at the cache, wait .... there is a lock (from another server). That's two. I know that Marc and ... (forgot the name) had quite some trouble doing a.o. the wake up code for the other threads waiting for a lock to be released. It would be nicer if you wouldn't have to go there IMHO, since the code is already in the locking manager ... I don't know shit about locking, but if you're short on locking input, I could give a shot at understanding it. No guarantees ...

                      About multiple threads accessing EJB's.

                      Stateless session beans have no state, so multiple thread access wouldn't pose that much of a problem. In fact I wouldn't be suprised if slsb aren't pooled at all.
                      Statefull sesion beans where designed to be used be one client, not much use for multiple thread access here, it defeats the purpose.

                      Generally for entity beans you would like an serializable isolation level. In this case threading is handled by the lockmanager: it won't allow two thread's to call functions on an EJBs with the same ID. Being it multiple instances, or the same instance. In case of optimistic locking you can have two threads accessing the same bean, but it would be two separate instances of the same bean. This would come down to no dirty/nonrepeatable reads, i.e. repeatable reads.
                      One step lower is read committed. A EJB read by a proces can be changed and committed by a shorter running tx, but you would still need multiple instances for this.

                      Only on the lowest level read uncommited you would have a scenario where two threads can actually read each others changes before committing. Then you basically would have you're quick and dirty SQL coded webshop with non locking (for update), and the CMP layer would end up being an blunt O/R tool. Model wize I would oppose this, but there would be a point in have all the options. But we don't need it now I think. Anyway, with the value object with heuristics solution there's no objection to returning a reference to the same object to two threads.

                      Maybe I could help out coding, but being Dutch I always try to achieve a consensus before acting, so everything can be planned. I'm afraid it's hereditary overhere, and not AOP like.

                      Till next week,

                      Sanne

                      • 8. Re: Cache @ bootcamp
                        belaban

                        Hey Sanne et al,

                        sorry for not replying sooner, but I'm quite busy in my day job these days. I will have almost a week off starting Feb 17, and I intend to get the XAResource part done, ro at least a large part of it.

                        Sanne: your comments are very much appreciated, I will go over all of them and take them into consideration.

                        One thing that we are a bit unclear at this time is whether we can really use the locks given to us by the Locking interceptor because we will need to acquire distributed locks. These are handled exclusively by the CacheInterceptor (or whatever we will call them, maybe they will end up in the CMP interceptor). So, when I mcast a PREPARE, this is communication between the CacheInterceptors (horizonal commmunication) rather than vertical, so at that point the incoming mcast will be received directly by the CacheInterceptor, and not even go through the LockingInterceptor.

                        This probably deserves some more thought, and I'll definitely go through your comments and suggestions.

                        I believe one possible way would be to make Locking a policy: anyone can replace the Locking policy. Maybe the default Locking policy would make use of locks provided by the LockingInterceptor, other may maintain their own locking tables.

                        With respect to multiple threads accessing the same EJB: Yes, this is forbidden by the spec. But, then again, does this make sense ? I believe J2EE does *not* mandate caches in the first place, despite the fact that they are very useful building blocks for any distributed application. So the question is whether we should follow the spec here, or not.

                        Cheers,
                        Bela

                        • 9. Re: Cache @ bootcamp

                          No.

                          Cache should first say it is o.k. to commit (certainly in multiple instance scenario), and then the JDBC part of the CMP should commit it's list of changes to DB for the appropriate tx (be it on a per object or per field basis) to persist the decision cache has made on the basis of the heuristics txID/locking policies.

                          Anyway that would be how I would see it all working!
                          Somebody is bound to protest!?

                          • 10. Re: Cache @ bootcamp

                            Right, I've lost the db persistence step....

                            • 11. Re: Cache @ bootcamp

                              ehhh. Small correction.

                              That would be:

                              ...

                              Thread two:

                              prepare
                              cache.prepare(): eh wait, you're initial version is stale, because the committed version has another one, exception!!!!
                              rollback

                              etc....

                              This is all from a 3.x perspective, but it should be comparable to the 4.x: the interceptor concept is enhanced, but I think the principle remains the same.

                              ...


                              The locking issue should really be sorted out:

                              a. is the distriduted lock emitted when the call is in the locking interceptor
                              b. or is the lock obtained when the value is being called from cache

                              My point would be that a. is the case for the non distributed case, and should be the scenario for the distributed too.

                              At the moment the order of interceptors (3.x) is:
                              1. security SecurityInterceptor
                              2. tx TxInterceptorCMT
                              3. EntityCreationInterceptor something with ejbCreate/postCreate
                              4. lock EntityLockInterceptor
                              5. cache EntityInstanceInterceptor
                              6. (connection/synch)-> if there's no valid context CMP ??

                              The good thing if this were true is that the complexity of the cache decreases immensely (divide and conquer!).
                              The cache would only have to know:

                              a. which thread are you
                              b. what is the locking policy of the EJB
                              c. apply heuristics in CacheValue to determine which version to return

                              And on a prepare call:

                              a. lockCache
                              b. what thread (txID) are you
                              b. which objects (wrapped by CacheValue) has this txID touched?
                              c. call prepare on the CacheValues and have them emit the relavant Update objects (or throw exception). CacheValue can do heuristics for each object remember!

                              And on commit

                              a. execute all the previously gathered Update objects
                              b. release cache lock

                              Et voila. The locking should be in the locking interceptor then.

                              Mmmmm sounds to simple .....

                              • 12. Re: Cache @ bootcamp

                                I guess one of the things I'm trying to say is that a distributed cache may involve making to things distributed:

                                1. the locking manager (broadcast all the locks)
                                2. the cache (broadcast all the state changes)

                                I don't think the two will be mixed in one apparatus.

                                • 13. Re: Cache @ bootcamp

                                  ...

                                  Yes. However, this would not work for multiple threads accessing the same EJB concurrently. Although this is forbidden by the EJB spec (I think !), we should be extensible for future requirements.

                                  ...

                                  Concurrent access is forbidden. This is a major stance in de spec, it is why EJB's are a good thing. At most a multiple instance strategy is possible. In this case two (two=example) separate instances of the same EJB instance (same id) are available, but each thread will have its own instance. This could easily be accomplished when each CacheValue stores a cached object mapped to a thread. I thought that the CMP crewhad some problems implementing this earlier: this is definitely a solution IMHO that could make the cache shine. I think this should be discussed with the CMP crew, as multiple instance is already a 3.x feature, and it will have to make it in 4.x.

                                  Something close to concurrent access is reentrancy: Thread x is in EJB A, which uses EJB B, which calls EJB A again: thread x is in two places in EJB A.

                                  ...

                                  Yes. Also, acquire a r/w lock on the entry *before* attempting to update.

                                  ...

                                  (Disclaimer: Talking only EJB's here). Locking is done in the EJB interceptor chain before the CMP interceptor. In case of a read/write lock on a bean the thread won't even make it to the CMP interceptor which might use the cache. So there's no explicit need to check locking in a r/w lock scenario for the cache. The cache however should be informed about the locking policy (isolation-level) of the bean when it gives a value to a thread. Say the policy is read-uncommited (not likely in the case of EJB). Then all threadsshare one cache object. If there is an optimistic locking, then each thread would get it's own object out of the cache, and which thread commits first walks away with the price. When the second thread want's to prepare, and the object state has changed, then it gets an exception because the state is inconsistent with the committed one. Sacha would know abou this.


                                  Basically a scenario would be like this

                                  R/W lock scenario

                                  Thread 1:

                                  Locking Interceptor get lock on EJB 1;
                                  CMP Interceptor EJB in cache? Get from cache.
                                  Change state EJB 1

                                  Thread 2:

                                  Locking Interceptor get lock on EJB: WAIT!
                                  ......
                                  ......
                                  Nothing happens

                                  Thread one:

                                  prepare
                                  cache.prepare()
                                  (get cache lock ???)
                                  get all objects touched by thread 1
                                  object.prepare(): return no exceptions

                                  commit
                                  cache.commit()
                                  Locking manager.commit():release all locks

                                  Thread two:

                                  WAKE UP!
                                  CMP Interceptor EJB in cache? Get from cache.

                                  etc....

                                  Multiple instance scenario


                                  Thread 1:

                                  Locking Interceptor get lock on EJB 1;
                                  CMP Interceptor EJB in cache? Get from cache.
                                  Change state EJB 1
                                  CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 1

                                  Thread 2:

                                  Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                  Change state EJB 1
                                  CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2
                                  (Do it twice ....)
                                  Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                  Change state EJB 1
                                  CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2
                                  (Do it three times ....)
                                  Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                  Change state EJB 1
                                  CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2


                                  Thread one:

                                  prepare
                                  cache.prepare()
                                  (get cache lock ???)
                                  get all objects touched by thread 1
                                  object.prepare(): return no exceptions

                                  commit
                                  cache.commit()
                                  Locking manager.commit():release all locks

                                  Thread two:

                                  commit
                                  cache.commit(): eh wait, you're initial version is stale, because the committed version has another one exception!!!!
                                  rollback

                                  etc....

                                  This is all from a 3.x perspective, but it should be comparable to the 4.x: the interceptor concept is enhanced, but I think the principle remains the same.

                                  Any way this stuff can get very complex, it'll be good to combine several points of view (CMP Dain, Clustering etc Sacha, JavaGroups eh that's you).
                                  All the above is based on the JBoss advanced course I;ve taken recently. If you've been there too, then chances are I'm telling nothing new.

                                  Regards,

                                  Sanne


                                  • 14. Re: Cache @ bootcamp
                                    belaban

                                    ...

                                    Yes. However, this would not work for multiple threads accessing the same EJB concurrently. Although this is forbidden by the EJB spec (I think !), we should be extensible for future requirements.

                                    ...

                                    Concurrent access is forbidden. This is a major stance in de spec, it is why EJB's are a good thing. At most a multiple instance strategy is possible. In this case two (two=example) separate instances of the same EJB instance (same id) are available, but each thread will have its own instance. This could easily be accomplished when each CacheValue stores a cached object mapped to a thread. I thought that the CMP crewhad some problems implementing this earlier: this is definitely a solution IMHO that could make the cache shine. I think this should be discussed with the CMP crew, as multiple instance is already a 3.x feature, and it will have to make it in 4.x.

                                    Something close to concurrent access is reentrancy: Thread x is in EJB A, which uses EJB B, which calls EJB A again: thread x is in two places in EJB A.

                                    ...

                                    Yes. Also, acquire a r/w lock on the entry *before* attempting to update.

                                    ...

                                    (Disclaimer: Talking only EJB's here). Locking is done in the EJB interceptor chain before the CMP interceptor. In case of a read/write lock on a bean the thread won't even make it to the CMP interceptor which might use the cache. So there's no explicit need to check locking in a r/w lock scenario for the cache. The cache however should be informed about the locking policy (isolation-level) of the bean when it gives a value to a thread. Say the policy is read-uncommited (not likely in the case of EJB). Then all threadsshare one cache object. If there is an optimistic locking, then each thread would get it's own object out of the cache, and which thread commits first walks away with the price. When the second thread want's to prepare, and the object state has changed, then it gets an exception because the state is inconsistent with the committed one. Sacha would know abou this.


                                    Basically a scenario would be like this

                                    R/W lock scenario

                                    Thread 1:

                                    Locking Interceptor get lock on EJB 1;
                                    CMP Interceptor EJB in cache? Get from cache.
                                    Change state EJB 1

                                    Thread 2:

                                    Locking Interceptor get lock on EJB: WAIT!
                                    ......
                                    ......
                                    Nothing happens

                                    Thread one:

                                    prepare
                                    cache.prepare()
                                    (get cache lock ???)
                                    get all objects touched by thread 1
                                    object.prepare(): return no exceptions

                                    commit
                                    cache.commit()
                                    Locking manager.commit():release all locks

                                    Thread two:

                                    WAKE UP!
                                    CMP Interceptor EJB in cache? Get from cache.

                                    etc....

                                    Multiple instance scenario


                                    Thread 1:

                                    Locking Interceptor get lock on EJB 1;
                                    CMP Interceptor EJB in cache? Get from cache.
                                    Change state EJB 1
                                    CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 1

                                    Thread 2:

                                    Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                    Change state EJB 1
                                    CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2
                                    (Do it twice ....)
                                    Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                    Change state EJB 1
                                    CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2
                                    (Do it three times ....)
                                    Locking Interceptor get lock on EJB? (hey lock is not exclusive!)
                                    Change state EJB 1
                                    CMP Interceptor on the way back Store EJB 1 state in cache under reach of thread 2


                                    Thread one:

                                    prepare
                                    cache.prepare()
                                    (get cache lock ???)
                                    get all objects touched by thread 1
                                    object.prepare(): return no exceptions

                                    commit
                                    cache.commit()
                                    Locking manager.commit():release all locks

                                    Thread two:

                                    commit
                                    cache.commit(): eh wait, you're initial version is stale, because the committed version has another one exception!!!!
                                    rollback

                                    etc....

                                    This is all from a 3.x perspective, but it should be comparable to the 4.x: the interceptor concept is enhanced, but I think the principle remains the same.

                                    Any way this stuff can get very complex, it'll be good to combine several points of view (CMP Dain, Clustering etc Sacha, JavaGroups eh that's you).
                                    All the above is based on the JBoss advanced course I;ve taken recently. If you've been there too, then chances are I'm telling nothing new.

                                    Regards,

                                    Sanne


                                    1 2 Previous Next