5 Replies Latest reply on Dec 16, 2002 8:31 PM by belaban

    cache inspiration: prevaylor, persistent object cache/store

      Hi Bela,

      Just to make you feel less alone on this forum...

      A long time ago I came across this link.

      http://www.prevayler.org/index.html

      It's a rather 'flashy' sourceforge project.
      It is a persistent in memory object store.

      Basically it works like this. You persist all your objects in memory (quote: '....ram costs nothing...'). All 'queries' are executed using a command pattern.
      All these commands are serialized to disk before being executed, an once a time you make a snapshot, ie dump the whole memory store to disk.

      So what happens when the server crashes? It takes the last snapshot from disk and 'plays' all the command previously written to disk, and voila: everything is back to normal. This sort of resembles the write ahead logging in bigger rdbms's voor more advanced memory paging.

      There's work going on on replication to.

      The prerequisists are: '..Your Business Classes must be Serializable and Deterministic ..'.

      It could be possible to keep a time/transaction ordered in memory array of serialized changed objects, these could just be persisted to rdbms. Then the 'replay' list could be shortend, making it more like a rolling log.

      My 2cts.,

      Sanne

        • 1. ps: cache inspiration: prevaylor, persistent object cache/st

          P.S. the fun part is that

          Jakarta Commons' Collections/ Xpath (http://jakarta.apache.org/commons/collections.html)

          Can be plugged in, this provides an Xpath query interface to the store. In this way you can query objects in collections, sub-collections, sub-sub-collections etc in a transparant way. It would make it possible in a trivial to plug-in cache policies for different collections in the store.

          Mmmm I must be overlooking alot,
          anyway, my 2cts,

          Sanne


          • 2. Re: cache inspiration: prevaylor, persistent object cache/st
            belaban

            I feel less alone already ... :-)

            Thanks for the hint ! Looks like Prevaylor is essentially an in-memory DB, much like HSQLDB.

            I guess our caching policy will determine if, and if yes, when to persist cache entries. Some policies might be snapshotting, write-through etc. Haven't gotten to that point yet.

            Bela

            • 3. Re: cache inspiration: prevaylor, persistent object cache/st

              Hi Bela,

              Since I don't know that much about caching, shut me up if I'm talking nonsense.

              To me a cache is a big map with objects in it, i.e. a collection. This is what prevayler is: no sql or such a thing.

              So what does prevayler add? Well if I have a collection, and the power goes PANG!, I can restart the system and the store is restored (... no more writes lost in cache state).

              Since prevayler extends to my knowledge the collection API's, the Jakarta JXPATH implementation can be applied on top of the system, it doesn't have to be. What does this add?

              Say I have a map called global cache, with in it sub maps with caches for every deployed war in the system/network, with submaps for caches for say SFSB and CMPEJB, then I could retrieve cache value objects using JXPATH by:

              Object cached_Object = context.getValue("globalcache/my_web_app/sfsb/cache_specific_wrapper_object[cacheObjectPKfield="id"]/cached_object");

              The context object is set to point to my global collections object. The cache_specific_wrapper_object could contain time of cacheing, timeout rules, etc. The lookup is done using introspection, so things could use speeding up, but you have to admit: this is spiffy.

              The point is that all the commands that modify the global collection object are submitted using a command pattern, which are spooled to disk, combined with a scheduled dump of the in memory collection th clear this command spool.

              These command objects could also contain a XPath expression as above to select the objects on which to act:

              1. //SFSB -> gimme all sfsb's (where // is short for /?????/?????/?????? etc)
              2. //my_deployed_war -> shut this one down
              2. / -> delete everything

              Further more, here is an interesting quote from the site about obtaining consistent cache snapshots:

              ...

              How can you expect to produce a consistent snapshot of a system that is constantly being modified?

              This is the fundamental problem with Ambitious Transparent Persistence projects. With prevalence, the problem is solved simply by using the command log.

              The command log enables the system to have a replica of the business logic on another virtual machine. All commands applied to the "hot" system are also read by the replica and applied in the exact same order. At backup time, the replica stops reading the commands and its snapshot is safely taken. Then, the replica resumes reading the command queue and gets back in sync with the "hot" system.

              ...

              This command log could for instance be distributed using Javagroups. There would be (MBean) command routers to decide the topology of the distributed caches (which cache replicates which?), and the (MBean) cache themselves feading on these routers. As long as caches are read this could occur concurrent, the writing commands would have to pass a router.

              The best thing is that since all commands are fed into the system using an object queue, the transaction characteristics are very favorable: no concurrency problems.

              To avoid confusion: I'm neither a cache expert, nor a prevaylor user/expert. But if you might be interested in using this technique, or maybe another one that's appealing, let me know (sanne@newfoundland.nl).
              But since I've learned some XML, and don't like RDBMS that much, this setup really appeals to me.

              Regards,

              Sanne

              • 4. Re: cache inspiration: prevaylor, persistent object cache/st

                Hi Bela,

                Since I don't know that much about caching, shut me up if I'm talking nonsense.

                To me a cache is a big map with objects in it, i.e. a collection. This is what prevayler is: no sql or such a thing.

                So what does prevayler add? Well if I have a collection, and the power goes PANG!, I can restart the system and the store is restored (... no more writes lost in cache state).

                Since prevayler extends to my knowledge the collection API's, the Jakarta JXPATH implementation can be applied on top of the system, it doesn't have to be. What does this add?

                Say I have a map called global cache, with in it sub maps with caches for every deployed war in the system/network, with submaps for caches for say SFSB and CMPEJB, then I could retrieve cache value objects using JXPATH by:

                Object cached_Object = context.getValue("globalcache/my_web_app/sfsb/cache_specific_wrapper_object[cacheObjectPKfield="id"]/cached_object");

                The context object is set to point to my global collections object. The cache_specific_wrapper_object could contain time of cacheing, timeout rules, etc. The lookup is done using introspection, so things could use speeding up, but you have to admit: this is spiffy.

                The point is that all the commands that modify the global collection object are submitted using a command pattern, which are spooled to disk, combined with a scheduled dump of the in memory collection th clear this command spool.

                These command objects could also contain a XPath expression as above to select the objects on which to act:

                1. //SFSB -> gimme all sfsb's (where // is short for /?????/?????/?????? etc)
                2. //my_deployed_war -> shut this one down
                2. / -> delete everything

                Further more, here is an interesting quote from the site about obtaining consistent cache snapshots:

                ...

                How can you expect to produce a consistent snapshot of a system that is constantly being modified?

                This is the fundamental problem with Ambitious Transparent Persistence projects. With prevalence, the problem is solved simply by using the command log.

                The command log enables the system to have a replica of the business logic on another virtual machine. All commands applied to the "hot" system are also read by the replica and applied in the exact same order. At backup time, the replica stops reading the commands and its snapshot is safely taken. Then, the replica resumes reading the command queue and gets back in sync with the "hot" system.

                ...

                This command log could for instance be distributed using Javagroups. There would be (MBean) command routers to decide the topology of the distributed caches (which cache replicates which?), and the (MBean) cache themselves feading on these routers. As long as caches are read this could occur concurrent, the writing commands would have to pass a router.

                The best thing is that since all commands are fed into the system using an object queue, the transaction characteristics are very favorable: no concurrency problems.

                To avoid confusion: I'm neither a cache expert, nor a prevaylor user/expert. But if you might be interested using this technique, or maybe another one that's appealing, let me know (sanne@newfoundland.nl).
                But since I've learned some XML, and don't like RDBMS that much, this setup really appeals to me.

                Regards,

                Sanne

                • 5. Re: cache inspiration: prevaylor, persistent object cache/st
                  belaban



                  > So what does prevayler add? Well if I have a
                  > collection, and the power goes PANG!, I can restart
                  > the system and the store is restored (... no more
                  > writes lost in cache state).


                  So the cache itself is persistent ? I thought the whole point about having caches was that we did not persist anything, therefore not slowing down the app. These guys must put the update onto a work queue and then have a thread to write the cached entry to storage.



                  > These command objects could also contain a XPath
                  > expression as above to select the objects on which to
                  > act:
                  >
                  > 1. //SFSB -> gimme all sfsb's (where // is short for
                  > /?????/?????/?????? etc)
                  > 2. //my_deployed_war -> shut this one down
                  > 2. / -> delete everything


                  Sounds cool. But I think we should focus on basic functionality first. KISS, KISS, KISS. Let's not overengineer this baby. Some people think that even JCS is too big (in terms of functionality)...



                  > The command log enables the system to have a replica
                  > of the business logic on another virtual machine. All
                  > commands applied to the "hot" system are also read by
                  > the replica and applied in the exact same order. At
                  > backup time, the replica stops reading the commands
                  > and its snapshot is safely taken. Then, the replica
                  > resumes reading the command queue and gets back in
                  > sync with the "hot" system.

                  This is certainly nice. I assume they provide FIFO order of commands. In our implementation (That's the part I'm currently focusing on), I want to go further and provide transactional semantics for cache updates: any item is updated in all caches, or in none. This is one of the properties, the others being simple async update and sync update without locking. Maybe the Prevayler stuff can be used on top of our cache ?...


                  Bela