5 Replies Latest reply on Oct 29, 2007 12:17 PM by manik

    Transactions

    aditsu

      Let's say I have a machine running a jboss cache (C) and another one running an application (A). A conencts to C through some kind of custom API. How can I make it so that A can call multiple API methods within a single transaction in C?
      E.g. this could be a scenario:
      - A starts a transaction
      - A calls a method that modifies a cache node in C
      - A calls a method that modifies another node in C
      - A tries to commit the transaction
      - the first node can be modified successfully, but the second one throws an exception
      - C rolls back the whole transaction because the second node change failed

      If I only rely on transactions in C, then either the first method has to start a transaction, or A needs to call another API method first that starts a transaction. How would the second method "know" it is part of the same transaction, and how can A then commit the same transaction? Are transactions associated with threads in machine C? Does that mean the whole sequence of API calls has to be executed on the same dedicated thread in C? Or is it possible to send the transaction reference across the API? Will there be conflicts if multiple threads in A are doing this at the same time?

      Or, is it possible to use a distributed transaction manager to solve this problem? How would that work, and how would the code in A and the code in C know they're part of the same transaction?

        • 1. Re: Transactions
          manik

          Using a distributed TM may work and is worth trying out, as I'd imagine this is the "correct" approach, provided C is configured with a TransactionManagerLookup that knows how to get a handle on the distributed TM.

          A simpler approach may be not to use a custom API to communicate between A and C at all, but instead do do something like this:

          Let B be another cache instance, which runs in the same JVM as A. A always talks to B, never directly to C. So this way transactional scope is maintained regardless of which TM you use.

          Now B can be tuned with an aggressive eviction policy so it does not maintain much state at all in memory so it doesn't impact the machine very much. B is also configured with a TcpCacheLoader pointing at C. C runs with a TcpCacheServer, which acts as a backing cache to B. So all the cache state is really held in C, but B acts as the API front end for interacting with the cache.



          • 2. Re: Transactions
            aditsu

            About distributed TM, I know almost nothing about that, and have no idea where to start.

            With the local cache ("B"), if a node is locked for writing in B, will it be also locked for writing in C, at the same time?

            • 3. Re: Transactions
              manik

               

              "aditsu" wrote:
              About distributed TM, I know almost nothing about that, and have no idea where to start.


              JBoss TS should have some literature around this, although I'd say you'd be breaking new ground here as I don't know of any users using JBoss Cache with a distributed TM.

              "aditsu" wrote:

              With the local cache ("B"), if a node is locked for writing in B, will it be also locked for writing in C, at the same time?


              Not until B commits.

              • 4. Re: Transactions
                aditsu

                Hi, I'm finally coming back to this problem, and I'd like to use a local cache (B) to keep things simple. So far I found 2 ways to do that:
                1. What you said - a TcpCacheLoader in B pointing at C, and (I guess) no replication for B
                2. B and C synchronously replicated in the same cluster, with no initial state transfer, and a ClusteredCacheLoader in B
                In both cases, eviction would be used to discard unneeded nodes from B, based either on either an eviction policy or application logic.

                Now, I'm not sure what are the exact differences between the 2 approaches (and also whether there is yet another way), but here is what I'd like to achieve:
                - Data changes in C should update the B cache too (or somehow notify A). In the first approach I'm afraid it won't be notified at all; in the second approach I'm afraid it will receive too much (including needed and unneeded data)
                - Suppose I have this local cache setup on 2 machines, B1 and B2. If I start a transaction on each one, and try to read or write a record on B1 and write the same record on B2, whichever happens first should block the other one, until the transaction is committed. I'm afraid that won't happen with a TcpCacheLoader

                Could you please confirm my understanding, complete the missing details and advise me what approach is better for my needs?

                Thanks
                Adrian

                • 5. Re: Transactions
                  manik

                  Hi

                  The differences are that approach 1 treats C as a store of data. Approach 2 treats C as a peer.

                  Even with approach 1, data changes in C will be picked up by B if the node didn't already exist in B or if the node was evicted. So if your requirement isn't for changes in C to be seen in B *in real time*, then an eviction policy on B will ensure that changes are seen up to a certain delay.

                  With approach 2, you shouldn't have too much information if all the state in C is relevant to B.

                  Re: your question on txs, this will still work since changes to the TcpCacheLoader are written in 2 phases as well, a prepare and commit phase and only when B1 or B2 successfully commits will the TcpCacheLoader be instructed to commit changes.