13 Replies Latest reply on Sep 14, 2006 12:42 PM by Brian Stansberry

    Exceptio in startService with PojoCache

    Yudivian Almeida Cruz Newbie

      I'm trying to use replication using PojoCache this work fine (not that fine because I get an OutOfMemory exception but that's not what I'm trying to solve now) when I use only one instance of the application I'm developing. But when I try to start more than one application in differents PC (or even in different JVM in the same PC) I got this exception

      org.jboss.cache.CacheException: Initial state transfer failed: Channel.getState() returned false
       at org.jboss.cache.TreeCache.fetchStateOnStartup(TreeCache.java:3190)
       at org.jboss.cache.TreeCache.startService(TreeCache.java:1429)
       at org.jboss.cache.aop.PojoCache.startService(PojoCache.java:94)
      


      And then when I increase the InitialStateRetrievalTimeout and try to start again the applications I got this different exception

      org.jboss.cache.lock.TimeoutException: failure acquiring lock: fqn=/d105619f-7bdc-4061-b463-a1c9526f8644/0, caller=Thread[Thread-10,5,main], lock=write owner=GlobalTransaction:<10.6.100.35:47413>:9877 (activeReaders=0, activeWriter=Thread[Thread-10,5,main], waitingReaders=0, waitingWriters=0, waitingUpgrader=0)
       at org.jboss.cache.Node.acquire(Node.java:407)
       at org.jboss.cache.Node.acquireAll(Node.java:446)
       at org.jboss.cache.Node.acquireAll(Node.java:453)
       at org.jboss.cache.Node.acquireAll(Node.java:453)
       at org.jboss.cache.TreeCache.acquireLocksForStateTransfer(TreeCache.java:2730)
       at org.jboss.cache.TreeCache._setState(TreeCache.java:2627)
       at org.jboss.cache.TreeCache.access$000(TreeCache.java:86)
       at org.jboss.cache.TreeCache$MessageListenerAdaptor.setState(TreeCache.java:5303)
       at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passUp(MessageDispatcher.java:626)
       at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:331)
       at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUp(MessageDispatcher.java:734)
       at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.access$300(MessageDispatcher.java:566)
       at org.jgroups.blocks.MessageDispatcher$1.run(MessageDispatcher.java:703)
       at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.cache.lock.TimeoutException: read lock for /d105619f-7bdc-4061-b463-a1c9526f8644/0 could not be acquired by Thread[Thread-10,5,main] after 150000 ms. Locks: Read lock owners: []
      Write lock owner: GlobalTransaction:<10.6.100.35:47413>:9877
      , lock info: write owner=GlobalTransaction:<10.6.100.35:47413>:9877 (activeReaders=0, activeWriter=Thread[Thread-10,5,main], waitingReaders=0, waitingWriters=0, waitingUpgrader=0)
       at org.jboss.cache.lock.IdentityLock.acquireReadLock(IdentityLock.java:257)
       at org.jboss.cache.Node.acquireReadLock(Node.java:417)
       at org.jboss.cache.Node.acquire(Node.java:384)
       ... 13 more
      


      I appreciate any suggestion on this because I'm dealing with this problem for two weeks and I don't have any other idea of what to do. Well thanks in advance.

        • 1. Re: Exceptio in startService with PojoCache
          Ben Wang Master

          How do you start up your cache instances? This is something to do with the initial state transfer. Also, make sure you don't have any orphaned process running.

          • 2. Re: Exceptio in startService with PojoCache
            Brian Stansberry Master

            You have a transaction on the existing cache instance that's holding a write lock on a node for 15 seconds. That's preventing the acquisition of the read lock that's necessary to prepare the state transfer. So a question is why a tx is holding the lock for that long.

            • 3. Re: Exceptio in startService with PojoCache
              Yudivian Almeida Cruz Newbie

              Hi, thanks for your post in the forum. Let me describe you a little more my problem and maybe you can help me. I'm developing an application where I need to replicate some data, to do this I'm using two instance of TreeCache for simple data and I'm using PojoCache to replicate an Array List. In a first version of the application I was usin only the two instance of TreeCache and it was working fine but when start to use PojoCache it works if I run the application in only one PC (or only one JVM) but when I run it in more than one PC I got the exceptions I wrote in the forum and I don't know why. I really don't know why a transaction is holding a lock for that long, (this must be inside JBCache because I'm not using transactions). Do you have any suggestion of how can I solve this problem or how to release the lock in the node.

              This is the code of the class that use PojoCache, maybe you can see an error using PojoCache in this code. (I use the method getMessageBusy() very often and from many different threads)

              public class JBCacheDMS {
              
               private static final Logger log = Logger.getLogger(JBCacheDMS.class);
              
               protected boolean inited;
              
               protected PojoCache queues;
              
               protected HashMap<String, List> proxyQueues;
              
               public void init(String DMSUrl, String[] DMSnodesUrls) {
               try {
               queues = new PojoCache();
               PropertyConfigurator config = new PropertyConfigurator(); // configure tree cache.
               config.configure(queues, "config/replSync-service.xml");
               String platformName = System.getProperty("platformName");
               queues.setClusterName("DMS-" + platformName);
               queues.setCacheMode(PojoCache.REPL_SYNC);
              // queues.createService();
               queues.startService();
              // addTreeCacheListener();
               proxyQueues = new HashMap<String, List>();
               inited = true;
               } catch (Exception e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
               }
              
               }
              
               /**
               * To add a TreeCacheListener
               *
               * @param listener
               * The listener to add
               */
               public void addTreeCacheListener(TreeCacheListener listener) {
               queues.addTreeCacheListener(listener);
               }
              
               /**
               * To remove a TreeCacheListener
               *
               * @param listener
               * The listener to remove
               */
               public void removeTreeCacheListener(TreeCacheListener listener) {
               queues.removeTreeCacheListener(listener);
               }
              
               /**
               * Return an instance of DMStorage(the class itself) if the class was
               * inited, if not returns null
               *
               * @return an instance of DMS
               */
               public DMS getDMS() {
               if (inited)
               return this;
               return null;
               }
              
               public void stop() {
               // TODO Auto-generated method stub
              
               }
              
               public void createQueue(String name) {
               try {
               queues.putObject(name, new ArrayList<AbstractMessage>());
               List<AbstractMessage> proxyList = (List) queues.getObject(name);
               proxyQueues.put(name, proxyList);
               // System.out.println(getAgent(name).getName().getName());
               } catch (CacheException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
               }
              
               }
              
               public void deleteQueue(String name) {
               try {
               proxyQueues.remove(name);
               queues.removeObject(name);
               queues.remove(name);
               } catch (CacheException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
               }
              
               }
              
               public void putMessage(String name, AbstractMessage message) {
               if (!queues.exists(name)){
               createQueue(name);
               }
               List<AbstractMessage> proxyList = proxyQueues.get(name);
               proxyList.add(message);
               }
              
               public AbstractMessage getMessageBusy(String name) {
               if (!queues.exists(name)){
               createQueue(name);
               }
               List<AbstractMessage> proxyList = proxyQueues.get(name);
               AbstractMessage message = proxyList.remove(0);
               while(message == null){
               message = proxyList.remove(0);
               try {
               Thread.sleep(10);
               } catch (InterruptedException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
               }
               }
               return message;
               }
              }
              

              >You have a transaction on the existing cache instance that's holding a write >lock on a node for 15 seconds. That's preventing the acquisition of the read >lock that's necessary to prepare the state transfer. So a question is why a >tx is holding the lock for that long.

              • 4. Re: Exceptio in startService with PojoCache
                Brian Stansberry Master

                1) What version of JBoss Cache are you using?

                2) Which server is 10.6.100.35? The one that's starting or the other one? From which server is the log snippet with the TimeoutException you originally posted?

                • 5. Re: Exceptio in startService with PojoCache
                  Yudivian Almeida Cruz Newbie

                  - I'm using JBossCache-1.4.0.GA
                  -The server 10.6.100.35 is the one that is starting, and the log snippet with the TimeOutException is from the server who starts second.

                  • 6. Re: Exceptio in startService with PojoCache
                    Brian Stansberry Master

                    Sorry, I'm confused. What do you mean "starts second"?

                    As I understand it there are two servers. #1 is running (and will therefore provide state to #2 when #2 starts) and #2 is starting.

                    So, 10.6.100.35 is #2, and the log snippet is from #what ?

                    • 7. Re: Exceptio in startService with PojoCache
                      Ben Wang Master

                      Are you saying you are using both TreeCache and PojoCache at the same time (e.g., within the same cluster group)?

                      • 8. Re: Exceptio in startService with PojoCache
                        Yudivian Almeida Cruz Newbie

                        Sorry if I was a little bit confusing

                        Server #1 (the first to be running) is 10.6.100.35
                        Server #2(starts when #1 is already running) is 10.6.100.37 and in this is where I got the Exceptions

                        And for every TreeCache or PojoCache I'm using I use a different cluster name.

                        • 9. Re: Exceptio in startService with PojoCache
                          Ben Wang Master

                          Can you try out TreeCache only first so we can isolate the problem further?

                          • 10. Re: Exceptio in startService with PojoCache
                            Yudivian Almeida Cruz Newbie

                            I'm using the TreeCache for a different purpose than I'm using the PojoCache . When I use the TreeCache all works fine, and when I use the PojoCache in only one PC it works fine. The problem is when I try to use the PojoCache in more than one PC.

                            • 11. Re: Exceptio in startService with PojoCache
                              Brian Stansberry Master

                              I suspect that the usage patterns of the PojoCaches is different from the TreeCache; i.e. the PojoCache on 10.6.100.35 is under heavy load when it's asked to provide state.

                              This is a tricky issue that we're working to resolve; see http://www.jboss.com/index.html?module=bb&op=viewtopic&t=78112 for a *long* discussion.

                              Are you using READ_COMMITTED or weaker IsolationLevel? Perhaps moving to REPEATABLE_READ with a long state transer timeout will help.

                              • 12. Re: Exceptio in startService with PojoCache
                                Yudivian Almeida Cruz Newbie

                                Yes you are right the usage of PojoCache is different from the TreeCache. and there are many threads trying to access data from the PjoCache all the time.

                                I'm already using in my configuration REPEATABLE_READ as isolation level and 150000 as InitialStateRetrievalTimeout. And it doesn't work any way.

                                • 13. Re: Exceptio in startService with PojoCache
                                  Brian Stansberry Master

                                  OK, unfortunately it sounds like you've hit the problem discussed in the above referenced thread, which we're still working to resolve. :(