5 Replies Latest reply on Oct 13, 2005 5:31 PM by brian.stansberry

    New - TreeCache Replication On Server Startup

    jedlloyd

      Hello,

      I'm having a problem with replication on server startup. I have written a wrapper class, using JBoss TreeCacheAOP, for a CacheManager class which is loaded via the struts DispatcherServlet on application startup. I now have the application deployed in a clustered Weblogic environment with two nodes running on the same box. The TreeCache is configured to run in REPL_ASYNC mode, with FetchStateOnStartup set to false and the InitialStateRetrievalTimeout set to 1 hour.

      What I have noticed is that after the CacheManager loads all the cache data to in memory Hashmaps and then begins adding these maps to the TreeCacheAOP through the wrapper class, it looks like the TreeCache begins making attempts to replicate the changes to each nodes cache. I have looked into the JGroups configuration specs and am not sure if I have missed something, however cannot seem to figure out why the replication attempts are started so soon after application startup.

      Here is a copy of the JBossCacheConfig.xml file followed by the application server startup logs:

      <?xml version="1.0" encoding="UTF-8"?>
      
      <!-- ===================================================================== -->
      <!-- -->
      <!-- TreeCacheAop Service Configuration -->
      <!-- -->
      <!-- ===================================================================== -->
      
      <server>
      
       <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
      
      
       <!-- ==================================================================== -->
       <!-- Defines TreeCache configuration -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.cache.aop.TreeCacheAop" name="jboss.cache:service=TreeCacheAop">
      
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=TransactionManager</depends>
      
       <!--
       Configure the TransactionManager
       -->
       <!-- <attribute name="TransactionManagerLookupClass">org.jboss.cache.JBossTransactionManagerLookup</attribute> -->
      
      
       <!--
       Node locking level : SERIALIZABLE
       REPEATABLE_READ (default)
       READ_COMMITTED
       READ_UNCOMMITTED
       NONE
       -->
       <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
      
       <!--
       Valid modes are LOCAL
       REPL_ASYNC
       REPL_SYNC
       -->
       <attribute name="CacheMode">REPL_ASYNC</attribute>
      
      
       <!-- Just used for async repl: use a repliation queue -->
       <attribute name="UseReplQueue">false</attribute>
      
       <!-- Replication interval for replication queue (in ms) -->
       <attribute name="ReplQueueInterval">0</attribute>
      
       <!-- Max number of elements which trigger replication -->
       <attribute name="ReplQueueMaxElements">0</attribute>
      
       <!-- Name of cluster. Needs to be the same for all clusters, in order
       to find each other
       -->
       <attribute name="ClusterName">Remedy-TreeCache-Cluster</attribute>
      
       <!-- JGroups protocol stack properties. Can also be a URL,
       e.g. file:/home/bela/default.xml
       <attribute name="ClusterProperties"></attribute>
       -->
      
       <attribute name="ClusterConfig">
       <config>
       <!-- UDP: if you have a multihomed machine,
       set the bind_addr attribute to the appropriate NIC IP address, e.g bind_addr="192.168.0.2"
       -->
       <!-- UDP: On Windows machines, because of the media sense feature
       being broken with multicast (even after disabling media sense)
       set the loopback attribute to true -->
       <UDP mcast_addr="228.1.2.3" mcast_port="48866" ip_ttl="32" ip_mcast="true" loopback="false"
       mcast_send_buf_size="200000" mcast_recv_buf_size="1000000"
       ucast_send_buf_size="200000" ucast_recv_buf_size="1000000"/>
       <PING timeout="2000" num_initial_members="2"/>
       <FD_SOCK/>
       <VERIFY_SUSPECT timeout="1500"/>
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" max_xmit_size="8192"/>
       <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"/>
       <pbcast.STABLE desired_avg_gossip="20000" max_bytes="500000"/>
       <FRAG frag_size="8192"/>
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false" print_local_addr="true"/>
       </config>
       </attribute>
      
      
       <!--
       Whether or not to fetch state on joining a cluster
       -->
       <attribute name="FetchStateOnStartup">false</attribute>
      
       <!--
       The max amount of time (in milliseconds) we wait until the
       initial state (ie. the contents of the cache) are retrieved from
       existing members in a clustered environment
       -->
       <attribute name="InitialStateRetrievalTimeout">3600000</attribute>
      
       <!--
       Number of milliseconds to wait until all responses for a
       synchronous call have been received.
       -->
       <attribute name="SyncReplTimeout">10000</attribute>
      
       <!-- Max number of milliseconds to wait for a lock acquisition -->
       <attribute name="LockAcquisitionTimeout">15000</attribute>
      
      
       <!-- Name of the eviction policy class. Not supported now. -->
       <attribute name="EvictionPolicyClass"></attribute>
       </mbean>
      
       </server>


      Application Node 1 Log:

      INFO [2005-10-11 15:31:11,985] [main] remedy - Loading Alerts cache data...
      INFO [2005-10-11 15:31:12,009] [main] remedy - 8 total Alert entries have been loaded from database.
      INFO [2005-10-11 15:31:12,010] [main] remedy - Loading ActionItem cache data...
      INFO [2005-10-11 15:31:12,065] [main] remedy - 21 total ActionItem entries have been loaded from database.
      INFO [2005-10-11 15:31:12,066] [main] remedy - Loading Action Logs cache data...
      INFO [2005-10-11 15:31:12,073] [main] remedy - 0 total ActionLog entries have been loaded from database.
      INFO [2005-10-11 15:31:12,074] [main] remedy - Loading SecPrincial data into cache...
      INFO [2005-10-11 15:31:12,231] [main] remedy - 2142 total SecPrincipal entries have been loaded from database.
      DEBUG [2005-10-11 15:31:12,256] [main] org.jboss.cache.Node - createChild: fqn=/Client, child_name=Client
      DEBUG [2005-10-11 15:31:12,257] [main] org.jboss.cache.interceptors.LockInterceptor - created child Client
      DEBUG [2005-10-11 15:31:12,257] [main] org.jboss.cache.Node - acquiring WL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,258] [main] org.jboss.cache.Node - acquired WL: fqn=/Client, caller=Thread[main,5,main], lock=write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,258] [main] org.jboss.cache.TreeCache - _put(null, "/Client", dummy, dummy)
      DEBUG [2005-10-11 15:31:12,259] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,260] [main] org.jboss.cache.interceptors.ReplicationInterceptor - invoking method _put([Ljava.lang.Object;@3e25d3), members=[10.3.75.38:34728, 10.3.75.38:34731], mode=REPL_ASYNC, exclude_self=true, timeout=10000
      DEBUG [2005-10-11 15:31:12,260] [main] org.jboss.cache.TreeCache - callRemoteMethods(): valid members are [10.3.75.38:34728]
      DEBUG [2005-10-11 15:31:12,261] [main] org.jgroups.blocks.RpcDispatcher - dests=[10.3.75.38:34728], method_call=_replicate(_put(null, /Client, dummy, dummy, true)), mode=6, timeout=10000
      DEBUG [2005-10-11 15:31:12,269] [main] org.jgroups.blocks.RpcDispatcher - real_dests=[10.3.75.38:34728]
      DEBUG [2005-10-11 15:31:12,310] [main] org.jgroups.blocks.GroupRequest - sending request (id=1129059072309)
      DEBUG [2005-10-11 15:31:12,311] [DownHandler (NAKACK)] org.jgroups.protocols.pbcast.NAKACK - sending msg #23
      DEBUG [2005-10-11 15:31:12,312] [main] org.jgroups.blocks.GroupRequest - received all responses: [GroupRequest:
      req_id=1129059072309
      caller=10.3.75.38:34731
      10.3.75.38:34728: sender=10.3.75.38:34728, retval=null, received=false, suspected=false

      request_msg: [dst: , src: 10.3.75.38:34731 (3 headers), size = 523 bytes]
      rsp_mode: GET_NONE
      done: false
      timeout: 10000
      expected_mbrs: 0
      ]
      DEBUG [2005-10-11 15:31:12,312] [DownHandler (UDP)] org.jgroups.protocols.UDP - sending msg to null (src=10.3.75.38:34731), headers are {NAKACK=[MSG, seqno=23], MessageDispatcher=[Header: name=MessageDispatcher, type=REQ, id=1129059072309, rsp_expected=false], dest_mbrs=[10.3.75.38:34728], UDP=[channel_name=Remedy-TreeCache-Cluster]}
      DEBUG [2005-10-11 15:31:12,313] [UDP mcast receiver] org.jgroups.protocols.UDP - received (mcast)674 bytes from 10.3.75.38:34732
      DEBUG [2005-10-11 15:31:12,314] [UDP mcast receiver] org.jgroups.protocols.UDP - message is [dst: 228.1.2.3:48866, src: 10.3.75.38:34731 (3 headers), size = 523 bytes], headers are {NAKACK=[MSG, seqno=23], MessageDispatcher=[Header: name=MessageDispatcher, type=REQ, id=1129059072309, rsp_expected=false], dest_mbrs=[10.3.75.38:34728], UDP=[channel_name=Remedy-TreeCache-Cluster]}
      DEBUG [2005-10-11 15:31:12,314] [UpHandler (NAKACK)] org.jgroups.protocols.pbcast.NAKACK - 10.3.75.38:34731] received 10.3.75.38:34731#23
      DEBUG [2005-10-11 15:31:12,316] [UpHandler (GMS)] org.jgroups.blocks.RequestCorrelator - discarded request from 10.3.75.38:34731 as we are not part of destination list (local_addr=10.3.75.38:34731, hdr=[Header: name=MessageDispatcher, type=REQ, id=1129059072309, rsp_expected=false], dest_mbrs=[10.3.75.38:34728])
      DEBUG [2005-10-11 15:31:12,318] [main] org.jgroups.blocks.RpcDispatcher - responses: [sender=10.3.75.38:34728, retval=null, received=false, suspected=false]

      DEBUG [2005-10-11 15:31:12,318] [main] org.jboss.cache.TreeCache - (10.3.75.38:34731): responses for method _replicate:
      [sender=10.3.75.38:34728, retval=null, received=false, suspected=false]

      DEBUG [2005-10-11 15:31:12,321] [main] org.jboss.cache.Node - acquiring RL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,321] [main] org.jboss.cache.Node - acquired RL: fqn=/Client, caller=Thread[main,5,main], lock=read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,322] [main] org.jboss.cache.TreeCache - _get("/Client", __AOPINSTANCE_POJO__, "false")
      DEBUG [2005-10-11 15:31:12,322] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,322] [main] org.jboss.cache.Node - acquiring RL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,322] [main] org.jboss.cache.Node - acquired RL: fqn=/Client, caller=Thread[main,5,main], lock=read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.TreeCache - _get("/Client", __jboss:internal:class__, "true")
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.aop.TreeCacheAopDelegate - putObject(): fqn: /Client
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.aop.TreeCacheAopDelegate - putObject(): aspectized obj is a Map type of size: 94
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.Node - acquiring RL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,323] [main] org.jboss.cache.Node - acquired RL: fqn=/Client, caller=Thread[main,5,main], lock=read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.TreeCache - _get("/Client", __jboss:internal:class__, "true")
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: read owners=[Thread[main,5,main]]
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.aop.TreeCacheAopDelegate - removeObject(): clasz is null. fqn: /Client
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.Node - acquiring WL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.Node - acquired WL: fqn=/Client, caller=Thread[main,5,main], lock=write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,324] [main] org.jboss.cache.TreeCache - _put(null, "/Client", __jboss:internal:class__, class java.util.HashMap)
      DEBUG [2005-10-11 15:31:12,325] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,325] [main] org.jboss.cache.interceptors.ReplicationInterceptor - invoking method _put([Ljava.lang.Object;@1e3f34c), members=[10.3.75.38:34728, 10.3.75.38:34731], mode=REPL_ASYNC, exclude_self=true, timeout=10000
      DEBUG [2005-10-11 15:31:12,325] [main] org.jboss.cache.TreeCache - callRemoteMethods(): valid members are [10.3.75.38:34728]
      DEBUG [2005-10-11 15:31:12,325] [main] org.jgroups.blocks.RpcDispatcher - dests=[10.3.75.38:34728], method_call=_replicate(_put(null, /Client, __jboss:internal:class__, class java.util.HashMap, true)), mode=6, timeout=10000
      DEBUG [2005-10-11 15:31:12,326] [main] org.jgroups.blocks.RpcDispatcher - real_dests=[10.3.75.38:34728]
      DEBUG [2005-10-11 15:31:12,326] [main] org.jgroups.blocks.GroupRequest - sending request (id=1129059072326)
      DEBUG [2005-10-11 15:31:12,326] [main] org.jgroups.blocks.GroupRequest - received all responses: [GroupRequest:
      req_id=1129059072326
      caller=10.3.75.38:34731
      10.3.75.38:34728: sender=10.3.75.38:34728, retval=null, received=false, suspected=false

      request_msg: [dst: , src: (1 headers), size = 596 bytes]
      rsp_mode: GET_NONE
      done: false
      timeout: 10000
      expected_mbrs: 0
      ]

      Application Node 2 Log:

      INFO [2005-10-11 15:31:12,305] [main] remedy - Loading Remedy Entitlement Group cache data...
      DEBUG [2005-10-11 15:31:12,313] [UDP mcast receiver] org.jgroups.protocols.UDP - received (mcast)674 bytes from 10.3.75.38:34732
      DEBUG [2005-10-11 15:31:12,314] [UDP mcast receiver] org.jgroups.protocols.UDP - message is [dst: 228.1.2.3:48866, src: 10.3.75.38:34731 (3 headers), size = 523 bytes], headers are {NAKACK=[MSG, seqno=23], MessageDispatcher=[Header: name=MessageDispatcher, type=REQ, id=1129059072309, rsp_expected=false], dest_mbrs=[10.3.75.38:34728], UDP=[channel_name=Remedy-TreeCache-Cluster]}
      DEBUG [2005-10-11 15:31:12,315] [UpHandler (NAKACK)] org.jgroups.protocols.pbcast.NAKACK - 10.3.75.38:34728] received 10.3.75.38:34731#23
      DEBUG [2005-10-11 15:31:12,316] [UpHandler (GMS)] org.jgroups.blocks.RequestCorrelator - calling (org.jgroups.blocks.RpcDispatcher) with request 1129059072309
      DEBUG [2005-10-11 15:31:12,328] [UDP mcast receiver] org.jgroups.protocols.UDP - received (mcast)747 bytes from 10.3.75.38:34732
      DEBUG [2005-10-11 15:31:12,329] [UDP mcast receiver] org.jgroups.protocols.UDP - message is [dst: 228.1.2.3:48866, src: 10.3.75.38:34731 (3 headers), size = 596 bytes], headers are {NAKACK=[MSG, seqno=24], MessageDispatcher=[Header: name=MessageDispatcher, type=REQ, id=1129059072326, rsp_expected=false], dest_mbrs=[10.3.75.38:34728], UDP=[channel_name=Remedy-TreeCache-Cluster]}
      DEBUG [2005-10-11 15:31:12,330] [UpHandler (NAKACK)] org.jgroups.protocols.pbcast.NAKACK - 10.3.75.38:34728] received 10.3.75.38:34731#24
      DEBUG [2005-10-11 15:31:12,349] [UpHandler (GMS)] org.jgroups.blocks.RpcDispatcher - [sender=10.3.75.38:34731], method_call: _replicate(_put(null, /Client, dummy, dummy, true))
      DEBUG [2005-10-11 15:31:12,349] [UpHandler (GMS)] org.jboss.cache.interceptors.ReplicationInterceptor - replicate(): received _put(null, /Client, dummy, dummy, true)
      INFO [2005-10-11 15:31:12,349] [main] remedy - 1 total Remedy Entitlement Group entries loaded from database.
      INFO [2005-10-11 15:31:12,356] [main] remedy - Loading Remedy Entitlement GrpToFunct cache data...
      DEBUG [2005-10-11 15:31:12,359] [UpHandler (GMS)] org.jboss.cache.Node - createChild: fqn=/Client, child_name=Client
      DEBUG [2005-10-11 15:31:12,359] [UpHandler (GMS)] org.jboss.cache.interceptors.LockInterceptor - created child Client
      DEBUG [2005-10-11 15:31:12,360] [UpHandler (GMS)] org.jboss.cache.Node - acquiring WL: fqn=/Client, caller=Thread[UpHandler (GMS),5,main], lock=
      DEBUG [2005-10-11 15:31:12,360] [UpHandler (GMS)] org.jboss.cache.Node - acquired WL: fqn=/Client, caller=Thread[UpHandler (GMS),5,main], lock=write owner=Thread[UpHandler (GMS),5,main]
      DEBUG [2005-10-11 15:31:12,361] [UpHandler (GMS)] org.jboss.cache.TreeCache - _put(null, "/Client", dummy, dummy)
      DEBUG [2005-10-11 15:31:12,361] [UpHandler (GMS)] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: write owner=Thread[UpHandler (GMS),5,main]
      DEBUG [2005-10-11 15:31:12,361] [UpHandler (GMS)] org.jgroups.blocks.RequestCorrelator - calling (org.jgroups.blocks.RpcDispatcher) with request 1129059072326
      DEBUG [2005-10-11 15:31:12,363] [UpHandler (GMS)] org.jgroups.blocks.RpcDispatcher - [sender=10.3.75.38:34731], method_call: _replicate(_put(null, /Client, __jboss:internal:class__, class java.util.HashMap, true))
      DEBUG [2005-10-11 15:31:12,364] [UpHandler (GMS)] org.jboss.cache.interceptors.ReplicationInterceptor - replicate(): received _put(null, /Client, __jboss:internal:class__, class java.util.HashMap, true)
      DEBUG [2005-10-11 15:31:12,364] [UpHandler (GMS)] org.jboss.cache.Node - acquiring WL: fqn=/Client, caller=Thread[UpHandler (GMS),5,main], lock=
      DEBUG [2005-10-11 15:31:12,364] [UpHandler (GMS)] org.jboss.cache.Node - acquired WL: fqn=/Client, caller=Thread[UpHandler (GMS),5,main], lock=write owner=Thread[UpHandler (GMS),5,main]
      DEBUG [2005-10-11 15:31:12,364] [UpHandler (GMS)] org.jboss.cache.TreeCache - _put(null, "/Client", __jboss:internal:class__, class java.util.HashMap)
      DEBUG [2005-10-11 15:31:12,364] [UpHandler (GMS)] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: write owner=Thread[UpHandler (GMS),5,main]
      INFO [2005-10-11 15:31:12,384] [main] remedy - 17 total Remedy Entitlement GrpToFunct entries loaded from database.
      INFO [2005-10-11 15:31:12,384] [main] remedy - Loading RefLocation to cache data...
      INFO [2005-10-11 15:31:12,388] [main] remedy - 4 total Location entries loaded from database.
      INFO [2005-10-11 15:31:12,389] [main] remedy - Loading Alerts cache data...
      INFO [2005-10-11 15:31:12,424] [main] remedy - 8 total Alert entries have been loaded from database.
      INFO [2005-10-11 15:31:12,425] [main] remedy - Loading ActionItem cache data...
      INFO [2005-10-11 15:31:12,477] [main] remedy - 21 total ActionItem entries have been loaded from database.
      INFO [2005-10-11 15:31:12,478] [main] remedy - Loading Action Logs cache data...
      INFO [2005-10-11 15:31:12,483] [main] remedy - 0 total ActionLog entries have been loaded from database.
      INFO [2005-10-11 15:31:12,483] [main] remedy - Loading SecPrincial data into cache...
      INFO [2005-10-11 15:31:12,632] [main] remedy - 2142 total SecPrincipal entries have been loaded from database.
      DEBUG [2005-10-11 15:31:12,637] [main] org.jboss.cache.Node - acquiring WL: fqn=/Client, caller=Thread[main,5,main], lock=
      DEBUG [2005-10-11 15:31:12,637] [main] org.jboss.cache.Node - acquired WL: fqn=/Client, caller=Thread[main,5,main], lock=write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,637] [main] org.jboss.cache.TreeCache - _put(null, "/Client", dummy, dummy)
      DEBUG [2005-10-11 15:31:12,637] [main] org.jboss.cache.interceptors.UnlockInterceptor - releasing lock for /Client: write owner=Thread[main,5,main]
      DEBUG [2005-10-11 15:31:12,638] [main] org.jboss.cache.interceptors.ReplicationInterceptor - invoking method _put([Ljava.lang.Object;@1efe574), members=[10.3.75.38:34728, 10.3.75.38:34731], mode=REPL_ASYNC, exclude_self=true, timeout=10000
      DEBUG [2005-10-11 15:31:12,638] [main] org.jboss.cache.TreeCache - callRemoteMethods(): valid members are [10.3.75.38:34731]
      DEBUG [2005-10-11 15:31:12,639] [main] org.jgroups.blocks.RpcDispatcher - dests=[10.3.75.38:34731], method_call=_replicate(_put(null, /Client, dummy, dummy, true)), mode=6, timeout=10000
      DEBUG [2005-10-11 15:31:12,640] [main] org.jgroups.blocks.RpcDispatcher - real_dests=[10.3.75.38:34731]
      DEBUG [2005-10-11 15:31:12,648] [main] org.jgroups.blocks.GroupRequest - sending request (id=1129059072648)

        • 1. Re: New - TreeCache Replication On Server Startup
          brian.stansberry

          The FetchStateOnStartup and InitialStateRetrievalTimeout properties control what happens when a new cache instance joins an already existing cluster, i.e. how it acquires any existing cache state from another node in the cluster. They have no effect on how quickly objects put into the cache are replicated to other nodes.

          The only thing that effects when replication happens it to use a replication queue:

           <!-- Just used for async repl: use a repliation queue -->
           <attribute name="UseReplQueue">true</attribute>
          
           <!-- Replication interval for replication queue (in ms) -->
           <attribute name="ReplQueueInterval">3600000</attribute>
          


          which would cause a thread to start that would replicate traffic once an hour. But that once-per-hour replication would continue as long as the cache is running; it's not just an initial delay at startup.

          • 2. Re: New - TreeCache Replication On Server Startup
            jedlloyd

            Thanks for the quick reply.

            Is there a way to control the replication programmatically? Can you set the TreeCache to non-replication mode on the initial configuration and then after loading all cache data to the TreeCache set it to replication mode?

            Both or all nodes will be loading the same cache data set, which are very large and I?m trying to minimize the cache load time on server startup. The initial load time for the CacheManager, prior to loading to the TreeCache, is around 6 ? 9 minutes. The added time to put the cache data onto the TreeCacheAOP cache is in the time frame of 30 minutes plus.

            Using the replication queue configurations would help solve the issue on startup, however once the cache is loaded the application will require faster replication then what would be set for the application startup.

            Thanks,

            Jed

            • 3. Re: New - TreeCache Replication On Server Startup
              brian.stansberry

               

              "jedlloyd" wrote:
              Is there a way to control the replication programmatically? Can you set the TreeCache to non-replication mode on the initial configuration and then after loading all cache data to the TreeCache set it to replication mode?


              No, this can't be done. When createService() is called, the cache sets up an internal interceptor stack it uses to handle requests. Once set up this can't be changed, and if you try to put objects in the cache before a call to createService(), the put will fail because the interceptor stack isn't there.

              "jedlloyd" wrote:
              Both or all nodes will be loading the same cache data set


              Where does the cache data come from? Is there some possibility to use a shared CacheLoader configuration? I'm just throwing out a wild thought here, in case it sparks something in your thinking.

              See http://docs.jboss.org/jbcache/current/TreeCache/html/index.html#d0e334 for more on CacheLoader.

              • 4. Re: New - TreeCache Replication On Server Startup
                jedlloyd

                Thanks again for the reply Brian.

                The cache data is loaded via a CacheManager class, which queries for the data in several databases. This class has been added to and updated for a long period of time and is why intially I have been looking into just using a wrapper class and adding the in memory data to a TreeCache. There is a lot of code and application dependency built into this CacheManager.

                I will look into the shared CacheLoader solution that you mentioned.

                Also, I have been looking into starting the TreeCache in an inactive state and with the idea of then registering it after it is loaded.

                TreeCacheAop() cache = new TreeCacheAop();
                PropertyConfigurator config = new PropertyConfigurator();
                config.configure(cache, JBOSS_TREE_CACHE_CONFIG_FILE);
                
                cache.setInactiveOnStartup(true);
                
                cache.createService();
                cache.startService();
                
                // load cache data to tree cache
                
                cache.activateRegion("/");
                



                From what I've seen it still looks like the replication methods are being called by each node, however it looks like they are being ignored. What are you thoughts on this approach as I'm not exactly sure if this will work?

                Again thanks for you assistance.

                Jed

                • 5. Re: New - TreeCache Replication On Server Startup
                  brian.stansberry

                  Unfortunately, the activate/inactivateRegion API is designed with the intention that the inactive region is not being used locally either, prior to activation. When you activate a region, its going to request the state for that region from the other node, throw away anything it already has (the design intent is it shouldn't have anything), and put the state received from the other node in the local tree.

                  I just wrote the docs for this a couple days ago, and your idea makes me realize I wasn't specific about that point (no help to you, but thanks!).

                  I'm wondering if it might be possible for you to subclass TreeCacheAop to get what you want. You would add this method

                   public void activateRegion(String subtreeFqn, boolean fetchState)
                   throws RegionNotEmptyException, RegionNameConflictException, CacheException
                   {
                   if (fetchState) {
                   super.activateRegion(subtreeFqn);
                   }
                   else if (useMarshalling_) {
                   marshaller_.activate(subtreeFqn);
                   }
                   else {
                   throw new IllegalStateException("TreeCache.activateRegion().
                  useMarshalling flag is not set!");
                   }
                   }
                  


                  This bypasses the state fetch and thus the discard of the existing state. Please consider the above another wild suggestion, not something I've thought through carefully enough to call a recommendation :)

                  A couple problems:

                  1) With the above code, once you activate you may get a commit message from the other server for a tx for which you didn't receive a prepare message. This will throw an exception. The exception won't cause the tx to fail on the other end, but you'll get errors in your logs on the end that threw the exception.
                  2) How are you going to keep these caches properly in sync? If they're both loading state from some initial shared source, but then one node goes live and begins changing the state, while the other is still loading, how will they ever synchronize? This is a general problem that exists no matter how you solve the problem of not replicating doing initial load.