7 Replies Latest reply on Mar 3, 2016 1:23 PM by haiaw

Jboss/Infinispan clustered Treecache - blocks server to start

haiaw Mar 1, 2016 2:05 PM

Hi,

I see this Cache example: https://github.com/infinispan/infinispan-quickstart/tree/master/clustered-cache

And I have very similar basend on JbossCache 1.4

I want to migrate to Jboss but firstly I want to know if the concept is the same.

In JbossCache 1.4 I have 4 nodes and while they are connecting to cache cluster using jgroubs with so called jgroups "Promises" (ackownleging different protocols readiness) they are suspending whole deploy process on Weblogic Server. Since I have some problem with setting cluster on node A with node B on different machine, one of the machines is endlessly deploying...

I would like to have such mechanism that would let to deployment to continue and maybe join the cache cluster int the background...

Is it possible in Infinispan?

Best regards

1. Re: Jboss/Infinispan clustered Treecache - blocks server to start

nadirx Mar 2, 2016 5:06 AM (in response to haiaw)

Not sure I understand what you mean.
Infinispan uses jgroups to handle all things related to cluster communications. The default configurations don't wait for the entire cluster to form, so you start one node at a time. There is an initial discovery process which blocks for a short while (GMS join timeout) but after that it does essentially "start in the background".
Actions
2. Re: Jboss/Infinispan clustered Treecache - blocks server to start

haiaw Mar 2, 2016 12:52 PM (in response to nadirx)

Are you sure it does that?

I am using JBossCache 1.4.1 which also uses jgroups (2.4.1) and GMS is retrying and timeouting endlessly and the deployment is blocked whole time. I am debugging jgroups sources and cannot find the reason. GMS is one of last protocols on the stack and whats strange, 2 times in 10 it succeeds. I wonder if Infinispan could do better and this information is very important.
Actions
3. Re: Jboss/Infinispan clustered Treecache - blocks server to start

nadirx Mar 2, 2016 1:50 PM (in response to haiaw)

Pretty sure
Also we are using JGroups 3.6. Things have changed. But belaban can probably provide more insight
Actions
4. Re: Jboss/Infinispan clustered Treecache - blocks server to start

belaban Mar 2, 2016 3:15 PM (in response to nadirx)

I have no idea what Colin is talking about, Colin, can you rephrase?
Actions
5. Re: Jboss/Infinispan clustered Treecache - blocks server to start

haiaw Mar 2, 2016 3:38 PM (in response to belaban)

1) I have one WAR which is deployed on two virtual machines (two separate ip addresses) Each virtual machine hase one Weblogic installed. On each Weblogic I have two servers - front and backend. Finally, my WAR file is deployed 4 times front,back on virtual machine 1 and front,back on virtual machine 2. I just want each WAR to deploy seamlessly in a way independent from cache problems. Each node should attach to cluster in background so as not to block the WAR file deployment. Front,back on virtual machine 1 starts OK, but deployment of front or back on virtual machine 2 never ends because of endlessly retrying GMS protocol connection event. It succedes 1 per 6 times, still don't know why..

2) Moreover I am wondering if we have good cache architecture. I have 4 deployments of the same WAR file on two machines, so there are two ip addresses. Each WAR has cluster-config file based on TreeCache. Cache can be replicated sync/async or invalidated sync/async - all depends on settings. But if something very important is cached and it must be replicated / invalidated ASAP like for example user permissions snapshot, then some problems may occur regardless of what happen - I mean if timeut whil replication/invalidation occurs it is bad, other nodes have wrong user permissions in cache. If, on the other side, it work, but to slowly, GUI and the end user are waiting for server to respond, if cache is replicated/ invalidated in sync mode. Worse, if timeout occurs on Jboss Cache invaidation / replication transaction, that rollbacks or blocks whole use-case started by user on GUI.. That is no acceptable that cache problems rollback / block user actions. Thats why I am thinking of changing even the architecture of cache i our project..
Maybe the best solution would be to have one cache source, not clustered, and every node would query that cache and changed values in it. We have database to which every node is connected, and it is good solution - just think what would happen if each node would have other database that would require some synchronization process, nobody do such things..

What Infinispan offers as for architecture for this scenario?
Actions
6. Re: Jboss/Infinispan clustered Treecache - blocks server to start

belaban Mar 3, 2016 12:47 AM (in response to haiaw)

I suggest you try out the latest stable Infinispan/JGroups combo, to see if you GMS connection problems disappear. Alternatively, you could run a JGroups standalone demo on all 4 servers (e.g. Chat) with your config, to look at networking issues separately.
Actions
7. Re: Jboss/Infinispan clustered Treecache - blocks server to start

haiaw Mar 3, 2016 1:23 PM (in response to belaban)

Thanks for your reply. I am doing two things now. New git branch for migrating to Infinispan, old to repair old cache.
As for old cache it seems I succeeded in joining all servers to cluster. I used probe.sh script from JbossCache sources, didn't know something like this even exists. This script showed me that I have 10 strange clusters with old settigs on servers. To restart server I was using kill -9 command and these clusters stayed in JVM with the same multicast port and different binding ports or maybe also the same in some cases. TreeCache.stop wasn't invoked due to kill -9.
That strange, that these zombie clusters stays on JVM and what worse I dont know how to kill them.... ???
After changing multicast port I managed to join all servers to cluster even two times so it seems it works.

Below is type of configuration (not the same but very similar, just copied from some example) I have so that you knew what I am writing about.

    <?xml version="1.0" encoding="UTF-8" ?>
    <server>
      <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar" />

      
      
      
      <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=TreeCache">
        <depends>jboss:service=Naming</depends>
        <depends>jboss:service=TransactionManager</depends>

        
        <attribute name="TransactionManagerLookupClass">org.jboss.cache.DummyTransactionManagerLookup</attribute>

        
        <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>

        
        <attribute name="IsolationLevel">REPEATABLE_READ</attribute>

        
        <attribute name="CacheMode">LOCAL</attribute>

        
        <attribute name="UseInterceptorMbeans">true</attribute>

        
        <attribute name="ClusterName">JBoss-Cache-Cluster</attribute>

        <attribute name="ClusterConfig">
          <config>
            
            
            <UDP mcast_addr="228.1.2.3" mcast_port="45566" ip_ttl="64" ip_mcast="true"
               mcast_send_buf_size="150000" mcast_recv_buf_size="80000" ucast_send_buf_size="150000"
               ucast_recv_buf_size="80000" loopback="false" />
            <PING timeout="2000" num_initial_members="3" up_thread="false" down_thread="false" />
            <MERGE2 min_interval="10000" max_interval="20000" />
            <FD shun="true" up_thread="true" down_thread="true" />
            <VERIFY_SUSPECT timeout="1500" up_thread="false" down_thread="false" />
            <pbcast.NAKACK gc_lag="50" max_xmit_size="8192" retransmit_timeout="600,1200,2400,4800" up_thread="false"
               down_thread="false" />
            <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10" down_thread="false" />
            <pbcast.STABLE desired_avg_gossip="20000" up_thread="false" down_thread="false" />
            <FRAG frag_size="8192" down_thread="false" up_thread="false" />
            <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true" />
            <pbcast.STATE_TRANSFER up_thread="false" down_thread="false" />
          </config>
        </attribute>

        
        <attribute name="InitialStateRetrievalTimeout">5000</attribute>

        
        <attribute name="SyncReplTimeout">10000</attribute>

        
        <attribute name="LockAcquisitionTimeout">15000</attribute>

        
        <attribute name="EvictionPolicyClass">org.jboss.cache.eviction.LRUPolicy</attribute>

        
        <attribute name="EvictionPolicyConfig">
          <config>
            <attribute name="wakeUpIntervalSeconds">5</attribute>
            
            <region name="/_default_">
             <attribute name="maxNodes">5000</attribute>
             <attribute name="timeToLiveSeconds">1000</attribute>
             
             <attribute name="maxAgeSeconds">120</attribute>
           </region>

           <region name="/org/jboss/data">
             <attribute name="maxNodes">5000</attribute>
             <attribute name="timeToLiveSeconds">1000</attribute>
           </region>

           <region name="/org/jboss/test/data">
             <attribute name="maxNodes">5</attribute>
             <attribute name="timeToLiveSeconds">4</attribute>
           </region>
          </config>
        </attribute>


      </mbean>
    </server>
Actions

Go to original post