2 Replies Latest reply on Jul 8, 2011 12:51 PM by monty-temboo

    Timeouts

    monty-temboo

      I have a simple test that fires up 4 nodes which read a 4000 items from a cache with a few ms delay in between.  Then it sleeps 5 sec and does it again.  I have another node with puts 10,000 items in the cache.  I start the reader nodes first because I'm using distributed mode and want the items distributed as they are put in.

       

      I'm running in distributed mode with l1 enabled, using a jgroups with udp, config files taken from the Radargun project.  I'm using infinispan 5 RC5

       

      If I run the reader nodes, the first two start up fine, then the next two get errors like the following.  The nodes continue starting up and seem to work fine. 

       

      I imagine I have some configuration parameter incorrect.  Config files are at the end.

       

      16:26:46,725  INFO JGroupsTransport -- ISPN00094: Received new cluster view: [temboooo-30527|2] [temboooo-30527, temboooo-56379, temboooo-24544]

      16:26:46,730 ERROR InvocationContextInterceptor -- ISPN00136: Execution error

      org.infinispan.remoting.RpcException: No more valid responses.  Received invalid responses from all of [Sender{address=temboooo-56379, responded=true}]

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:380)

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$ReplicationTask.call(CommandAwareRpcDispatcher.java:253)

      at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:116)

      at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:436)

      at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:131)

      at org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:283)

      at org.infinispan.interceptors.DistributionInterceptor.realRemoteGet(DistributionInterceptor.java:177)

      at org.infinispan.interceptors.DistributionInterceptor.remoteGetAndStoreInL1(DistributionInterceptor.java:165)

      at org.infinispan.interceptors.DistributionInterceptor.visitGetKeyValueCommand(DistributionInterceptor.java:131)

      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:61)

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

      at org.infinispan.interceptors.LockingInterceptor.visitGetKeyValueCommand(LockingInterceptor.java:147)

      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:61)

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

      at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:133)

      at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:90)

      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:61)

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

      at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:202)

      at org.infinispan.interceptors.TxInterceptor.visitGetKeyValueCommand(TxInterceptor.java:193)

      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:61)

      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:119)

      at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:99)

      at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:64)

      at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:90)

      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:61)

      at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:274)

      at org.infinispan.CacheDelegate.get(CacheDelegate.java:240)

      at org.infinispan.examples.tutorial.clustered.Node4TimeGetGuids.run(Node4TimeGetGuids.java:79)

      at org.infinispan.examples.tutorial.clustered.Node4TimeGetGuids.main(Node4TimeGetGuids.java:25)

       

       

      <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:infinispan:config:4.0">

         <global>

            <transport clusterName="x">

               <properties>

                  <property name="configurationFile" value="jgroups.xml"/>

               </properties>

            </transport>

         </global>

       

         <default>

            <transaction

                  transactionManagerLookupClass="org.infinispan.transaction.lookup.GenericTransactionManagerLookup"/>

            <locking concurrencyLevel="1000" useLockStriping="false" />

       

            <unsafe unreliableReturnValues="false" />

       

            <clustering mode="d">

               <sync replTimeout="10000"/>

               <l1 enabled="true" lifespan="480000"/>

               <hash numOwners="2" rehashEnabled="true"/>

            </clustering>

         </default>

       

      </infinispan>

       

       

      <config xmlns="urn:org:jgroups"

              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

              xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd

      urn:org:jgroups ">

         <UDP

               mcast_addr="${jgroups.udp.mcast_addr:232.10.10.10}"

               mcast_port="${jgroups.udp.mcast_port:45588}"

               tos="8"

               ucast_recv_buf_size="20000000"

               ucast_send_buf_size="640000"

               mcast_recv_buf_size="25000000"

               mcast_send_buf_size="640000"

               loopback="false"

               discard_incompatible_packets="true"

               max_bundle_size="64000"

               max_bundle_timeout="30"

               ip_ttl="${jgroups.udp.ip_ttl:2}"

               enable_bundling="true"

               enable_diagnostics="false"

               thread_naming_pattern="cl"

       

               thread_pool.enabled="true"

               thread_pool.min_threads="4"

               thread_pool.max_threads="8"

               thread_pool.keep_alive_time="5000"

               thread_pool.queue_enabled="true"

               thread_pool.queue_max_size="10000"

               thread_pool.rejection_policy="discard"

       

               oob_thread_pool.enabled="true"

               oob_thread_pool.min_threads="8"

               oob_thread_pool.max_threads="300"

               oob_thread_pool.keep_alive_time="5000"

               oob_thread_pool.queue_enabled="false"

               oob_thread_pool.queue_max_size="100"

               oob_thread_pool.rejection_policy="discard"

               />

       

         <PING timeout="5000" num_initial_members="1000"/>

         <MERGE2 max_interval="30000" min_interval="10000"/>

         <FD_SOCK/>

         <FD_ALL timeout="15000" interval="5000"/>

         <VERIFY_SUSPECT timeout="15000"/>

         <BARRIER/>

         <pbcast.NAKACK use_stats_for_retransmission="false"

                        exponential_backoff="0"

                        use_mcast_xmit="true" gc_lag="0"

                        retransmit_timeout="600,1200"

                        discard_delivered_msgs="true"/>

         <UNICAST timeout="300,600,1200"/>

         <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"

                        max_bytes="1000000"/>

         <pbcast.GMS print_local_addr="true" join_timeout="3000"

                     max_bundling_time="500"

                     view_bundling="true"/>

         <FC max_credits="500000"

             min_threshold="0.20"/>

         <FRAG2 frag_size="60000"/>

         <pbcast.STATE_TRANSFER/>

         <pbcast.FLUSH/>

      </config>

       

      Thanks,

       

      Monty

        • 1. Re: Timeouts
          galder.zamarreno

          It's hard to say what those invalid responses are, but I'd suggest enabling TRACE logging on org.infinispan package to find out more. If you enable TRACE, maybe lower down the amount of entries loaded to avoid getting a massive log file. DEBUG might even do. Also check the logs in the other nodes to see if you see any errors (in this case, check the logs from temboooo-56379 which the exception points towards).

           

          Just in case too, try latest 5.0.0.CR7.

          1 of 1 people found this helpful
          • 2. Re: Timeouts
            monty-temboo

            I've shelved my infinispan experiments for the time being.  When I get back into it I'll try TRACE and also updating to whatever the latest version is.

             

            Thanks,

             

            Monty