2 Replies Latest reply on Jul 9, 2012 7:08 AM by senakafdo

    Infinispan EC2 Error - this.merge_id (null) is different from merge_id

    joereger

      Hi All!

       

      I'm running Infinispan 4.2.1 with Hibernate 3.6.4 on Amazon EC2.  App server is standalone Tomcat 7.  OS is Amazon Linux/CentOs.  I keep getting this error:

       

      2011-05-28 22:30:46,540 [OOB-512,InfinispanCache,domU-12-51-39-00-86-57-34887] ERROR org.jgroups.protocols.pbcast.GMS - domU-12-51-39-00-86-57-34887: this.merge_id (null) is different from merge_id (domU-12-51-39-00-86-57-34887::78)

       

      Cache seems to be working locally but I've verified that they're not updating across the cluster.  EC2 instances are in same security group so I don't believe there is any network block.  I'm using S3_PING and have verified that it's properly talking to S3, storing status files in buckets, etc.  Members of the cluster see one another and adjust the cluster membership roster.

       

      Config files below... based on the ec2 sample file in the distro.  Any ideas?

       

      Thanks!

       

      Joe

       

      My config file:________________________________________

       

      <?xml version="1.0" encoding="UTF-8"?>

      <infinispan

            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

             xsi:schemaLocation="urn:infinispan:config:4.2 http://www.infinispan.org/schemas/infinispan-config-4.2.xsd"

            xmlns="urn:infinispan:config:4.2">

          <global>

            <transport clusterName="wct-infinispan-cache">

               <properties>

                  <property name="configurationFile" value="infinispan-jgroups-config-amazon-s3.xml" />

                </properties>

            </transport>

         </global>

         <default>

            <locking

               isolationLevel="READ_COMMITTED"

               lockAcquisitionTimeout="20000"

                writeSkewCheck="false"

               concurrencyLevel="5000"

               useLockStriping="false"

            />

            <jmxStatistics enabled="false"/>

            <invocationBatching enabled="true"/>

             <clustering mode="replication">

                      <stateRetrieval

                         timeout="20000"

                         fetchInMemoryState="false"

                         alwaysProvideInMemoryState="false"

                       />

                      <async

                         useReplQueue="true"

                         replQueueInterval="2500"

                         replQueueMaxElements="100"

                       />

            </clustering>

         </default>

      </infinispan>

       

       

      And my protocol file:__________________________________

       

      <config xmlns="urn:org:jgroups"

              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

               xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd">

         <TCP

              bind_addr="${jgroups.tcp.address:127.0.0.1}"

              bind_port="${jgroups.tcp.port:7800}"

               loopback="true"

              port_range="30"

              recv_buf_size="20000000"

              send_buf_size="640000"

              discard_incompatible_packets="true"

               max_bundle_size="64000"

              max_bundle_timeout="30"

              enable_bundling="true"

              use_send_queues="true"

              sock_conn_timeout="300"

               enable_diagnostics="false"

       

              thread_pool.enabled="true"

              thread_pool.min_threads="2"

              thread_pool.max_threads="30"

              thread_pool.keep_alive_time="5000"

               thread_pool.queue_enabled="false"

              thread_pool.queue_max_size="100"

              thread_pool.rejection_policy="Discard"

       

              oob_thread_pool.enabled="true"

               oob_thread_pool.min_threads="2"

              oob_thread_pool.max_threads="30"

              oob_thread_pool.keep_alive_time="5000"

              oob_thread_pool.queue_enabled="false"

               oob_thread_pool.queue_max_size="100"

              oob_thread_pool.rejection_policy="Discard"

               />

       

         <S3_PING secret_access_key="${jgroups.s3.secret_access_key}" access_key="${jgroups.s3.access_key}" location="${jgroups.s3.bucket:jgroups}" />

       

         <MERGE2 max_interval="30000"

                 min_interval="10000"/>

         <FD_SOCK/>

         <FD timeout="3000" max_tries="3"/>

         <VERIFY_SUSPECT timeout="1500"/>

          <pbcast.NAKACK

               use_mcast_xmit="false" gc_lag="0"

               retransmit_timeout="300,600,1200,2400,4800"

               discard_delivered_msgs="false"/>

         <UNICAST timeout="300,600,1200"/>

          <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"

                        max_bytes="400000"/>

         <pbcast.GMS print_local_addr="false" join_timeout="7000" view_bundling="true"/>

          <UFC max_credits="2000000" min_threshold="0.10"/>

         <MFC max_credits="2000000" min_threshold="0.10"/>

         <FRAG2 frag_size="60000"/>

         <pbcast.STREAMING_STATE_TRANSFER/>

          <pbcast.FLUSH timeout="0"/>

      </config>

        • 1. Re: Infinispan EC2 Error - this.merge_id (null) is different from merge_id
          belaban

          The error message means that there was a merge response received by a member which didn't take part in the merge. This can be safely ignored. It could happen when a merge was started, but the cancelled because the correct view had been received, but we're still getting that spurious message.

          The question is does your cluster form at startup, or does every member start as cluster singleton, only to be merged into a cluster by MERGE2 later on ? The latter case is something that should *not* happen...

          1 of 1 people found this helpful
          • 2. Re: Infinispan EC2 Error - this.merge_id (null) is different from merge_id
            senakafdo

            Hi Bela,

             

            If the error can be ignored can we log it @ WARN level? Its a little confusing for an end user to see an ERROR appearing in a situation where it can be ignored from my understanding. The issue I ran into is probably very different to that of Joe, but the errors were the same, and had no impact on the functionality from what I understood.

             

            BR,
            Senaka.