2 Replies Latest reply on Mar 20, 2012 4:39 AM by galder.zamarreno

    Infinispan cluster error in RadarGun 1.0.0 Final

    kavandesai

      Hi

       

      Since last couple of day  I was trying to form cluster of infinispan nodes in Radargun 1.0.

      I was able to start both master and slave nodes (2 slave) however the log message on master appear that cluster is not formed

       

      2012-03-13 11:20:18,401 [main] INFO  [org.radargun.state.MasterState] Started benchmarking product 'infinispan4'

      2012-03-13 11:20:18,418 [main] INFO  [org.radargun.Master] Master started and listening for connection on: /10.83.14.234:2103

      2012-03-13 11:20:18,418 [main] INFO  [org.radargun.Master] Waiting 5 seconds for server socket to open completely

      2012-03-13 11:20:23,421 [main] INFO  [org.radargun.Master] Waiting for 2 slaves to register to the master.

      2012-03-13 11:20:23,472 [main] INFO  [org.radargun.Master] Waiting for 1 slaves to register to the master.

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.Master] Connection established from 2 slaves.

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] Initializing: 2,2,1

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] Initializing config 2

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarkIt=0

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarks size=1

      2012-03-13 11:20:25,058 [main] INFO  [org.radargun.Master] Starting dist stage 'DestroyWrapperStage' on 2 Slaves: DestroyWrapperStage {slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      2012-03-13 11:20:25,236 [main] INFO  [org.radargun.stages.DestroyWrapperStage] DestroyWrapperStage received ack from all (2) slaves. Durations [0:61 milliseconds, 1:47 milliseconds]

      2012-03-13 11:20:25,237 [main] INFO  [org.radargun.stages.DestroyWrapperStage] Node 1 has an initial free memory of: 1002447kb

      2012-03-13 11:20:25,237 [main] INFO  [org.radargun.stages.DestroyWrapperStage] Node 0 has an initial free memory of: 1002447kb

      2012-03-13 11:20:25,237 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarkIt=0

      2012-03-13 11:20:25,237 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarks size=1

      2012-03-13 11:20:25,237 [main] INFO  [org.radargun.Master] Starting dist stage 'StartClusterStage' on 2 Slaves: StartClusterStage {productName='infinispan4', useSmartClassLoading=true, config=dist-sync.xml, slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      2012-03-13 11:23:38,018 [main] INFO  [org.radargun.stages.StartClusterStage] StartClusterStage received ack from all (2) slaves. Durations [0:187.28 seconds, 1:192.78 seconds]

      2012-03-13 11:23:38,019 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarkIt=0

      2012-03-13 11:23:38,019 [main] INFO  [org.radargun.config.ScalingBenchmarkConfig] fixedBenchmarks size=1

      2012-03-13 11:23:38,019 [main] INFO  [org.radargun.Master] Starting dist stage 'ClusterValidationStage' on 2 Slaves: ClusterValidationStage {isPartialReplication=false, replicationTryCount=60, replicationTimeSleep=2000, wrapper=null, slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      2012-03-13 11:25:38,229 [main] INFO  [org.radargun.stages.ClusterValidationStage] ClusterValidationStage received ack from all (2) slaves. Durations [0:120.12 seconds, 1:120.2 seconds]

      2012-03-13 11:25:38,229 [main] WARN  [org.radargun.stages.ClusterValidationStage] On slave DefaultDistStageAck{slaveIndex=0, slaveAddress=/10.83.14.227, isError=false, errorMessage='null', payload=-1, remoteExceptionString=null} total replication hasn't occurred. Expected 1 and received -1

      2012-03-13 11:25:38,230 [main] WARN  [org.radargun.stages.ClusterValidationStage] On slave DefaultDistStageAck{slaveIndex=1, slaveAddress=/10.83.14.226, isError=false, errorMessage='null', payload=-1, remoteExceptionString=null} total replication hasn't occurred. Expected 1 and received -1

      2012-03-13 11:25:38,230 [main] WARN  [org.radargun.stages.ClusterValidationStage] Cluster hasn't formed!

       

      On slave the message appears like

       

      java -server -Xmx1024M -Xms1024M -Dresolve.dns=false -Djgroups.timer.num_threads=4 -Djava.net.preferIPv4Stack=true -Dlog4j.file.prefix=node1 -Dbind.address= -classpath /opt/RadarGun-1.0.0-FINAL/lib/activation-1.1.jar:/opt/RadarGun-1.0.0-FINAL/lib/commons-beanutils-1.8.0.jar:/opt/RadarGun-1.0.0-FINAL/lib/commons-digester-2.0.jar:/opt/RadarGun-1.0.0-FINAL/lib/commons-logging-1.1.1.jar:/opt/RadarGun-1.0.0-FINAL/lib/commons-math-1.2.jar:/opt/RadarGun-1.0.0-FINAL/lib/jboss-transaction-api-1.0.1.GA.jar:/opt/RadarGun-1.0.0-FINAL/lib/jcommon-1.0.15.jar:/opt/RadarGun-1.0.0-FINAL/lib/jfreechart-1.0.12.jar:/opt/RadarGun-1.0.0-FINAL/lib/log4j-1.2.14.jar:/opt/RadarGun-1.0.0-FINAL/lib/mail-1.4.2.jar:/opt/RadarGun-1.0.0-FINAL/lib/radargun-framework-1.0.0-FINAL.jar:/opt/RadarGun-1.0.0-FINAL/conf org.radargun.Slave -master 10.83.14.234:2103

      --------------------------------------------------------------------------------

      PerNodeRollingFileAppender::Using file prefix:node1

      11:20:12,021 INFO  [Slave] Attempting to connect to master 10.83.14.234:2103

      11:20:12,039 INFO  [Slave] Successfully established connection with master at: 10.83.14.234:2103

      11:20:15,281 INFO  [Slave] Executing stage: DestroyWrapperStage {slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      11:20:15,281 INFO  [DestroyWrapperStage] Received destroy cache wrapper request from master...

      11:20:15,287 INFO  [DestroyWrapperStage] No cache wrapper deployed on this slave, nothing to do.

      11:20:15,306 INFO  [DestroyWrapperStage] Before executing clear, memory looks like this: Memory - free: 1002447 - max:1,013,632 kb- total:1,013,632 kb

      11:20:15,342 INFO  [DestroyWrapperStage] After executing cleanup, memory looks like this: Memory - free: 1007531 - max:1,013,632 kb- total:1,013,632 kb

      11:20:15,356 INFO  [Slave] Finished stage: DestroyWrapperStage {slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      11:20:15,359 INFO  [Slave] Ack successfully sent to the master

      11:20:15,368 INFO  [Slave] Executing stage: StartClusterStage {productName='infinispan4', useSmartClassLoading=true, config=dist-sync.xml, slaveIndex=0, activeSlavesCount=2, totalSlavesCount=2}

      11:20:15,368 INFO  [StartClusterStage] Startup staggering, cluster size is 2 This is the slave with index 0, not sleeping

      11:20:15,368 INFO  [StartClusterStage] Ack master's StartCluster stage. Local address is: /10.83.14.227. This slave's index is: 0

      11:20:15,373 INFO  [StartClusterStage] Creating newInstance org.radargun.cachewrappers.InfinispanWrapper with classloader java.net.URLClassLoader@6fc5f743

      11:20:16,279 WARN  [GenericTransactionManagerLookup] Falling back to DummyTransactionManager from Infinispan

      11:20:16,716 INFO  [JGroupsTransport] Starting JGroups Channel

       

      -------------------------------------------------------------------

      GMS: address=node1-37250, cluster=x, physical address=127.0.0.1:35211

      -------------------------------------------------------------------

      11:20:22,389 INFO  [JGroupsTransport] Received new cluster view: [node1-37250|0] [node1-37250]

      11:20:22,391 INFO  [JGroupsTransport] Cache local address is node1-37250, physical addresses are [127.0.0.1:35211]

      11:20:22,391 INFO  [GlobalComponentRegistry] Infinispan version: Infinispan 'Ursus' 4.2.0-SNAPSHOT

      11:20:22,453 INFO  [ComponentRegistry] Infinispan version: Infinispan 'Ursus' 4.2.0-SNAPSHOT

      11:20:22,453 INFO  [InfinispanWrapper] Loading JGroups form: file:/opt/RadarGun-1.0.0-FINAL/plugins/infinispan4/lib/jgroups-2.11.0.GA.jar

      11:20:22,453 INFO  [InfinispanWrapper] JGroups version: JGroups 2.11.0.GA [$Id: Version.java,v 1.110 2010/10/29 12:02:06 belaban Exp $]

      11:20:22,453 INFO  [StartClusterStage] Number of members=1 is not the one expected: 2

      11:20:23,455 INFO  [StartClusterStage] Number of members=1 is not the one expected: 2

       

      When I reduced the number of salve from 2 to 1 it works.