1 2 Previous Next 24 Replies Latest reply on Aug 17, 2016 7:45 AM by Wolf-Dieter Fink

    Infinispan server: Cache entries are not replicated although cache is in mode replicated async

    Eli Z Newbie

      Hi,

       

      We are using infinispan server version 8.2.1 Final.

       

      Are set up is:

      1. We have two cluster nodes master (server one) & slave (server two) working in domain mode.

      2. Configured to work as replicated async cache

      3. Our application uses the hot rod client in round robin writing and reading from each node.

       

      We have a behavior sometimes in which the replication does not work.

      That is, in master node we have Record X and in slave node we have Record Y.

      I would expect both nodes to have both Record X & Record Y.

       

      That behavior occurred several times and we do not have a specific scenario for it. May be it's due to many writes to the cache to both nodes. we do not know.

       

      in both console logs we see this exception, we do not know if it's related but it looks like its related:

      ========================================================================================

      [Server:server-one]  [33m [0m [33m19:33:35,304 WARN  [org.jgroups.protocols.TCP] (TcpServer.Acceptor [7600],null) JGRP000006: failed accepting connection from peer: java.net.SocketException: BaseServer.TcpConnection.readPeerAddress(): cookie read by 192.168.118.51:7600 does not match own cookie; terminating connection [0m

      [Server:server-one]  [33m at org.jgroups.blocks.cs.TcpConnection.readPeerAddress(TcpConnection.java:256) [0m

      [Server:server-one]  [33m at org.jgroups.blocks.cs.TcpConnection.<init>(TcpConnection.java:54) [0m

      [Server:server-one]  [33m at org.jgroups.blocks.cs.TcpServer$Acceptor.handleAccept(TcpServer.java:132) [0m

      [Server:server-one]  [33m at org.jgroups.blocks.cs.TcpServer$Acceptor.run(TcpServer.java:117) [0m

      [Server:server-one]  [33m at java.lang.Thread.run(Thread.java:745) [0m

      [Server:server-one]  [33m [0m

      [Server:server-one]  [33m [0m [0m09:05:35,687 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 52) DGISPN0001: Started ___defaultcache cache from clustered container [0m

       

       

      [Server:server-two]  [33m [0m [33m19:31:40,350 WARN  [org.jgroups.protocols.TCP] (TcpServer.Acceptor [7600],null) JGRP000006: failed accepting connection from peer: java.net.SocketException: BaseServer.TcpConnection.readPeerAddress(): cookie read by 192.168.118.52:7600 does not match own cookie; terminating connection [0m

      [Server:server-two]  [33m at org.jgroups.blocks.cs.TcpConnection.readPeerAddress(TcpConnection.java:256) [0m

      [Server:server-two]  [33m at org.jgroups.blocks.cs.TcpConnection.<init>(TcpConnection.java:54) [0m

      [Server:server-two]  [33m at org.jgroups.blocks.cs.TcpServer$Acceptor.handleAccept(TcpServer.java:132) [0m

      [Server:server-two]  [33m at org.jgroups.blocks.cs.TcpServer$Acceptor.run(TcpServer.java:117) [0m

      [Server:server-two]  [33m at java.lang.Thread.run(Thread.java:745) [0m

      ========================================================================================

      After this state occurs, only restart of the whole domain (master & slave) fixes the issue and the replication starts working again.

       

       

      Can you please advise on this issue? Since it's very critical for us since if replication stops working it means our high availability of our application will not work also.

       

      Attached are console logs of both master & slave servers.
      And also configuration files of the domain * hosts files

       

       

      Thanks,

      Eli

        1 2 Previous Next