14 Replies Latest reply on Nov 1, 2018 9:00 AM by william.burns

    Listener (Clustered) on other node(s) not being called when item added to cache

    jadiyo1971

      Hello,

       

      I am struggling to get Listeners to fire on nodes within a cluster.

       

      I am using Infinispan 9.1.7.Final in Embedded Cache Manager.

       

      The application will run on multiple nodes/openshift pods, but for now, this is just running 2 instances of JBoss  on my local machine.

      Based on the logs for each node, the nodes seem to be able to see each other.

       

      Node A Logs:

      2018-09-08 13:55:43,324 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 64) ISPN000078: Starting JGroups channel GAP_App_Cache_Cluster_REPL_SYNC

      2018-09-08 13:55:43,324 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 64) ISPN000088: Unable to use any JGroups configuration mechanisms provided in properties {}. Using default JGroups configuration!

      2018-09-08 13:55:43,592 WARNING [org.jgroups.stack.Configurator] (ServerService Thread Pool -- 64) JGRP000026: unable to find an address other than loopback for IP version IPv6

      2018-09-08 13:55:48,760 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 64) ISPN000094: Received new cluster view for channel GAP_App_Cache_Cluster_REPL_SYNC: [LG013868-50835|0] (1) [LG013868-50835]

      2018-09-08 13:55:48,763 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 64) ISPN000079: Channel GAP_App_Cache_Cluster_REPL_SYNC local address is LG013868-50835, physical addresses are [0:0:0:0:0:0:0:1:62020]

       

      Also, when Node B starts up, the Node A logs show this:

      2018-09-08 13:57:23,847 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,LG013868-50835) ISPN000094: Received new cluster view for channel GAP_App_Cache_Cluster_REPL_SYNC: [LG013868-50835|1] (2) [LG013868-50835, LG013868-50409]

      2018-09-08 13:57:23,852 INFO  [org.infinispan.CLUSTER] (Incoming-2,LG013868-50835) ISPN100000: Node LG013868-50409 joined the cluster

       

      Node B Logs:

      2018-09-08 13:57:23,242 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 84) ISPN000078: Starting JGroups channel GAP_App_Cache_Cluster_REPL_SYNC

      2018-09-08 13:57:23,242 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 84) ISPN000088: Unable to use any JGroups configuration mechanisms provided in properties {}. Using default JGroups configuration!

      2018-09-08 13:57:23,506 WARNING [org.jgroups.stack.Configurator] (ServerService Thread Pool -- 84) JGRP000026: unable to find an address other than loopback for IP version IPv6

      2018-09-08 13:57:23,872 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 84) ISPN000094: Received new cluster view for channel GAP_App_Cache_Cluster_REPL_SYNC: [LG013868-50835|1] (2) [LG013868-50835, LG013868-50409]

      2018-09-08 13:57:23,875 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 84) ISPN000079: Channel GAP_App_Cache_Cluster_REPL_SYNC local address is LG013868-50409, physical addresses are [0:0:0:0:0:0:0:1:54556]

       

      So it looks like the nodes can see each other.

       

      When Node A adds an entry in to the Cache.  The Listener I have attached to the Cache, fires on Node A only.

      Nothing happens on Node B.

       

      The Listener is anotated as follows:

      @Listener(clustered=true)

       

       

       

      I am only listening for @CacheEntryCreated , which I believe is supported by Clustered Listeners.

      :

       

      The following is the Cache Manager configuration:

      private EmbeddedCacheManager cacheManager = null;

      ..

      ..

      ..

      public void init() {

        ConfigurationBuilder cfg = new ConfigurationBuilder();

        CacheMode selectedCacheMode = CacheMode.REPL_SYNC;

        cfg.clustering().cacheMode(selectedCacheMode);

        cfg.indexing();

        cfg.clustering().create();


        GlobalConfigurationBuilder globalConfigurationBuilder = GlobalConfigurationBuilder.defaultClusteredBuilder();

        globalConfigurationBuilder.globalJmxStatistics().allowDuplicateDomains(true);

        globalConfigurationBuilder.transport().clusterName("GAP_App_Cache_Cluster_"+selectedCacheMode);

       

         cacheManager = new DefaultCacheManager(globalConfigurationBuilder.build(), cfg.build());

         cacheManager.start();

      }

       

      I have been trying for some time now to get this part working and it so any help would be greatly appreciated.

       

      Many thanks

        • 1. Re: Listener (Clustered) on other node(s) not being called when item added to cache
          galder.zamarreno

          When Node A adds an entry in to the Cache.  The Listener I have attached to the Cache, fires on Node A only.

          Nothing happens on Node B.

          Have you registered the cluster listener only in Node A? If so, that's how a clustered listener is supposed to work. A clustered listener get's a single invocation for each updates on either Node A or Node B.

          • 2. Re: Listener (Clustered) on other node(s) not being called when item added to cache
            jadiyo1971

            Thanks for the reply.  Yes, both nodes register a clustered listener on the same cache.

            It is actually the same application running on 2 nodes, so they both do the same thing.

             

            However, when you say :

            "A clustered listener get's a single invocation for each updates on either Node A or Node B."

             

            Does this suggest that only one node (A or B) in the cluster will receive the notification?

            1 of 1 people found this helpful
            • 3. Re: Listener (Clustered) on other node(s) not being called when item added to cache
              william.burns

              When you register a listener that is clustered it will receive all events from any node. So in this case your listener on A and listener on B would both be notified of the event.

               

              So I am not sure why you didn't get it on both. Is the data that was written accessible from both nodes after the put completes?

              1 of 1 people found this helpful
              • 4. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                jadiyo1971

                Thanks for responding William.

                Yes, the data is available on both nodes. I tested this by periodically dumping the data in the cache to the logs. It may help to know that the data was only available on both nodes when the cache had a write through to DB set. When it was a purely in-memory cache, data was not synced between the nodes.

                • 5. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                  william.burns

                  So this sounds like the cache is not clustered properly. I can't tell from what you posted here why that is. Do you have more than one cache definition?

                   

                  But the reason both nodes can see it seems to be because you have a shared cache store so they can see the data the other node wrote by accessing it from the database.

                  1 of 1 people found this helpful
                  • 6. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                    jadiyo1971

                    Only 1 cache definition exists. What you mention about the database makes perfect sense but clearly the clustering is not working correctly.  What would be useful to post in addition to what is already here.

                     

                    Thanks again

                     

                    Nirmal

                    • 7. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                      william.burns

                      If you enable TRACE level and capture the output, it should tell us pretty well.

                      • 8. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                        jadiyo1971

                        Yes..I'll try that

                        • 9. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                          silvaran

                          I'll +1 the issue with clustered listeners.  I need to get a sample project together that demonstrates the problem.  I had a distributed cache that would never, ever, ever, send @CacheEntryCreated events.  In fact it would send @CacheEntryModified events in their place so I simply listened to that (in my use case I didn't need to distinguish between creates and updates so that was fine.

                           

                          In another case, it would only occasionally send @CacheEntryCreated events, and would sometimes not send @CacheEntryModified events.  I removed that listener entirely and fired a JMS message over a topic to notify of cache entry puts.

                           

                          This is a project several years in development now with a few dozen caches and I can't imagine a clustering issue going unnoticed by everything but these cache entry event listeners.  Hopefully I'm able to reproduce it in a sample project.

                           

                          What would be helpful, and possibly faster, is if I had a set of log categories I could set to TRACE that might (hopefully) relate to events so I could do some diagnosing myself.  Any thoughts?

                          • 10. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                            william.burns

                            Scott, I have to admit I am surprised to hear that, cluster listeners has been around for over 4 years now and it is the first I have really heard of such things. The main reason I always find for events not being fired are because of the cluster not being formed as in the above post. There is one thing to note though, that a CREATE event can turn into a MODIFY if the primary owner crashes in the middle of processing a write with a non transactional cache, but that should be extremely infrequent (when this occurs the CacheEntryModifiedEvent (Infinispan JavaDoc All 9.4.0.Final API)  will be true btw).

                             

                            If you are able to get something to demonstrate the issue, we can definitely try to figure out what is going on!

                            • 11. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                              silvaran

                              Huh... I'm also getting 2 POST-@CacheEntryExpired events on a perfectly healthy cluster.  Definitely going to look into this more before truly suggesting it's related. It happens on 2 different threads within the same millisecond, too.  The listener itself is declared @Listener( primaryOnly = true ).  I remember reading something about multiple expiry events in the docs but I thought that was during a failover or something.  Will keep you posted on the main issue.

                              • 12. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                                william.burns

                                Unfortunately expired events are a bit special. They are initiated outside of locking and thus multiple concurrent reads can cause more than one event. And we err on the side of giving possibly an extra one just in case. This can be especially apparent when doing a read from a non owner node. In this case this could cause a read from a primary and backup node for example. If both of these see the expired entry, they may both cause an expired event. One thing to note is that these events are ordered internally, so you don't have to worry about an expired, create and expired event, for example.

                                 

                                There were some more improvements in 9.4.0 as well around expiration, but unfortunately under concurrent reads for an expired read, there is a chance of getting more than one event.

                                • 13. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                                  silvaran

                                  It's definitely a difficult problem to solve.  Now in my case the listener isn't clustered; would these duplicates disappear if I just set clustered=true instead of primaryOnly=true? Funny I'm also seeing CacheEntryExpiredEvent.getValue() return null...

                                   

                                  Wait a minute.... if I have primaryOnly=false and clustered=true for a listener, how could @CacheEntryExpired get fired on TWO different nodes for the exact same key (one getValue()==null).  Is that because it's a distributed cache? Wouldn't there still be exactly one primary owner that doesn't change for a given key assuming a healthy cluster?

                                   

                                  Is there any way my cluster configuration could be messed up enough to break listener functionality but work just fine on absolutely everything else?

                                   

                                  Edit: Looks like the null itself is expected: https://blog.infinispan.org/2015/10/expiration-enhancements.html

                                       "[...] the first event will contain the value (if applicable) and any others will show a null value."

                                  • 14. Re: Listener (Clustered) on other node(s) not being called when item added to cache
                                    william.burns

                                    silvaran  wrote:

                                     

                                    Wait a minute.... if I have primaryOnly=false and clustered=true for a listener, how could @CacheEntryExpired get fired on TWO different nodes for the exact same key (one getValue()==null).  Is that because it's a distributed cache? Wouldn't there still be exactly one primary owner that doesn't change for a given key assuming a healthy cluster?

                                    A cluster listener is special in that it receives all notifications you are registered for and that pass the provided filter if present, irrespective of what node it is registered on. If you had the event fired on two different nodes, it sounds like you have two different clustered listeners. The event received by cluster listeners will always be the same unless there is a topology change and then one or more nodes may or may not get a duplicate, with the retry flag set (not counting expiration).

                                     

                                    silvaran  wrote:

                                     

                                    Edit: Looks like the null itself is expected: https://blog.infinispan.org/2015/10/expiration-enhancements.html

                                         "[...] the first event will contain the value (if applicable) and any others will show a null value."

                                    Yes, this is what I was referring to about concurrent access. Technically you can also get a null value if an entry expires from a cache store and isn't present in memory. This is a legacy thing that was done for performance a long while back, and haven't had a chance to fix it yet.