7 Replies Latest reply on May 30, 2017 8:50 AM by jaikiran

    Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?

    djeezus

      Hi all,

       

      we're running wildfly8-2, jgroups-3.4.5, jboss-ejb-client 2.1.3 (also tried with latest 2.1.8); we have an issue, where the RandomClusterNodeSelector seems to not find all the nodes in the cluster.  Or at least not all the time, or it's just really bad at randomizing, so that all requests end up on the same node...

       

      All clustering seems to work as expected : nodes added get recognized by the cluster (MERGE succesfull), nodeViews are updated properly (subsystem=jgroups/channel=ejb shows all nodes), and if the EJB-client was already running, it notices this new view, and "org.jboss.ejb.client.RandomClusterNodeSelector" starts using the newly added node.  When a clusterNode is stopped, it's taken out of the View, and all works fine ...

       

      The problem now is, that when a clusterNode is restarted and a ejb-client (which is also Wildfly server) had just selected this node for invocation of remote EJB, the particular node is taken out of the ClusterContext entirely, and even when this same clusterNode comes back up, rejoins the cluster properly and in becomes available again for action, it never gets selected again.

       

      So, is there a way to refresh (or even view debug/trace log4j or jboss-cli in runtime ?) this ClusterContext manually and/or programmatically ?  Or maybe a way to figure out why ejb.client does not pick the other nodes ... does it even know about them ?

       

      grtz,

      laerg

        • 1. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
          dmlloyd

          The RandomClusterNodeSelector generally only selects among connected nodes when there is at least one node connected, rather than choosing among all nodes.  You can provide an alternative implementation which selects from among all available nodes instead though.

          • 2. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
            djeezus

            Hi David,

             

            thank you for responding; in my experience, the RandomClusterNodeSelector actually does randomly select connected nodes ... it's just that "sometimes" it doesn't.  I'd like to investigate this "sometimes".  I've put org.jboss.ejb.client in DEBUG, so I can see the selection itself, but I cannot see the selection process.  I can also see the exclusion of a node if it effectively goes down ...

             

            I'll follow your suggestion and try to implement a custom selection implementation; I see that in jboss-ejb-client-4.0x there are other implementations added by default (RoundRobinConnectedNode would be ideal).

            I guess I cannot just add the jboss-ejb-client-4.0.x in this wildfly-8.2 because of dependencies... any other way of implementing these on wildfly-8.2 ?

             

            thnx for your time,

            Gert

            • 3. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
              jaikiran

              In 2.x of jboss-ejb-client the RandomClusterNodeSelector did indeed take into account unconnected (but available) cluster nodes during selection jboss-ejb-client/RandomClusterNodeSelector.java at 2.x · jbossas/jboss-ejb-client · GitHub

               

              The information about nodes that belong to a cluster is transmitted by the server to the client and is handled here jboss-ejb-client/ClusterTopologyMessageHandler.java at 2.x · jbossas/jboss-ejb-client · GitHub

               

              Ideally enabling TRACE level logging of org.jboss.ejb.client should give some details on what's going on.

              • 4. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
                djeezus

                Hi,

                 

                indeed, it does, but only sometimes ... I don't seem to find any logic in when or why it does randomly select nodes, and when or why it stays with the one it's connected to initially.  Maybe there is some timeout/refresh/recheck option I'm not aware of in ejb-client clusterContext ?

                 

                Any way, I've implemented a custom selector, with only the availableNodes part, so I left out the connectedNodes check...

                 

                if (availableNodes.length == 1) {
                return availableNodes[0];
                }
                final Random random = new Random();
                final int randomSelection = random.nextInt(availableNodes.length);
                return availableNodes[randomSelection];

                 

                And when all clusterNodes are up and running, the ejb-client does randomly pick nodes, but when I shutdown a node, it keeps randomly selecting to send requests to this node (and failing obviously), even though it's not part of the cluster any more (I checked in jgroups via cli to verify).  It looks as though there is some sort of "cache" inside the ejb-client's clusterContext, that keeps this node in the availableList.  Is it supposed to be like this, surely not ... or am I missing something ?

                 

                grtz & thnx,

                Gert

                • 5. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
                  jaikiran

                  To me, at this point, it looks like the WildFly server node(s) whichever went down isn't communicating to this client that it has gone down and can no longer serve that application (EJBs). The communication protocol is a EJB client specific protocol and does _not_ rely on jgroups. So the thing to look into is:

                   

                  1. How is that node going down? Cleanly? Or kill -9 kind of thing?

                  2. Each server instance is supposed to have a unique jboss.node.name. Make sure your server instances indeed do have unique value for this system property(?). It's been a while I have been involved in this code, so I don't fully remember, how you ensure that.

                  3. Enable TRACE level logs on org.jboss.ejb.client package on the client side and TRACE level logs of org.jboss.as.ejb3 on (each of the) server nodes and have those logs uploaded here in this thread (you can click on "Use advanced editor" on top right side of the message editor window, which then allows you to upload attachments)

                  • 6. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
                    djeezus

                    Hi,

                     

                    sorry for taking this long to reply, but I was knee-deep in mcast / router / igmp and such ...  As we don't have a "real" router in between our cluster_nodes (loadbalacner act also as gateway), there was no IGMP routing between nodes.  So that was why sometimes the cluster formed, and sometimes it didn't; depending on the switch's multicast port timeout.  After enabling mcast routing, all was working fine and stable

                     

                    1. Nodes are always going down cleanly, and now also clusterView is updated appropriately

                    2. unique jboss.node.name : ${hostname}

                    3. TRACE wasn't much help, except that I noticed that even now when all clustering is 100% working, sometimes the "connected" list is not updated when a cluster node returns (MERGE3).  That's why I've opted to write a custom selector that only picks from "connected" list if there's more than 1, otherwise pick from "available". 

                     

                    I'm not quite sure what the difference is between "connected" and "available" nodes for the EJB-client.  When I bring down a cluster node, and it was in "connected" & "available" list, it is effectively removed from "connected" & "available" list; when it comes back up and joins the cluster, it's put back in "available" list, but not in the "connected" (or at least not always).  If I restart the ejb-client however, both (or all) clustered nodes are "connected" & "available" ...

                     

                    What is the mechanism for ending up in "connected" and/or "available" list ?

                     

                    grtz,

                    Gert

                     

                    PS : the initial EJB-client call, is towards a VIP, so it could be that an initial request ends up multiple times on the same host. But imo that shouldn't matter, because ejb-client will only put clustered nodes in its context, right ?

                    • 7. Re: Is there a way to refresh/update org.jboss.ejb.client.ClusterContext view ?
                      jaikiran

                      Gert Vandelaer wrote:

                       

                       

                      What is the mechanism for ending up in "connected" and/or "available" list ?

                       

                      The "connected" and "available" list is really an implementation detail of the EJB client library. It essentially translates to:- does the EJB client library have an established connection (==connected) with one of those nodes so that it can reuse that connection and the node to handle the EJB invocation instead of creating a new connection from the EJB client to one of the nodes among the "available" nodes in the cluster.