8 Replies Latest reply on Aug 6, 2010 12:54 PM by vblagojevic

    JGroups - Server vs. Client

    shane_dev

      I am wondering when should a node become a JGroups server, and when should it become a JGroups client?

       

      I had thought that the default behavior was for every node to become a server. However, we are seeing that by default they are becoming clients.

       

      There is some inconsistency though. We have 5 clusters set up. During one of our test runs, one cluster (OHS) became server-server while the remaining clusters became server-client.

       

      I did notice that a merge took place when the OHS cluster started up. I'm not sure why though. I didn't see any timeouts or suspect messages. As a matter of fact, the merge takes place right in the middle of a state transfer that begins right after one node has successfully joined the other.

       

      My next question is...

      Why did a merge take place in the OHS cluster?

       

      Other than the merge, the OHS clustered seemed to start up just like the other 4. I didn't see any other differences in the logs. However, I have attached them along with some of my own comments. Perhaps you can see something that I missed.

       

      It seems to me that when the nodes start up as clients they do not run the MERGE2 protocol such that they never send 'get members' requests. They only respond to them.

       

      ohs-node-a-clean.txt

      There are two nodes (A and B) in the OHS cluster.

       

      trs-node-a-clean.txt

      There are two nodes (A and B) in the TRS cluster. There are actually 4 separate trs clusters. However, since they all started up identical to each I've attached an example from just one of them.

        • 1. Re: JGroups - Server vs. Client
          galder.zamarreno

          I'll let the JGroups guys talk about the specifics of server/client, since it's linked to their internals.

           

          In the past I've seen merges happening when nodes in the cluster were started at exactly same time. Staggering startups solved the issue. Other times, I've seen merges when the underlying multicast communications was lousy. For example, nodes would not join initially but after a few minutes they'd find each other via merge and would join then. I don't think the latter is your case but it's worth keeping it in mind.

           

          What does your JGroups config look like?

          • 2. Re: JGroups - Server vs. Client
            shane_dev

            Here are the JGroups files for the OSH and TRS clusters. The only difference is that we've been adjusting the MPING timeout/retry values per some multicasting issues we came across.

            • 3. Re: JGroups - Server vs. Client
              vblagojevic

              Shane,

               

              So the main question is why did that merge request happen at all? Was this a one time event or you can reproduce it frequently? How big is the state you are transferring?

               

              Here is my theory. State transfer took a while (logs indicate at least 4 sec). In meantime, merge kicked in, discover that it has two members who are both coordinators and initiated merge. That lead us into nondeterministic territory.

               

              I am still thinking about this use case. Can you plase confirm my assumptions and we'll take it from there.

               

              Regards,

              Vladimir

              1 of 1 people found this helpful
              • 4. Re: JGroups - Server vs. Client
                shane_dev

                Hi Vladimir,

                 

                That may be it. To be honest, I didn't see a merge take place in the last couple of test runs. That, and it didn't cause the caches to be out of sync since it happened during the state transfer. I suppose I was just curious as to what may have happened. If we see it again, I'll dig deeper in the logs. Perhaps I did miss a suspect message somewhere.

                 

                The other issue is that we are curious as to when a node should stay a client or become a server. Again, it doesn't directly affect the cache consistency but we are curious. We'd like to get it running to a point where the behavior is predictable and nothing 'out of the ordinary' is happening.

                 

                Thanks,
                Shane

                • 5. Re: JGroups - Server vs. Client
                  vblagojevic

                  Hey Shane,

                   

                  Which particular part of server-client transition are you interested in? I'll have to lookup the details myself but on top of my head: node becomes a server as soon as it joins a cluster and becomes a client after it leaves the cluster.

                   

                  Vladimir

                  • 6. Re: JGroups - Server vs. Client
                    shane_dev

                    That was my understanding as well. None of the nodes should really be operating as a client in an active cluster.

                     

                    What bothered us was that there was some output on the nodes that was similar to '1 servers (1 coord), 1 clients' while in the other cluster we saw '2 servers (1 coord), 0 clients.

                     

                    In addition, in the cluster where we saw the client messages ( > 0) we noticed that those nodes never send out the 'get members' requests to detect a merge scenario. Then only respond.

                     

                    Just something we are curious about. Nothing is broken per say.

                    • 7. Re: JGroups - Server vs. Client
                      vblagojevic

                      Hey Shane,

                       

                      "In addition, in the cluster where we saw the client messages ( > 0)  we noticed that those nodes never send out the 'get members' requests to  detect a merge scenario. Then only respond."

                       

                      Capture this in a log if you can please, on both nodes. It would be great to analyze such a trace as I can not envision how this can happen. It could possibly be a scenario when outbound thread (a sender thread) died and all outbound traffic is not happening. We took special care in reviewing code for 2.10.GA to make sure this never happens.

                       

                      Best regards,

                      Vladimir

                      • 8. Re: JGroups - Server vs. Client
                        vblagojevic

                        Thought of this just as I clicked send. Is it that all outbound traffic is stopped or only get members? If it is only get members it could possibly be a bug in discovery protocol.

                         

                        Thanks,

                        Vladimir