1 2 Previous Next 17 Replies Latest reply on Jan 25, 2007 12:24 PM by Manik Surtani

    Buddy replication behavior

    Gianluca Puggelli Newbie

      Hi all,
      I'm testing the buddy replication with JBossCache version 1.4.0-SP1.
      I found a strange behavior, so I'm wondering if it is correct.

      Let suppose to have there caches A, B, C configured with:

       <attribute name="BuddyReplicationConfig">
       <config>
       <buddyReplicationEnabled>true</buddyReplicationEnabled>
       <buddyLocatorClass>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</buddyLocatorClass>
       <buddyCommunicationTimeout>50000</buddyCommunicationTimeout>
      
       <buddyLocatorProperties>
       numBuddies = 1
       ignoreColocatedBuddies = true
       </buddyLocatorProperties>
      
       <dataGravitationRemoveOnFind>false</dataGravitationRemoveOnFind>
       <dataGravitationSearchBackupTrees>true</dataGravitationSearchBackupTrees>
       <autoDataGravitation>false</autoDataGravitation>
       </config>
       </attribute>
      


      If the caches are started in the A, B, C order then the replication chain
      is A replicates on B, B replicates on C and C replicates on A:

      A -> B -> C -> A

      Let suppose to store 1 on A, 2 on B and 3 on C with the call:
       cache.put(new Fqn(new Integer(x)), key, value);
      

      where x is 1,2 and 3 and key/value some data.

      Then the caches contain (all the three caches are running on the
      192.168.0.4 host).

      A (192.168.0.4:33510)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33517
       3
       1
      

      B (192.168.0.4:33512)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33510
       1
       2
      

      C (192.168.0.4:33517)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33512
       2
       3
      

      If the process where the cache A is killed then the data contained in
      the other two is the following:

      B (192.168.0.4:33512)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33510
       1
       192.168.0.4_33517
       3
       2
      

      C (192.168.0.4:33517)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33512
       2
       3
      

      I think that this is not correct, it should be:

      B (192.168.0.4:33512)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33517
       3
       2
       1
      

      C (192.168.0.4:33517)
       /
       _BUDDY_BACKUP_
       192.168.0.4_33512
       2
       1
       3
      


      Thanks in advance for your time.

      best regards
      gianluca
      --
      Gianluca Puggelli
      skype:pugg1138

        • 1. Re: Buddy replication behavior
          Manik Surtani Master

          You are correct with what you expect, that is what should happen. This is proved by the unit test BuddyReplicationFailoverTest.testGravitationKillOwner().

          Just tried this and the test works fine.

          Do you have a unit test that shows the problem?

          • 2. Re: Buddy replication behavior
            Gianluca Puggelli Newbie

            Hi Manik,
            unfortunately I don't have an unit test. But I will try to write it.

            I had a look to the testGravitationKillOwner() method. In it after the kill (stopService), the get method is explicitly called and then the data is gravitated to another node.
            May be I'm wrong, but the behavior that I expect to have should happen just after the kill owner without invoking any methods. And it should happen even if the gravitation is completely disabled.

            thanks and regards
            gianluca
            --
            Gianluca Puggelli
            skype:pugg1138

            • 3. Re: Buddy replication behavior
              Gianluca Puggelli Newbie

              Manik, this is a test case that cover this use case:

              public class BuddyReplicationFailoverTest extends BuddyReplicationTestsBase
              {
              ...
              
               public void testReplication() throws Exception
               {
               caches = createCaches(3, false, false, false);
              
               final String[] fqns = { "/one", "/two", "/three" };
               final String[] backupFqns = {
               "/" + BuddyManager.BUDDY_BACKUP_SUBTREE + "/"
               + BuddyManager.getGroupNameFromAddress(caches[0].getLocalAddress()) + fqns[0],
              
               "/" + BuddyManager.BUDDY_BACKUP_SUBTREE + "/"
               + BuddyManager.getGroupNameFromAddress(caches[1].getLocalAddress()) + fqns[1],
              
               "/" + BuddyManager.BUDDY_BACKUP_SUBTREE + "/"
               + BuddyManager.getGroupNameFromAddress(caches[2].getLocalAddress()) + fqns[2],
               };
              
               caches[0].put(fqns[0], key, value);
               caches[1].put(fqns[1], key, value);
               caches[2].put(fqns[2], key, value);
              
               dumpCacheContents(caches);
              
               caches[0].stopService();
               caches[0] = null;
               TestingUtil.sleepThread(500);
              
               dumpCacheContents(caches);
              
               assertTrue("caches[1] should contain \"one\" and \"two\"",
               caches[1].exists(fqns[0]) && caches[1].exists(fqns[2]));
               assertTrue("caches[1] should contain the \"three\" backup", caches[1].exists(backupFqns[2]));
              
               assertTrue("caches[2] should contain \"three\"", caches[2].exists(fqns[2]));
               assertTrue("caches[2] should contain the \"one\" and \"two\" backups",
               caches[2].exists(backupFqns[0]) && caches[2].exists(backupFqns[1]));
               }
              ...
              }
              


              The first assertion fails. This is what is printed before the first cache kill:

              **** START: Cache Contents ****
               ** Cache 0 is 192.168.0.4:33266
              
              /one
              /_BUDDY_BACKUP_
               /192.168.0.4_33270
               /three
              
               ** Cache 1 is 192.168.0.4:33268
              
              /_BUDDY_BACKUP_
               /192.168.0.4_33266
               /one
              /two
              
               ** Cache 2 is 192.168.0.4:33270
              
              /three
              /_BUDDY_BACKUP_
               /192.168.0.4_33268
               /two
              
              **** END: Cache Contents ****
              


              While this is what is printed after the kill:

              **** START: Cache Contents ****
               ** Cache 0 is null!
               ** Cache 1 is 192.168.0.4:33268
              
              /_BUDDY_BACKUP_
               /192.168.0.4_33270
               /three
               /192.168.0.4_33266
               /one
              /two
              
               ** Cache 2 is 192.168.0.4:33270
              
              /three
              /_BUDDY_BACKUP_
               /192.168.0.4_33268
               /two
              
              **** END: Cache Contents ****
              


              The same as my original post.

              Thanks and regards
              gianluca

              --
              Gianluca Puggelli
              skype:pugg1138

              • 4. Re: Buddy replication behavior
                Manik Surtani Master

                Correct, this is what is expected.

                You need to

                1) enable gravitation explicitly if you want to pull data out of a potential backup scenario (this is necessary as a separate option to prevent calls trying to gravitate data for data that does not exist. E.g., If I try

                String[] s = {"/one", "/two", "/three", "/four", "/five",
                 "/six", "/seven", "/eight", "/nine", "/ten"};
                
                for (String fqn : s) cache[1].get(fqn);
                


                I don't want expensive network calls (esp. if the cluster is big) to go out when looking for nodes four to ten.

                This is why when you know about a view-change event (perhaps by using a listener) you can execute gravitate calls.

                2) Gravitation should not happen automatically - only when a gravitation call occurs, and even then, only for the node being called. This is to prevent a "network storm" when a node dies. Let's assume each node has 1GB of data. If a node dies, I don't want 1GB of network traffic of data being gravitated, since this may then kill other nodes or cause the network to be unresponsive.

                This is why this happens lazily, when a node is requested.

                Hope this helps,
                Manik

                • 5. Re: Buddy replication behavior
                  Gianluca Puggelli Newbie

                  Hello again,

                  the point 1) is clear, but I have some doubts about the point 2).

                  Let me divide the data stored in a cache in primary and backup. When a node die, to avoid network storm, the primary data is not automatically gravitated. If so, why the automatic gravitation is done for the backup ?
                  In fact in the example that I posted, the primary data contained in the node C or cache 2 (the node three) is automatically copied in the backup data of the node B (or cache 1).

                  And then, when a node die, the cluster is not 'homogeneous' anymore. Some data has a backup and some other don't. In this situation, without a specific application code that force the gravitation, another fault can cause information loss.

                  thanks and regards
                  gianluca
                  --
                  Gianluca Puggelli
                  skype:pugg1138

                  • 6. Re: Buddy replication behavior
                    Manik Surtani Master

                    There is no automatic gravitation done for the backup. In your example, cache[1] always had /_BUDDY_BACKUP_/192.168.0.4_33266/one. No gravitation necessary.

                    cache[1] also now sees /_BUDDY_BACKUP_/192.168.0.4_33266/three since cache[2] realises that it doesn't have a backup anywhere anymore, and hence assigns cache[1] as it's new backup node with it's state.

                    • 7. Re: Buddy replication behavior
                      Gianluca Puggelli Newbie

                      I used the word 'gravitation' also for backup data, may be improperly.


                      cache[1] also now sees /_BUDDY_BACKUP_/192.168.0.4_33266/three since cache[2] realises that it doesn't have a backup anywhere anymore, and hence assigns cache[1] as it's new backup node with it's state.


                      This movement of the primary data that are without backup can cause a "network storm", but this is inevitable.
                      Why there isn't an equivalent movement for the backup data that are without primary (e.g. /_BUDDY_BACKUP_/192.168.0.4_33266/one on cache[1]) ?

                      My major concern is not related to the "network storm" but to the fact that in case of multiple faults the cluster has information loss. For example: Let suppose that first the cache[0] dies and then after one minute also the cache[1] dies. In this case the data stored in the node /one is lost forever.

                      Thanks and regards
                      gianluca
                      --
                      Gianluca Puggelli
                      skype:pugg1138

                      • 8. Re: Buddy replication behavior
                        Brian Stansberry Master

                        IMO, this is an area where configurable alternatives would be useful, since the "network storm" problem is a real issue, as is the lower QOS that comes if the primary buddy doesn't automatically take over the data.

                        A couple of config options come to mind (option names I just made up with very little thought):

                        1) "primary-backup-take-ownership" boolean flag -- if true, the behavior Gianlucca is looking for occurs.

                        2) minBuddies -- indicates the minimum number of buddies a node has to have; if a node fails, the affected nodes check this to decide whether they have to elect a new buddy and/or take ownership (if they are the primary).

                        • 9. Re: Buddy replication behavior
                          Gianluca Puggelli Newbie

                          Yes, this will solve the problem.

                          Any idea about when this feature will be available ?

                          thanks and regards
                          gianluca

                          --
                          Gianluca Puggelli
                          skype:pugg1138

                          • 10. Re: Buddy replication behavior
                            Gianluca Puggelli Newbie

                            Hi,
                            I wrote a workaround in the form of a TreeCacheListenener. Please let me know what you think about.

                            class Listener implements TreeCacheListener
                            {
                             private TreeCache cache;
                             private View view;
                            
                             private static final Fqn backupFqn = new Fqn(BuddyManager.BUDDY_BACKUP_SUBTREE);
                             private static final Option option = new Option();
                            
                             static
                             {
                             option.setForceDataGravitation(true);
                             }
                            
                             private Vector getMembersLeft(View old_view, View new_view)
                             {
                             final Vector result = new Vector();
                             final Vector members = old_view.getMembers();
                             final Vector new_members = new_view.getMembers();
                            
                             for(int i=0; i < members.size(); i++)
                             {
                             final Object mbr = members.elementAt(i);
                            
                             if(!new_members.contains(mbr))
                             {
                             result.addElement(mbr);
                             }
                             }
                            
                             return(result);
                             }
                            
                             private void check(Vector membersLeft)
                             {
                             for(int i=0, n=membersLeft.size(); i<n ; i++)
                             {
                             final Address addr = (Address)membersLeft.get(i);
                             final Fqn fqn = new Fqn(backupFqn,
                             new Fqn(BuddyManager.getGroupNameFromAddress(addr)));
                            
                             if(cache.exists(fqn))
                             {
                             try
                             {
                             final Set children = cache.getChildrenNames(fqn);
                            
                             for(final Iterator it=children.iterator() ; it.hasNext() ;)
                             {
                             cache.get(new Fqn((String)it.next()), option);
                             }
                             }
                             catch(CacheException ex)
                             {
                             ex.printStackTrace();
                             }
                             }
                             }
                             }
                            
                            
                             public void cacheStarted(TreeCache cache)
                             {
                             this.cache = cache;
                             this.view = new View(new ViewId(cache.getCoordinator()), cache.getMembers());
                             }
                            
                             public void cacheStopped(TreeCache cache) {}
                             public void nodeCreated(Fqn fqn) {}
                             public void nodeEvicted(Fqn fqn) {}
                             public void nodeLoaded(Fqn fqn) {}
                             public void nodeModified(Fqn fqn) {}
                             public void nodeRemoved(Fqn fqn) {}
                             public void nodeVisited(Fqn fqn) {}
                            
                             public void viewChange(View new_view)
                             {
                             if(view!=null)
                             {
                             final Vector membersLeft = getMembersLeft(view, new_view);
                            
                             if(!membersLeft.isEmpty())
                             {
                             check(membersLeft);
                             }
                             }
                            
                             view = new_view;
                             }
                            }
                            
                            


                            thanks and regards
                            gianluca
                            --
                            Gianluca Puggelli
                            skype:pugg1138

                            • 11. Re: Buddy replication behavior
                              Brian Stansberry Master

                              I read this real quickly, so forgive me if I'm wrong, but it looks like *each* buddy of the node that left will try to do the gravitation, rather than a "primary" buddy. That will very likely lead to problems.

                              • 12. Re: Buddy replication behavior
                                Gianluca Puggelli Newbie

                                You are right Brian, I forgot to post the configuration.

                                 <attribute name="BuddyReplicationConfig">
                                 <config>
                                 <buddyReplicationEnabled>true</buddyReplicationEnabled>
                                 <buddyLocatorClass>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</buddyLocatorClass>
                                 <buddyCommunicationTimeout>50000</buddyCommunicationTimeout>
                                 <buddyLocatorProperties>
                                 numBuddies = 1
                                 ignoreColocatedBuddies = true
                                 </buddyLocatorProperties>
                                
                                 <dataGravitationRemoveOnFind>true</dataGravitationRemoveOnFind>
                                 <dataGravitationSearchBackupTrees>true</dataGravitationSearchBackupTrees>
                                 <autoDataGravitation>false</autoDataGravitation>
                                 </config>
                                 </attribute>
                                


                                this workaround should work fine only in case of numBuddies=1.

                                I don't have idea how to discriminate the first buddy to force the gravitation only on it.

                                regards
                                gianluca
                                --
                                Gianluca Puggelli
                                skype1138



                                • 13. Re: Buddy replication behavior
                                  Brian Stansberry Master

                                  Hmm, I don't see anything in the BuddyManager class that allows a caller to find out this kind of information either. That's a flaw.

                                  1 2 Previous Next