7 Replies Latest reply on May 30, 2005 4:51 AM by garu

    org.jboss.ha.framework.server.FarmMemberService again

    garu

      Sorry to bother you again but i think i found another problem that may become an issue in a large environment.

      Suppose you have a cluster of N nodes and each node has A applications in the farm directory.
      Now you join node (N+1) to cluster, it receives N farmDeployments responses and pullNewDeployments() is called for each response.
      This means that each of the A applications is pulled from each of the N nodes.

      Now i see two problems, i have N*A transfers and for large values of N and A this may become a problem and if the size of A files is large there may be a huge latency in startup.

      Second, given that by definition the farming service keeps in sync all the member of the cluster, once i pulled the A files from the first member in list then i have (N-1)*A useless transfers.

      In my opinion the service startup actions should be:
      1- ask for farmDeployments from cluster
      2- pull the A applications form the first farmDeployments response
      3- call scannerThread.doScan(), the status is still STARTING so no deploy will take place but parentDUMap is filled with each pulled file name
      4- now pull the applications from the remaining N-1 nodes, but this time parentDUMap is filled and lastModified time can be checked avoiding useless transfers (in theory this should be useless, but i don't know the clustering stuff so deeply to understand if this step can be safely skipped)
      5- scannerThread.setEnabled(scanEnabled.get()) and return.

      Pls, tell me what do you think.
      Gabriele

        • 1. Re: org.jboss.ha.framework.server.FarmMemberService again
          garu

          Well, some more tests proved i was wrong.
          The thing is even worst, for an N node cluster the transfers are (N**2)*A!!
          If you don't believe me here's the log of a fourth server joining a three members cluster.

          BTW, step 2 should should read: pull the A applications form the first cluster node.

          2005-05-23 15:00:31,568 INFO [org.jboss.ha.framework.server.FarmMemberService] **** pullNewDeployments ****
          2005-05-23 15:00:31,740 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:31,740 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:31,755 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:31,958 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:31,974 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:31,990 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:31,990 INFO [org.jboss.ha.framework.server.FarmMemberService] **** pullNewDeployments ****
          2005-05-23 15:00:32,724 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:32,740 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:32,755 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:33,427 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:33,443 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:33,458 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:33,458 INFO [org.jboss.ha.framework.server.FarmMemberService] **** pullNewDeployments ****
          2005-05-23 15:00:33,599 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:33,615 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:33,630 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BTV.ear
          2005-05-23 15:00:34,333 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:34,349 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          2005-05-23 15:00:34,365 INFO [org.jboss.ha.framework.server.FarmMemberService] farmDeployment(), deploy locally: C:\home\jboss\server\srv1\tmp\BHG.ear
          


          • 2. Re: org.jboss.ha.framework.server.FarmMemberService again
            garu

            Opened issue http://jira.jboss.com/jira/browse/JBCLUSTER-44 for this problem.
            Attached a patch that should reduce the number of transfer to the number of applications in the farm directory.

            • 3. Re: org.jboss.ha.framework.server.FarmMemberService again
              smarlow

              Hello,

              This is very timely as I am looking into a related issue (http://jira.jboss.com/jira/browse/JBCLUSTER-33) that has to do with farm deploying large applications. I'm trying to improve the way we handle large application farm deployments.

              We currently have two ways that farm deploy. At server startup time, we "pull" new deployments by doing a cluster RPC call requesting all deployed applications to be returned to us. This is the case that you point out the issues with.

              We also have "push" which sends applications to all other nodes in the cluster as needed. This seems fine.

              I agree that we should at least apply your patch but perhaps do more (as part of the large file handling).

              Thank you,
              Scott

              • 4. Re: org.jboss.ha.framework.server.FarmMemberService again
                garu

                Hi Scott,
                i'm experimenting with cluster because we plan to set up a production environment where clustering has a main role and since i found some issues that do not allow me to rely on it until they are solved, i tried to speed up the thing trying to help you finding a solution. Given that i don't pay (yet), i feel my duty trying to help you.
                (BTW, i'm negotiating with your new European office for a support packet)
                To focus on your last reply, i had the same problem and i've got something that is already functioning.
                I cannot say it is the solution because it would take too much time for me to understand all the possible drawbacks of what i implemented and overall the real thing should be more structural and involve at least HAPartitonImpl too, but it works and in my hope it can give you some ideas to speed up the solution.
                I posted a zip file in http://jira.jboss.com/jira/browse/JBCLUSTER-33 with the code.
                Cheers, Gabriele

                • 5. Re: org.jboss.ha.framework.server.FarmMemberService again
                  smarlow

                  Hi Gabriele,

                  Thank you for helping out :-)

                  I tried an alternative change to avoid pulling applications from every node and it also solves the problem that you raised here.

                  I started a separate thread to discuss the approach http://www.jboss.org/index.html?module=bb&op=viewtopic&t=64489 as I didn't want to hijack this discussion. My changes did involve HAPartiton + HAPartitonImpl + FarmMemberService.

                  Your proposed change would also solve the "(N-1)*A useless transfers" problem.

                  I took a quick look at your proposal for fragmenting the file transfer and that should help as I now have one possible solution to start with.

                  Cheers,
                  Scott

                  • 6. Re: org.jboss.ha.framework.server.FarmMemberService again
                    garu

                    You're welcom.
                    Also be aware that whit the following code

                    byte[] lBuffer = new byte[ 1024 ];
                    lInput = new FileInputStream( pFile );
                    ByteArrayOutputStream lOutput = new ByteArrayOutputStream();
                    int j = 0;
                    while( ( j = lInput.read( lBuffer ) ) > 0 ) {
                    lOutput.write( lBuffer, 0, j );
                    }
                     return new FileContent( pFile, lOutput.toByteArray() );
                    


                    • 7. Re: org.jboss.ha.framework.server.FarmMemberService again
                      garu

                      Well, sorry for the double update, the old wise says "never let the cat on the keyboard when writing an update", otherwise said "know what you are doing, before doing it"...

                      I just wanted to point out that will the following code:

                       byte[] lBuffer = new byte[ 1024 ];
                       lInput = new FileInputStream( pFile );
                       ByteArrayOutputStream lOutput = new ByteArrayOutputStream(); // #1
                       int j = 0;
                       while( ( j = lInput.read( lBuffer ) ) > 0 ) { // #2
                       lOutput.write( lBuffer, 0, j );
                       }
                       return new FileContent( pFile, lOutput.toByteArray() ); // #3
                      


                      you are allocating twice the file content byte[].

                      At #1 the output buffer is allocated, at #2 it is grown up to file size then at #3 a copy is created, and for large files this may be a problem.
                      Gabriele