11 Replies Latest reply on Sep 18, 2006 10:17 AM by smarlow

    Farm deployement problem

    gregsting

      Hi,

      I try to configure a cluster of two servers. They seem to find each other and form a cluster (history of my mbean partition shows

      9/13/06 2:46 PM : New view: [10.255.XX.YY:1099, 10.255.XX.ZZ:1099] with viewId: 1 (old view: [10.255.XX.YY:1099] )
      )

      The problem is that when I try to deploy a file across my farm (that is, I put a test file "test.war" in the "farm" directory of a node) it deploys on the current node but not on the whole farm. On the others nodes of the cluster, the file is neither deployed, nor copied

      Altough, console shows:

      15:08:26,543 INFO [ClusterFileTransfer] Start push of file test.war to cluster.
      15:10:26,564 INFO [ClusterFileTransfer] Finished push of file test.war to cluster.
      


      Note that there is always exactly 2 minutes between start and finish of transfer, so I suppose this is a timeout. However log files don't show any error.

        • 1. Re: Farm deployement problem
          brian.stansberry

          What AS version?

          • 2. Re: Farm deployement problem
            gregsting

            4.0.4 (Latest) on Solaris.
            I use the standard "all" config. Nothing changed from the default install, just copy pasted the files on both servers from a windows station and launched them using "./run.sh -c all"

            Maybe it has something to do with permissions? (i.e. Jboss maybe doesn't have the rigth to write the file on the different nodes of the clusters?). But what bothers me is that I have no errors..

            • 3. Re: Farm deployement problem
              gregsting

              Just noticed message "Missing file: /lib/tools.jar
              run.sh: Unexpected results may occur. Make sure JAVA_HOME points to a JDK and not a JRE." when I launch server... Maybe that could be the reason...

              • 4. Re: Farm deployement problem
                gregsting

                And actually this is a dev environnement used by many developpers so maybe it is a port problem? Anybody has an idea of wich port is used for the farming?

                • 5. Re: Farm deployement problem
                  brian.stansberry

                  Farming uses the JGroups channel configured in the ClusterPartition mbean, found in deploy/cluster-service.xml. Ports are whatever is configured there. By default it listens for multicast on port 45566.

                  • 6. Re: Farm deployement problem
                    gregsting

                    It seems that using TCP instead of UDP (in /deploy/cluster-service.xml) resolves the problem... all the communications are much faster so I think I'll keep this config
                    Thx for your help

                    • 7. Re: Farm deployement problem
                      brian.stansberry

                      Glad that helped. :)

                      But, when I read your response a potential cause of your problem occurred to me. So here's something else to try for those who have issues sending files using farming but still want to use UDP:

                      In cluster-service.xml, bump up the buffer sizes in the UDP protocol:

                      <UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
                       mcast_port="45566"
                       ip_ttl="8" ip_mcast="true"
                       mcast_send_buf_size="25000000" mcast_recv_buf_size="640000"
                       ucast_send_buf_size="20000000" ucast_recv_buf_size="640000"
                       loopback="false"/>
                      


                      Now that's a pretty big increase in the resources JGroups is going to use (the old defaults for the send buffers were 800000). If you're concerned about that, don't increase the send buffers so much, but leave the receive buffers at 640K.

                      Farming sends files around the cluster in 512K chunks. With the default config, the receive buffer sizes are not even large enough to hold one chunk. This can lead to problems where the packets that make up part of a chunk get dropped and have to be retransmitted. With very large files, this can slow things down to the point timeouts start getting tripped and the transfer fails.

                      I'll bump the default size of the buffers for 4.0.5.GA:

                      http://jira.jboss.com/jira/browse/JBAS-3659

                      • 8. Re: Farm deployement problem
                        smarlow

                        Brian,

                        I like your idea of increasing the buffer sizes. I didn't think about syncing the ClusterFileTransfer buffer sizes with the send/receive buffers. Do you think we should consider decreasing the ClusterFileTransfer buffer sizes instead?

                        Greg,

                        Regarding the missing tools.jar, that will byte you hard if you have any JSPs in the application as they won't compile without a java compiler (tools.jar contains the java compiler class.)

                        Scott

                        • 9. Re: Farm deployement problem
                          brian.stansberry

                          Scott,

                          For 5.0 we're using a 640K receive buffer in the shared multiplexer channel. Our perf testing also uses that size receive buffer, and even in 4.0.x we'd like to make our default config more performant. So I don't have a big issue with the receive buffer.

                          For the send buffers, I do think the fc-fast-minimalthreads 25,000,000 is too big for cluster-service.xml, which shouldn't usually be pushing as much data as the tc5-cluster-service.xml channel. I'm thinking 2,000,000 is more reasonable.

                          What do you think?

                          (Oh, just realized this info isn't in the thread -- a couple weeks back we had a support case where transfer of large files (> 30MB) was failing midway. Increasing the buffers resolved the problem and greatly increased the speed.)

                          • 10. Re: Farm deployement problem
                            brian.stansberry

                            Woah, sorry guys -- the example UDP config I posted before had the send and recieve buffer sizes reversed!!! Something like this is more appropriate:

                            <UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
                             mcast_port="45566"
                             ip_ttl="${jgroups.mcast.ip_ttl:8}" ip_mcast="true"
                             mcast_recv_buf_size="2000000" mcast_send_buf_size="640000"
                             ucast_recv_buf_size="2000000" ucast_send_buf_size="640000"
                             loopback="false"/>


                            Here I'm using ~ 2MB for receive buffers, not 25MB and definitely not 25MB for send!

                            Scott, maybe that will make my comment on the sizes more coherent. Or maybe less coherent. ;)

                            • 11. Re: Farm deployement problem
                              smarlow

                              Brian,

                              The buffer size changes seem fine to me.

                              Scott