11 Replies Latest reply on Sep 18, 2006 10:17 AM by smarlow

Farm deployement problem

gregsting Sep 13, 2006 9:27 AM

Hi,

I try to configure a cluster of two servers. They seem to find each other and form a cluster (history of my mbean partition shows

9/13/06 2:46 PM : New view: [10.255.XX.YY:1099, 10.255.XX.ZZ:1099] with viewId: 1 (old view: [10.255.XX.YY:1099] )

)

The problem is that when I try to deploy a file across my farm (that is, I put a test file "test.war" in the "farm" directory of a node) it deploys on the current node but not on the whole farm. On the others nodes of the cluster, the file is neither deployed, nor copied

Altough, console shows:

15:08:26,543 INFO [ClusterFileTransfer] Start push of file test.war to cluster.
15:10:26,564 INFO [ClusterFileTransfer] Finished push of file test.war to cluster.

Note that there is always exactly 2 minutes between start and finish of transfer, so I suppose this is a timeout. However log files don't show any error.

1. Re: Farm deployement problem

brian.stansberry Sep 13, 2006 12:32 PM (in response to gregsting)

What AS version?
Actions
2. Re: Farm deployement problem

gregsting Sep 14, 2006 2:39 AM (in response to gregsting)

4.0.4 (Latest) on Solaris.
I use the standard "all" config. Nothing changed from the default install, just copy pasted the files on both servers from a windows station and launched them using "./run.sh -c all"

Maybe it has something to do with permissions? (i.e. Jboss maybe doesn't have the rigth to write the file on the different nodes of the clusters?). But what bothers me is that I have no errors..
Actions
3. Re: Farm deployement problem

gregsting Sep 14, 2006 2:51 AM (in response to gregsting)

Just noticed message "Missing file: /lib/tools.jar
run.sh: Unexpected results may occur. Make sure JAVA_HOME points to a JDK and not a JRE." when I launch server... Maybe that could be the reason...
Actions
4. Re: Farm deployement problem

gregsting Sep 14, 2006 5:01 AM (in response to gregsting)

And actually this is a dev environnement used by many developpers so maybe it is a port problem? Anybody has an idea of wich port is used for the farming?
Actions
5. Re: Farm deployement problem

brian.stansberry Sep 14, 2006 12:41 PM (in response to gregsting)

Farming uses the JGroups channel configured in the ClusterPartition mbean, found in deploy/cluster-service.xml. Ports are whatever is configured there. By default it listens for multicast on port 45566.
Actions
6. Re: Farm deployement problem

gregsting Sep 15, 2006 5:03 AM (in response to gregsting)

It seems that using TCP instead of UDP (in /deploy/cluster-service.xml) resolves the problem... all the communications are much faster so I think I'll keep this config
Thx for your help
Actions
7. Re: Farm deployement problem

brian.stansberry Sep 15, 2006 10:36 AM (in response to gregsting)
Glad that helped. :)

But, when I read your response a potential cause of your problem occurred to me. So here's something else to try for those who have issues sending files using farming but still want to use UDP:

In cluster-service.xml, bump up the buffer sizes in the UDP protocol:

<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" mcast_port="45566" ip_ttl="8" ip_mcast="true" mcast_send_buf_size="25000000" mcast_recv_buf_size="640000" ucast_send_buf_size="20000000" ucast_recv_buf_size="640000" loopback="false"/>

Now that's a pretty big increase in the resources JGroups is going to use (the old defaults for the send buffers were 800000). If you're concerned about that, don't increase the send buffers so much, but leave the receive buffers at 640K.

Farming sends files around the cluster in 512K chunks. With the default config, the receive buffer sizes are not even large enough to hold one chunk. This can lead to problems where the packets that make up part of a chunk get dropped and have to be retransmitted. With very large files, this can slow things down to the point timeouts start getting tripped and the transfer fails.

I'll bump the default size of the buffers for 4.0.5.GA:

http://jira.jboss.com/jira/browse/JBAS-3659
Actions
8. Re: Farm deployement problem

smarlow Sep 15, 2006 5:13 PM (in response to gregsting)

Brian,

I like your idea of increasing the buffer sizes. I didn't think about syncing the ClusterFileTransfer buffer sizes with the send/receive buffers. Do you think we should consider decreasing the ClusterFileTransfer buffer sizes instead?

Greg,

Regarding the missing tools.jar, that will byte you hard if you have any JSPs in the application as they won't compile without a java compiler (tools.jar contains the java compiler class.)

Scott
Actions
9. Re: Farm deployement problem

brian.stansberry Sep 15, 2006 5:58 PM (in response to gregsting)

Scott,

For 5.0 we're using a 640K receive buffer in the shared multiplexer channel. Our perf testing also uses that size receive buffer, and even in 4.0.x we'd like to make our default config more performant. So I don't have a big issue with the receive buffer.

For the send buffers, I do think the fc-fast-minimalthreads 25,000,000 is too big for cluster-service.xml, which shouldn't usually be pushing as much data as the tc5-cluster-service.xml channel. I'm thinking 2,000,000 is more reasonable.

What do you think?

(Oh, just realized this info isn't in the thread -- a couple weeks back we had a support case where transfer of large files (> 30MB) was failing midway. Increasing the buffers resolved the problem and greatly increased the speed.)
Actions
10. Re: Farm deployement problem

brian.stansberry Sep 17, 2006 4:34 PM (in response to gregsting)
Woah, sorry guys -- the example UDP config I posted before had the send and recieve buffer sizes reversed!!! Something like this is more appropriate:

<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" mcast_port="45566" ip_ttl="${jgroups.mcast.ip_ttl:8}" ip_mcast="true" mcast_recv_buf_size="2000000" mcast_send_buf_size="640000" ucast_recv_buf_size="2000000" ucast_send_buf_size="640000" loopback="false"/>

Here I'm using ~ 2MB for receive buffers, not 25MB and definitely not 25MB for send!

Scott, maybe that will make my comment on the sizes more coherent. Or maybe less coherent. ;)
Actions
11. Re: Farm deployement problem

smarlow Sep 18, 2006 10:17 AM (in response to gregsting)

Brian,

The buffer size changes seem fine to me.

Scott
Actions

Go to original post