JBoss clustering gives errors on starting up new server
sarkar Aug 24, 2005 7:55 PMHope some of you have had this issue earlier and can respond to it.
I have two scenarios and they are related:
Scenario 1
========
Deployment description
----------------------------
I have a JBoss installation on two Linux boxes ,say lin-1 ,lin-2 ;both on the same network.
On boxes lin-1 and lin-2, i have setup four Jboss servers/domains (of type-all i.e. cluster-aware).Four different applications app-A,app-B,app-C,app-D have been installed on each server on lin-1 box.The same deployment is repeated on lin-2 box.
If you have understood,the idea is that app-A will be clustered between its setup on lin-1 and lin-2, so will app-B,app-C and app-D.So ideally I should have 4 JBoss-clusters of two members each[(lin-1/app-A + lin-2/app-A );
(lin-1/app-B + lin2-app-B) and so on] deployed between the two boxes lin-1 and lin-2.
The clustering document at JBoss website says that by default every server that comes up on the network is part of the the cluster called "DefaultPartition". So instead of 4 clusters or 2 members each , I am getting a single cluster with 8 members in it all belonging to the "DefaultPartition".
Question is : How do i split them to form four JBoss-clusters of two memenrs each.If you are suggesting the way of changing the mcast_addr in cluster_service.xml file, then please detail on the following:
1. What value can the mcast_addr element take?
2. How do i ensure multicast is enabled on my network.
3. Do I leave the PartitionName element 's value to "DefaultPartition" itself.
Scenario 2
========
Deployment description
----------------------------
The same as the Scenario 1 .In addition , I have a new linux box on the network running a JBoss installation, lin-3 .Now on this JBoss installation , I have setup four new apps app-P ,app-Q,app-R and app-S again running with server config. all-type server(cluster aware).
So this makes the total no. of JBoss servers 12. Here is a strange occurence.Until the 11th server,all server come up fine.However ,when I start up the 12th server, the startup log goes only until the Mbean config of Cluster and shows the No. of members as "11" and then hangs .The application ear-files on this server does not get deployed and hence the server i assume does not come up fully.The server is running since i can see it when i do a ps -ef |grep
The same is the result for any new server coming up after the 11th server as shown in the log below which was when i brought up the 13th one.
-------------------------------------------------------
GMS: address is mpapp1-d:33766 (additional data: 18 bytes)
-------------------------------------------------------
16:28:22,234 INFO [DefaultPartition] Number of cluster members: 13
16:28:22,234 INFO [DefaultPartition] Other members: 12
16:28:22,234 INFO [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
16:28:22,234 INFO [DefaultPartition] New cluster view for partition DefaultPartition: 265 ([10.120.102.91:1099, 10.120.102.91:1199, 10.120.102.94:1099, 10.120.102.94:1199, 10.120.102.97:1399, 10.120.102.97:1099, 10.120.102.103:1099, 10.120.102.103:1199, 10.120.102.106:1099, 10.120.102.106:1199, 10.120.102.91:1299, 10.120.102.97:1899, 10.120.102.91:1399] delta: 0)
16:28:22,241 INFO [DefaultPartition] I am (null) received membershipChanged event:
16:28:22,241 INFO [DefaultPartition] Dead members: 0 ([])
16:28:22,241 INFO [DefaultPartition] New Members : 0 ([])
16:28:22,241 INFO [DefaultPartition] All Members : 13 ([10.120.102.91:1099, 10.120.102.91:1199, 10.120.102.94:1099, 10.120.102.94:1199, 10.120.102.97:1399, 10.120.102.97:1099, 10.120.102.103:1099, 10.120.102.103:1199, 10.120.102.106:1099, 10.120.102.106:1199, 10.120.102.91:1299, 10.120.102.97:1899, 10.120.102.91:1399])
Now , I tried to work around in that if i rename the PartitionName element's value in the cluster-service.xml to something other than "${jboss.partition.name:DefaultPartition}" to any string and then restart the server. This time the server comes up ,deploys my ear-file and i can launch the application thorugh the browser.However,towards the end of the server log , I can see messages like :
15:12:47,924 ERROR [HANamingService] Could not start on port 1100
java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:331)
at java.net.ServerSocket.bind(ServerSocket.java:318)
at java.net.ServerSocket.(ServerSocket.java:185)
at org.jboss.ha.jndi.DetachedHANamingService.startService(DetachedHANamingService.java:223)
at org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:272)
at org.jboss.system.ServiceMBeanSupport.jbossInternalLifecycle(ServiceMBeanSupport.java:222) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324)
.....
.....
.....
followed by
I Depend On: jboss:service=DefaultPartition
jboss.cache:service=InvalidationManager
Depends On Me: javax.naming.NameNotFoundException: DefaultPartition not bound
ObjectName: jboss:service=HAJNDI
state: FAILED
I Depend On: jboss:service=DefaultPartition
Depends On Me: java.lang.NullPointerException
ObjectName: jboss:service=HASessionState
state: FAILED
I Depend On: jboss:service=DefaultPartition
Depends On Me: javax.naming.NameNotFoundException: DefaultPartition not bound
ObjectName: jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB
state: CREATED
I Depend On:
Depends On Me:
ObjectName: jboss.j2ee:jndiName=clustering/HTTPSession,service=EJB
state: FAILED
I Depend On: jboss:service=DefaultPartition
jboss:service=invoker,type=jrmp
At the end of this procedure, my server is up as I mentioned but then I guess there should be a more graceful way of doing this.
Questions
-------------
1.Is there a limit on the no. of nodes that can form a cluster?and if yes, is it 11?
2. Some of my applications do not need cluster.How can i disable it in the config file.
3. Once my clusters are all deployed is there a stub I can use to check the cluster info. or is it available on the management console?
Thanks in advance,
goks
P.S:I can provide the logs just in case