-
1. Re: Unix kill partition registerMembershipListener still reg
brian.stansberry May 5, 2006 8:46 AM (in response to misge)I don't understand what you mean. Why does killing one instance mean another instance should stop listening for membership changes?
-
2. Re: Unix kill partition registerMembershipListener still reg
misge May 5, 2006 9:04 AM (in response to misge)Because if i will restart JBoss in Solaris it should be part of the group and act as secondary node. In my case JBoss acts alone as Master node. I tried also the TopologyMonitorService configured from his xml but with no luck
Is it possible to get an example of how to register an HAMBean in order to receive Topology notifications?
This is what i did so far. It works but not all the times.public class ClusterNodeListenerTim extends TopologyMonitorService //org.jboss.system.ServiceMBeanSupport implements ClusterNodeListenerTimMBean, org.jboss.ha.framework.interfaces.HAPartition.AsynchHAMembershipListener //org.jboss.ha.framework.interfaces.HAPartition.HAMembershipListener { //attributes ------------------------------------------------ final static Logger log = Logger.getLogger(ClusterNodeListenerTim.class); private String partitionName = ServerConfigUtil.getDefaultPartitionName(); private HAPartition partition; private String hostname; // Constructors ------------------------------------------------- /** * @jmx.managed-constructor **/ public ClusterNodeListenerTim(){ super(); } // ServiceMBeanSupport overrides -------------------------------- public void createService() throws Exception { InitialContext ctx = new InitialContext(); String partitionJndiName = "/HAPartition/" + partitionName; partition = (HAPartition) ctx.lookup(partitionJndiName); // Register as a listener of cluster membership changes partition.registerMembershipListener(this); log.info("Registered as MembershipListener"); try { hostname = InetAddress.getLocalHost().getHostName(); } catch(IOException e) { log.warn("Failed to lookup local hostname", e); hostname = "<unknown>"; } log.info("SNMP trap.... create "+ hostname+ ", partitionName:"+partitionName); } public void destroyService() { log.info("SNMP trap.... destroying"+ hostname); partition.unregisterMembershipListener(this); } public void membershipChanged(Vector deadMembers, Vector newMembers, Vector allMembers) { if(deadMembers.size() != 0){ log.info("======================================================================"); log.fatal("*******Dead members: " + deadMembers.size() + " (" + deadMembers + ")"); log.info("======================================================================"); //log.info("*******New Members : " + newMembers.size() + " (" + newMembers + ")"); //log.info("*******All Members : " + allMembers.size() + " (" + allMembers + ")"); } super.membershipChanged(deadMembers, newMembers, allMembers); }
-
3. Re: Unix kill partition registerMembershipListener still reg
brian.stansberry May 6, 2006 2:06 PM (in response to misge)What you showed looks fine, but you need to tell me exactly what your problem is. Is it just that when your restart a server it doesn't always join the group properly, i.e. I should ignore the stuff about zombies? If so, see http://wiki.jboss.org/wiki/Wiki.jsp?page=TestingJBoss for info on troubleshooting this kind of problem.
-
4. Re: Unix kill partition registerMembershipListener still reg
misge May 8, 2006 2:55 AM (in response to misge)This is my config file.
<Config> <TCP bind_addr="146.124.115.149" start_port="7800" loopback="true"/> <TCPPING initial_hosts="146.124.115.149[7800],146.124.111.104[7800],146.124.115.37[7800]" port_range="3" timeout="3500" num_initial_members="3" up_thread="true" down_thread="true"/> <MERGE2 min_interval="5000" max_interval="10000"/> <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" /> <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" /> <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100" retransmit_timeout="3000"/> <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" /> <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false" print_local_addr="true" down_thread="true" up_thread="true"/> <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/> </Config> </attribute> <depends>jboss:service=Naming</depends> </mbean>
I got the same for both WinXP and Solaris with a change on bind_addr only. I think the solution will come step by step.
Yes, you are right, forget the zombie. It was a problem with a not clean build.
The problem remains when one JBoss restarts, master or node either. It seems that it cannot join the group afterwards. I use the TCP config(above) due to firewall restrictions. I cannot do the test as you proposed cause of our firewall. Any other idea?
Cheers