4 Replies Latest reply on Oct 30, 2008 2:22 AM by sonali majumdar

    FamilyClusterInfo loses all targets

    Ernest Mishkin Newbie

      Hello:

      I'm experiencing a problem where FamilyClusterInfo instance is losing all targets even though one of the nodes in the cluster is up. The scenario is described below; I'd like an advice from the forum community whether it does resemble a bug and should be filed as a JIRA issue.

      The setup:

      JBoss 4.2.2.GA installed on two linux servers, running on 1.6.0_04 JVM

      A jboss config (aka server/instance) with several EJB3 SLSBs deployed and clustering enabled (HA JNDI and all). This will be referred to as "server instance"

      A jboss config (aka server/instance) which looks up and makes use of those SLSBsThis will be referred to as "client instance"

      Client uses a custom LoadBalance policy implementation which, for the sake of debugging the problem, has been stripped down to printing out error line if familiyClusterInfo.getTargets().isEmpty() is true

      Both configs are running on both nodes by default; clusters appear to be setup properly, everything functions just fine

      The problem scenario:
      Leave a single client instance up. It connects to either of the server instances as expected. Bring a server instance (say, "node A") down. FamilyClusterInfo is properly updated, all calls are served by "node B".
      Now bring "node A" up and bring "node B" down. At this point FamilyClusterInfo apparently loses all target (B is removed, but A is not added). This is proven by the error output from the custom LoadBalancePolicy impl as well as by the exception on the client side:

      java.lang.RuntimeException: Unreachable?: Service unavailable.
       at org.jboss.aspects.remoting.ClusterChooserInterceptor.invoke(ClusterChooserInterceptor.java:176)


      It seems that FamilyClusterInfo is only updated in terms of removal of dead targets but never in terms of addition of newly alive targets.

      So... does it look like a solid candidate for posting as an issue? Or I'm missing something obvious? Or perhaps this should've been posted to developer forum instead?


      --Ernest