6 Replies Latest reply on Mar 9, 2006 7:12 AM by wzzzrd

    HAJNDI/ HAJMS Discovery Problem

      Hi there,

      I searched the forums, the wiki and the web and found no solution, so I post my question.

      Configuration: JBoss: 4.0.3SP1, OS: Redhat Enterprise (AMD64), single machine, one JBoss installation. The cluster has two nodes (represented by two servers), bound to different IP-addresses via -b parameter.

      Both servers have the HANaming, HAJNDI and HASingleton service installed and contain both a deploy-hasingleton directory holding our JMS-Deployments (Destinations) which was taken from the ALL-configuration and modified.

      The aim: We are using HAJMS Topics to send replication messages to all cluster members.

      Configuration steps (same on both nodes):

      deploy/jms/hajndi-jms-ds.xml:

      <connection-factories>
      
       <!-- ==================================================================== -->
       <!-- JMS Stuff -->
       <!-- ==================================================================== -->
      
       <!-- The JMS provider loader -->
       <mbean code="org.jboss.jms.jndi.JMSProviderLoader"
       name="jboss.mq:service=JMSProviderLoader,name=HAJNDIJMSProvider">
       <attribute name="ProviderName">DefaultJMSProvider</attribute>
       <attribute name="ProviderAdapterClass">
       org.jboss.jms.jndi.JNDIProviderAdapter
       </attribute>
       <!-- The combined connection factory -->
       <attribute name="FactoryRef">XAConnectionFactory</attribute>
       <!-- The queue connection factory -->
       <attribute name="QueueFactoryRef">XAConnectionFactory</attribute>
       <!-- The topic factory -->
       <attribute name="TopicFactoryRef">XAConnectionFactory</attribute>
       <!-- Access JMS via HAJNDI -->
       <attribute name="Properties">
       java.naming.factory.initial=org.jnp.interfaces.NamingContextFactory
       java.naming.factory.url.pkgs=org.jboss.naming:org.jnp.interfaces
       java.naming.provider.url=${jboss.bind.address}:1100
       jnp.disableDiscovery=false
       jnp.partitionName=${jboss.partition.name:StagePartition}
       jnp.discoveryGroup=${jboss.partition.udpGroup:230.0.0.4}
       jnp.discoveryPort=1102
       jnp.discoveryTTL=16
       jnp.discoveryTimeout=5000
       jnp.maxRetries=1
       </attribute>
       </mbean>
      
       <!-- The server session pool for Message Driven Beans -->
       <mbean code="org.jboss.jms.asf.ServerSessionPoolLoader"
       name="jboss.mq:service=ServerSessionPoolMBean,name=StdJMSPool">
       <depends optional-attribute-name="XidFactory">jboss:service=XidFactory</depends>
       <attribute name="PoolName">StdJMSPool</attribute>
       <attribute name="PoolFactoryClass">
       org.jboss.jms.asf.StdServerSessionPoolFactory
       </attribute>
       </mbean>
      
       <!-- JMS XA Resource adapter, use this to get transacted JMS in beans -->
       <tx-connection-factory>
       <jndi-name>JmsXA</jndi-name>
       <xa-transaction/>
       <rar-name>jms-ra.rar</rar-name>
       <connection-definition>org.jboss.resource.adapter.jms.JmsConnectionFactory</connection-definition>
       <config-property name="SessionDefaultType" type="java.lang.String">javax.jms.Topic</config-property>
       <config-property name="JmsProviderAdapterJNDI" type="java.lang.String">java:/DefaultJMSProvider</config-property>
       <max-pool-size>20</max-pool-size>
       <security-domain-and-application>JmsXARealm</security-domain-and-application>
       </tx-connection-factory>
      
      </connection-factories>
      

      deploy/cluster-service.xml:
      <server>
      
       <!-- ==================================================================== -->
       <!-- Cluster Partition: defines cluster -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.framework.server.ClusterPartition"
       name="jboss:service=${jboss.partition.name:StagePartition}">
      
       <!-- Name of the partition being built -->
       <attribute name="PartitionName">${jboss.partition.name:StagePartition}</attribute>
      
       <!-- The address used to determine the node name -->
       <attribute name="NodeAddress">${jboss.bind.address}</attribute>
      
       <!-- Determine if deadlock detection is enabled -->
       <attribute name="DeadlockDetection">False</attribute>
      
       <!-- Max time (in ms) to wait for state transfer to complete. Increase for large states -->
       <attribute name="StateTransferTimeout">30000</attribute>
      
       <!-- The JGroups protocol configuration -->
       <attribute name="PartitionConfig">
       <!--
       The default UDP stack:
       - If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
       appropriate NIC IP address, e.g bind_addr="192.168.0.2".
       - On Windows machines, because of the media sense feature being broken with multicast
       (even after disabling media sense) set the UDP protocol's loopback attribute to true
       -->
       <Config>
       <UDP mcast_addr="${jboss.partition.udpGroup:230.0.0.4}" bind_addr="${jboss.bind.address}" mcast_port="45566"
       ip_ttl="8" ip_mcast="true"
       mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
       ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
       loopback="false"/>
       <PING timeout="2000" num_initial_members="3"
       up_thread="true" down_thread="true"/>
       <MERGE2 min_interval="10000" max_interval="20000"/>
       <FD shun="true" up_thread="true" down_thread="true"
       timeout="2500" max_tries="5"/>
       <VERIFY_SUSPECT timeout="3000" num_msgs="3"
       up_thread="true" down_thread="true"/>
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
       max_xmit_size="8192"
       up_thread="true" down_thread="true"/>
       <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
       down_thread="true"/>
       <pbcast.STABLE desired_avg_gossip="20000"
       up_thread="true" down_thread="true"/>
       <FRAG frag_size="8192"
       down_thread="true" up_thread="true"/>
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
       shun="true" print_local_addr="true"/>
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
       </Config>
      
       </attribute>
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- ==================================================================== -->
       <!-- HA Session State Service for SFSB -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.hasessionstate.server.HASessionStateService"
       name="jboss:service=HASessionState">
       <!-- Name of the partition to which the service is linked -->
       <attribute name="PartitionName">${jboss.partition.name:StagePartition}</attribute>
       <!-- JNDI name under which the service is bound -->
       <attribute name="JndiName">/HASessionState/Default</attribute>
       <!-- Max delay before cleaning unreclaimed state.
       Defaults to 30*60*1000 => 30 minutes -->
       <attribute name="BeanCleaningDelay">0</attribute>
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=${jboss.partition.name:StagePartition}</depends>
       </mbean>
      
       <!-- ==================================================================== -->
       <!-- HA JNDI -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.jndi.HANamingService"
       name="jboss:service=HAJNDI">
       <depends>jboss:service=${jboss.partition.name:StagePartition}</depends>
       <!-- Name of the partition to which the service is linked -->
       <attribute name="PartitionName">${jboss.partition.name:StagePartition}</attribute>
       <!-- Bind address of bootstrap and HA-JNDI RMI endpoints -->
       <attribute name="BindAddress">${jboss.bind.address}</attribute>
       <!-- Port on which the HA-JNDI stub is made available -->
       <attribute name="Port">1100</attribute>
       <!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. -->
       <attribute name="RmiPort">1101</attribute>
       <!-- Accept backlog of the bootstrap socket -->
       <attribute name="Backlog">50</attribute>
       <!-- The thread pool service used to control the bootstrap and
       auto discovery lookups -->
       <depends optional-attribute-name="LookupPool"
       proxy-type="attribute">jboss.system:service=ThreadPool</depends>
      
       <!-- A flag to disable the auto discovery via multicast -->
       <attribute name="DiscoveryDisabled">false</attribute>
       <!-- Set the auto-discovery bootstrap multicast bind address. If not
       specified and a BindAddress is specified, the BindAddress will be used. -->
       <attribute name="AutoDiscoveryBindAddress">${jboss.bind.address}</attribute>
       <!-- Multicast Address and group port used for auto-discovery -->
       <attribute name="AutoDiscoveryAddress">${jboss.partition.udpGroup:230.0.0.4}</attribute>
       <attribute name="AutoDiscoveryGroup">1102</attribute>
       <!-- The TTL (time-to-live) for autodiscovery IP multicast packets -->
       <attribute name="AutoDiscoveryTTL">16</attribute>
      
       <!-- Client socket factory to be used for client-server
       RMI invocations during JNDI queries
       <attribute name="ClientSocketFactory">custom</attribute>
       -->
       <!-- Server socket factory to be used for client-server
       RMI invocations during JNDI queries
       <attribute name="ServerSocketFactory">custom</attribute>
       -->
       </mbean>
      
       <mbean code="org.jboss.invocation.jrmp.server.JRMPInvokerHA"
       name="jboss:service=invoker,type=jrmpha">
       <attribute name="ServerAddress">${jboss.bind.address}</attribute>
       <attribute name="RMIObjectPort">4447</attribute>
       <!--
       <attribute name="RMIClientSocketFactory">custom</attribute>
       <attribute name="RMIServerSocketFactory">custom</attribute>
       -->
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- the JRMPInvokerHA creates a thread per request. This implementation uses a pool of threads -->
       <mbean code="org.jboss.invocation.pooled.server.PooledInvokerHA"
       name="jboss:service=invoker,type=pooledha">
       <attribute name="NumAcceptThreads">1</attribute>
       <attribute name="MaxPoolSize">300</attribute>
       <attribute name="ClientMaxPoolSize">300</attribute>
       <attribute name="SocketTimeout">60000</attribute>
       <attribute name="ServerBindAddress">${jboss.bind.address}</attribute>
       <attribute name="ServerBindPort">4446</attribute>
       <attribute name="ClientConnectAddress">${jboss.bind.address}</attribute>
       <attribute name="ClientConnectPort">0</attribute>
       <attribute name="EnableTcpNoDelay">false</attribute>
       <depends optional-attribute-name="TransactionManagerService">jboss:service=TransactionManager</depends>
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- ==================================================================== -->
      
       <!-- ==================================================================== -->
       <!-- Distributed cache invalidation -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.cache.invalidation.bridges.JGCacheInvalidationBridge"
       name="jboss.cache:service=InvalidationBridge,type=JavaGroups">
       <attribute name="InvalidationManager">jboss.cache:service=InvalidationManager</attribute>
       <attribute name="PartitionName">${jboss.partition.name:StagePartition}</attribute>
       <attribute name="BridgeName">DefaultJGBridge</attribute>
       <depends>jboss:service=${jboss.partition.name:StagePartition}</depends>
       <depends>jboss.cache:service=InvalidationManager</depends>
       </mbean>
      

      deploy/deploy-hasingleton-service.xml:
      <server>
      
       <mbean code="org.jboss.ha.singleton.HASingletonController"
       name="jboss.ha:service=HASingletonDeployer">
       <depends>jboss:service=${jboss.partition.name:StagePartition}</depends>
       <depends optional-attribute-name="TargetName">jboss.system:service=MainDeployer</depends>
       <attribute name="PartitionName">${jboss.partition.name:StagePartition}</attribute>
       <attribute name="TargetStartMethod">deploy</attribute>
       <attribute name="TargetStartMethodArgument">${jboss.server.home.url}/deploy-hasingleton</attribute>
       <attribute name="TargetStopMethod">undeploy</attribute>
       <attribute name="TargetStopMethodArgument">${jboss.server.home.url}/deploy-hasingleton</attribute>
       </mbean>
      </server>
      

      The Problem: When starting one node, everything seems fine, the ConnectionFactories get bound and the Global Namespace can be used to look up Topics and ConnectionFactories. However, when running a second node, the Global Namespace does not get replicated.

      The log says(Node 1):
      23:02:15,071 INFO [StagePartition] Initializing
      23:02:17,121 INFO [StagePartition] Number of cluster members: 1
      23:02:17,121 INFO [StagePartition] Other members: 0
      23:02:17,121 INFO [StagePartition] Fetching state (will wait for 30000 milliseconds):
      23:02:17,123 INFO [StagePartition] New cluster view for partition StagePartition (id: 0, delta: 0) : [192.168.100.211:1099]
      23:02:17,128 INFO [StagePartition] I am (192.168.100.211:1099) received membershipChanged event:
      23:02:17,128 INFO [StagePartition] Dead members: 0 ([])
      23:02:17,128 INFO [StagePartition] New Members : 0 ([])
      23:02:17,128 INFO [StagePartition] All Members : 1 ([192.168.100.211:1099])
      23:02:17,157 INFO [HANamingService] Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/192.168.100.211
      23:02:17,163 INFO [DetachedHANamingService$AutomaticDiscovery] Listening on /192.168.100.211:1102, group=230.0.0.4, HA-JNDI address=192.168.100.211:1100
      

      The log says(Node 2):
      22:33:20,340 INFO [StagePartition] Initializing
      22:33:22,417 INFO [StagePartition] Number of cluster members: 2
      22:33:22,417 INFO [StagePartition] Other members: 1
      22:33:22,417 INFO [StagePartition] Fetching state (will wait for 30000 milliseconds):
      22:33:22,417 INFO [StagePartition] New cluster view for partition StagePartition: 1 ([192.168.100.211:1099, 192.168.100.212:1099] delta: 0)
      22:33:22,420 INFO [StagePartition] I am (null) received membershipChanged event:
      22:33:22,420 INFO [StagePartition] Dead members: 0 ([])
      22:33:22,420 INFO [StagePartition] New Members : 0 ([])
      22:33:22,421 INFO [StagePartition] All Members : 2 ([192.168.100.211:1099, 192.168.100.212:1099])
      22:33:22,518 INFO [HANamingService] Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/192.168.100.212
      22:33:22,524 INFO [DetachedHANamingService$AutomaticDiscovery] Listening on /192.168.100.212:1102, group=230.0.0.4, HA-JNDI address=192.168.100.212:1100
      


      But only the node started first can obtain ConnectionFactories or Topics. The other node always says:
      javax.naming.NameNotFoundException: XAConnectionFactory not bound
       at org.jnp.server.NamingServer.getBinding(NamingServer.java:514)
       at org.jnp.server.NamingServer.getBinding(NamingServer.java:522)
       at org.jnp.server.NamingServer.getObject(NamingServer.java:528)
       at org.jnp.server.NamingServer.lookup(NamingServer.java:281)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:610)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:572)
       at javax.naming.InitialContext.lookup(InitialContext.java:351)
      

      No matter if we use port 1099 or the Global Namespace at port 1100, or wether we use "ConnectionFactory" as lookup name. Also the "Global Namespace" listed by the JNDIView-MBean does not contain anything from the deploy-hasingleton directory on the second node.

      I'm quite sure it's a HAJNDI-Problem, because we only run into trouble when trying to get things from HAJNDI, our ejb3-MDBs seem to have no problem to resolve their Topics (configured via Annotations),.

      I have no idea what else to try, the config seems fine to me.

      Any help would be appriciated.

      Thanks in advance,
      Martin

        • 1. Re: HAJNDI/ HAJMS Discovery Problem

          I don't have any information on your problem, just an observation about the Global Namespace in the JNDIView mbean.

          The Global Namespace is unrelated to HA-JNDI; it's a store for local JNDI bindings. So if you bind something in HA-JNDI, it won't appear there. Of course, HA-JNDI should locate bindings in local namespaces on other nodes so if something is bound there on one node, it should be found via an HA-JNDI lookup on any node.

          JBossAS 4.0.4 includes HA-JNDI bindings in the JNDIView mbean's list() output so that you can also view HA-JNDI bindings. They're included under the heading HA-JNDI Namespace.

          • 2. Re: HAJNDI/ HAJMS Discovery Problem

             

            "JerryGauth" wrote:
            I don't have any information on your problem, just an observation about the Global Namespace in the JNDIView mbean.

            The Global Namespace is unrelated to HA-JNDI; it's a store for local JNDI bindings. So if you bind something in HA-JNDI, it won't appear there. Of course, HA-JNDI should locate bindings in local namespaces on other nodes so if something is bound there on one node, it should be found via an HA-JNDI lookup on any node.

            JBossAS 4.0.4 includes HA-JNDI bindings in the JNDIView mbean's list() output so that you can also view HA-JNDI bindings. They're included under the heading HA-JNDI Namespace.

            Sorry, I didn't know that. However, the problem remains. We wrote our own HAJNDIView MBean, which gets it's InitialContext via:
             Properties properties = new Properties();
             properties.setProperty(Context.PROVIDER_URL, bindAddress);
             properties.setProperty(Context.INITIAL_CONTEXT_FACTORY, "org.jnp.interfaces.NamingContextFactory");
             properties.setProperty(Context.URL_PKG_PREFIXES, "org.jboss.naming:org.jnp.interfaces");
             InitialContext context = new InitialContext(properties);
            


            When using exactly the same properties with a standalone client, we get the right context back. When using this inside the server, we get something else. Is that intended?

            Regards,
            Martin

            • 3. Re: HAJNDI/ HAJMS Discovery Problem

              Does your PROVIDER_URL property specify the HA-JNDI port? For example,
              jnp://localhost:1100

              • 4. Re: HAJNDI/ HAJMS Discovery Problem

                 

                "JerryGauth" wrote:
                Does your PROVIDER_URL property specify the HA-JNDI port? For example,
                jnp://localhost:1100

                Yes it does, bindAddress contains "jnp://IPADDRESS:1100" (PROVIDER_URL is just the key taken from the javax.naming.Context interface). As I said, it works perfectly for a remote client, but not from within any other node but the one first started.

                What also might be interesting: if the first node is shut down, all HA-stuff is correctly deployed to the other node. When restarting the first node, on THAT node the lookup fails but the other works. In short: only on the HA-master node the lookup succeeds. Strange.

                Regards,
                Martin

                • 5. Re: HAJNDI/ HAJMS Discovery Problem

                  HA-JNDI bindings are replicated so any HA-JNDI bindings on one node would be copied to the other node. Global JNDI bindings would not be replicated as they're only bound locally. In either case, using HA-JNDI to perform a lookup should locate all local and ha bindings on nodes in the cluster.

                  If your HA lookup is failing from a restarted node, it seems like one of the following is the problem.

                  1) The lookup is being performed locally, not via ha.
                  2) The new node isn't recognized as part of the cluster.
                  3) There's a problem in the HA code and the lookup isn't being properly propagated to other nodes in the cluster. Remote lookups of locally bound objects are included in the unit tests for HA-JNDI so in general this should work properly.

                  Sorry I don't have any more to offer on this issue as I'm not familiar with the inner workings of the server or naming service.

                  • 6. Re: solved

                    Hi there,

                    I solved this problem and tested it with different configurations. The trick is not to specify the PROVIDER_URL property but the "jnp.partitionName" property.

                    I added this to the wiki: http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossHAJNDIUseCluster

                    Regards,
                    Martin