8 Replies Latest reply on Oct 14, 2005 2:23 AM by rolfrolf

    Cannot assign requested address: Datagram  ....

    rolfrolf

      Hi

      I'm using Weblogig 8.1 Sp3,Hibernate 3.0 and Jboss TreeCache. When I startup the Application Server everything seems fine as soon as I start up a second app server. The second one get stucked and reports following error over and over again (see below)

      Any help is very much appreciated.

      Error:
      17:11:44,375 ERROR UDP:660 - exception=java.net.BindException: Cannot assign requested address: Datagram send failed, msg=[dst: 0.0.0.0:3330, src: 10.6.2.206:62085 (3 headers), size = 0 bytes], mcast_addr=224.10.10.10:45563




      Here is my treecache.xml file

      <server>
       <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
       <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=TreeCache">
       <depends>jboss:service=Naming</depends>
       <depends>jboss:service=TransactionManager</depends>
       <attribute name="TransactionManagerLookupClass">org.jboss.cache.GenericTransactionManagerLookup</attribute>
       <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
       <attribute name="CacheMode">REPL_SYNC</attribute>
       <attribute name="UseReplQueue">false</attribute>
       <attribute name="ReplQueueInterval">0</attribute>
       <attribute name="ReplQueueMaxElements">0</attribute>
       <attribute name="ClusterName">e-LMS-Cache-Cluster</attribute>
       <attribute name="ClusterConfig">
       <config>
       <UDP bind_addr="0.0.0.0" mcast_addr="224.10.10.10" mcast_port="45566" ip_ttl="64" ip_mcast="true" mcast_send_buf_size="150000" mcast_recv_buf_size="80000" ucast_send_buf_size="150000" ucast_recv_buf_size="80000" loopback="false"/>
       <PING timeout="2000" num_initial_members="3" up_thread="false" down_thread="false"/>
       <MERGE2 min_interval="10000" max_interval="20000"/>
       <FD shun="true" up_thread="true" down_thread="true" />
       <FD_SOCK/>
       <VERIFY_SUSPECT timeout="1500" up_thread="false" down_thread="false"/>
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" max_xmit_size="8192" up_thread="false" down_thread="false"/>
       <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10" down_thread="false"/>
       <pbcast.STABLE desired_avg_gossip="20000" up_thread="false" down_thread="false"/>
       <FRAG frag_size="8192" down_thread="false" up_thread="false"/>
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true"/>
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
       </config>
       </attribute>
       <attribute name="FetchStateOnStartup">true</attribute>
       <attribute name="InitialStateRetrievalTimeout">5000</attribute>
       <attribute name="SyncReplTimeout">10000</attribute>
       <attribute name="LockAcquisitionTimeout">15000</attribute>
       <attribute name="MaxCapacity">10000</attribute>
       <attribute name="SyncReplTimeout">10000</attribute>
       <attribute name="LockLeaseTimeout">60000</attribute>
       <attribute name="EvictionPolicyConfig">
       <config>
       <attribute name="wakeUpIntervalInSeconds">5</attribute>
       <region name="/_default_">
       <attribute name="maxNodes">9750</attribute>
       <attribute name="timeToIdleSeconds">120</attribute>
       </region>
       <region name="au.com.auspost.elms.model.ElmsConfig">
       <attribute named="maxNodes">250</attribute>
       <attribute named="timeToIdleSeconds">300</attribute>
       </region>
       </config>
       </attribute>
       </mbean>
      </server>
      


        • 1. Re: Cannot assign requested address: Datagram  ....
          brian.stansberry

          In the UDP protocol config, bind_addr="0.0.0.0" won't work. It needs to point to a valid IP address used by the machine.

          You have 4 options:

          1) Leave out the bind_addr attribute; let the OS pick the interface on which it will send and receive multicast and send/receive UDP unicast.
          2) Leave out the bind_addr, but add bind_to_all_interfaces="true". Now you will receive multicast on all interfaces, but the OS will pick the one you send on.
          3) Change the bind_addr attribute to a valid IP address on your machine. You will send and receive multicast and UDP unicast via that interface.
          4) Change the bind_addr attribute to a valid IP address on your machine. You will send multicast via that interface. Also add the bind_to_all_interfaces="true" attribute. You will receive multicast on all interfaces.

          Options 1 and 3 are the more common choices; which one you pick depends on whether you care which interface JGroups uses. If your machine only has one IP address, Option 1 is the way to go.

          To complicate things a bit ;), you can also set a system property bind.address (e.g. -Dbind.address=192.168.0.5). If you do this, JGroups will ignore the bind_addr attribute and use the system property. This approach is typically used when you want to use the same TreeCache config file in all nodes in the cluster -- i.e. don't want to have a different file in each node because you need to embed an IP address.

          • 2. Re: Cannot assign requested address: Datagram  ....
            rolfrolf

            Thanks for the quick responds. I changed the treecache.xml file as you suggested and the errors stopped appearing but in the log files
            I'm not able to find any information from the treecache saying that it has added the second node to the treecache cluster. Is there a way to test if the cluster works correctly ?




            <UDP mcast_addr="224.10.10.10" mcast_port="45566" ip_ttl="64" ip_mcast="true" mcast_send_buf_size="150000"
            mcast_recv_buf_size="80000" ucast_send_buf_size="150000" ucast_recv_buf_size="80000" loopback="true"/>
            


            • 3. Re: Cannot assign requested address: Datagram  ....
              brian.stansberry

              You should have in your log something like the following:

              11:46:51,708 INFO [org.jboss.cache.TreeCache] viewAccepted(): new members: [192.168.1.2:32785, 192.168.1.3:32775]


              On the 1st node to come up you'll get a message like that, but only listing itself. Then when the 2nd node comes up, each node should have an log entry like that, but now listing 2 members.

              • 4. Re: Cannot assign requested address: Datagram  ....
                rolfrolf

                I do receive the message: Discard message from non-member !

                What is tha telling me ? Both are in the same cluster regarding the TreeCache.xml



                <?xml version="1.0" encoding="UTF-8"?>
                
                <!-- ===================================================================== -->
                <!-- e-LMS TreeCache Service Configuration -->
                <!-- ===================================================================== -->
                
                <server>
                 <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
                 <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=TreeCache">
                 <depends>jboss:service=Naming</depends>
                 <depends>jboss:service=TransactionManager</depends>
                 <attribute name="TransactionManagerLookupClass">org.jboss.cache.GenericTransactionManagerLookup</attribute>
                 <!-- Isolation level : SERIALIZABLE, REPEATABLE_READ (default), READ_COMMITTED, READ_UNCOMMITTED, NONE -->
                 <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
                 <!-- Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC -->
                 <attribute name="CacheMode">REPL_SYNC</attribute>
                 <!-- Just used for async repl: use a replication queue -->
                 <attribute name="UseReplQueue">false</attribute>
                 <!-- Replication interval for replication queue (in ms) -->
                 <attribute name="ReplQueueInterval">0</attribute>
                 <!-- Max number of elements which trigger replication -->
                 <attribute name="ReplQueueMaxElements">0</attribute>
                 <!-- Name of cluster. Needs to be the same for all clusters, in order
                 to find each other -->
                 <attribute name="ClusterName">e-LMS-Cache-Cluster</attribute>
                 <!-- JGroups protocol stack properties. Can also be a URL,
                 e.g. file:/home/bela/default.xml
                 <attribute name="ClusterProperties"></attribute>
                 -->
                 <attribute name="ClusterConfig">
                 <config>
                 <!-- UDP: if you have a multihomed machine, set the bind_addr attribute
                 to the appropriate NIC IP address, e.g bind_addr="192.168.0.2" -->
                 <!-- UDP: On Windows machines, because of the media sense feature being
                 broken with multicast (even after disabling media sense) set the loopback
                 attribute to true -->
                 <UDP mcast_addr="239.1.2.3" mcast_port="45566" ip_ttl="64" ip_mcast="true" mcast_send_buf_size="150000" mcast_recv_buf_size="80000" ucast_send_buf_size="150000" ucast_recv_buf_size="80000" loopback="true"/>
                 <PING timeout="2000" num_initial_members="3" up_thread="false" down_thread="false"/>
                 <MERGE2 min_interval="10000" max_interval="20000"/>
                 <FD shun="true" up_thread="true" down_thread="true" />
                 <FD_SOCK/>
                 <VERIFY_SUSPECT timeout="1500" up_thread="false" down_thread="false"/>
                 <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" max_xmit_size="8192" up_thread="false" down_thread="false"/>
                 <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10" down_thread="false"/>
                 <pbcast.STABLE desired_avg_gossip="20000" up_thread="false" down_thread="false"/>
                 <FRAG frag_size="8192" down_thread="false" up_thread="false"/>
                 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true"/>
                 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
                 </config>
                 </attribute>
                 <!-- Whether or not to fetch state on joining a cluster -->
                 <attribute name="FetchStateOnStartup">true</attribute>
                 <!--
                 The max amount of time (in milliseconds) we wait until the
                 initial state (ie. the contents of the cache) are retrieved from
                 existing members in a clustered environment
                 -->
                 <attribute name="InitialStateRetrievalTimeout">5000</attribute>
                 <!--
                 Number of milliseconds to wait until all responses for a
                 synchronous call have been received.
                 -->
                 <attribute name="SyncReplTimeout">10000</attribute>
                 <!-- Max number of milliseconds to wait for a lock acquisition -->
                 <attribute name="LockAcquisitionTimeout">15000</attribute>
                 <attribute name="MaxCapacity">10000</attribute>
                 <attribute name="SyncReplTimeout">10000</attribute>
                 <attribute name="LockLeaseTimeout">60000</attribute>
                 <!--attribute name="EvictionPolicyClass">org.jboss.cache.eviction.LRUPolicy</attribute-->
                 <attribute name="EvictionPolicyConfig">
                 <config>
                 <attribute name="wakeUpIntervalInSeconds">5</attribute>
                 <region name="/_default_">
                 <attribute name="maxNodes">9750</attribute>
                 <attribute name="timeToIdleSeconds">120</attribute>
                 </region>
                 <region name="au.com.auspost.elms.model.ElmsConfig">
                 <attribute named="maxNodes">250</attribute>
                 <attribute named="timeToIdleSeconds">300</attribute>
                 </region>
                 </config>
                 </attribute>
                 <!--
                 <attribute name="CacheLoaderClass">org.jboss.cache.loader.bdbje.BdbjeCacheLoader</attribute>
                 <attribute name="CacheLoaderConfig">c:\temp\bdbje</attribute>
                 <attribute name="CacheLoaderShared">true</attribute>
                 <attribute name="CacheLoaderPreload">/</attribute>
                 -->
                 <!--
                 <attribute name="CacheLoaderClass">org.jboss.cache.loader.FileCacheLoader</attribute>
                 <attribute name="CacheLoaderConfig">/tmp</attribute>
                 <attribute name="CacheLoaderShared">true</attribute>
                 <attribute name="CacheLoaderPreload">/</attribute>
                 -->
                 </mbean>
                </server>
                


                • 5. Re: Cannot assign requested address: Datagram  ....
                  brian.stansberry

                  1) Do you get the viewAccepted() messages?

                  2) Is there another Jgroups channel somewhere using the same mcast_addr and mcast_port? The discard msfs often come when the JGroups channel is receiving packets intended for a different cluster. Check cluster-service.xml and tc5-cluster-service.xml if you're using the JBoss all config. In any case, try changing the mcast_port to something else.

                  • 6. Re: Cannot assign requested address: Datagram  ....
                    rolfrolf

                    1) Do you get the viewAccepted() messages?

                    No I don't get the viewAccept messages

                    2) Is there another Jgroups channel somewhere using the same mcast_addr and mcast_port? The discard msfs often come when the JGroups channel is receiving packets intended for a different cluster. Check cluster-service.xml and tc5-cluster-service.xml if you're using the JBoss all config. In any case, try changing the mcast_port to something else.

                    I changed the port but same problem

                    • 7. Re: Cannot assign requested address: Datagram  ....
                      brian.stansberry

                      I notice in your config you have both FD and FD_SOCK. You should pick one or the other.

                      • 8. Re: Cannot assign requested address: Datagram  ....
                        rolfrolf

                        I think it's working now. I received those cache stats (see below) which shows that the second level cache is working but I still don't get your messages but remember I run your TreeCache on Weblogic maybe that could cause the fact that I'm not getting your messages




                        ===============================================CACHE STATISTICS
                        ==============================================
                        session.open.count : 41
                        session.close.count : 41
                        second.lvl.cache.put : 0
                        second.lvl.cache.hits : 50
                        second.lvl.cache.miss : 0
                        transaction.count : 41
                        transaction.count.success: 41
                        ==============================================