5 Replies Latest reply on Sep 29, 2004 4:45 PM by belaban

    java.net.SocketException: The message is larger than the max

    arnold

      Hi all,

      I am running two machines in the partition; JBoss 3.2.4 on Windows XP SP2, Java SDK 1.4.2_05. Using the un-modified cluster-service.xml.

      Left it running over the weekend, and saw this error when I came back. (The log is from MACHINE_B, modified for easier viewing)

      2004-09-29 10:06:16,203 DEBUG [org.jgroups.protocols.UNICAST] [MACHINE_B:1057 (additional data: 19 bytes)] --> XMIT(MACHINE_B:4460 (additional data: 19 bytes): #126)
      
      2004-09-29 10:06:16,203 DEBUG [org.jgroups.Message] header for "UDP" is already present: old header=[UDP:group_addr=DefaultPartition], new header=[UDP:group_addr=DefaultPartition]
      
      2004-09-29 10:06:16,218 DEBUG [org.jgroups.protocols.UDP] sending message to MACHINE_B:4460 (additional data: 19 bytes) (src=MACHINE_B:1057 (additional data: 19 bytes)), headers are {UNICAST=[UNICAST: DATA, seqno=126], GMS=GmsHeader[JOIN_RSP]: join_rsp=view: [MACHINE_B:1057 (additional data: 19 bytes)|1022] [MACHINE_B:1057 (additional data: 19 bytes), MACHINE_A:2920 (additional data: 19 bytes), MACHINE_B:2429 (additional data: 19 bytes), MACHINE_B:2431 (additional data: 19 bytes),
       ###SIMILAR PATTERN 500 TIMES###
      MACHINE_B:4458 (additional data: 19 bytes), MACHINE_B:4460 (additional data: 19 bytes)], digest: [MACHINE_B:1057 (additional data: 19 bytes): [0 : 1054, MACHINE_A:2920 (additional data: 19 bytes): [0 : 35, MACHINE_B:2429 (additional data: 19 bytes): [0 : 0, MACHINE_B:2431 (additional data: 19 bytes): [0 : 0, MACHINE_B:2433 (additional data: 19 bytes): [0 : 0, MACHINE_B:2435 (additional data: 19 bytes): [0 : 0, MACHINE_B:2437 (additional data: 19 bytes):
       ###SIMILAR PATTERN 500 TIMES###
      MACHINE_B:4456 (additional data: 19 bytes): [0 : 0, MACHINE_B:4458 (additional data: 19 bytes): [0 : 0, MACHINE_B:4460 (additional data: 19 bytes): [0 : 0, MACHINE_B:4460 (additional data: 19 bytes): [0 : 0]
      , UDP=[UDP:group_addr=DefaultPartition]}
      
      2004-09-29 10:06:16,234 ERROR [org.jgroups.protocols.UDP] exception=java.net.SocketException: The message is larger than the maximum supported by the underlying transport: Datagram send failed, msg=[dst: MACHINE_B:4460 (additional data: 19 bytes), src: MACHINE_B:1057 (additional data: 19 bytes) (3 headers), size = 0 bytes], mcast_addr=228.1.2.3:45566
      


      Is this a known issue? What's going on with the 3rd message?

      Thanks.
      Arnold

        • 1. Re: java.net.SocketException: The message is larger than the
          belaban

          Post your cluster-service.xml. You probably need to defined a frag_size in FRAG.
          Use JGroups/bin/frag_size.sh to determine the optimal fragmentation size.

          Bela

          P.S.: you'll also need to specify the frag size in NAKACK

          • 2. Re: java.net.SocketException: The message is larger than the
            arnold

            My cluster-service.xml is as exactly the same as the one from JBoss' distribution:

            <?xml version="1.0" encoding="UTF-8"?>
            
            <!-- ===================================================================== -->
            <!-- -->
            <!-- Sample Clustering Service Configuration -->
            <!-- -->
            <!-- ===================================================================== -->
            
            <server>
            
             <classpath codebase="lib" archives="jbossha.jar"/>
            
             <!-- ==================================================================== -->
             <!-- Cluster Partition: defines cluster -->
             <!-- ==================================================================== -->
            
             <mbean code="org.jboss.ha.framework.server.ClusterPartition"
             name="jboss:service=DefaultPartition">
            
             <!-- Name of the partition being built -->
             <attribute name="PartitionName">DefaultPartition</attribute>
             <!-- The address used to determine the node name -->
             <attribute name="NodeAddress">${jboss.bind.address}</attribute>
             <!-- Determine if deadlock detection is enabled -->
             <attribute name="DeadlockDetection">False</attribute>
            
             <!-- Time in milliseconds to wait for state to be transferred -->
             <attribute name="StateTransferTimeout">60000</attribute>
            
             <!-- The JGroups protocol configuration -->
             <attribute name="PartitionConfig">
             <Config>
             <!-- UDP: if you have a multihomed machine,
             set the bind_addr attribute to the appropriate NIC IP address -->
             <!-- UDP: On Windows machines, because of the media sense feature
             being broken with multicast (even after disabling media sense)
             set the loopback attribute to true -->
             <UDP mcast_addr="228.1.2.3" mcast_port="45566"
             ip_ttl="32" ip_mcast="true"
             mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
             ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
             loopback="false" />
             <PING timeout="2000" num_initial_members="3"
             up_thread="true" down_thread="true" />
             <MERGE2 min_interval="10000" max_interval="20000" />
             <FD shun="true" up_thread="true" down_thread="true"
             timeout="2500" max_tries="5" />
             <VERIFY_SUSPECT timeout="3000" num_msgs="3"
             up_thread="true" down_thread="true" />
             <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
             max_xmit_size="8192"
             up_thread="true" down_thread="true" />
             <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
             down_thread="true" />
             <pbcast.STABLE desired_avg_gossip="20000"
             up_thread="true" down_thread="true" />
             <FRAG frag_size="8192"
             down_thread="true" up_thread="true" />
             <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
             shun="true" print_local_addr="true" />
             <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
             </Config>
             </attribute>
            
             </mbean>
            
             <!-- ==================================================================== -->
             <!-- HA Session State Service for SFSB -->
             <!-- ==================================================================== -->
            
             <mbean code="org.jboss.ha.hasessionstate.server.HASessionStateService"
             name="jboss:service=HASessionState">
             <depends>jboss:service=DefaultPartition</depends>
             <!-- Name of the partition to which the service is linked -->
             <attribute name="PartitionName">DefaultPartition</attribute>
             <!-- JNDI name under which the service is bound -->
             <attribute name="JndiName">/HASessionState/Default</attribute>
             <!-- Max delay before cleaning unreclaimed state.
             Defaults to 30*60*1000 => 30 minutes -->
             <attribute name="BeanCleaningDelay">0</attribute>
             </mbean>
            
             <!-- ==================================================================== -->
             <!-- HA JNDI -->
             <!-- ==================================================================== -->
            
             <mbean code="org.jboss.ha.jndi.HANamingService"
             name="jboss:service=HAJNDI">
             <depends>jboss:service=DefaultPartition</depends>
             <!-- Name of the partition to which the service is linked -->
             <attribute name="PartitionName">DefaultPartition</attribute>
             <!-- bind address of HA JNDI RMI endpoint -->
             <attribute name="BindAddress">${jboss.bind.address}</attribute>
             <!-- RmiPort to be used by the HA-JNDI service
             once bound. 0 => auto. -->
             <attribute name="RmiPort">0</attribute>
             <!-- Port on which the HA-JNDI stub is made available -->
             <attribute name="Port">1100</attribute>
             <!-- Backlog to be used for client-server RMI
             invocations during JNDI queries -->
             <attribute name="Backlog">50</attribute>
            
             <!-- Multicast Address and Group used for auto-discovery -->
             <attribute name="AutoDiscoveryAddress">230.0.0.4</attribute>
             <attribute name="AutoDiscoveryGroup">1102</attribute>
            
             <!-- IP Address to which should be bound: the Port, the RmiPort and
             the AutoDiscovery multicast socket. -->
             <!-- Client socket factory to be used for client-server
             RMI invocations during JNDI queries -->
             <!--attribute name="ClientSocketFactory">custom</attribute-->
             <!-- Server socket factory to be used for client-server
             RMI invocations during JNDI queries -->
             <!--attribute name="ServerSocketFactory">custom</attribute-->
             </mbean>
            
             <mbean code="org.jboss.invocation.jrmp.server.JRMPInvokerHA"
             name="jboss:service=invoker,type=jrmpha">
             <attribute name="ServerAddress">${jboss.bind.address}</attribute>
             <!--
             <attribute name="RMIObjectPort">0</attribute>
             <attribute name="RMIClientSocketFactory">custom</attribute>
             <attribute name="RMIServerSocketFactory">custom</attribute>
             -->
             </mbean>
            
             <!-- ==================================================================== -->
             <!-- Distributed cache invalidation -->
             <!-- ==================================================================== -->
            
             <mbean code="org.jboss.cache.invalidation.bridges.JGCacheInvalidationBridge"
             name="jboss.cache:service=InvalidationBridge,type=JavaGroups">
             <depends>jboss:service=DefaultPartition</depends>
             <depends>jboss.cache:service=InvalidationManager</depends>
             <attribute name="InvalidationManager">jboss.cache:service=InvalidationManager</attribute>
             <attribute name="PartitionName">DefaultPartition</attribute>
             <attribute name="BridgeName">DefaultJGBridge</attribute>
             </mbean>
            
            </server>
            


            Th partition is set up with a clustered Entity Bean. But there were any activity at the application level, so all traffic are from JBoss/JGroup. What worreis me is the HUGE message attempted to be sent between the nodes (see the edited log), do you why such large message is being sent?

            Thanks.

            • 3. Re: java.net.SocketException: The message is larger than the
              belaban

              Hmm, this is strange because the message size is *0* ! So this is probably caused by another exception.
              You could enable logging: add a category for org.jgroups.protocols.UDP at the TRACE level.

              What version of JGroups ?
              java -jar server/all/lib/jgroups.jar org.jgroups.Version

              Bela

              • 4. Re: java.net.SocketException: The message is larger than the
                arnold

                Thanks for the prompt response.

                java -cp jgroups.jar org.jgroups.Version>
                Version: 2.2.4
                CVS: $Id: Version.java,v 1.5 2004/04/28 18:44:58 belaban Exp $
                History: (see doc/history.txt for details)

                BTW, I have a silly newbie question. In cluster-service.xml, what other value can I use for

                <UDP mcast_addr="228.1.2.3" mcast_port="45566"
                ? I am a bit mystified by this IP 228.1.2.3, what does it represent?

                Reading the JBoss Clustering doc didn't help, as it refered to JavaGroups' doc which I can't find the relevant section.....

                Thanks.

                • 5. Re: java.net.SocketException: The message is larger than the
                  belaban

                  Hmm, please enable logging (as described). I cannot find the "larger message" error in any of the newer JGroups versions (starting from 2.2.4).


                  Bela