-
1. Re: discarded message from non-member
brian.stansberry Jan 19, 2009 12:38 PM (in response to bmelloni)First, if you are a support customer (you're using EAP), please open a case via the Customer Support Portal. There's no SLA via the forums.
Otherwise,
1) Are you actually using UDP multicast? The ports shown in your logs seem more like what would be used by a TCP-based JGroups config. (Could be multicast though; depends on your config).
2) I need to understand what channels are using 192.168.11.102:1147 and 192.168.11.103:2600. Please find the logging that looks like this on the two nodes and post the area around it:
-------------------------------------------------------
GMS: address is 192.168.11.102:1147
-------------------------------------------------------
or
-------------------------------------------------------
GMS: address is 192.168.11.103:2600
-------------------------------------------------------
3) The following will likely cause problems, although AFAIR not the NAKACK issue you are reporting:
15:50:45,995 INFO [DefaultPartition] All Members : 2 ([127.0.0.1:1099, 127.0.0.1:1099])
That tells me you have JBoss bound to 127.0.0.1 on both nodes. That would occur either by starting JBoss with -b 127.0.0.1 on both nodes, or by not setting -b and leaving the 127.0.0.1 default. The AS clustering code uses the bind address and JNDI port to form a unique cluster-wide id for each node. Works fine, except when you bind JBoss to 127.0.0.1 or 0.0.0.0 on more than one machine. If you *want* to use 127.0.0.1 or 0.0.0.0 as the -b value on more than one node, you should edit the server/all/deploy/cluster-service.xml's ClusterPartition mbean and either change
${jboss.bind.address}
to something unique per server, like
192.168.11.102
or, explicitly configure a String "NodeName" attribute with a unique value per node:
node1
Bottom line, you don't want duplicates in the "[DefaultPartition] All Members" logging. -
2. Re: discarded message from non-member
bmelloni Jan 20, 2009 11:17 AM (in response to bmelloni)Yes, we are a support customer. I am a new employee for the company and I just requested from my boss the info needed to open a ticket.
Thank you for helping until I am able to open the formal ticket.
Your suggestion (3) to start with -b took care of the discarded message. But I still get some errors. After starting .103 first and .102 second, the following is still happening:
A) I see these errors on .102 at about a 2 minute interval:
09:02:22,093 WARN [ConnectionTable] peer closed connection, trying to re-send msg
09:02:22,093 ERROR [ConnectionTable] 2nd attempt to send data failed too
B) Deployment after placing a WAR in the farm folder seems to be horrendously slow (like if it was failing a lot, timing out, and recovering). I see the WAR file being placed in the all/tmp folder, but the byte count goes up at a crawl. In both servers logs I see quite a few debug statements for TORecoveryModule and XARecoveryModule. Once the push finally finished (after 30-60 min!) the application worked on both servers.
Here are the details you requested in your previous email:
1) I am using the default clustering configuration, since the instructions say you should get default clustering by just starting in the 'all' configuration. If that is UDP multicast, then yes.
I believe the only changes I did to the defaults are:
a) What is indicated in the post-Installation instructions (i.e.: enable the admin accounts so that I can get to the web pages.
b) Start with "-c all', to get default clustering.
c) Since I noticed that with the defaults I couldn't access the server by IP, after I capture the logs I posted, I changed the start to include '-b '.
2)
=====================
Log snippet from .103:
=====================
08:51:58,433 INFO [ServerInfo] Java version: 1.6.0_11,Sun Microsystems Inc.
08:51:58,433 INFO [ServerInfo] Java VM: Java HotSpot(TM) Server VM 11.0-b16,Sun Microsystems Inc.
08:51:58,433 INFO [ServerInfo] OS-System: Windows XP 5.1,x86
08:51:58,824 INFO [Server] Core system initialized
08:52:02,621 INFO [WebService] Using RMI server codebase: http://192.168.11.103:8083/
08:52:02,621 INFO [Log4jService$URLWatchTimerTask] Configuring from URL: resource:jboss-log4j.xml
08:52:03,058 INFO [TransactionManagerService] JBossTS Transaction Service (JTA version) - JBoss Inc.
08:52:03,058 INFO [TransactionManagerService] Setting up property manager MBean and JMX layer
08:52:03,168 INFO [TransactionManagerService] Starting recovery manager
08:52:03,215 INFO [TransactionManagerService] Recovery manager started
08:52:03,215 INFO [TransactionManagerService] Binding TransactionManager JNDI Reference
08:52:07,996 INFO [EJB3Deployer] Starting java:comp multiplexer
08:52:09,840 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.103:1733
-------------------------------------------------------
08:52:11,871 INFO [TreeCache] viewAccepted(): [192.168.11.103:1733|0] [192.168.11.103:1733]
08:52:11,918 INFO [TreeCache] TreeCache local address is 192.168.11.103:1733
08:52:11,918 INFO [TreeCache] State could not be retrieved (we are the first member in group)
08:52:11,918 INFO [TreeCache] parseConfig(): PojoCacheConfig is empty
08:52:12,074 INFO [STDOUT] no object for null
08:52:12,074 INFO [STDOUT] no object for null
08:52:12,121 INFO [STDOUT] no object for null
08:52:12,137 INFO [STDOUT] no object for {urn:jboss:bean-deployer}supplyType
08:52:12,137 INFO [STDOUT] no object for {urn:jboss:bean-deployer}dependsType
08:52:16,480 INFO [NativeServerConfig] JBoss Web Services - Native
08:52:16,496 INFO [NativeServerConfig] jbossws-native-2.0.1.SP2 (build=200710210837)
08:52:18,090 INFO [SnmpAgentService] SNMP agent going active
08:52:18,433 INFO [DefaultPartition] Initializing
08:52:18,465 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.103:1738
-------------------------------------------------------
08:52:20,480 INFO [DefaultPartition] Number of cluster members: 1
08:52:20,480 INFO [DefaultPartition] Other members: 0
08:52:20,480 INFO [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
08:52:20,480 INFO [DefaultPartition] State could not be retrieved (we are the first member in group)
08:52:20,543 INFO [HANamingService] Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/192.168.11.103
08:52:20,558 INFO [DetachedHANamingService$AutomaticDiscovery] Listening on /192.168.11.103:1102, group=230.0.0.4, HA-JNDI addr
ess=192.168.11.103:1100
08:52:20,933 INFO [TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
08:52:21,027 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.103:1742
-------------------------------------------------------
08:52:23,043 INFO [TreeCache] viewAccepted(): [192.168.11.103:1742|0] [192.168.11.103:1742]
08:52:23,043 INFO [TreeCache] TreeCache local address is 192.168.11.103:1742
08:52:23,324 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.103:1746
-------------------------------------------------------
08:52:25,324 INFO [TreeCache] viewAccepted(): [192.168.11.103:1746|0] [192.168.11.103:1746]
08:52:25,324 INFO [TreeCache] TreeCache local address is 192.168.11.103:1746
================================
Snippet from .102:
================================
09:01:43,031 INFO [ServerInfo] Java version: 1.6.0_10,Sun Microsystems Inc.
09:01:43,031 INFO [ServerInfo] Java VM: Java HotSpot(TM) Server VM 11.0-b15,Sun Microsystems Inc.
09:01:43,031 INFO [ServerInfo] OS-System: Windows XP 5.1,x86
09:01:43,531 INFO [Server] Core system initialized
09:01:45,359 INFO [WebService] Using RMI server codebase: http://192.168.11.102:8083/
09:01:45,359 INFO [Log4jService$URLWatchTimerTask] Configuring from URL: resource:jboss-log4j.xml
09:01:45,750 INFO [TransactionManagerService] JBossTS Transaction Service (JTA version) - JBoss Inc.
09:01:45,750 INFO [TransactionManagerService] Setting up property manager MBean and JMX layer
09:01:45,921 INFO [TransactionManagerService] Starting recovery manager
09:01:45,968 INFO [TransactionManagerService] Recovery manager started
09:01:45,968 INFO [TransactionManagerService] Binding TransactionManager JNDI Reference
09:01:47,781 INFO [EJB3Deployer] Starting java:comp multiplexer
09:01:49,296 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.102:1577
-------------------------------------------------------
09:01:51,515 INFO [TreeCache] viewAccepted(): [192.168.11.103:1733|1] [192.168.11.103:1733, 192.168.11.102:1577]
09:01:51,578 INFO [TreeCache] TreeCache local address is 192.168.11.102:1577
09:01:51,640 INFO [TreeCache] received the state (size=1024 bytes)
09:01:51,656 INFO [TreeCache] state was retrieved successfully (in 78 milliseconds)
09:01:51,656 INFO [TreeCache] parseConfig(): PojoCacheConfig is empty
09:01:51,703 INFO [STDOUT] no object for null
09:01:51,703 INFO [STDOUT] no object for null
09:01:51,718 INFO [STDOUT] no object for null
09:01:51,750 INFO [STDOUT] no object for {urn:jboss:bean-deployer}supplyType
09:01:51,765 INFO [STDOUT] no object for {urn:jboss:bean-deployer}dependsType
09:01:53,000 INFO [NativeServerConfig] JBoss Web Services - Native
09:01:53,000 INFO [NativeServerConfig] jbossws-native-2.0.1.SP2 (build=200710210837)
09:01:53,453 INFO [SnmpAgentService] SNMP agent going active
09:01:53,687 INFO [DefaultPartition] Initializing
09:01:53,718 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.102:1583
-------------------------------------------------------
09:02:00,562 INFO [DefaultPartition] Number of cluster members: 2
09:02:00,562 INFO [DefaultPartition] Other members: 1
09:02:00,562 INFO [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
09:02:00,750 INFO [DefaultPartition] state was retrieved successfully (in 188 milliseconds)
09:02:00,953 INFO [HANamingService] Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/192.168.11.102
09:02:00,953 INFO [DetachedHANamingService$AutomaticDiscovery] Listening on /192.168.11.102:1102, group=230.0.0.4, HA-JNDI addr
ess=192.168.11.102:1100
09:02:01,218 INFO [TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
09:02:01,312 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.102:1589
-------------------------------------------------------
09:02:03,578 INFO [TreeCache] viewAccepted(): [192.168.11.103:1742|1] [192.168.11.103:1742, 192.168.11.102:1589]
09:02:03,640 INFO [TreeCache] TreeCache local address is 192.168.11.102:1589
09:02:03,734 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.11.102:1594
-------------------------------------------------------
09:02:06,031 INFO [TreeCache] viewAccepted(): [192.168.11.103:1746|1] [192.168.11.103:1746, 192.168.11.102:1594]
09:02:06,093 INFO [TreeCache] TreeCache local address is 192.168.11.102:1594 -
3. Re: discarded message from non-member
brian.stansberry Jan 20, 2009 1:09 PM (in response to bmelloni)Make sure when you open a support ticket that you reference this thread so the support team can see the background.
Please post the contents of your deploy/cluster-service.xml file.
Your farming issue for sure sounds like a communication issue; i.e. lost messages, lots of retries. Not bad enough that the cluster falls apart, but bad enough that RPCs around the cluster take forever. -
4. Re: discarded message from non-member
bmelloni Jan 20, 2009 2:10 PM (in response to bmelloni)Here is cluster-service.xml for both servers.
It should be 'untouched' from the original install (although I remember having to change 'somewhere' - maybe in this file or another file - a value from 0 to 1 to avoid the nodes fighting each other for the same identity).
.103 (the first server I start):
==================
<?xml version="1.0" encoding="UTF-8"?>
<!-- ===================================================================== -->
<!-- -->
<!-- Sample Clustering Service Configuration -->
<!-- -->
<!-- ===================================================================== -->
<!-- ==================================================================== -->
<!-- Cluster Partition: defines cluster -->
<!-- ==================================================================== -->
<!-- Name of the partition being built -->
${jboss.partition.name:DefaultPartition}
<!-- The address used to determine the node name -->
${jboss.bind.address}
<!-- Determine if deadlock detection is enabled -->
False
<!-- Max time (in ms) to wait for state transfer to complete. Increase for large states -->
30000
<!-- The JGroups protocol configuration -->
<!--
The default UDP stack:
- If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
appropriate NIC IP address, e.g bind_addr="192.168.0.2".
- On Windows machines, because of the media sense feature being broken with multicast
(even after disabling media sense) set the UDP protocol's loopback attribute to true
-->
<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
mcast_port="${jboss.hapartition.mcast_port:45566}"
tos="8"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="false"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
ip_ttl="${jgroups.udp.ip_ttl:2}"
down_thread="false" up_thread="false"/>
<PING timeout="2000"
down_thread="false" up_thread="false" num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"
down_thread="false" up_thread="false"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
<!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
<!--
<TCP bind_addr="thishost" start_port="7800" loopback="true"
tcp_nodelay="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
down_thread="false" up_thread="false"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"/>
<TCPPING initial_hosts="thishost[7800],otherhost[7800]" port_range="3"
timeout="3000"
down_thread="false" up_thread="false"
num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
-->
jboss:service=Naming
<!-- ==================================================================== -->
<!-- HA Session State Service for SFSB -->
<!-- ==================================================================== -->
jboss:service=Naming
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
<!-- JNDI name under which the service is bound -->
/HASessionState/Default
<!-- Max delay before cleaning unreclaimed state.
Defaults to 30*60*1000 => 30 minutes -->
0
<!-- ==================================================================== -->
<!-- HA JNDI -->
<!-- ==================================================================== -->
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
<!-- Bind address of bootstrap and HA-JNDI RMI endpoints -->
${jboss.bind.address}
<!-- Port on which the HA-JNDI stub is made available -->
1100
<!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. -->
1101
<!-- Accept backlog of the bootstrap socket -->
50
<!-- The thread pool service used to control the bootstrap and
auto discovery lookups -->
<depends optional-attribute-name="LookupPool"
proxy-type="attribute">jboss.system:service=ThreadPool
<!-- A flag to disable the auto discovery via multicast -->
false
<!-- Set the auto-discovery bootstrap multicast bind address. If not
specified and a BindAddress is specified, the BindAddress will be used. -->
${jboss.bind.address}
<!-- Multicast Address and group port used for auto-discovery -->
${jboss.partition.udpGroup:230.0.0.4}
1102
<!-- The TTL (time-to-live) for autodiscovery IP multicast packets -->
16
<!-- The load balancing policy for HA-JNDI -->
org.jboss.ha.framework.interfaces.RoundRobin
<!-- Client socket factory to be used for client-server
RMI invocations during JNDI queries
custom
-->
<!-- Server socket factory to be used for client-server
RMI invocations during JNDI queries
custom
-->
<!-- ==================================================================== -->
<!-- HA Invokers -->
<!-- ==================================================================== -->
jboss:service=TransactionManager
<depends optional-attribute-name="Connector"
proxy-type="attribute">jboss.remoting:service=Connector,transport=socket
jboss:service=${jboss.partition.name:DefaultPartition}
${jboss.bind.address}
4447
<!--
custom
custom
-->
jboss:service=Naming
<!-- the JRMPInvokerHA creates a thread per request. This implementation uses a pool of threads -->
1
300
300
60000
${jboss.bind.address}
4448
${jboss.bind.address}
0
false
<depends optional-attribute-name="TransactionManagerService">jboss:service=TransactionManager
jboss:service=Naming
<!-- ==================================================================== -->
<!-- ==================================================================== -->
<!-- Distributed cache invalidation -->
<!-- ==================================================================== -->
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
jboss.cache:service=InvalidationManager
jboss.cache:service=InvalidationManager
DefaultJGBridge
.102 (the secondserver):
===============
<?xml version="1.0" encoding="UTF-8"?>
<!-- ===================================================================== -->
<!-- -->
<!-- Sample Clustering Service Configuration -->
<!-- -->
<!-- ===================================================================== -->
<!-- ==================================================================== -->
<!-- Cluster Partition: defines cluster -->
<!-- ==================================================================== -->
<!-- Name of the partition being built -->
${jboss.partition.name:DefaultPartition}
<!-- The address used to determine the node name -->
${jboss.bind.address}
<!-- Determine if deadlock detection is enabled -->
False
<!-- Max time (in ms) to wait for state transfer to complete. Increase for large states -->
30000
<!-- The JGroups protocol configuration -->
<!--
The default UDP stack:
- If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
appropriate NIC IP address, e.g bind_addr="192.168.0.2".
- On Windows machines, because of the media sense feature being broken with multicast
(even after disabling media sense) set the UDP protocol's loopback attribute to true
-->
<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
mcast_port="${jboss.hapartition.mcast_port:45566}"
tos="8"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="false"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
ip_ttl="${jgroups.udp.ip_ttl:2}"
down_thread="false" up_thread="false"/>
<PING timeout="2000"
down_thread="false" up_thread="false" num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"
down_thread="false" up_thread="false"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
<!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
<!--
<TCP bind_addr="thishost" start_port="7800" loopback="true"
tcp_nodelay="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
enable_bundling="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
down_thread="false" up_thread="false"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"/>
<TCPPING initial_hosts="thishost[7800],otherhost[7800]" port_range="3"
timeout="3000"
down_thread="false" up_thread="false"
num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
-->
jboss:service=Naming
<!-- ==================================================================== -->
<!-- HA Session State Service for SFSB -->
<!-- ==================================================================== -->
jboss:service=Naming
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
<!-- JNDI name under which the service is bound -->
/HASessionState/Default
<!-- Max delay before cleaning unreclaimed state.
Defaults to 30*60*1000 => 30 minutes -->
0
<!-- ==================================================================== -->
<!-- HA JNDI -->
<!-- ==================================================================== -->
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
<!-- Bind address of bootstrap and HA-JNDI RMI endpoints -->
${jboss.bind.address}
<!-- Port on which the HA-JNDI stub is made available -->
1100
<!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. -->
1101
<!-- Accept backlog of the bootstrap socket -->
50
<!-- The thread pool service used to control the bootstrap and
auto discovery lookups -->
<depends optional-attribute-name="LookupPool"
proxy-type="attribute">jboss.system:service=ThreadPool
<!-- A flag to disable the auto discovery via multicast -->
false
<!-- Set the auto-discovery bootstrap multicast bind address. If not
specified and a BindAddress is specified, the BindAddress will be used. -->
${jboss.bind.address}
<!-- Multicast Address and group port used for auto-discovery -->
${jboss.partition.udpGroup:230.0.0.4}
1102
<!-- The TTL (time-to-live) for autodiscovery IP multicast packets -->
16
<!-- The load balancing policy for HA-JNDI -->
org.jboss.ha.framework.interfaces.RoundRobin
<!-- Client socket factory to be used for client-server
RMI invocations during JNDI queries
custom
-->
<!-- Server socket factory to be used for client-server
RMI invocations during JNDI queries
custom
-->
<!-- ==================================================================== -->
<!-- HA Invokers -->
<!-- ==================================================================== -->
jboss:service=TransactionManager
<depends optional-attribute-name="Connector"
proxy-type="attribute">jboss.remoting:service=Connector,transport=socket
jboss:service=${jboss.partition.name:DefaultPartition}
${jboss.bind.address}
4447
<!--
custom
custom
-->
jboss:service=Naming
<!-- the JRMPInvokerHA creates a thread per request. This implementation uses a pool of threads -->
1
300
300
60000
${jboss.bind.address}
4448
${jboss.bind.address}
0
false
<depends optional-attribute-name="TransactionManagerService">jboss:service=TransactionManager
jboss:service=Naming
<!-- ==================================================================== -->
<!-- ==================================================================== -->
<!-- Distributed cache invalidation -->
<!-- ==================================================================== -->
<!-- We now inject the partition into the HAJNDI service instead
of requiring that the partition name be passed -->
<depends optional-attribute-name="ClusterPartition"
proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}
jboss.cache:service=InvalidationManager
jboss.cache:service=InvalidationManager
DefaultJGBridge -
5. Re: discarded message from non-member
brian.stansberry Jan 21, 2009 11:10 AM (in response to bmelloni)Your "ConnectionTable" logging:
09:02:22,093 WARN [ConnectionTable] peer closed connection, trying to re-send msg 09:02:22,093 ERROR [ConnectionTable] 2nd attempt to send data failed too
is coming from the JBoss Messaging Data Channel. That channel uses TCP unicast for sending messages, unlike the other channels that use UDP multicast.
Farming doesn't use that channel; it uses a different one, the UDP multicast-based one from cluster-service.xml.
So, two separate channels using different underlying protocols are experiencing problems, which sounds to me like a network or host configuration problem. Hard to say what; if resolving the firewall issues you raise in a separate thread make it go away, there's your answer.
See also http://www.jboss.org/community/docs/DOC-12375 -
6. Re: discarded message from non-member
bmelloni Jan 21, 2009 12:41 PM (in response to bmelloni)This Connection issue does not go away with the firewall turned off. The two posts are independent problems.
Let's table this problem. I should get my commercial license credentials today or tomorrow and will use phone support to start over from scratch and reinstall both servers in the cluster according to their instructions instead of what the documentation seems to say.
A suggestion:
- The server config guide is a good reference, but it is worthless as a cluster installation guide unless you are already a jBoss configuration expert.
- There is a need for a simple, step by step guide for installing a basic cluster.
- I might even write it myself and contribute it back after talking to support. Nobody else should suffer through this installation nightmare.
Thanks for trying to help. -
7. Re: discarded message from non-member
brian.stansberry Jan 21, 2009 2:12 PM (in response to bmelloni)"bmelloni" wrote:
This Connection issue does not go away with the firewall turned off. The two posts are independent problems.
Let's table this problem. I should get my commercial license credentials today or tomorrow and will use phone support to start over from scratch and reinstall both servers in the cluster according to their instructions instead of what the documentation seems to say.
OK. The support team is much better equipped to handle issues that are specific to a particular environment.
A suggestion:
- The server config guide is a good reference, but it is worthless as a cluster installation guide unless you are already a jBoss configuration expert.
- There is a need for a simple, step by step guide for installing a basic cluster.
- I might even write it myself and contribute it back after talking to support. Nobody else should suffer through this installation nightmare.
Thanks for the input. I've heard similar things before, and basically agree. I'd certainly welcome any contributions, particularly on AS 4.x. I'm rewriting the Clustering Guide for AS 5 and have added some of what you are talking about. A draft of that can be found attached to http://www.jboss.org/community/docs/DOC-12928; comments are welcome. (Note: it's the attached document at the bottom of the page; not the links at the top. I won't bore you with the details as to why).