12 Replies Latest reply on Mar 20, 2009 2:21 AM by krishnan366

Cluster setup

krishnan366 Mar 3, 2009 3:31 AM

Hi,
I've two instances of jboss 5.0.0 GA up and running in two different machines.

I am using EJB 3.0 stateless session beans to access jboss cache . I do have the @Clustered annotation in my bean.I am using HA_JNDI to invoke the remote ejb from client.

When using both the instances of the server , the cache is getting replicated in both the instances , but on server start up both the nodes does does not seem to recognise each other.. From the server logs I see

07:21:15,536 INFO [DEPartition] Number of cluster members: 1
07:21:15,536 INFO [DEPartition] Other members: 0

The entire setup works like a cluster , like if one of the node is down , the request is routed to the other server..etc.. but just the message seems to be incorrect.

I ran the tests mentioned in the community ( JGroups Probe and Draw) . It does display both the nodes..

How do I actually verify if this setup is correct? and if the cluster is correctly configured.?

1. Re: Cluster setup

brian.stansberry Mar 3, 2009 11:48 AM (in response to krishnan366)

Look for "New cluster view for partition DEPartition" in your logs; that's what's logged when the view changes as another node joins the group.

You can also look in the jmx-console, mbean jboss:partitionName=DEPartition,service=HAPartition, attribute CurrentView.

Getting the message you reported on both nodes is a bit odd, unless you started both nodes at the same time (w/in say 5 seconds of each other). If you start both nodes at the same time, you can get that as neither node has started so neither can be discovered by the other, leading each to form a one-node group. The JGroups MERGE2 protocol (http://www.jboss.org/community/docs/DOC-10896) eventually detects this and combines the 2 one-node groups into a single group.
Actions
2. Re: Cluster setup

krishnan366 Mar 3, 2009 11:16 PM (in response to krishnan366)
Hi ,
Thanks for the response.

I don't have the message
"New cluster view for partition DEPartition"
in both my server's logs. Do I need to change jboss-log4j.xml to be able to see this message?

I do see the message
[org.jboss.cache.RPCManagerImpl] (main) Received new cluster view
but this just lists the same ip address and does not reflect the node 2.

I did wait for a while before starting the second server , say a minute or so..

Also in the currentView in jmx console I just see one IP address listed. Should this list both the IP address?

Do I need to change any other files for the cluster set up?
Actions
3. Re: Cluster setup

brian.stansberry Mar 5, 2009 10:11 AM (in response to krishnan366)

If you only have one item in the view, then the nodes didn't form a cluster.

You said you already tried some of this stuff, but not sure if you tried it all, so:

http://www.jboss.org/community/docs/DOC-12375

http://www.jgroups.org/manual/html/ch02.html, section 2.8 and on.
Actions
4. Re: Cluster setup

krishnan366 Mar 6, 2009 1:17 AM (in response to krishnan366)

Hi Brian,
I haven't gone through all the tests yet , I'll do that and will get back to you..

In the meanwhile I tried setting up a cluster instance as follows

a) 2 instances of jboss 5.0.0 GA running in my local machine(2 instances in the same machine).
b) I've generated 2 different IP addresses using microsoft loop back adaptor.
c) Started my servers with the -c , -b and ServerPeerID options.
d) In the application ,I've ejb 3.0 stateless session beans trying to connect to jboss cache.

e) The cluster is set up in the 2 instances , as the message is displayed as expected.

i) but in the jboss cache displays the message

WARN [TxInterceptor] Commit failed. Clearing stale locks.

And the cache data is not available in the database.Is this error because of the loop back adapter?

2) how is setting up a loop back adaptor different from using 2 ip address of the localhost(one LAN IP and the otherone for the wireless) . Without the loop hole adaptor , cluster does not get formed.
Actions
5. Re: Cluster setup

brian.stansberry Mar 6, 2009 9:53 AM (in response to krishnan366)

"krishnan366" wrote:

i) but in the jboss cache displays the message

WARN [TxInterceptor] Commit failed. Clearing stale locks.

And the cache data is not available in the database.Is this error because of the loop back adapter?

If the two nodes are properly forming a cluster, then your loopback adaptor setup looks to be working. Which reduces the likelihood that it's the cause of your problem.
Beyond that, you'd need to give a lot more information on the problem than the above to expect any kind of diagnosis.

2) how is setting up a loop back adaptor different from using 2 ip address of the localhost(one LAN IP and the otherone for the wireless) . Without the loop hole adaptor , cluster does not get formed.

If you use two different IPs associated with different NICs you need to add a mulitcast route to your routing table.

Also, as an FYI to anyone reading this, using localhost as one node's address and an IP on another adapter for the other address will not work on Windows.
Actions
6. Re: Cluster setup

krishnan366 Mar 9, 2009 1:41 AM (in response to krishnan366)

Hi Brian,
I am using a JDBCCacheLoader and on eviction of any of the nodes, I get the message as mentioned in my previous post . and the nodes are not stored in the database because of the above message.

I just removed the configuration from my cache-config.xml.

<clustering mode="replication">


<stateRetrieval timeout="20000" fetchInMemoryState="false"/>


<sync replTimeout="20000"/>
</clustering>

Without this the nodes are persisted and work without any issue. Is something wrong in the above configuration that causes the issue?
Actions
7. Re: Cluster setup

krishnan366 Mar 9, 2009 1:43 AM (in response to krishnan366)
Reposting the xml

<clustering mode="replication" clusterName="DataCacheCluster"> <stateRetrieval timeout="20000" fetchInMemoryState="false"/> <sync replTimeout="20000"/> </clustering>
Actions
8. Re: Cluster setup

brian.stansberry Mar 17, 2009 12:23 PM (in response to krishnan366)

Please post your full JBoss Cache config file.

Are these caches writing to a single shared database?
Actions

9. Re: Cluster setup

krishnan366 Mar 17, 2009 11:15 PM (in response to krishnan366)

Please find the complete configuration file.

Yes, they are writing to a single shared DB

<?xml version="1.0" encoding="UTF-8"?>
<jbosscache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:jboss:jbosscache-core:config:3.0">

 <locking
 isolationLevel="READ_COMMITTED"/>
 <transaction
 transactionManagerLookupClass="org.jboss.cache.transaction.GenericTransactionManagerLookup"
 syncRollbackPhase="false"
 syncCommitPhase="false"/>


 <clustering mode="replication">


 <stateRetrieval timeout="20000" fetchInMemoryState="false"/>
 <sync replTimeout="20000"/>

 </clustering>

 <eviction wakeUpInterval="500" >

 <default algorithmClass="org.jboss.cache.eviction.FIFOAlgorithm" eventQueueSize="100000">
 <property name="maxNodes" value="100" />
 <property name="minTimeToLive" value="36000"/>
 </default>

 <region name="/regionA">
 <property name="maxNodes" value="3" />
 <property name="minTimeToLive" value="2000"/>

 </region>
 </eviction>


 <loaders passivation="false" shared="true">
 <preload><node fqn="/"></node></preload>
 <loader class="org.jboss.cache.loader.JDBCCacheLoader" async="false" fetchPersistentState="false"
 ignoreModifications="false" purgeOnStartup="true">
 <properties>
 cache.jdbc.datasource=java:/ProjectOracleDS
 location=./
 cache.jdbc.table.drop=false
 cache.jdbc.table.name=cache_loader
 cache.jdbc.table.primarykey=FQN
 cache.jdbc.fqn.column=FQN
 cache.jdbc.node.column=cache_data
 cache.jdbc.parent.column=parent_fqn
 cache.jdbc.sql-concat=1||2
 </properties>

 </loader>
 </loaders>




</jbosscache>

I got get this exception

org.jboss.cache.lock.TimeoutException: Unable to acquire lock on Fqn

10. Re: Cluster setup

brian.stansberry Mar 18, 2009 10:24 AM (in response to krishnan366)

Your config looks fine.

You need to provide a lot more context about what is going on. A one line snippet from an exception stack trace with no real information about what your app is doing at the time is useless.
Actions
11. Re: Cluster setup

krishnan366 Mar 20, 2009 2:19 AM (in response to krishnan366)

Actually I am trying to put data into the cache . since the JDBCCacheLoader is configured and passivation is false , I expect the data to be available in the database as well.

But I don;t get to see the data in the database , I assume that the the database write fails and hence the exception
WARN [TxInterceptor] Commit failed. Clearing stale locks.

I don't face any problem when I comment the lines
<clustering mode="replication">

<stateRetrieval timeout="20000" fetchInMemoryState="false"/>
<sync replTimeout="20000"/>

</clustering>

from my cache config.
Actions
12. Re: Cluster setup

krishnan366 Mar 20, 2009 2:21 AM (in response to krishnan366)
I commented the lines below and the cache works fine as expected

<clustering mode="replication"> <stateRetrieval timeout="20000" fetchInMemoryState="false"/> <sync replTimeout="20000"/> </clustering>

So shouldn't I have the above in my cache config for a cluster set up?
Actions

Go to original post