-
1. Re: HA-JNDI (Fault tolerance)
slaboure Jul 22, 2002 11:39 AM (in response to fred_soulier)First the easy answer: could you please try with the last CVS HEAD if possible? Some bug fixes are not in the official release.
If you experience the same problem with HEAD, I will check.
Cheers,
sacha -
2. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 23, 2002 6:30 AM (in response to fred_soulier)Hi Sacha,
By CVS HEAD you mean the module "jboss-all" that builds a JBoss3.1.0alpha?
Fred -
3. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 23, 2002 10:01 AM (in response to fred_soulier)OK. Got "jboss-all" updated to CVS HEAD. Built JBoss3.1.0alpha and I have been fighting with it for the last few hours... either it does not start or hangs or does not shutdown properly...
So for the time being I will install a fresh JBoss3.0.0 and see whether the pbm is still there...
Although I really really need to find a solution to this pbm.
Fred -
4. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 23, 2002 11:28 AM (in response to fred_soulier)I have tried with a fresh JBoss3.0.0_Tomcat4.0.3 install on both nodes. I only changed the cluster-service.xml, deployed my ear file, ran my client... Same thing, only 1 node process the requests... kill this node... no switching to the other node... client throws exception and stops...
Fred -
5. Re: HA-JNDI (Fault tolerance)
belaban Jul 23, 2002 12:20 PM (in response to fred_soulier)Fred,
it could be a problem with JavaGroups. Are you sure the 2 nodes 'see' each other ? You can try this out by starting 2 Draw instances and checking whether they form a group (both should show '2 instances' in their title):
java org.javagroups.demos.Draw
If this works, then it is not a JavaGroups problem. If they don't find each other, try the following:
1. Get the latest JavaGroups from javagroups.sf.net. If you know how to do it, get it from the CVS, build javagroups-all.jar and drop the JAR into the correct location in JBoss.
2. If this still doesn't work:
3. Modify the JavaGroups properties (described in the Cluster documentation): add a bind_addr property to the UDP spec, e.g.:
"UDP(bind_addr=192.168.20.210;...):"
This will tell the instance to bind to the correct interface in case of a multi-homed system. You need to change this for the other node.
Hope this helps,
Bela -
6. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 23, 2002 12:44 PM (in response to fred_soulier)Well, They see each other because the ReplicantManager displays msgs when one node dies (deadMembers: 1) or when a node is started (There are new members. Spawning MergeMembers thread.)
Also when a node is started and there is already a node with the same partition running I can see in the log: [CLUSTER] Number of cluster members: 2
I have just tried JBoss3.0.1RC1_Tomcat4.0.4 and it's exactly the same pbm...
Fred -
7. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 24, 2002 1:32 PM (in response to fred_soulier)I got javagroups from CVS and rebuilt it. Replaced javagroups-2.0.jar in JBoss /sever/all/lib by javagroups-all.jar.
When I restarted JBoss it complained about UNICAST.setProperties() for min_wait_time=2000 which I had in my cluster-service.xml. Removed min_wait_time=2000 and restarted... Good no error.
So now I need to re-run my test and I will try the Draw example from javagroups as well just to be on the safe side.
Stay tuned :) -
8. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 24, 2002 2:13 PM (in response to fred_soulier)Ok. Just tried the Draw demo from Javagroups (rebuilt from CVS, v2.0.2) and it's working fine.
1st Scenario
------------
2 instances of Draw running on same box can see each other. Drawings on one appear on the other, etc...
2nd scenario
------------
2 instances running on 2 different boxes (by the way the 2 boxes are the exact same boxes I use for my 2-node cluster)
and again they can see each other. Drawings on one appear on the other, etc...
So Javagroups runs fine on these boxes (v2.0.2 built from CVS).
/Fred -
9. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 25, 2002 4:46 AM (in response to fred_soulier)Finally, ran my client again with following config:
JBoss3.0.0_Tomcat4.0.3
Javagroups in JBoss3.0.0 replaced by Javagroups 2.0.2 from CVS
Same problem. Only 1 node serves the responses and if it dies there is no failover to the 2nd node...
Can someone look at this pbm? I'm ready to try whatever hacks/fixes may work but I need directions.
Thanks.
/Fred -
10. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 25, 2002 2:49 PM (in response to fred_soulier)OK. It seems that the name of the EJB hasn't been bound through the HA-JNDI...
Logging CONSOLE output in debug mode, I get:
18:39:09,634 DEBUG [HAJNDI] lookupLocally
Looking at the source of org.jboss.ha.jndi.HAJNDI in the lookup(Name name) method I get this message most likely because the super.lookup(name) (in NamingServer) failed...
It then calls the lookupLocally(name) method which returns the name.
I've attached the client I use and the cluster-service.xml
In jboss.xml I have for my ejb/GUIDGenerator:
<!-- =================== -->
<!-- GUID Generator EJB -->
<!-- =================== -->
<ejb-name>GUIDGeneratorEJB</ejb-name>
<jndi-name>ejb/GUIDGenerator</jndi-name>
True
<cluster-config>
<partition-name>CLUSTER</partition-name>
<home-load-balance-policy>org.jboss.ha.framework.interfaces.RoundRobin</home-load-balance-policy>
<bean-load-balance-policy>org.jboss.ha.framework.interfaces.RoundRobin</bean-load-balance-policy>
</cluster-config>
If I use PROVIDER_URL="192.168.20.210:1099" the name is found.
If I use PROVIDER_URL="" the name is found because HAJNDI looked it up locally.
Why is my EJB name not bound through HA-JNDI?
Am I missing something in a config file?
/Fred -
11. Re: HA-JNDI (Fault tolerance)
slaboure Jul 29, 2002 3:47 AM (in response to fred_soulier)Hello Fred,
I made a few bug fixes this week-end. Could you please try to get a fresh version from HEAD (jboss-all from HEAD) and try it. But don't forget to set your test client code to use HA-JNDI and not simply JNDI!! (use the good port number *on the client side*!)
Cheers,
Sacha -
12. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 29, 2002 7:24 AM (in response to fred_soulier)Hi Sacha
Thanks.
Please see the attached files.
Basically it yielded some positive results (that was a joy to my eyes) and maybe some not so good.
/Fred -
13. Re: HA-JNDI (Fault tolerance)
slaboure Jul 29, 2002 11:07 AM (in response to fred_soulier)> Scenario #2
> -----------
> The 3 nodes are running.
>
> From my client trying to lookup my EJB with:
> PROVIDER_URL = ""
> fails.
don't set PROVIDER_URL = "", but simply PROVIDER_URL = null (i.e. don't set it!)
If you still get exceptions, then the stacktrack is appreciated.
> Scenario #3
> -----------
> The 3 nodes are running.
>
> From my client trying to lookup my EJB with:
> PROVIDER_URL = "192.168.20.210:1100,192.168.20.104:1100,192.168.20.118:1100"
> returns the correct lookup but ...
> The client looks up the same EJB 10 times.
logical, you need to make subsequent calls *with the same object* to have round robin behaviour. You always get a *new* object (stub) => it is logical that it doesn't work.
> So according to the source code, the binding is not found in the HA-JNDI and it
> looks for it in the local JNDI tree which is not what I expected...
> Why is the name not bound through the HA-JNDI?
see the documentation. normal.
> Scenario #4 (fail-over)
> -----------------------
> The 3 nodes are running.
>
> The client looks up the same EJB in an infinite loop.
> PROVIDER_URL = "192.168.20.210:1100,192.168.20.104:1100,192.168.20.118:1100"
> (PROVIDER_URL="" does not work as mention earlier)
>
> The client runs happily and requests are dispatched to all nodes.
> I switch off the W2K node #3
> ...
> in the local JNDI tree).
> - once application is deployed and local JNDI tree is setup, the exceptions stop.
stacktrack appreciated. And try it on HEAD please (I don't want to hunt old bugs)
Cheers,
Sacha -
14. Re: HA-JNDI (Fault tolerance)
fred_soulier Jul 30, 2002 10:35 AM (in response to fred_soulier)Hi Sacha,
Thanks for your reply.
Today's CVS (HEAD) does not build:
...
generate-parsers:
[mkdir] Created dir: /home/fsoulier/development/JBoss_Head/jboss-all/server/output/parsers/org/jboss/ejb/plugins/cmp/ejbql
BUILD FAILED
file:/home/fsoulier/development/JBoss_Head/jboss-all/server/build.xml:382: Failed to launch JJTree
>>don't set PROVIDER_URL = "", but simply PROVIDER_URL = null (i.e. don't set it!)
>>If you still get exceptions, then the stacktrack is appreciated.
Yep, no setting the PROVIDER_URL works.
>>logical, you need to make subsequent calls *with the same object* to have round robin behaviour. You always get a *new* object (stub) => it is logical that it doesn't work.
OK I made some changes to my client to call the same business method getGUID() 6 times for the same stub.
The results were:
Node #1: Linux (192.168.20.104)
Node #2: W2K (192.168.20.210)
Node #3: W2K (192.168.20.211)
1st batch
---------
Node #1: 1
Node #2: 3
Node #3: 2
2nd batch
---------
Node #1: 3
Node #2: 1
Node #3: 2
3rd batch
---------
Node #1: 1
Node #2: 3
Node #3: 2
4th batch
---------
Node #1: 1
Node #2: 3
Node #3: 2
5th batch
---------
Node #1: 3
Node #2: 1
Node #3: 2
So yes there is some load balancing done. (note: maybe I should do the test with more
nodes and more calls using the same stub?)
>>see the documentation. normal.
Yes a fine manual indeed :)
page17: "So, a EJB home lookup through HA-JNDI, will always be delegated to the local
JNDI instance."
>>stacktrack appreciated. And try it on HEAD please (I don't want to hunt old bugs)
Client changed to call the getGUID() method 10000 times using the same stub to test the
failover capability of the stub.
3 nodes were running, all nodes were serving responses, I then decided to be mean and
shutdown 2 of them and got this exception.
...
[java] (#0_6792_<1>) Got_GUID_Generator_Reference: ejb/GUIDGenerator:Stateless / Got_GUID: 5BA35C2B4058146800284ED632C81693
[java] (#0_6793_<1>) Got_GUID_Generator_Reference: ejb/GUIDGenerator:Stateless / Got_GUID: 5B9DC4EC4058142D000297FE0618AD26
[java] (#0_6794_<1>) Got_GUID_Generator_Reference: ejb/GUIDGenerator:Stateless / Got_GUID: 5BA35C344058146800284ED63E7FA0C0
[java] (#0_6795_<1>) Got_GUID_Generator_Reference: ejb/GUIDGenerator:Stateless / Got_GUID: 5B9DC4F64058142D000297FE5A5DB81F
[java] java.lang.IllegalStateException: container is not started, you cannot invoke ejb methods on it
[java] at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:240)
[java] at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:215)
[java] at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:117)
[java] at org.jboss.invocation.jrmp.server.JRMPInvoker_Stub.invoke(Unknown Source)
[java] at org.jboss.invocation.jrmp.interfaces.JRMPInvokerProxyHA.invoke(JRMPInvokerProxyHA.java:164)
[java] at org.jboss.invocation.InvokerInterceptor.invoke(InvokerInterceptor.java:92)
[java] at org.jboss.proxy.TransactionInterceptor.invoke(TransactionInterceptor.java:51)
[java] at org.jboss.proxy.SecurityInterceptor.invoke(SecurityInterceptor.java:48)
[java] at org.jboss.proxy.ejb.StatelessSessionInterceptor.invoke(StatelessSessionInterceptor.java:109)
[java] at org.jboss.proxy.ClientContainer.invoke(ClientContainer.java:82)
[java] at $Proxy2.getGUID(Unknown Source)
[java] at com.lastminute.ebasket.RMISSLTest.main(RMISSLTest.java:90)
/Fred