1 Reply Latest reply on Dec 8, 2010 10:24 PM by eduardo_thp

JBoss 4.2.3.GA - Issue when a node is removed from the LAN

eduardo_thp Dec 8, 2010 12:00 PM

Hello,

I'm using JBoss 4.2.3.GA on a TCP clustered environment (the issue I'm describing has been seen on a cluster with 2 nodes and on a cluster with 4 nodes)

I have the following sceneario:

- no loadbalancers

- a cluster of two or four jboss servers (shouldn't matter, issue was seen on both envs)

- WebApplication that has a pushlet connection for pushing events from the server to the browser

- On the web server side of the application I have a thread running for monitoring the state of the pushlet connection

- On the client side I'm also monitoring the state of the pushlet connection (JavaScript)

* If the clientside (browser) detects that the pushlet connection has been lost, it tries reconnecting to the server, in case it can't reconnect to the same server it goes to a new server in the cluster (same domain - no browser security issues)

* If the serverside (thread) detects that the pushlet connection has been lost, it specifies a timeout, if after this timeout the connection hasn't been stablished the thread invalidates the user session.

Notice:

(my code also sets into the session an attribute that represents the current server to which the pushlet is currently connected to)

(Application failover occurs without problem when the failover is caused by a server or jboss shutdown/restart)

--------------------

The issue:

- Server A is up and Server B is up

- The user has opened the browser and connected to Server A (pushlet is up, my monitoring thread starts running)

- Server A has it's LAN cable disconnected from the network

- Browser code detects the failure, starts trying a reconnection, reconnects to Server B (Failover Successful)

- User uses the app without problems

- after about 5 minutes Server A has it's LAN cable reconnected to the network

***** now the problem starts *****

There seem to be no merge issues:

ServerALog:

2010-12-07 21:54:17,792 INFO [org.jboss.cache.TreeCache] viewAccepted(): MergeView::[161.134.28.20:7810|3] [161.134.28.20:7810, 161.134.28.21:7810], subgroup
s=[[161.134.28.20:7810|2] [161.134.28.20:7810], [161.134.28.21:7810|2] [161.134.28.21:7810]]

ServerBLog:

2010-12-07 21:54:17,819 INFO [org.jboss.cache.TreeCache] viewAccepted(): MergeView::[161.134.28.20:7810|3] [161.134.28.20:7810, 161.134.28.21:7810], subgroup
s=[[161.134.28.20:7810|2] [161.134.28.20:7810], [161.134.28.21:7810|2] [161.134.28.21:7810]]

FD_SOCK suspicious message on both servers:

ServerALog:

2010-12-07 21:54:38,178 WARN [org.jgroups.protocols.FD_SOCK] I was suspected by 161.134.28.21:7810; ignoring the SUSPECT message

ServerBLog:

2010-12-07 21:54:38,031 WARN [org.jgroups.protocols.FD_SOCK] I was suspected by 161.134.28.20:7810; ignoring the SUSPECT message

My monitor retrieves different values for the attribute that is stored in the session

ServerA:

... SessionMinder] [] [] **** SESSION SERVER: 161.134.28.20

ServerB

... SessionMinder] [] [] **** SESSION SERVER: 161.134.28.21

Monitoring thread on Server A invalidates the session after the timeout

and on server B I see the following message:

2010-12-07 21:56:08,167 INFO [org.jboss.web.tomcat.service.session.CacheListener] Possible concurrency problem: Replicated version id 50 matches in-memory ve
rsion for session 8K3xQotPH-OjVp91acqZRw**
2010-12-07 21:56:08,167 DEBUG [org.jboss.web.tomcat.service.session.ClusteredSession] The session has expired with id: 8K3xQotPH-OjVp91acqZRw** -- is it local
? true
2010-12-07 21:56:08,167 DEBUG [org.jboss.cache.TreeCache] Performing a real remove for node /JSESSION/localhost/HISWebUI/8K3xQotPH-OjVp91acqZRw**, marked for
removal.

User is redirected to the logon page

My cluster configuration is the default that comes with jboss, the only thing I modified was to use TCP instead of UDP:

...

<attribute name="CacheMode">REPL_ASYNC</attribute>

<attribute name="UseRegionBasedMarshalling">false</attribute>

...

....

Any idea of what could be happening or how can I obtain more information on what's going on ?

I didn't want to switch to repl_sync, also saw something about configuration the cache for instead of replication doing invalidation, how configure that ?

Thanks,

Eddie

Additional Info:

* Tried modifying the configuration for using REPL_SYNC and that didn't resolve the problem

* We are using AIX

* Having a look at the logs, I also noticed that our pushlet only has its outputStream closed when the server gets its LAN cable reconnected to the network.

Seems that on a LAN failure streams aren't properly closed and stay open for quite some time... not sure if that could be causing problems to the replication code as well.

Is it possible that by modifying an OS configuration could we have a different result when a LAN disconnection happens ?

1. Re: JBoss 4.2.3.GA - Issue when a node is removed from the LAN

eduardo_thp Dec 8, 2010 10:24 PM (in response to eduardo_thp)

I've noticed some issues on my code where an InterruputedException was being swallowed....

after properly dealing with the issue ( Thread.currentThread().interrupt() )

I've re-executed the tests and now JBossCacheService on Server A throws an exception when the nodes try merging.

Not sure if that could be related to the AIX JVM implementation or ....

2010-12-08 20:53:42,276 DEBUG [org.jboss.web.tomcat.service.session.JBossCacheManager] processSessionRepl(): failed with exception
java.lang.RuntimeException: JBossCacheService: exception occurred in cache put ...
        at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:147)
        at org.jboss.web.tomcat.service.session.JBossCacheService.putSession(JBossCacheService.java:325)
        at org.jboss.web.tomcat.service.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:123)
        at org.jboss.web.tomcat.service.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1127)
        at org.jboss.web.tomcat.service.session.JBossCacheManager.storeSession(JBossCacheManager.java:682)
        at org.jboss.web.tomcat.service.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:49)
        at org.jboss.web.tomcat.service.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:108)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:432)
        at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
        at java.lang.Thread.run(Thread.java:736)
Caused by:
java.lang.RuntimeException: java.lang.InterruptedException
        at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5931)
        at org.jboss.cache.TreeCache.put(TreeCache.java:3784)
        at sun.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
        at java.lang.reflect.Method.invoke(Method.java:600)
        at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
        at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
        at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
        at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:193)
        at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
        at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
        at $Proxy58.put(Unknown Source)
        at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:138)
        ... 17 more
Caused by:
java.lang.InterruptedException
        at org.jboss.cache.lock.ReadWriteLockWithUpgrade$ReaderLock.attempt(ReadWriteLockWithUpgrade.java:303)
        at org.jboss.cache.lock.IdentityLock.acquireReadLock(IdentityLock.java:252)
        at org.jboss.cache.Node.acquireReadLock(Node.java:545)
        at org.jboss.cache.Node.acquire(Node.java:507)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.acquireNodeLock(PessimisticLockInterceptor.java:410)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.lock(PessimisticLockInterceptor.java:322)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.invoke(PessimisticLockInterceptor.java:189)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.UnlockInterceptor.invoke(UnlockInterceptor.java:32)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:39)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:379)
        at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:174)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167)
        at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5919)
        ... 29 more
2010-12-08 20:53:42,278 WARN [org.jboss.web.tomcat.service.session.InstantSnapshotManager./HISWebUI] Failed to replicate session LqAw8zAQHwaQH04WQvb2HQ**
java.lang.RuntimeException: JBossCacheService: exception occurred in cache put ...
        at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:147)
        at org.jboss.web.tomcat.service.session.JBossCacheService.putSession(JBossCacheService.java:325)
        at org.jboss.web.tomcat.service.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:123)
        at org.jboss.web.tomcat.service.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1127)
        at org.jboss.web.tomcat.service.session.JBossCacheManager.storeSession(JBossCacheManager.java:682)
        at org.jboss.web.tomcat.service.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:49)
        at org.jboss.web.tomcat.service.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:108)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:432)
        at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
        at java.lang.Thread.run(Thread.java:736)
Caused by:
java.lang.RuntimeException: java.lang.InterruptedException
        at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5931)
        at org.jboss.cache.TreeCache.put(TreeCache.java:3784)
        at sun.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
        at java.lang.reflect.Method.invoke(Method.java:600)
        at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
        at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
        at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
        at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:193)
        at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
        at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
        at $Proxy58.put(Unknown Source)
        at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:138)
        ... 17 more
Caused by:
java.lang.InterruptedException
        at org.jboss.cache.lock.ReadWriteLockWithUpgrade$ReaderLock.attempt(ReadWriteLockWithUpgrade.java:303)
        at org.jboss.cache.lock.IdentityLock.acquireReadLock(IdentityLock.java:252)
        at org.jboss.cache.Node.acquireReadLock(Node.java:545)
        at org.jboss.cache.Node.acquire(Node.java:507)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.acquireNodeLock(PessimisticLockInterceptor.java:410)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.lock(PessimisticLockInterceptor.java:322)
        at org.jboss.cache.interceptors.PessimisticLockInterceptor.invoke(PessimisticLockInterceptor.java:189)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.UnlockInterceptor.invoke(UnlockInterceptor.java:32)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:39)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:379)
        at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:174)
        at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
        at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167)
        at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5919)
        ... 29 more
Actions