12 Replies Latest reply on Aug 8, 2005 1:47 AM by bstansberry

ClusteredSingleSignOn - TreeCacheSSOClusterManager fails to

drzewo Jul 28, 2005 11:16 AM

I'am constantly getting this on both server startup as well as on each SSO replication event.

16:45:53,212 INFO [TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache --
 treeCacheName must be set to point to a running TreeCache before ClusteredSingleSignOn can handle requests

TreeCache *is* however configured beforehand as the log entries state. This happens to me when using my customized (stripped down a bit) server configuration. Then I gave a spin to the "all" server and observed similar behaviour. The only changes in original "all" configuration files were:
* enabling the org.jboss.web.tomcat.tc5.sso.ClusteredSingleSignOn Valve in tomcat server.xml
* fix the dependency of Tomcat service on the jboss.cache:service=TomcatClusteringCache in tomcat's jboss-service.xml

There is a part of the log (greped by TreeCache, showing that the apropriate tree cache had already been started)

2005-07-28 16:53:22,027 DEBUG [org.jboss.system.ServiceCreator] About to create bean: jboss.cache:service=TomcatClusteringCache with code: org.jboss.cache.TreeCache
2005-07-28 16:53:22,199 INFO [org.jboss.cache.TreeCache] setting cluster properties from xml to: UDP(ip_mcast=true;ip_ttl=64;loopback=false;mcast_addr=230.1.2.3;mcast_port=45577;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
2005-07-28 16:53:22,215 DEBUG [org.jboss.cache.TreeCache] Creating jboss.cache:service=TomcatClusteringCache
2005-07-28 16:53:22,215 DEBUG [org.jboss.cache.TreeCache] Created jboss.cache:service=TomcatClusteringCache
2005-07-28 16:53:22,230 DEBUG [org.jboss.cache.TreeCache] Starting jboss.cache:service=TomcatClusteringCache
2005-07-28 16:53:22,293 INFO [org.jboss.cache.TreeCache] interceptor chain is:
2005-07-28 16:53:22,293 INFO [org.jboss.cache.TreeCache] cache mode is REPL_SYNC
2005-07-28 16:53:24,387 INFO [org.jboss.cache.TreeCache] viewAccepted(): new members: [W20609:3701]
2005-07-28 16:53:24,387 INFO [org.jboss.cache.TreeCache] new cache is null (maybe first member in cluster)
2005-07-28 16:53:24,402 INFO [org.jboss.cache.TreeCache] state could not be retrieved (must be first member in group)
2005-07-28 16:53:24,402 DEBUG [org.jboss.cache.TreeCache] Started jboss.cache:service=TomcatClusteringCache
2005-07-28 16:53:26,496 INFO [org.jboss.web.tomcat.tc5.sso.TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache -- treeCacheName must be set to point to a running TreeCache before ClusteredSingleSignOn can handle requests

Need I mention that the SSO data do not get replicated across the cluster while the plain http session data does?

I'am running jboss 4.0.1sp1 on the latest Sun Java 1.5.0_04 on XP sp2.
Now, there comes the weird part. The *very same* configuration when gzipped and moved over to another machine (yep, as you might have expected running jboss 4.0.1sp1 on the latest Sun Java 1.5.0_04 on XP sp2, too) runs like a charm?

Has anybody else run into similar problem?

1. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Jul 30, 2005 6:27 PM (in response to drzewo)

This is definitely quite odd if the same config works on one machine and not another. As you state, the log clearly shows the TreeCache being started and registered in JMX, yet TreeCacheSSOClusterManager cannot find it when it queries the JMX server.

Couple things:

1) Can you go into the jmx-console and confirm the jboss.cache:service=TomcatClusteringCache MBean is indeed registered?

2) You noted that you get a message in the log for each SSO replication event. Can you confirm that the first one is logged at INFO level (the one you included in your report), followed by an ERROR level, followed thereafter by WARN levels. This is the expected behavior when it can't find the cache; if it's not what you're getting maybe that will give me a clue.

3) If the MBean is visible in the jmx-console, can you e-mail me your tc5-cluster-service.xml file and your server.xml file? Address is b_stansberry AT hotmail DOTcom
Actions
2. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

drzewo Jul 31, 2005 5:23 PM (in response to drzewo)

Again, here are the datailed steps I took to get this "odd behaviour" surface.
1. Take the standard "all" configuration. It has all in store (session replication via TreeCache defined in tc5-cluster-service.xml)
2. Add the dependency on the jboss.cache:service=TomcatClusteringCache to the jbossweb-tomcat50.sar/META-INF/jboss-service.xml
3. Enable the ClusteredSingleSignOn valve in tomcat's server.xml

That's all. I hereby confirm that jboss.cache:service=TomcatClusteringCache MBean is indeed registered, moreover standard sessions do get replicated across the members of the cluster (I first observed this behaviour when starting two instances of jboss on one machine binding them to two different IPs, but this also happens when I try to start jboss without specifying the IP address to bind).

As for the questions you have asked - let me double check tomorrow at work where oddity happens (particularly Q2 as I seem to have answered Q1 already, and my response to Q3 does not seem to be necessary - no particular changes to the files you had requested). On second thought, I'll email the modifed files to you.

BTW, this also happens to jboss 4.0.2.
Actions

3. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

drzewo Aug 1, 2005 6:57 AM (in response to drzewo)

I checked what you wanted me to and confirm that the TreeCache is up and
running.

Here is what I get on the stdout (Sorry for the huge log)

===============================================================================
.
 JBoss Bootstrap Environment
.
 JBOSS_HOME: D:\java\jboss-4.0.1sp1\bin\\..
.
 JAVA: c:\java\jdk1.5.0\bin\java
.
 JAVA_OPTS: -Dprogram.name=run.bat -Dcom.sun.management.jmxremote -Xms32m -Xmx128m
.
 CLASSPATH: c:\java\jdk1.5.0\lib\tools.jar;D:\java\jboss-4.0.1sp1\bin\\run.jar
.
 CMD PARAMS: -c all1 -b node1
.
===============================================================================
.
12:11:27,171 INFO [Server] Starting JBoss (MX MicroKernel)...
12:11:27,233 INFO [Server] Release ID: JBoss [Zion] 4.0.1sp1 (build: CVSTag=JBoss_4_0_1_SP1 date=200502160314)
12:11:27,296 INFO [Server] Home Dir: D:\java\jboss-4.0.1sp1
12:11:27,374 INFO [Server] Home URL: file:/D:/java/jboss-4.0.1sp1/
12:11:27,389 INFO [Server] Library URL: file:/D:/java/jboss-4.0.1sp1/lib/
12:11:27,405 INFO [Server] Patch URL: null
12:11:27,421 INFO [Server] Server Name: all1
12:11:27,436 INFO [Server] Server Home Dir: D:\java\jboss-4.0.1sp1\server\all1
12:11:27,436 INFO [Server] Server Home URL: file:/D:/java/jboss-4.0.1sp1/server/all1/
12:11:27,468 INFO [Server] Server Data Dir: D:\java\jboss-4.0.1sp1\server\all1\data
12:11:27,483 INFO [Server] Server Temp Dir: D:\java\jboss-4.0.1sp1\server\all1\tmp
12:11:27,499 INFO [Server] Server Config URL: file:/D:/java/jboss-4.0.1sp1/server/all1/conf/
12:11:27,499 INFO [Server] Server Library URL: file:/D:/java/jboss-4.0.1sp1/server/all1/lib/
12:11:27,514 INFO [Server] Root Deployment Filename: jboss-service.xml
12:11:27,530 INFO [Server] Starting General Purpose Architecture (GPA)...
12:11:28,155 INFO [ServerInfo] Java version: 1.5.0_04,Sun Microsystems Inc.
12:11:28,171 INFO [ServerInfo] Java VM: Java HotSpot(TM) Client VM 1.5.0_04-b05,Sun Microsystems Inc.
12:11:28,186 INFO [ServerInfo] OS-System: Windows XP 5.1,x86
12:11:28,936 INFO [Server] Core system initialized
12:11:33,030 INFO [Log4jService$URLWatchTimerTask] Configuring from URL: resource:log4j.xml
12:11:33,202 INFO [WebService] Using RMI server codebase: http://node1:8083/
12:11:33,499 INFO [NamingService] Started jndi bootstrap jnpPort=1099,rmiPort=1098, backlog=50, bindAddress=node1/10.0.0.1, Client SocketFactory=null, Server SocketFactory=org.jboss.net.sockets.DefaultSocketFactory@ad093076
12:11:43,529 INFO [DefaultPartition] Initializing
12:11:43,701 INFO [STDOUT]
-------------------------------------------------------
GMS: address is node1:2403 (additional data: 13 bytes)
-------------------------------------------------------
12:11:45,764 INFO [DefaultPartition] Number of cluster members: 1
12:11:45,779 INFO [DefaultPartition] Other members: 0
12:11:45,795 INFO [DefaultPartition] New cluster view for partition DefaultPartition (id: 0, delta: 0) : [10.0.0.1:1099]
12:11:45,811 INFO [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
12:11:45,842 INFO [DefaultPartition] I am (10.0.0.1:1099) received membershipChanged event:
12:11:45,858 INFO [DefaultPartition] Dead members: 0 ([])
12:11:45,873 INFO [DefaultPartition] New Members : 0 ([])
12:11:45,889 INFO [DefaultPartition] All Members : 1 ([10.0.0.1:1099])
12:11:46,154 INFO [HANamingService] Listening on node1/10.0.0.1:1100
12:11:46,170 INFO [DetachedHANamingService$AutomaticDiscovery] Listening on /10.0.0.1:1102, group=230.0.0.4, HA-JNDI ad
dress=10.0.0.1:1100
12:11:49,873 INFO [MailService] Mail Service bound to java:/Mail
12:11:50,514 INFO [BlockingServer] Remoting 'async' protocol clients will connect to: InvokerLocator [async://10.160.32.251:8085/?compression=-1&tcp.nodelay=true]
12:11:51,029 INFO [TreeCache] setting cluster properties from xml to: UDP(ip_mcast=true;ip_ttl=64;loopback=false;mcast_addr=230.1.2.3;mcast_port=45577;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
12:11:51,170 INFO [TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.LockInterceptor
class org.jboss.cache.interceptors.CreateIfNotExistsInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
12:11:51,217 INFO [TreeCache] cache mode is REPL_SYNC
12:11:51,310 INFO [STDOUT]
-------------------------------------------------------
GMS: address is node1:2408
-------------------------------------------------------
12:11:53,467 INFO [TreeCache] viewAccepted(): new members: [node1:2408]
12:11:53,576 INFO [TreeCache] state could not be retrieved (must be first member in group)
12:11:53,576 INFO [TreeCache] new cache is null (maybe first member in cluster)
12:11:56,310 INFO [Embedded] Catalina naming disabled
12:11:59,779 INFO [Http11Protocol] Initializing Coyote HTTP/1.1 on http-node1%2F10.0.0.1-8080
12:11:59,888 INFO [Catalina] Initialization processed in 2390 ms
12:11:59,904 INFO [StandardService] Starting service jboss.web
12:11:59,951 INFO [StandardEngine] Starting Servlet Engine: Apache Tomcat/5.0.28
12:12:00,029 INFO [StandardHost] XML validation disabled
12:12:00,044 INFO [TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache --
 treeCacheName must be set to point to a running TreeCache before ClusteredSingleSignOn can handle requests
12:12:00,076 INFO [Engine] SingleSignOn[localhost]: Started
12:12:00,201 INFO [Catalina] Server startup in 297 ms
12:12:00,216 INFO [Tomcat5] Setting the cache name to jboss.cache:service=TomcatClusteringCache on jboss.web:type=Valve
,name=ClusteredSingleSignOn,host=localhost
12:12:00,701 INFO [TomcatDeployer] deploy, ctxPath=/invoker, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/httpha-invoker.sar/invoker.war/
12:12:03,091 INFO [TomcatDeployer] deploy, ctxPath=/ws4ee, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/tmp/deploy/tmp36577jboss-ws4ee-exp.war/
12:12:03,982 INFO [TomcatDeployer] deploy, ctxPath=/, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jbossweb-tomcat50.sar/ROOT.war/
12:12:04,435 INFO [TomcatDeployer] deploy, ctxPath=/c1, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jbossweb-tomcat50.sar/c1.war/
12:12:04,732 INFO [JBossCacheManager] init(): replicationGranularity_ is 0 and invaldateSessionPolicy is 2
12:12:04,779 INFO [JBossCacheManager] Starting JBossManager
12:12:04,966 INFO [TomcatDeployer] deploy, ctxPath=/c2, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jbossweb
-tomcat50.sar/c2.war/
12:12:05,216 INFO [JBossCacheManager] init(): replicationGranularity_ is 0 and invaldateSessionPolicy is 2
12:12:05,294 INFO [JBossCacheManager] Starting JBossManager
12:12:05,419 INFO [TomcatDeployer] deploy, ctxPath=/jbossmq-httpil, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy-hasingleton/jms/jbossmq-httpil.sar/jbossmq-httpil.war/
12:12:06,419 INFO [RARDeployment] Required license terms exist view the META-INF/ra.xml: file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jboss-local-jdbc.rar
12:12:06,716 INFO [RARDeployment] Required license terms exist view the META-INF/ra.xml: file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jboss-xa-jdbc.rar
12:12:07,013 INFO [RARDeployment] Required license terms exist view the META-INF/ra.xml: file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jms/jms-ra.rar
12:12:07,263 INFO [RARDeployment] Required license terms exist view the META-INF/ra.xml: file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/mail-ra.rar
12:12:09,810 INFO [WrapperDataSourceService] Bound connection factory for resource adapter for ConnectionManager 'jboss.jca:service=DataSourceBinding,name=DefaultDS to JNDI name 'java:DefaultDS'
12:12:09,888 WARN [ConfiguredIdentityLoginModule] Creating LoginModule with no configured password!
12:12:10,435 INFO [A] Bound to JNDI name: queue/A
12:12:10,450 INFO [B] Bound to JNDI name: queue/B
12:12:10,466 INFO [C] Bound to JNDI name: queue/C
12:12:10,482 INFO [D] Bound to JNDI name: queue/D
12:12:10,497 INFO [ex] Bound to JNDI name: queue/ex
12:12:10,560 INFO [testTopic] Bound to JNDI name: topic/testTopic
12:12:10,575 INFO [securedTopic] Bound to JNDI name: topic/securedTopic
12:12:10,591 INFO [testDurableTopic] Bound to JNDI name: topic/testDurableTopic
12:12:10,607 INFO [testQueue] Bound to JNDI name: queue/testQueue
12:12:10,716 INFO [OILServerILService] JBossMQ OIL service available at : node1/10.0.0.1:8090
12:12:10,825 INFO [UILServerILService] JBossMQ UIL service available at : node1/10.0.0.1:8093
12:12:10,903 INFO [DLQ] Bound to JNDI name: queue/DLQ
12:12:11,325 INFO [ConnectionFactoryBindingService] Bound connection factory for resource adapter for ConnectionManager
 'jboss.jca:service=ConnectionFactoryBinding,name=JmsXA to JNDI name 'java:JmsXA'
12:12:11,622 INFO [TomcatDeployer] deploy, ctxPath=/jmx-console, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/jmx-console.war/
12:12:12,700 INFO [TomcatDeployer] deploy, ctxPath=/web-console, warUrl=file:/D:/java/jboss-4.0.1sp1/server/all1/deploy/management/web-console.war/
12:12:17,028 INFO [Http11Protocol] Starting Coyote HTTP/1.1 on http-node1%2F10.0.0.1-8080
12:12:17,966 INFO [ChannelSocket] JK2: ajp13 listening on node1/10.0.0.1:8009
12:12:17,981 INFO [JkMain] Jk running ID=0 time=0/78 config=null
12:12:18,028 INFO [Server] JBoss (MX MicroKernel) [4.0.1sp1 (build: CVSTag=JBoss_4_0_1_SP1 date=200502160314)] Started
in 47s:780ms
12:12:26,278 INFO [DefaultPartition] New cluster view for partition DefaultPartition (id: 1, delta: 1) : [10.0.0.1:1099, 10.0.0.2:1099]
12:12:26,293 INFO [DefaultPartition] I am (10.0.0.1:1099) received membershipChanged event:
12:12:26,309 INFO [DefaultPartition] Dead members: 0 ([])
12:12:26,325 INFO [DefaultPartition] New Members : 1 ([10.0.0.2:1099])
12:12:26,325 INFO [DefaultPartition] All Members : 2 ([10.0.0.1:1099, 10.0.0.2:1099])
12:12:32,293 INFO [TreeCache] locking the tree to obtain transient state
12:12:32,309 INFO [TreeCache] returning the transient state (217 bytes)
12:12:32,340 INFO [TreeCache] viewAccepted(): new members: [node1:2408, node2:2419]
12:13:06,104 INFO [Engine] SingleSignOn[localhost]: Process request for '/c1/tsr.jsp'
12:13:06,120 INFO [Engine] SingleSignOn[localhost]: Checking for SSO cookie
12:13:06,136 INFO [Engine] SingleSignOn[localhost]: SSO cookie is not present
12:13:11,339 INFO [Engine] SingleSignOn[localhost]: Process request for '/c1/tsr.jsp'
12:13:11,354 INFO [Engine] SingleSignOn[localhost]: Checking for SSO cookie
12:13:11,370 INFO [Engine] SingleSignOn[localhost]: SSO cookie is not present
12:13:11,526 INFO [Engine] SingleSignOn[localhost]: Registering sso id '1339792CA19F19753196F3794A22CC84' for user 'use
r1' with auth type 'BASIC'
12:13:11,542 ERROR [TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache --
 TreeCache must be started before ClusteredSingleSignOn can handle requests
12:13:11,635 INFO [STDOUT] Can't find resource org.jboss.web.tomcat.session.LocalStrings org.jboss.web.tomcat.tc5.WebCt
xLoader$ENCLoader@2c5444
12:13:11,667 INFO [STDOUT] [Ljava.net.URL;@157402b
12:13:11,870 INFO [Engine] SingleSignOn[localhost]: Associate sso id 1339792CA19F19753196F3794A22CC84 with session SessionBasedClusteredSession[qM-ENRL0Lg21qb4ka3n4VA**.node1]
12:13:11,885 WARN [TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache --
 TreeCache must be started before ClusteredSingleSignOn can handle requests
12:13:34,900 INFO [Engine] SingleSignOn[localhost]: Process request for '/c2/tsr.jsp'
12:13:34,916 INFO [Engine] SingleSignOn[localhost]: Checking for SSO cookie
12:13:34,931 INFO [Engine] SingleSignOn[localhost]: Checking for cached principal for 1339792CA19F19753196F3794A22CC84
12:13:34,947 INFO [Engine] SingleSignOn[localhost]: Found cached principal 'user1' with auth type 'BASIC'
12:13:35,041 INFO [Engine] SingleSignOn[localhost]: Associate sso id 1339792CA19F19753196F3794A22CC84 with session SessionBasedClusteredSession[DwQHoSanZlEVYjCsRjQp4w**.node1]
12:13:35,072 WARN [TreeCacheSSOClusterManager] Cannot find TreeCache using jboss.cache:service=TomcatClusteringCache --
 TreeCache must be started before ClusteredSingleSignOn can handle requests

There are two simple web applications requiring authentication bound to contexts c1 and c1. They consist of the single jsp displaying current session id. In the log above one can observe the result of accessing /c1/tsr.jsp and /c2/tsr.jsp. The first request requires auhentication, the second doesn't, as the JSESSIONIDSSO cokie is set on the first response and sent along with the second request by the browser.

There *is* an MBean jboss.cache:service=TomcatClusteringCache which when invoking printDetails() on it returns standard session entries only (no SSO entries)

/JSESSION

/c2

/DwQHoSanZlEVYjCsRjQp4w**
VERSION: 1
DwQHoSanZlEVYjCsRjQp4w**: org.jboss.invocation.MarshalledValue@adccaac3

/c1

/qM-ENRL0Lg21qb4ka3n4VA**
VERSION: 1
qM-ENRL0Lg21qb4ka3n4VA**: org.jboss.invocation.MarshalledValue@ddfb3d0a

As a mater of fact I took several further steps to get the config running, e.g. I had to set the jvmRoute="node1|node2" and set the useJK attribute to "true". I also anabled java:/jaas/other security domain and made the c1 and c2 use this domain. Later on I copied the whole all config to all2 (changing the jvmRoute to node2 where appropriate), renamed all to all1 and launched them both.

run.bat -c all1 -b node1
run.bat -c all2 -b node2

, where node1 and node2 are resolved to 10.0.0.1 and 10.0.0.2 respectively.

As a result I observed replication of the JSESSION entries, SSO works across the applications on the same node but do not get replicated across the members of the cluster.

Btw, there is this nasty little bug in jgroups/jbosscache in 4.0.1sp1 that causes the following exception on each replicaction event (on the stdout of the node the sessions get replicated to). AFAIK it has been reported in JIRA and can be worked around by substituting jgroups.jar and jboss-cache.jar with more recent versions.

12:13:11,979 ERROR [RpcDispatcher] failed invoking method
java.lang.UnsupportedOperationException: commit() should not be called on TreeCache directly
 at org.jboss.cache.TreeCache.commit(TreeCache.java:2783)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at org.jgroups.blocks.MethodCall.invoke(MethodCall.java:236)
 at org.jboss.cache.interceptors.CallInterceptor.invoke(CallInterceptor.java:37)
 at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:40)
 at org.jboss.cache.interceptors.LockInterceptor.invoke(LockInterceptor.java:126)
 at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:40)
 at org.jboss.cache.interceptors.CreateIfNotExistsInterceptor.invoke(CreateIfNotExistsInterceptor.java:47)
 at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:40)
 at org.jboss.cache.interceptors.ReplicationInterceptor.replicate(ReplicationInterceptor.java:210)
 at org.jboss.cache.TreeCache._replicate(TreeCache.java:2733)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at org.jgroups.blocks.MethodCall.invoke(MethodCall.java:236)
 at org.jgroups.blocks.RpcDispatcher.handle(RpcDispatcher.java:220)
 at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:615)
 at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:512)
 at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:326)
 at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUp(MessageDispatcher.java:722)
 at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.access$300(MessageDispatcher.java:554)
 at org.jgroups.blocks.MessageDispatcher$1.run(MessageDispatcher.java:691)
 at java.lang.Thread.run(Thread.java:595)

Whew, that was long, I hope this will be helpfull.
Brian, should you need more details, just let me know.

/dd

4. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

drzewo Aug 1, 2005 9:00 AM (in response to drzewo)

I kept investigating the problem and found the difference betweeen the machine where Clustered SSO works for me and the one where it doesn't. Moreover, I found the root of described the problem!

Apparently some time ago I wanted to be able to reach jboss JVM via sun's jconsole, therefore I modified the startup script run.bat, and appended -Dcom.sun.management.jmxremote system property. You can observe this in the provided log. The addition of this parameter rendered Clustered SSO unusable. The question is: why the heck this happens! The quick&simple resolution is the removal of -Dcom.sun.management.jmxremote. Again, is there any sensible reason why this affects ONLY Clustered SSO??

Now, finally, back to the drawing board...

cheers,
/dd
Actions
5. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Aug 1, 2005 2:04 PM (in response to drzewo)

I think there was a problem with how TreeCacheSSOClusterManager was finding the JMX server -- I'm expecting your command line switch caused another JMX server to start and TreeCacheSSOClusterManager was finding that (sorry -- at work now and can't confirm).

Last night I committed a patch to Branch_4_0 CVS that should fix this.

Earlier today I replied to the e-mail you sent me with the config files. Email included a zip with the latest version and compiled classes, which if you want you can integrate to see if it solves your problem.

Tonight I'll try using the -Dcom.sun.management.jmxremote switch and see if my patch resolves the problem.

BTW, thanks for the very clear report.
Actions
6. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

starksm64 Aug 1, 2005 2:11 PM (in response to drzewo)

The -Dcom.sun.management.jmxremote does cause a second MBeanServer to be created so unless you are querying the for the jboss specific one you may be getting the jdk platform MBeanServer which will have no jboss mbeans registered.
Actions
7. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Aug 2, 2005 1:05 AM (in response to drzewo)

I was able to reproduce the problem by setting the -Dcom.sun.management.jmxremote parameter, and confirmed that the patch I committed last night fixes the problem (by finding the JMX server by calling MBeanServerLocator.locateJBoss() instead of just locate()). So, this issue is fixed in CVS.
Actions
8. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

drzewo Aug 5, 2005 10:38 AM (in response to drzewo)
Now, there is a continuation of my story. Here is what I want to achieve:

I want to run a cluster (say, hostnames node1 and node2) with apache2 + mod_jk loadballancer (hostname apache) with failover and loadballancing capabilities and SSO across multiple applications. There are two aplications, of which one is used much more often than the other, here is the detailed description I submitted to tomcat-dev:

Let's assume there are two applications /app1 and /app2, and there is a SSO
setup on them both. Now, user logs into the /app1 (which requires
authentication) and /app2 (which uses SSO Cookie, no authentication then),
and later on makes use of only one of them (say: /app1) for quite a long
time, so that this period outlives the session expiry date of the unused
application (/app2). Provided that both applications establish their own
sessions the one created in the scope of constantly used application
(/app1) wouldn't expire, while the second one definitely would.

Now the question: As both sessions are gathered into a higher-level SSO
session, would it be against the specification if *all* standard sessions
were aged (eg. by calling session.access()) on each request containing
valid SSO cookie? I suggest that there should be a flag on the SSO Valve,
that is org.apache.catalina.authenticator.SingleSignOn allowing users to
specify the behaviour.

You can find the original discussion here:
http://www.mail-archive.com/tomcat-dev@jakarta.apache.org/msg74023.html

To achieve this I added a simple aging logic to the ClusteredSingleSignOn. The logic behind this is to age (call session.access()) all of the sessions that are federated under the incoming SSO, provided that the requested session ID is among them. Here is how my invoke() method looks like now (I have stripped comments for brevity):

public void invoke(Request request, Response response, ValveContext context) throws IOException, ServletException { if (!(request instanceof HttpRequest) || !(response instanceof HttpResponse)) { context.invokeNext(request, response); return; } HttpServletRequest hreq = (HttpServletRequest) request.getRequest(); HttpServletResponse hres = (HttpServletResponse) response.getResponse(); request.removeNote(Constants.REQ_SSOID_NOTE); if (debug >= 1) log("Process request for '" + hreq.getRequestURI() + "'"); if (hreq.getUserPrincipal() != null) { if (debug >= 1) log(" Principal '" + hreq.getUserPrincipal().getName() + "' has already been authenticated"); context.invokeNext(request, response); return; } if (debug >= 1) log(" Checking for SSO cookie"); Cookie cookie = null; Cookie cookies[] = hreq.getCookies(); if (cookies == null) cookies = new Cookie[0]; for (int i = 0; i < cookies.length; i++) { if (Constants.SINGLE_SIGN_ON_COOKIE.equals(cookies.getName())) { cookie = cookies; break; } } if (cookie == null) { if (debug >= 1) log(" SSO cookie is not present"); context.invokeNext(request, response); return; } if (debug >= 1) log(" Checking for cached principal for " + cookie.getValue()); SingleSignOnEntry entry = getSingleSignOnEntry(cookie.getValue()); if (entry != null) { Principal ssoPrinc = entry.getPrincipal(); if (debug >= 1) { log(" Found cached principal '" + (ssoPrinc == null ? "NULL" : ssoPrinc.getName()) + "' with auth type '" + entry.getAuthType() + "'"); } request.setNote(Constants.REQ_SSOID_NOTE, cookie.getValue()); if (!getRequireReauthentication() && ssoPrinc != null) { ((HttpRequest) request).setAuthType(entry.getAuthType()); ((HttpRequest) request).setUserPrincipal(ssoPrinc); } if (keepaliveSessions) { Session[] sessions = entry.findSessions(); String reqSessId = hreq.getRequestedSessionId(); if (debug >= 90) { log("requested session id " + reqSessId); } boolean amongFlag = false; if (reqSessId != null) { for (int i = 0; i < sessions.length; i++) { Session session = sessions; if (session.getId().equals(reqSessId)) { amongFlag = true; if (debug >= 90) { log("SSO contains requested sessid"); } break; } } } if (amongFlag) { if (debug >= 1) { log("Keepalive " + sessions.length + " sessions"); } for (int i = 0; i < sessions.length; i++) { Session session = sessions; session.access(); if (debug >= 90) { log("Aged " + session.getId()); } } } } } else { if (debug >= 1) log(" No cached principal found, erasing SSO cookie"); cookie.setMaxAge(0); hres.addCookie(cookie); } // Invoke the next Valve in our pipeline context.invokeNext(request, response); }

Now what my setup looks like: there are two jbosses (config as described in earlier posts), each bound to one ip, and apache 2.0.54 + latest mod_jk with the following workers.properties - nuthin' unusual I suppose. LB works like a charm, but there are issues with replication

worker.node1.port=8009 worker.node1.host=node1 worker.node1.type=ajp13 worker.node1.socket_timeout=3 worker.node1.retries=3 worker.node1.socket_keepalive=120 worker.node1.recycle_timeout=240 worker.node1.lbfactor=1 worker.node1.cachesize=1 worker.node1.cache_timeout=240 worker.node2.port=8009 worker.node2.host=node2 worker.node2.type=ajp13 worker.node2.socket_timeout=3 worker.node2.retries=3 worker.node2.socket_keepalive=120 worker.node2.recycle_timeout=240 worker.node2.lbfactor=1 worker.node2.cachesize=1 worker.node2.cache_timeout=240 worker.loadbalancer.type=lb worker.loadbalancer.balance_workers=node1,node2 worker.loadbalancer.sticky_session=true worker.loadbalancer.sticky_session_force=false worker.status.type=status worker.list=loadbalancer,status

There are two applications /c1 and /c2. Here are the steps performed with the description of what happens:
1. /c1 is accessed for the first time, request is directed to node2
2. HTTP 401 is returned, user authenticates herself, another request is sent, new session is created, two Set-Cooke headers are sent in the response containing SSO session id an standard session id
3. So far, so good, using jmx-console I observe the appearance od SSO and JSESSIONID in the node2 TreeCache, the same applies to node1 cache, so the sessions (both standard and SSO) seem to replicate

/JSESSION /c1 /rPa-0to9ouutQSCVeosCQA** VERSION: 1 rPa-0to9ouutQSCVeosCQA**: org.jboss.invocation.MarshalledValue@7f29934 /SSO /87FFE0E5ACABC19049C5075B90364F2E /credentials key: org.jboss.web.tomcat.tc5.sso.TreeCacheSSOClusterManager$SSOCredentials@12833ff /sessions key: [rPa-0to9ouutQSCVeosCQA**.node2]

4. /c1 is accessed for the second time, request is paseed to node2 (as expected, we have a sticky session after all), my added logic kicks off and ages all the sessions under passed SSO, namely the one and only session there is, that is the session that was passed in along the second request. So far, everything is fine.
5. /c2 is accessed for the first. SSO cookie gets sent, but no session has been started yet. request gets passed to the node1 (we have a loadballancer after all). No Authentication required, as the appropriate SSO session is in place (in the replicated TreeCache on node1, that is)

/JSESSION /c2 /XAcHQMEFfm8cb1mFUWF9Aw** VERSION: 1 XAcHQMEFfm8cb1mFUWF9Aw**: org.jboss.invocation.MarshalledValue@f5f32b57 /KNRsop2ihZVcuouXQ46IGw** VERSION: 1 KNRsop2ihZVcuouXQ46IGw**: org.jboss.invocation.MarshalledValue@177084c6 /c1 /rPa-0to9ouutQSCVeosCQA** VERSION: 1 rPa-0to9ouutQSCVeosCQA**: org.jboss.invocation.MarshalledValue@7f29934 /SSO /87FFE0E5ACABC19049C5075B90364F2E /credentials key: org.jboss.web.tomcat.tc5.sso.TreeCacheSSOClusterManager$SSOCredentials@12833ff /sessions key: [rPa-0to9ouutQSCVeosCQA**.node2, KNRsop2ihZVcuouXQ46IGw**.node1, XAcHQMEFfm8cb1mFUWF9Aw**.node1]

6. another peek at the jmx-console on both nodes reveals that everything works as expected: the most recently created session gets replicated from node1 to node2, and SSO now is comprised of both sessions. Both nodes contain identical entries in the TreeCache.
7. another request to /c2 or /c2 reveals something strange - my aging logic should age all sessions in the particular SSO entry, but there seem to be only one session inserted in the SSO session (which is not consistent with what the jmx-console shows). when accessing /c1 only session originaly created on the first access to /c1 is aged. The same applies to /c2.

This behaviour doesn't seem to have much in common with my change, it must have already worked like that. My change just helped me reveal this.

Now, when one node goes down (I killed the proccess on node2) and I access the app wchich was originally served by node2, that is /c1, SSO automagically starts seeing all federated sessions and my aging algorithm starts working as expected, that is it has access to all sessions under the same SSO Session.

Is it designed this way to reduce the replication overhead?
Is it possible to achieve aging this way?
What do you thing about my proposal of aging federated sessions (for the sake of the consistency, as operations on one session, namely invalidate(), *do* affect the rest of the sessions in the scope of the same SSO session)?

cheers,
/dd
Actions
9. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Aug 7, 2005 1:20 AM (in response to drzewo)

My initial thought on this is that what you're suggesting seems like a reasonable option, but having read the tomcat-dev thread, I have to agree that doing this expands the scope of the SingleSignOn valve from handling authentication issues to managing session lifecycle, which is a pretty big change. (I consider invalidating associated sessions when one session in the SSO is invalidated for reasons other than timeout to be authentication related, although I can understand your point of view :)

In any case, for maintainability reasons I want to keep the logic of ClusteredSingleSignOn the same as the Tomcat SingleSignOn valve on which it is based (except, of course, for communicating with the clustering layer). So, I wouldn't want to add this to ClusteredSingleSignOn unless the Tomcat folks were willing to add it to their version. (Besides Remy is the lead on JBoss/Tomcat integration, so I wouldn't add something he doesn't agree with).

All that being said, I'm more than happy to help you get a custom version working. I suspect your problem is that your "keepaliveSessions" block is being executed too early. A session is not actually bound to the SingleSignOnEntry until the associate() method is called. This happens later in the pipeline, when the request gets to the authenticate() method of one of the Tomcat authenticators (e.g. org.apache.catalina.authenticator.BasicAuthenticator).

If you move your "keepaliveSessions" block to ClusteredSingleSignOn.associate(), I believe this will fix your problem (haven't tried it myself).

BTW, needing to move this block to associate() slightly increases my concern about adding this option. Now we're doing somethat that affects session lifecycle in an obscure location, rather than in a call from the invoke() method.

Best regards,
Actions
10. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

drzewo Aug 7, 2005 6:38 AM (in response to drzewo)

I'll try to convince Remy (that might be tough though and might take some time, AFAIK Remy is off for vacation till August 16th) to enhance the SingleSignOn valve with optional logic, as a the described by me use case, I believe is a common scenario for those who use SSO. I'd be more than happy if anyone for whom my arguments seem reasonable could support my point of view. Remy pointed out, that the applications should be as isolated as possible while the SSO Valve in it current implementaion denied this. To be precise, to me, it would be much more appropriate and in the spirit of Remy's point of view if the logout action (session.invalidate() on one of the federeated sessions) removed an entry in the SSO cache only, leaving non-invalidated sessions untouched. Now, if another request arrived with the SSO id, the valve would do nothing as there's no such entry in the SSO cache, therefore user would heave to reauthenticate. Of course I might miss something.

I'll dig into Authenicator stuff and try your suggestion.

cheers,
/dd
Actions
11. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Aug 8, 2005 1:04 AM (in response to drzewo)

I've thought about your problem more and realize my suggestion will not be reliable with the clustered version of SSO.

As you've seen, there are two areas where ClusteredSingleSignOn maintains information about sessions associated with a sign-on. The first is the set of session id's maintained in the SSO's node in the TreeCache. This is used to provide global information to all nodes in the cluster about the complete set of sessions associated with the SSO. If a node removes the last entry from the set (due to session timeout), it invalidates the SSO. Really this set is just used for this purpose. The SSO node in the TreeCache only maintains session id's, not the sessions themselves because 1) the session id is all that's needed; the session is too heavy for the purpose and 2) session replication is a complex business, far beyond the scope of the SSO valve.

Each node that has been accessed by a user also maintains a SingleSignOnEntry cached in the valve itself. This object maintains an array of sessions that have been accessed on that node. This array is used so the node can invalidate any sessions active on that node if a logout occurs. If other sessions exist on other nodes, it is the responsibility of those other nodes to invalidate those sessions.

So, within the context of ClusteredSingleSignOn, on an individual node there is no place where all sessions associated with the SSO are guaranteed to be accessible. If all the apps that use SSO are distributable, then all the session data will be in the tree cache, but to access the sessions you would need to be able to find the Manager instance associated with each app.
Actions
12. Re: ClusteredSingleSignOn - TreeCacheSSOClusterManager fails

bstansberry Aug 8, 2005 1:47 AM (in response to drzewo)

"drzewo" wrote:
I'll try to convince Remy (that might be tough though and might take some time, AFAIK Remy is off for vacation till August 16th) to enhance the SingleSignOn valve with optional logic, as a the described by me use case, I believe is a common scenario for those who use SSO.

I'm curious about your use case (always like to know the different ways things are used). Typically when I've written webapps, if a user has not accessed the app for a while, I want their session to invalidate so I can free resources. Obviously I don't want the timeout of one session to affect others, but the SSO valve is designed to prevent that. I can imagine scenarios where keeping all sessions alive would be useful, but would be interested to know a real one.

"drzewo" wrote:
To be precise, to me, it would be much more appropriate and in the spirit of Remy's point of view if the logout action (session.invalidate() on one of the federeated sessions) removed an entry in the SSO cache only, leaving non-invalidated sessions untouched. Now, if another request arrived with the SSO id, the valve would do nothing as there's no such entry in the SSO cache, therefore user would heave to reauthenticate. Of course I might miss something.

If the user accessed an app for which the session was not invalidated, they would not have to authenticate (and so you'd have single sign on but not single sign out). The Authenticator for each web app caches the credentials for the user in the session itself and uses them to reauthenticate users with valid sessions. The SSO valve caches credentials presented to one web app and makes them available to other webapps, but its support is not needed once the user has authenticated to an app.

Best regards,
Actions

Go to original post