3 Replies Latest reply on Nov 8, 2013 8:03 AM by jfclere

How to analyse sticky-session problem

a_schulle Nov 7, 2013 9:52 AM

Hi,

we are using linux systems with apache 2.2, mod_cluster 1.2.1 and three JBoss AS 7.1 servers.

Sometimes, I think wenn the load is a litte higher, sticky sessions are getting lost. On the JBoss side I can see that for example a request belonging to a session that is located on JBoss A is forwarded to JBoss B. That only happens sometimes (less than 1% of the requests). I'm not even sure if there is a dependency on the system load.

We just enabled sticky session and left all other parameters like sticky-session-force, sticky-session-remove, worker-timeout, max-attempts, node-timeout and so on to there defaults.

Now the question is what I can do to analyse this problem. When mod_cluster is not able to route a sticky session request to the right node, does it log the reason? If so, how can I enable this logging?

Tanks in advance.

Alex

1. Re: How to analyse sticky-session problem

mbabacek Nov 7, 2013 10:24 AM (in response to a_schulle)
Dear Alexander,
first and foremost: please, do update to mod_cluster 1.2.6. It's a good, stable version, much better than 1.2.1.

Regarding sticky sessions question
What does it mean that "sticky sessions are getting lost"? Do you mean they loose their stickiness or they are getting actually lost (loosing session data for client)? If the former, it is O.K., under circumstances I will explain below, if it's the latter, it might be a serious bug.

Sticky sessions
There are these settings:
sticky-session, if it's true, balancer will try to maintain the session on the worker it originated from. If the worker is unresponsive/overloaded/down and sticky-session-force is false, balancer will route the request to a different worker.
sticky-session-remove, should the balancer remove the stickiness to the former worker after a failover occurred (true), or should the balancer return the session to the former node when it becomes available again (false)?
sticky-session-force, if this is set to true, there is no failover under any circumstances - session sticks to the worker it originated from no matter whether it's available or not.

Now, had there been any reason why balancer decided to address a different worker?
Apache HTTP Server log on LogLevel debug will tell you more...

HTH
Actions
2. Re: How to analyse sticky-session problem

a_schulle Nov 7, 2013 10:49 AM (in response to mbabacek)

Dear Michael,

thanks for the quick answer!

I just received another error logfile from the provider that is operating the apache. It says several times a day:
...
[Wed Oct 16 17:16:28 2013] [error] (70007)The timeout specified has expired: ajp_ilink_receive() can't receive header
[Wed Oct 16 17:16:34 2013] [error] proxy: ajp: disabled connection for (xxxxxx)
...

I just learned from this discussion: Worker in error state after a request timeout that when a request takes longer than nodeTimeout the node is disabled until the next STATUS message from the node saying that it is OK arrived. This should be every 30 seconds.
If the node is disabled all requests for sessions on that node are directly forwarded to another node.

Is this right so far? The other discussion is about mod_cluster 1.0.0 so I'm not sure about that.

From time to time we have those long running requests witch exceeds the node timeout. Thats not good but thats the way it is. What would be the right solution? Using a larger nodeTimeout, or is there another way to tell mod_cluster not to disable the node after such a timeout?

Thx
Alex
Actions
3. Re: How to analyse sticky-session problem

jfclere Nov 8, 2013 8:03 AM (in response to a_schulle)

you need to use as large enough node-timeout.
Actions

Go to original post