5 Replies Latest reply on Dec 16, 2011 12:19 PM by rhusar

Too many sending are-you-alive msg

cadmus Dec 12, 2011 11:00 AM

Hi folks! I have some issue.

Application runs on JBoss AS 5.1.0.

There are two physical servers on each 4 node.

When the server is up, in the logs can be seen that sent too many "are-you-alive msg".

As I understand it wrong.

I attached the archive logs from all cluster nodes.

node 1 fragment logs.

12.12 17:19:17,113 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:17,114 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:17,113 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52044 (own address=10.44.0.177:48938)

12.12 17:19:17,125 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:7901 (own address=10.44.0.177:7900)

12.12 17:19:27,116 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:27,117 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52044 (own address=10.44.0.177:48938)

12.12 17:19:27,116 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:27,128 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:7901 (own address=10.44.0.177:7900)

12.12 17:19:38,078 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:7901 (own address=10.44.0.177:7900)

12.12 17:19:38,079 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:38,079 DEBUG org.jgroups.protocols.FD$Monitor heartbeat missing from 10.44.0.177:7901 (number=0)

12.12 17:19:38,079 DEBUG org.jgroups.protocols.FD$Monitor heartbeat missing from 10.44.0.177:52819 (number=0)

12.12 17:19:38,079 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52044 (own address=10.44.0.177:48938)

12.12 17:19:38,080 DEBUG org.jgroups.protocols.FD$Monitor heartbeat missing from 10.44.0.177:52044 (number=0)

12.12 17:19:38,080 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:38,080 DEBUG org.jgroups.protocols.FD$Monitor heartbeat missing from 10.44.0.177:52819 (number=0)

12.12 17:19:48,082 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:48,082 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52044 (own address=10.44.0.177:48938)

12.12 17:19:48,082 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:48,083 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:7901 (own address=10.44.0.177:7900)

12.12 17:19:58,084 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52819 (own address=10.44.0.177:48938)

12.12 17:19:58,084 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:52044 (own address=10.44.0.177:48938)

12.12 17:19:58,086 DEBUG org.jgroups.protocols.FD$Monitor sending are-you-alive msg to 10.44.0.177:7901 (own address=10.44.0.177:7900)

Thanks!

Cluster_log.rar.zip 174.5 KB

1. Re: Too many sending are-you-alive msg

rhusar Dec 14, 2011 7:59 PM (in response to cadmus)

LOL, with debug logging turned on expect lot of everything :-)

The "FD" you are seeing is fauilure detection. It works by sending and receiving are you alive messages.

Nothing to worry about here.

HTH
Rado
Actions
2. Re: Too many sending are-you-alive msg

cadmus Dec 15, 2011 2:50 AM (in response to rhusar)

Hi Radoslav !
I asked for a reason. In the official guide says that
"Regular traffic from a node counts as if it is a heartbeat response. So, the are-youalive
messages are only sent when there is no regular traffic to the node for some
time."

So I decided that it can be the wrong behavior.
Actions
3. Re: Too many sending are-you-alive msg

rhusar Dec 15, 2011 4:43 AM (in response to cadmus)

Hi Maxim,

okay, I like your approach -- dont trust anything ;-) In that case it will be best if you test it yourself.

What the docs say makes perfect sense. If you are receiving other messages already there is no reason to send additional heartbeat messages, because you know that the node is okay. For this to test deploy an app that when you do a request it will communicate to all other nodes, so a distributable web app modifiying session each time its accessed, plus replicating to all members. Then access it in small interval and you should see no areyoualive or very little.

Rado
Actions
4. Re: Too many sending are-you-alive msg

cadmus Dec 15, 2011 7:40 AM (in response to rhusar)

I do not quite understand what you mean.
But if you meant that the application is not actively used and for this messages send.
This is not that case, the application actively in use.
Actions
5. Re: Too many sending are-you-alive msg

rhusar Dec 16, 2011 12:19 PM (in response to cadmus)

Actively does not necessarily mean there have been messages exchanged between the servers. If you are just reading the session, there is nothing to replicate. Thus its necessary to send are-you-alive messages.

Also note that the intervals need to be quite short, if something goes wrong the cluster needs to react to is asap with tolerance for some network instability.

HTH,
Rado
Actions

Go to original post