Intermittent CLI failure due to connection timeout
smpatanka Mar 11, 2018 10:02 AMProblem Statement:
Connection timeout exception encountered sporadically during jboss CLI execution or reload from CLI. The exception occurs sporadically on HA set up with 3 nodes. Issue is not seen on single node installation.
Wildfly version: 10.1
Description:
Connection timeout is observed in 2 scenarios:
- CLI command execution
- Reload from CLI
Domain mode deployment is configured in our set up. PFA, domain.xml and host.xml
- Reload fails quite frequently in HA set up (with 3 nodes) with Connection timeout exception.
Command : ({{ jboss_home }}/bin/jboss-cli.sh --connect commands='reload --host={{ ansible_hostname }}')
Error: See attached exception_in_server_log.txt
- JBoss CLI execution fails with same error. However, exception/error is not logged in server.log in this case.
Error: See attached ansible_task_execution_log.txt
TCP Stream:
Unable to attach the tcpdump as the file size is > 15MB. I can upload it onto FTP server if required.
Observations:
Time jumps backs to 1970 only for (SYN,ACK) frames. Same pattern observed in couple of packet captures collected for timeout exception encountered during reload operation
Several TCP frames seen dating back to 1970 observed in tcpdump. Each of them is a TCP (SYN,ACK) frame.
Issue observed only on HA set ups. HA set up is a 3 node set up.
Had this been an NTP sync issue then the issue should have been seen for other frames as well. Why only (SYN,ACK) frame?
Possible cause:
- Could this be an issue with XNIO library or Jboss CLI?
- The issue reported in the below ticket looks similar to the issue we are facing.
- Is this a known bug in wildfly 10.1?
Regards,
Smitha
-
ansible_task_execution_log.txt.zip 780 bytes
-
domain.xml.zip 7.5 KB
-
host.xml.zip 1.7 KB