We occasionally see that replication between nodes may take around 500ms at times.
Usually it is "immediate", but occasionally it takes 70ms and we've seen it take up to 500 ms.
The behavior we are expecting is when an EJB request to server 111 is completed, that all of the cache data that was changed is replicated to server 222 before the client EJB request returns.
We have a monitoring application which constantly checks the jboss instances.
So here is the process:
|step||Server 111||Server 222|
|1||monitor invokes ejb to set a NEW cache value for key ABC123.|
|2||monitor invokes ejb to read the value for key ABC123 and verifies it is the same value it set.|
|3||monitor invokes ejb to read the value for key ABC123 and verifies it is the same value that was sent to server 111.|
|4||If the value is not the same, try for a max of 5 seconds or until it gets the correct value.|
|5||Monitor invokes ejb to remove key ABC123.|
|6||Monitor invokes ejb to make sure key ABC123 has been removed.|
|7||monitor invokes ejb to make sure key ABC123 has been removed from this server as well.|
|8||if the value still exists, then try for a max of 5 seconds or until it is found that key ABC123 has been removed.|
Most of the time this works correctly.
However we see delays at steps 3-4 and 7-8 where the value has not been replicated until some time afterwards.
Are there any settings which would guarantee that the cache data has been replicated to the other nodes (barring an error) before the client transaction is completed and control returned to the client?
I have attached an example of our cache configuration.
We have many caches... they all use different cluster names and port numbers.
examplecache-service.xml 1.9 KB