I'm running pojocache 2.0.0 CR3 in a standalone two node active/hot-standby cluster and I'm running into problems when brining up the second node (the hot-standby). I'm using the replySync-service.xml configuration from the example code (using TCP instead of multicast UDP). The active node is updating the cache frequently (about once per second). When the hot-standby node comes online, the active node will almost always throw a TimeoutException during the state transfer because it will continue updating the cache during the state transfer. The interleaving of the cache update and the state transfer appears to cause a deadlock, which results in the TimeoutException being thrown. The replicated state is fairly small (a Map<String, String> with less than 100 entries). I registered a custom @CacheListener class that listens for ViewChanged events and prevents the active node from writing to the pojocache for the next five seconds and the problem seems to have gone away. Is there a better way to do this? Is there any way to hook into the underlying replication code to prevent cache operations during replication? Are there any configuration settings that can be used to prevent cache operations during replication?
Cache operations should not be prevented on an application level. Internally, they are locked during the course of the state transfer which would make cache operations block.
You could try increasing your LockAcquisitionTimeout to a value greater than your StateRetrievalTimeout.