1 Reply Latest reply on Sep 22, 2016 5:06 AM by galder.zamarreno

Scaling infinispan server 8.2.2 with Remote Event Listeners

whethsmith Sep 19, 2016 2:27 PM

Greetings community,

We currently have a 5-node infinispan server cluster running in production and it is able to handle up to 2 thousand requests per second. Our desire is to have the cache scale linearly, up 10x or more. In our stress tests, our app servers start getting SocketTimeoutExceptions from infinspan after 30 minutes under planned future load.

Our most active cache runs in distributed mode with 2 owners and 20 segments. One of the bottlenecks appear to be with our pub/sub system using the remote event listeners. Basically whenever a cache entry is modified, our remote event listeners get notified, who then in turn respond to long-polling requests. The remote event listeners are running in Java servlets on tomcat.

About a month ago, we ran into the issue found here: Show stopper: Infinispan hot rod server gets stuck / dead lock in high load with registered client listener in hot rod client - infinispan-server-8.2.1.Final

and after patching infinispan server with an increased event queue size (now at 1 million) we were able to scale up quite far, just not as much as we'd like.

In terms of actual errors on the server, what we see are things like:

ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (pool-6-thread-1) ISPN000136: Error executing command RemoveExpiredCommand, writing keys [[B0x033e183537626430..[27]]: org.infinispan.util.concurrent.TimeoutExcepti

on: ISPN000299: Unable to acquire lock after 30 seconds for key [B0x033e183537626430..[27] and requestor CommandUUID{address=XYZ, id=851936}. Lock is held by CommandUUID{address=XYZ, id=851819}

And below is the cache config:

<cache-container name="clustered" default-cache="default" statistics="true">

<distributed-cache name="default" mode="SYNC" segments="20" owners="2" remote-timeout="30000" start="EAGER">

</distributed-cache>

<distributed-cache name="gameStateCache" mode="SYNC" remote-timeout="30000" start="EAGER">

</distributed-cache>

</cache-container>

</subsystem>

Is anyone aware of high-load issues with the remote event listener system or perhaps suggest alternative configuration?

Cheers,

Brian

1. Re: Scaling infinispan server 8.2.2 with Remote Event Listeners

galder.zamarreno Sep 22, 2016 5:06 AM (in response to whethsmith)

Judging from the error, maybe the problem is with clustered expiration? Maybe you can try to disable expiration and see how it goes?
Actions