1 Reply Latest reply on Sep 2, 2019 10:06 AM by galder.zamarreno

    Please Help on Infinispan Cluster Timeout/Not Responding Issue Which Viewed As Unavailable by Keycloak

    mingjliu

      Questions:

      In my configuration, I didn't see any 15000ms setup for Infinispan or keycloak, but from the HotRod access logs it seems a lot error on 150001ms timeout from server infinispan-0 to infinispan-2 and also from server infinispan-2 to infinispan-0

       

      How to trouble shoot issues like Infinispan server not responding cases? Any tools/logs to look at or any metrics that can help analyze what the cluster currently doing?

       

      Anyone have any experience on how to handle the issue of Infinispan not responding?

       

      Thanks in advance!

       

      Environment

      3 infinipspan servers in OpenShift StatefulSet:  infinispan-0,infinispan-1,infinispan-2

      Infinispan version: 9.1.4

      Keycloak version: 3.4.3.Final

      Both running in kubernetes based on Openshift container environment.

       

      Background

      The existing configuration is working well for a long time, but after a deployment of keycloak several times today I start to see Infinispan timeout, which is causing keycloak not able to run into ready status.

       

      Symptoms

      The keycloak could successfully get restarted, but immediately the server start failing with warn logs then error logs of timeout from Infinispan servers

       

      Keycloak - the client of Infinispan cluster which maintaining an embeded infinispan cluster replication mode seeing following issues.

      10:11:10,770 WARN  [org.infinispan.client.hotrod.impl.protocol.Codec21] (default task-12) ISPN004005: Error received from the server: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 14876334 from infinispan-0,infinispan-2
      2019-07-25 10:11:10,771 DEBG 'Keycloak' stdout output:

      10:11:10,770 WARN  [org.infinispan.client.hotrod.impl.protocol.Codec21] (default task-12) ISPN004005: Error received from the server: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 14876334 from infinispan-0,infinispan-2

      10:11:10,771 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-12)

      ISPN000136: Error executing command RemoveCommand, writing keys [d8823b1c-e1e2-436b-9d53-dc3bd55f2308]:

      org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for messageId=33
      returned server error (status=0x86): org.infinispan.util.concurrent.TimeoutException:

      ISPN000476: Timed out waiting for responses for request 14876334 from infinispan-0,infinispan-2
        at org.infinispan.client.hotrod.impl.protocol.Codec20.checkForErrorsInResponseStatus(Codec20.java:350)
        at org.infinispan.client.hotrod.impl.protocol.Codec20.readPartialHeader(Codec20.java:139)
        at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeader(Codec20.java:125)
        at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56

       

      Infinispan HotRod Access log

      could not find any clue from HotRod logs on pod infinispan-0

       

      (HotRod-ServerWorker-3-30) /10.0.41.20:56342 [2019-07-25T10:47:22.752] "PING actionTokens" "OK" - 24 1125 1 ms                                                                                   

      (HotRod-ServerWorker-3-24) /10.0.41.20:56314 [2019-07-25T10:47:22.752] "PING offlineSessions" "OK" - 27 1125 1 ms                                                                                

      (HotRod-ServerWorker-3-28) /10.0.41.20:56338 [2019-07-25T10:47:22.752] "PING clientSessions" "OK" - 26 1125 1 ms                                                                                 

      (HotRod-ServerWorker-3-26) /10.0.41.20:56326 [2019-07-25T10:47:22.838] "GET_WITH_METADATA authenticationSessions" "OK" [B0x033E243861393563..[39] 71 4198 0 ms                                   

      (HotRod-ServerWorker-3-29) /10.0.41.20:56340 [2019-07-25T10:47:29.987] "PING ___script_cache" "OK" - 27 1125 1 ms                                                                                

      (HotRod-ServerWorker-3-29) /10.0.41.20:56340 [2019-07-25T10:47:29.990] "CONTAINS_KEY ___script_cache" "OK" [B0x033E106C6F61642D..[19] 44 5 2 ms                                                  

      (HotRod-ServerWorker-3-39) /10.0.35.14:55858 [2019-07-25T10:48:27.904] "PING offlineClientSessions" "OK" - 33 1125 1 ms                                                                          

      (HotRod-ServerWorker-3-34) /10.0.35.14:55836 [2019-07-25T10:48:27.904] "PING offlineSessions" "OK" - 27 1125 1 ms                                                                                

      (HotRod-ServerWorker-3-35) /10.0.35.14:55840 [2019-07-25T10:48:27.904] "PING work" "OK" - 16 1125 1 ms                                                                                           

      (HotRod-ServerWorker-3-37) /10.0.35.14:55850 [2019-07-25T10:48:27.904] "PING authenticationSessions" "OK" - 34 1125 1 ms                                                                         

      (HotRod-ServerWorker-3-36) /10.0.35.14:55844 [2019-07-25T10:48:27.905] "PING actionTokens" "OK" - 24 1125 0 ms                                                                                   

      (HotRod-ServerWorker-3-40) /10.0.35.14:55860 [2019-07-25T10:48:27.905] "PING loginFailures" "OK" - 25 1125 0 ms                                                                                  

      (HotRod-ServerWorker-3-38) /10.0.35.14:55856 [2019-07-25T10:48:27.905] "PING clientSessions" "OK" - 26 1125 0 ms                                                                                 

      (HotRod-ServerWorker-3-41) /10.0.35.14:55862 [2019-07-25T10:48:27.907] "PING sessions" "OK" - 20 1125 0 ms                                                                                       

      (HotRod-ServerWorker-3-37) /10.0.35.14:55850 [2019-07-25T10:48:28.036] "GET_WITH_METADATA authenticationSessions" "OK" [B0x033E243861393563..[39] 71 4198 1 ms                                   

      (HotRod-ServerWorker-3-39) /10.0.35.14:55858 [2019-07-25T10:48:34.952] "PING ___script_cache" "OK" - 27 1125 1 ms                                                                                

      (HotRod-ServerWorker-3-39) /10.0.35.14:55858 [2019-07-25T10:48:34.958] "CONTAINS_KEY ___script_cache" "OK" [B0x033E106C6F61642D..[19] 44 5 2 ms                                                  

      (HotRod-ServerWorker-3-46) /10.0.40.149:34076 [2019-07-25T10:49:23.001] "PING actionTokens" "OK" - 24 1125 0 ms                                                                                  

      (HotRod-ServerWorker-3-48) /10.0.40.149:34086 [2019-07-25T10:49:23.001] "PING work" "OK" - 16 1125 1 ms                                                                                          

      (HotRod-ServerWorker-3-45) /10.0.40.149:34062 [2019-07-25T10:49:23.001] "PING offlineSessions" "OK" - 27 1125 1 ms                                                                               

      (HotRod-ServerWorker-3-47) /10.0.40.149:34082 [2019-07-25T10:49:23.001] "PING loginFailures" "OK" - 25 1125 1 ms                                                                                 

      (HotRod-ServerWorker-3-49) /10.0.40.149:34092 [2019-07-25T10:49:23.001] "PING offlineClientSessions" "OK" - 33 1125 1 ms                                                                         

      (HotRod-ServerWorker-3-44) /10.0.40.149:34068 [2019-07-25T10:49:23.001] "PING sessions" "OK" - 20 1125 1 ms                                                                                      

      (HotRod-ServerWorker-3-50) /10.0.40.149:34094 [2019-07-25T10:49:23.003] "PING authenticationSessions" "OK" - 34 1125 0 ms                                                                        

      (HotRod-ServerWorker-3-51) /10.0.40.149:34096 [2019-07-25T10:49:23.008] "PING clientSessions" "OK" - 26 1125 1 ms                                                                                

      (HotRod-ServerWorker-3-50) /10.0.40.149:34094 [2019-07-25T10:49:23.145] "GET_WITH_METADATA authenticationSessions" "OK" [B0x033E243861393563..[39] 71 4198 0 ms                                  

      (HotRod-ServerWorker-3-49) /10.0.40.149:34092 [2019-07-25T10:49:29.964] "PING ___script_cache" "OK" - 27 1125 2 ms                                                                               

      (HotRod-ServerWorker-3-49) /10.0.40.149:34092 [2019-07-25T10:49:29.967] "CONTAINS_KEY ___script_cache" "OK" [B0x033E106C6F61642D..[19] 44 5 3 ms                                                 

      (HotRod-ServerWorker-3-54) /10.0.39.126:39586 [2019-07-25T10:50:19.367] "PING -" "OK" - 12 869 1 ms                                                                                              

      (HotRod-ServerWorker-3-56) /10.0.39.126:39588 [2019-07-25T10:50:19.367] "PING -" "OK" - 12 869 1 ms                                                                                              

      (HotRod-ServerWorker-3-55) /10.0.39.126:39582 [2019-07-25T10:50:19.367] "PING -" "OK" - 12 869 1 ms                                                                                              

      (HotRod-ServerWorker-3-57) /10.0.39.126:39608 [2019-07-25T10:50:19.400] "PING sessions" "OK" - 20 1125 0 ms                                                                                      

      (HotRod-ServerWorker-3-60) /10.0.39.126:39628 [2019-07-25T10:50:19.400] "PING actionTokens" "OK" - 24 1125 0 ms                                                                                  

      (HotRod-ServerWorker-3-58) /10.0.39.126:39618 [2019-07-25T10:50:19.400] "PING offlineClientSessions" "OK" - 33 1125 0 ms                                                                         

      (HotRod-ServerWorker-3-61) /10.0.39.126:39634 [2019-07-25T10:50:19.400] "PING authenticationSessions" "OK" - 34 1125 0 ms                                                                        

      (HotRod-ServerWorker-3-59) /10.0.39.126:39626 [2019-07-25T10:50:19.400] "PING clientSessions" "OK" - 26 1125 0 ms                                                                                

      (HotRod-ServerWorker-3-62) /10.0.39.126:39638 [2019-07-25T10:50:19.401] "PING work" "OK" - 16 1125 0 ms                                                                                          

      (HotRod-ServerWorker-3-63) /10.0.39.126:39640 [2019-07-25T10:50:19.402] "PING loginFailures" "OK" - 25 1125 0 ms                                                                                 

      (HotRod-ServerWorker-3-64) /10.0.39.126:39642 [2019-07-25T10:50:19.406] "PING offlineSessions" "OK" - 27 1125 1 ms                                               

       

      there are timeout error log from pod infinispan-2

      (HotRod-ServerWorker-3-40) /10.0.88.132:52344 [2019-07-25T10:15:22.673] "GET_WITH_METADATA clientSessions" "OK" [B0x0304090000000E6A..[85] 110 6 1 ms

      (HotRod-ServerWorker-3-36) /10.0.41.6:55432 [2019-07-25T10:15:07.869] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 1
      3885562 from infinispan-0" - 1143 153 15001 ms

      (HotRod-ServerWorker-3-28) /10.0.40.135:34462 [2019-07-25T10:15:22.890] "GET_WITH_VERSION sessions" "OK" [B0x033E243861646135..[39] 57 5 0 ms

      (HotRod-ServerWorker-3-118) /10.0.88.132:47988 [2019-07-25T10:15:08.349] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for reques

      t 13885563 from infinispan-0" - 1144 154 15001 ms

      (HotRod-ServerWorker-3-138) /10.0.39.117:47856 [2019-07-25T10:15:09.078] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for reques

      t 13885564 from infinispan-0" - 1144 154 15002 ms

      (HotRod-ServerWorker-3-71) /10.0.35.251:55910 [2019-07-25T10:15:24.118] "GET_WITH_VERSION sessions" "OK" [B0x033E246630623932..[39] 58 6 0 ms

      (HotRod-ServerWorker-3-135) /10.0.88.132:48514 [2019-07-25T10:15:09.242] "ERROR clientSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for

      request 13885565 from infinispan-0" - 1319 154 15001 ms

      (HotRod-ServerWorker-3-58) /10.0.88.132:46856 [2019-07-25T10:15:09.351] "ERROR clientSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for r

      equest 13885566 from infinispan-0" - 1319 154 15002 ms

      (HotRod-ServerWorker-3-76) /10.0.35.251:57242 [2019-07-25T10:15:09.363] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request

      13885567 from infinispan-0" - 1144 154 15002 ms

      (HotRod-ServerWorker-3-125) /10.0.88.132:48092 [2019-07-25T10:15:09.433] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for reques

      t 13885568 from infinispan-0" - 1144 154 15002 ms

      (HotRod-ServerWorker-3-39) /10.0.35.251:50234 [2019-07-25T10:15:09.544] "ERROR authenticationSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for respons

      es for request 13885569 from infinispan-0" - 72 154 15001 ms

      (HotRod-ServerWorker-3-139) /10.0.39.117:47858 [2019-07-25T10:15:24.554] "GET_WITH_VERSION clientSessions" "OK" [B0x0304090000000E6A..[85] 110 6 0 ms

      (HotRod-ServerWorker-3-14) /10.0.35.251:48360 [2019-07-25T10:15:09.857] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request

      13885570 from infinispan-0" - 1144 154 15001 ms

      (HotRod-ServerWorker-3-132) /10.0.35.251:37444 [2019-07-25T10:15:10.041] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for reques

      t 13885571 from infinispan-0" - 1144 154 15001 ms

      (HotRod-ServerWorker-3-25) /10.0.88.132:48996 [2019-07-25T10:15:10.187] "ERROR clientSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for r

      equest 13885572 from infinispan-0" - 1319 154 15001 ms

      (HotRod-ServerWorker-3-139) /10.0.39.117:47858 [2019-07-25T10:15:25.212] "GET_WITH_VERSION clientSessions" "OK" [B0x0304090000000E6A..[85] 110 6 0 ms

      (HotRod-ServerWorker-3-28) /10.0.40.135:34462 [2019-07-25T10:15:25.239] "GET_WITH_METADATA sessions" "OK" [B0x033E243039383131..[39] 57 5 0 ms

      (HotRod-ServerWorker-3-28) /10.0.40.135:34462 [2019-07-25T10:15:25.239] "GET_WITH_VERSION sessions" "OK" [B0x033E243039383131..[39] 57 5 0 ms

      (HotRod-ServerWorker-3-28) /10.0.40.135:34462 [2019-07-25T10:15:25.623] "GET_WITH_VERSION sessions" "OK" [B0x033E243165353565..[39] 57 5 0 ms

      (HotRod-ServerWorker-3-28) /10.0.40.135:34462 [2019-07-25T10:15:25.644] "GET_WITH_VERSION sessions" "OK" [B0x033E243635353963..[39] 57 5 0 ms

      (HotRod-ServerWorker-3-34) /10.0.88.132:50402 [2019-07-25T10:15:10.828] "ERROR clientSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for r

      equest 13885573 from infinispan-0" - 1319 154 15001 ms

      (HotRod-ServerWorker-3-100) /10.0.35.251:60668 [2019-07-25T10:15:10.845] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for reques

      t 13885574 from infinispan-0" - 1144 154 15001 ms

      (HotRod-ServerWorker-3-24) /10.0.41.6:54028 [2019-07-25T10:15:12.690] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 1
      3885575 from infinispan-0" - 1143 153 15001 ms

      (HotRod-ServerWorker-3-26) /10.0.39.117:49490 [2019-07-25T10:15:13.220] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request

      13885576 from infinispan-0" - 1144 154 15001 ms

      (HotRod-ServerWorker-3-23) /10.0.88.132:46056 [2019-07-25T10:15:13.339] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request

      13885577 from infinispan-0" - 1144 154 15002 ms

      (HotRod-ServerWorker-3-124) /10.0.35.251:35182 [2019-07-25T10:15:28.483] "GET_WITH_VERSION sessions" "OK" [B0x033E246536306162..[39] 58 6 0 ms

      (HotRod-ServerWorker-3-136) /10.0.35.251:38458 [2019-07-25T10:15:13.733] "ERROR clientSessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for

      request 13885578 from infinispan-0" - 1319 154 15002 ms

      (HotRod-ServerWorker-3-41) /10.0.35.251:50928 [2019-07-25T10:15:13.794] "ERROR sessions" "org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request

      13885579 from infinispan-0" - 1144 154 15001 ms

       

       

      Infinispan-0 & infinispan-2 status via ispn-cli.sh both are not responding just saying it's rolled back.

       

      $  /bin/ispn-cli.sh -c                                                                                                                                                   

      [standalone@localhost:9990 /] connect                                                                                                                                                            

      [standalone@localhost:9990 /] container cluster-a                                                                                                                                                

      [standalone@localhost:9990 cache-container=cluster-a] ls                                                                                                                                         

      WFLYCTL0063: Composite operation was rolled back