1 2 Previous Next 19 Replies Latest reply on Jun 14, 2014 6:59 AM by rituraj

Infinispan error as "org.infinispan.commons.CacheException"

rituraj May 27, 2014 7:44 AM

we are running 6 nodes in domain mode where 2 nodes are running on master and 4 are distributed on 2 slaves

master :- node 1 and node 2

slave 1 : - node 3 and node 4

slave 2 : - node 5 and and node 6

there are 2 clusters as

cluster 1: ( node 1 , node 3 and node 5)

cluster 2 : (node 2 , node 4 and node 6)

profile used : HA

we are using modcluster as well

both the clusters have the same war deployed on them and everything is working fine. During my load testing i observed that one of node was stopped responsing(was not able to serve the any incoming request).

Hence I decided to restart it using web console, during restart I m observing the below exception:

06:11:05,670 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 64) MSC000001: Failed to start service jboss.infinispan.web.default-host/testapp: org.jboss.msc.service.StartException in service jboss.infinispan.web.default-host/testapp: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl

[Server:node1] at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:91)

[Server:node1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_45]

[Server:node1] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_45]

[Server:node1] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]

[Server:node1] at org.jboss.threads.JBossThread.run(JBossThread.java:122)

[Server:node1] Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl

[Server:node1] at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:185)

[Server:node1] at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)

[Server:node1] at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)

[Server:node1] at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)

[Server:node1] at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)

[Server:node1] at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216)

[Server:node1] at org.infinispan.CacheImpl.start(CacheImpl.java:675)

[Server:node1] at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:553)

[Server:node1] at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:516)

[Server:node1] at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:398)

[Server:node1] at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:412)

[Server:node1] at org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:103)

[Server:node1] at org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:94)

[Server:node1] at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)

[Server:node1] at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:86)

[Server:node1] ... 4 more

[Server:node1] Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache default-host/testapp on wildfly-app301p_dc:node1/wildfly-consumer1-cluster

[Server:node1] at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:202)

[Server:node1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_45]

[Server:node1] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_45]

[Server:node1] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_45]

[Server:node1] at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_45]

[Server:node1] at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:183)

[Server:node1] ... 18 more

[Server:node1]

[Server:node1] 06:11:05,695 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS014613: Operation ("add") failed - address: ([("deployment" => "test.war")]) - failure description: {"JBAS014671: Failed services" => {"jboss.infinispan.web.default-host/testapp" => "org.jboss.msc.service.StartException in service jboss.infinispan.web.default-host/testapp: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl

[Server:node1] Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache default-host/testapp on wildfly-app301p_dc:node1/wildfly-consumer1-cluster"}}

[Server:node1] 06:11:05,771 INFO [org.jboss.as.server] (Controller Boot Thread) JBAS018559: Deployed "test.war" (runtime-name : "test.war")

[Server:node1] 06:11:05,772 INFO [org.jboss.as.server] (Controller Boot Thread) JBAS018559: Deployed "property-adapter-module.rar" (runtime-name : "property-adapter-module.rar")

[Server:node1] 06:11:05,772 INFO [org.jboss.as.server] (Controller Boot Thread) JBAS018559: Deployed "mysql-connector-java-5.1.30-bin.jar" (runtime-name : "mysql-connector-java-5.1.30-bin.jar")

[Server:node1] 06:11:05,777 INFO [org.jboss.as.controller] (Controller Boot Thread) JBAS014774: Service status report

[Server:node1] JBAS014777: Services which failed to start: service jboss.infinispan.web.default-host/testapp: org.jboss.msc.service.StartException in service jboss.infinispan.web.default-host/testapp: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl

[Server:node1]

[Server:node1] 06:11:05,785 ERROR [org.jboss.as] (Controller Boot Thread) JBAS015875: WildFly 8.0.0.Final "WildFly" started (with errors) in 84750ms - Started 912 of 1044 services (8 services failed or missing dependencies, 192 services are lazy, passive or on-demand)

and after this error we were unable to start the server also it continued to give error like that only ....

can someone please let us know how can we solve the infinispan issue ...?

1. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro May 27, 2014 10:30 AM (in response to rituraj)

Which version of WildFly are you using? I suspect that whatever prevented the node from responding also prevented it from shutting down properly. This would certainly cause the state transfer to timeout. In the future, I would recommend ensuring that the unresponsive node is completely shutdown (i.e. the other node register a cluster view change) before restarting it.
Actions

2. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj May 28, 2014 6:20 AM (in response to pferraro)

Thanks for replying Paul,

i am using wildfly 8.0.0.Final ..we have tried to reproduce the same thing during stress test and this time the error we are getting are all related to infinispan ...

linking this with another discussion which we have raised ...Getting lots of exception related to infinispan while load testing

errors given here are the same which we are getting now ...

1st one

2014-05-23 03:49:37,056 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-15) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Replication timeout for partswildfly-pdapp301p_dc:partswildfly-consumer2-pdapp301pw1/partswildfly-consumer2-cluster

at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:77)

2nd

Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache default-host/testapp t on wildfly-app303p_slv:wildfly-consumer2-app303pw1/wildfly-consumer2-cluster

at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:202)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_45]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_45]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_45]

at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_45]

at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:183)

... 18 more

both the above infinispan error continued and the app stopped responding after sometime ....

is there any solution to it also please find the below configuration for infinispan we are using

	<cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
	<transport stack="tcp" cluster="${jboss.cluster.group.name}" lock-timeout="60000"/>
	<distributed-cache name="dist" batching="true" mode="ASYNC" owners="4" l1-lifespan="0">
	<locking isolation="REPEATABLE_READ"/>
	<file-store/>
	</distributed-cache>
	</cache-container>

let us know if we are missing something here....or what would be the recommended configuration of infinispan

Rituraj

3. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj May 28, 2014 10:51 AM (in response to rituraj)

hey Paul,

i have a question related to the configuration settings what does these 2 value hold for can you explain a bit for us
"owner" and "l1-lifespan"

Thanks
Rituraj
Actions
4. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro May 28, 2014 11:19 AM (in response to rituraj)

owners refers to the number of nodes (including itself) to which the web session will replicate a given session. The default value in WF 8.0 was 4, which was deemed too conservative. We decreased this to 2 for 8.1. The value should be set according to the level of availability you require. The general trade-off is between availability and capacity/performance. A cluster can survive the simultaneous loss of up to N-1 nodes without losing sessions, where N is the configured value for "owners". However, the smaller N is, the greater the in-memory session capacity of the cluster as a whole. There are also marginal replication costs to higher values of N.
http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_distribution_mode

l1-lifespan refers to Infinispan's level-1 cache when using distribution mode.
http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_l1_caching
Since WF in concert with mod_cluster already takes measures to always direct requests to the primary owner of a given session, I recommend keeping the L1 cache disabled (i.e. l1-lifespan=0) to avoid unnecessary remote invalidation RPCs.
Actions
5. Re: Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro May 28, 2014 11:28 AM (in response to rituraj)
I replied to the other thread, but it also occurs to me, given that you're performing load tests, that you may need to increase the state transfer timeout. The default value is 60000 ms (i.e. 1 minute).
e.g.
<distributed-cache name="dist" batching="true" mode="SYNC" owners="2" l1-lifespan="0"> <state-transfer timeout="120000"/>  </distributed-cache>
1 of 1 people found this helpful
Actions
6. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj May 28, 2014 11:56 AM (in response to pferraro)

Paul Thanks a lot for the explaination....just wanted to make sure one thing do i need

<transport stack="tcp" cluster="${jboss.cluster.group.name}" lock-timeout="60000"/>
<distributed-cache name="dist" batching="true" mode="ASYNC" owners="1" l1-lifespan="0">

is lock-timeout="60000" will be there or should i remove this when i am adding <state-transfer timeout="120000"/>

Thanks
Actions
7. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro May 28, 2014 12:27 PM (in response to rituraj)

The <transport lock-timeout="..."/> attribute controls the acquisition timeout for the lock required to initiate the state transfer, since only one node is allowed to perform state transfer at a given time. The state transfer timeout controls how long the node that initiated the state transfer should wait for the state transfer to finish. Go ahead and set these both to something large, say 240000 (i.e. 4 minutes).
1 of 1 people found this helpful
Actions
8. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj May 29, 2014 7:04 AM (in response to pferraro)

Hi Paul,

we did some round of testing today ...and it went quite good after those changes so right now we have
master --> node 1and 2 slaves each one of them have one application-server each
slave 1 --> node 2
slave 3 --> node 3
we started the stress test and brought node 3 and node 1 (on master one by one) ...after which node 2 of slave 2 was handling the request well...
but once we started the master again we started receiving a lot 500s and below in the log on master node 1 as well at the same time slave 2 which was up stopped responding
below errors came on master node 1

05:28:39,164 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (remote-thread-7) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on BwFRvwU0GP1cA7o4hKBHxQRa on behalf of transaction GlobalTransaction:<wildfly-app302p_slv:wildfly-consumer-app302pw1/wildfly-consumer-cluster>:31783:remote. Waiting to complete tx: RemoteTransaction{modifications=[ReplaceCommand{key=We-iVzjYANyY1ura5bGWeAbq, oldValue=null, newValue=[B@3a1d3fb5, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=00cpVwyjZ9DXIPdGsxr5D409, oldValue=null, newValue=[B@5641a1e1, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=00cpVwyjZ9DXIPdGsxr5D409, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@6eeae704, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=MIafKu-_NUDDXhMWuPzVcbYv, oldValue=null, newValue=[B@7fe4f07f, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=MIafKu-_NUDDXhMWuPzVcbYv, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@f696372, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=YnlKwyJig2-USYC1b7T6-KFv, oldValue=null, newValue=[B@4fffd945, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=YnlKwyJig2-USYC1b7T6-KFv, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@26116a0b, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=ljztt_d7bNz4Dndj7mHi8ksX, oldValue=null, newValue=[B@41490bb3, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=ljztt_d7bNz4Dndj7mHi8ksX, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@229f3641, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=vIgxd_HdcPJwBFE7_g5U4IMs, oldValue=null, newValue=[B@2a2f729, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=vIgxd_HdcPJwBFE7_g5U4IMs, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@1910a619, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=BwFRvwU0GP1cA7o4hKBHxQRa, oldValue=null, newValue=[B@1408c02a, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=BwFRvwU0GP1cA7o4hKBHxQRa, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@4a800fac, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=Rjm48FSuHZnoAKE2PkLIdaOz, oldValue=null, newValue=[B@1d850b6a, metadata=EmbeddedMetadata{version=null}, flags=[SKIP_LOCKING, IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}, ReplaceCommand{key=Rjm48FSuHZnoAKE2PkLIdaOz, oldValue=null, newValue=org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionCacheEntry@9b06b4a, metadata=EmbeddedMetadata{version=null}, flags=[IGNORE_RETURN_VALUES], successful=true, valueMatcher=MATCH_ALWAYS}], lookedUpEntries={}, lockedKeys=null, backupKeyLocks=[BwFRvwU0GP1cA7o4hKBHxQRa, YnlKwyJig2-USYC1b7T6-KFv, Rjm48FSuHZnoAKE2PkLIdaOz, ljztt_d7bNz4Dndj7mHi8ksX, 00cpVwyjZ9DXIPdGsxr5D409, vIgxd_HdcPJwBFE7_g5U4IMs, MIafKu-_NUDDXhMWuPzVcbYv], lookedUpEntriesTopology=10, isMarkedForRollback=false, tx=GlobalTransaction:<wildfly-app302p_slv:wildfly-consumer-app302pw1/wildfly-consumer-cluster>:28422:remote, state=null}.

can you tell us why is this happening ....it happens when we try to bring up master node ....

Thanks
RIturaj
Actions
9. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro May 29, 2014 10:59 AM (in response to rituraj)

You probably need to increase the locking timeout. When you bring a node up, a rebalancing of session occurs. Any sessions that need to be moved will be locked temporarily - which can cause timeouts if the state transfer is moving alot of data.
e.g.
<locking acquire-timeout="60000"/>
Actions
10. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj Jun 10, 2014 12:54 PM (in response to rituraj)

I have 2 more questions related to it
1) we are facing a .dat file issue which is occupying a lot of space and its growing continuously and results in eating up all the space available on the disk...its path is as follows
/domain/servers/node1/data/infinispan/web/default-host/appname.dat
this can be controlled once we use expiration and set max-idle to something around 2 min or something ...but once we use expiration we start getting the infinispan errors...
can you tell us how can we deal with this ..?
2) if we don't want to use caching at all how can we achieve it in "ha" profile or may be use local caching only ..

again thanks for all your help Paul
Actions
11. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro Jun 11, 2014 12:17 AM (in response to rituraj)

1) we are facing a .dat file issue which is occupying a lot of space and its growing continuously and results in eating up all the space available on the disk...its path is as follows

/domain/servers/node1/data/infinispan/web/default-host/appname.dat

this can be controlled once we use expiration and set max-idle to something around 2 min or something ...but once we use expiration we start getting the infinispan errors...

can you tell us how can we deal with this ..?

What does your infinispan cache configuration look like? Using the default web session configuration, this file only stores passivated web sessions. However, if configured using <file-store passivation="false"/>, it will store a copy of *all* web sessions (including those in memory). There is no reason to disable passivation if using a clustered cache (i.e. distributed-cache or replicated-cache) since web session in memory are backed up on remote nodes, if not on disk. If you are already configured to store passivated entries only (i.e. using<file-store passivation="true"/>, or just <file-store/>, since passivation="true" by default), then make sure that your passivation settings (i.e. configured via <max-active-sesssions/> in jboss-web.xml - or via <eviction max-entries="..."/> in the infinispan subsystem configuration) are not overly aggressive. Using <max-active-sessions/> ensure that only the configure number of web sessions are ever held in memory - excess ones are passivated to disk.
The other possibility is that you are simply overloading your nodes with more web sessions than they can handle. In that case, you will need to add more nodes to your cluster to handle the expected load.

I should also note that web session expiration should not be configured via the cache configuration, but rather via <session-config><session-timeout>...</session-timeout></session-config> in your web.xml.

2) if we don't want to use caching at all how can we achieve it in "ha" profile or may be use local caching only

If your web application does not require high-availability of web session state, then simply remove the <distributable/> element from your web.xml. That way web sessions will only be stored in memory - and number of sessions your node can handle will be limited by the available heap space - or rigidly restricted via a configured value for <max-active-sessions/>.
Actions
12. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj Jun 11, 2014 2:31 AM (in response to pferraro)

Thanks a lot Paul...and all of things you said make sense...so here is the cache configuration right now

                <cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
                    <transport stack="tcp" cluster="${jboss.cluster.group.name}" lock-timeout="300000"/>
                    <distributed-cache name="dist" start="EAGER" batching="true" mode="SYNC" remote-timeout="300000" owners="2" l1-lifespan="0">
                        <locking striping="false" acquire-timeout="60000" concurrency-level="3000"/>
                        <eviction strategy="LIRS" max-entries="1000"/>
                            <state-transfer timeout="300000"/>
                        <file-store shared="true" preload="true"/>
                    </distributed-cache>
                </cache-container>

we are also specifying the session-tiemout in our web.xml as
<session-config>
    <session-timeout>60</session-timeout>
    <cookie-config>
      <path>/</path>
    </cookie-config>
</session-config>

so right now we have 3 nodes in a cluster running we will try to increase them to 6 and since we are not using jboss-web.xml we are still relying on eviction to help us out may be the value is too high right now ...so we will try to test it for something around 500 entries...
Actions
13. Re: Infinispan error as "org.infinispan.commons.CacheException"

rituraj Jun 11, 2014 3:40 AM (in response to rituraj)

which one is true...i am still confused about owner
onwers is "The number of copies that should be maintained in the cluster for each cache entry" --> if this is true then owner=2 will be sufficient and should not be effected with adding or removing nodes from a cluster as we will have always 2 copies of each session across the cluster ..
now
owner is "owners refers to the number of nodes (including itself) to which the web session will replicate a given session" --> if this is true and we have again owner=2 suppose we have 6 nodes in a cluster so each session will be replicated to 2 nodes "only" (which nobody knows) irrespective of the others nodes and if somehow both of them are down then the session will be lost ?
in the above case then we should be having owner=4

wanted to have a logical value of owner wrt to session cache copies and number of nodes in the cluster...
Actions
14. Re: Infinispan error as "org.infinispan.commons.CacheException"

pferraro Jun 11, 2014 9:41 AM (in response to rituraj)

Strictly speaking, owners="2" means that a cache entry will be store on 2 nodes of the cluster. The determination of ownership is deterministic and uses a hashing algorithm that considers the hash code of the cache key (i.e. the session ID) and the topology of the cluster. WildFly generates its session IDs such that the node that created the session will be one of the owners (in fact, the primary owner - on which lock acquisition takes place). When the cluster topology changes (i.e. a node leaves or joins), ownership is reassessed to ensure that session ownership is distributed evenly, and any missing owners (due to a node leaving) are reassigned.

In general, the configured value for owners is a trade-off between availability of cache data and efficiency/performance. The lower the value, the better the performance, but the more vulnerable the cluster is to data loss. If owners="2" and 2 nodes in the cluster go down at the same time, any sessions that were only owned by those nodes will be lost. Any sessions that were owned by 1 of those nodes will gain another owner, such that redundancy is restored.
Actions

1 2 Previous Next

Go to original post