1 of 1 people found this helpful
Your configuration files aren't too far from the standard configuration included in the AS7.1 kit, so there's nothing there that looks out of ordinary. The attributes on the Infinispan cache "store" element seem appropriate. Interestingly, the NPE in the ConcurrentHashMap matches a line in my source that implies the value being inserted was null. Stepping up in the stack trace, the calling method appears to be line 308 of BdbjeCacheStore.java, which would imply there was no current transaction. Could the transaction have timed out? Surely there'd be something in the log. Perhaps try bumping the default timeout to something quite a bit larger. In your AS7 configuration, this is set with the following:
<subsystem xmlns="urn:jboss:domain:transactions:1.1"> ... <coordinator-environment default-timeout="300"/> ... </subsystem>
If changing that doesn't work, perhaps try testing with another cache loader. I understand that wouldn't help your configuration, but it might help diagnose what's going wrong.
Thanks I'll give that a try.
I doubled the transaction timeout to 600 seconds, and eventually got the same exception. I have changed my test to a generic program instead of one that uses confidential data. I also set max-entries on the cache to be sure I was not running out of memory. I'll keep experimenting with it. I'll set the transaction timeout to some really huge value. If I knew that it was just a transaction timeout (where you get the usual timeout message from arjuna) I'd be OK that's the problem, but that's not what I'm seeing. Also, the transactions are all the same - saving around 10K nodes that are almost identical, so why does it take hours of running before hitting the first failure?
This is just a stab in the dark, but is there any background Lucene processing gonig on at any point, like index optimization, that might be some interfering?
Quick update on this. After setting max-entries in the Infinispan cache config to limit Infinispan memory usage, setting the tx timeout to 36000, and regularly calling System.gc(), I was able to add a much larger amount of nodes without this exception - and without the JVM memory growing to the maximum.
At first, after just setting max-entries, it did not make sense why the JVM memory still just continued to grow and grow. Finally it occurred to me it might not be Infinispan at all, but regular Java objects not being GC'd because they were being created so quickly. After adding the System.gc() call periodically, everything behaved much more predictably.
So I cannot say this with any facts to back it up, but it might be that the NPE was due to an out of memory being encountered and not handled in a graceful manner, due to a misconfiguration (unlimited cache size) and the GC issue.
Correction: tx timeout = 3600