OutOfMemoryException
matabo Aug 24, 2006 2:55 AMWe are using Jbosscache in production. The environment is:
-two weblogic 7 instances on two different machines (SUN and HP)
-Jdk 1.3.1
-JBosscache 1.2.3
-Jgroups 2.2.8-jdk-1.3
We don't use transactions and eviction policy. Object lifecycle is managed by our application. Synchronous replication. We use only Treecache objects. No use of AOP. The JBosscache configuration file is:
<?xml version="1.0" encoding="UTF-8"?> <!-- ===================================================================== --> <!-- --> <!-- Sample TreeCache Service Configuration --> <!-- --> <!-- ===================================================================== --> <server> <!-- <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/> --> <!-- ==================================================================== --> <!-- Defines TreeCache configuration --> <!-- ==================================================================== --> <mbean code="org.jboss.cache.TreeCache" name="jboss.cache:service=TreeCache"> <depends>jboss:service=Naming</depends> <depends>jboss:service=TransactionManager</depends> <!-- Configure the TransactionManager --> <attribute name="TransactionManagerLookupClass">org.jboss.cache.GenericTransactionManagerLookup</attribute> <!-- Isolation level : SERIALIZABLE REPEATABLE_READ (default) READ_COMMITTED READ_UNCOMMITTED NONE --> <attribute name="IsolationLevel">REPEATABLE_READ</attribute> <!-- Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC --> <attribute name="CacheMode">REPL_SYNC</attribute> <!-- Just used for async repl: use a replication queue --> <attribute name="UseReplQueue">true</attribute> <!-- Replication interval for replication queue (in ms) --> <attribute name="ReplQueueInterval">100</attribute> <!-- Max number of elements which trigger replication --> <attribute name="ReplQueueMaxElements">10</attribute> <!-- Name of cluster. Needs to be the same for all clusters, in order to find each other --> <!-- <attribute name="ClusterName">TreeCache-Cluster</attribute> --> <!-- JGroups protocol stack properties. Can also be a URL, e.g. file:/home/bela/default.xml <attribute name="ClusterProperties"></attribute> --> <attribute name="ClusterConfig"> <config> <!-- UDP: if you have a multihomed machine, set the bind_addr attribute to the appropriate NIC IP address, e.g bind_addr="192.168.0.2" --> <!-- UDP: On Windows machines, because of the media sense feature being broken with multicast (even after disabling media sense) set the loopback attribute to true; porta mcast originale:48866 indirizzo mcast originale: 228.1.2.3 --> <UDP mcast_addr="228.1.2.3" mcast_port="48866" bind_addr="1.5.28.121" ip_ttl="64" ip_mcast="true" mcast_send_buf_size="150000" mcast_recv_buf_size="80000" ucast_send_buf_size="150000" ucast_recv_buf_size="80000" loopback="false"/> <PING timeout="2000" num_initial_members="3" up_thread="false" down_thread="false"/> <MERGE2 min_interval="10000" max_interval="20000"/> <!-- <FD shun="true" up_thread="true" down_thread="true" />--> <FD_SOCK/> <VERIFY_SUSPECT timeout="1500" up_thread="false" down_thread="false"/> <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800" max_xmit_size="8192" up_thread="false" down_thread="false"/> <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10" down_thread="false"/> <pbcast.STABLE desired_avg_gossip="20000" up_thread="false" down_thread="false"/> <FRAG frag_size="8192" down_thread="false" up_thread="false"/> <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true" print_local_addr="true"/> <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/> </config> </attribute> <!-- Whether or not to fetch state on joining a cluster --> <attribute name="FetchStateOnStartup">true</attribute> <!-- The max amount of time (in milliseconds) we wait until the initial state (ie. the contents of the cache) are retrieved from existing members in a clustered environment --> <attribute name="InitialStateRetrievalTimeout">30000</attribute> <!-- Number of milliseconds to wait until all responses for a synchronous call have been received. --> <attribute name="SyncReplTimeout">15000</attribute> <!-- Max number of milliseconds to wait for a lock acquisition --> <attribute name="LockAcquisitionTimeout">10000</attribute> <!-- Name of the eviction policy class. --> <attribute name="EvictionPolicyClass"></attribute> <!-- Indicate whether to use marshalling or not. Set this to true if you are running under a scoped class loader, e.g., inside an application server. Default is "false". --> <attribute name="UseMarshalling">true</attribute> <!-- <attribute name="CacheLoaderClass">org.jboss.cache.loader.bdbje.BdbjeCacheLoader</attribute> <attribute name="CacheLoaderConfig"> location=c:\\tmp\\bdbje </attribute> <attribute name="CacheLoaderShared">true</attribute> <attribute name="CacheLoaderPreload">/</attribute> <attribute name="CacheLoaderPassivation">false</attribute> --> <!-- <attribute name="CacheLoaderClass">org.jboss.cache.loader.FileCacheLoader</attribute> <attribute name="CacheLoaderConfig"> location=c:\\tmp </attribute> <attribute name="CacheLoaderShared">true</attribute> <attribute name="CacheLoaderPreload">/</attribute> <attribute name="CacheLoaderPassivation">false</attribute> --> </mbean> <!-- Uncomment to get a graphical view of the TreeCache MBean above --> <!-- <mbean code="org.jboss.cache.TreeCacheView" name="jboss.cache:service=TreeCacheView">--> <!-- <depends>jboss.cache:service=TreeCache</depends>--> <!-- <attribute name="CacheService">jboss.cache:service=TreeCache</attribute>--> <!-- </mbean>--> </server>
We are facing the following problems:
1) from time to time the Weblogic "freezes" (speaking BEA language: no answer - no remote exception, nothing at all - from a t3 call).
We hope that this problem will disappear when we'll migrate to JBosscache 1.4.0/weblogic 9.1/jdk 1.5
2) OutOfMemory exception. Everything works fine for a few days, then we see many FULL garbage collection, then memory explodes.
Our application has a sort of "periodic load": many user connect at morning, a few disconnect during lunch time, then they reconnect in the afternoon , then they disconnect on evening. During night we clean eveything (and from what we see, we are almost sure to clean).
We made a lot of stress test, but we've never seen this kind of problems during them.
We are trying to investigate the OOM problem, but I would like to know if our JBosscache configuration file could be responsible for memory troubles (i read something about FC fast - messaging related to memory trouble...).
TIA
matabo