2 Replies Latest reply on Aug 24, 2006 4:31 AM by matabo

OutOfMemoryException

matabo Aug 24, 2006 2:55 AM

We are using Jbosscache in production. The environment is:

-two weblogic 7 instances on two different machines (SUN and HP)

-Jdk 1.3.1

-JBosscache 1.2.3

-Jgroups 2.2.8-jdk-1.3

We don't use transactions and eviction policy. Object lifecycle is managed by our application. Synchronous replication. We use only Treecache objects. No use of AOP. The JBosscache configuration file is:

<?xml version="1.0" encoding="UTF-8"?>

<!-- ===================================================================== -->
<!-- -->
<!-- Sample TreeCache Service Configuration -->
<!-- -->
<!-- ===================================================================== -->

<server>

 <!--
 <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
 -->


 <!-- ==================================================================== -->
 <!-- Defines TreeCache configuration -->
 <!-- ==================================================================== -->

 <mbean code="org.jboss.cache.TreeCache"
 name="jboss.cache:service=TreeCache">

 <depends>jboss:service=Naming</depends>
 <depends>jboss:service=TransactionManager</depends>

 <!--
 Configure the TransactionManager
 -->
 <attribute name="TransactionManagerLookupClass">org.jboss.cache.GenericTransactionManagerLookup</attribute>

 <!--
 Isolation level : SERIALIZABLE
 REPEATABLE_READ (default)
 READ_COMMITTED
 READ_UNCOMMITTED
 NONE
 -->
 <attribute name="IsolationLevel">REPEATABLE_READ</attribute>

 <!--
 Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC
 -->
 <attribute name="CacheMode">REPL_SYNC</attribute>

 <!--
 Just used for async repl: use a replication queue
 -->
 <attribute name="UseReplQueue">true</attribute>

 <!--
 Replication interval for replication queue (in ms)
 -->
 <attribute name="ReplQueueInterval">100</attribute>

 <!--
 Max number of elements which trigger replication
 -->
 <attribute name="ReplQueueMaxElements">10</attribute>

 <!-- Name of cluster. Needs to be the same for all clusters, in order
 to find each other
 -->
 <!-- <attribute name="ClusterName">TreeCache-Cluster</attribute> -->

 <!-- JGroups protocol stack properties. Can also be a URL,
 e.g. file:/home/bela/default.xml
 <attribute name="ClusterProperties"></attribute>
 -->

 <attribute name="ClusterConfig">
 <config>
 <!-- UDP: if you have a multihomed machine,
 set the bind_addr attribute to the appropriate NIC IP address, e.g bind_addr="192.168.0.2"
 -->
 <!-- UDP: On Windows machines, because of the media sense feature
 being broken with multicast (even after disabling media sense)
 set the loopback attribute to true; porta mcast originale:48866
 indirizzo mcast originale: 228.1.2.3 -->
 <UDP mcast_addr="228.1.2.3" mcast_port="48866" bind_addr="1.5.28.121"
 ip_ttl="64" ip_mcast="true"
 mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
 ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
 loopback="false"/>
 <PING timeout="2000" num_initial_members="3"
 up_thread="false" down_thread="false"/>
 <MERGE2 min_interval="10000" max_interval="20000"/>
 <!-- <FD shun="true" up_thread="true" down_thread="true" />-->
 <FD_SOCK/>
 <VERIFY_SUSPECT timeout="1500"
 up_thread="false" down_thread="false"/>
 <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
 max_xmit_size="8192" up_thread="false" down_thread="false"/>
 <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
 down_thread="false"/>
 <pbcast.STABLE desired_avg_gossip="20000"
 up_thread="false" down_thread="false"/>
 <FRAG frag_size="8192"
 down_thread="false" up_thread="false"/>
 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
 shun="true" print_local_addr="true"/>
 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
 </config>
 </attribute>


 <!--
 Whether or not to fetch state on joining a cluster
 -->
 <attribute name="FetchStateOnStartup">true</attribute>

 <!--
 The max amount of time (in milliseconds) we wait until the
 initial state (ie. the contents of the cache) are retrieved from
 existing members in a clustered environment
 -->
 <attribute name="InitialStateRetrievalTimeout">30000</attribute>

 <!--
 Number of milliseconds to wait until all responses for a
 synchronous call have been received.
 -->
 <attribute name="SyncReplTimeout">15000</attribute>

 <!-- Max number of milliseconds to wait for a lock acquisition -->
 <attribute name="LockAcquisitionTimeout">10000</attribute>

 <!-- Name of the eviction policy class. -->
 <attribute name="EvictionPolicyClass"></attribute>

 <!--
 Indicate whether to use marshalling or not. Set this to true if you are running under a scoped
 class loader, e.g., inside an application server. Default is "false".
 -->
 <attribute name="UseMarshalling">true</attribute>

 <!--
 <attribute name="CacheLoaderClass">org.jboss.cache.loader.bdbje.BdbjeCacheLoader</attribute>
 <attribute name="CacheLoaderConfig">
 location=c:\\tmp\\bdbje
 </attribute>
 <attribute name="CacheLoaderShared">true</attribute>
 <attribute name="CacheLoaderPreload">/</attribute>
 <attribute name="CacheLoaderPassivation">false</attribute>
 -->

 <!--
 <attribute name="CacheLoaderClass">org.jboss.cache.loader.FileCacheLoader</attribute>
 <attribute name="CacheLoaderConfig">
 location=c:\\tmp
 </attribute>
 <attribute name="CacheLoaderShared">true</attribute>
 <attribute name="CacheLoaderPreload">/</attribute>
 <attribute name="CacheLoaderPassivation">false</attribute>
 -->




 </mbean>


 <!-- Uncomment to get a graphical view of the TreeCache MBean above -->
 <!-- <mbean code="org.jboss.cache.TreeCacheView" name="jboss.cache:service=TreeCacheView">-->
 <!-- <depends>jboss.cache:service=TreeCache</depends>-->
 <!-- <attribute name="CacheService">jboss.cache:service=TreeCache</attribute>-->
 <!-- </mbean>-->


</server>

We are facing the following problems:

1) from time to time the Weblogic "freezes" (speaking BEA language: no answer - no remote exception, nothing at all - from a t3 call).
We hope that this problem will disappear when we'll migrate to JBosscache 1.4.0/weblogic 9.1/jdk 1.5

2) OutOfMemory exception. Everything works fine for a few days, then we see many FULL garbage collection, then memory explodes.
Our application has a sort of "periodic load": many user connect at morning, a few disconnect during lunch time, then they reconnect in the afternoon , then they disconnect on evening. During night we clean eveything (and from what we see, we are almost sure to clean).

We made a lot of stress test, but we've never seen this kind of problems during them.

We are trying to investigate the OOM problem, but I would like to know if our JBosscache configuration file could be responsible for memory troubles (i read something about FC fast - messaging related to memory trouble...).

TIA
matabo

1. Re: OutOfMemoryException

belaban Aug 24, 2006 3:22 AM (in response to matabo)

If you make a lot of updates to the cache in a lot of clients, then the JGroups config posted will not be able to catch up with distributed message garbage collection. I therefore suggest you overwrite your config with the contents of fc-fast-minimalthreads.xml (between the and elements) from the JGroups source distribution.
I also suggest you upgrade (if you can upgrade to JDK 1.4 or 5) to a newer version of JGroups.
Actions
2. Re: OutOfMemoryException

matabo Aug 24, 2006 4:31 AM (in response to matabo)

Thank you for your info, Bela.

We'll give a try to fc-fast-minimalthreads. I'll post here if I 've any news.

regards
matabo
Actions

Go to original post