- 
        1. Re: About the UseParallelOldGCpeterj Jun 19, 2009 1:29 PM (in response to cytchiu)I have found the same strange behavior - while running parallel gc threads for minor collections reduces the gc pause time, running parallel gc threads for major collections increases the pause time. It makes no sense, but there you have it. 
 You really have to try various heap and gc settings and find the best settings for your app. Have yo considered the CMS collector? With the number of CPUs you have it might be a good option.
 See this presentation:
 http://www.cecmg.de/doc/tagung_2007/agenda07/24-mai/2b3-peter-johnson/index.html
- 
        2. Re: About the UseParallelOldGCcytchiu Jun 23, 2009 10:13 PM (in response to cytchiu)Is there any possible reason why the multi-thread ParallelOldGC is running poor than the single thread one? 
- 
        3. Re: About the UseParallelOldGCpeterj Jun 24, 2009 10:08 AM (in response to cytchiu)Lock contention over common object, such as the free memory list? I really have not had time to investigate it. 
- 
        4. Re: About the UseParallelOldGCcytchiu Jun 24, 2009 10:37 PM (in response to cytchiu)Thank you Peter. Can I say? 
 The problem typically happens when there are too many parallelOldGC
 threads in the process and there is too small an old generation. This
 results in excessive work stealing between the GC threads and this
 work stealing bangs on a lock. Too many ParallelOldGC threads without
 enough old space to carve up between them result in this work stealing
 pathology.
- 
        5. Re: About the UseParallelOldGCpeterj Jun 25, 2009 11:57 AM (in response to cytchiu)I have a quad-core and for my testing I used a 1GB heap (I did not specify a young gen size, but I believe the JVM never set it to more than 100M). When using multiple tenured GC threads the JVM splits the tenured generation into sections and lets each thread clean its own section to minimize contention. So I had 4 thread cleaning about 200MB each. You, or course, had 8 threads so your lock-contention is higher. But I read a very interesting paper the other day regarding cache coherency between L2 caches in the CPUs that caused a significant performance drop when running a multi-threaded app, so I'm wondering if that could be a reason. Of course, I'd need VTune to track that down. 
- 
        6. Re: About the UseParallelOldGCcytchiu Jun 25, 2009 10:40 PM (in response to cytchiu)I want to understand more about the point 'lock contention' in the free memory list. 
 In your case, you have 1GB Heap, with 4 cores. So, assume your young gen size is 100MB, the old gen is around 900MB. So each core will share 900MB / 4 = around 225MB.
 If I can use 8 cores, each core will share 113MB.
 Lock contention occurs because each thread is working on 'Too few' old gen size?
- 
        7. Re: About the UseParallelOldGCpeterj Jun 26, 2009 12:02 PM (in response to cytchiu)You'll notice the question mark after my statement about the free memory list. That means I don't know, I am just guessing and my guess could be completely off. I also stated that I had not had time to look into why the parallel old GC runs slow. So asking me to explain it is futile because I have no answers. As I stated earlier, the best thing you can do is try several different GC mechanisms and use the one that works best for you. If you are really concerned about the parallel old GC performance, you should take that up with Sun, after all, it's their JVM and their code. 
 
    