13 Replies Latest reply on Oct 25, 2013 11:54 AM by vasilievip

    Hang in ValueFactory.createBinary()

    jonathandfields

      Hi,

       

      I'm invoking ValueFactory.createBinary() and am encountering a hang. This is occurring in Modeshape 3.3.0 and EAP 6.

       

      An SLSB (transaction/session) is being repeatedly invoked every few seconds and in that  the following code is being called. (The test case is polling certain properties on a node to determine when they change).

       

      if (node.hasProperty("returnValue")) {
           Binary binary = node.getProperty("returnValue").getBinary();
           ObjectInputStream ois = new ObjectInputStream(binary.getStream());
           Object returnValue = ois.readObject();
           // ... do something with returnValue
      }
      

       

      At some point while this is happening, the following code is called from another SLSB (transaction/session):

       

                              
      ByteArrayOutputStream buffer = new ByteArrayOutputStream();
      ObjectOutputStream oos = new ObjectOutputStream(buffer);
      oos.writeObject(mySerializableObject);
      oos.close();
      Binary binary = session.getValueFactory().createBinary(new ByteArrayInputStream(buffer.toByteArray())); // <--- HANG
      node.setProperty("returnValue", binary);
      

       

      This latter call hangs and here is the stack trace from jstack:

      "EJB default - 9" prio=3 tid=0x0000000002d36000 nid=0xca waiting on condition [0xfffffd7d9e4f6000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for   (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
              at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
              at org.modeshape.jcr.value.binary.NamedLocks$NamedLock.lock(NamedLocks.java:196)
              at org.modeshape.jcr.value.binary.NamedLocks.lock(NamedLocks.java:91)
              at org.modeshape.jcr.value.binary.NamedLocks.writeLock(NamedLocks.java:54)
              at org.modeshape.jcr.value.binary.FileSystemBinaryStore.saveTempFileToStore(FileSystemBinaryStore.java:161)
              at org.modeshape.jcr.value.binary.FileSystemBinaryStore.storeValue(FileSystemBinaryStore.java:128)
              at org.modeshape.jcr.value.binary.BinaryStoreValueFactory.create(BinaryStoreValueFactory.java:253)
              at org.modeshape.jcr.value.binary.BinaryStoreValueFactory.create(BinaryStoreValueFactory.java:57)
              at org.modeshape.jcr.JcrValueFactory.createBinary(JcrValueFactory.java:147)
              at org.modeshape.jcr.JcrValueFactory.createBinary(JcrValueFactory.java:47)

       

      FWIW, I have IPSN locking set to PESSIMISTIC.

       

      Any suggestions? I'll probably try this with 3.5.0 but thought I'd also ask here in case this is a known issue.

       

      Thanks! Jon

       

      .

        • 1. Re: Hang in ValueFactory.createBinary()
          vasilievip

          What is your environment? Linux? What is JDK version?

          There was issue in old jdk for this: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370

          • 2. Re: Hang in ValueFactory.createBinary()
            jonathandfields

            Solaris 10

            JDK 1.7.0_25

             

            From the bug report it appears that this was fixed in Java 7...

            • 3. Re: Re: Hang in ValueFactory.createBinary()
              vasilievip

              Jonathan Fields wrote:

               

              From the bug report it appears that this was fixed in Java 7...

              Yep, fix should be in place. You can check this by trying workaround from the bug, if it helps - then bug is still present:

               

              Ignore the other suggested workarounds. The true workaround here is to specify -XX:+UseMembar as indicated in the public comments.

              This has the side-effect of placing a memory barrier into the path on which it is currently missing.

              • 4. Re: Re: Hang in ValueFactory.createBinary()
                jonathandfields

                Added -XX:+UseMembar. Failed (hung) on the second try.

                 

                If I remove the concurrent access by eliminating the thread that is repeatedly getting the Binary property, the hang goes away. I also lowered the frequency with which that thread was  polling so it was much less likely that the two threads were concurrently access the node/property and experienced a hang. It is almost as if any attempt to get the Binary property before it has been created sometimes results in the creating thread to hang.

                 

                I'm going to upgrade to modeshape 3.6.0 to make sure I am running the latest and greatest and then proceed from there.

                • 5. Re: Re: Hang in ValueFactory.createBinary()
                  rhauch

                  Are you using the same session in both threads? Also, are the thread(s) reading the binary value(s) and the threads writing the binary value(s) ever working with the same binary content (e.g., the same SHA-1 hash)?

                  • 6. Re: Re: Hang in ValueFactory.createBinary()
                    jonathandfields

                    Separate threads, transactions, and sessions  - separate SLSBs with CMT in EAP6.

                     

                    Here is the high level sequence of events

                    1. A node is created, it does not have the Binary property named "returnValue" at this point, but has other (non-binary) properties. This is done in a separate tx/session.

                    2. A loop is started, invoking an SLSB that in turn calls node.hasProperty("returnValue") to see if the property exists. If it does, it then reads the InputStream. Each loop invocation is a new thread/tx/session (SLSB invocation). Exact code is above.

                    3. A few seconds later, another SLSB updates the node, adding the Binary property. It is in the call to ValueFactory.createBinary() that this SLSB (thread) experiences the hang. Code and stack trace are above.

                     

                    It appears that if I eliminate (2), there are no problems.

                     

                    I'm in the process of upgrading to 3.6.0 so I'll report back if this makes any difference. I could not find anything in JIRA that appears to be similar....

                     

                    Thanks, Jon.

                    • 7. Re: Re: Hang in ValueFactory.createBinary()
                      hchiorean

                      Unfortunately I don't think 3.6.0 will make a difference. Also:

                       

                      • ISPN locking is unrelated to this problem as binary values reading/writing in the binary store is "orthogonal" to the ISPN cache
                      • based on your code sample, it appears the writer is hanging and the only case when that can happen is if there's another writer holding an open named lock. You should really look at the entire thread dump to make sure there aren't any other writers
                      • the lock (named lock) uses the SHA1 of the content of the file as the name of the lock, so it would seem like multiple writers are trying to write the same file from different threads.
                      • 8. Re: Re: Hang in ValueFactory.createBinary()
                        jonathandfields

                        You are correct, I upgraded to 3.6.0 and am still experiencing the hang. I have also reproduced it on both Linux and Solaris.

                         

                        Yes, the writer is hanging. There is only one place in the code where this particular property is written to. There is only one writer in the entire thread dump.

                         

                        If  eliminate any calls to that access the node before the property is written, the hang goes away.

                         

                        I will keep working to see if I can narrow it down and possibly provide a reproducible test case.

                        • 9. Re: Re: Hang in ValueFactory.createBinary()
                          jonathandfields

                          I believe I have solved the problem. It was a combination of 1) storing the same data every time (same SHA1 and file), and b) not calling InputStream.close() on the stream obtained from Binary.getStream().

                           

                          By making sure I call InputStream.close(), the hang goes away.

                           

                          Looking at FileSystemBinaryStore and SharedLockingInputStream, this seems to make sense. Not calling InputStream.close() results in the the read lock never being released by SharedLockingInputStream. Also, in my case, I was reading serialized data, so it knew exactly how many bytes to read, so the EOF was never reached, so SharedLockingInputStream.close() was not called in SharedLockingInputStream.read() (e.g. line 194, etc.)

                           

                          So, beware, not closing the InputStream can hang your entire application requiring a kill -9 of the application server to recover :-)

                          • 10. Re: Re: Hang in ValueFactory.createBinary()
                            rhauch

                            So, beware, not closing the InputStream can hang your entire application requiring a kill -9 of the application server to recover :-)

                            +1000. This is absolutely true. If your code processes the stream until the end of the available data (e.g., until the stream's "read(...)" method returns -1), then the stream will automatically close. However, if you read less than the total content OR you read exactly the right number of bytes (such that the "read(...)" method never returns -1), then the stream will NOT automatically close.

                             

                            Therefore, ALWAYS CLOSE EVERY INPUT STREAM.

                            • 11. Re: Re: Hang in ValueFactory.createBinary()
                              vasilievip

                              Randall Hauch wrote:

                              Therefore, ALWAYS CLOSE EVERY INPUT STREAM.

                              Yeah, this is good one

                              BTW, something useful in jdk 7 already present for this + sonar can catch such problems during build process:

                              http://www.mkyong.com/java/try-with-resources-example-in-jdk-7/

                              http://www.sonarqube.org/sonar-2-12-in-screenshots/

                              Highly recommended!

                              • 12. Re: Hang in ValueFactory.createBinary()
                                jonathandfields

                                Would it make sense to also have SharedLockingInputStream implement finalize() to call close()?

                                 

                                For example, http://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html#finalize() "Ensures that the close method of this file input stream is called when there are no more references to it.".

                                 

                                I totally agree that it is a good practice to close InputStream, but the penalty in this case for an oversight is pretty severe. The binary property is locked forever and  I cannot even shutdown JBoss and have to "kill -9" to terminate the process because the thread is waiting forever for that lock to be released.... If finalize() was implemented then things would eventually free up.....

                                 

                                Or perhaps a timeout on the lock so if it is not acquired in a "reasonable" (yes, I know, define reasonable) amount of time it gives up?

                                • 13. Re: Hang in ValueFactory.createBinary()
                                  vasilievip

                                  Jonathan Fields wrote:

                                  Would it make sense to also have SharedLockingInputStream implement finalize() to call close()?

                                   

                                  I do not think that this will help. Close is about freeing OS resources, finalize is about garbage collection which you do not have control on + I doubt it will be performed on shutdown since there is no need to "gracefully collect", one can just clean up everything.