9 Replies Latest reply on Jan 29, 2006 12:33 AM by jason.greene

    Thread Safety Issues With UnifiedMetaDataModel

    jason.greene

      We currently share our UMD in multiple threads. This works fine because it is constructed during the deployment process which happens in a single thread. However, problems arrise when we add lazy loading into the UMD.

      Currently OperationMetaData.getJavaMethod() and ParameterMetaData.getJaveType() lazy load class types. There is the potential for a race here, although i believe it is most likely a safe race. Since classloader is threadsafe the only potential race is a duplicate assign of the class type cache value.

      There is another lazy loading condition in ParameterWrapping.getWrappedVariables() that is definately not thread safe.

      I also need to add lazy initialization for wrapper types, which would be unsafe as well.

      I can think of six possible solutions to the thread safety issues of the UMD.

      1. We add a second initialization phase that is called from start() of our endpoint. This will solve our classloader problem with Annotation deployments. All lazily loaded data must be loaded here
      2. We make a copy of the UMD for every invocation
      3. We put these lazy values in thread locals
      4. We synchronize access to the lazy initialized data
      5. We mark the lazy initialized data as volatile (although this will only work properly on JDK5, as in earlier versions races were still possible)
      6. We use the threadlocal version of DLC (double lock checking), since regular DLC is broken.

      I vote for #1.

      I compiled a list of resources for those of you that arent familiar with the common Java lazy loading problems:

      http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
      http://www.javaworld.com/jw-02-2001/jw-0209-double.html
      http://www.javaworld.com/javaworld/jw-11-2001/jw-1116-dcl.html
      http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html

      -Jason

        • 1. Re: Thread Safety Issues With UnifiedMetaDataModel
          thomas.diesler

          Here is another one

          Question of the month: What does volatile do?
          http://www.javaperformancetuning.com/news/qotm030.shtml

          We only target jdk1.5, so I think volatile will do. If I am not mistaken.

          • 2. Re: Thread Safety Issues With UnifiedMetaDataModel
            jason.greene

            Yes volatile will work on JDK 1.5, however there are disadvantages with this solution:

            1. If we retroweave jbossws to run on JDK1.4, the 1.5 semantics of volatile can't be converted since this is a low level behavior of the VM (JMM).
            2. volatile requires a memory barrier before every read and write operation. This means that everytime you touch a volatile value, it triggers a cpu cache flush. So basically the cost of volatile on jdk 5 is very close to the cost of using synchronize.

            IMO we should try and avoid synchronization in the UMD, especially since these values are read on every request.

            The advantage of option 1 is that we take the penalty of initialization only on service startup, and we no longer have to worry about thread safety. Also, the classloader problems are no longer an issue.

            -Jason

            • 3. Re: Thread Safety Issues With UnifiedMetaDataModel
              thomas.diesler

              Lets not bee too academic about this. If I understand correctly, we are talking about the situation when an endpoint is started and more than one client make the initial request. Then and only then we can have a possible race condition that both client request initialize some of the UMDM properties lazily.

              We should not ignore this unlikely case, however data integrity should be our only concern - not that the work is possibly be done twice.

              i.e.

              A class property is not critical, a collection property should not get initialized with more items than it should.

              In that light is it realy necessary to synchronize anything?

              • 4. Re: Thread Safety Issues With UnifiedMetaDataModel
                jason.greene

                Yes without synchronization the class properties are safe, and the performance cost of a collision is minimal. The collection properties are not thread safe, and have a strong potential for corruption. The reason this can happen has to do with instruction ordering optimizations that the jvm can/will perform in a thread. The scenario of concern is when the collection is assigned to the instance field *before* it is initialized. Even though the code has this happening as the last step, the optimizer is free to do this first. So a null check performed by another thread would fail in this scenario.

                Practically this means if two requests hit an endpoint at the same time, there is a chance that one of the threads will receive an incomplete collection.

                The main reason I brought this up is to address our contract of the UMD as it pertains to thread safety. Besides fixing the existing collection access from Paramater.getWrappedParameters(), I have an instance where I would like to make use of the wrapper type generation that I put in place for tools in the JSR-181 deployer, and this code needs to happen in a thread safe manner (cglib can not be accessed in multiple threads). My plan is to place this initialization step in endpoint start() since the correct classloader is available and it's thread safe. This got me to thinking maybee we should be eageraly initializing everything else at this point (instead of having some things initialized lazily within the UMD, and some external).

                IMO, No matter the solution we do need to establish and document the contract for the UMD since it is shared between threads, and it will help future contributors be aware of its correct use.

                It may seem like these kinds of conditions are unlikely but we did have a similar scenario to this occur in the 4.0.1 where a DOM element was placed inside a metadata object, and it lead to corruption that was reported by a user.

                -Jason

                • 5. Re: Thread Safety Issues With UnifiedMetaDataModel
                  thomas.diesler

                   


                  Even though the code has this happening as the last step, the optimizer is free to do this first.


                  I did not know that. In that case the optimizer changes the behaviour intended by the developer.

                  Yes, lifecycle start would then be the right place to do eager initialization. How about having an init cascade on the UMDM, so start only needs to call a single method?



                  • 6. Re: Thread Safety Issues With UnifiedMetaDataModel
                    jason.greene

                     

                    "thomas.diesler@jboss.com" wrote:

                    Even though the code has this happening as the last step, the optimizer is free to do this first.


                    I did not know that. In that case the optimizer changes the behaviour intended by the developer.


                    Yes, the only requirement is that the optimizer must gaurantee the outcome is the same, but only for a signle thread. The big reason why this is there is to allow for efficent processing in an SMP system. So, when a thread is bound to a CPU it will stick values in registers or L1 cache to avoid slower access to main memory which must be synchronized across all cpus. Instruction reordering can be used to achive the most optimal use of registers. Also some archetectures can bundle certain instructions (Itanium), and reordering allows for the most efficient use of this feature.

                    One of the most scary scenarios that can happen in Java (or C++) is your constructor can be called *after* the object reference is set. So one thread gets an uninitialized java object! This is why the double check lock algorithm fails.


                    "thomas.diesler@jboss.com" wrote:

                    Yes, lifecycle start would then be the right place to do eager initialization. How about having an init cascade on the UMDM, so start only needs to call a single method?


                    Yes this sounds good. Should I go ahead with that?

                    -Jason



                    • 7. Re: Thread Safety Issues With UnifiedMetaDataModel
                      thomas.diesler

                      Yes, please do.

                      • 8. Re: Thread Safety Issues With UnifiedMetaDataModel
                        jason.greene

                        I just noticed that the UMD contains a reference to wsdl4j Definition objects. These are not thread safe (even with read-only access) because they contain DOM objects.

                        Only deployment, and http wsdl request handling accesses these objects at the moment. The former is fine since this happens in a single thread, the latter is a problem.

                        -Jason

                        • 9. Re: Thread Safety Issues With UnifiedMetaDataModel
                          jason.greene

                          I added a JIRA entry for fixing this:

                          http://jira.jboss.com/jira/browse/JBWS-674