5 Replies Latest reply on Nov 28, 2006 10:32 AM by bill.burke

    Zero turnaround Java/JBoss

    bill.burke

      Had an IM conversation with Gavin on "Zero turnaround Java". Basically the idea centers around a development turnaround time closer to that of PHP. I thought about a few things of how we could be as Zero-Turnaround as possible. Here is a breakdown of possible tasks/features we could do step by step:

      * A real "redeploy" event that reuses the DeploymentUnit/Context. In this case deployment metadata could be reused if metadata sources (.class or .xml files) have not changed on disk within the deployment. To make this happen, the deployment unit is going to have to remember all sources of metadata within the deployment. EJB for example would be a set of ejb-jar.xml, jboss.xml, bean class (and superclasses) as well as business interfaces as all these things are sources of metadata.

      * This point makes me wonder if the *redeploy* event should pass a list of changed files within the deployent unit path.

      * Hot Swap is useless. Yes it is neat, but since schema changes aren't allowed, its not going to cover 100% of cases. IMO, any Zero Turnaround solution needs to cover 100% of use cases otherwise there will be a lot of user confusion when things go bad.

      * Since Hot Swap is not viable, we will need to create a new classloader on redeployment to load changed classes. I'm guessing we can greatly speed up classloading on redeployment if we cache resources. The idea is that a classloader has a resource cache, on redeployment, the ClassLoader of the DeploymentUnit hands off its resource cache to a newly created ClassLoader. This way the new ClassLoader does not have to go to disk to load the .class file and can just get things from memory. Not sure how much this would speed up things, but this idea can be used for my next point. BTW, I think running in a debug session of most IDEs would allow for non-schema based code changes are runtime anyways.

      * When I've used PHP or another scripting language, the coolest thing for me was that there was no build/compilation step and you had WYSIWYG development as the source files were directly in your deployment path. I think we can do the same for Java. The idea would be to have a ".java" based ClassLoader. When a class needs to be resolved, the ClassLoader would look in its classpath for .java files or look in a memory cache for compiled bytecode. I took a look at Eclipse's JDT, the compiler that Tomcat uses to comple JSP source. Its pretty cool. You have total control on where to resolve source and bytecode when compiling something.

      A scanner would watch all ".java" files within a deployment. If one changed it would cause a redeployment event to happen. The resource-cache-based ClassLoader I talked about above could be used to cache compiled .java source. This would make the JDT compilation step much faster as it would only require a recompiling of modified .java files.

      * A blog Gavin mentioned to me talked about being able to cache an HTTP Session (or SFSB Session) between a redeploy. This enables Zero-turnaround even further as at runtime you can make a modification to your web application or EJB right in the middle of testing a web app. The way it would work is that the HTTP SEssion/SFSB session would be serialized and reloaded after a new classloader was created to pick up the class changes. I think this is a cool idea, but suffers from the same problems as Hot Swap:
      - I'm pretty sure that a schema change to a .class file will generate a different SerialVersionUUID for default serialization. So, we would need to use JBoss Serialization to ignore version id mismatches.
      - I don't know if a schema change (an additional field added) would break the serialized HTTP Session/SFSB Session on deserialization.
      - Because of both of these problems, I think just using the IDE to do non-schema code changes during and within a test run is good enough.

        • 1. Re: Zero turnaround Java/JBoss
          gavin.king

          What is most urgently needed is:

          (1) the ability to efficiently "redeploy", under the assumption that the metamodel has not changed. So the internal implementation of the components might have changed, but there is no reason to actually restart the EJB container and the webapp.

          (2) the ability to do this without losing the state of existing SFSB instances and the HttpSession. The slowest part of the compile-deploy-test cycle today is logging back into the application and getting to the page you want to test.

          I'm pretty sure that a schema change to a .class file will generate a different SerialVersionUUID for default serialization. So, we would need to use JBoss Serialization to ignore version id mismatches.


          Or just tell the user to add a serialVersionUID to their classes. Easy, I don't see them minding that. For SFSBs, we could probably do it for them.

          I don't know if a schema change (an additional field added) would break the serialized HTTP Session/SFSB Session on deserialization.


          Changing (adding/removing) a non-transient field definitely would break standard Java serialization, but there is no reason for it to need to break JBoss serialization, since there is an incredibly natural and trivial way you can define this:

          (1) if a field has been added, set it to Java language default
          (2) if a field has been removed, throw away its state
          (3) if the field type has changed, try to typecase

          This is of course what dynamic languages do in the same situation.

          Because of both of these problems, I think just using the IDE to do non-schema code changes during and within a test run is good enough.


          This is definitely not true. The problem with using IDEs and HotSwap to debug JBoss is that as soon as a single schema change occurs, HotSwap stops working until you restart the whole JBoss process. Redeploying the webapp is NOT enough to get HotSwap working again, because the IDE does not see that as a class reload.

          Try this in eclipse. As soon as you make one single schema change to any class, HotSwap is hosed until you restart JBoss. Of course, this is totally crap, but that tells you how crap HotSwap is.

          • 2. Re: Zero turnaround Java/JBoss
            bill.burke

             

            Try this in eclipse. As soon as you make one single schema change to any class, HotSwap is hosed until you restart JBoss. Of course, this is totally crap, but that tells you how crap HotSwap is.


            Again, I hope you realize that if HotSwap is not used, this requires a recycling of classloaders of the component as well as any classloaders of dependent components. For instance, change an @Entity bean, and you have to cycle the classloader of EJBs that reference it, then the classloaders of the WARs that reference the EJBs (if you're using @Local).

            (1) the ability to efficiently "redeploy", under the assumption that the metamodel has not changed. So the internal implementation of the components might have changed, but there is no reason to actually restart the EJB container and the webapp.


            I hope you realize that this isn't some cross-cutting functionality that you can just snap on. Since classloaders have to be recycled, this means all affected containers (web, ejb, and hibernate) have to do some re-initialization. For instance, the EJB container references tons of Class, Method and Field objects that will need to be recycled. Maybe what we could do is write a parallel reflection hierarchy that just delegates to real reflection objects. This reflection model could register itself with the classloader of the real reflection object. The classloader could ask registered reflection objects to recycle themselves upon classloader recycling. (Say that 5 times fast).

            BTW, Each component type is going to need to be seriously refactored to support optimized redeployment. (BTW, with my measurements with E-EJB3, Hibernate is the biggest hog at boottime).

            What it boils down to is that there's TONS and TONS of work to do before we can even think about adding HTTP Session/ SFSB serialization (and I think this idea is just a toy anyways...)

            • 3. Re: Zero turnaround Java/JBoss
              slaboure

              Concerning the "clean redeploy" of classes, instead of relying on facing hotswap behaviour or similar, why not use a simpler approach where we would load the NEW classes in a new Classloading tree and start routing new requests to that classes tree once it has been properly loaded? The trick is to make sure that the SAME "containers" and services are able to work on multiple classes trees (and, for example, keep a single version of a cache, or a lock table, of a transaction table, etc. => services and containers would rely on specific ID, not on specific class instances)

              That way, if you have a long running requests being processed, this request can continue in parallel with new requests. The old request would keep using the old classes definition while new requests could use the new class definitions. That could also speed up app re-deployment A LOT: there wouldn't be this huge gap anymore between the undeploy of version 1 and the total redeploy of version 2: instead, any requests would be able to keep using version 1 while we fully load version 2 and at that time, we can do a micro-switch that would route any incomnig request to the new class tree. This could certainly be easily handled by using the per-container/per-request METADATA i.e. the class tree to use would be specified in the METADATA and used dynamically at runtime by the containers/services.

              That means we could also support, in a more "production" kind of environment, stable "switch of releases" i.e. JBoss ON could distribute a new version of an application to 50 nodes of a cluster, PREPARE the application by loading it in a new class tree and, once all nodes have successfuly loaded the new tree, you atomically re-route all new requests to this new app => standby mode for pre-loaded apps. This scenario is probably slightly more complex as it would probaly require a change in metadata, hence is possibly a different scenario.

              • 4. Re: Zero turnaround Java/JBoss
                starksm64

                Provided that components are properly defined with their dependencies, one should be able to start with a component such as a servlet and transition back to a state where the class loader is defined and then roll forward with the same metadata. We are a long way from having this level of integration between the deployers and mc such that we have the complete object graph described to the mc along with dependencies and install/uninstall actions.

                There is also a disconnect with how the class loader is defined at the deployer layer rather than via a factory that can be re-run to recreate the class loader (I believe), not to mention tracking dependicies based on type usage associated with the class loader.

                I believe looking at copying a deployment object graph along the lines of what Sacha is indicating is where we want to get to first.

                • 5. Re: Zero turnaround Java/JBoss
                  bill.burke

                   

                  "sacha.labourey@jboss.com" wrote:
                  Concerning the "clean redeploy" of classes, instead of relying on facing hotswap behaviour or similar, why not use a simpler approach where we would load the NEW classes in a new Classloading tree and start routing new requests to that classes tree once it has been properly loaded?


                  This idea only addresses the use case where zero "metadata-sources" have changed on disk. By a metadata-source I mean an ejb-jar.xml file, a bean class, or a business interface. If a metadata-source changes, you're gonna have to be real careful on how changes are merged with the old metadata.



                  The trick is to make sure that the SAME "containers" and services are able to work on multiple classes trees (and, for example, keep a single version of a cache, or a lock table, of a transaction table, etc. => services and containers would rely on specific ID, not on specific class instances)

                  That way, if you have a long running requests being processed, this request can continue in parallel with new requests. The old request would keep using the old classes definition while new requests could use the new class definitions.

                  That could also speed up app re-deployment A LOT: there wouldn't be this huge gap anymore between the undeploy of version 1 and the total redeploy of version 2: instead, any requests would be able to keep using version 1 while we fully load version 2 and at that time, we can do a micro-switch that would route any incomnig request to the new class tree. This could certainly be easily handled by using the per-container/per-request METADATA i.e. the class tree to use would be specified in the METADATA and used dynamically at runtime by the containers/services.

                  That means we could also support, in a more "production" kind of environment, stable "switch of releases" i.e. JBoss ON could distribute a new version of an application to 50 nodes of a cluster, PREPARE the application by loading it in a new class tree and, once all nodes have successfuly loaded the new tree, you atomically re-route all new requests to this new app => standby mode for pre-loaded apps. This scenario is probably slightly more complex as it would probaly require a change in metadata, hence is possibly a different scenario.


                  You're talking more about applications in production, while this thread is focused more on zero turnaround in development. For this case, it doesn't matter much on how fast the creation of the new metadata (or container) is, just that the swap from the old version to the new is superfast and seemless to the application.

                  What this thread is discussing is getting closer to the zero turnaround time of Rails in development so that Java EE/JBoss can be used in RAD environments more viably and closer to the productivity that Rails brags about (and so that people don't use gay scripting language to do development).