5 Replies Latest reply on Sep 1, 2005 9:28 PM by ben.wang

    Refactoring of object graph model in TreeCacheAop

      Recently I have been busy trying to build couple of more "real life" use cases for JBossCacheAop. The whole goal is to write an article promoting it! There are many possbilities after talking to people that are using it. But I have settled on two use cases both involved network management. They all have complex circular and multiple reference properties. In addition, heavy use of Collection classes (liek List) is also needed.

      So that gave me a chance to fix couple problems in Collection class proxy support. But now I have run into a fundamental problem of the underlying object graph model. Previously, we use JBossInternal node to re-locate pojos that are multiple referenced. Here are the pros and cons:

      Pros-
      1. It is more "canonical". Meaning all the parent pojos of that shared object see the same tree under JBossInternal.
      2. Locking can be applied in a more consistent manner

      Cons-
      1. Circular and multiple reference interaction is complicated becuase of relocation.
      2. Relocation of pojos are expensive.

      On the other hand, if we don't use JBossInternal, we will simply keep track of reference counting within the original pojo. Here are the pros and cons.

      Pros -
      1. No need for re-location. So it is efficient.
      2. No need to distiguish between circular and multiple references. Their interaction is vastly simplified.

      Cons -
      1. It is not canonical. I.e., the location of the shared pojos depends on the order of the parent pojo get put into cache.
      2. Locking behavior can be inconsistent depending on where the referenced object resides.
      3. Overwriting of an existing pojo that is still multiple referenced will need to throw exception.

      Anyway, because the obstacle is Cons #1 in the first JBossInternal model, I am now leaning toward the second solution of keeping the refernce counter of the original pojo. Then to solve the problems, we will:

      1. To address the canonical problem (and thus the resulting locking issue), a user is advised to put the referenced pojo into cache first, e.g., cache.putObject("/shared", sharedPojo), although this is still not totally desirable.

      2. To address overwriting of exsiting shared pojo that holds the reference counter, we will throw an exception if attempted. So a user is warned. Again, this is not a complete solution because a user may need to know the alternative. But we shall just treat it as a limitation now.

      Any sgguestion?

      -Ben

        • 1. Re: Refactoring of object graph model in TreeCacheAop
          twundke

          Ben, having thought through this I think I'm going to do a 180 degree turn here and suggest that the first model is actually better, although perhaps what we really need is a way to choose between different models depending on user requirements.

          My reasoning is mostly born out of the application that I'm writing. I need the caching via AOP to be as transparent as possible. Third party developers will be writing objects that get put into the cache, and I really don't want to force too many restrictions on what they can/can't do. This goes equally for the objects that I'm writing to go into the cache.

          So, this means that locking must be correct and exceptions should not be thrown for reasonable actions, which basically rules out option #2 for me. I'd also like as much speed as possible, but I just don't think I can get past the canonical object issue, which probably means moving nodes.

          To be honest, I think that maybe we're trying to fit an oval peg into a round hole. We're trying to convert the Java heap-based memory structure into a tree-based structure, with mixed results. I guess this is why we perhaps need a pluggable AOP system. If the user can control the object graph then option #2 is probably a good option. In the more generic case, option #1 can be used, albeit with a loss in efficiency. A developer could also potentially implement their own custom solution based on individual needs.

          Having said that, there are still issues with the current implementation of option #1 that need to be resolved. I'm currently thinking through these, and will get back to you with my thoughts.

          Tim

          • 2. Re: Refactoring of object graph model in TreeCacheAop

            Tim,

            In the option #2, I don't think it is that much less transparent though. The new restrictions are:

            1. You can't overwrite an existing Pojo. It will throw a PojoExistException. I can later reduce the restriction by only throw exception when it is still being referenced, then you can't overwrite it.

            Meanwhile, I keep track of the referees so a user can potentially query who is referring it.

            2. Locking won't be incorrect, pe se. It is just more restrictive. The original owner of Address object may lock it while I am modifying my name attirbute, for exmaple. But if both Persons are updating the Address fields, the locking semantics has no problem.

            But the advantage of this second approach is it enables flexible and complicated object graph relationship to be built! And that is the main problem I encountered in the first (and original) option.

            It is true that the restriction of TreeCache internal structure hinders the development of object relationship (if there is no relationship, it fits perfectly!). There are ways to get around it. One possible solution is to refactor TreeCache so I can detach and attach a node under another parent easily. That way, an "aspectized" object can be re-attached easily and thus solves numerous problems.

            Finally, I have refactored the code such that it won't be that much work to switch back and forth for both models. So if you can give me use cases that warrants this, I can plan to put efforts in there.

            -Ben

            • 3. Re: Refactoring of object graph model in TreeCacheAop
              twundke

               

              "ben.wang@jboss.com" wrote:

              In the option #2, I don't think it is that much less transparent though. The new restrictions are:

              1. You can't overwrite an existing Pojo. It will throw a PojoExistException. I can later reduce the restriction by only throw exception when it is still being referenced, then you can't overwrite it.

              This is the deal-breaker for me. For example, the following code will throw the exception.

              public class DataObject {
               private String name;
              
               DataObject( String aName ) {
               name = aName;
               }
              }
              
              public class Storage {
               private DataObject firstRef;
               private DataObject secondRef;
              
               public void storeRef( DataObject aObject ) {
               firstRef = aObject ;
               secondRef = aObject ;
               }
              }
              
              public class Runner {
               public static void main( String[] aArgs ) {
               Storage s = new Storage();
               DataObject o1 = new DataObject( "one" );
               DataObject o2 = new DataObject( "two" );
              
               cache.putObject( "/storage", s );
              
               s.storeRef( o1 );
               s.storeRef( o2 ); // <- exception thrown here
               }
              }
              

              This isn't a particularly contrived example, as this type of code happens quite frequently. You often need to store multiple references to an object and then overwrite one or both.

              Maybe my particular use-case is different from most. Third party developers simply implement one of my container interfaces, and their object and all sub-objects are automatically placed into the cache. They should effectively get replication of their data for free. The whole point is that the developer need not know that the cache even exists (well, that's not quite true, but I'm trying to limit the restrictions placed on their code, and currently there are very few thanks to AOP).

              To me this is the absolute beauty of the AOP cache solution. I can place a single top-level object into the cache and everything else is automatically taken care of. I never have to deal with the tree nodes except in a few rare cases where I store some application-internal objects in the cache.

              "ben.wang@jboss.com" wrote:

              2. Locking won't be incorrect, pe se. It is just more restrictive. The original owner of Address object may lock it while I am modifying my name attirbute, for exmaple. But if both Persons are updating the Address fields, the locking semantics has no problem.

              Yeah, sorry, I didn't mean to imply that it wouldn't be semantically correct. There's just more locking going on than is strictly necessary. Once again, my application is storing all runtime data in the cache, with lots of interactions, so extraneous locking can cause performance problems. There's possibly also more chance of deadlock, although I haven't thought through this enough to know whether that's actually true or not.

              "ben.wang@jboss.com" wrote:

              But the advantage of this second approach is it enables flexible and complicated object graph relationship to be built! And that is the main problem I encountered in the first (and original) option.

              Well, I think that can be overcome. This is what I'm currently investigating.

              "ben.wang@jboss.com" wrote:

              It is true that the restriction of TreeCache internal structure hinders the development of object relationship (if there is no relationship, it fits perfectly!). There are ways to get around it. One possible solution is to refactor TreeCache so I can detach and attach a node under another parent easily. That way, an "aspectized" object can be re-attached easily and thus solves numerous problems.

              A cheap way of relocating nodes would be very handy, and would certainly solve most of these issues that we're facing. I certainly vote for that!

              "ben.wang@jboss.com" wrote:

              Finally, I have refactored the code such that it won't be that much work to switch back and forth for both models. So if you can give me use cases that warrants this, I can plan to put efforts in there.

              Yeah, I've seen the changes, which should help. Hopefully I've given you a better overview of my use-case. I must admit that I'm being a bit cagey as I'm not sure that my boss would appreciate all of the details being placed on a public forum :) My main concern though is that the cache is as transparent as possible. My application currently hides a lot of the details from the user, but exceptions are something I can't do too much about, hence my wanting to minimise them. I also don't want to have to write contrived code just because I know it could be placed into the cache.

              Ultimately perhaps I'm using the cache for a purpose that it wasn't originally designed for, but it's such a great technology that I just couldn't pass it up!

              Tim

              • 4. Re: Refactoring of object graph model in TreeCacheAop

                Tim,

                You example does present a use case for over-writing an existing sub-pojo reference. I will open a Jira issue to address this issue.

                -Ben

                • 5. Re: Refactoring of object graph model in TreeCacheAop

                  So I have fine tuned the new object graph model once more. Previously, it has the restritction of exception thrown when one tries to overwrite a pojo that is referenced by others, e.g.,

                  joe.setAddress(newAddress);

                  Now, I can handle this case now without throwing any exception. Here are additional enhancements:

                  1. When overwriting, I will do a removeObject(fqn) first.

                  2. During the removeObject, if I find out the pojo is still referenced by others, I will look for the first referencing fqn in the list. Then, I simply do a re-location of the current pojo to that referencing fqn (new fqn). So this is like transfer of ownership to the next kin, so to speak.

                  I have added numerous test cases to test the object graph model, btw.