4 Replies Latest reply on Nov 5, 2006 9:28 PM by Ben Wang

    Incorrect serialization of PojoCache collection class proxy

    Ben Wang Master

      This is originally reported from
      http://www.jboss.com/index.html?module=bb&op=viewtopic&t=93750 where an accidental serialization of CachedMapImpl has occurred.

      Brian has also created http://jira.jboss.com/jira/browse/JBCACHE-830.

      I have raised it here for further discussion.

      I need to document it more clearly. However, the Collection proxy are not meant for serialization since the proxy won't come out properly once it is serialized (the interceptor reference is local only). The proper way to serialize a Collection (outside of the PojoCache) is to use the detach call first, e.g.,

      List list = new ArrayList();
      ...

      cache.attach("list", list);
      list = cache.find("list"); // This is a proxy
      ...

      list = cache.detach("list"); // now this returns a normal List and can be serialized properly.

      So looks like we have two choices:

      1. Throw an exception when VM tries to serialize it?

      2. Make all the fields transient if there is really a use case for that.

      Of course, if you make every POJO "aspectized" then this won't be an issue since replication won't involved the proxy itself (rather just the fields). This occurs when a POJO contains a object graph that has this reference.

      Like I said, I will document this further. But do we really see a use case that inenvitably we need to serialize the proxy?

        • 1. Re: Incorrect serialization of PojoCache collection class pr
          Brian Stansberry Master

          This can easily happen -- a pojo with a collection field gets put in the cache, and then some user code hands a ref to the collection to some other object, that object gets serialized (say as part of a response to a remote call); boom!

          The case on the other forum post is a bit odd, as the serialization there is occuring due to cache replication itself. That's kind of a bad practice; they should aspectize all their classes. But I think we need to support the generic case where a Collection gotten from the cache gets serialized.

          But, I don't think there's any need at all to try to maintain any object relationship or anything when the object is deserialized. So, this kind of thing should work fine:

          Example is CachedListImpl:

          private Object writeReplace() throws ObjectStreamException {
           ArrayList toSerialize = new ArrayList();
           toSerialize.addAll(this);
           return toSerialize;
          }
          

          toSerialize is what gets serialized. When deserialized, it's a List, as expected. If the user thought the deserialized list would somehow be the same object as something stored in a PojoCache somewhere (where???), they're not thinking right.




          • 2. Re: Incorrect serialization of PojoCache collection class pr
            Ben Wang Master

            OK, when I try to code the proxy class, I had the entity model in mind. I.e., when tyring to do remote call, you should always detach from it entity manager first and then re-attach again later.

            I can see your use case here as well for user accidentally to pass in the reference (that's what I said of not transparent). And what you proposed make good sense.

            But on the same token, this can be user's expectation that serilization and POJO relationship are still intact on the remote node! Again, this is unintentional but they have no way of knowing it, unless of course we explictly throw an exception.

            • 3. Re: Incorrect serialization of PojoCache collection class pr
              Brian Stansberry Master

              It's too bad there's no obvious way to distinguish the 2 use cases when doing the serialization. 2 use cases being 1) serializing as part of a JBC replication (like in the other forum thread that led to this issue) and 2) other types of serialization like I described above. With the first, I agree, it's easy for the user to get confused and expect some relationship will survive the serialization. Giving them help is not bad. For the second case, if they think that somehow a relationship will survive by magic when a remote cache isn't even involved, well, they're just not thinking at all; I don't feel any strong need to "help" in such a case.

              Hmm -- instead of throwing an exception, why not log a WARN and then do the writeReplace? The category logging the WARN is the proxy class, so if people write code where they're doing this on purpose and understand what they're doing, they can just limit the proxy class category to ERROR.