Version 2

    Page for discussion related to HHH-2762

     

    Following discussion occurred on November 12, 2009:

     

    [12:44pm] gbadner: bstansberry, I've checked in an initial implementation of the get/applyNonFlushedChanges stuff
    [12:44pm] bstansberry: gbadner: ah, interesting
    [12:44pm] gbadner: bstansberry, into hibernate trunk
    [12:45pm] gbadner: bstansberry, the NonFlushedChanges object contains a lot more than just the changes, but it should be better than serializing the entire session
    [12:46pm] gbadner: bstansberry, currently it contains the StatefulPersistenceContext and ActionQueue
    [12:46pm] bstansberry: gbadner: that makes sense
    [12:46pm] gbadner: bstansberry, later I'll cut things down more
    [12:47pm] bstansberry: gbadner: can you ping Paul Ferraro?
    [12:48pm] gbadner: bstansberry, I've been testing it by copying and adapting the hibernate "ops" tests; CreateTest, DeleteTest, MergeTest, etc
    [12:48pm] bstansberry: bstansberry: he's been doing work on optimizing replication of SFSBs/etc that contain entities
    [12:48pm] bstansberry: bstansberry: this ties in with that
    [[12:50pm] gbadner: bstansberry, ok, I'll try him tomorrow; do you want to be in on the discussion?
    [12:52pm] bstansberry: gbadner: sure; if on email copy me. on irc i won't likely be online on same hours as you both until next week
    [12:55pm] bstansberry: gbadner: any easy access point for me to look at your changes? i.e. class/method to start from. now i'm curious
    [12:56pm] gbadner: bstansberry, yup, also, can I ask you a couple of questions while you're here?
    [12:56pm] bstansberry: gbadner: absolutely
    [12:57pm] gbadner: bstansberry, here's the changeset: http://fisheye.jboss.org/changelog/Hibernate?cs=17948
    [12:59pm] gbadner: bstansberry, all you really need to look at is the change in SessionImpl and the implementation of NonFlushedChanges in NonFlushChangesImpl
    [1:00pm] gbadner: bstansberry, the other changes were just things that were needed to be able to serialize the StatefulPersistenceContext and ActionQueue with a null Session
    [1:03pm] gbadner: bstansberry, sigh, the diffs for SessionImpl were truncated; take a look at http://fisheye.jboss.org/browse/Hibernate/core/trunk/core/src/main/java/org/hibernate/impl/SessionImpl.java?r=17948 and search for "17948"
    [1:05pm] bstansberry: gbadner: hmm, the diff looked non-truncated. but the annotated link is better anyway
    [1:07pm] gbadner: bstansberry, the trick is that serializing/deserializing the NonFlushedChanges changes Session references in the stored StatefulPersistenceContext and ActionQueue to null
    [1:08pm] bstansberry: gbadner: ic
    [1:10pm] gbadner: bstansberry, when the NonFlushedChanges are applied, the StatefulPersistenceContext and ActionQueue are re-serialized/deserialized, but this time the session references in those objects point to the new session
    [1:11pm] gbadner: bstansberry, it's done that way only because it was an easy way to do it
    [1:12pm] bstansberry: gbadner: sounds reasonable. i'll have to look more carefully. is there a unit test?
    [1:12pm] gbadner: bstansberry, easier than going through all the objects in the StatefulPersistenceContext and ActionQueue and manually setting the correct Session referenc
    [1:12pm] gbadner: bstansberry, yeah, but I haven't checked them in yet
    [1:12pm] gbadner: bstansberry, I wanted to check with you to make sure my testing strategy makes sense
    [1:13pm] bstansberry: gbadner: k. i just ask because unit tests are a good way to understand how stuff is meant to be used
    [1:13pm] gbadner: bstansberry, well, the unit tests I'm setting up don't reflect how it will really be used
    [1:14pm] bstansberry: gbadner: what's your testing strategy?
    [1:14pm] gbadner: bstansberry, I'm just working w/ hibernate unit tests
    [1:15pm] gbadner: bstansberry, basically I'm copying the hibernate "ops" tests and adapting them
    [1:15pm] gbadner: bstansberry, I'll give you an example
    [1:18pm] gbadner: bstansberry, http://pastebin.com/m7be7f8de
    [1:19pm] gbadner: bstansberry, if you look at the end, you'll see applyNonFlushedChangesToClearedSession()
    [1:20pm] bstansberry: gbadner: yep, was looking at it
    [1:21pm] bstansberry: gbadner: so you're using the same session object but clearing it
    [1:22pm] gbadner: bstansberry, yeah,  the NonFlushedChanges object is serialized/deserialized to pretend that it is being sent to a session in a different ClassLoader
    [1:22pm] gbadner: bstansberry, since I don't have a clustered transaction, I don't think I can use a different session (or can I???)
    [1:24pm] bstansberry: gbadner: good question
    [1:24pm] gbadner: bstansberry, I was going to try rollingback the transaction in the original session then starting a new transaction with the non-flushed changes, but I didn't think it would work
    [1:25pm] gbadner: bstansberry, because then any changes already flushed would be lost when the transaction is rolled back
    [1:26pm] gbadner: bstansberry, I'd have to actually copy the original transaction from the original session into the new session, wouldn't I?
    [1:27pm] bstansberry: gbadner: if you created a new session and started using it inside the scope of the ongoing tx, wouldn't it pick up the tx?
    [1:28pm] gbadner: bstansberry, the transaction is bound to the session
    [1:29pm] gbadner: bstansberry, in testGetLoad(), you'll see Session.beginTransaction()
    [1:30pm] bstansberry: gbadner, oh, i didn't read carefully and just assumed a JTA transaction
    [1:30pm] bstansberry: gbadner, not org.hibernate.Transaction
    [1:31pm] gbadner: bstansberry, hmmm, maybe I shouldn't be using a session?
    [1:34pm] gbadner: bstansberry, ok, I see some unit tests that use JTA
    [1:34pm] bstansberry: gbadner, yeah. hibernate has a mock JTA impl in the testing module
    [1:35pm] gbadner: bstansberry, ok, I'll adapt my tests to use that
    [1:35pm] gbadner: bstansberry, there's something else that's I don't get though
    [1:35pm] gbadner: bstansberry, take a look at testGetLoad()
    [1:36pm] bstansberry: gbadner: k
    [1:36pm] gbadner: bstansberry, notice that after calling applyNonFlushedChangesToClearedSession(s), s.merge(...) is called
    [1:36pm] gbadner: ?
    [1:37pm] gbadner: bstansberry, this has to be done because after the StatefulPersistenceContext in the NonFlushedChanges is ser/deser, the entity references have changed; they are no longer the same as the references that the client has
    [1:38pm] bstansberry: gbadner: right. that's where it ties in with paul's stuff
    [1:38pm] gbadner: bstansberry, so in order for the client to be able to continue with the new session, the client has to update its references so it matches what the new session has
    [1:39pm] gbadner: bstansberry, ok, so paul's stuff deals with that somehow?
    [1:39pm] bstansberry: gbadner: yes. assuming all the client references are in HttpSession or a SLSB
    [1:40pm] bstansberry: gbadner: although we need to talk about how to replicate the session. can we go through that?
    [1:41pm] gbadner: bstansberry, before we get to replication, do you know of any way the hibernate unit tests can get around calling Session.merge(entity) to update the entity references?
    [1:42pm] bstansberry: gbadner: no, but i'm no expert
    [1:43pm] gbadner: bstansberry, ok, I'll check w/ steve ebersole; I'm glad that's dealt with elsewhere; it would be ugly if clients had to go through that
    [1:44pm] gbadner: bstansberry, I'm not very familiar with how hibernate does replication; are there changes for non-flushed changes that are relevant to replication?
    [1:44pm] bstansberry: gbadner: well, hibernate doesn't replicate, it just serializes
    [1:45pm] gbadner: bstansberry, you mean serializes the session?
    [1:46pm] bstansberry: gbadner: yeah. but we want that to be lightweight; i.e. just enough to instantiate the session on the other side + apply non-flushed changes
    [1:46pm] gbadner: bstansberry, oh, ok; well, I think what I've done is the first step towards that
    [1:47pm] bstansberry: gbadner: right. the thing i'm thinking about is when serializing the session we don't want to call clear() in order to make it lightweight
    [1:47pm] bstansberry: gbadner: since that would force the client to merge() all objects
    [1:48pm] bstansberry: gbadner: i.e. we are just serializing the session to provide an HA backup; the expectation is the client will normally not access it, will keep using the original session
    [1:49pm] gbadner: bstansberry, I call clear in applyNonFlushedChangesToClearedSession(s) simply to pretend that we're using a new "clean" session
    [1:49pm] gbadner: bstansberry, I'll change my tests to actually use a new session
    [1:50pm] bstansberry: gbadner: ok, good. if you just call s.getNonFlushedChanges() that doesn't alter the state of the session?
    [1:50pm] gbadner: bstansberry, then it won't need to clear the session
    [1:50pm] gbadner: bstansberry, no, it doesn't change the state at all
    [1:51pm] bstansberry: gbadner: your distributed transaction point was a good one
    [1:52pm] bstansberry: gbadner: the backup copy of the session would only be useful with a distributed transaction
    [1:52pm] gbadner: bstansberry, I'm glad we chatted about this. so far I've only used transactions that were bound to sessions
    [1:52pm] bstansberry: gbadner: yeah; good for me too; to think about it
    [1:54pm] bstansberry: gbadner: if the session was a JPA extended persistence context, then there may be no unflushed changes. so then the serialized session is very lightweight
    [1:54pm] gbadner: bstansberry, I am kind of concerned about putting live references to the persistence context and action queue in NonFlushedChanges
    [1:55pm] bstansberry: gbadner: why? (i can guess but it's better to hear your reasons)
    [1:55pm] gbadner: bstansberry, you mentioned a while back that Session.getNonFlushedChanges() would be called after each request; what happens to all those NonFlushedChanges objects?
    [1:56pm] bstansberry: gbadner: let's assume the session is part of the state of an SFSB. the SFSB would get placed in JBoss Cache
    [1:57pm] bstansberry: gbadner: think of JBC as a map, where the SFSB id is the key, value is the SFSB state
    [1:57pm] bstansberry: gbadner: so each request would do a map put, replacing the old value
    [1:58pm] bstansberry: gbadner: technically, what get's put is a byte[], not the SFSB object. but that could change
    [1:58pm] gbadner: bstansberry, so would the NonFlushedChanges object be ser/deser immediately before or when it's put in the JBC Map?
    [1:59pm] gbadner: bstansberry, the reason I ask is because, at least for now, ser/deser will be necessary to remove the session references
    [2:00pm] bstansberry: gbadner: the way it is now, it's serialized before it's put in JBC
    [2:00pm] gbadner: bstansberry, OK, that's good, because the next request could also affect the ActionQueue and/or PersistenceContext
    [2:00pm] bstansberry: gbadner: ok, that's good to know.
    [2:01pm] bstansberry: gbadner: hmm, let's talk web sessions, which work a bit differently
    [2:02pm] gbadner: bstansberry, um, ok; (really don't know much about this...)
    [2:02pm] bstansberry: gbadner: there, the unserialized data is put in the cache. JBC then serializes in order to create a replication message to send to the other cluster nodes
    [2:03pm] bstansberry: gbadner: but the unserialized state remains in the local cache
    [2:03pm] bstansberry: gbadner: no! i'm wrong!
    [2:03pm] gbadner: bstansberry, lol; did you slap yourself?
    [2:04pm] bstansberry: gbadner: ow. yes.
    [2:04pm] bstansberry: gbadner: the NonFlushedChanges object only exists for the time it takes to serialize
    [2:05pm] gbadner: bstansberry, you mean it immediately is used to replicate?
    [2:06pm] bstansberry: gbadner: what's stored in the cache locally is either a 1) byte[] or 2)the original HttpSession+Session, i.e. the object graph the user uses
    [2:08pm] gbadner: bstansberry, hmmm, is the idea that the Session part would be replaced by the NonFlushedChanges object?
    [2:08pm] bstansberry: gbadner: yes. plus whatever other data is needed to allow instantiation of a new session on the remote node
    [2:10pm] gbadner: bstansberry, in that case, would the NonFlushedChanges have already been serialized/deserialized?
    [2:11pm] bstansberry: gbadner: well, it would all be part of an overall serialization process. the order in which stuff occurs, i don't know. we would do it in whatever order was necessary
    [2:13pm] gbadner: bstansberry, we need to be sure that any changes to the original session's persistence context or action queue do not affect what's in the NonFlushedChanges; before the NonFlushedChanges is serialized, it and the Session refer to the same persistence context and action queue
    [2:15pm] bstansberry: gbadner: if there's only a single thread, it should be ok. and there should only be a single thread. let's assume a web request
    [2:15pm] bstansberry: gbadner: the web clustering layer will ensure only one thread can access session during the replication phase at the end of the request
    [2:16pm] bstansberry: gbadner: then replication occurs, meaning web session state is serialized; part of that is Hibernate Session
    [2:16pm] bstansberry: gbadner: during that, NonFlushedChanges is created and immediately serialized, then ref to NonFlushedChanges goes out of scope
    [2:17pm] gbadner: bstansberry, ok, so all that will happen before the next request on the original session happens?
    [2:17pm] bstansberry: gbadner: yes
    [2:18pm] gbadner: bstansberry, excellent! I'll be able to sleep tonight
    [2:18pm] bstansberry: gbadner: there's actually another legacy replication mode where it can happen later, but still, a lock will be acquired and serialization will only occur when the session isn't being used by user code
    [2:19pm] gbadner: bstansberry, ok, that's good
    [2:19pm] bstansberry: gbadner: if people use the legacy mode, they are accepting that what gets replicated could be a later version of the session. which is ok; the purpose is HA; replicating the most current state is better
    [2:21pm] gbadner: bstansberry, interesting, so the non-legacy mode could actually lose more data if a session crashes
    [2:21pm] gbadner: bstansberry, and I suppose using legacy mode could result in having crashed NonFlushedChanges
    [2:22pm] bstansberry: gbadner: yep. it uses a separate thread to replicate, so the request thread returns faster, not having to serialize, push a message on the network
    [2:22pm] gbadner: bstansberry, ok, if that's acceptable
    [2:23pm] bstansberry: gbadner: TBH we shouldn't support this with the legacy mode. TBH we shouldn't support the legacy mode
    [2:23pm] gbadner: bstansberry, yeah, it seems kind of dangerous
    [2:24pm] bstansberry: gbadner: it's a tradeoff of application responsiveness versus safety. for a lot of webapps it's fine. for an app that would have transactions spanning requests or using an XPC, it seems like a bad choice
    [2:26pm] gbadner: bstansberry, ok, well, I think you have a good picture of where I am so far
    [2:27pm] gbadner: bstansberry, I'll update my tests and get them checked in
    [2:28pm] bstansberry: gbadner: sounds good.