Page for discussion related to HHH-2762
Following discussion occurred on November 12, 2009:
[12:44pm] gbadner: bstansberry, I've checked in an initial implementation of the get/applyNonFlushedChanges stuff
[12:44pm] bstansberry: gbadner: ah, interesting
[12:44pm] gbadner: bstansberry, into hibernate trunk
[12:45pm] gbadner: bstansberry, the NonFlushedChanges object contains a lot more than just the changes, but it should be better than serializing the entire session
[12:46pm] gbadner: bstansberry, currently it contains the StatefulPersistenceContext and ActionQueue
[12:46pm] bstansberry: gbadner: that makes sense
[12:46pm] gbadner: bstansberry, later I'll cut things down more
[12:47pm] bstansberry: gbadner: can you ping Paul Ferraro?
[12:48pm] gbadner: bstansberry, I've been testing it by copying and adapting the hibernate "ops" tests; CreateTest, DeleteTest, MergeTest, etc
[12:48pm] bstansberry: bstansberry: he's been doing work on optimizing replication of SFSBs/etc that contain entities
[12:48pm] bstansberry: bstansberry: this ties in with that
[[12:50pm] gbadner: bstansberry, ok, I'll try him tomorrow; do you want to be in on the discussion?
[12:52pm] bstansberry: gbadner: sure; if on email copy me. on irc i won't likely be online on same hours as you both until next week
[12:55pm] bstansberry: gbadner: any easy access point for me to look at your changes? i.e. class/method to start from. now i'm curious
[12:56pm] gbadner: bstansberry, yup, also, can I ask you a couple of questions while you're here?
[12:56pm] bstansberry: gbadner: absolutely
[12:57pm] gbadner: bstansberry, here's the changeset: http://fisheye.jboss.org/changelog/Hibernate?cs=17948
[12:59pm] gbadner: bstansberry, all you really need to look at is the change in SessionImpl and the implementation of NonFlushedChanges in NonFlushChangesImpl
[1:00pm] gbadner: bstansberry, the other changes were just things that were needed to be able to serialize the StatefulPersistenceContext and ActionQueue with a null Session
[1:03pm] gbadner: bstansberry, sigh, the diffs for SessionImpl were truncated; take a look at http://fisheye.jboss.org/browse/Hibernate/core/trunk/core/src/main/java/org/hibernate/impl/SessionImpl.java?r=17948 and search for "17948"
[1:05pm] bstansberry: gbadner: hmm, the diff looked non-truncated. but the annotated link is better anyway
[1:07pm] gbadner: bstansberry, the trick is that serializing/deserializing the NonFlushedChanges changes Session references in the stored StatefulPersistenceContext and ActionQueue to null
[1:08pm] bstansberry: gbadner: ic
[1:10pm] gbadner: bstansberry, when the NonFlushedChanges are applied, the StatefulPersistenceContext and ActionQueue are re-serialized/deserialized, but this time the session references in those objects point to the new session
[1:11pm] gbadner: bstansberry, it's done that way only because it was an easy way to do it
[1:12pm] bstansberry: gbadner: sounds reasonable. i'll have to look more carefully. is there a unit test?
[1:12pm] gbadner: bstansberry, easier than going through all the objects in the StatefulPersistenceContext and ActionQueue and manually setting the correct Session referenc
[1:12pm] gbadner: bstansberry, yeah, but I haven't checked them in yet
[1:12pm] gbadner: bstansberry, I wanted to check with you to make sure my testing strategy makes sense
[1:13pm] bstansberry: gbadner: k. i just ask because unit tests are a good way to understand how stuff is meant to be used
[1:13pm] gbadner: bstansberry, well, the unit tests I'm setting up don't reflect how it will really be used
[1:14pm] bstansberry: gbadner: what's your testing strategy?
[1:14pm] gbadner: bstansberry, I'm just working w/ hibernate unit tests
[1:15pm] gbadner: bstansberry, basically I'm copying the hibernate "ops" tests and adapting them
[1:15pm] gbadner: bstansberry, I'll give you an example
[1:18pm] gbadner: bstansberry, http://pastebin.com/m7be7f8de
[1:19pm] gbadner: bstansberry, if you look at the end, you'll see applyNonFlushedChangesToClearedSession()
[1:20pm] bstansberry: gbadner: yep, was looking at it
[1:21pm] bstansberry: gbadner: so you're using the same session object but clearing it
[1:22pm] gbadner: bstansberry, yeah, the NonFlushedChanges object is serialized/deserialized to pretend that it is being sent to a session in a different ClassLoader
[1:22pm] gbadner: bstansberry, since I don't have a clustered transaction, I don't think I can use a different session (or can I???)
[1:24pm] bstansberry: gbadner: good question
[1:24pm] gbadner: bstansberry, I was going to try rollingback the transaction in the original session then starting a new transaction with the non-flushed changes, but I didn't think it would work
[1:25pm] gbadner: bstansberry, because then any changes already flushed would be lost when the transaction is rolled back
[1:26pm] gbadner: bstansberry, I'd have to actually copy the original transaction from the original session into the new session, wouldn't I?
[1:27pm] bstansberry: gbadner: if you created a new session and started using it inside the scope of the ongoing tx, wouldn't it pick up the tx?
[1:28pm] gbadner: bstansberry, the transaction is bound to the session
[1:29pm] gbadner: bstansberry, in testGetLoad(), you'll see Session.beginTransaction()
[1:30pm] bstansberry: gbadner, oh, i didn't read carefully and just assumed a JTA transaction
[1:30pm] bstansberry: gbadner, not org.hibernate.Transaction
[1:31pm] gbadner: bstansberry, hmmm, maybe I shouldn't be using a session?
[1:34pm] gbadner: bstansberry, ok, I see some unit tests that use JTA
[1:34pm] bstansberry: gbadner, yeah. hibernate has a mock JTA impl in the testing module
[1:35pm] gbadner: bstansberry, ok, I'll adapt my tests to use that
[1:35pm] gbadner: bstansberry, there's something else that's I don't get though
[1:35pm] gbadner: bstansberry, take a look at testGetLoad()
[1:36pm] bstansberry: gbadner: k
[1:36pm] gbadner: bstansberry, notice that after calling applyNonFlushedChangesToClearedSession(s), s.merge(...) is called
[1:36pm] gbadner: ?
[1:37pm] gbadner: bstansberry, this has to be done because after the StatefulPersistenceContext in the NonFlushedChanges is ser/deser, the entity references have changed; they are no longer the same as the references that the client has
[1:38pm] bstansberry: gbadner: right. that's where it ties in with paul's stuff
[1:38pm] gbadner: bstansberry, so in order for the client to be able to continue with the new session, the client has to update its references so it matches what the new session has
[1:39pm] gbadner: bstansberry, ok, so paul's stuff deals with that somehow?
[1:39pm] bstansberry: gbadner: yes. assuming all the client references are in HttpSession or a SLSB
[1:40pm] bstansberry: gbadner: although we need to talk about how to replicate the session. can we go through that?
[1:41pm] gbadner: bstansberry, before we get to replication, do you know of any way the hibernate unit tests can get around calling Session.merge(entity) to update the entity references?
[1:42pm] bstansberry: gbadner: no, but i'm no expert
[1:43pm] gbadner: bstansberry, ok, I'll check w/ steve ebersole; I'm glad that's dealt with elsewhere; it would be ugly if clients had to go through that
[1:44pm] gbadner: bstansberry, I'm not very familiar with how hibernate does replication; are there changes for non-flushed changes that are relevant to replication?
[1:44pm] bstansberry: gbadner: well, hibernate doesn't replicate, it just serializes
[1:45pm] gbadner: bstansberry, you mean serializes the session?
[1:46pm] bstansberry: gbadner: yeah. but we want that to be lightweight; i.e. just enough to instantiate the session on the other side + apply non-flushed changes
[1:46pm] gbadner: bstansberry, oh, ok; well, I think what I've done is the first step towards that
[1:47pm] bstansberry: gbadner: right. the thing i'm thinking about is when serializing the session we don't want to call clear() in order to make it lightweight
[1:47pm] bstansberry: gbadner: since that would force the client to merge() all objects
[1:48pm] bstansberry: gbadner: i.e. we are just serializing the session to provide an HA backup; the expectation is the client will normally not access it, will keep using the original session
[1:49pm] gbadner: bstansberry, I call clear in applyNonFlushedChangesToClearedSession(s) simply to pretend that we're using a new "clean" session
[1:49pm] gbadner: bstansberry, I'll change my tests to actually use a new session
[1:50pm] bstansberry: gbadner: ok, good. if you just call s.getNonFlushedChanges() that doesn't alter the state of the session?
[1:50pm] gbadner: bstansberry, then it won't need to clear the session
[1:50pm] gbadner: bstansberry, no, it doesn't change the state at all
[1:51pm] bstansberry: gbadner: your distributed transaction point was a good one
[1:52pm] bstansberry: gbadner: the backup copy of the session would only be useful with a distributed transaction
[1:52pm] gbadner: bstansberry, I'm glad we chatted about this. so far I've only used transactions that were bound to sessions
[1:52pm] bstansberry: gbadner: yeah; good for me too; to think about it
[1:54pm] bstansberry: gbadner: if the session was a JPA extended persistence context, then there may be no unflushed changes. so then the serialized session is very lightweight
[1:54pm] gbadner: bstansberry, I am kind of concerned about putting live references to the persistence context and action queue in NonFlushedChanges
[1:55pm] bstansberry: gbadner: why? (i can guess but it's better to hear your reasons)
[1:55pm] gbadner: bstansberry, you mentioned a while back that Session.getNonFlushedChanges() would be called after each request; what happens to all those NonFlushedChanges objects?
[1:56pm] bstansberry: gbadner: let's assume the session is part of the state of an SFSB. the SFSB would get placed in JBoss Cache
[1:57pm] bstansberry: gbadner: think of JBC as a map, where the SFSB id is the key, value is the SFSB state
[1:57pm] bstansberry: gbadner: so each request would do a map put, replacing the old value
[1:58pm] bstansberry: gbadner: technically, what get's put is a byte[], not the SFSB object. but that could change
[1:58pm] gbadner: bstansberry, so would the NonFlushedChanges object be ser/deser immediately before or when it's put in the JBC Map?
[1:59pm] gbadner: bstansberry, the reason I ask is because, at least for now, ser/deser will be necessary to remove the session references
[2:00pm] bstansberry: gbadner: the way it is now, it's serialized before it's put in JBC
[2:00pm] gbadner: bstansberry, OK, that's good, because the next request could also affect the ActionQueue and/or PersistenceContext
[2:00pm] bstansberry: gbadner: ok, that's good to know.
[2:01pm] bstansberry: gbadner: hmm, let's talk web sessions, which work a bit differently
[2:02pm] gbadner: bstansberry, um, ok; (really don't know much about this...)
[2:02pm] bstansberry: gbadner: there, the unserialized data is put in the cache. JBC then serializes in order to create a replication message to send to the other cluster nodes
[2:03pm] bstansberry: gbadner: but the unserialized state remains in the local cache
[2:03pm] bstansberry: gbadner: no! i'm wrong!
[2:03pm] gbadner: bstansberry, lol; did you slap yourself?
[2:04pm] bstansberry: gbadner: ow. yes.
[2:04pm] bstansberry: gbadner: the NonFlushedChanges object only exists for the time it takes to serialize
[2:05pm] gbadner: bstansberry, you mean it immediately is used to replicate?
[2:06pm] bstansberry: gbadner: what's stored in the cache locally is either a 1) byte[] or 2)the original HttpSession+Session, i.e. the object graph the user uses
[2:08pm] gbadner: bstansberry, hmmm, is the idea that the Session part would be replaced by the NonFlushedChanges object?
[2:08pm] bstansberry: gbadner: yes. plus whatever other data is needed to allow instantiation of a new session on the remote node
[2:10pm] gbadner: bstansberry, in that case, would the NonFlushedChanges have already been serialized/deserialized?
[2:11pm] bstansberry: gbadner: well, it would all be part of an overall serialization process. the order in which stuff occurs, i don't know. we would do it in whatever order was necessary
[2:13pm] gbadner: bstansberry, we need to be sure that any changes to the original session's persistence context or action queue do not affect what's in the NonFlushedChanges; before the NonFlushedChanges is serialized, it and the Session refer to the same persistence context and action queue
[2:15pm] bstansberry: gbadner: if there's only a single thread, it should be ok. and there should only be a single thread. let's assume a web request
[2:15pm] bstansberry: gbadner: the web clustering layer will ensure only one thread can access session during the replication phase at the end of the request
[2:16pm] bstansberry: gbadner: then replication occurs, meaning web session state is serialized; part of that is Hibernate Session
[2:16pm] bstansberry: gbadner: during that, NonFlushedChanges is created and immediately serialized, then ref to NonFlushedChanges goes out of scope
[2:17pm] gbadner: bstansberry, ok, so all that will happen before the next request on the original session happens?
[2:17pm] bstansberry: gbadner: yes
[2:18pm] gbadner: bstansberry, excellent! I'll be able to sleep tonight
[2:18pm] bstansberry: gbadner: there's actually another legacy replication mode where it can happen later, but still, a lock will be acquired and serialization will only occur when the session isn't being used by user code
[2:19pm] gbadner: bstansberry, ok, that's good
[2:19pm] bstansberry: gbadner: if people use the legacy mode, they are accepting that what gets replicated could be a later version of the session. which is ok; the purpose is HA; replicating the most current state is better
[2:21pm] gbadner: bstansberry, interesting, so the non-legacy mode could actually lose more data if a session crashes
[2:21pm] gbadner: bstansberry, and I suppose using legacy mode could result in having crashed NonFlushedChanges
[2:22pm] bstansberry: gbadner: yep. it uses a separate thread to replicate, so the request thread returns faster, not having to serialize, push a message on the network
[2:22pm] gbadner: bstansberry, ok, if that's acceptable
[2:23pm] bstansberry: gbadner: TBH we shouldn't support this with the legacy mode. TBH we shouldn't support the legacy mode
[2:23pm] gbadner: bstansberry, yeah, it seems kind of dangerous
[2:24pm] bstansberry: gbadner: it's a tradeoff of application responsiveness versus safety. for a lot of webapps it's fine. for an app that would have transactions spanning requests or using an XPC, it seems like a bad choice
[2:26pm] gbadner: bstansberry, ok, well, I think you have a good picture of where I am so far
[2:27pm] gbadner: bstansberry, I'll update my tests and get them checked in
[2:28pm] bstansberry: gbadner: sounds good.
Comments