I actually did see the problem yesterday but it was past 1am and I figured I'd just leave it till I could look at it again with a fresh mind. :-) Looks like you beat me to it though, thanks for spotting this!
I'd veer towards solution 1. above.
1. - Seems the most logical that we don't block the JG thread as viewchanges come in.
2. I'm not too happy with this, simply because, as you said, there is no chance of spotting exceptions and dealing with this.
3. Not happy with switching to a push-based state transfer either, simply because of the rework needed in the state transfer codebase. Logically, a push-based state transfer makes the most sense since the data owner controls whole process. Regarding regions, I agree that a List in a single call is best here. Do you think a push-based state transfer can easily be retrofitted on top of the existing state transfer codebase? If so, I'd prefer it anyway since it means 1 less RPC message when assigning a buddy.
I'm still going ahead and implementing 1 though, I don't lilke blocking the listener.
#1 is necessary but insufficient to fix the problem, as a set of cyclical calls still remain. Found this out when I hacked in a quick fix for #1 (not a proper one at all, just the fastest I could do to move past the problem). That's when I finally applied my brain and thought through the whole sequence of calls that I wrote about.
So, either #2 or #3 is necessary. I too prefer #3.
I think converting to push-based state transfer should be pretty easy (i.e. doable today, tomorrow at latest). There are 3 main areas involved in state transfer:
a) _getState, which prepares a state transfer byte for a given region.
b) _setState, which integrates the state transfer byte into a given region.
c) Various methods which orchestrate the remote call and the processing of the return, either through a call to JChannel.getState() or through an RPC call.
We're just talking about adding a slightly different flavor to c) -- a and b are the harder parts and should remain unaffected.