Regarding your OOMs, apart from restricting how much state gets transferred (aggressive eviction + a cache loader) and perhaps a shared database cache loader for persistent state, all I can recommend is allocating more mem until you can use streaming state transfers.
Regarding the performance of 2.0.0, in CR1 we just focused on features and not performance. with CR2 it will be more on stability of CR1 features, and after that we will be looking at performance. I expect 2.0.0 to be (marginally) slower than 1.4.1 when we go GA (1.4.1 is our fastest release to date).
By the time 2.1.0 (much shorter cycle than the 1.4.x -> 2.0.0) comes around I hope to beat 1.4.1 in performance as well, since I have some internal architectural changes in 2.1.0 which will take advantage of and optimise for the 2.x APIs.