[From Gray Watson]
Just to close this issue about large packets being dropped between hosts running in Amazon's EC2 cluster using jgroups. The problem was that large packets using the default stack configuration for FRAG2 (60k) were sometimes being dropped between some hosts. The cluster would work fine until a large amount of data was sent between some pairs of servers. Very confusing.
Amazon support got back to us with the following response.
This is an update for case 85983221. We are currently limited to packet sizes of 32k and below on Amazon
EC2 and can confirm the issues you are facing for larger packet sizes. We are investigating a solution
to this limitation. Please let us know if you can keep your packet sizes below this level, or if this
is severe problem blocking your ability to operate.
We are actively looking into increasing the packet size along with other platform improvements, and
apologize for this inconvenience.
So it looks folks should use FRAG sizes of <= 32k if you are running in UDP mode under EC2.