This is per-design: when the target is available in the same JVM as the client, we by-pass the remoting layers and do a direct Java call, hence by-passing the load-balancing layers as well.
Sasha, thanks for the reply.
Hopefully, there is a work-around for this. My process will generate work units to enable concurrent processing. For each work unit, I will send a JMS message which will be processed by a Message-Driven Bean. The Message-Driven Bean will then invoke the clustered Session Bean to do the work. I would like for this work to be distributed among the nodes for performance reasons. Based on your reply, it sounds like this is not possible. Please let me know if there is a work-around or a better approach.