This setting is based on performance tests, which really assumes that the cache is heavily accessed. We're trying to keep the default as performant as possible, and you don't have to tune kernel - if the buffer cannot be allocated, only a warning is produced and the cluster still works.
2) The question whether to use client-server mode should be based on the general use rather than on fine-tuning parameters. If the node is expected to often connect/disconnect, it's definitely a use case for HotRod client. If you need any of the embedded mode-only features (e.g. transactions), or you really want to store data on the node, use embedded mode.
3) The major latency source is network. Read operations need to do 1 roundtrip if the entry is not located on the local node (which is always in client mode, unless you're using near cache, but in embedded mode you can be lucky and can serve the request locally), and for non-tx writes it's roundtrip to primary owner (which might be the local node in embedded mode), and then max(roundtrips) to all backup owners.
Many thanks for answer my questions Radim, much appreciated!