TreeCache will replicate the modified key value pairs, not the whole node and definitely not any other nodes. So if you do a put("/a/b/c", "x", "y") an object encapsulating that command is replicated.
Putting everything in one node could be faster in a simple test because the shallower your tree, the faster the navigation to the needed node is. But, there could be a heavy price to pay in a multi-threaded system. With READ_COMMITTED or stronger isolation, if one thread is modifying that node, all other threads are blocked waiting to read it. With finer-grained nodes, any lock contention is also more fine grained.