It looks like the original code did not close the ObjectOutputStream, rather it closed the underlying file stream, which isn't correct.
Could you file a JIRA issue and link to this post?
By the way, I think the FileCacheLoader is documented as not so production worthy.
Yes, please create a JIRA for this, target for 2.2.0.
The fix above reduced the EOF exceptions but it did not completely eliminate the issue. I have a shared cache with 2 nodes and file cache loader with nfs (yes, i know that you guys don't recomment it, just building a basic infrastructure to test with). Further debugging revealed the fact that both the nodes are trying to write into the data file at the same time. One of which runs into the EOF exception as it tries to do a readObject on a 0 byte file that is being written into from the other node. Now i am confused, isn't the lock supposed to be across the cluster at the tree cache level, avoiding this issue ? Or am I missing something ?
No, it is still possible that one instance performs a read, another instance performs a write, both involving a cache loader.
There is no cluster-wide lock at the start. Locking attempts to gain cluster-wide locks during the prepare phase of a 2-phase commit, and the transaction fails if this cannot be obtained.
I changed the code to do a dot file and move to avoid the EOF on read(on linux). The problem i see now is that, there are more than one write happening at the same time (from different nodes in the cluster). How is that possible if there is a write lock on the node ?
protected void storeAttributes(Fqn fqn, Map attrs) throws Exception
File f = getDirectory(fqn, true);
File child = new File(f, DATA);
File dotChild = new File(f, DOT_DATA);
System.out.println("Found dot file : " + dotChild);
FileOutputStream out = new FileOutputStream(dotChild);
MarshalledValueOutputStream output =
out = null;
if(out != null)
throw new Exception("Failed to rename '" + dotChild + "' to '" +
child + "' : " + f.exists() + " : " + dotChild.exists() +
" : " + child.exists());
If you trace through, you'll notice the writing to disk doesn't happen during when write lock is obtained, but actually when the transaction commits.
What probably should happen is the writes happen during the prepare phase to "dot files", perhaps named with the JGroups address, and during the commit phase the files are renamed.
BTW, what version of JBC are you referring to in your original post?