7 Replies Latest reply on Mar 31, 2008 12:17 PM by manik

    FileCacheLoader failing with EOFException with fix

    e2open

      File cache loader in a cluster was failing with EOF exception while restoring an object, under load. It looks like the original hash map was not getting written properly. The original code is given below:

      protected void storeAttributes(Fqn fqn, Map attrs) throws Exception
      {
      File f = getDirectory(fqn, true);
      File child = new File(f, DATA);
      if (!child.exists())
      if (!child.createNewFile())
      throw new IOException("Unable to create file: " + child);
      FileOutputStream out = new FileOutputStream(child);
      ObjectOutputStream output = new ObjectOutputStream(out);
      output.writeObject(attrs);
      out.close();
      }

      Changing the code as given below seems to fix the issue.

      protected void storeAttributes(Fqn fqn, Map attrs) throws Exception
      {
      File f = getDirectory(fqn, true);
      File child = new File(f, DATA);
      if (!child.exists())
      if (!child.createNewFile())
      throw new IOException("Unable to create file: " + child);
      FileOutputStream out = new FileOutputStream(child);
      try
      {
      MarshalledValueOutputStream output =
      new MarshalledValueOutputStream(out);
      output.writeObject(attrs);
      output.close();
      out = null;
      }
      finally
      {
      if(out != null)
      out.close();
      }
      }

        • 1. Re: FileCacheLoader failing with EOFException with fix
          genman

          It looks like the original code did not close the ObjectOutputStream, rather it closed the underlying file stream, which isn't correct.

          Could you file a JIRA issue and link to this post?

          By the way, I think the FileCacheLoader is documented as not so production worthy.

          • 2. Re: FileCacheLoader failing with EOFException with fix
            manik

            Yes, please create a JIRA for this, target for 2.2.0.

            Thanks
            Manik

            • 3. Re: FileCacheLoader failing with EOFException with fix
              e2open

              The fix above reduced the EOF exceptions but it did not completely eliminate the issue. I have a shared cache with 2 nodes and file cache loader with nfs (yes, i know that you guys don't recomment it, just building a basic infrastructure to test with). Further debugging revealed the fact that both the nodes are trying to write into the data file at the same time. One of which runs into the EOF exception as it tries to do a readObject on a 0 byte file that is being written into from the other node. Now i am confused, isn't the lock supposed to be across the cluster at the tree cache level, avoiding this issue ? Or am I missing something ?

              • 4. Re: FileCacheLoader failing with EOFException with fix
                manik

                No, it is still possible that one instance performs a read, another instance performs a write, both involving a cache loader.

                There is no cluster-wide lock at the start. Locking attempts to gain cluster-wide locks during the prepare phase of a 2-phase commit, and the transaction fails if this cannot be obtained.

                • 5. Re: FileCacheLoader failing with EOFException with fix
                  e2open

                  I changed the code to do a dot file and move to avoid the EOF on read(on linux). The problem i see now is that, there are more than one write happening at the same time (from different nodes in the cluster). How is that possible if there is a write lock on the node ?


                  protected void storeAttributes(Fqn fqn, Map attrs) throws Exception
                  {
                  File f = getDirectory(fqn, true);
                  File child = new File(f, DATA);
                  File dotChild = new File(f, DOT_DATA);
                  if(dotChild.exists())
                  System.out.println("Found dot file : " + dotChild);
                  FileOutputStream out = new FileOutputStream(dotChild);
                  try
                  {
                  MarshalledValueOutputStream output =
                  new MarshalledValueOutputStream(out);
                  output.writeObject(attrs);
                  output.close();
                  out = null;
                  }
                  finally
                  {
                  if(out != null)
                  out.close();
                  }
                  if(!dotChild.renameTo(child))
                  {
                  throw new Exception("Failed to rename '" + dotChild + "' to '" +
                  child + "' : " + f.exists() + " : " + dotChild.exists() +
                  " : " + child.exists());
                  }
                  }

                  • 6. Re: FileCacheLoader failing with EOFException with fix
                    genman

                    If you trace through, you'll notice the writing to disk doesn't happen during when write lock is obtained, but actually when the transaction commits.

                    What probably should happen is the writes happen during the prepare phase to "dot files", perhaps named with the JGroups address, and during the commit phase the files are renamed.

                    • 7. Re: FileCacheLoader failing with EOFException with fix
                      manik

                      BTW, what version of JBC are you referring to in your original post?