6 Replies Latest reply on Aug 25, 2004 1:25 PM by starksm64

    How to clear HAJNDI's HARMI stub being cached in NamingConte

    budworth

      Hi all,

      We currently have a somewhat major issue with JBoss clusters and NamingContexts.

      Currently, if a client connects to HAJNDI and the server(s) are restarted in between client requests, the client handle gets invalidated.

      as in:
      1) Client A creates an initialcontext for DefaultPartition
      2) Server B answers back, handing him the HARMI stub
      3) Client A waits for user input
      4) Server B gets stopped and restarted (the JVM)
      5) Client A attempts to do JNDI lookup

      At this point, the HARMI stub is invalid as it only knows of a connection to the previous RMI server.

      Looking at NamingContext, it retains a weak reference cache that holds all server stubs. Which basically means that if I create another InitialContext, it still does not function since it uses the Naming object (rmi stub) from cache. and that one is invalid.

      There is IOException handling that seems to purge the cache of that server entry. But that seems to not be functioning properly.

      If you wait long enough, the weak ref will get purged and you can then create a new InitialContext and all works normally.

      That delay is about a minute or so, so doesn't exactly work for us (our servers handle several hundred requests per second)

      Anyone have an idea on how to get around this?

      My solution is to make a dynamic proxy for NamingContext in org.jnp.interfaces, that blows away the entire cache every time a CommunicationsException is received. (my proxy is in that package so i can access the package protected static cache)

      This works, but I was hoping Was hoping there was a more normal way.

        • 1. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
          starksm64

          I'm not following the failure scenario you expect. Given what you describe, this sample hajndi client is doing the same steps, and its expected that the second list operation will fail as there is no server to failover to. It is able to immeadiately reconnect by creating a new InitialContext, so what differs in your scenario?

          package naming;
          
          import java.util.Properties;
          import javax.naming.Context;
          import javax.naming.InitialContext;
          import javax.naming.NamingException;
          
          /**
           * @author Scott.Stark@jboss.org
           * @version $Revison:$
           */
          public class TestHANaming
          {
           public static void main(String[] args) throws Exception
           {
           Properties env = new Properties();
           env.setProperty(Context.INITIAL_CONTEXT_FACTORY,
           "org.jnp.interfaces.NamingContextFactory");
           env.setProperty(Context.PROVIDER_URL, "localhost:1100");
           InitialContext ic = new InitialContext(env);
           ic.list("");
           System.out.println("Listed localhost:1100 root, kill the server");
           Thread.sleep(30*1000);
          
           System.out.println("Trying to list again...");
           try
           {
           // Should fail because the only server has been restarted
           ic.list("");
           }
           catch(NamingException e)
           {
           System.out.println("Lookup failed, trying new InitialContext");
           ic = new InitialContext(env);
           ic.list("");
           System.out.println("Listed localhost:1100 root again");
           }
           }
          
          }
          


          Listed localhost:1100 root, kill the server
          Trying to list again...
          Lookup failed, trying new InitialContext
          Listed localhost:1100 root again
          



          • 2. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
            budworth

            Hi Scott, thank you for the response. (sorry for my delay, got side tracked for quite a bit)

            Your example works fine, but if you auto-discover the host, it won't.

            Simply change your

            env.setProperty(Context.PROVIDER_URL, "localhost:1100");
            


            to:

             env.setProperty(Context.PROVIDER_URL, "");
             env.setProperty("jnp.partitionName","DefaultPartition");
            


            And you'll find the second lookup fails.

            I tracked it down to the URL -> NamingContext cache.

            Upon getting a RMI failure, it removes the 'url' from the cache and throws a CommunicationsException.... Problem is, the 'URL' key in the cache doesn't seem to match, because on next lookup, it just gets the cached version back.

            If you wait long enough (set the sleep to 5 minutes), you'll get the weak ref to expire and the lookup to occur again.

            I know it's not specifically an issue with the muilticast locator code, as in my project I've implemented my own mcast server/locator/IC factory...

            And my IC factory composes a targetd "HOST:1100" url matching the discovered host sets up some props and passes it to the jboss NamingContext.

            So it's somewhere in the NamingContext class.

            -David

            • 3. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
              budworth

              Forgot to mention, it's happening in 3.2.3, as well as 3.2.4

              • 4. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
                starksm64

                I added a change for 3.2.5 that clears the discovered proxy on failure.

                • 5. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
                  drew.farris

                   

                  "scott.stark@jboss.org" wrote:
                  I added a change for 3.2.5 that clears the discovered proxy on failure.

                  I am encountering this issue as well -- in 3.2.5, where can I find the patch that fixes it?
                  Will this fix be rolled into the next release of 3.2.x?

                  • 6. Re: How to clear HAJNDI's HARMI stub being cached in NamingC
                    starksm64

                    Its in 3.2.5 so post a bug report with the example which illustrates the problem to sourceforge:
                    http://sourceforge.net/tracker/?group_id=22866&atid=376685