11 Replies Latest reply on Feb 20, 2008 9:03 AM by martin.wickus

    Detected failure on control connection

    bob_walker99

      Can someone offer an explanation of what exactly is happening when I get this error:
      org.jboss.remoting.transport.bisocket.BisocketServerInvoker$ControlMonitorTimerTask@20dbb1: detected failure on control connection Thread[control: S
      ocket[addr=/127.0.0.1,port=3160,localport=2999],5,main]: requesting new control connection


      I'm seeing it a lot, but the only references I can find on the forums are these:

      http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4055249#4055249
      http://jira.jboss.org/jira/browse/JBREM-725

      These don't seem to be the same case - I'm not seeing the NPE, and it isn't a server/client restart issue for me.

      I'm using AS 4.2.0 with JBM 1.4.0 CR1, (but it also happened in 1.3.0 GA and I'm currently running my tests on an XP box with JDK 1.6.0_02. I replaced jboss remoting with the 2.2.0 SP1 version, to no avail.

      I can't recreate it consistently in a test case (but it does happen intermittently in the tests I've written) but it happens a lot in (what would be) my production code (if it worked..). If it is something I am doing incorrectly, I need to understand what this is, but I'm at a loss where to look.

      Help?

      Sorry if this belongs on the remoting forums, let me know and I'll repost if necessary.

        • 1. Re: Detected failure on control connection
          bob_walker99

          Interestingly, the times it does happen my test receiving code are when it pages from disk:

          i.e.
          1) The queue has a FullSize of 500, Page Size of 250 and DownCacheSize of 250.
          2) my test sending client sends 10,000 messages
          3) I fire up my test receiving client, and on a couple of occasions when the message received count is divisble by 250, I get a "control connection failure".

          Again it's not conclusive, but it's the first pattern I've seen, so I thought I'd better post it.

          BTW, I've mentioned elsewhere, but these are relatively large text messages of 1Mb apiece.

          Could it be a timeout issue? There is a significant pause as it pages in a batch of messages (which is perfectly acceptable, and to be expected). Does this stop the Bisocket conversation working correctly?

          I'll try to package my test code up with the queue specifcations and instructions to reproduce a.s.a.p. and post it on here.

          Regards,

          Bob

          • 2. Re: Detected failure on control connection
            bob_walker99

            Can anyone help with this? I'm really struggling here, and I'm no nearer even understanding what the error means?

            • 3. Re: Detected failure on control connection
              clebert.suconic

              This was the same failure I saw while testing http://jira.jboss.com/jira/browse/JBMESSAGING-1012

              I bet this was fixed as part of CR2

              • 4. Re: Detected failure on control connection
                bob_walker99

                Sounds promising. I mailed Tim a test which reproduces the behaviour, but was intermittent at the time.

                I've just done 2 clean installs: 1 of 4.2.0/1.3.0GA, and consistently got that error when creating a connection,

                I then did a clean 4.2.0/1.4.0CR2 install, and didn't see the error.

                However, I have seen the code work sometimes on 1.3.0GA, and sometimes fail, so you'll forgive me if I don't hold my breath?

                I'll mail you the test, I believe Tim is on holiday for a while?


                Can you expand on what the problem/resolution was? I've been tearing my hair out on this.

                Thanks Clebert,

                Best regards,

                Bob

                • 5. Re: Detected failure on control connection
                  clebert.suconic

                  We found few bugs on the Bisocket implementation... on the StressTest shown at JBMESSAGING-1012.

                  Ron had fixed lots of bugs, and JBossRemoting 2.2.2.GA fixed the problem.

                  So, you will also need to replace jboss-remoting by the newer version to make sure the error won't happen.

                  • 6. Re: Detected failure on control connection
                    bob_walker99

                    Thanks Clebert,

                    Just to confirm, this is the correct jar to grab, yes?:

                    http://repository.jboss.com/jboss/remoting/2.2.2.GA-brew/lib/

                    Best regards,

                    Bob

                    • 7. Re: Detected failure on control connection
                      clebert.suconic

                      Yep!

                      Hopefully you won't see that bug again!

                      Please let me know if you have any problems

                      • 8. Re: Detected failure on control connection
                        bob_walker99

                        Hi Clebert,

                        I'm going round in circles :-)

                        with the upgrade to 1.4.0CR2 - this has now surfaced again:

                        http://www.jboss.com/index.html?module=bb&op=viewtopic&t=106928

                        Any ideas?

                        Thanks,

                        Bob

                        • 9. Re: Detected failure on control connection
                          martin.wickus

                          I am seeing what I believe is the same.

                          Environment in this case using:
                          JBoss Messaging 1.4.0.GA
                          JBoss Remoting 2.2.2.SP1

                          Seeing the following warning:

                          2008-02-18 13:53:43,993 WARN [org.jboss.remoting.transport.bisocket.BisocketServerInvoker] org.jboss.remoting.transport.bisocket.BisocketServerInvoker$ControlMonitorTimerTask@72bd77: detected failure on control connection Thread[control: Socket[addr=/10.140.177.50,port=58029,localport=1654],5,main] (a3w4x10-9m4crq-fcsan8o9-1-fcsan9pl-9: requesting new control connection
                          2008-02-18 13:53:43,993 DEBUG [org.jboss.remoting.transport.bisocket.BisocketClientInvoker] getting secondary locator
                          


                          This continues for a while then eventually I get exception saying could not connect. I guess at that point I've exhausted the connections in the client pool.

                          However, before I get the could not connect exception, there is a long period where I can still send JMS messages, yet they never end up on my queue. No JMSException reported either.

                          • 10. Re: Detected failure on control connection
                            timfox

                            Hello Wickus-

                            Any chance you could try with 1.4.0.SP3 and remoting 2.2.2.SP4?

                            Several things have been fixed in remoting since the version you are using.

                            If you're using the supported EAP 4.3 configuration, you'll already be using that.

                            Cheers

                            • 11. Re: Detected failure on control connection
                              martin.wickus

                              Hi Tim

                              This specific environment is not yet using EAP 4.3 (although I am using it elsewhere). I will push for an upgrade into that environment as well.

                              Is there a Jira I can reference as motivation this issue has been addressed in EAP 4.3 (or the component JBM/JBR libraries) ? Will make my task easier convincing a release (as we are running in a couple of different countries :-)