12 Replies Latest reply on Oct 9, 2012 9:48 AM by rhusar

    What is my error in Cluster AS7.1 between two machines

    mgordon

      I 'm installing a cluster with AS 7.1 in two machines, and I follow the notes in: http://middlewaremagic.com/jboss/?p=1969

       

      When i follow the steps about configuration about  standalone instance in the same machine it works perfectly but when i follow the steps for two machines It doesn't work

       

      In the start up of both instances

      I have this message

       

      10:53:56,106 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-14-thread-1) ISPN000094: Received new cluster view: [node1/web|0] [node1/web]

      10:53:56,108 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-14-thread-1) ISPN000079: Cache local address is node1/web, physical addresses are [192.168.106.50:55200]

      10:53:56,112 INFO  [org.infinispan.factories.GlobalComponentRegistry] (pool-14-thread-1) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL

      10:53:56,113 INFO  [org.infinispan.config.ConfigurationValidatingVisitor] (pool-14-thread-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

      10:53:56,617 INFO  [org.infinispan.jmx.CacheJmxRegistration] (pool-14-thread-1) ISPN000031: MBeans were successfully registered to the platform mbean server.

      10:53:57,239 INFO  [org.jboss.as.clustering.infinispan] (pool-14-thread-1) JBAS010281: Started repl cache from web container

      10:53:57,257 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-1) JBAS010206: Number of cluster members: 1

       

      and I don't know what is my problem....

       

      I don't know if its a problem of network bettween both machine , Someone can tell me what port must be opened for the comunication between both standalone instances?

      It's necesarry a apache to obtain this mantain of session? If it's true why if i have the two standalone instances in the same machine and i haven't one apache , why it works perfectly ?

       

       

      Thanks in advance

        • 1. Re: What is my error in Cluster AS7.1 between two machines
          rhusar

          When i follow the steps about configuration about  standalone instance in the same machine it works perfectly but when i follow the steps for two machines It doesn't work

          So clearly, the problem is in communication between these 2 machines. By default, clustering uses UDP to do the discovery. By default firewalls are setup to dump this traffic.

           

          For debugging, you can try turning off the firewall to see that it is the culprit. Then set it up correctly and start it again.

           

          I don't know if its a problem of network bettween both machine , Someone can tell me what port must be opened for the comunication between both standalone instances?

          You can use TCPPING discovery and use TCP for the data channels and avoid using UDP whatsoever. In that case you will just have to open few ports specified.

          It's necesarry a apache to obtain this mantain of session? If it's true why if i have the two standalone instances in the same machine and i haven't one apache , why it works perfectly ?

          Well, in theory, you dont need apache. However you need something in front of it, how else would you guarantee the HA, ie if one server is down another one picks up?

           

          Rado

          1 of 1 people found this helpful
          • 2. Re: What is my error in Cluster AS7.1 between two machines
            mgordon

            Thanks,

             

            What do i have to change for avoid using UDP ? ,

             

             

            Yes , I know that I need or Apache or a Hardware balance in order to guarantee the HA , but in some article i read that I need an Apache when the cluster is between instances in more than one machine ,, and i don't understand why?

             

            I'm going to review firewall settings

            • 3. Re: What is my error in Cluster AS7.1 between two machines
              wdfink

              You might switch to TCP. Change the following subsystem:

              <subsystem xmlns="urn:jboss:domain:jgroups:1.1" default-stack="udp">

              to "tcp"

               

              Also you might add the hosts for initial detection:

                <stack name="tcp">

                  <protocol type="TCPPING">

                    <property name="initial_hosts">x.x.x.x[7600],x.x.x.y[7600]</property>

                    <property name="num_initial_members">2</property>

                    <property name="port_range">0</property>

                    <property name="timeout">5000</property>

                 ...

              • 4. Re: What is my error in Cluster AS7.1 between two machines
                rhusar

                ..and just remove MPING which uses multicast, so delete this line:

                            <protocol type="MPING" socket-binding="jgroups-mping"/>
                1 of 1 people found this helpful
                • 5. Re: What is my error in Cluster AS7.1 between two machines
                  mgordon

                  thanks to both .

                   

                  I have advanced and now including

                   

                       <stack name="tcp">

                      <protocol type="TCPPING">

                        <property name="initial_hosts">x.x.x.x[7600],x.x.x.y[7600]</property>

                        <property name="num_initial_members">2</property>

                        <property name="port_range">0</property>

                        <property name="timeout">5000</property>

                  and deleting

                  <protocol type="MPING" socket-binding="jgroups-mping"/>

                   

                  In the both nodes in de standalone-ha.xml in both machines

                   

                  when I start my node 2 I obtain these messages

                   

                  15:29:24,787 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-14-thread-1) ISPN000094: Received new cluster view: [node1/web|1] [node1/web, node2/web]

                  15:29:24,788 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-14-thread-1) ISPN000079: Cache local address is node2/web, physical addresses are [192.168.106.51:7600]

                  15:29:24,805 INFO  [org.infinispan.factories.GlobalComponentRegistry] (pool-14-thread-1) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL

                  15:29:24,806 INFO  [org.infinispan.config.ConfigurationValidatingVisitor] (pool-14-thread-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                  15:29:24,947 INFO  [org.infinispan.jmx.CacheJmxRegistration] (pool-14-thread-1) ISPN000031: MBeans were successfully registered to the platform mbean server.

                  15:29:25,144 INFO  [org.jboss.as.clustering.infinispan] (pool-14-thread-1) JBAS010281: Started repl cache from web container

                  15:29:25,168 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-2) JBAS010206: Number of cluster members: 2

                   

                  but in node 1 the messages are

                  15:21:22,004 INFO [org.infinispan.factories.GlobalComponentRegistry] (pool-12-thread-1) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL

                  15:21:22,007 INFO [org.infinispan.config.ConfigurationValidatingVisitor] (pool-12-thread-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                  15:21:22,702 INFO [org.infinispan.jmx.CacheJmxRegistration] (pool-12-thread-1) ISPN000031: MBeans were successfully registered to the platform mbean server.

                  15:21:22,724 INFO [org.jboss.as.clustering.infinispan] (pool-12-thread-1) JBAS010281: Started repl cache from web container

                  15:21:22,752 INFO [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-1) JBAS010206: Number of cluster members: 1

                   

                  and I 'm testing whith an aplication cluster-demo and it doesn't work as a cluster..,

                   

                  Any more idea ?


                  • 6. Re: What is my error in Cluster AS7.1 between two machines
                    mgordon

                    One stuff more

                     

                    when I do

                     

                    telnet CAMPIJBOS01.yelldes.intrayell.com 7600

                    Trying 192.168.106.51...

                    Connected to CAMPIJBOS01.yelldes.intrayell.com (192.168.106.51).

                    Escape character is '^]'.

                    15:50:04,309 ADVERTENCIA [org.jgroups.blocks.TCPConnectionMap] (ConnectionMap.Acceptor,null) Could not read accept connection from peer java.net.SocketTimeoutException: Read timed out

                    Connection closed by foreign host.

                     

                     

                    ....

                    • 7. Re: What is my error in Cluster AS7.1 between two machines
                      rhusar

                      When the 2nd node joins in that logging is not there any more. I have created https://issues.jboss.org/browse/AS7-5407 in the past that will make this more clear.  Look at the received cache view and you should see that it shows 2 different node names. Also note that both server must have TCPPING configured with all nodes.

                      • 8. Re: What is my error in Cluster AS7.1 between two machines
                        mgordon

                        After I restarted my node 1 I obtained the cluster configuration correct .

                         

                        16:23:50,591 INFO  [org.infinispan.jmx.CacheJmxRegistration] (pool-15-thread-1) ISPN000031: MBeans were successfully registered to the platform mbean server.

                        16:23:50,702 INFO  [org.jboss.as.clustering.infinispan] (pool-15-thread-1) JBAS010281: Started repl cache from web container

                        16:23:50,716 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-1) JBAS010206: Number of cluster members: 2

                        16:23:50,779 INFO  [org.infinispan.configuration.cache.EvictionConfigurationBuilder] (MSC service thread 1-2) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                        16:23:50,781 INFO  [org.infinispan.config.ConfigurationValidatingVisitor] (MSC service thread 1-2) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                         

                        And when I stop or start one of the nodes , the other one reflected this changes...

                         

                        BUT

                         

                        My application with Distributable option in web.xml don't works properly ... the session is not transmited from one node to the other ....

                         

                         

                         

                        do I need install apache?, do i need configurate CACHE is any specific parameter? What's the matter?

                         

                         

                        TIPI previously forgot to say that my both machines are in EC2 of AMAZON, I suppose that the cause that the comunication throug UDP doesn't work for me !!!

                        • 9. Re: What is my error in Cluster AS7.1 between two machines
                          mgordon

                          Hello,

                           

                          After one week in which I haven't been working in  this issue I return...

                           

                          My cluster & Application works properly if the two  servers are in the same machine.., and I have configured the Apache with mod-cluster and it's works properly ...

                           

                          BUT I continue whith the same problem when I configure my two servers in two different Nodes ... My two nodes are in Amazon Cloud.., but I am not able to do that my application works ...

                           

                          do i need configurate CACHE is any specific parameter? What's the matter?

                           

                           

                          My log is similar to:  ...

                           

                          13:09:58,759 INFO  [org.infinispan.configuration.cache.EvictionConfigurationBuilder] (ServerService Thread Pool -- 33) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                          13:09:58,764 INFO  [org.infinispan.configuration.cache.EvictionConfigurationBuilder] (ServerService Thread Pool -- 33) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                          13:09:58,952 INFO  [org.apache.coyote.http11.Http11Protocol] (MSC service thread 1-1) Arrancando Coyote HTTP/1.1 en puerto http--192.168.106.51-8080

                          13:09:59,227 INFO  [org.apache.coyote.ajp.AjpProtocol] (MSC service thread 1-1) Arrancando Coyote AJP/1.3 en ajp--192.168.106.51-8009

                          13:09:59,728 INFO  [org.jboss.as.mail.extension] (MSC service thread 1-1) JBAS015400: Bound mail session [java:jboss/mail/Default]

                          13:10:00,022 INFO  [org.jboss.ws.common.management.AbstractServerConfig] (MSC service thread 1-2) JBoss Web Services - Stack CXF Server 4.0.2.GA

                          13:10:01,921 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) JBAS010400: Bound data source [java:jboss/datasources/ExampleDS]

                          13:10:02,325 INFO  [org.jboss.as.remoting] (MSC service thread 1-1) JBAS017100: Listening on /192.168.106.51:9999

                          13:10:02,335 INFO  [org.jboss.as.remoting] (MSC service thread 1-1) JBAS017100: Listening on /192.168.106.51:4447

                          13:10:02,390 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015876: Starting deployment of "ClusterWebApp.war"

                          13:10:02,410 INFO  [org.jboss.as.server.deployment.scanner] (MSC service thread 1-1) JBAS015012: Started FileSystemDeploymentService for directory /software/jboss-7.1.1/bin/../standalone-node2/deployments

                          13:10:02,411 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015876: Starting deployment of "cluster-demo.war"

                          13:10:06,478 INFO  [stdout] (pool-11-thread-1)

                          13:10:06,479 INFO  [stdout] (pool-11-thread-1) -------------------------------------------------------------------

                          13:10:06,480 INFO  [stdout] (pool-11-thread-1) GMS: address=node2/web, cluster=web, physical address=192.168.106.51:7600

                          13:10:06,480 INFO  [stdout] (pool-11-thread-1) -------------------------------------------------------------------

                          13:10:07,107 INFO  [org.infinispan.configuration.cache.EvictionConfigurationBuilder] (MSC service thread 1-2) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                          13:10:07,126 INFO  [org.infinispan.configuration.cache.EvictionConfigurationBuilder] (MSC service thread 1-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                          13:10:07,271 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-12-thread-1) ISPN000078: Starting JGroups Channel

                          13:10:07,277 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-12-thread-1) ISPN000094: Received new cluster view: [node1/web|1] [node1/web, node2/web]

                          13:10:07,282 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-12-thread-1) ISPN000079: Cache local address is node2/web, physical addresses are [192.168.106.51:7600]

                          13:10:07,302 INFO  [org.infinispan.factories.GlobalComponentRegistry] (pool-12-thread-1) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL

                          13:10:07,303 INFO  [org.infinispan.config.ConfigurationValidatingVisitor] (pool-12-thread-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be pasivated.

                          13:10:07,823 INFO  [org.infinispan.jmx.CacheJmxRegistration] (pool-12-thread-1) ISPN000031: MBeans were successfully registered to the platform mbean server.

                          13:10:07,931 INFO  [org.jboss.as.clustering.infinispan] (pool-12-thread-1) JBAS010281: Started repl cache from web container

                          13:10:07,942 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-1) JBAS010206: Number of cluster members: 2

                           

                           

                          Thanks in advance...

                          • 10. Re: What is my error in Cluster AS7.1 between two machines
                            rhusar

                            do I need install apache?, do i need configurate CACHE is any specific parameter? What's the matter?

                            Yes, you need an LB in most cases.

                             

                            TIPI previously forgot to say that my both machines are in EC2 of AMAZON, I suppose that the cause that the comunication throug UDP doesn't work for me !!!

                            Yes, you could have said that at the beginning. We know that UDP multicasting doesnt work in EC2 which is the default in AS7.  (AFAIR UDP unicast does work but thats not any better than TCP).

                             

                            The server log looks healthy, so what is the problem? Session is not replicating? How do you check that?

                            • 11. Re: What is my error in Cluster AS7.1 between two machines
                              mgordon

                              Thanks Radoslav

                               

                              About your asnwer:

                              do I need install apache?, do i need configurate CACHE is any specific parameter? What's the matter?

                              Yes, you need an LB in most cases.

                               

                              Why I don't need any LB when mys servers are in the same machine and if mys servers are in two different machines I need it? I don's understand why?

                               

                              Thanks.

                              • 12. Re: What is my error in Cluster AS7.1 between two machines
                                rhusar

                                Ok I see your problem now, your test is wrong.

                                 

                                Why I don't need any LB when mys servers are in the same machine and if mys servers are in two different machines I need it? I don's understand why?

                                 

                                How is a session identified? By HTTP Set-Cookie header. Cookies are only saved for a HOST and NOT for port (eg. switching unsecured connection 80 to SSL would lost your session). Thus when you are testing locally on one node, the hostname is the same, cookie is send, session is found. When you are testing with 2 different nodes, you need to send that cookie as well -- but you dont, because the hostname is different (It would be a major security issue of your browser if it worked this way.) Thus you either need an LB and cookie is remembered for that LB address, or you need to spoof that cookie e.g. using a telnet client.