Co-located replication failover configuration i...| JBoss.org Content Archive (Read Only)

15. Re: Co-located replication failover configuration in standalone-ha.xml EAP 7

mnovak May 24, 2018 8:39 AM (in response to vamshi1413)

In CLI you check is artemis server is active by command like (change name of server):

/subsystem=messaging-activemq/server=master:read-attribute(name=active)

/subsystem=messaging-activemq/server=slave:read-attribute(name=active)

Those WARNs are ok. I can see there:

2018-05-23 11:57:51,955 INFO [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221007: Server is now live

which means that backup is activated. Then there are other 2 logs ending with "...is connected" from cluster connection. Which means that master/live and newly activated backup/slave formed cluster. So it works. You can try to start killed server again and see if failback succeeds.

16. Re: Co-located replication failover configuration in standalone-ha.xml EAP 7

vamshi1413 May 24, 2018 12:55 PM (in response to mnovak)

mnovak wrote:

In CLI you check is artemis server is active by command like (change name of server):
/subsystem=messaging-activemq/server=master:read-attribute(name=active)
/subsystem=messaging-activemq/server=slave:read-attribute(name=active)

I don't see such attribute in CLI, are those available in replication-master / replication-slave configuration..? My jboss is running with replication-colocated configuration and I don't see those attributes.

Those WARNs are ok. I can see there:
2018-05-23 11:57:51,955 INFO [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221007: Server is now live

which means that backup is activated. Then there are other 2 logs ending with "...is connected" from cluster connection. Which means that master/live and newly activated backup/slave formed cluster. So it works.

Are these comments still valid for my replication-colocated configuration ..?

You can try to start killed server again and see if failback succeeds.

I didn't configure failback for my configuration as I don't want to stop the new active server(server2) to stop processing messages and switch to the new server(server1) after bringing it up.

That said, when I start server1 post stopping

Server1

2018-05-24 11:47:53,289 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 68) AMQ221007: Server is now live

2018-05-24 11:47:53,289 INFO [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 68) AMQ221001: Apache ActiveMQ Artemis Message Broker version 1.1.0.SP24-redhat-1 [nodeID=c869c816-5df0-11e8-bd9d-357bd70a9cb9]

2018-05-24 11:47:54,039 INFO [org.apache.activemq.artemis.core.server] (Thread-1 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@69e6b1d0-1390278372)) AMQ221027: Bridge ClusterConnectionBridge@1c9020e8 [name=sf.my-cluster.6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51, queue=QueueImpl[name=sf.my-cluster.6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=c869c816-5df0-11e8-bd9d-357bd70a9cb9]]@48b0623a targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1c9020e8 [name=sf.my-cluster.6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51, queue=QueueImpl[name=sf.my-cluster.6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=c869c816-5df0-11e8-bd9d-357bd70a9cb9]]@48b0623a targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8180&host=stg-dmz-app25-wernerds-net], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@818909400[nodeUUID=c869c816-5df0-11e8-bd9d-357bd70a9cb9, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8180&host=stg-dmz-app24-wernerds-net, address=jms, server=ActiveMQServerImpl::serverUUID=c869c816-5df0-11e8-bd9d-357bd70a9cb9])) [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8180&host=stg-dmz-app25-wernerds-net], discoveryGroupConfiguration=null]] is connected

Server2

2018-05-24 11:47:52,053 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-18,ee,stg-dmz-app25:stg-app24.Member2) ISPN000094: Received new cluster view for channel server: [stg-dmz-app25:stg-app24.Member2|3] (2) [stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]

2018-05-24 11:47:52,054 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-18,ee,stg-dmz-app25:stg-app24.Member2) ISPN000094: Received new cluster view for channel web: [stg-dmz-app25:stg-app24.Member2|3] (2) [stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]

2018-05-24 11:47:52,056 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-18,ee,stg-dmz-app25:stg-app24.Member2) ISPN000094: Received new cluster view for channel ejb: [stg-dmz-app25:stg-app24.Member2|3] (2) [stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]

2018-05-24 11:47:52,058 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-18,ee,stg-dmz-app25:stg-app24.Member2) ISPN000094: Received new cluster view for channel hibernate: [stg-dmz-app25:stg-app24.Member2|3] (2) [stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]

2018-05-24 11:47:52,851 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t1) ISPN000310: Starting cluster-wide rebalance for cache routing, topology CacheTopology{id=5, rebalanceId=3, currentCH=DefaultConsistentHash{ns=80, owners = (1)[stg-dmz-app25:stg-app24.Member2: 80+0]}, pendingCH=DefaultConsistentHash{ns=80, owners = (2)[stg-dmz-app25:stg-app24.Member2: 40+40, stg-dmz-app24:stg-app24.Member1: 40+40]}, unionCH=null, actualMembers=[stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]}

2018-05-24 11:47:52,872 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t1) ISPN000310: Starting cluster-wide rebalance for cache cluster-demo.war, topology CacheTopology{id=5, rebalanceId=3, currentCH=DefaultConsistentHash{ns=80, owners = (1)[stg-dmz-app25:stg-app24.Member2: 80+0]}, pendingCH=DefaultConsistentHash{ns=80, owners = (2)[stg-dmz-app25:stg-app24.Member2: 40+40, stg-dmz-app24:stg-app24.Member1: 40+40]}, unionCH=null, actualMembers=[stg-dmz-app25:stg-app24.Member2, stg-dmz-app24:stg-app24.Member1]}

2018-05-24 11:47:53,161 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t3) ISPN000336: Finished cluster-wide rebalance for cache cluster-demo.war, topology id = 5

2018-05-24 11:47:53,168 INFO [org.infinispan.CLUSTER] (remote-thread--p8-t3) ISPN000336: Finished cluster-wide rebalance for cache routing, topology id = 5

2018-05-24 11:47:53,257 WARN [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=c869c816-5df0-11e8-bd9d-357bd70a9cb9

2018-05-24 11:47:53,258 WARN [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=c869c816-5df0-11e8-bd9d-357bd70a9cb9

2018-05-24 11:47:54,272 INFO [org.apache.activemq.artemis.core.server] (Thread-28 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@50362333-1385211039)) AMQ221027: Bridge ClusterConnectionBridge@a2b3f11 [name=sf.my-cluster.c869c816-5df0-11e8-bd9d-357bd70a9cb9, queue=QueueImpl[name=sf.my-cluster.c869c816-5df0-11e8-bd9d-357bd70a9cb9, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51]]@2cfaef88 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@a2b3f11 [name=sf.my-cluster.c869c816-5df0-11e8-bd9d-357bd70a9cb9, queue=QueueImpl[name=sf.my-cluster.c869c816-5df0-11e8-bd9d-357bd70a9cb9, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51]]@2cfaef88 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8680&host=stg-dmz-app25-wernerds-net], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@360168733[nodeUUID=6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8180&host=stg-dmz-app25-wernerds-net, address=jms, server=ActiveMQServerImpl::serverUUID=6ea64fd5-57aa-11e8-9ff9-7d30e8dfbf51])) [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8680&host=stg-dmz-app25-wernerds-net], discoveryGroupConfiguration=null]] is connected

17. Re: Co-located replication failover configuration in standalone-ha.xml EAP 7

mnovak May 25, 2018 5:01 AM (in response to vamshi1413)

I thought you trying it with replication-master and replication-slave config so there are 2 servers in messaging-activemq subsystem. I did not play with replication-colocated so in this case i'm not sure whether it's possible to check that backup is active in CLI. However you can check that backup has opened ports for its acceptors (which has given port offset).

Are log messages:

2018-05-24 11:47:53,257 WARN [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=c869c816-5df0-11e8-bd9d-357bd70a9cb9

log periodically or you can see just a few of them.

I think it's better to configure allow-failback=true because otherwise server 2 will have active 2 Artemis servers and handle all load alone.

Do you have some client or application which you might use to try whether failover works. For example you could try MDB deployed to 3rd server which would be configured to connect to remote server 1. MDB would resend messages from InQueue to OutQueue. You would kill server 1 and check that MDB failovered to slave/backup to server 2. In the end number of messages sent to InQueue must be the same as number of messages in OutQueue.

18. Re: Co-located replication failover configuration in standalone-ha.xml EAP 7

vamshi1413 May 30, 2018 5:09 PM (in response to mnovak)

Sorry, I was O.O.O the past few days.. and couldn't get back to you.

mnovak wrote:

I thought you trying it with replication-master and replication-slave config so there are 2 servers in messaging-activemq subsystem. I did not play with replication-colocated so in this case i'm not sure whether it's possible to check that backup is active in CLI. However you can check that backup has opened ports for its acceptors (which has given port offset).

Re: Co-located replication failover configuration in standalone-ha.xml EAP 7 , Like I said in this I was using replication-colocated sorry if I didn't convey it clearly. The backup port offset will be based out of connectors/acceptors but I don't know exactly what port I am looking for... can you help me how to find port + backup-port-offset. how to find the actual port .

Are log messages:
2018-05-24 11:47:53,257 WARN [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=c869c816-5df0-11e8-bd9d-357bd70a9cb9

log periodically or you can see just a few of them.

No, they appear only once per node as it says in the log.

OK, I configured allow-failbackup=true as per your suggestion.

I have a sample application which I downloaded from online (JMS Queue Clustering on JBoss EAP Server ) I am not quite sure if this application has such logic to test the scenario you mentioned. Is there any application that you can suggest to test the functionality ?

19. Re: Co-located replication failover configuration in standalone-ha.xml EAP 7

mnovak Jun 1, 2018 2:40 AM (in response to vamshi1413)

Re: Co-located replication failover configuration in standalone-ha.xml EAP 7 , Like I said in this I was using replication-colocated sorry if I didn't convey it clearly. The backup port offset will be based out of connectors/acceptors but I don't know exactly what port I am looking for... can you help me how to find port + backup-port-offset. how to find the actual port .

Take a look at your remote-acceptor which socket binding it's using. From definition of socket binding you get the port. Then backup port offset is defined in <replication-colocated ... > config. So when backup activates then you can check if its ports are open.

No, they appear only once per node as it says in the log.

Sounds good. This is ok.

I have a sample application which I downloaded from online (JMS Queue Clustering on JBoss EAP Server ) I am not quite sure if this application has such logic to test the scenario you mentioned. Is there any application that you can suggest to test the functionality ?

There are just compiled classes in the project so it's hard to tell. However you can try following MDB. It looks long but it's easy when you read the code. It consumes messages from queue InQueue and in onMessage() method it sends one message to OutQueue (there is used default java:/JmsXA connection factory for sending message to OutQueue). All this happens in XA transaction - so when receiving or sending fails then the whole transaction is rolled back which means the message "returns" to InQueue and is processed later - for example after failover ;-)

You will need to deploy queue InQueue and OutQueue to both of the servers using CLI:

/subsystem=messaging-activemq/server=default/jms-queue=InQueue:add(entries=[java:jboss/exported/jms/queue/InQueue, java:/jms/queue/InQueue])

/subsystem=messaging-activemq/server=default/jms-queue=OutQueue:add(entries=[java:jboss/exported/jms/queue/OutQueue, java:/jms/queue/OutQueue])

MDB will be deployed on 3rd server where you will configure connector in pooled-connection-factory to 1st server. You will send messages to InQueue to 1st server and check that MDB is processing messages (there messages in OutQueue). Then you will kill 1st server and see whether MDB did failover => it's still consuming message and there are new messages in OutQueue. Don't be scared when you 10s of exceptions in server.log of 3rd server when you kill 1st server. It's normal.

@MessageDriven(name = "mdb1",
        activationConfig = {
                @ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue"),
                @ActivationConfigProperty(propertyName = "rebalanceConnections", propertyValue = "true"),
                @ActivationConfigProperty(propertyName = "hA", propertyValue = "true"),
                @ActivationConfigProperty(propertyName = "destination", propertyValue = "jms/queue/InQueue")})
@TransactionManagement(value = TransactionManagementType.CONTAINER)
@TransactionAttribute(value = TransactionAttributeType.REQUIRED)
public class MdbWithRemoteOutQueueToContaniner1 implements MessageListener {

    private static final long serialVersionUID = 2770941392406343837L;
    private static final Logger log = Logger.getLogger(MdbWithRemoteOutQueueToContaniner1.class.getName());
    private static final JMSImplementation jmsImplementation = ServiceLoader.load(JMSImplementation.class).iterator().next();
    private Queue queue = null;
    public static AtomicInteger numberOfProcessedMessages = new AtomicInteger();

    @Resource(mappedName = "java:/JmsXA")
    private ConnectionFactory cf;

    @Resource
    private MessageDrivenContext context;

    @Override
    public void onMessage(Message message) {

        Connection con = null;
        Session session = null;

        try {

            long time = System.currentTimeMillis();
            int counter = 0;
            try {
                counter = message.getIntProperty("count");
            } catch (Exception e) {
                log.error(e.getMessage(), e);
            }

            String messageInfo = message.getJMSMessageID() + ", count:" + counter;

            log.debug(" Start of message:" + messageInfo);

            for (int i = 0; i < (5 + 5 * Math.random()); i++) {
                try {
                    Thread.sleep((int) (10 + 10 * Math.random()));
                } catch (InterruptedException ex) {
                }
            }

            con = cf.createConnection();

            session = con.createSession(false, Session.AUTO_ACKNOWLEDGE);

            if (queue == null) {
                queue = session.createQueue("OutQueue");
            }

            con.start();

            String text = message.getJMSMessageID() + " processed by: " + hashCode();
            MessageProducer sender = session.createProducer(queue);
            TextMessage newMessage = session.createTextMessage(text);
            newMessage.setStringProperty("inMessageId", message.getJMSMessageID());
            newMessage.setStringProperty(jmsImplementation.getDuplicatedHeader(), message.getStringProperty(jmsImplementation.getDuplicatedHeader()));
            sender.send(newMessage);

            messageInfo = messageInfo + ". Sending new message with inMessageId: " + newMessage.getStringProperty("inMessageId")
                    + " and messageId: " + newMessage.getJMSMessageID();

            log.debug("End of " + messageInfo + " in " + (System.currentTimeMillis() - time) + " ms");

            if (numberOfProcessedMessages.incrementAndGet() % 100 == 0)
                log.info(messageInfo + " in " + (System.currentTimeMillis() - time) + " ms");

        } catch (Exception t) {
            log.error(t.getMessage(), t);
            this.context.setRollbackOnly();
        } finally {
            if (con != null) {
                try {
                    con.close();
                } catch (JMSException e) {
                    log.fatal(e.getMessage(), e);
                }
            }
        }
    }
}