1 2 Previous Next 18 Replies Latest reply on Oct 27, 2016 8:53 AM by mnovak

    Wildfly 10.1 fails to deploy app during Artemis failback

    asimmahmood

      Hi all,

       

      I’m attempting to run two instances of Wildfly 10.1 in a cluster. Both instances are running 2 Artemis activemq servers, one which is live and one which backs up the live server on the other Wildfly instance. The live backup pairs are configured to use network replication to synchronise their data.

       

      The issue I’m having is that my application will fail to deploy if the live Artemis instance needs to synchronise with the backup Artemis instance at startup (during failback). The errors I’m getting are:

       

      "WFLYCTL0412: Required services that are not installed:" => ["jboss.naming.context.java.jboss.DefaultJMSConnectionFactory

       

      And

       

      "Services that may be the cause:" => [

      "jboss.naming.context.java.JmsXA",

      "jboss.naming.context.java.jboss.DefaultJMSConnectionFactory",

      "jboss.ra.activemq-ra"

              ]

       

      Shortly after getting these errors I can see the following messages in the logs:

      WFLYJCA0002: Bound JCA ConnectionFactory [java:/JmsXA]

      WFLYMSGAMQ0002: Bound messaging object to jndi name java:jboss/DefaultJMSConnectionFactory

       

      I believe Wildfly is attempting to deploy my application before Artemis is ready. In the case of a failback the backup server (which became live) needs to replicate its state to the live server before it can start up. This will obviously take a little longer than a clean startup. I believe it is this extra time which is causing Artemis to startup later, and as a result the connection factories aren’t available when the application is deployed.

       

      Is there a way to ensure my application is only deployed when Artemis has been completely initialised? My configuration for the messaging-activemq subsystem is shown below. Is there anything in my configuration that could be causing this issue? I start the first node with jboss.node.name system property set to node-1 and artemis.backup.node.name set to node-2. The second node has jboss.node.name set to node-2 and artemis.backup.node.name set to node-1

       

       <subsystem xmlns="urn:jboss:domain:messaging-activemq:1.0">
                  <server name="default">
                      <cluster password="${jboss.messaging.cluster.password:CHANGE ME!!}"/>
                      <replication-master check-for-live-server="true" group-name="${jboss.node.name:node-1}" cluster-name="my-cluster"/>
                      <security-setting name="#">
                          <role name="guest" send="true" consume="true" create-non-durable-queue="true" delete-non-durable-queue="true"/>
                      </security-setting>
                      <address-setting name="#" dead-letter-address="jms.queue.DLQ" expiry-address="jms.queue.ExpiryQueue" max-size-bytes="10485760" page-size-bytes="2097152" message-counter-history-day-limit="10" redistribution-delay="1000"/>
                      ... (address-settings)
                      <remote-connector name="netty" socket-binding="messaging"/>
                      <remote-connector name="netty-throughput" socket-binding="messaging-throughput">
                          <param name="batch-delay" value="50"/>
                      </remote-connector>
                      <in-vm-connector name="in-vm" server-id="0"/>
                      <remote-acceptor name="netty" socket-binding="messaging"/>
                      <remote-acceptor name="netty-throughput" socket-binding="messaging-throughput">
                          <param name="batch-delay" value="50"/>
                          <param name="direct-deliver" value="false"/>
                      </remote-acceptor>
                      <in-vm-acceptor name="in-vm" server-id="0"/>
                      <broadcast-group name="bg-group1" jgroups-channel="activemq-cluster" connectors="netty"/>
                      <discovery-group name="dg-group1" jgroups-channel="activemq-cluster"/>
                      <cluster-connection name="my-cluster" address="jms" connector-name="netty" discovery-group="dg-group1"/>
                      <jms-queue name="ExpiryQueue" entries="java:/jms/queue/ExpiryQueue"/>
                      <jms-queue name="DLQ" entries="java:/jms/queue/DLQ"/>
                      ... (jms-queues)
                      <connection-factory name="InVmConnectionFactory" entries="java:/ConnectionFactory" connectors="in-vm"/>
                      <connection-factory name="RemoteConnectionFactory" entries="java:jboss/exported/jms/RemoteConnectionFactory" connectors="netty" ha="true" block-on-acknowledge="true" reconnect-attempts="-1"/>
                      <pooled-connection-factory name="activemq-ra" entries="java:/JmsXA java:jboss/DefaultJMSConnectionFactory" connectors="in-vm" transaction="xa"/>
                  </server>
                  <server name="backup">
                      <cluster password="${jboss.messaging.cluster.password:CHANGE ME!!}"/>
                      <replication-slave max-saved-replicated-journal-size="-1" group-name="${artemis.backup.node.name:node-2}" cluster-name="my-cluster"/>
                      <bindings-directory path="activemq/bindings-backup"/>
                      <journal-directory path="activemq/journal-backup"/>
                      <large-messages-directory path="activemq/largemessages-backup"/>
                      <paging-directory path="activemq/paging-backup"/>
                      <address-setting name="#" redistribution-delay="0" page-size-bytes="524288" max-size-bytes="1048576" max-delivery-attempts="200"/>
                      <remote-connector name="netty-backup" socket-binding="messaging-backup"/>
                      <in-vm-connector name="in-vm" server-id="1"/>
                      <remote-acceptor name="netty-backup" socket-binding="messaging-backup"/>
                      <broadcast-group name="bg-group-backup" connectors="netty-backup" jgroups-channel="activemq-cluster"/>
                      <discovery-group name="dg-group-backup" jgroups-channel="activemq-cluster"/>
                      <cluster-connection name="my-cluster" discovery-group="dg-group-backup" connector-name="netty-backup" address="jms"/>
                  </server>
              </subsystem>
              ...
              <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}">
              ...
                     <socket-binding name="messaging" port="5445"/>
                     <socket-binding name="messaging-backup" port="5446"/>
                     <socket-binding name="messaging-throughput" port="5447"/>
              ...
              </socket-binding-group>
      

       

      Any help would be greatly appreciated.

       

      Thanks,

      Asim

        • 1. Re: Wildfly 10.1 fails to deploy app during Artemis failback
          jbertram

          How is your application acquiring its JMS resources (e.g. injection, manual JNDI lookup, etc.)?

          • 2. Re: Wildfly 10.1 fails to deploy app during Artemis failback
            asimmahmood

            Hi Justin,

             

            Thanks for your response. My application injects the connection factory using @Resource(mappedName = "java:/JmsXA"). The queues are acquired by doing a JNDI lookup. As a quick test I tried changing one of my connection factories to

             

            @Inject

            JMSContext context;

             

            However it has made little difference. Do you have any suggestions?

             

            Thanks,

            Asim

            • 3. Re: Wildfly 10.1 fails to deploy app during Artemis failback
              jbertram

              Since the application server doesn't appear to support this particular use-case I'd recommend you split your application from your messaging infrastructure.  In other words, run a separate messaging-specific group of servers which you connect to over the network from your application(s).

              • 4. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                asimmahmood

                Thanks Justin, I’m not really keen on running separate messaging servers if I can avoid it. So I have attempted to do the same thing but within the one Wildfly node. I tried the configuration below and it does seem to have solved the initial issue.

                 

                <subsystem xmlns="urn:jboss:domain:messaging-activemq:1.0">
                            <server name="connect">
                                <bindings-directory path="activemq/bindings-connect"/>
                                <journal-directory path="activemq/journal-connect"/>
                                <large-messages-directory path="activemq/largemessages-connect"/>
                                <paging-directory path="activemq/paging-connect"/>
                                <in-vm-connector name="in-vm" server-id="0"/>
                                <pooled-connection-factory name="activemq-ra" entries="java:/JmsXA java:jboss/DefaultJMSConnectionFactory" connectors="in-vm" transaction="xa"/>
                            </server>
                            <server name="default">
                                <cluster password="${jboss.messaging.cluster.password:CHANGE ME!!}"/>
                                <replication-master check-for-live-server="true" group-name="${jboss.node.name:node-1}" cluster-name="my-cluster"/>
                                <security-setting name="#">
                                    <role name="guest" send="true" consume="true" create-non-durable-queue="true" delete-non-durable-queue="true"/>
                                </security-setting>
                                <address-setting name="#" dead-letter-address="jms.queue.DLQ" expiry-address="jms.queue.ExpiryQueue" max-size-bytes="10485760" page-size-bytes="2097152" message-counter-history-day-limit="10" redistribution-delay="1000"/>
                                ... (address-settings)  
                                <remote-connector name="netty" socket-binding="messaging"/>
                                <remote-connector name="netty-throughput" socket-binding="messaging-throughput">
                                    <param name="batch-delay" value="50"/>
                                </remote-connector>
                                <in-vm-connector name="in-vm" server-id="0"/>
                                <remote-acceptor name="netty" socket-binding="messaging"/>
                                <remote-acceptor name="netty-throughput" socket-binding="messaging-throughput">
                                    <param name="batch-delay" value="50"/>
                                    <param name="direct-deliver" value="false"/>
                                </remote-acceptor>
                                <in-vm-acceptor name="in-vm" server-id="0"/>
                                <broadcast-group name="bg-group1" jgroups-channel="activemq-cluster" connectors="netty"/>
                                <discovery-group name="dg-group1" jgroups-channel="activemq-cluster"/>
                                <cluster-connection name="my-cluster" address="jms" connector-name="netty" discovery-group="dg-group1"/>
                                <jms-queue name="ExpiryQueue" entries="java:/jms/queue/ExpiryQueue"/>
                                <jms-queue name="DLQ" entries="java:/jms/queue/DLQ"/>
                                ... (jms-queues)
                                <connection-factory name="InVmConnectionFactory" entries="java:/ConnectionFactory" connectors="in-vm"/>
                                <connection-factory name="RemoteConnectionFactory" entries="java:jboss/exported/jms/RemoteConnectionFactory" connectors="netty" ha="true" block-on-acknowledge="true" reconnect-attempts="-1"/>
                            </server>
                            <server name="backup">
                                <cluster password="${jboss.messaging.cluster.password:CHANGE ME!!}"/>
                                <replication-slave max-saved-replicated-journal-size="-1" group-name="${hornetq.backup.node.name:node-2}" cluster-name="my-cluster"/>
                                <bindings-directory path="activemq/bindings-backup"/>
                                <journal-directory path="activemq/journal-backup"/>
                                <large-messages-directory path="activemq/largemessages-backup"/>
                                <paging-directory path="activemq/paging-backup"/>
                                <address-setting name="#" redistribution-delay="0" page-size-bytes="524288" max-size-bytes="1048576" max-delivery-attempts="200"/>
                                <remote-connector name="netty-backup" socket-binding="messaging-backup"/>
                                <in-vm-connector name="in-vm" server-id="1"/>
                                <remote-acceptor name="netty-backup" socket-binding="messaging-backup"/>
                                <broadcast-group name="bg-group-backup" connectors="netty-backup" jgroups-channel="activemq-cluster"/>
                                <discovery-group name="dg-group-backup" jgroups-channel="activemq-cluster"/>
                                <cluster-connection name="my-cluster" discovery-group="dg-group-backup" connector-name="netty-backup" address="jms"/>
                            </server>
                        </subsystem>
                

                 

                 

                However I now find that when I gracefully shutdown Wildfly there are occasions when all the messages in my queues are failed and moved to my DLQ. After my last shutdown I found that all 149 messages had been sent to the DLQ. The logs were also full of these messages:

                 

                AMQ222149: Message Reference[2147486172]:RELIABLE:ServerMessage[messageID=2147486172,durable=true,userID=06cf727e-8956-11e6-b864-430bd8ce4b60,priority=4, bodySize=1024, timestamp=Mon Oct 03 11:42:05 BST 2016,expiration=0, durable=true, address=jms.queue.testQueue,properties=TypedProperties[__AMQ_CID=06b77d72-8956-11e6-b864-430bd8ce4b60]]@305973320 has reached maximum delivery attempts, sending it to Dead Letter Address jms.queue.testDLQ from jms.queue.testQueue

                 

                 

                I did attempt to run a completely separate messaging server and found that if you shutdown the messaging server while the application server is still running the same thing happens. So I suspect that this is what is happening in the single Wildfly configuration. Do you know if anything can be done to stop Artemis failing all these messages?

                • 5. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                  jbertram

                  I don't understand what you've done here or what problem you've hit now.  Let's focus on the highest priority issue.  Please provide a complete explanation of the issue ideally with steps I might use to reproduce it myself.

                  • 6. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                    asimmahmood

                    Sorry for the confusion. You previously suggested splitting my application from my messaging infrastructure. This is what I attempted to do with my latest configuration. The “connect” server is simply there to connect to the local “default” server. This is similar to how you would configure Wildfly to connect to a remote messaging server over the network. The “connect” configuration is like the local side and “default” is like the remote side. With this configuration java:/JmsXA is bound early in the server start up process even if the “default” server needs to first synchronise with its backup during failback.

                     

                    The issue I was having was when I gracefully shutdown the server all the messages on my queues would be sent to the configured dead letter queue. This effectively emptied my queues of messages. I attempted to write something to demonstrate this behaviour but in the process found the solution. By injecting my queues and JMSContext’s this seems to have stopped this from happening.

                     

                    Thanks again for your help.

                    • 7. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                      mnovak

                      Hi Asin,

                       

                      this is disadvantage of replicated journal, it takes some time to sync live before failback. I'm not aware of any way how to tell MDB " to not deploy now and wait for live to activate". Currently MDB is just retrying it's deploy which will succeed eventually.

                       

                      Your 2nd configuration with "connect" server is not a good try for this as "connect" server does not have backup. It breaks what you're trying to achieve.

                       

                      Thanks,

                      Mirek

                      • 8. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                        asimmahmood

                        Hi Mirek,

                         

                        Thanks for the response. I’m not particularity concerned about the longer startup time but I do need it to start up reliably. When using the first configuration I found that if the MDB failed to deploy it would not reattempt deployment. My ear would be left as a failed deployment. When trying the 2nd configuration the connection factory would be bound to jndi very quickly which would let the MDB deploy. If the “default” configuration hasn’t become active yet then the connection is reattempted meaning the MDB eventually starts taking messages off the queues.

                         

                        Although the “connect” configuration doesn’t have a backup I don’t think it needs one. All the messages are held within the “default” configuration which does have a backup. So in the event of an outage all messages will we received by the MDB on the other Wildfly node. The testing I have done so far has confirmed this. Is there something else I should be considering?

                         

                        Thanks,

                        Asim

                        • 9. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                          jbertram

                          The configuration you're using (i.e. with 3 brokers running in a single node) is pretty bizarre.  I've never seen such a configuration before, and it seems like a work-around for a problem that might have a simpler solution.  I'm still not sure why all your messages would get sent to the DLQ when you shut down a server.  A message is sent to a DLQ when the max delivery attempts has been exceeded which indicates some kind of problem with the consumer or possibly with the message itself.  If you can work-up a test-case I can use to reproduce this behavior I could investigate further.

                          • 10. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                            asimmahmood

                            I agree that the 3 broker configuration is quite bizarre so If there is a simpler solution that would be preferred. I have put together a test that demonstrates the original failback issue. This is the main problem so a solution to this would be the best outcome. To set it up do the following.

                             

                            1. Extract a fresh copy of Wildfly 10.1
                            2. Copy the contents of the appserver folder in cluster test.zip to the wildfly-10.1.0.Final directory.
                            3. Open bin.test/standalone.bat in a text editor an add your machines IP address to the last line after the –b flag.
                            4. Open bin.test2/standalone.bat in a text editor an add your machines IP address to the last line after the –b flag.
                            5. Run bin.test/standalone.bat and wait for Wildfly to start up
                            6. Run bin.test2/standalone.bat and wait for Wildfly to start up.
                            7. Gracefully shut down the second Wildfly instance.
                            8. Run bin.test2/standalone.bat. On this occasion Artemis will fail back and the deployment of the application will fail.

                             

                            Thanks,

                            Asim

                            • 11. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                              jbertram

                              The application does indeed fail to deploy on fail-back which isn't surprising given that the local broker isn't yet active.  However, my question was why the messages would go to the DLQ.  That's the part I don't understand.  Can you shed any light on that?

                              • 12. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                                asimmahmood

                                Sorry, I may have confused things slightly. Messages only ever went to the DLQ when using the 3 broker configuration or if I connected to a remote broker over the network. These were both used as workarounds to the problem demonstrated in my last post.  However I have since decided that neither of these configurations are suitable and I’m now investigating other options. If this particular issue is of interest to you I can put together a test to demonstrate it but it’s not something I need an answer to.

                                 

                                It seems like Wildfly doesn’t support high availability of messages in a collocated replicated configuration. As my application is injecting the JMSContext and has an MDB that uses the default resource adaptor, I would have thought that this would be enough to tell Wildfly there is a dependency on Artemis?

                                 

                                I would appreciate any advice on getting this configuration working. I would also be willing to provide a fix if someone could point me to the part of Wildfly that needs changed.

                                • 13. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                                  mnovak

                                  Hi Asim,

                                   

                                  you can use configuration from correct answer in Re: Wildfly 10 - ActiveMQ Artemis Failover with Standalone client . It contains valid configuration for Artemis in replicated collocated topology. Configuration is alost the same for both of the WF10 servers. You'll need to add socket-bindings "messaging" and "messaging-backup" to each server for example like:

                                   

                                      <socket-binding name="messaging" interface="public" port="5445"/>
                                      <socket-binding name="messaging-backup" interface="public" port="5446"/>

                                   

                                  And provide the same group-name=... master and its slave. So the topology will look like WF1(master1/slave2) <-> WF2(master2/slave1). master1 and slave1 have group-name=group1 and master2 and slave2 have group-name="group2".

                                   

                                  In this topology MDB will be always connected to Artemis master. As <pooled-connection-factory> is using in-vm connector this will always be the case with above configuration. You will still see initial reconnect attempts from MDB after failback. This is because it will take some time for master to sync with slave before it can activate. It cannot happen that message will end in DLQ.

                                   

                                  Thanks,

                                  Mirek

                                  • 14. Re: Wildfly 10.1 fails to deploy app during Artemis failback
                                    asimmahmood

                                    Hi Mirek,

                                     

                                    I followed your recommendations and found that I still encounter the same issue. The problem is that I don’t see reconnect attempts from the MDB after failback. The MDB does not deploy because the pooled-connection-factory is not available until failback is complete. The test application I provided demonstrates this issue. If you try it you should see what I mean.

                                     

                                    Thanks,

                                    Asim

                                    1 2 Previous Next