3 Replies Latest reply on Oct 23, 2013 4:12 PM by ehle

    First Deploy or Redeploy of bundle frequently fails - repeats are OK ("Failed to schedule... may be down")

    ehle

      Hello,

       

      I was wondering if anyone else had seen this or might have some incite.

       

      Basic scenario:

       

      Several RH JBoss EAP6 servers running - configured as per "best practices"

      2 Test RHQ (JBoss-ON 3.1.2) server (independent, with different client sets)

      All servers are using RPM client:

      Release     : 1.el6_3

      Size        : 8.0 M

       

      1. Upload a bundle.

      2. Set up  Destination for deployment (one or more EAP6 Servers' deployments/<app_name> directory)

      3. Deploy

       

      "Frequently,"  the first attempt will fail. The RHQ/JON GUI gives a message like this when you dig into the red Exclamation points:

      Deployment Requested - Success

      Deployment - "Failed to schedule, agent on [Resource[id=10404, uuid=4e27d296-10ed-4920-8033-3f89699d8bd6, type={JBossAS7}JBossAS7 Standalone Server, key=/usr/share/jbossas/standalone, name=EAP (192.168.1.92:9990), parent=myusername-dev03.private.com, version=EAP 6.1.0.GA]] may be down: java.lang.reflect.UndeclaredThrowableException"

      (IP#s/DNS names obsfucated for privacy)

       

      Server Log:

      2013-09-13 11:39:20,697 ERROR [org.jboss.remoting.transport.socket.SocketClientInvoker] Got marshalling exception, exiting

      javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

              at sun.security.ssl.SSLSocketImpl.checkEOF(SSLSocketImpl.java:1476)

      <SNIP>

              at java.lang.Thread.run(Thread.java:724)

      Caused by: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

              at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)

              at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1886)

      <SNIP>

              at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

              ... 22 more

      Caused by: java.net.SocketException: Broken pipe

              at java.net.SocketOutputStream.socketWrite0(Native Method)

      <SNIP>

              at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)

              ... 105 more

      2013-09-13 11:39:20,706 ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: java.rmi.MarshalException:Failed to communicate.  Problem during marshalling/unmarshalling; nested exception is:

              javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

      2013-09-13 11:39:21,278 ERROR [org.jboss.remoting.transport.socket.SocketClientInvoker] Got marshalling exception, exiting

      javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

              at sun.security.ssl.SSLSocketImpl.checkEOF(SSLSocketImpl.java:1476)

      <SNIP>

              at java.lang.Thread.run(Thread.java:724)

      Caused by: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

              at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)

      <SNIP>

              at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

              ... 22 more

      Caused by: java.net.SocketException: Broken pipe

              at java.net.SocketOutputStream.socketWrite0(Native Method)

      <SNIP>

              at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)

              ... 105 more

      2013-09-13 11:39:21,283 ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: java.rmi.MarshalException:Failed to communicate.  Problem during marshalling/unmarshalling; nested exception is:

              javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:java.net.SocketException: Broken pipe -> java.net.SocketException:Broken pipe. Cause: java.rmi.MarshalException: Failed to communicate.  Problem during marshalling/unmarshalling; nested exception is:

              javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

       

      Agent (Debug Logging) doesn't mention the deploy attempt, only a closed socket:

      2013-09-13 11:38:57,999 DEBUG [InventoryManager.availability-1] (rhq.core.pc.inventory.AvailabilityExecutor)- Built availability report for [0] resources with a size of [261] bytes in [256]ms

      2013-09-13 11:39:05,919 DEBUG [WorkerThread#0[<RHQ.SERVER.IP.#>:50822]] (jboss.remoting.transport.socket.ServerThread)- WorkerThread#0[146.139.115.113:50822] closing socketWrapper: ServerSocketWrapper[5f8aff2d[TLS_DHE_DSS_WITH_AES_256_CBC_SHA: Socket[addr=/146.139.115.113,port=50822,localport=16163]].5f8aff2d]

      2013-09-13 11:39:05,919 DEBUG [WorkerThread#0[<RHQ.SERVER.IP.#>:50822]] (jboss.remoting.transport.socket.ServerSocketWrapper)- wrote CLOSING

      2013-09-13 11:39:05,920 DEBUG [WorkerThread#0[<RHQ.SERVER.IP.#>:50822]] (jboss.remoting.transport.socket.SocketWrapper)- ServerSocketWrapper[5f8aff2d[TLS_DHE_DSS_WITH_AES_256_CBC_SHA: Socket[addr=/146.139.115.113,port=50822,localport=16163]].5f8aff2d] closing

      2013-09-13 11:39:26,537 DEBUG [MeasurementManager.sender-1] (rhq.core.pc.measurement.MeasurementSenderRunner)- Measurement report contains no data - not sending to Server.

      2013-09-13 11:39:26,599 DEBUG [EventManager.poller-2] (core.pluginapi.event.log.LogFileEventPoller)- /usr/share/jbossas/standalone/log/server.log: 36708 new bytes

       

      If you try again immediately, everything works.

       

      Problem is most common after client/server has been sitting idle for a while. Once you are succeeding you can redeploy dozens of times and not get an issue.

       

      If deploying to a group of EAP6 system, I first observed that the first system the JON/RHQ tried to deploy to would fail to deploy but the second would succeed. 

      On a hunch I let the JVM use more Memory, problem went a way for a while (after the server restart...) but when it came back both would usually fail.

       

      If you choose a "Clean" Deploy or not doesn't seem to have an impact.

       

      I have a feeling this might be a timeout of some sort - that the client isn't responding fast enough, or the server side is not waiting long enough for the connection to be established, but I can't prove it.

       

      If you could shed some light on the matter, it would be much appreciated - or even suggestions for tests to run.

       

      Thanks much!

      David.