4 Replies Latest reply on May 1, 2009 12:30 PM by mazz

    agent purges

      I'm using jopr 1.2 beta 1 in amazon ec2 instances. We had an issue where the instance had to be shutdown and launch a new one. Someone accidentally re-registered the agent under a different user name and before the new elastic ip address was assigned. I've been trying to run the agent with the --cleanconfig option under the correct userid with the correct connection information (the one It should have been) but it does not allow me to reconnect. stating I'm not authenticated in the jopr server logs. My question is how can I purge agents from the jopr server that have become corrupted or utilizing connection configurations that need to be changed in which a --cleanconfig does not help?


        • 1. Re: agent purges
          mazz

          I'll explain some things.

          First, if you run the agent as user A, all the agent's preferences get stored in Java Preferences for that user A. If you then run the agent as user B, it will start with fresh, clean config because user B doesn't have persisted preferences yet. If an agent doesn't have persisted preferences yet (or you've --cleanconfig), agent-configuration.xml will be read in and used as the config. agent-configuration.xml is not used thereafter (unless, of course, you --cleanconfig again). Read the comments at the top of agent-configuration.xml - it explains some of this: http://svn.rhq-project.org/repos/rhq/trunk/modules/enterprise/agent/src/main/resources/agent-configuration.xml

          As for the agent name - once you register an agent under a name, you can not change that name. This is to prevent someone from stealing your agent registration. For example, if I register as agent name "foo" with agent IP of 1.2.3.4 port 16163 - then someone else should not be able to register an agent with name "bar" with the same IP/port combination. Someone is trying to hijack your agent's traffic if they do that. In fact, once you register, you cannot change the agents IP and/or port UNLESS your agent has its internal security token in its persisted preferences (once registered, the agent gets a security token assigned and the agent uses that to pass up to the server as kind of a psuedo-secure-UUID but don't consider it hackproof). If you therefore try to register the agent with a different IP/port (but with the same name "foo") the server will allow you to change it but only if you still have your config intact (because it is the persisted preferences where the security token is stored). If you purge your config or somehow no longer have persisted preferences, you can't register with name "foo" under a different IP/port combination.

          What you need to do is --cleanconfig under the user you want (presumably the old user that you still want to run the agent as) and register the agent with the same IP/port combination as before.

          If you cannot use the same IP/port, but you've lost your security token (because you've purged your config for example), then you need to purge that agent record or manually add the security token back to your agent as a hack to get around it or manually change the IP/port in your database (RHQ_AGENT database table has the IP, port and security token string, if you have authorization to get to your database, "select * from rhq_agent" and you'll see them - you can use http://:7080/admin/test/sql.jsp to do this query - you must log in as the rhqadmin user to do so.)

          Also, read:

          http://jira.rhq-project.org/browse/RHQ-914

          It may be of help.

          • 2. Re: agent purges

            that was an excellent explanation, I really appreciate the time taken to explain this situation. Based upon what you said, I think i would fall into the purging the agent records. I have uninventoried the agent I want to remove and ran the queries in RHQ-914 without success. I get an error executing the statement. I'm using postgres as the database and executing from the /test/sql.jsp page.

            Is there a way to explicitly specifiy the agent I want removed ?

            this is from the RHQ-914 instructions.

            Error: org.postgresql.util.PSQLException: ERROR: syntax error at or near "inner"
            SQL was:delete from RHQ_FAILOVER_DETAILS failoverli0_ inner join RHQ_FAILOVER_LIST failoverli1_ on failoverli0_.FAILOVER_LIST_ID=failoverli1_.ID where failoverli1_.AGENT_ID not in (select resource2_.AGENT_ID from RHQ_RESOURCE resource2_)
            StackTrace: org.postgresql.util.PSQLException: ERROR: syntax error at or near "inner" at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1608) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1343) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:194) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:350) at


            ------This is the statement-----
            delete from RHQ_FAILOVER_DETAILS failoverli0_ inner join RHQ_FAILOVER_LIST failoverli1_ on failoverli0_.FAILOVER_LIST_ID=failoverli1_.ID where failoverli1_.AGENT_ID not in (select resource2_.AGENT_ID from RHQ_RESOURCE resource2_)




            • 3. Re: agent purges
              mazz

              I have no idea what that first SQL is all about - its wrong. I updated that JIRA with the proper SQL. Here it is:

              delete from RHQ_FAILOVER_DETAILS where id in (select failoverli0_.id from RHQ_FAILOVER_DETAILS failoverli0_ inner join RHQ_FAILOVER_LIST failoverli1_ on failoverli0_.FAILOVER_LIST_ID=failoverli1_.ID where failoverli1_.AGENT_ID not in (select resource2_.AGENT_ID from RHQ_RESOURCE resource2_))
              ;
              delete from RHQ_FAILOVER_LIST failoverli0_ where failoverli0_.AGENT_ID not in (select resource1_.AGENT_ID from RHQ_RESOURCE resource1_)
              ;
              delete from RHQ_AGENT a where a.id not in (select res.AGENT_ID from RHQ_RESOURCE res)
              


              I just ran it myself on Postgres 8.3 and it worked. Remember, you must uninventory your platform (which removes all resources from that platform on down). Once you do that, then you execute the above SQL. This will purge all remnants of that agent.

              We really need to implement this feature so people can uninventory a platform and have the agent go away too. In fact, I'm gonna target this for our next release, this has been open for too long.

              • 4. Re: agent purges
                mazz

                FWIW, I checked into trunk a fix for http://jira.rhq-project.org/browse/RHQ-914. Now when you uninventory a top-level platform, the agent record itself is deleted (rhq_agent and related tables)