Please sign into the #jopr channel on freenode. Communicating back and forth through the forums is too slow. ; )
fyi...RHQ-2150 has been resolved. if you run with rev4175 or higher, recovery alerts at the template level now work naturally.
it seems like when my servers restart overnight, everything resets and i haveta re run the query every monring, plus it still wont work because the cache is not restarted
The ONLY way an alert definition can have it's recoveryIds messed up is if the template is edited/saved. Could there be other admins/operators using the system that you're unaware of? Maybe they are resetting things on you?
"Plus it still won't work because the cache is not restarted"
As I mentioned to you in the #jopr room, this was one reason why you should upgrade and use the 2.2 release. In it, there exists a way to reload all of the caches for each of the agents in the system. This operation was written with the specific intention of being able to correct the cache data in situation like this one (even know I didn't know about RHQ-2150 when I wrote/exposed the cache reloading operation).
However, maybe I can convince our buildmeister that we should put out a new community version (because this issue is now fixed in trunk). There are a few items that are in progress right now, so it may have to wait a few weeks until things settle down and get stable.
I post my "rhq_alert_definition" content on jopr pastebin  as you request on JIRA
When I configured the recovery alerts at the first time I was confused about "to-be-recovered/recovery" concepts. I turned its definitions. Them when I tried correct the alert definitions (revert these templates) I guess the JON duplicated some definitions on DB. I'am not sure!