Younes, when you say it increases, how much does it increase by? The BINDING_TEMPLATE represents endpoints for your services, so as the server starts up, it should increase as services and endpoints are registered.
We are using 5.1 GA and ESB 4.6.
We had the same problem until we started shuting down the server properly. By properly I mean using either shutdown.bat/sh or running the server as a windows service (stop does a shutdown). I think the ESB removes the endpoints it added during startup by itself if you do a proper shutdown.
To see the effect of shutdown you need to have a reasonably clean juddi. If the same service is registered 10 times at juddi (from previous failures or bad stops) and you shutdown the server it will remove only 1 endpoint from juddi. So if the server for some reason crashes you need to clean the juddi registry manually from the entries that were added during startup or in your case from the entries from the previous attempts.
We are checking that the registry is clean and performing the cleaning tasks using ub (not from the database).
Having a clean juddi registry is also essential for clustering to work properly.
BTW: The reason why you did not have this problem with hsqldb is that hsqldb is a memory db and the memory is clean at every startup.
We are using oracle and we had the same problem.
This is correct, the services remove the bindings when they stop. If you use a persistent database then they will survive crashes.
There are options to remove invalid stale entries when discovered, but the biggest impact is likely to be performance if you are not using a version of ESB which addresses the juddi/scout performance issues.
There has also been a new registry interceptor added, recently, into the codebase. This removes the impact of juddi on any service which is deployed in the same instance, but has the downside of hiding remote EPRs for those services.
can someone describe correct procedure to clean this JUDDI registry, or point to documentation? We have similiar problem as described here (long start of server) and next impact is that if really lot of bindings remain in registry, also the services lookup become quite slow and sometime it timeouts (when using Service Invoker, where the timeout is set to 300 seconds). We have found that recreating JUDDI DB help us (drop the schema and create empty one from the scracth using JUDDI DB scripts). But we are not sure if it can lead to some other problems, e.g. as mentioned here in case of surviving server failures.
Nikos mentioned they are using "ub", but I am sorry I do not know what it is.
Have a look at this link about ub. It is also mentioned in the ESB manuals
The uddi browser (ub) application has an action called something like "remove all services" that has the same effect as truncating tables.
About best practices:(?) - the following depends a bit on the deployment I guess.
Each service should have an end-point per esb node in the cluster. When a node shuts down the endpoint entry gets removed. If a node crashes then the service gets registered twice at startup. You get a debug warning when the esb is staring up if this is the case. I think ideally when a server crashes an administrator should go using a tool like ub and remove the entries of the node that is down and restart the node only after this is done. In your case if you have 10s or 100s of endpoints for the same service the best thing would probably be to clean the registry and start again but make sure you shutdown the servers this time and follow the administration routine.
About the default balancer
Now another thing we have noticed is that if a service fails, the default RoundRobin balancer of the service invoker will route requests to the node that is down until the entries get removed from the registry. This is not very good because the cluster performance gets degraded this way. There is a option called org.jboss.soa.esb.failure.detect.removeDeadEpr see http://community.jboss.org/thread/157342 but this is a bit dangerous as we found out because services will get unregistered if a timeout occurs for whatever reason (the manual warns you about this danger).
We have written our own balancer for the service invoker that prefers endpoints local to the server.
A better solution is to use a registry interceptor as the load balance policy does not have sufficient information given that the EPRs are opaque.
See the commits for https://jira.jboss.org/browse/JBESB-3449
Thank you for explanation! We will try this uddi browser.
Maybe there is a small misunderstanding, I haven't mentioned that we do not used ESB in cluster (we use heartbeat failover cluster) and also we use community versions which builds SOA platform:
- JBoss AS 4.2.3. GA
- JBoss ESB 4.5. GA
- JUDDI 4.2.3. GA
- JBPM 3.2.2
Anyway, you are right that we have to define some administration routine which will describe when and how to use ub to clean the registry.
Just a question to what we were thinking about - to clean JUDDI DB at JBoss start up (because it can be automated). Is this approach correct, since the ub function "remove all services" do the same? As I understand it, all ESB services are registered during startup, so the question is if the server crashes and there is some unprocessed message, if it get lost or will be processed with the newly registered service (same name, but probably diferent binding).
First question was
'(we use heartbeat failover cluster) ... Just a question to what we were thinking about - to clean JUDDI DB at JBoss start up (because it can be automated). Is this approach correct, since the ub function "remove all services" do the same?"
I am not sure what you mean by "we use heartbeat failover cluster"? If you have only one ESB instance/node up at any given time then I guess you are fine if you just automate cleaning the DB at startup.
Second question was
"As I understand it, all ESB services are registered during startup, so the question is if the server crashes and there is some unprocessed message, if it get lost or will be processed with the newly registered service (same name, but probably diferent binding)."
I am not sure what the correct answer is but I think the right answer is that if you want that kind of reliability you have to follow the instructions about clustering the esb, messaging, transports, HAJNDI, HTTP etc.
We are doing a lot of tests around this time to test whether these kind of scenarios work well the way we configured it. I will know in a few months but perhaps Kevin knows.
Wow, I see https://jira.jboss.org/browse/JBESB-3449 is very new. I guess it hasn't made it into 4.9 yet. We would like to upgrade to 4.9 but we can't do it just now. I can't read the SVN commits for JBESB-3449 for some reason.
We only use JMS end points. Ι attach what we did. It works in our case but it can probably be done better.
LocalPreferenceBalancer.java.zip 839 bytes