Using regex on fuse esb
rogelio_sevilla1 Jul 19, 2011 1:56 PMHello everyone:
I don't know if this is the right place to ask this question but here it goes.
I'm building a camel route which gets an html code and then, it applies a regular expression match to extract certain data.
I know the regex works right when deployed as a simple java app. But i know that fuse ESB is a multi threaded enviroment so, there are more difficulties when using non thread safe code.
To be honest, i'm not a multi-thread advanced coder, so I was wondering if anyone here has some advice on using regex on fuse esb. this is the method i'm using:
private static final Pattern regex = Pattern.compile("link href='(.*)'");
public String getUrls(@Body String htmlcode) {
Matcher matcher = regex.matcher(htmlcode);
StringBuffer urls = new StringBuffer("");
synchronized(this){
matcher.matches();
matcher.find();
while (matcher.find()) {
urls.append(matcher.group(1));
urls.append("\n");
}
matcher.reset();
}
return urls.toString();
}
What i want to get in here is a list of urls. Curiously, when i extract the url list, around 8 links from 150 are totally wrong. something like:
htp:/wrong_url.com?a=asdsa
h:/wrong_url.com?asdsa
I know my regex is fine because i'm executing this code every 5 minutes using a quartz component, and the next time the match is executed (on the exact same html code), the wrong urls are extracted correctly, and then, some others are now extracted incorrectly :-S .I thought the synchronized block would fix this, but not.
I've been dealing with this problem for a couple of days without success; does anyone has any experience on fuse esb regex usage that could shed some light on this??.
thanks a lot in advance.
Edited by: rogelio_sevilla1 on Jul 19, 2011 5:56 PM