Using multi-threading in a splitter
xdevroey Apr 19, 2011 7:23 AMHello,
I am currently trying to writte a route to split a big xml file, into smaller files according to a groupId in the xml.
When I don't use the .parallelProcessing() option of the splitter, everything is fine. But when I try to use multi-threading, it blocks and I get java heap space exceptions.
Here is the route (the attached snippet contains the formatted following code):
-
ThreadPoolExecutor splittingPool = new ThreadPoolExecutor(
1, 5, 30L, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(10));
splittingPool.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
from("file:D:/test/in?noop=true")
.log("Begin splitting")
.split(body().tokenize("</elem>"))
.parallelProcessing()
.executorService(splittingPool)
.streaming()
.convertBodyTo(String.class)
.process(trxSplittingProc)
.choice()
.when(header("groupId").isNotEqualTo("EOF"))
.to("file:D:/test/out?fileName=${header.groupId}-D-${header.startProcessingTime}.txt&fileExist=Append")
.otherwise()
.log("End of file reached")
.end()
.end()
.bean(trxSplittingProc, "reinitilize");
-
The header.groupId and header.startProcessingTime are set by the trxSplittingProc Processor. This processor only add the closing </elem> tag, removed by the splitting, to the body.
The file I have to precess is pretty big (3.5Go) and contains 1.000.000 <elem>s (one element is approximatively 3500 char long).
I have tried to configure the splittingPool to only have one thread and one element in the LinkedBlockingQueue, but the problem remains. Which is, in my opinion, weird since the splitting without multi-threading just work fines.
Does anyone have an idea to solve the problem ? Many thanks.
Xavier.
Edit:
I use Java 1.6 and Camel 2.2.0-fuse-02-00
Edited by: xdevroey on Apr 19, 2011 11:23 AM
-
snippet.txt 846 bytes