Thanks Tom! It's very kind of you! I will look at it.
I am thinking the slow bit is the regex parsing so doing those steps in parallel might give you an indication of whether splitting up the work will help. It should be a quick test to give you a rough idea of whether it is worthwhile re-architecting how you are doing the parsing. So something along the following lines:
public class NTA { private static Pattern ingWords = Pattern.compile(".*ing"); public static void main(String[] args) throws Exception { String words = "/usr/share/dict/words"; try (Stream<String> stream = Files.lines(Paths.get(words))) { stream.parallel() .filter(NTA::wordMatch) .collect(Collectors.toList()) .forEach(k -> System.out.printf("%s\n", k)); } } private static boolean wordMatch(String s) { return ingWords.matcher(s).matches(); } }
Thanks Mike and it is a nice example ! Yuan, can you try this in the filter handlers ?
Yes, I will.
currently we are working on the [NTA-89] Add a keyword filter - JBoss Issue Tracker and we want to ignore these lines which contains the keywords in the filter.properties. It could improve that we don't need to parse the lines.