1 2 Previous Next 19 Replies Latest reply on Jun 3, 2016 4:44 AM by zhfeng

The work to improve the NTA parsing

ichigo517 Apr 20, 2016 5:16 AM

Hi all, here's my work to improve NTA parsing

1. Find the bottleneck

According to NTA-53 [NTA-53] Do a quick profile of the system to find the bottleneck of the log parsing - JBoss Issue Tracker, I use JProfiler to view the CPU and the result shows that about half of the time are spent on regex.matcher.find( ).

2. possible solutions

1) I add a keywords filter and Amos suggests to write a filter.properties for easier adding new keywords in the future.

2) Using indexOf to free some logs from regex matching.

3) Change the sequence of handler, make sure to match the most common handler at first.

3. Future work

1) Compare NTA with ShiViz and Zipkin and find missing features for NTA

Thanks,

Yuan Hu

1. Re: The work to improve the NTA parsing

tomjenkinson Apr 21, 2016 9:35 AM (in response to ichigo517)

Great - you should consider using a benchmarking framework to check if your performance improvements make an impact. We use something called JMH already. You could take a look at: https://github.com/jbosstm/performance/blob/master/narayana/ArjunaJTA/jta/tests/classes/com/arjuna/ats/jta/xa/performance/JTAStoreTests.java

If you aren't already it would definitely be worth considering pre-processing the log to only parse the lines that match ones that NTA will actually parse.

It might be worth changing the logging in the lines that are read so that the output is more machine readable and so each line is not as complex to parse.

E.g. change:
@Message(id = 16036, value = "commit on {0} ({1}) failed with exception ${2}"
To:
@Message(id = 16036, value = "{0} {1} - {2}"

You could try that with the most common log messages that NTA uses to see if it provides a benefit.

Also, we talked at some point about changing it so that the logging didn't write to file but wrote directly to the NTA database, it may be that doing that would be faster. You could send the log message over JMS to an MDB that does that so it doesn't hold up the main writer (or maybe JBoss Logging is already in a background thread.

You could look at what the ironjacamar tracer does and ask questions on their forum to see if it has some benefit to NTA.
Actions
2. Re: The work to improve the NTA parsing

tomjenkinson Apr 21, 2016 9:37 AM (in response to ichigo517)

Maybe you could try running ShiViz with one of the logs that you generate using the NTA log generator and see if it can visualize it for you? I imagine Zipkin needs specific coding for it?

There are other log visualization tools too that might provide some ideas, e.g. I just found this one: Trace Log to Sequence Diagram Generation by eventhelix
Actions
3. Re: The work to improve the NTA parsing

ichigo517 Apr 21, 2016 10:11 PM (in response to tomjenkinson)

Hi Tom,

Thanks for your advice! I think maybe sending log message to an MDB while writing log file is a good choice. I'll think it over.

Yours,
Yuan Hu
Actions
4. Re: The work to improve the NTA parsing

zhfeng Apr 21, 2016 10:19 PM (in response to tomjenkinson)

The most import thing of the ShiViz is the vector clocks which are used to represent the happened-before relation between the events. I think we have not them in the logs.
Actions
5. Re: The work to improve the NTA parsing

mmusgrov Apr 22, 2016 4:53 AM (in response to zhfeng)

But we do have timestamps
Actions
6. Re: The work to improve the NTA parsing

mmusgrov Apr 22, 2016 4:55 AM (in response to ichigo517)

1. Find the bottleneck

A quick check would be to use Java 8 parallel streams to do the regex parsing.
Actions
7. Re: The work to improve the NTA parsing

tomjenkinson Apr 22, 2016 5:19 AM (in response to zhfeng)

It looks like you can instrument the System.out code (I guess JBoss Logging/log4j) with ShiVector? bestchai / ShiVector / wiki / Home — Bitbucket
Actions
8. Re: The work to improve the NTA parsing

tomjenkinson Apr 22, 2016 5:21 AM (in response to tomjenkinson)

This is the class that deals with log4j log and warn (probably need to do something special for trace: bestchai / ShiVector / source / java / shivectorasp / src / shivector / aspects / BasicAspect.java — Bitbucket)
Actions
9. Re: The work to improve the NTA parsing

tomjenkinson Apr 22, 2016 5:30 AM (in response to ichigo517)

I am thinking a bit more about this. The most common use case would be that someone would want to provide a log file for assistance so maybe the MDB was a bit too much of an "engineering" solution rather than user focussed. I think we should maybe concentrate on improving the parsing of the log file once it is produced. I still think there would be value in seeing if reducing the complexity of the log message would help improve performance here. I assume that the log parsing routines already discard any line that doesn't match a log message ID that won't be used? E.g. if NTA only parsed IDs ID1 and ID3 something should quickly strip out lines starting ID2 before more complex processing:
ID2
ID1
ID2
ID3
ID2
ID2

After that we need to make the regexp be able to find the parts of the message it needs as quick as possible. For most use cases I guess that is the UID of the transaction or the UID of the resource so I would propose making the log messages start with those components so the regexp can complete as quick as possible.

If possible though I would definitely be interested to see what we can do to work with existing log visualization tools (including customizing them).
Actions
10. Re: The work to improve the NTA parsing

zhfeng Apr 25, 2016 1:28 AM (in response to mmusgrov)

Michael Musgrove 撰写:

1. Find the bottleneck

A quick check would be to use Java 8 parallel streams to do the regex parsing.

yeah, it makes sense.
Actions
11. Re: The work to improve the NTA parsing

tomjenkinson Apr 27, 2016 4:16 AM (in response to ichigo517)

Hi Yuan,

I wondered how your investigations where going? In particular if the parallel streams idea had demonstrated any performance benefits?

Thanks,
Tom
Actions
12. Re: The work to improve the NTA parsing

ichigo517 Apr 27, 2016 4:28 AM (in response to tomjenkinson)

Sorry, I haven't implemented the parallel streams yet. I'll let you know if I did.
Actions
13. Re: The work to improve the NTA parsing

zhfeng Apr 27, 2016 4:37 AM (in response to tomjenkinson)

I think the parallel stream regex parsing looks like a little bit hard for her. But yeah, it is a good idea and I will investigate it.
Actions
14. Re: The work to improve the NTA parsing

tomjenkinson Apr 27, 2016 5:06 AM (in response to ichigo517)

Thanks. I wondered if it might be personally interesting to you to look at the idea and to understand how it would be beneficial even if you do not implement it. As Amos says it is likely fairly complex but it might help inform any write up you produce for your studies.

Thanks again,
Tom
Actions

1 2 Previous Next

Go to original post