Some possible measures to improve quality for JGroups (roughly in order of priority). Some of these are just things which have been raised in the past; some are new.
1. testsuite from JUnit to testNG - how to apply new features of testNG (dependencies, groups, parallel, reporting) to the organization of the testsuite? (for JGroups 2.7)
(in progress - Bela)
2. concurrency testing using MTC - how to apply MTC and UncontentedLock to existing and/or new test cases (how can we better the test concurrent parts of JGroups). Which tests are these tools best suited to? Only tests involving blocking, and checking for concurrent access to non-thread safe objects? (JGRP-693)
In general, use byte code instrumentation to inject various faults things affecting timings (e.g. thread interleavings) into a running program.
3. getting a static analysis tool like JCarder to work, to detect potential races and deadlocks on a regular basis (how can we better test concurrent parts of JGroups). The JCarder group suggested that they could use manpower.
4. hammering of protocols in isolation using Simulator (with failures modelled -see wiki on Simulator) against state machine descriptions of protocol correctness. Do we do this enough, considering the key role protocols play in JGroups?
This is somewhat related to the previous point: if we suspect a potential bug (e.g. with JCarder, or MTC), then we can try to model it and run it through the Simulator
5. documentation (poor documentation can cause users to get turned off) (http://jira.jboss.com/jira/browse/JGRP-695)
6. code coverage tool to see if there are parts of JGroups our tests are not exercising (units/components not being tested issue)
7. incorporate FindBugs runs with testsuite and use filters to report only the bug classes we are interested in, and checkpoint new bugs introduced between releases. Are there bug patterns particular to JGroups/Cache/Clustering which we could check for? (i.e common mistakes).
8. systematic testing across more platforms using Hudson - fixes are in progress for outstanding issues on reporting.
9. wireshark dissectors for JGroups protocols. Our aim should be to have dissectors which handle the key protocols and aim to get this incorporated into the Wireshark codebase.
10. design level verification of complex and important protocols (which testing is not well suited to), e.g. Promela and Spin
11. the "shipping features which aren't tested" issue, raised on forum concerning DistributedLockManager. We need to identify which parts of JGroups are being tested and which are not (c.f. my javadoc based tool). Should we separate the distro into core features (thoroughly tested protocols, data structures) and contrib features (not thoroughly tested protocols, data structures, etc. ). This could allow keeping old stuff like UNIFORM around.
List of supported APIs: http://jira.jboss.com/jira/browse/JGRP-253
12. add in JIRA components to bug reporting and use JIRA pie charting tool to measure and reflect on bug distributions over components between each release (which determine components account for the most bugs). For example, by reading over the mailing list for the past few weeks, many users are experiencing problems with the combined effects of failure detection, merging and shunning/auto-reconnect. Don't know how many of these resulted in bugs or were just configuration errors, but it seems to be a source of frustration. Doesn't this indicate that we need to enhance testing in those areas?
Future tasks
distributed testing of JGroups (writing testcases which execute on multiple nodes). Mitigated by executing the testsuite in TestNG in parallel
Comments