[fpc-pascal] FPCUnit - parallel test suite execution

Thu Mar 19 11:56:31 CET 2015

Graeme Geldenhuys wrote on Thu, 19 Mar 2015:

> On 2015-03-19 08:10, Mark Morgan Lloyd wrote:
>> In any event the output would need to be presented in a reproducible
>> sequence.
>
> A good point, but thinking about it I don't think it would be that hard.
> The text output might be a bit more tricky, but not impossible.

The easiest, most maintainable, least invasive (and in case you don't  
need/want the threads to interact, often also the best performing) way  
to introduce parallelism is to do it as coarse-grained as possible.  
I.e., simply run multiple self-contained unit-test programs in  
parallel, rather than trying to parallelise the execution of unit  
tests within a single program. Since the tests by definition are  
supposed to be independent, introducing dependencies by adding  
threading is a step backwards in several ways (performance,  
maintainability, debugability, ..).

This, of course, requires the availability of as sufficient number of  
test programs to actually execute in parallel (which definitely can be  
a problem if you're used to putting all unit tests into a single  
program). But if you do have that amount of independent work  
available, you'll generally get much better performance because you  
won't need any locking anywhere (even the overhead of the OS starting  
a program can easily be lower than the one caused by work queue  
maintenance and contention over a bunch of locks in a program,  
especially if you have very fine-grained work units such as a single  
function call to evaluate and test). You'll also have much better  
productivity because you'll have to spend much less time on writing,  
optimising and debugging the actual testing framework (and on taking  
into account all possible threading hazards when extending it).

An extreme example of this is the FPC testsuite: on an IBM Power8  
system with 10 cores that each can run 16 threads, the entire  
testsuite of 4000-5000 test programs finishes in about 30 seconds  
(including running fpmake install on the rtl and packages  
directories). The parallelism, in this case, is handled via make -j  
160, but it could just as well be a Pascal program that fires up  
threads and launches programs from within those threads. Every  
parallel invocation of the test runner (dotest) executes 10 to 20  
tests sequentially and the results are all written to separate files.  
At the very end, all of those text files are concatenated in the right  
order, so the output is deterministic.

Most OS kernels have a lot of smp optimisations, probably way better  
than any of us could every write (or even would want to bother to  
write and maintain for something like a testing harness). I very much  
doubt a multi-threaded implementation of the dotest program, apart  
from the work this would entail, could outperform the current approach  
without several weeks or even months of work.

Jonas