Parallel thoughts
Andy Armstrong
n at rciss.us
Mon Oct 1 22:41:34 BST 2007
We currently have three process allocation models:
1) Single process. Actually two processes - Test::Harness in one and the
test script (sorry Andy, program) in the other. This is the default
2) Multiplexed parsers. Both Eric's TAP::Harness::Parallel and the
code I
committed today work like this. Multiple parsers are created in
the main
process and then we poll or select them and process input from
each of
them as it becomes available.
3) Forked parsers. The experimental branches/paa works like this. The
main
process forks before it enters the inner TAP processing loop.
Each parser
gets its own process.
The problem with 2 is that all the parsing is still happening in a
single process. This bottleneck gets worse as you throw more cores at
the problem. It's a win for I/O bound tests and highly CPU intensive
tests if you have lots of cores. For everything else it sucks a bit.
It's not hard to see why. You don't have to write a particularly
speedy test for it to be able to generate TAP faster than we can
parse it. As long as all the TAP parsing takes place in a single
process we're frequently going to be CPU bound in that process.
The problem with 3 is that the inner TAP parsing loop runs in some
other process - so it's hard to feed the result for each line of TAP
back to the mothership. The branches/paa version ignores the problem
- and because it's using the console formatter it doesn't matter too
much. But it'd suck if, for example, you had a formatter trying to
create an HTML document that unified all the test results.
But ultimately 3 - or something rather like it - is the way we have
to go if we're ever going to spread our CPU load around.
So we could do something like this:
* The default is single process
* There's a safe parallel option which is handy for I/O bounds tests
* There's a speedy forking option which doesn't capture test output
line by line - but it still
gets the summaries right (and correctly lists which test numbers
failed) and it scales well to
any number of cores.
As it stands option 3 also has dependencies on fork and Storable. The
Storable habit would be hard to kick. I guess we could just say it
doesn't work if you don't have Storable. There might be away of
expressing the forking idiom using Win32 threads.
--
Andy Armstrong, Hexten
More information about the tapx-dev
mailing list