Results of profiling parallel tests (was Re: [1194] grep in scalar context doesn't short-circuit (it can' t - side effects) so the)
Nicholas Clark
nick at ccl4.org
Sat Aug 30 21:48:13 UTC 2008
On Sat, Aug 30, 2008 at 09:03:56PM +0000, nicholas at hexten.net wrote:
> Revision: 1194
> Author: nicholas
> Date: 2008-08-30 21:03:56 +0000 (Sat, 30 Aug 2008)
>
> Log Message:
> -----------
> grep in scalar context doesn't short-circuit (it can't - side effects) so the
> implicit loop with a recursive call in the grep in _not_empty() was expensive.
> Replace it with a foreach, that short-circuits by returning early. This takes
> the time in this method down from 52 seconds to 8. Win!
Which basically gets us from
http://www.wu-wien.ac.at/usr/ma/nclark/blead_harness_34237_r1189/
Calls P F Inclusive Exclusive Subroutine
9515804 2 1 621.50612 175.03842 TAP::Parser::Scheduler::_gather
7525708 2 1 316.94195 129.37304 TAP::Parser::Scheduler::_not_empty
70049 2 2 27.81537 20.28561 IO::Select::can_read
92717 2 1 195.24093 17.34687 TAP::Parser::Scheduler::_find_next_job
1549 1 1 16.98888 16.98888 IPC::Open3::xfork
209137 1 1 54.29413 16.58686 TAP::Parser::__ANON__[../lib/TAP/Parser.pm:1234]
961494 42 6 15.23568 15.23568 TAP::Parser::__ANON__[../lib/TAP/Parser.pm:72]
235565 1 1 91.79267 15.19162 TAP::Parser::Grammar::tokenize
to
http://www.wu-wien.ac.at/usr/ma/nclark/blead_harness_34237_r1195/
Calls P F Inclusive Exclusive Subroutine
77782 2 2 39.29334 30.89907 IO::Select::can_read
1549 1 1 21.23763 21.23763 IPC::Open3::xfork
209137 1 1 55.51446 16.78572 TAP::Parser::__ANON__[../lib/TAP/Parser.pm:1234]
962506 42 6 15.91717 15.91717 TAP::Parser::__ANON__[../lib/TAP/Parser.pm:72]
235556 1 1 95.39925 15.67889 TAP::Parser::Grammar::tokenize
ie time spent in the gubbins of the parallel test job scheduler has dropped
from "lots" to "bugger all".
I'm not entirely convinced that the timing slowdown of Devel::NYTProf is
linear, because I don't see anywhere near as large a CPU second drop when
running on an optimised build without profiling.
1: Does NYTProf slow down subroutine calls way more than most other statements?
2: Is NYTProf getting confused about inclusive times in the first profile,
with the inefficient recursive methods _gather() and _not_empty()?
Anyone else can play along with these simple steps:
$ rsync -za rsync://ftp.linux.activestate.com/perl-current/ bleadperl
$ cd bleadperl
$ rm -rf ext/Test/Harness
$ svn co https://svn.hexten.net/tapx/trunk ext/Test/Harness
$ svn co http://perl-devel-nytprof.googlecode.com/svn/trunk ext/Devel/NYTProf
$ ./Configure -des -Dusedevel && make -j3 test_prep
$ cd t
$ export TEST_JOBS=3
$ ./perl -I../lib -d:NYTProf harness
and then generate your HTML with something like
$ ./perl -I../lib ../ext/Devel/NYTProf/bin/nytprofhtml --out ~/public_html/blead_harness_34237_r1195
Nicholas Clark
More information about the tapx-dev
mailing list