TAP::Parser should be *easier* to subclass [PATCH]

Steve Purkis steve at purkis.ca
Mon Jun 9 21:31:45 UTC 2008


Hi Andy,

On Jun 8, 2008, at 20:32, Andy Armstrong wrote:

> I'm currently working on TAP::Filter which allows filters that can
> modify the TAP stream to be placed in front of the parser. The use  
> case
> is that, at work, we want to be able to fault tests that have no
> description. With TAP::Filter you can synthesise an additional failed
> test for each real test that has no description.

Sounds useful.

Ok - so filters are basically subclasses of Source & Grammar, right?


> That exposed the following issue: between creating a parser and  
> using it
> you might want to replace the source, the iterator or the grammar. To
> replace the source you'd want to do something like:
>
>      $parser->_source( Some::Filter->new( $parser->_source ) );
>
> Similarly to replace the grammar you'd want to do
>
>      $parser->_grammar( Some::Filter->new( $parser->_grammar ) );
>
> The problem is that if you create the source /and/ grammar in the
> constructor it's too late to replace the source (on which the grammar
> depends) after the constructor returns. If you delay creation of the
> grammar until the first call to next() you never get the chance to  
> add a
> filter into the grammar chain - because the grammar doesn't come into
> existence until too late.

This is something I've noticed throughout the TAP::* codebase - a lot  
of work is done in the constructor / init methods.  While this can be  
good for performance, it does leave you with little wiggle room later  
on (another example: in the testsuite, I've had to subclass  
TAP::Parser as 'EmptyParser' to avoid all the magic in _initialize -  
all I needed was a fake parser).  In general, I prefer to do less in  
constructors because of this.


> My solution was to make $parser->_grammar lazily create a grammar on
> demand - so if you do
>
>      my $parser = TAP::Parser->new( ... ) # source created
>      my $result = $parser->next; # next calls $self->_grammar which
>                                  # causes a grammar to be created
>
> but if you do
>
>      my $parser = TAP::Parser->new( ... ) # source created
>      $parser->_grammar( Some::Filter->new( $parser->_grammar ) );
>      my $result = $parser->next;
>
> the first call to _grammar creates a grammar before entering next  
> so you
> can get the grammar and hook one of your own in front of it.
>
> This works OK too
>
>      my $parser = TAP::Parser->new( ... ) # source created
>      $parser->_source( Some::Filter->new( $parser->_source ) );
>      my $result = $parser->next;
>
> because it's OK to replace _source before the grammar is created  
> and, in
> this case, that doesn't happen until next needs it.
>
> I'd be interested in comments on that approach.

Sounds like that'll work well.

I'm assuming the Filters would be subclasses of Source & Grammar?  If  
that's the case, you mightn't need to worry about the chicken & egg  
stuff with the patch I've just submitted - the parser would  
instantiate the right sub-classes.  Of course, if you needed to do  
custom initialization on them, then the patch won't help - in that  
case, the above is quite complimentary.

Either way, I still think lazy init is the way to go.


> It seems to me to do the
> right thing - the only caveat is that if you change the grammar before
> changing the source
>
>      my $parser = TAP::Parser->new( ... ) # source created
>      $parser->_grammar( Some::Filter->new( $parser->_grammar ) );
>      $parser->_source( Some::Filter->new( $parser->_source ) );
>      my $result = $parser->next;
>
> the change to the source is ineffectual because the grammar has  
> already
> been created using the original source. I can probably make it  
> error in
> that case.

Well, one option is to make Grammar get its source (I'm assuming you  
mean its 'stream' attr?) from the parser all the time...  Then it  
only needs to know about the parser, and there's no problem changing  
it.  Unless you've already called 'next', in which case you should be  
slapped (eg: an exception should be raised).

Also, another idea popped into my head - in case you've not  
considered this approach (bearing in mind it's late, crack-fueled,  
and I've not fully thought this through):  Why not move to an event- 
based parser?  Then you don't have to worry about the user doing  
silly things between calls to 'next'.  Do little in new().  Give  
users access to the mutators, & validate there.  User registers  
callbacks for events they're interested in.  Introduce a parse()  
method...

Ok, so here are some reasons why not:
1. *completely* breaks backwards compat :-/
2. users could still modify the parser while parsing (albeit, with  
more effort).
3. it'd be a lot of work.  for what gain?

Looks like I'm swiftly talking myself out of that idea...  left in  
for completeness.

Anyways, I still think you've got a good solution above: it's not  
much work, and an improvement on what's there.  Hopefully it means t/ 
lib/EmptyParser.pm can be retired before it's even released ;-)

Cheers,
-Steve



More information about the tapx-dev mailing list