Search notes:
Perl module Parse::RecDescent
use strict;
use warnings;
use Parse::RecDescent;
my $grammar = q {
operation : ident operator ident
{printf "Parsed operation: 1st ident: %s %s %s\n", $item[1], $item[2], $item[3];}
operator : '+' | '-' | '*' | '/'
ident : char char_or_num(s?) # s: one or more, s? zero or more
{ $item[1] . join "", @{$item[2]} }
char_or_num: char | num
num : /\d/
char : /[a-zA-Z]/
};
my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!";
defined $parser->operation('Foo + Bar') or print "didn't match\n";
defined $parser->operation('x10 / y15') or print "didn't match\n";
defined $parser->operation('100 / 42') or print "didn't match\n"; # Doesn't match because operator requires ident's, not num's.
defined $parser->operation('a - b ') or print "didn't match\n";
print "\nUniversal token prefix pattern: >$Parse::RecDescent::skip<\n"; # \s*
The script prints:
Parsed operation: 1st ident: Foo + Bar
Parsed operation: 1st ident: x10 / y15
didn't match
Parsed operation: 1st ident: a - b
Universal token prefix pattern: >\s*<
Rules and productions
Rules consist of alternatives of productions. In the grammar, they are defined like so:
rule : production_1 | production_2 |
production_3 | production_4
Productions: zero or more of following items
subrule: name of another rule
token: pattern or string
action: block of Perl code
directive: instruction to the parser
comment
A rule matches text if any (the first!) of its production matches.
A production matches if its items within match in the order stated.
First production, not longest
Unlike yacc (breadth-first), the first production (depth-first) takes precedence over the longest.
This is the fundamental difference between bottom up and recursive descent parsing.
#!/usr/bin/perl
use warnings;
use strict;
use Parse::RecDescent;
my $grammar_1 = q { #_{
start : seq_1 seq_2
seq_1 : 'A' 'B' 'C' 'D' { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
| 'A' 'B' 'C' 'D' 'E' 'F' { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
seq_2 : character(s)
character: /\w/ { print "character: $item[1]\n" }
#_} };
my $grammar_2 = q { #_{
start: seq_1 seq_2
seq_1 : 'A' 'B' 'C' 'D' 'E' 'F'
{ print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
| 'A' 'B' 'C' 'D'
{ print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
seq_2 : character(s)
character: /\w/
{ print "character: $item[1]\n" }
#_} };
print "Parse with first grammar:\n";
parse($grammar_1);
print "\nParse with second grammar:\n";
parse($grammar_2);
sub parse { #_{
my $grammar = shift;
my $parser=Parse::RecDescent->new($grammar);
$parser->start("A B C D E F G");
} #_}
Parse with first grammar:
seq_1: A B C D
character: E
character: F
character: G
Parse with second grammar:
seq_1: A B C D E F
character: G
@item and %item
$item[0]
and $item{__RULE__}
is/are the name of the rule.
$item[n]
are the values of the nth subitem in the rule.
#!/usr/bin/perl
use warnings;
use strict;
use Parse::RecDescent;
my $grammar = q {
seq : one_char three_chars two_chars
{ main::print_item(\@item, \%item); }
one_char : character
{ main::print_item(\@item, \%item); }
three_chars : character character character
{ main::print_item(\@item, \%item); }
two_chars : character
{ main::print_item(\@item, \%item); }
character : /\w/
{ main::print_item(\@item, \%item);}
};
my $parser=Parse::RecDescent->new($grammar) or die;
$parser->seq("A B C D E F");
sub print_item {
no warnings 'once';
my @item = @{ $_[0] };
my %item = %{ $_[1] };
print join " -- ", @item;
print "\n";
print map {$_ . "=" . $item{$_} . "; "} keys %item;
print "\n\n";
$Parse::RecDescent::return = join "", @item[1..$#item]; # $Parse::RecDescent::return
}
Source code
Apparently, the module is hosted on
github .
See also
Perl module Marpa::R2