Perl module Parse::RecDescent

Rules and productions

Rules consist of alternatives of productions. In the grammar, they are defined like so:

rule : production_1 | production_2 |
  production_3 | production_4

Productions: zero or more of following items

subrule: name of another rule
token: pattern or string
action: block of Perl code
directive: instruction to the parser
comment

A rule matches text if any (the first!) of its production matches.

A production matches if its items within match in the order stated.

First production, not longest

Unlike yacc (breadth-first), the first production (depth-first) takes precedence over the longest.

This is the fundamental difference between bottom up and recursive descent parsing.

#!/usr/bin/perl
use warnings;
use strict;

use Parse::RecDescent;

my $grammar_1 = q { #_{

  start    :  seq_1 seq_2

  seq_1    :   'A' 'B' 'C' 'D'         { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
           |   'A' 'B' 'C' 'D' 'E' 'F' { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }

  seq_2    : character(s)

  character: /\w/                      { print "character: $item[1]\n" }

#_} };
my $grammar_2 = q { #_{

  start:  seq_1 seq_2

  seq_1    :  'A' 'B' 'C' 'D' 'E' 'F'
               { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }
            | 'A' 'B' 'C' 'D'
               { print "seq_1: " . join (" ", @item[1..$#item]) . "\n" }


  seq_2    : character(s)

  character: /\w/
               { print "character: $item[1]\n" }

#_} };

print "Parse with first grammar:\n";
parse($grammar_1);
print "\nParse with second grammar:\n";
parse($grammar_2);

sub parse { #_{
  my $grammar = shift;

  my $parser=Parse::RecDescent->new($grammar);
  
  $parser->start("A B C D E F G");

} #_}

Github repository PerlModules, path: /Parse/RecDescent/first-production_not-longest.pl

Parse with first grammar:
seq_1: A B C D
character: E
character: F
character: G

Parse with second grammar:
seq_1: A B C D E F
character: G

@item and %item

$item[0] and $item{__RULE__} is/are the name of the rule.

$item[n] are the values of the nth subitem in the rule.

#!/usr/bin/perl
use warnings;
use strict;

use Parse::RecDescent;

my $grammar = q {

  seq         : one_char three_chars two_chars
              { main::print_item(\@item, \%item); }

  one_char    : character
              { main::print_item(\@item, \%item); }

  three_chars : character character character
              { main::print_item(\@item, \%item); }
  
  two_chars   : character
              { main::print_item(\@item, \%item); }

  character   : /\w/
              { main::print_item(\@item, \%item);}
};

my $parser=Parse::RecDescent->new($grammar) or die;

$parser->seq("A B C D E F");


sub print_item {
  no warnings 'once';
  my @item = @{ $_[0] };
  my %item = %{ $_[1] };
  print join " -- ", @item;
  print "\n";
  print map {$_ . "=" . $item{$_} . "; "} keys %item;
  print "\n\n";
  $Parse::RecDescent::return = join "", @item[1..$#item]; # $Parse::RecDescent::return 
}

Github repository PerlModules, path: /Parse/RecDescent/item.pl

Perl module Parse::RecDescent

Rules and productions

First production, not longest

@item and %item

Source code

See also