Search notes:

Perl module: Tree::Create::DepthFirst

Tree::Create::DepthFirst allows to create a tree (Tree::Simple) by adding its nodes in the same sequence (or order) which they would/will be returned if the tree is traversed depth first.

Simple demonstration

The following simple script demonstrates how Tree::Create::DepthFirst is supposed to work.
The array @nodes is a list of elements, each with a depth and a value.
The elements of this list are added ($tree_creator->addNode()) to the tree being created.
Then, the final tree is stored in $tree ($tree = $tree_creator->getTree()).
Now, the $tree's method traverse will call the supplied callback function for each node (and leaf) found in the tree. The callback function verifis if in fact the nodes stored in the tree are returned in the same order as they were previously added
#!/usr/bin/perl
use warnings;
use strict;
use Tree::Create::DepthFirst;

my $tree_creator = Tree::Create::DepthFirst->new();

my @nodes = (
  {depth => 0    , val=>'0'          },
  {depth =>  1   , val=>'0.0'        },
  {depth =>  1   , val=>'0.1'        },
  {depth => 0    , val=>'1'          },
  {depth =>  1   , val=>'1.0'        },
  {depth =>   2  , val=>'1.0.0'      },
  {depth =>   2  , val=>'1.0.1'      },
  {depth =>  1   , val=>'1.1'        },
  {depth =>   2  , val=>'1.1.0'      },
  {depth =>    3 , val=>'1.1.0.1'    },
  {depth =>     4, val=>'1.1.0.1.0'  },
  {depth =>     4, val=>'1.1.0.1.1'  },
  {depth =>     4, val=>'1.1.0.1.2'  },
  {depth =>   2  , val=>'1.1.1'      },
  {depth =>   2  , val=>'1.1.2'      },
  {depth =>  1   , val=>'1.2'        },
  {depth => 0    , val=>'2'          },
  {depth => 0    , val=>'3'          },
  {depth =>  1   , val=>'3.0'        },
  {depth =>   2  , val=>'3.0.0'      },
  {depth =>    3 , val=>'3.0.0.0'    },
  {depth =>     4, val=>'3.0.0.0.0'  },
  {depth => 0    , val=>'4'          },

);

# First add the nodes in the sequence that we want them
# returned if the final tree is traversed depth first:
#
for my $node (@nodes) {
  $tree_creator->addNode($node->{depth}, $node->{val});
}

my $tree = $tree_creator->getTree();

# Then verify if they are actually returned in the
# same order:
#
$tree->traverse (sub {
  my $node = shift;

  my $next_expected = shift @nodes;

  die unless $next_expected->{depth} == $node->getDepth();
  die unless $next_expected->{val}   eq $node->getNodeValue();

});
Github repository PerlModules, path: /Tree/Create/DepthFirst/script.pl

Creating a tree from HTML ul and li tags/elements

Here's a piece of a html docoument (parse-html-ul-li.html):
  <ul><li>0<ul><li>0.0<ul><li>0.0.0<ul><li>0.0.0.0<li>0.0.0.1</ul>
  </ul></ul><ul><li>0.1</li></ul></li><li>1<ul><li>1.0</li><li>1.1
  </li><ul><li>1.1.0<li>1.1.1</ul></ul></li><li>2</li><li>3<ul><li>3.0
  <ul><li>3.0.0</li><li>3.0.1</li><ul><li>3.0.1.0<li>3.0.1.1</ul>
  <li>3.0.2</li></ul></li></ul><ul><li>3.1</li><li>3.2</li></ul></li></ul>
When rendered in a browser, this gibberish looks like this:
Now, I want to reconstruct this tree with a Perl script.
So, I combine the HTML::Parser Perl module with Tree::Create::DepthFirst:
#!/usr/bin/perl
use warnings;
use strict;

use Tree::Create::DepthFirst;
use HTML::Parser;

my $depth = -1;
my $tree_creator = Tree::Create::DepthFirst->new();

my $html_parser = HTML::Parser->new(
  start_h       => [\&start_tag  , 'tag'  ],
  end_h         => [\&end_tag    , 'tag'  ],
  text_h        => [\&text       , 'text' ],
);

$html_parser->parse_file('parse-html-ul-li.html') or die;

my $tree_simple = $tree_creator->getTree();


$tree_simple->traverse(sub {
  my $node = shift;
  print "  " x $node->getDepth() . $node->getNodeValue() . "\n";
});

sub start_tag {
  my $tag = shift;
  
  $depth++ if ($tag eq 'ul');
}
sub end_tag {
  my $tag = shift;
  $depth-- if ($tag eq '/ul');
}
sub text {
  my $text = shift;
  $text =~ s/\s//g;

  $tree_creator->addNode($depth, $text) if $depth >= 0 and $text =~ /\w/;
}
Github repository PerlModules, path: /Tree/Create/DepthFirst/parse-html-ul-li.pl
When the script is executed, it will print
0
  0.0
    0.0.0
      0.0.0.0
      0.0.0.1
  0.1
1
  1.0
  1.1
    1.1.0
    1.1.1
2
3
  3.0
    3.0.0
    3.0.1
      3.0.1.0
      3.0.1.1
    3.0.2
  3.1
  3.2

See also

Perl modules
Source code in the GitHub repository.

Index