Study Guide: Trees

Instructions

This is the companion guide to Quiz 4 with links to past lectures, assignments, and handouts, as well as isomorphic quiz problems and additional practice problems to assist you in learning the concepts.

Assignments

Handouts

Lectures

Readings

Guides

Trees

A tree is an abstract model of a hierarchical structure. It consists of nodes with a parent-child relationship. Its name comes from the fact that when drawn, it resembles an upside-down tree: the root of the tree is at the top and the leaves are at the bottom.

A tree is a recursive data structure; each node of the tree contains a label value and a list of branches, each of which are also trees. The label can be any data type, while the branches are represented as a list of trees. The lecture slides provide a good overview of tree terminology and a visual to help demonstrate.

Hierarchical Data

We learned in lecture about lists, containers which hold sequences of data. These sequences can also be arbitrarily nested to create nested lists.

>>> l = [1, 2, 3]
>>> [l, [l]]
[[1, 2, 3], [[1, 2, 3]]]

This results in a hierarchy that lets us model, organize, and represent complex systems.

Trees build upon this idea of hierarchy by constraining the structure of the tree. We are only allowed to build trees in a certain way, and they take on a certain visual representation, but underlying all of the nodes and their labels and branches is the same story of a nested, list hierarchy. What the tree data abstraction allows us to do is think of trees as larger units and not worry about manipulating individual lists.

In this class, we have two different views of trees which we often switch between depending on the situation.

  1. We can view trees as individual nodes. In Python, our tree data abstraction is compound type. This means that we reference it with a pointer or arrow, just like a list. But this means that, in our list of branches, each tree is pointed to with a pointer too! You can imagine this can get complicated quickly, so when we view trees as individual nodes, we consider what we can do with an individual node of a tree. Python sees trees this same way: not as a whole, but as an individual node in a larger structure that happens to be connected.
  2. We can also view trees as larger, connected structures. Just like functional abstraction allows us to reason about the behavior of a program in terms of its domain, range, and behavior, thinking abstractly about tree structures allow us to reason about the behavior of programs at a larger scale. We can say, for example, that calling replace_leaf on a branch of a tree will return a branch replacing all of its leaves' label values.

Tree Traversal

Hierarchy and structure helps us organize data, but at the cost of making that very same data harder to find. Now that we've organized data into a tree, we'd like to be able to use and manipulate that data.

Tree traversal refers to the process of visiting each node in a tree data abstraction, exactly once. Just like how we saw that we needed to come up with an organized way to explore the space of possible solutions in tree recursion, we also need an organized way to traverse trees to avoid repeating work or getting stuck in an infinite loop.

Each node of a tree has a predefined structure: a label value and a list of branches. This structure lets us explore tree structures in a very specific pattern where we visit each node of the tree one-by-one until we've completed the traversal and solved the problem.

  1. Visit the current node and check or otherwise process the node. Our base case often falls under this category: we process the node by checking if the node is a leaf, or some other stopping condition.
  2. Iterate across the branches of the tree and recurse down. We often write a for loop or list comprehension to iterate across the branches of the tree and recurse on the branch to explore down the tree.
  3. Combine the result obtained from the recursive call to determine the final result. To find the sum of all the label values in a tree, for example, it's not sufficient to stop at making a recursive call: we also need to determine what to do with the number returned from the sum of the values in the branch.

Tree problems tend toward recursive solutions. This also makes it convenient for understanding the structure of tree problems by first examining the expected behavior by coming up with small example cases. How should our program behave when we call it on a leaf, for example? How about if we scale up to a larger tree of 2 or 3 nodes? And so forth.

Practice Problems

Easy

Q1: Acorn Finder

The squirrels on campus need your help! There are a lot of trees on campus and the squirrels would like to know which ones contain acorns. Define the function acorn_finder, which takes in a tree and returns True if the tree contains a node with the value 'acorn' and False otherwise.

def acorn_finder(t):
    """Returns True if t contains a node with the value 'acorn' and
    False otherwise.

    >>> scrat = tree('acorn')
    >>> acorn_finder(scrat)
    True
    >>> sproul = tree('roots', [tree('branch1', [tree('leaf'), tree('acorn')]), tree('branch2')])
    >>> acorn_finder(sproul)
    True
    >>> numbers = tree(1, [tree(2), tree(3, [tree(4), tree(5)]), tree(6, [tree(7)])])
    >>> acorn_finder(numbers)
    False
    """
"*** YOUR CODE HERE ***"
if label(t) == 'acorn': return True for b in branches(t): if acorn_finder(b): return True return False # Alternative solution def acorn_finder(t): if label(t) == 'acorn': return True return True in [acorn_finder(b) for b in branches(t)]

Use Ok to test your code:

python3 ok -q acorn_finder

Q2: Tree Height

We have now defined height in the context of trees. Consider the following function that computes the height of any tree.

def height(t):
    if is_leaf(t):
        return 0
    return 1 + max([height(c) for c in children(t)])

Test your understanding of how this function works for any tree constructed in Python:

python3 ok -q height -u
>>> from lab05 import *
>>> def height(t):
...     if is_leaf(t):
...         return 0
...     return 1 + max([height(c) for c in children(t)])
>>> t = tree(1, [tree(2, [tree(3)])])
>>> height(t)
______
2
>>> t = tree(1, [tree(2), tree(3, [tree(5)]), tree(4)]) >>> height(t)
______
2
>>> t = tree(1, [tree(2), tree(3, [tree(5)])]) >>> height(t)
______
2
>>> t = tree(1) >>> height(t)
______
0
>>> from lab05 import *
>>> def height(t):
...     if is_leaf(t):
...         current_height = 0
...     else:
...         current_height = 1 + max([height(child) for child in children(t)])
...     print(current_height)
...     return current_height
>>> t = label(1, [label(2, [label(3)])])
>>> height(t)
______
0 1 2 2
>>> t = label(1, [label(2), label(3, [label(5)]), label(4)]) >>> height(t)
______
0 0 1 0 2 2
>>> t = label(1, [label(2), label(3, [label(5)])]) >>> height(t)
______
0 0 1 2 2

Q3: Tree Map

Define the function tree_map, which takes in a tree and a one-argument function as arguments and returns a new tree which is the result of mapping the function over the entries of the input tree.

def tree_map(fn, t):
    """Maps the function fn over the entries of tree and returns the
    result in a new tree.

    >>> numbers = tree(1,
    ...                [tree(2,
    ...                      [tree(3),
    ...                       tree(4)]),
    ...                 tree(5,
    ...                      [tree(6,
    ...                            [tree(7)]),
    ...                       tree(8)])])
    >>> print_tree(tree_map(lambda x: 2**x, numbers))
    2
      4
        8
        16
      32
        64
          128
        256
    """
"*** YOUR CODE HERE ***"
if children(t) == []: return tree(fn(label(t)), []) mapped_subtrees = [] for subtree in children(t): mapped_subtrees += [ tree_map(fn, subtree) ] return tree(fn(label(t)), mapped_subtrees) # Alternate solution def tree_map(fn, t): return tree(fn(label(t)), [tree_map(fn, t) for t in children(t)])

Q4: Same shape

Define a function same_shape that, given two trees, t1 and t2, returns True if the two trees have the same shape (but not necessarily the same data in each node) and False otherwise.

def same_shape(t1, t2):
    """Return True if t1 is indentical in shape to t2.

    >>> test_tree1 = tree(1, [tree(2), tree(3)])
    >>> test_tree2 = tree(4, [tree(5), tree(6)])
    >>> test_tree3 = tree(1,
    ...                   [tree(2,
    ...                         [tree(3)])])
    >>> test_tree4 = tree(4,
    ...                   [tree(5,
    ...                         [tree(6)])])
    >>> same_shape(test_tree1, test_tree2)
    True
    >>> same_shape(test_tree3, test_tree4)
    True
    >>> same_shape(test_tree2, test_tree4)
    False
    """
"*** YOUR CODE HERE ***"
s1, s2 = branches(t1), branches(t2) if len(s1) != len(s2): return False for index in range(len(s1)): if not same_shape(s1[index], s2[index]): return False return True # Alternate solution def same_shape(t1, t2): children_same = all(map(same_shape, branches(t1), branches(t2))) return len(branches(t1)) == len(branches(t2)) and children_same

Medium

Q5: Sprout leaves

Define a function sprout_leaves that, given a tree, t, and a list of values, vals, and produces a tree with every leaf having new branches that contain each of the items in vals. Do not add new branches to non-leaf nodes.

def sprout_leaves(t, vals):
    """Sprout new leaves containing the data in vals at each leaf in
    the original tree t and return the resulting tree.

    >>> t1 = tree(1, [tree(2), tree(3)])
    >>> print_tree(t1)
    1
      2
      3
    >>> new1 = sprout_leaves(t1, [4, 5])
    >>> print_tree(new1)
    1
      2
        4
        5
      3
        4
        5

    >>> t2 = tree(1, [tree(2, [tree(3)])])
    >>> print_tree(t2)
    1
      2
        3
    >>> new2 = sprout_leaves(t2, [6, 1, 2])
    >>> print_tree(new2)
    1
      2
        3
          6
          1
          2
    """
"*** YOUR CODE HERE ***"
if is_leaf(t): return tree(label(t), [tree(v) for v in vals]) return tree(label(t), [sprout_leaves(s, vals) for s in branches(t)])

Use Ok to test your code:

python3 ok -q sprout_leaves

Q6: Has Path

Write a function has_path that takes in a tree t and a string word. It returns True if there is a path that starts from the root where the entries along the path spell out the word, and False otherwise.

def has_path(t, word):
    """Return whether there is a path in a tree where the entries along the path
    spell out a particular word.

    >>> greetings = tree('h', [tree('i'),
    ...                        tree('e', [tree('l', [tree('l', [tree('o')])]),
    ...                                   tree('y')])])
    >>> print_tree(greetings)
    h
      i
      e
        l
          l
            o
        y
    >>> has_path(greetings, 'h')
    True
    >>> has_path(greetings, 'i')
    False
    >>> has_path(greetings, 'hi')
    True
    >>> has_path(greetings, 'hello')
    True
    >>> has_path(greetings, 'hey')
    True
    >>> has_path(greetings, 'bye')
    False

>>> c = tree('c', [tree('a', [tree('t')]), ... tree('o', [tree('l', [tree('o', [tree('r')])]), ... tree('y')])]) >>> has_path(c, 'c') True >>> has_path(c, 'at') False >>> has_path(c, 'caturday') False >>> has_path(c, 'color') True >>> has_path(c, 'cobol') False >>> has_path(c, 'col') True >>> has_path(c, 'coy') True >>> has_path(c, 'olo') False >>> has_path(c, 'oy') False
""" assert len(word) > 0, 'no path for empty words.'
"*** YOUR CODE HERE ***"
if label(t) != word[0]: return False elif len(word) == 1: return True for b in branches(t): if has_path(b, word[1:]): return True return False