# Study Guide: Interpreters

## Instructions

This is a study guide with links to past lectures, assignments, and handouts, as well as additional practice problems to assist you in learning the concepts.

**Assignments**

Important: For solutions to these assignments once they have been released, see the main website

**Handouts**

**Lectures**

**Readings**

# Guides

How does the computer make sense of the questions we ask in the interpreter?

```
scm> (+ 1 2)
3
```

When we type the characters, `(+ 1 2)`

, how does a computer understand the
meaning of each character: the open parenthesis `(`

, the plus symbol, `+`

, the
space character `, and so forth?`

## Interpreters

An **interpreter** is a program which can directly execute instructions in a
programming language like Python or Scheme. To do this, the interpreter needs
to understand how the rules of a language work, and to do that, the interpreter
needs to be designed (by a human, usually!) to follow those rules. In CS 61A,
we'd like to learn how the interpreter *program* works and understand the ideas
that go into an interpreter to reveal the magic behind how a computer makes
sense of `(+ 1 2)`

in Scheme and `1 + 2`

in Python.

### REPL

The interpreters we study in this course are designed around a
**Read-Eval-Print-Loop**.

The interpreter waits for our input, looping until we type in a string of
characters, `(+ 1 2)`

.

In the

**Read**stage, we take the string of characters and convert those characters into data structures that can be understood by the interpreter. In the Scheme interpreter, we'd like to take the string of characters`(+ 1 2)`

and convert it into a`Pair`

structure,`Pair('+', Pair(1, Pair(2, nil)))`

.- The
**Parser**takes the string`(+ 1 2)`

and splits into meaningful*tokens*wherever there is whitespace or a syntax character like the open parenthesis`(`

. The parser returns a structure that kind of resembles a list of individual characters like`['(', '+', 1, 2, ')']`

. - The
**Lexer**takes the list of smaller strings and converts them into a data structure that understands the nesting in the language. It's not too tricky to convert the example above from`['(', '+', '1', '2', ')']`

to`Pair('+', Pair(1, Pair(2, nil)))`

. But the pair structure is much more useful when we want to work on a more nested example like`(= (+ 1 2) 3)`

which parses to the flat list`['(', '=', '(', '+', 1, 2, ')', 3, ')']`

. As the expressions get more and more complicated, using a nested data structure helps the computer make sense of the expression in chunks, much like how we humans make sense of a Scheme expression as well.

- The
In the

**Eval**stage, we evaluate the nested data structure to a value.- If the expression is simple, like a number or a boolean, we can just
return the number or boolean itself without any further evaluation. Evaluating
1 in the global frame returns the number
`1`

. If we pass in a name like 'x', we'll try to evaluate the name 'x' in the current environment, looking up through the parent frames as necessary. - If we pass a call expression like
`Pair('+', Pair(1, Pair(2, nil)))`

into the evaluator, we'll follow the call expression evaluation rules to determine the value of the value of the expression. We'll evaluate the operator,`'+'`

, to the built-in procedure which adds values, and each operand, 1 and 2, before applying the procedure to the arguments. - if the expression is a special form like
`Pair('define', Pair('x', Pair(1, nil)))`

(after parsing`(define x 1)`

), then we'll follow the special form evaluation rules defined in the Scheme interpreter. In this example, we don't need to evaluate the name 'x' (it's just a name!) but we do need to evaluate the value, the number`1`

.

- If the expression is simple, like a number or a boolean, we can just
return the number or boolean itself without any further evaluation. Evaluating
1 in the global frame returns the number
- In the
**Print**stage, we take the value we determined in the evaluation stage and figure out how to display the value to the user. Even if an expression evaluates to a number, the computer actually stores that number in its memory as a bunch of 0 and 1 binary digits! We take that number and convert it back to a more readable representation like '3'! - Once we've displayed the value, we can wait in a
**Loop**until the user asks another question to the interpreter.

There are some special rules in the parser to handle scenarios like the quote
operator `'`

. Run `python3 scheme_reader.py`

to explore its behavior.

```
read> '1
str : (quote 1)
repr: Pair('quote', Pair(1, nil))
read> '(1 2 3)
str : (quote (1 2 3))
repr: Pair('quote', Pair(Pair(1, Pair(2, Pair(3, nil))), nil))
```

Likewise, the evaluator also needs to handle the behavior of special forms differently from standard call expressions. The Scheme Specification contains more information, and the special form evaluation rules will also be briefly described by example here.

### Counting Calls

One way of understanding what occurs in the Scheme interpreter is by counting
the number of calls to `scheme_eval`

that occur when evaluating an expression.

Atomic data types like numbers, booleans, strings, symbols, etc. only take a single step to evaluate. Primitive expressions like names evaluate to a value in one step as well.

```
Expression: 1
Eval: - (1 scheme_eval)
Expression: #t
Eval: -- (1 scheme_eval)
Expression: 'cs61a
Eval: ------ (1 scheme_eval)
```

#### Call Expressions

Call expressions follow the call expression evaluation procedure.

```
Expression: (+ 1 2)
Eval: -------
- - - (4 scheme_eval)
Apply: ~~~~~~~ (1 scheme_apply)
Expression: (= (quotient 5 1) (+ 1 2))
Eval: --------------------------
- --------------
-------- - -
Apply: ~~~~~~~~~~~~~~
Eval: -------
- - - (10 scheme_eval)
Apply: ~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~ (3 scheme_apply)
```

#### Special Forms

Special forms follow different evaluation rules. A more complete description of their behaviors is described in the Scheme Project and the Scheme Specification.

Unlike call expressions where we need to evaluate the operator, special forms
do not need to evaluate the first term. In `scheme_eval`

, the logic looks like
this.

```
SPECIAL_FORMS = {
'and': do_and_form,
'begin': do_begin_form,
'cond': do_cond_form,
'define': do_define_form,
'if': do_if_form,
'lambda': do_lambda_form,
'let': do_let_form,
'or': do_or_form,
'quote': do_quote_form,
}
first, rest = expr.first, expr.second
if scheme_symbolp(first) and first in SPECIAL_FORMS:
return SPECIAL_FORMS[first](rest, env)
```

Instead of continuing with the call expression evaluation procedure, we handle the special form with its own set of rules.

```
Expression: (if (= 1 1) #t #f)
Eval: ------------------
------- (note that if is not evaluated)
- - -
Apply: ~~~~~~~ (1 scheme_apply)
Eval: -- (6 scheme_eval)
(note that #f is not evaluated)
```

We can also trace through how our Scheme interpreter handles another special
form, `cond`

.

```
Expression: (cond ((= 1 2) #f) (else #t))
Eval: -----------------------------
-------
- - -
Apply: ~~~~~~~ (1 scheme_apply)
Eval: -- (6 scheme_eval)
```

# Practice Problems

### Q1: Eval/Apply

This question is from the CS 61A Summer 2017 Final Exam.

With your Project 4 implementation, how many total calls to `scheme_eval`

and
`scheme_apply`

would be made when evaluating the following two expressions?
Assume that you are not using the tail call optimized `scheme_eval_optimized`

function.

Suppose we have already evaluated the following definition in the current environment.

`(define lst (list 1 2 3))`

How many calls are made to `scheme_eval`

and `scheme_apply`

?

`(+ (car lst) (- 5 (car (cdr lst))))`

`13 eval calls, 5 apply calls`

How many calls are made to `scheme_eval`

and `scheme_apply`

?

```
((if (or (null? lst) (null? (cdr lst)))
(lambda (s) 0)
(lambda (s) (car (cdr s))))
lst)
```

`18 eval calls, 6 apply calls`