Calculator

From CS 61A Wiki
Jump to: navigation, search

Calculator is a Scheme-like language that supports basic arithmetic operations.

Syntax

Calculator has primitive expressions and call expressions:

  • a primitive expression is number: 2 or 3.14
  • a call expression is a Scheme list that begins with an operator, followed by 0 or more expressions: (+), (* 1 2 3), (/ 3 (+ 4 5))

The legal operators are:

  • +: sum of the arguments
  • *: product of the arguments
  • -: if one argument, negate it; else subtract the rest from the first
  • /: if one argument, invert it; else divide the rest from the first

Here are some examples of Calculator in action:

calc> (+ 2 2)
4
calc> (- 5)
-5
calc> (* (+ 1 2) (+ 2 3))
15

Building an interpreter

Given a Calculator expression, an interpreter evaluates its meaning. An interpreter consists of a parser and an evaluator.

Parser

A parser analyzes an input string and describes the structure. It performs lexical analysis and syntactic analysis:

  • lexical analysis (tokenization) divides the input string into meaningful tokens, such as punctuation marks and numbers
  • syntactic analysis interprets a token sequence as an expression tree

Tokenization

In Calculator, the tokens are parentheses, numbers, and operators. The following tokenize function surrounds parentheses with spaces and invokes split (which splits a string into words using spaces as delimiters):[1]

def tokenize(s):
    """Splits string S into a list of tokens.
    >>> tokenize('(* (+ 12 3) 5)')
    ['(', '*', '(', '+', '12', '3', ')', '5', ')']
    """
    return s.replace('(', ' ( ').replace(')', ' ) ').split()

Syntactic analysis

In our Calculator interpreter, we will represent expression trees as deep linked lists. We organize this stage into two mutually recursive functions: one for expressions (read) and one for operand lists (read_tail). Here they are:[1]

def read(tokens):
    """Read an expression from a sequence of tokens.
    >>> exp = read(['(', '+', '2', '3', '6', ')'])
    >>> print(exp)
    < '+' 2 3 6 >
    >>> exp = read(['(','*','4','(','-','12','8',')',')','3','2'])
    >>> print(exp)
    < '*' 4 < '-' 12 8 > >
    """
    if tokens == []:
        raise SyntaxError('unexpected end of input')
    token = tokens.pop(0)
    if token == '(':
        return read_tail(tokens)
    elif token == ')':
        raise SyntaxError('unexpected )')
    else:
        return atom(token)
 
def read_tail(tokens):
    """Read the rest of a list."""
    if tokens == []:
        raise SyntaxError('unexpected end of input')
    elif tokens[0] == ')':
        tokens.pop(0) # pop off ')'
        return Link.empty
    else:
        return Link(read(tokens), read_tail(tokens)) # read the next expression, whether it be a number or a combination, and link it with the rest of the list
 
def atom(token):
    """Try to coerce TOKEN into an int or a float; otherwise return TOKEN."""
    try:
        return int(token)
    except ValueError:
        try:
            return float(token)
        except ValueError:
            return token

read checks for various base cases, including empty input (error) and primitive expressions (which are converted from strings to their correct type through atom). A recursive call to read_tail is invoked whenever a ( token indicates the beginning of a list.[2]

read_tail continues to read from the same tokens, but expects to be called after a list has begun. Its base cases are an empty input (error) or a ) token that terminates the list. Its recursive call reads the first element of the list with read, reads the rest of the list with read_tail, and then returns a list represented as a linked list.[2]

Now we can write a function that represents the parsing stage:

def parse(s):
    return read(tokenize(s))

Evaluator

An evaluator discovers the form of an expression and executes the corresponding evaluation rule.

Primitive expressions evaluate to themselves. Call expressions are evaluated recursively:

  • evaluate each operand expression
  • collect their values as a list of arguments
  • apply the operator to the argument list

calc_eval computes the value of an expression exp, which is either a number or a call expression:

def calc_eval(exp):
    """Evaluate EXP."""
    if type(exp) in (int, float): # number
        return exp
    elif isinstance(exp, Link): # call expression
        args = exp.rest.map(calc_eval).to_list() # evaluate each argument and collect them in a list
        return calc_apply(exp.first, args) # apply the operator
    else:
        raise SyntaxError('unknown expression')

Assume that the linked list class supports map and to_list.

calc_apply applies an operator to a list of arguments:

def calc_apply(operator, args):
    if operator == '+':
        return sum(args)
    elif operator == '-':
        if len(args) == 1:
            return -args[0]
        else:
            return args[0] + sum([-arg for arg in args[1:]]) # args[0] - args[1] - args[2] - ...
    elif operator == '*':
        return reduce(lambda x, y: x * y, args, 1)
    elif operator == '/':
        if len(args) == 1:
            return 1 / args[0]
        else:
            return reduce(lambda x, y: x * y, [1 / arg for arg in args[1:]], args[0]) # args[0] / args[1] / args[2] / ...

Depending on what operator is, we decide how to treat the arguments.

Read-eval-print loop

An interactive session is supported by a read-eval-print loop (REPL): read an expression, evaluate it, print the result, and repeat.

For Calculator, the REPL looks like this:[3]

def repl():
    while True:
        try:
            print(calc_eval(parse(input('calc> '))))
        except (SyntaxError, ZeroDivisionError) as e:
            print(type(e).__name__ + ':', e)

Final product

The entire source code can be viewed at Calculator/Source code.

Sources

  1. 1.0 1.1 http://norvig.com/lispy.html
  2. 2.0 2.1 http://composingprograms.appspot.com/pages/34-interpreters-for-languages-with-combination.html#parsing-expressions
  3. http://inst.eecs.berkeley.edu/~cs61a/su14/hw/hw6/hw6.html