Difference between revisions of "Guides"

From CS 61A Wiki
Jump to: navigation, search
(Object-oriented programming)
Line 446: Line 446:
 
== Mutable data-structures ==
 
== Mutable data-structures ==
 
== Object-oriented programming ==
 
== Object-oriented programming ==
 +
=== Inheritance and class vs instance attributes ===
 +
[https://piazza.com/class/hoxc5uu6sud761?cid=1413 Source: Spring 2014 Piazza (1413)]
 +
 +
'''Student Question'''
 +
 +
I'm confused on how Classes and Inheritance work.
 +
 +
If there's a Parent class and a Child class, when coding in the Child class, when do you write <code>Parent.attribute</code>, when do you write <code>Child.attribute</code>, and when do you write <code>self.attribute</code>?
 +
 +
Also, I'm also confused as to when to put <code>self</code> into the parentheses as well.
 +
 +
'''Instructor Answer'''
 +
 +
<code>Parent.attribute</code> and <code>Child.attribute</code> would both be ways of accessing a'''class variable'''. These are variables that can be accessed without creating new '''instances''' of the that class.
 +
 +
<code>self.attribute</code> would be used in '''methods''' to access an '''instance variable''' (an attribute specific to an instance).
 +
 +
So for example, <code>Insect.watersafe</code> is <code>False</code>, but <code>Bee.watersafe</code> is <code>True</code>. These are class attributes because you don't have to create an Insect object or a Bee object in order to say <code>Insect.watersafe</code> or <code>Bee.watersafe</code>.
 +
 +
However it wouldn't make any sense to say <code>Bee.armor</code>, since armor is an '''instance variable'''. You have to first create a new Bee before you could ask it for it's armor. If you created a second Bee after that, the second Bee would also have its own armor.
 +
 +
There's a lot of vocab (in bold) that might trip you up. Try reading Discussion 6 and posting a followup if you're still unsure!
 +
 
== Iterables, iterators and generators ==
 
== Iterables, iterators and generators ==
 
== Scheme ==
 
== Scheme ==

Revision as of 11:14, 20 June 2014

Higher-order functions

Environment diagrams

Environment diagram Rules

Source: Spring 2014 Piazza (131)

Environment Diagrams are very important in our understanding of how the computer interprets our code.

We will test you on this in every exam.

It will never go away.

Given that, master it as quickly as you can! :)

Below are the rules I follow when drawing environment diagrams. If you understand and faithfully follow these rules when drawing them, you'll never get them wrong.

One thing you haven't learned yet is nonlocal. You can skip that particular step for now (step 2 of Assignment).

Post here if you have any questions!

You can also take a look at this link for some examples of environment diagrams: http://albertwu.org/cs61a/notes/environments

For a different perspective on the rules, check out: http://markmiyashita.com/cs61a/sp14/environment_diagrams/rules_of_environment_diagrams/

A handout with detailed instructions on drawing environment diagrams is also available here (linked on the bottom of the course homepage): http://inst.eecs.berkeley.edu/~cs61a/sp14/pdfs/environment-diagrams.pdf

Environment Diagram Rules
=========================

Creating a Function
--------------------
1. Draw the func <name>(<arg1>, <arg2>, ...)
2. The parent of the function is wherever the function was defined
   (the frame we're currently in, since we're creating the function).
3. If we used def, make a binding of the name to the value in the current frame.

Calling User Defined Functions
------------------------------
1. Evaluate the operator and operands.
2. Create a new frame; the parent is whatever the operator s parent is.
   Now this is the current frame.
3. Bind the formal parameters to the argument values (the evaluated operands).
4. Evaluate the body of the operator in the context of this new frame.
5. After evaluating the body, go back to the frame that called the function.

Assignment
----------
1. Evaluate the expression to the right of the assignment operator (=).
2. If nonlocal, find the frame that has the variable you re looking for,
   starting in the parent frame and ending just before the global frame (via
   Lookup rules). Otherwise, use the current frame. Note: If there are multiple
   frames that have the same variable, pick the frame closest to the current
   frame.
3. Bind the variable name to the value of the expression in the identified
   frame. Be sure you override the variable name if it had a previous binding.

Lookup
------
1. Start at the current frame. Is the variable in this frame?
   If yes, that's the answer.
2. If it isn't, go to the parent frame and repeat 1.
3. If you run out of frames (reach the Global frame and it's not there), complain.

Tips
----
1. You can only bind names to values.
   No expressions (like 3+4) allowed on environment diagrams!
2. Frames and Functions both have parents.

Sequences

Reversing tuples

Source: Spring 2014 Piazza (639)

Student Question

Why does [::-1] tuple work while the tuple [0:3:-1] doesn't?

I thought the -1 after the second semicolon meant that the interpreter is going to read the indexes "backwards".

Student Answer

The syntax of slicing is tup[start:end:step]:

  • start from index start and end just before index end, incrementing the index by step each time
  • if no step is provided, step = 1
  • if step is positive, default values if not provided: start = 0, end = len(tup)
  • if step is negative, default values if not provided: start = -1, end = one position before the start of the string
>>> (1, 2, 3)[::-1] # start at index -1, end one position before the start of the string
(3, 2, 1)
>>> (1, 2, 3)[0:3:-1] # start at 0 and go to 3, but step is negative, so this doesn't make sense and an empty tuple is returned
()

This is a helpful visualization from http://en.wikibooks.org/wiki/Python_Programming/Strings#Indexing_and_Slicing:

To understand slices, it's easiest not to count the elements themselves. It is a bit like counting not on your fingers, but in the spaces between them. The list is indexed like this:

Element:     1     2     3     4
Index:    0     1     2     3     4
         -4    -3    -2    -1

More info about slicing at http://stackoverflow.com/a/13005464/2460890.

Slicing with negative step

Source: Spring 2014 Piazza (702)

Student Question

if the third example returns an empty tuple because you can't take negative steps from 0 to 4, shouldn't the second example also return an empty tuple?

Can someone explain why each example returns the respective answers?

Thanks

>>> x= (1,2,3,4)
>>> x[0::-1]
(1,)
>>> x[::-1]
(4, 3, 2, 1)
>>> x[0:4:-1]
()
>>> x[1::-1]
(2, 1)

Instructor Answer

(For reference, the notation is x[start:end:step])

Python does something a very strange when the step is negative: if you omit the arguments to start and end, Python will fill them with what makes sense for a negative step. In the simple case of x[::-1], Python fills in the start with len(x)-1 and the end with -(len(x)+1). The end term is strange, but remember that the end term isn't included. We therefore can't use 0, but we can't use -1 either, since that clearly refers to the last element of the tuple. We need to fully wrap the negative index around, to refer to the element "before" the 0th index. This way, Python will start at the end of the tuple and proceed to the beginning of the tuple.

That's why x[0:4:-1] doesn't make sense: how can we start at 0 and end at 4, if we're proceeding backwards?

And that's why x[0::-1] makes sense (albeit, in a strange way): Python is proceeding from the 0 index to the beginning of the list. It includes the start index, which is why you see a 1 pop up.

Let me know if that was confusing!

Recursion

Data abstraction

Time complexity

Andrew Huang's guide to order of growth and function runtime

Source: Guide to Order of Growth and Function Runtime (Retrieved June 16th, 2014)

Introduction

Confused by $O$, $\Omega$, and $\Theta$?

Want to figure out the runtime of that tricky function?

Read this.

NOTE THAT THIS GUIDE STARTS WITH BIG O, WHICH IS DIFFERENT FROM THETA. IF YOU UNDERSTAND BIG O, THETA IS EASY (IN FACT, IT DEFINES THETA IN TERMS OF BIG O BELOW).

First some math.

Formal definition of O(Big O):

Let $f(n)$ and $g(n)$ be functions from positive integers to positive reals. We say $f \in O(g)$ (“f grows no faster than g”) if there is a constant $0 < c < \inf$ <such that $f(n) \leq c \cdot g(n)$.

(Paraphrased from Dasgupta, Papadimitriou, & Vazirani)

(You'll see this again in CS 170)

What the heck does that mean?

Let’s look at math functions for a second (just a second).

Say $f(n)=5n$ and $g(n)=n^{2}$

What does that look like on a graph?

http://www.wolframalpha.com/input/?i=plot+5n+and+n%5E2+from+0+to+10

There’s a section where $n$ dominates $n^{2}$, from 0 to 5, but we don’t really care, because after that point, $n^{2}$ is larger, all the way to infinity! By the definition, we could scale $n^{2}$ by 5 and we would span that initial gap.

Thus we can say $5n \in O(n^{2})$ or $f \in O(g)$.

Can we say the converse? That is, is $n^{2} \in O(5n)$?

Not at all! From the graph we see that $n^{2}$ grows too quickly for $n$ to catch up, no matter what constant we scale $n$ by.

So what if $f(n)=n+1000$ and $g(n)=n^{2}$?

It turns out $n+1000 \in O(n^{2})$ still, because according to the definition, as long as we can multiply $n^{2}$ by some $c$, such that the gap of 1000 is spanned, we’re good. In the case, $c=1001$.

What about and $\Omega$ and $\Theta$?

If you digested all of the above, the rest isn’t scary! (Note, $a \equiv b$ means $a$ is equivalent to $b$)

$f \in \Omega(g) \equiv g \in O(f)$ (You'll see this again briefly in CS 170)

$f \in \Theta(g) (f \in O(g) and g \in O(f))$

This means that if $f$ is Theta of $g$, then there exist some $c_{1}$ and $c_{2}$ such that

$c_{1}g > f$ and

$c_{2}g < f$

for all positive integers.

What does that mean for Python functions?

Given a function $f$, we want to find out how fast that function runs. One way of doing this is to take out a stopwatch, and clock the amount of time it takes for $f$ to run on some input. However, there are tons of problems with that (different computers => different speeds; only one fixed input? Maybe $f$ is really fast for that input but slow for everything else; next year, all the measurements need to be redone on new computers; etc.) Instead, we'll count the steps that a function needs to perform as a function of its input. For example, here are some of the functions that take one step regardless of their input:

mul

add

sub

print

return

...

So for example, (3 + 3 * 8) % 3 would be 3 steps--one for the multiply, one of the add, and one for the mod.

Let's take a simple example:

def square(x):
  return x * x

square is a function that for any input, always takes two steps, one of the multiplication, and one for returning. Using the notation, we can say square ∈ Θ(1).

Functions with iteration (for loops, recursion, etc.), usually multiply the steps by some factor. For example, consider factorial:

def factorial(n):
  if n == 0:
    return 1
  else:
    return n * factorial(n-1)

factorial ∈ Θ(n). Why? Well given some input n, we do n recursive calls. At each recursive call, we carry out 4 steps, one for if n == 0, one for subtraction, one for multiply, one for return. Plus, we have the base case, which is another 2 steps, one for if and one for return. So factorial(n) takes $4n+2$ steps => ∈ Θ(n).

As mentioned, we care about how the running time (how long the function takes to run) of the function changes, as we increase the size of the argument. So if we imagine a graph, then the x-axis represents the size of our input, and the y-axis represents how long the function took to run for each x. As the size of the input increases, the function’s runtime does something on the graph. So when we say something like “$O(n^{2})$ where $n$ is the length of the list”, we are saying as we double the size of the list, the function is expected to run at most four times as long. NOTE ALSO THAT I SAID WHAT $n$ IS! ALWAYS GIVE YOUR UNITS.

This means that when we compare two functions A and B, A may be overall slower than B as we increase the size of their arguments. However, it’s possible at some specific arguments, the A may run faster (like the $f(n)=5n$ and $g(n)=n^{2}$ example above.)

This also means we do not care about the time taken of any particular input! This implies that all those constant-time base cases all those functions don’t really matter, because they don’t scale. That is, only one specific input causes the base case to be reached, and if we increased the size of the argument, $O(1)$ doesn't necessarily hold.

Brief “What runs faster than what”

Sorted from fastest to slowest. This is by no means comprehensive.

  • $\Theta(1)$
  • $\Theta(\log(n))$
  • $\Theta(n)$
  • $\Theta(n \log(n))$
  • $\Theta(n^{2})$
  • $\Theta(n^{3})$
  • $\Theta(2^{n})$
  • (Anything past this point is kind of ridiculous)
  • $\Theta(n!)$
  • $\Theta(n^{n})$

So we know about the math and the motivation, now how do we actually assign runtimes to real Python functions?

What you must understand, is that there is no one method for finding the runtime. You MUST look at a function holistically or you won’t get the right answer. What does this mean? In order to get the correct runtime, you first must understand what the function is doing! You cannot pattern-match your way to becoming good at this.

This cannot be stressed enough: UNITS MATTER, if you say O((n)), you must tell us what $n$ is.

General tips

  1. UNDERSTAND WHAT THE FUNCTION IS DOING!!!
  2. Try some sample input. That is, pretend you’re the interpreter and execute the code with some small inputs. What is the function doing with the input? Having concrete examples lets you do tip 1 better. You can also graph how the runtime increases as the argument size increases.
  3. If applicable, draw a picture of the tree of function calls. This shows you the "growth" of the function or how the function is getting "bigger", which will help you do tip 1 better.
  4. If applicable, draw a picture of how the input is being modified through the function calls. For example, if your input is a list and your function recursively does something to that list, draw out a list, then draw out parts of the list underneath it that are called during the recursion. Helps with tip 1.
  5. See tip 1.

Anyways, let's examine some common runtimes (keep scrolling). Remember, this is in no way a comprehensive list, NOR IS IT TRYING TO TEACH YOU HOW TO FIND THEM. This post is just to give you a starting point into orders of growth by showing you some examples and basic details about each runtime.

Constant $\Theta(1)$

What it looks like:

http://www.wolframalpha.com/input/?i=plot+5

Example:

def add(x, y):
   return x + y

$add \in \Theta(1)$, where 1 is.. well a constant...

Approach:

The key behind constant time functions is that regardless of the size of the input, they always run the same number of instructions.

Don’t fall for this Trap:

def bar(n):
   if n % 7 == 0:
       return "Bzzst"
   else:
       return bar(n -1)

$\mathtt{bar} \in \Theta(1)$. Why?

Logarithmic $\Theta(\log(n))$

What it looks like:

http://www.wolframalpha.com/input/?i=plot+4log3n+from+0+to+10

Example:

def binary_search(sorted_L, n):
   """ sorted_L is a list of numbers sorted from
        smallest to largest
   """
   if sorted_L == []:
       return False
   mid_num = sorted_L[len(sorted_L) // 2]
   if n == mid_num:
       return True
   elif n < mid_num:
       return binary_search(sorted_L[:mid_num], n)
   else:
       return binary_search(sorted_L[mid_num:], n)

$\mathtt{binary\_search} \in \Theta(log(n))$, where $n$ is the number of elements in sorted_L.

Approach:

Logarithmic functions scale down the size of the problem by some constant every iteration (either with a recursive loop, a for loop, or a while loop). Also, logarithmic functions do not branch out--they generally do not make more than one call to themselves per recursion.

Linear $\Theta(n)$

What it looks like:

http://www.wolframalpha.com/input/?i=plot+8n+from+0+to+10

Examples:

def sum_list(L):
   sum = 0
   for e in L:
       sum += e
   return sum

$\mathtt{sum\_list} \in \Theta(n)$, where $n$ is the number of elements in $L$.

</pre>def countdown(n):

  if n > 0:
      print(n)
      countdown(n - 1)
  else:
      print("Blast off!")</pre>

$\mathtt{countdown} \in \Theta(n)$, where $n$ is n.

Approach:

Linear functions usually act on sequences or other collections of data. In that case, the function will go through the elements once or twice or k times, where $k<<n$. If the function acts on a number, the number usually gets smaller by a constant each iteration.

Don't fall for this trap:

def two_for_loops(n):
   for a in range(n):
       if n == 4:
           for y in range(n):
               print("Admiral Ackbar")
       else:
           print("It's a trap!")

$\mathtt{two\_for\_loops} \in \Theta(n)$, where $n$ is n. Why?

Loglinear/Linearithmic $\Theta(n \log(n))$

What it looks like:

http://www.wolframalpha.com/input/?i=plot+nlog%28n%29+from+0+to+10

Example:

def merge(s1, s2):
   if len(s1) == 0:
       return s2
   elif len(s2) == 0:
       return s1
   elif s1[0] < s2[0]:
       return [s1[0]] + merge(s1[1:], s2)
   else:
       return [s2[0]] + merge(s1, s2[1:])

def mergesort(lst):
   if len(lst) <= 1:
       return lst
   else:
       middle = len(lst) // 2
       return merge(mergesort(lst[:middle]), \
                    mergesort(lst[middle:]))

$\mathtt{mergesort} \in \Theta(n \log(n))$, where $n$ is the number of elements in lst.

Approach: These functions tend to make two recursive calls, each making the problem smaller by a half. There's a neat way to see this. For example in mergesort, start with an entire line, which represents mergesort called on the initial list. From there, the list gets split in half by the two recursive calls to mergesort in the code, so draw the another line right below the first, of the same length, but with a small gap in the middle to represent the split. Repeat until you're tired. At the end, you get a rectangle that's nwide and (n)tall!

---------------
------- -------
--- --- --- ---
- - - - - - - -

The total area is the runtime, $\Theta(n \log(n))$

Don’t fall for this trap:

Don’t confuse functions that have an average running time of n(n)(like quicksort) with functions that are in (n(n))

Polynomial $\Theta(n^{2})$,$\Theta(n^{3})$, etc.

What it looks like:

http://www.wolframalpha.com/input/?i=plot+n%5E2%2B3+from+0+to+10

Example:

def print_a_grid(n):
   for _ in range(n):
       for _ in range(n):
           print("+", end="")
       print("")

$\mathtt{print\_a\_grid} \in \Theta(n^{2})$, where $n$ is n.

Approach:

Polynomial functions will examine each element of an input many, many times, as opposed to linear functions, which examine some constant number of times.

Don’t fall into this trap:

Don’t get polynomial confused with exponential (below).

Exponential

What it looks like:

http://www.wolframalpha.com/input/?i=plot+2%5En+from+0+to+10

Example:

(define (strange-add x)
 (if (zero? x)
     1
     (+ (strange-add (- x 1))
        (strange-add (- x 1)) )))

def strange_add(x):
  if x == 0:
    return 1
  else:
    return strange_add(x - 1) + strange_add(x - 1)

$\mathtt{strange\_add} \in \Theta(2^{n})$, where $n$ is x.

Approach:

Exponential functions tend to branch out as you get deeper and deeper into their call tree, and each call only makes the work smaller by a little bit. For example, (strange-add 8) calls (strange-add 7) and (strange-add 7). Those two calls each make two calls, (strange-add 6), (strange-add 6), (strange-add 6), and (strange-add 6) respectively, and so on.

Mutability

Mutable data-structures

Object-oriented programming

Inheritance and class vs instance attributes

Source: Spring 2014 Piazza (1413)

Student Question

I'm confused on how Classes and Inheritance work.

If there's a Parent class and a Child class, when coding in the Child class, when do you write Parent.attribute, when do you write Child.attribute, and when do you write self.attribute?

Also, I'm also confused as to when to put self into the parentheses as well.

Instructor Answer

Parent.attribute and Child.attribute would both be ways of accessing aclass variable. These are variables that can be accessed without creating new instances of the that class.

self.attribute would be used in methods to access an instance variable (an attribute specific to an instance).

So for example, Insect.watersafe is False, but Bee.watersafe is True. These are class attributes because you don't have to create an Insect object or a Bee object in order to say Insect.watersafe or Bee.watersafe.

However it wouldn't make any sense to say Bee.armor, since armor is an instance variable. You have to first create a new Bee before you could ask it for it's armor. If you created a second Bee after that, the second Bee would also have its own armor.

There's a lot of vocab (in bold) that might trip you up. Try reading Discussion 6 and posting a followup if you're still unsure!

Iterables, iterators and generators

Scheme

Tail recursion


Tail recursion in Python

Source: http://kylem.net/programming/tailcall.html (Retrieved June 16th, 2014)

In this page, we’re going to look at tail call recursion and see how to force Python to let us eliminate tail calls by using a trampoline. We will go through two iterations of the design: first to get it to work, and second to try to make the syntax seem reasonable. I would not consider this a useful technique in itself, but I do think it’s a good example which shows off some of the power of decorators.

The first thing we should be clear about is the definition of a tail call. The “call” part means that we are considering function calls, and the “tail” part means that, of those, we are considering calls which are the last thing a function does before it returns. In the following example, the recursive call to f is a tail call (the use of the variable ret is immaterial because it just connects the result of the call to f to the return statement), and the call to g is not a tail call because the operation of adding one is done after g returns (so it’s not in “tail position”).

def f(n) :
    if n > 0 :
        n -= 1
        ret = f(n)
        return ret
    else :
        ret = g(n)
        return ret + 1

1. Why tail calls matter

Recursive tail calls can be replaced by jumps. This is called “tail call eliminination,” and is a transformation that can help limit the maximum stack depth used by a recursive function, with the benefit of reducing memory traffic by not having to allocate stack frames. Sometimes, recursive function which wouldn’t ordinarily be able to run due to stack overflow are transformed into function which can.

Because of the benefits, some compilers (like gcc) perform tail call elimination[1], replacing recursive tail calls with jumps (and, depending on the language and circumstances, tail calls to other functions can sometimes be replaced with stack massaging and a jump). In the following example, we will eliminate the tail calls in a piece of code which does a binary search. It has two recursive tail calls.

def binary_search(x, lst, low=None, high=None) :
    if low == None : low = 0
    if high == None : high = len(lst)-1
    mid = low + (high - low) // 2
    if low > high :
        return None
    elif lst[mid] == x :
        return mid
    elif lst[mid] > x :
        return binary_search(x, lst, low, mid-1)
    else :
        return binary_search(x, lst, mid+1, high)

Supposing Python had a goto statement, we could replace the tail calls with a jump to the beginning of the function, modifying the arguments at the call sites appropriately:

def binary_search(x, lst, low=None, high=None) :
  start:
    if low == None : low = 0
    if high == None : high = len(lst)-1
    mid = low + (high - low) // 2
    if low > high :
        return None
    elif lst[mid] == x :
        return mid
    elif lst[mid] > x :
        (x, lst, low, high) = (x, lst, low, mid-1)
        goto start
    else :
        (x, lst, low, high) = (x, lst, mid+1, high)
        goto start

which, one can observe, can be written in actual Python as

def binary_search(x, lst, low=None, high=None) :
    if low == None : low = 0
    if high == None : high = len(lst)-1
    while True :
        mid = low + (high - low) // 2
        if low > high :
            return None
        elif lst[mid] == x :
            return mid
        elif lst[mid] > x :
            high = mid - 1
        else :
            low = mid + 1

I haven’t tested the speed difference between this iterative version and the original recursive version, but I would expect it to be quite a bit faster because of there being much, much less memory traffic.

Unfortunately, the transformation makes it harder to prove the binary search is correct in the resulting code. With the original recursive algorithm, it is almost trivial by induction.

Programming languages like Scheme depend on tail calls being eliminated for control flow, and it’s also necessary for continuation passing style.[2]

2. A first attempt

Our running example is going to be the factorial function (a classic), written with an accumulator argument so that its recursive call is a tail call:

def fact(n, r=1) :
    if n <= 1 :
        return r
    else :
        return fact(n-1, n*r)

If n is too large, then this recursive function will overflow the stack, despite the fact that Python can deal with really big integers. On my machine, it can compute fact(999), but fact(1000) results in a sad RuntimeError: Maximum recursion depth exceeded.

One solution is to modify fact to return objects which represent tail calls and then to build a trampoline underneath fact which executes these tail calls after fact returns. This way, the stack depth will only contain two stack frames: one for the trampoline and another for each call to fact.

First, we define a tail call object which reifies the concept of a tail call:

class TailCall(object) :
    def __init__(self, call, *args, **kwargs) :
        self.call = call
        self.args = args
        self.kwargs = kwargs
    def handle(self) :
        return self.call(*self.args, **self.kwargs)

This is basically just the thunk lambda : call(*args, **kwargs), but we don’t use a thunk because we would like to be able to differentiate between a tail call and returning a function as a value.

The next ingredient is a function which wraps a trampoline around an arbitrary function:

def t(f) :
    def _f(*args, **kwargs) :
        ret = f(*args, **kwargs)
        while type(ret) is TailCall :
            ret = ret.handle()
        return ret
    return _f

Then, we modify fact to be

def fact(n, r=1) :
    if n <= 1 :
        return r
    else :
        return TailCall(fact, n-1, n*r)

Now, instead of calling fact(n), we must instead invoke t(fact)(n) (otherwise we’d just get a TailCall object).

This isn’t that bad: we can get tail calls of arbitrary depth, and it’s Pythonic in the sense that the user must explicitly label the tail calls, limiting the amount of unexpected magic. But, can we eliminate the need to wrap t around the initial call? I myself find it unclean to have to write that t because it makes calling fact different from calling a normal function (which is how it was before the transformation).

3. A second attempt

The basic idea is that we will redefine fact to roughly be t(fact). It’s tempting to just use t as a decorator:

@t
def fact(n, r=1) :
    if n <= 1 :
        return r
    else :
        return TailCall(fact, n-1, n*r)

(which, if you aren’t familiar with decorator syntax, is equivalent to writing fact = t(fact) right after the function definition). However, there is a problem with this in that the fact in the returned tail call is bound to t(fact), so the trampoline will recursively call the trampoline, completely defeating the purpose of our work. In fact, the situation is now worse than before: on my machine, fact(333) causes a RuntimeError!

For this solution, the first ingredient is the following class, which defines the trampoline as before, but wraps it in a new type so we can distinguish a trampolined function from a plain old function:

class TailCaller(object) :
    def __init__(self, f) :
        self.f = f
    def __call__(self, *args, **kwargs) :
        ret = self.f(*args, **kwargs)
        while type(ret) is TailCall :
            ret = ret.handle()
        return ret

and then we modify TailCall to be aware of TailCallers:

class TailCall(object) :
    def __init__(self, call, *args, **kwargs) :
        self.call = call
        self.args = args
        self.kwargs = kwargs
    def handle(self) :
        if type(self.call) is TailCaller :
            return self.call.f(*self.args, **self.kwargs)
        else :
            return self.call(*self.args, **self.kwargs)

Since classes are function-like and return their constructed object, we can just decorate our factorial function with TailCaller:

@TailCaller
def fact(n, r=1) :
    if n <= 1 :
        return r
    else :
        return TailCall(fact, n-1, n*r)

And then we can call fact directly with large numbers!

Also, unlike in the first attempt, we can now have mutually recursive functions which all perform tail calls. The first-called TailCall object will handle all the trampolining.

If we wanted, we could also define the following function to make the argument lists for tail calls be more consistent with those for normal function calls:[3]

def tailcall(f) :
    def _f(*args, **kwargs) :
        return TailCall(f, *args, **kwargs)
    return _f

and then fact could be rewritten as

@TailCaller
def fact(n, r=1) :
    if n <= 1 :
        return r
    else :
        return tailcall(fact)(n-1, n*r)

One would hope that marking the tail calls manually could just be done away with, but I can’t think of any way to detect whether a call is a tail call without inspecting the source code. Perhaps an idea for further work is to convince Guido von Rossum that Python should support tail recursion (which is quite unlikely to happen).

[1] This is compiler-writer speak. For some reason, “elimination” is what you do when you replace a computation with something equivalent. In this case, it’s true that the call is being eliminated, but in its place there’s a jump. The same is true for “common subexpression elimination” (known as CSE), which takes, for instance,

a = b + c
d = (b + c) + e
and replaces it with
a = b + c
d = a + e

Sure, the b+c is eliminated from the second statement, but it’s not really gone... The optimization known as “dead code elimination” actually eliminates something, but that’s because dead code has no effect, and so it can be removed (that is, be replaced with nothing).

[2] In Scheme, all loops are written as recursive functions since tail calls are the pure way of redefining variables (this is the same technique Haskell uses). For instance, to print the numbers from 1 to 100, you’d write

(let next ((n 1))
  (if (<= n 100)
    (begin
      (display n)
      (newline)
      (next (+ n 1)))))

where next is bound to be a one-argument function which takes one argument, n, and which has the body of the let statement as its body. If that 100 were some arbitrarily large number, the tail call to next had better be handled as a jump, otherwise the stack would overflow! And there’s no other reasonable way to write such a loop!

Continuation passing style is commonly used to handle exceptions and backtracking. You write functions of the form

(define (f cont)
   (let ((cont2 (lambda ... (cont ...) ...)))
      (g cont2)))

along with functions which take multiple such f’s and combines them into another function which also takes a single cont argument. I’ll probably talk about this more in another page, but for now notice how the call to g is in the tail position.

[3] This is basically a curried[4] version of TailCall.

[4] That is, Schönfinkelized.

Streams

Logic

Python syntax and semantics

print vs return


Andrew's tips

Source: Spring 2014 Piazza (779)

Remember the differences between return and print.

  • return can only be used in a def statement. It returns a value from a function. Once Python evaluates a return statement, it immediately exits the function.
  • print is a function that displays its argument on the screen. It always returns None.

Examples:

def foo1(x):
  return x

def foo2(x):
  print(x)

>>> foo2(1) # In foo2, we print 1 ourselves using the print function
1
>>> foo1(1) # HERE, THE PYTHON INTERPRETER PRINTS THE RETURN VALUE OF FOO1. CANNOT STRESS HOW IMPORTANT TO UNDERSTAND THIS
1
>>> foo1(1) + 1
2
>>> foo2(1) + 1
1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Function decorators


How function decorators work

Source: Spring 2014 Piazza (638)

Student Question

I'm having difficulties understanding what exactly a function decorator is. Can someone elaborate and potentially provide me with an example other than the one in the readings?

Instructor Answer

So imagine you wanted your functions to print their arguments before they executed them. Here's one way to do this.

def loud(fn):
    def new_fn(*args):
        print(args)
        return fn(*args)
    return new_fn 

Here's a function loud that takes in a function and returns a new function that when called, prints out its arguments, and then does what the old function does.

For example:

def sq(x):
    return x * x
>>> sq(4)
16
>>> sq = loud(sq) # replace the old square with our loud one.
>>> sq(4)
(4,)
16

A function decorator does the same thing as the above. Assuming loud is defined, we can do this:

@loud
def sq(x):
    return x * x

>>> sq(4)
(4,)

Student guides

How to learn computer science

Source: Spring 2014 Piazza (241)

If you've never programmed before, or if you've never taken a class quite like 61A before, things right now might be scary. Everything is strange and new and there quite a lot to take in all at once. So if you're having a hard time so far, here are a few articles that might help.

Note: these articles are pretty long, so feel free to read them in multiple sittings.

At the beginning, everything seems a bit scary in CS. Michelle Bu, a Berkeley alum and a crazy good hacker, shares one of her experiences when she was a wee n00b in 21 Nested Callbacks.

Start here! "A Beginner's Guide to Computer Science" Written by Berkeley's own James Maa. James is known for his killer walkthroughs (check out his Productivity guide). This article gives you some background on learning CS and then provides a practical guide on how to learn effectively.

How do we learn? Mark Eichenlaub explains in this Introduction to Learning Theory. This is quite possibly the best introduction to Learning Theory.

Sometimes, you're stuck and you end up really, really frustrated. Marcus Geduld explains Why do we get frustrated when learning something?

Quick guide on getting unstuck

Source: Quick Guide on Getting Unstuck (Retrieved June 16th, 2014)

A major frustration you might encounter in 61A is when you stare at a homework problem and have no idea where to start. Or you write some code and it doesn't pass the doctests, but now what? You work at it for a while, but next thing you know, you've been stuck for hours on the the same problem and have little to show about it.

So here's a checklist of things you can do when you're stuck. Experienced programmers do these things almost naturally (because of how much practice they've had being stuck), and so while they get stuck just as much as your or I, they always know what to do next.

  1. Do I understand what the problem is asking?
    1. If not, which part of the problem is confusing me?
      1. Identify the exact sentences/phrases/words/etc.
    2. Check the given examples. Do they make sense to me?
    3. Can I come up with my own examples? A good indicator that you understand the question is that you can come up with some nontrivial examples of how the function works.
  2. What concepts should I use here?
    1. Do I understand the concepts? Can I explain the concept in English to one of my friends such that they get it?
      1. If not, go back and relearn the specific concepts that are unclear (through discussion, lab, lecture, etc.) Don't read the entire book in order to solve one problem..
    2. How do I apply the concept to the given problem?
  3. Write your code and test it.
    1. Use doctests, BUT ALSO LOAD IT INTERACTIVELY (python3 -i ...)
      1. Saying "my function works because the doctests pass" is a lot like saying "this airplane will fly because it has wings."
    2. If your code breaks, ask yourself:
      1. Does it error? Is it a....
        1. Syntax error? If so, find the syntax bug and fix it.
        2. Logic error? Is it something weird that you don't understand? (E.g. cannot add integer and tuple)
      2. Why did it do that? Why didn't it do what I expected? Trace through the code by hand with an example (sample values) you came up with in step 0. Add calls to print in order to figure out how your function is handling the arguments.
  4. Am I missing a trick?
    1. Oftentimes you've never seen this type of problem before. This is expected on homework (and this is why homework can take a long time) because if you see it on the homework, then you will be familiar with it on the exam and when you program for fun and profit.
    2. The key here is just to learn the trick however you need to.
      1. Stare at it yourself
      2. Stare at it with others (peers in the class)
      3. Ask on PIazza what the approach is.
      4. Stare at it with the TAs/lab Assistants
    3. Once you figure it out, remember the trick so that you can use it next time.
  5. At any point you identify what you're stuck on, you can begin to resolve it.
    1. Use the tips above. Try things out on the interpreter. Review the lecture/discussion/labs/etc. Do whatever helps you get a better understanding of the problem.
    2. Once you have something specific that you're stuck on, you can ask other people in the class.
      1. Don't be afraid to ask. Everyone gets stuck and feels stupid sometimes. However, you get to choose how you react to it.
      2. At the same time, it really helps to work with people who are on about the same level in the course.
    3. Look on Piazza. Ask questions if yours hasn't come up yet. Be that awesome guy/girl who helps answer questions.
    4. You can ask the TA if all else fails. We are here to help you learn!

Here is an old algorithm for studying for tests (the final in this case), salvaged from the sands of time:

For each topic on the final, find problems on them and do them.
  If you can solve them on your own, move on.
  Else if you are stuck, look at the solution and figure out if you
  are missing a trick or if you do not understand the concepts.
    If the problem is that you are stuck on some random trick,
      just learn the trick.
    Stare at the solutions, ask Piazza, your TA, etc.
    Questions you should ask at this stage:
      What is the problem asking me to do?
      How was I suppose to follow the instructions
        to solve the problem?
      What part of the problem do I not understand?
      What is the fastest way to clear up that misunderstanding?
   Then if you think you are still stuck conceptually, review
   and learn the concept, however you learn best.
   Suggestions for picking up concepts quickly (~1-2 hours):
     Discussion notes typically have a very concise recap of the
       thing they are going over.
     There are guides for particularly tricky things on Piazza,
       like Logic, Pairs and Lists in Scheme, etc.
       Find them and go over them.
     Ask a TA: "what is the best way to learn X?"
     If these do not work and you are still shaky after an hour
     or two, it might be worth watching a lecture or reading
     the notes.

Composition

General style guidelines from 61A website

Source: Spring 2014 Piazza (149)

Student Question

Are we required to add any comments to our code to say what a function does, etc.? And does clarity of code count for this project, in which case should we write comments at the end of not-so-clear statements? Thanks.

Student Answer

Docstrings of each function are already provided. If you add a helper function, you should write a docstring for it.

The style guide on the course website advises: "Your actual code should be self-documenting -- try to make it as obvious as possible what you are doing without resorting to comments. Only use comments if something is not obvious or needs to be explicitly emphasized"

Instructor Answer

You should always aim to make your code "self-documenting," meaning it is clear what your code is doing without the aid of comments. You should try to keep the number of comments to a minimum, but if there are lines which you think are unclear/ambiguous, feel free to add a comment.

All projects in this class contain a 3 point component that is judged solely on your code "composition" -- i.e. whether your code is clear, concise, and easy to read.

Debugging

Miscellaneous

Andrew Huang's tips

Source: Spring 2014 Piazza (779)

Order of evaluation matters. The rules for evaluating call expressions are

  1. Evaluate the operator
  2. Evaluate the operands
  3. Call the operator on the operands (and draw a new frame...)

For example:

def baz():
  print("this was first")
  def bar(x):
    print(x)
    return lambda x: x * x
  return bar # baz is a function that when called, returns a function named bar

>>> baz() # the operator is baz, there are no operands
this was first
<function bar at 0x2797e20>
>>> baz()("this was second") # the operator is baz(), the operand is "this was second"
this was first
this was second
<function <lambda> at 0x2120e20>
>>> baz()("this was second")(3) # the operator is baz()("this was second"), the operand is 3
this was first
this was second
9
>>> def bar(x):
...   print(x)
...   return 3
... 
>>> baz()("this was second")(bar("this was third")) # the operator is baz()("this was second"), the operand is bar("this was third")
this was first
this was second
this was third
9

In order to solve any problem, you must first understand what the problem is asking. Often times it helps to try to explain it concisely in English. It also helps to come up with small examples. For example:

def mouse(n):
  if n >= 10:
    squeak = n // 100
    n = frog(squeak) + n % 10
  return n

def frog(croak):
  if croak == 0:
    return 1
  else:
    return 10 * mouse(croak+1)

mouse(21023508479)

So the goal is to figure out what mouse(21023508479) evaluates to.

One way is to just step-by-step evaluate this, as an interpreter would.

Another way, is to understand what the functions are doing.

Looking at mouse, we see that it takes in a number and outputs that same number if it is smaller than 10. otherwise, it'll return something weird. In order to understand that weird thing, we have to understand what frog is doing. frog takes in a number and if that number is 0, return 1. Otherwise, return ten times mouse(croak+1). Well, this is still confusing. Let's try a small example.

>>> mouse(357)
47
>>> mouse(123)
23
>>> mouse(1234)
44
>>> mouse(12345)
245

There is a pattern. We notice that the resulting number is composed of every other digit of the original, plus one (except for the last one.) So 21023508479 is [2+1][0+1][3+1][0+1][4+1][9] = 314159. Can you see how the code reflects that? However in this particular example, the pattern is definitely tricky to find here, so it might make more sense to brute force it.


Remember for recursion, you always need to find three things:

  • One or more base cases
  • One or more was to reduce the problem
  • A way to solve the problem given solutions to smaller problems

For example, the discussion notes, we asked you to write count_stairs. This function takes in n, the number of steps, and returns all the ways you can climb up them if at each step, you can take either one or two steps.

  • Base cases: if we consider n to be the number of steps left to climb, then it makes sense that if there is 1 step left, then there is exactly one way. If there are two steps left, then there are exactly 2 ways (1 step, 1 step, or two steps). Why do we need two base cases here?
  • We can make the problem smaller by reducing the n. At each step, we can take one step (resulting in count_stairs(n-1)) or two steps (count_stairs(n-2)).
  • Assuming we get the solutions to the two recursive calls, we should add them together to get all the ways we can climb the stairs.

Thus we end up with

def count_stairs(n):
  if n <= 2:
    return n
  else:
    return count_stairs(n-1) + count_stars(n-2)

Notice that at each stair step, we either take one step or two steps. This is a common pattern in tree recursion. Look through Discussion 3 for more info.