Concurrency

From CS 61A Wiki
Revision as of 22:47, 8 August 2014 by Stevencheng (Talk | contribs)


Jump to: navigation, search

Concurrency, or parallel computing, allows for multiple operations to be performed simultaneously. In the past, the speed of individual processor cores grew at an exponential rate, but due to power and thermal constraints, this increase came to an abrupt end. Since then, CPU manufacturers began to place cores in a single processor, allowing for more operations to be performed concurrently.

Parallelism in Python

Threading

In threading, multiple "threads" of execution exist within a single interpreter. Each thread executes code independently from the others, though they share the same data.

>>> import threading
>>> def thread_hello():
        other = threading.Thread(target=thread_say_hello, args=())
        other.start()
        thread_say_hello()
 
>>> def thread_say_hello():
        print('hello from', threading.current_thread().name)
 
>>> thread_hello()
hello from Thread-1
hello from MainThread

Multiprocessing

Multiprocessing allows a program to spawn multiple interpreters, or processes, each of which can run code independently, without sharing data so any shared state must be communicated between processes.

>>> import multiprocessing
>>> def process_hello():
        other = multiprocessing.Process(target=process_say_hello, args=())
        other.start()
        process_say_hello()
 
>>> def process_say_hello():
        print('hello from', multiprocessing.current_process().name)
 
>>> process_hello()
hello from MainProcess
>>> hello from Process-1

Synchronized Data Structures

A Queue is an example of a data structure that provides synchronized operations. The queue module contains a Queue class that provides synchronized first in, first out access to data. The put method adds an item to the Queue and the get method retrieves an item (at the end of the Queue). The class ensures methods are synchronized, so items are not lost no matter how thread operations are intertwined.

An example of a producer/consumer Queue:

queue = Queue()
 
def synchronized_consume():
    while True:
        print('got an item:', queue.get())
        queue.task_done()
 
def synchronized_produce():
    consumer = threading.Thread(target=synchronized_consume, args=())
    consumer.daemon = True
    consumer.start()
    for i in range(10):
        queue.put(i)
    queue.join()
 
synchronized_produce()

Another use of a Queue is a parallel web crawler that searches for dead links on a website. By following every link hosted by the same site, it continually adds new URLs to a Queue, performs a process on that item, and removes it from the Queue afterwards. By using a synchronized Queue, multiple threads can safely add to and remove from the data structure concurrently.