Concurrency, or parallel computing, allows for multiple operations to be performed simultaneously. In the past, the speed of individual processor cores grew at an exponential rate, but due to power and thermal constraints, this increase came to an abrupt end. Since then, CPU manufacturers began to place cores in a single processor, allowing for more operations to be performed concurrently.
Parallelism in Python
In threading, multiple "threads" of execution exist within a single interpreter. Each thread executes code independently from the others, though they share the same data.
>>> import threading >>> def thread_hello(): other = threading.Thread(target=thread_say_hello, args=()) other.start() thread_say_hello() >>> def thread_say_hello(): print('hello from', threading.current_thread().name) >>> thread_hello() hello from Thread-1 hello from MainThread
Multiprocessing allows a program to spawn multiple interpreters, or processes, each of which can run code independently, without sharing data so any shared state must be communicated between processes.
>>> import multiprocessing >>> def process_hello(): other = multiprocessing.Process(target=process_say_hello, args=()) other.start() process_say_hello() >>> def process_say_hello(): print('hello from', multiprocessing.current_process().name) >>> process_hello() hello from MainProcess >>> hello from Process-1
Synchronized Data Structures
Queue is an example of a data structure that provides synchronized operations. The
queue module contains a
Queue class that provides synchronized first in, first out access to data. The
put method adds an item to the
Queue and the
get method retrieves an item (at the end of the Queue). The class ensures methods are synchronized, so items are not lost no matter how thread operations are intertwined.
An example of a producer/consumer
queue = Queue() def synchronized_consume(): while True: print('got an item:', queue.get()) queue.task_done() def synchronized_produce(): consumer = threading.Thread(target=synchronized_consume, args=()) consumer.daemon = True consumer.start() for i in range(10): queue.put(i) queue.join() synchronized_produce()
Another use of a
Queue is a parallel web crawler that searches for dead links on a website. By following every link hosted by the same site, it continually adds new URLs to a
Queue, performs a process on that item, and removes it from the
Queue afterwards. By using a synchronized
Queue, multiple threads can safely add to and remove from the data structure concurrently.