Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
[workspace]
members = ["exercises/*/*", "verifier"]
exclude = [
"exercises/03_concurrency/00_introduction",
"exercises/03_concurrency/01_python_threads",
]
resolver = "2"

[workspace.dependencies]
anyhow = "1"
duct = "0.13"
primes = "0.4"
pyo3 = "0.23.3"
rayon = "1"
semver = "1.0.23"
serde = "1.0.204"
serde_json = "1.0.120"
67 changes: 67 additions & 0 deletions book/src/03_concurrency/00_introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Concurrency

All our code so far has been designed for sequential execution, on both the Python and Rust side.
It's time to spice things up a bit and explore concurrency[^scope]!

We won't dive straight into Rust this time.\
We'll start by solving a few parallel processing problems in Python, to get a feel for Python's capabilities and limitations.
Once we have a good grasp of what's possible there, we'll port our solutions over to Rust.

## Multiprocessing

If you've ever tried to write parallel code in Python, you've probably come across the `multiprocessing` module.
Before we dive into the details, let's take a step back and review the terminology we'll be using.

### Processes

A **process** is an instance of a running program.\
The precise anatomy of a process depends on the underlying **operating system** (e.g. Windows or Linux).
Some characteristics are common across most operating systems, though. In particular, a process typically consists of:

- The program's code
- Its memory space, allocated by the operating system
- A set of resources (file handles, sockets, etc.)

```ascii
+------------------------+
| Memory |
| |
| +--------------------+ |
| | Process A Space | | <-- Each process has a separate memory space.
| +--------------------+ |
| |
| +--------------------+ |
| | Process B Space | |
| | | |
| +--------------------+ |
| |
| +--------------------+ |
| | Process C Space | |
| +--------------------+ |
+------------------------+
```

There can be multiple processes running the same program, each with its own memory space and resources, fully
isolated from one another.\
The **operating system's scheduler** is in charge of deciding which process to run at any given time, partitioning CPU time
among them to maximize throughput and/or responsiveness.

### The `multiprocessing` module

Python's `multiprocessing` module allows us to spawn new processes, each running its own Python interpreter.

A process is created by invoking the `Process` constructor with a target function to execute as well as
any arguments that function might need.
The process is launched by calling its `start` method, and we can wait for it to finish by calling `join`.

If we want to communicate between processes, we can use `Queue` objects, which are shared between processes.
These queues try to abstract away the complexities of inter-process communication, allowing us to pass messages
between our processes in a relatively straightforward manner.

## References:

- [`multiprocessing` module](https://docs.python.org/3/library/multiprocessing.html)
- [`Process` class](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process)
- [`Queue` class](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue)

[^scope]: We'll limit our exploration to threads and processes, without venturing into the realm of `async`/`await`.
96 changes: 96 additions & 0 deletions book/src/03_concurrency/01_python_threads.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Threads

## The overhead of multiprocessing

Let's have a look at the solution for the previous exercise:

```python
from multiprocessing import Process, Queue

def word_count(text: str, n_processes: int) -> int:
result_queue = Queue()
processes = []
for chunk in split_into_chunks(text, n_processes):
p = Process(target=word_count_task, args=(chunk, result_queue))
p.start()
processes.append(p)
for p in processes:
p.join()
results = [result_queue.get() for _ in range(len(processes))]
return sum(results)
```

Let's focus, in particular, on process creation:

```python
p = Process(target=word_count_task, args=(chunk, result_queue))
```

The parent process (the one executing `word_count`) doesn't share memory with the child process (the one
spawned via `p.start()`). As a result, the child process can't access `chunk` or `result_queue` directly.
Instead, it needs to be provided a **deep copy** of these objects[^pickle].\
That's not a major issue if the data is small, but it can become a problem on larger datasets.\
For example, if we're working with 8 GB of text, we'll end up with at least 16 GB of memory usage: 8 GB for the
parent process and 8 GB split among the child processes. Not ideal!

We could try to circumvent this issue[^mmap], but that's not always possible nor easy to do.\
A more straightforward solution is to use **threads** instead of processes.

## Threads

A **thread** is an execution context **within a process**.\
Threads share the same memory space and resources as the process that spawned them, thus allowing them to communicate
and share data with one another more easily than processes can.

```ascii
+------------------------+
| Memory |
| |
| +--------------------+ |
| | Process A Space | | <-- Each process has its own memory space.
| | +-------------+ | | Threads share the same memory space
| | | Thread 1 | | | of the process that spawned them.
| | | Thread 2 | | |
| | | Thread 3 | | |
| | +-------------+ | |
| +--------------------+ |
| |
| +--------------------+ |
| | Process B Space | |
| | +-------------+ | |
| | | Thread 1 | | |
| | | Thread 2 | | |
| | +-------------+ | |
| +--------------------+ |
+------------------------+
```

Threads, just like processes, are operating system constructs.\
The operating system's scheduler is in charge of deciding which thread to run at any given time, partitioning CPU time
among them.

## The `threading` module

Python's `threading` module provides a high-level interface for working with threads.\
The API of the `Thread` class, in particular, mirrors what you already know from the `Process` class:

- A thread is created by calling the `Thread` constructor and passing it a target function to execute as well as
any arguments that function might need.
- The thread is launched by calling its `start` method, and we can wait for it to finish by calling `join`.
- If we want to communicate between threads, we can use `Queue` objects, from the `queue` module, which are shared between threads.

## References:

- [`threading` module](https://docs.python.org/3/library/threading.html)
- [`Thread` class](https://docs.python.org/3/library/threading.html#threading.Thread)
- [`Queue` class](https://docs.python.org/3/library/queue.html)

[^pickle]: To be more precise, the `multiprocessing` module uses the `pickle` module to serialize the objects
that must be passed as arguments to the child process.
The serialized data is then sent to the child process, as a byte stream, over an operating system pipe.
On the other side of the pipe, the child process deserializes the byte stream back into Python objects using `pickle`
and passes them to the target function.\
This all system has higher overhead than a "simple" deep copy.

[^mmap]: Common workarounds include memory-mapped files and shared-memory objects, but these can be quite
difficult to work with. They also suffer from portability issues, as they rely on OS-specific features.
66 changes: 66 additions & 0 deletions book/src/03_concurrency/02_gil.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# The GIL problem

## Concurrent, yes, but not parallel

On the surface, our thread-based solution addresses all the issues we identified in the `multiprocessing` module:

```python
from threading import Process
from queue import Queue

def word_count(text: str, n_threads: int) -> int:
result_queue = Queue()
threads = []

for chunk in split_into_chunks(text, n_threads):
t = Thread(target=word_count_task, args=(chunk, result_queue))
t.start()
threads.append(t)

for t in threads:
t.join()

results = [result_queue.get() for _ in range(len(threads))]
return sum(results)
```

When a thread is created, we are no longer cloning the text chunk nor incurring the overhead of inter-process communication:

```python
t = Thread(target=word_count_task, args=(chunk, result_queue))
```

Since the spawned threads share the same memory space as the parent thread, they can access the `chunk` and `result_queue` directly.

Nonetheless, there's a major issue with this code: **it won't actually use multiple CPU cores**.\
It will run sequentially, even if we pass `n_threads > 1` and multiple CPU cores are available.

## Python concurrency

You guessed it: the infamous Global Interpreter Lock (GIL) is to blame.
As we discussed in the [GIL chapter](../01_intro/05_gil.md),
Python's GIL prevents multiple threads from executing Python code simultaneously[^free-threading].

As a result, thread-based parallelism has historically
seen limited use in Python, as it doesn't provide the performance benefits one might expect from a
multithreaded application.

That's why the `multiprocessing` module is so popular: it allows Python developers to bypass the GIL.
Each process has its own Python interpreter, and thus its own GIL. The operating system schedules these processes
independently, allowing them to run in parallel on multicore CPUs.

But, as we've seen, multiprocessing comes with its own set of challenges.

## Native extensions

There's a third way to achieve parallelism in Python: **native extensions**.\
We must [be holding the GIL](../01_intro/05_gil.html#pythonpy) when we invoke a Rust function from Python, but
pure Rust threads are not affected by the GIL, as long as they don't need to interact with Python objects.

Let's rewrite again our `word_count` function, this time in Rust!

[^free-threading]: This is the current state of Python's concurrency model. There are some exciting changes on the horizon, though!
[`CPython`'s free-threading mode](https://docs.python.org/3/howto/free-threading-python.html) is an experimental feature
that aims to remove the GIL entirely.
It would allow multiple threads to execute Python code simultaneously, without forcing developers to rely on multiprocessing.
We won't cover the new free-threading mode in this course, but it's worth keeping an eye on it as it matures out of the experimental phase.
Loading
Loading