The plugin API is unsound due to multi-threading

Safe buffer creation in the host was just the warm-up. Now the real fun begins! :wink:

**TL;DR:** To adhere to the requirements of Safe Rust, the `Plugin` API of the `rust-vst` crate needs to be restructured in a way that will require substantial changes to all existing plugins.


## The problem

The VST API is multi-threaded. A host may call multiple methods concurrently on a plugin instance.

The way the `rust-vst` crate is structured, all methods have access to the same data - an instance of a type implementing the `Plugin` trait. In the presence of concurrency, this sharing causes data races, which is undefined behavior and violates the assumptions of Safe Rust. In practice, this leads to crashes and other weird behavior, as the Rust compiler assumes that access through a mutable reference is exclusive and performs optimizations and other transformations based on this assumption.

**If the `rust-vst` crate is to be a safe wrapper around the VST API, it must be restructured in such a way that data races cannot occur.** In Rust terms, this means that:

- If two mehods can be called concurrently, they cannot have mutable access to the same data.
- If data is (immutably) shared between methods that can be called concurrently, it must be `Sync`.

On the host side, we have the opposite problem: a host would potentially like to call plugin methods from multiple threads (as outlined in the next section), but it is currently not possible to do so in Safe Rust, because many of the methods require an exclusive reference to the plugin instance.


## VST concurrency

The concurrency between VST plugin methods is, fortunately, not arbitrary. The multi-threading characteristics of VST plugins are described [here](http://www.dith.it/listing/vst_stuff/vstsdk2.4/doc/html/sequences.html). I found this description to be somewhat vague and incomplete, so in order to uncover the details of how a host might call into a plugin, I wrote a [test plugin](https://github.com/askeksa/vst_concurrency_tester) (which detects and reports which methods are called concurrently) and stress-tested it in Renoise (ran a few notes in a loop with parameter automation while tweaking everything in the GUI I could think of).

My understanding based on both of these sources is the following:

- Methods in a plugin are called from two threads: the GUI thread and the processing thread. There can be more than one processing thread, but calls will only come from one of them at a time, so for the purposes of concurrency, we can assume there is just one processing thread.

- The VST plugin methods fall into four categories:

  1. The *setup* methods (`can_do`, `get_info`, `get_input_info`, `get_output_info`, `init`, `resume`, `suspend`, `set_block_size`, `set_sample_size` and (presumably; not seen) `get_tail_size`). These methods are never called concurrently with anything. Furthermore, all of these methods (except `suspend`) are only ever called when the plugin is in the suspended state. All other methods are only called when the plugin is in the resumed state.

  2. The *processing* methods (`process`, `process_f64` and `process_events`) are always called from the processing thread. Thus, they are never called concurrently with each other, but can be called concurrently with other methods (except the *setup* methods).

  3. The *automation* methods (`set_parameter`, `change_preset`) can be called either from the processing thread (for automation) or from the GUI thread (when parameters are manually changed in the host GUI). Thus, these can be called concurrently with themselves and each other, and with other methods (except the *setup* methods).

  4. The *remaining* methods (mostly parameter queries, preset handling and editor interaction) are always called from the GUI thread. Thus they are never called concurrently with each other, but can be called concurrently with the *processing* and *automation* methods.


## Requirements

A solution to this issue should ideally fulfill the following requirements:

- A plugin written in Safe Rust never encounters data races (or other undefined behavior) when run in a host that follows the VST concurrency rules described in the previous section.

- A host written in Safe Rust is able to call plugin methods from multiple threads, subject to the VST concurrency rules. It is a bonus if the rules are enforced by the API (statically, dynamically, or some combination thereof) such that the host is not able to violate them.

- Calling a Rust plugin directly from a Rust host without going through the VST API is still possible and is still safe (i.e. safety should not depend on the VST API bridging code).

- The API is not too opinionated about how the plugin implements communication between the threads. In particular, it should be possible, within the API constraints, for the processing thread to be completely free of allocation and blocking synchronization.


## Implementation

To achieve safety, the plugin state needs to be split into separate chunks of state such that methods that can be called concurrently do not have mutable access to the same chunk.

Note that since the *automation* methods can be called concurrently with themselves, this implies that these methods can't have mutable access to anything. All mutation performed by these methods must thus take place via thread-safe internal mutability (i.e. `Mutex`, `RwLock`, spinlocks, atomics and the like).

One way to split the state could be something like this:

```Rust
// Exclusive to the processing thread.
trait PluginProcessing {
    fn process(&mut self, buffer: &mut AudioBuffer<f32>);
    fn process_f64(&mut self, buffer: &mut AudioBuffer<f64>);
    fn process_events(&mut self, events: &Events);
}

// Shared between threads and the main vessel for communication
// between the threads. This communication happens through
// thread-safe interior mutability.
// Note that all references to self are immutable.
trait PluginAutomation {
    fn set_parameter(&self, index: i32, value: f32);
    fn change_preset(&self, preset: i32);

    // The other parameter/preset methods can be placed here.
    // This will force these other methods to also work though
    // interior mutability, but it will reduce the amount of
    // communication necessary between separate state chunks.
}

// Main plugin trait.
trait Plugin {
    type Processing: PluginProcessing;
    type Automation: PluginAutomation + Sync;

    // Get a shared handle to the automation state.
    fn get_automation_handle(&mut self) -> Arc<Self::Automation>;

    // When a plugin is resumed, it relinquishes its access to the
    // processing state so that it can be passed to the processing
    // thread for exclusive access.
    fn resume(&mut self) -> Box<Self::Processing>;

    // To suspend a plugin, the host must pass the processing state
    // back in to prove that no other thread is accessing it.
    fn suspend(&mut self, Box<Self::Processing>);

    // Setup and remaining methods
}

```

This design achieves thread safety in plugins, and it prevents hosts from calling processing methods while the plugin is suspended. It does have a few drawbacks, though:

 - It does not prevent the host from calling *setup* methods while the plugin is resumed. Such a restriction could be implemented by giving the host a "setup token" that it needs to pass to the setup methods (or maybe the methods are on the token itself). This token would be consumed by the `resume` method and given back by the `suspend` method. This could become somewhat unwieldy for both the plugin and the host, however.

- Using the standard `Arc` and `Box` types to control access to the `Automation` and `Processing` state chunks means that these chunks must be allocated on the heap. This could be avoided by introducing specialized wrapper types with similar semantics but which will allow the chunks to be embedded into the main plugin struct. But again, this would make the API more cumbersome to use.

- Adding associated types to the `Plugin` trait means the trait is no longer object safe, i.e. it can't be used for trait objects. This can be a problem for the pure Rust use case (bypassing the bridge).

- Some of the new methods can't have sensible default implementations unless we require the `Processing` and `Automation` types to implement `Default`. Such a requirement can be inconvenient, as the `Processing` chunk would usually want to contain a (non-optional) reference to the `Automation` chunk.


## What to do now

Discuss. :grinning:

Then, we should make some prototypes to see how these ideas (and others we come up with) work in practice.

This change is a substantial undertaking (not least in fixing all existing plugins), but I think it is necessary before we can call our crate complete and stable. And if we manage to pull this off in a good way, it could make for a quite good case story (about wrapping an unsafe API safely) for [This Week in Rust](https://this-week-in-rust.org/). :relieved:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The plugin API is unsound due to multi-threading #49

The problem

VST concurrency

Requirements

Implementation

What to do now

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The plugin API is unsound due to multi-threading #49

Description

The problem

VST concurrency

Requirements

Implementation

What to do now

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions