In this post, we'll have an introduction to self-referential structs, pinning, futures, and executors to be used in a follow-up post.
Self-referential structs
Normally, if you move a struct that you have ownership of in Rust, everything should be all right; all its containing fields will be moved together:
struct MyStruct {
field: String,
}
fn main() {
let my_struct = MyStruct {
field: "hei".to_string(),
};
let my_struct_addr = &raw const my_struct;
let field_addr = &raw const my_struct.field;
// move the struct to the heap, then back to the stack
let my_struct = *Box::new(my_struct);
// in a typical build, the address of the struct and its field
// before and after the move will have changed
assert_ne!(my_struct_addr, &raw const my_struct);
assert_ne!(field_addr, &raw const my_struct.field);
}
And Rust must not let you violate memory safety without the use of unsafe:
struct MyStruct {
field: String,
pointer_field: *const String,
}
fn main() {
let mut my_struct = MyStruct {
field: "hei".to_string(),
pointer_field: std::ptr::null(),
};
// my_struct.pointer_field is pointing to my_struct.field
// so the struct is self-referential
my_struct.pointer_field = &raw const my_struct.field;
let field_addr = &raw const my_struct.field;
let pointer_before_move = my_struct.pointer_field;
// move the struct to the heap, then back to the stack
let my_struct = *Box::new(my_struct);
// in a typical build, the address of the struct field
// before and after the move will have changed
assert_ne!(field_addr, &raw const my_struct.field);
// confirm that pointer didn't change even though the field
// address changed, so this pointer is now dangling!
assert_eq!(pointer_before_move, my_struct.pointer_field);
}
The code above is still okay because a raw pointer is an inert thing as long as it's not dereferenced, and dereferencing it requires unsafe. If you added code that dereferenced this pointer, then that code would need to uphold the invariant that the pointer still points to a valid String.
If you try to do something similar with safe Rust, you will find yourself unable to move the struct:
struct MyStruct<'a> {
field: String,
pointer_field: &'a String,
}
fn main() {
let x = "".to_string();
let mut my_struct = MyStruct {
field: "hei".to_string(),
pointer_field: &x,
};
my_struct.pointer_field = &my_struct.field;
// this won't work: compilation error
// cannot move the struct because it is borrowed
// let my_struct = *Box::new(my_struct);
}Pinning
So what is Pin? It's basically a struct that helps us create a safe interface to interact with values whose validity depends on their address staying stable, such as self-referential objects:
First let's start considering the trait Unpin; most types in Rust are not self-referential and implement the trait Unpin automatically. However, they will not automatically implement Unpin if they have a field that doesn't implement it. So, let's consider the two cases.
-
Implements
Unpin: the first case is easy; if a typeTimplementsUnpin, then pinning doesn't impose extra restrictions, and you can, with safe Rust, convert&mut Tto or fromPin<&mut T>withPin::newandPin::get_mut/Pin::into_inner. This is useful if some method expects aPin<&mut T>but we have a&mut T. -
Does not implement
Unpin: then it's impossible, with safe Rust, to convert&mut TtoPin<&mut T>directly usingPin::new, but there is a macrostd::pin::pin!that allows us (with safe Rust) to take ownership of a valueTand create a local pinned binding, giving us aPin<&mut T>. While that pinned binding is alive, safe Rust cannot move the value out again. In this case, it's also impossible, with safe Rust, to convertPin<&mut T>to&mut T, so assuming you don't have ownership ofTanymore (i.e. you didn't use unsafe Rust to getPin<&mut T>to begin with), it's not possible to move the value out of its location, so that place in memory is pinned. Another common option isBox::pin, which pins a value on the heap.
So let's illustrate this with an example:
use std::{marker::PhantomPinned, pin::Pin};
#[derive(Debug)]
struct MyStruct {
field: String,
}
#[derive(Debug)]
struct MyStructPP {
field: String,
pin: PhantomPinned,
}
fn main() {
let mut my_struct = MyStruct {
field: "hei".to_string(),
};
let pinned: Pin<&mut MyStruct> = Pin::new(&mut my_struct);
let mut_ref: &mut MyStruct = Pin::into_inner(pinned);
dbg!(mut_ref);
let my_struct = MyStructPP {
field: "hei".to_string(),
pin: PhantomPinned,
};
// won't work: compilation error
// let pinned: Pin<&mut MyStructPP> = Pin::new(&mut my_struct);
let pinned: Pin<&mut MyStructPP> = std::pin::pin!(my_struct);
// won't work: compilation error
// the macro took away our owned value of my_struct
// dbg!(my_struct);
// won't work: compilation error
// let mut_ref: &mut MyStructPP = Pin::into_inner(pinned);
// we can only get back to a &mut with unsafe
unsafe {
let pointer_back: &mut MyStructPP = Pin::into_inner_unchecked(pinned);
dbg!(pointer_back);
}
}Futures, executors
To begin, here's an ultra-quick intro on futures and executors, assuming you already have some experience with async Rust.
A future in Rust can be created by an async block or closure, an async function, or manually by implementing the Future trait:
async fn give_me_some_future() {}
fn main() {
let _some_future = give_me_some_future();
let _another_future = async {};
let an_async_closure = async || {};
let _another_future = an_async_closure();
}
Or manually by implementing the Future trait:
use std::{
future::Future,
pin::Pin,
task::{Context, Poll},
};
fn main() {
let _some_future = SomeFuture(3);
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
The manually implemented future makes it quite clear that futures are lazy, such that just creating a future, in general, doesn't do anything per se. We need to call await on them and for that we need an executor, e.g. add tokio = { version = "1", features = ["rt", "macros"] } to your Cargo.toml or run cargo add tokio --features=rt,macros and then you can run this:
use std::{
future::Future,
pin::Pin,
task::{Context, Poll},
};
#[tokio::main(flavor = "current_thread")]
async fn main() {
let some_future = SomeFuture(3);
assert_eq!(some_future.await, 3);
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
What an async executor does, in simple terms, is poll futures until they return Poll::Ready; when a future returns Poll::Pending, the executor normally waits until the future's Waker signals that it may make progress.
Note: in practice, it's not a busy loop calling until it gets a Poll::Ready, but rather the executor gives the future a Waker inside a Context, which the future uses to tell the executor that it's ready to be polled again and make progress, but this is almost beyond the scope of this post.
So we can simulate this with:
use std::{
future::Future,
pin::Pin,
task::{Context, Poll},
};
// run
// cargo add futures
// or add
// futures = "0.3"
// to your Cargo.toml
use futures::task::noop_waker;
fn main() {
// create a dummy waker and context
// just to make the type system work
let waker = noop_waker();
let mut cx = Context::from_waker(&waker);
let mut fut = SomeFuture(3);
let pfut = Pin::new(&mut fut);
assert_eq!(pfut.poll(&mut cx), Poll::Ready(3))
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
Or to visualize with something a little more complex:
use std::{
future::Future,
pin::Pin,
task::{Context, Poll},
time::SystemTime,
};
use futures::task::noop_waker;
fn main() {
let waker = noop_waker();
let mut cx = Context::from_waker(&waker);
let mut fut = SomeFuture(0);
let mut pfut = Pin::new(&mut fut);
loop {
match pfut.as_mut().poll(&mut cx) {
Poll::Pending => {}
Poll::Ready(val) => {
println!("\nFinal result: {val}");
break;
}
}
}
}
struct SomeFuture(u64);
impl Future for SomeFuture {
type Output = u64;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<u64> {
// Unix epoch timestamp in microseconds
let timestamp = SystemTime::now()
.duration_since(SystemTime::UNIX_EPOCH)
.unwrap()
.as_nanos();
// A rudimentary random number generator
let remainder_of_timestamp_divided_by_200 = timestamp % 200;
if remainder_of_timestamp_divided_by_200 == 0 {
Poll::Ready(self.0)
} else {
self.get_mut().0 += 1;
// Tells the async executor that
// we are ready to be polled again
cx.waker().wake_by_ref();
Poll::Pending
}
}
}Self-referential futures
Not every future is self-referential per se; in fact, all the manually defined futures in the previous sections are non-self-referential.
The compiler-generated futures (through functions and async blocks or closures) are generally marked as !Unpin, even when they technically are not self-referential, so even this won't compile:
let mut fut = async {};
let x = Pin::new(&mut fut); // compile error!
To conclude, let's give an example of a truly self-referential future we will explore in a follow-up post:
use std::{
future::Future,
pin::Pin,
task::{Context, Poll},
};
// This future is self-referential
async fn some_future() -> String {
let mut mstr = "Hello".to_string();
mstr.push('1');
// let's call this the child future
async {
// On the first poll, execution reaches pending_once().await
// and returns Poll::Pending. At that point, the parent future's
// state contains mstr and also contains the child future, which
// has borrowed mstr. Moving the parent future after this point
// could invalidate that internal reference.
pending_once().await;
mstr.push('2');
}
.await;
mstr.push('3');
mstr
}
// This future is not self-referential
// So this compiles and runs:
// fn main() {
// let mut fut = pending_once();
// let _x = Pin::new(&mut fut);
// }
fn pending_once() -> impl Future<Output = ()> {
struct PendingOnce {
polled_once: bool,
completed: bool,
}
impl Future for PendingOnce {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
let this = self.get_mut();
if this.completed {
panic!("future resumed after completion")
} else if this.polled_once {
this.completed = true;
Poll::Ready(())
} else {
cx.waker().wake_by_ref();
this.polled_once = true;
Poll::Pending
}
}
}
PendingOnce {
polled_once: false,
completed: false,
}
}