In post, we'll have an introduction to self-referential structs, pinning, futures and executors to be used in a follow up post.
Self-referential structs
Normally, if you move a struct that you have ownership in Rust, everything should be all right, all its containing fields will be moved together:
struct MyStruct {
field: String,
}
fn main() {
let my_struct = MyStruct {
field: "hei".to_string(),
};
let my_struct_addr = &raw const my_struct;
let field_addr = &raw const my_struct.field;
// move the struct to the heap, then back to the stack
let my_struct = *Box::new(my_struct);
// confirm that the address of the struct and its field
// before and after the move have changed
assert_ne!(my_struct_addr, &raw const my_struct);
assert_ne!(field_addr, &raw const my_struct.field);
}
And Rust must not let you violate memory safety without the use of unsafe:
struct MyStruct {
field: String,
pointer_field: *const String,
}
fn main() {
let mut my_struct = MyStruct {
field: "hei".to_string(),
pointer_field: std::ptr::null(),
};
// my_struct.pointer_field is pointing to my_struct.field
// so the struct is a self-referential
my_struct.pointer_field = &raw const my_struct.field;
let field_addr = &raw const my_struct.field;
let pointer_before_move = my_struct.pointer_field;
// move the struct to the heap, then back to the stack
let my_struct = *Box::new(my_struct);
// confirm that the address of the struct field
// before and after the move has changed
assert_ne!(field_addr, &raw const my_struct.field);
// confirm that pointer didn't change even though the field
// address changed, so this pointer is now dangling!
assert_eq!(pointer_before_move, my_struct.pointer_field);
}
The code above is still okay because a raw pointer is an inert thing as long it's not dereferenced, and that requires unsafe, but if you added code that dereferenced this pointer, then either it should be in an unsafe function (so the caller has to take care of the unsafety) or you need Pin.
If you try to do something similar with safe Rust, you find ourself unable to move the struct:
struct MyStruct<'a> {
field: String,
pointer_field: &'a String,
}
fn main() {
let x = "".to_string();
let mut my_struct = MyStruct {
field: "hei".to_string(),
pointer_field: &x,
};
my_struct.pointer_field = &my_struct.field;
// this won't work: compilation error
// cannot move the struct because it is borrowed
// let my_struct = *Box::new(my_struct);
}
Pinning
So what is Pin? It's basically a struct that helps us create a safe interface to interact with self-referential objects:
First let's start considering the Trait Unpin, most types in Rust are not self-referential and implement the trait Unpin automatically. However they will not automatically implement Unpin if they have a field that doesn't implement it. So, let's consider the two cases.
-
Implements
Unpin: the first case is easy, if a type T implementsUnpin, then you can, with safe Rust, convert&mut Tto/fromPin<&mut T>withPin::newandPin::get_mut/Pin::into_inner. This is usefull to call if some method expects aPin<&mut T>but we have a&mut T. -
Does not implement
Unpin: then it's impossible, with safe Rust, to convert&mut TtoPin<&mut T>directly, but there is a macrostd::pin::pin!that allows us (with safe Rust) to convert an owned valueTtoPin<&mut T>, losing access to ownership permanently (except though unsafe). In this case, it's also impossible, with safe Rust, to convertPin<&mut T>to&mut T, so assuming you don't have ownership ofTanymore (i.e: you didn't use unsafe Rust to getPin<&mut T>to begin with), it's not possible to move the value out of its location, so that place in memory is pinned.
So let's illustrate this with an example:
use std::{marker::PhantomPinned, pin::Pin};
#[derive(Debug)]
struct MyStruct {
field: String,
}
#[derive(Debug)]
struct MyStructPP {
field: String,
pin: PhantomPinned,
}
fn main() {
let mut my_struct = MyStruct {
field: "hei".to_string(),
};
let pinned: Pin<&mut MyStruct> = Pin::new(&mut my_struct);
let mut_ref: &mut MyStruct = Pin::into_inner(pinned);
dbg!(mut_ref);
let mut my_struct = MyStructPP {
field: "hei".to_string(),
pin: PhantomPinned,
};
// won't work: compilation error
// let pinned: Pin<&mut MyStructPP> = Pin::new(&mut my_struct);
let pinned: Pin<&mut MyStructPP> = std::pin::pin!(my_struct);
// won't work: compilation error
// the macro took away our owned value of my_struct
// dbg!(my_struct);
// won't work: compilation error
// let mut_ref: &mut MyStructPP = Pin::into_inner(pinned);
// we can only can back to a &mut with unsafe
unsafe {
let pointer_back: &mut MyStructPP = Pin::into_inner_unchecked(pinned);
}
}
Futures, executors
To begin, here's have a ultra quick intro on futures and executors, but assuming you already have some experience with async Rust.
A future in Rust can be created by an async block/closure or an async function or manually by implementing the Future trait:
async fn give_me_some_future() {}
fn main() {
let _some_future = give_me_some_future();
let _another_future = async {};
let an_async_closure = async || {};
let _another_future = an_async_closure();
}
Or manually by implementing the Future trait:
use std::{
pin::Pin,
task::{Context, Poll},
};
fn main() {
let _some_future = SomeFuture(3);
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
The manually implemented future makes it quite clear that futures are lazy, such that just creating a future in general, doesn't do anything per se. We need to call await on them and for that we need an executor, e.g.: add tokio = { version = "1", features = ["rt", "macros"] } to your Cargo.toml or run cargo add tokio --features=rt,macros and then you can run this:
use std::{
pin::Pin,
task::{Context, Poll},
};
#[tokio::main(flavor = "current_thread")]
async fn main() {
let some_future = SomeFuture(3);
assert_eq!(some_future.await, 3);
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
What an async executor do, in simple terms, is keep calling the poll function from the future until it gets a Poll::Ready.
Note: in pratice it's not a busy loop calling until it gets a Poll::Ready, but rather the executor give the future a Waker inside a Context, which the future use to tell the executor that it's ready to be polled again and make progress, but this is (almost) beyond the scope of this post.
So we can simulate this with:
use std::{
pin::Pin,
task::{Context, Poll},
};
// run
// cargo add futures
// or add
// futures = "0.3"
// to your Cargo.toml
use futures::task::noop_waker;
fn main() {
// create a dummy waker and context
// just to make the type system work
let waker = noop_waker();
let mut cx = Context::from_waker(&waker);
let mut fut = SomeFuture(3);
let pfut = Pin::new(&mut fut);
assert_eq!(pfut.poll(&mut cx), Poll::Ready(3))
}
struct SomeFuture(u8);
impl Future for SomeFuture {
type Output = u8;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<u8> {
Poll::Ready(self.0)
}
}
Or to visualize with something a little more complex:
use std::{
pin::Pin,
task::{Context, Poll},
time::SystemTime,
};
use futures::task::noop_waker;
fn main() {
let waker = noop_waker();
let mut cx = Context::from_waker(&waker);
let mut fut = SomeFuture(0);
let mut pfut = Pin::new(&mut fut);
loop {
match pfut.as_mut().poll(&mut cx) {
Poll::Pending => {}
Poll::Ready(val) => {
println!("\nFinal result: {val}");
break;
}
}
}
}
struct SomeFuture(u64);
impl Future for SomeFuture {
type Output = u64;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<u64> {
// Unix epoch timestamp in microseconds
let timestamp = SystemTime::now()
.duration_since(SystemTime::UNIX_EPOCH)
.unwrap()
.as_nanos();
// A rudimentary random number generator
let remainder_of_timestamp_divided_by_200 = timestamp % 200;
if remainder_of_timestamp_divided_by_200 == 0 {
Poll::Ready(self.0)
} else {
self.get_mut().0 += 1;
// Tells the async executor that
// we are ready to be polled again
cx.waker().wake_by_ref();
Poll::Pending
}
}
}
Self-referential futures
Not every future is self-referential per si, in fact, all the manually defined futures in the previous sections are non self-referential.
The compiler generated futures (through functions and async blocks/closures) are always marked are self-referential (don't implement Pin) though, even when they technically are not, so even this won't compile:
let mut fut = async {};
let x = Pin::new(&mut fut); // compile error!
To conclude, let's give an example of truly self-referential future we will explore in a follow up post:
use std::{
pin::Pin,
task::{Context, Poll},
};
// This future a self-referential
async fn some_future() -> String {
let mut mstr = "Hello".to_string();
mstr.push('1');
// let's call this the child future
async {
// when first polling the future, we will stop
// at this point. Here the child future has borrowed mstr
// from the parent future so if we move the parent future
// it will move mstr too, but the child future will have
// the old address of mstr
pending_once().await;
mstr.push('2');
}
.await;
mstr.push('3');
mstr
}
// This future is not self-referential
// So this is compiles and runs:
// fn main() {
// let mut fut = pending_once();
// let _x = Pin::new(&mut fut);
// }
fn pending_once() -> impl Future<Output = ()> {
struct PendingOnce {
polled_once: bool,
completed: bool,
}
impl Future for PendingOnce {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
let this = self.get_mut();
if this.completed {
panic!("future resumed after completion")
} else if this.polled_once {
this.completed = true;
Poll::Ready(())
} else {
cx.waker().wake_by_ref();
this.polled_once = true;
Poll::Pending
}
}
}
PendingOnce {
polled_once: false,
completed: false,
}
}