Smart Pointers

The most common kind of pointer in Rust is a reference. Smart Pointers not only act like pointers but also have additional metadata and capabilities, like reference counting. There are a few types of smart pointers in Rust.

References borrow data, while smart pointers often own the data.

There are some built-in smart pointers and there are many in crates. We can write our smart pointers as well.

Traits

Smart pointers are usually structs. They implement Deref and Drop traits.

Deref

Deref allows an instance of a struct to behave like a reference, so consuming code can work either with references or smart pointers. It does it by allowing customizing the behavior of the dereference operator (*).

Here’s an example of how dereferencing may be used with smart pointers, same as with references:

fn main() {
  let x = 5;
  let y = Box::new(x);

  assert_eq!(5, x);
  assert_eq!(5, *y); // compiler calls *(y.deref()) behind the scenes
}

Deref trait has one method - deref() - it should return a & (immutable) reference. By using *, the compiler calls deref() behind the scenes to get a reference, and then it knows how to get value behind that reference.

Deref Coercion

Rust can automatically dereference values to proper types if possible. For example, if a function expects &str we can pass it &String. The compiler will call defer() on the &String to get &str. The compiler can call deref() as many times as needed until the proper type is found. E.g., if we had Box<String> it would have to call deref() twice:

&Box<String> -> &String -> &str

Drop

Drop allows writing code that will be run when the type goes out of scope. E.g. it could release some resources like files or connections. It is similar to IDisposable interface in the .NET world. In the case of smart pointers, Drop will deallocate the memory on the heap.

The Drop trait has one method - drop(). The compiler will call it automatically when the value goes out of scope.

Box

It’s the most straightforward smart pointer. It allows storing data on the heap instead of the stack. Only the pointer stays on the stack.

Storing Data on the Heap

let b = Box::new(5); // 5 is on the heap
println!("b = {}", b);

When b goes out of scope, both the Box and i32 it points to get deallocated.

Recursive Types

Data on the stack needs to be of a known size. Not all values are. Recursive types can infinitely contain other values of the same type - they are of unknown size. They should be stored on the heap.

Example:

enum List {
  Cons(i32, List),
  Nil,
}

let list = Cons(1, Cons(2, Cons(3, Nil)));

This enum does not compile; its size is unknown.

Here’s a version with a Box<T>:

enum List {
  Cons(i32, Box<List>),
  Nil,
}

let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));

Only the i32 1 and the first Box<T> are stored on the stack. The rest is on the heap:

Rc

The Rc<T> smart pointer enables multiple ownership - owning a value by multiple bindings. Rc stands for reference counting. Rc<T> keeps track of the number of references to a value. If there are none, the value can be cleaned up safely. Rc<T> is useful when multiple actors in our program will read the data, but we don’t know which one will be the last to do that. Otherwise, we could use the “normal” ownership concepts. Rc allows to have multiple immutable references. Mutable references would bring chaos.

enum List {
  Cons(i32, Rc<List>),
  Nil,
}

use crate::List::{Cons, Nil};
use std::rc::Rc;

fn main() {
  let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
  let b = Cons(3, Rc::clone(&a));
  let c = Cons(4, Rc::clone(&a));
}

There are 3 references to a, all of them encapsulated within Rc. The value will “live” as long as any Rc instance still points to it.

Weak

Rc allows to create two types of smart pointers:

Rc::clone(&a) - strong reference (Rc<T>)
Rc::downgrade(&a) - weak reference (Weak<T>)

Weak references do not increment the strong_count, they increment weak_count. The weak_count does not need to be 0 for the value being referred to to be dropped.

Since the value behind Weak<T> is uncertain it needs to be retrieved using the upgrade() method that returns Option<T>.

RefCel

Here are the borrowing rules in Rust:

At any given time, you can have either (but not both of) one mutable reference or any number of immutable references.
References must always be valid.

With references and Box<T>, the borrowing rules are enforced at compile time. With RefCell<T>, these invariants are enforced at runtime. With references, if you break these rules, you’ll get a compiler error. With RefCell<T>, if you break these rules, your program will panic and exit.

Because RefCell<T> allows mutable borrows checked at runtime, you can mutate the value inside the RefCell<T> even when the RefCell<T> is immutable. This is interior mutability pattern.

Interior mutability is a design pattern in Rust that allows you to mutate data even when there are immutable references to that data. Normally, it’s disallowed by borrowing rules.

Here’s an example of practical usage of RefCell:

struct MockMessenger {
  sent_messages: RefCell<Vec<String>>,
}

impl MockMessenger {
  fn new() -> MockMessenger {
    MockMessenger {
      sent_messages: RefCell::new(vec![]),
    }
  }
}

impl Messenger for MockMessenger {
  fn send(&self, message: &str) {
    self.sent_messages.borrow_mut().push(String::from(message));
  }
}

The send() message has an immutable reference to self, because it makes sense from the client perspective of that method. Normally, sending data should not modify the state of the sender object. However, our implementation is some mock that is supposed to keep every call to send() for verification later on. RefCell comes into the picture. We can use it as a pointer to a Vec that stores the invocations. We can get a mutable reference to that vector with borrow_mut().

RefCell keeps track of how many mutable and immutable references were taken out of it. RefCell, following the borrowing rules still allows multiple immutable references or only one mutable reference at a time! Breaking these rules results in a panic (at runtime).

← Raw Pointers

Concurrency →