# Smart Pointers

The most common kind of pointer in Rust is a reference. Smart Pointers not only act like pointers but also have additional metadata and capabilities, like reference counting. There are a few types of smart pointers in Rust.

References borrow data, while smart pointers often own the data.

TIP

String and Vec<T> are smart pointers! They own some memory and can manipulate it. They have metadata (like capacity).

There are some built-in smart pointers add there are many in crates. We can write our smart pointers as well.

# Traits

Smart pointers are usually structs. They implement Deref and Drop traits.

# Deref

Deref allows an instance of a struct to behave like a reference, so consuming code can work either with references or smart pointers. It does it by allowing customizing the behavior of the dereference operator (*).

Here's an example of how dereferencing may be used with smart pointers, same as with references:

fn main() {
  let x = 5;
  let y = Box::new(x);

  assert_eq!(5, x);
  assert_eq!(5, *y); // compiler calls *(y.deref()) behind the scenes
}

Deref trait has one method - deref() - it should return a & (immutable) reference. By using *, compiler calls deref() behind the scenes to get a reference, and then it knows how to get value behind that reference.

# Deref Coersion

Rust can automatically dereference values to proper types if possible. For example, if a function expects &str we can pass it &String. The compiler will call defer() on the &String to get &str. Compiler can call deref() as many times as needed until the proper type is found. E.g., if we had Box<String> it would have to call deref() twice:

&Box<String> -> &String -> &str

DerefMut

Similar to Deref is the DerefMut trait. It enables resolving of mutable references mut&.

# Drop

Drop allows writing code that will be run when the a type goes out of scope. E.g. it could release some resources like files or connections. It is similar to IDisposable interface in the .NET world. In the case of smart pointers, Drop will deallocate the memory on the heap.

The Drop trait has one method - drop(). Compiler will call it automatically when the value goes out of scope.

Dropping manually

If we need to destruct an instance before its end of scope, we can call drop(instance). It's not the same drop() as the one implemented via the Drop trait! Calling that one explicitly is not allowed.

# Box

It's the most straightforward smart pointer. It allows storing data on the heap instead of the stack. Only the pointer stays on the stack.

# Storing Data on the Heap

let b = Box::new(5); // 5 is on the heap
println!("b = {}", b);

When b goes out of scope, both the Box and i32 it points to get deallocated.

# Recursive Types

Data on the stack needs to be of a known size. Not all values are. Recursive types can infinitely contain other values of the same type - they are of unknown size. They should be stored on the heap.

Example:

enum List {
  Cons(i32, List),
  Nil,
}

let list = Cons(1, Cons(2, Cons(3, Nil)));

This enum does not compile; its size is unknown.

Here's a version with a Box<T>:

enum List {
  Cons(i32, Box<List>),
  Nil,
}

let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));

Only the i32 1 and the first Box<T> are stored on the stack. The rest is on the heap:

# Rc

The Rc<T> smart pointer enables multiple ownership - owning a value by multiple bindings. Rc stands for reference counting. Rc<T> keeps track of the number of references to a value. If there are none, the value can be cleaned up safely. Rc<T> is useful when multiple actors in our program will read the data, but we don't know which one will be the last to do that. Otherwise, we could use the "normal" ownership concepts. Rc allows to have multiple immutable references. Mutable references would bring chaos.

Multi-threading

Rc<T> is only for single-threaded scenarios!

Graphs

Rc<T> is useful in graph-like structures where nodes may be pointed at by multiple other nodes.

enum List {
  Cons(i32, Rc<List>),
  Nil,
}

use crate::List::{Cons, Nil};
use std::rc::Rc;

fn main() {
  let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
  let b = Cons(3, Rc::clone(&a));
  let c = Cons(4, Rc::clone(&a));
}

]

There are 3 references to a, all of them encapsulated within Rc. The value will "live" as long as any Rc instance still points to it.

Performance

clone() on Rc is cheap. It just increments reference count. It's cheaper to clone Rc than to clone the actual value that Rc points to.

Check count

We can check the current reference count of Rc with Rc::strong_count(&a), where &a is a reference to an actual instance of Rc.

# Weak

Rc allows to create two types of smart pointers:

  • Rc::clone(&a) - strong reference (Rc<T>)
  • Rc::downgrade(&a) - weak reference (Weak<T>)

Weak references do not increment the strong_count, they increment weak_count. The weak_count does not need to be 0 for the value being referred to to be dropped.

Since the value begind Weak<T> is uncertain it needs to be retrieved using the upgrade() method that returns Option<T>.

Reference Cycles

Weak is useful for protection agains reference cycles. If some node referes to another via Rc, it's a strong connection. If the other node needs to have a reference to the first one as well, it would use Weak. That way, the strong_count stays at 1 and the nodes can be dropped when they go out of scope.

Weak is useful in graphs as an alternative of Rc.

# RefCel

Here are the borrowing rules in Rust:

  • At any given time, you can have either (but not both of) one mutable reference or any number of immutable references.
  • References must always be valid.

With references and Box<T>, the borrowing rules are enforced at compile time. With RefCell<T>, these invariants are enforced at runtime. With references, if you break these rules, you’ll get a compiler error. With RefCell<T>, if you break these rules, your program will panic and exit.

Why?

The RefCell<T> type is useful when you’re sure your code follows the borrowing rules but the compiler is unable to understand and guarantee that.

Multi-threading

Rc<T> is only for single-threaded scenarios!

Because RefCell<T> allows mutable borrows checked at runtime, you can mutate the value inside the RefCell<T> even when the RefCell<T> is immutable. This is interior mutability pattern.

Interior mutability is a design pattern in Rust that allows you to mutate data even when there are immutable references to that data. Normally, it's disallowed by borrowing rules.

Here's an example of practical usage of RefCell:

struct MockMessenger {
  sent_messages: RefCell<Vec<String>>,
}

impl MockMessenger {
  fn new() -> MockMessenger {
    MockMessenger {
      sent_messages: RefCell::new(vec![]),
    }
  }
}

impl Messenger for MockMessenger {
  fn send(&self, message: &str) {
    self.sent_messages.borrow_mut().push(String::from(message));
  }
}

The send() message has an immutable reference to self, because it makes sense from the client perspective of that method. Normally, sending data should not modify the state of the sender object. However, our implementation is some mock that is supposed to keep every call to send() for verification later on. RefCell comes into the picture. We can use it as a pointer to a Vec that stores the invocations. We can get a mutable reference to that vector with borrow_mut().

TIP

RefCell also allows to get an immutable reference using borrow().

RefCell keeps track of how many mutable and immutable references were taken out of it. RefCell, following the borrowing rules still allows multiple immutable references or only one mutable reference at a time! Breaking these rules results in a panic (at runtime).

TIP

We can combine the capabilities of Rc and RefCell to be able to get multiple owners of mutable data. An example is here (opens new window).

Last Updated: 11/28/2022, 9:27:43 AM