Smart Pointers
The most common kind of pointer in Rust is a reference. Smart Pointers not only act like pointers but also have additional metadata and capabilities, like reference counting. There are a few types of smart pointers in Rust.
References borrow data, while smart pointers often own the data.
There are some built-in smart pointers and there are many in crates. We can write our smart pointers as well.
Traits
Smart pointers are usually structs. They implement Deref
and Drop
traits.
Deref
Deref
allows an instance of a struct to behave like a reference, so consuming
code can work either with references or smart pointers. It does it by allowing
customizing the behavior of the dereference operator (*
).
Here’s an example of how dereferencing may be used with smart pointers, same as with references:
Deref
trait has one method - deref()
- it should return a &
(immutable) reference. By using *
, the compiler calls deref()
behind the
scenes to get a reference, and then it knows how to get value behind that
reference.
Deref Coercion
Rust can automatically dereference values to proper types if possible. For
example, if a function expects &str
we can pass it &String
. The compiler
will call defer()
on the &String
to get &str
. The compiler can call deref()
as many times as needed until the proper type is found. E.g., if we had
Box<String>
it would have to call deref()
twice:
&Box<String>
->&String
->&str
Drop
Drop
allows writing code that will be run when the type goes out of scope.
E.g. it could release some resources like files or connections. It is similar to
IDisposable
interface in the .NET world. In the case of smart pointers, Drop
will deallocate the memory on the heap.
The Drop
trait has one method - drop()
. The compiler will call it automatically
when the value goes out of scope.
Box
It’s the most straightforward smart pointer. It allows storing data on the heap instead of the stack. Only the pointer stays on the stack.
Storing Data on the Heap
When b
goes out of scope, both the Box
and i32
it points to get
deallocated.
Recursive Types
Data on the stack needs to be of a known size. Not all values are. Recursive types can infinitely contain other values of the same type - they are of unknown size. They should be stored on the heap.
Example:
This enum does not compile; its size is unknown.
Here’s a version with a Box<T>
:
Only the i32
1 and the first Box<T>
are stored on the stack. The rest is on
the heap:
Rc
The Rc<T>
smart pointer enables multiple ownership - owning a value by
multiple bindings. Rc stands for reference counting. Rc<T>
keeps track of
the number of references to a value. If there are none, the value can be cleaned
up safely. Rc<T>
is useful when multiple actors in our program will read the
data, but we don’t know which one will be the last to do that. Otherwise, we
could use the “normal” ownership concepts. Rc
allows to have multiple
immutable references. Mutable references would bring chaos.
There are 3 references to a
, all of them encapsulated within Rc
.
The value will “live” as long as any Rc
instance still points to it.
Weak
Rc
allows to create two types of smart pointers:
Rc::clone(&a)
- strong reference (Rc<T>
)Rc::downgrade(&a)
- weak reference (Weak<T>
)
Weak references do not increment the strong_count
, they increment
weak_count
. The weak_count
does not need to be 0 for the value being
referred to to be dropped.
Since the value behind Weak<T>
is uncertain it needs to be retrieved using
the upgrade()
method that returns Option<T>
.
RefCel
Here are the borrowing rules in Rust:
- At any given time, you can have either (but not both of) one mutable reference or any number of immutable references.
- References must always be valid.
With references and Box<T>
, the borrowing rules are enforced at compile time.
With RefCell<T>
, these invariants are enforced at runtime. With references, if
you break these rules, you’ll get a compiler error. With RefCell<T>,
if you
break these rules, your program will panic and exit.
Because RefCell<T>
allows mutable borrows checked at runtime, you can mutate
the value inside the RefCell<T>
even when the RefCell<T>
is immutable. This
is interior mutability pattern.
Interior mutability is a design pattern in Rust that allows you to mutate data even when there are immutable references to that data. Normally, it’s disallowed by borrowing rules.
Here’s an example of practical usage of RefCell
:
The send()
message has an immutable reference to self
, because it makes
sense from the client perspective of that method. Normally, sending data should
not modify the state of the sender object. However, our implementation is some
mock that is supposed to keep every call to send()
for verification later on.
RefCell
comes into the picture. We can use it as a pointer to a Vec
that
stores the invocations. We can get a mutable reference to that vector with
borrow_mut()
.
RefCell
keeps track of how many mutable and immutable references were taken
out of it. RefCell
, following the borrowing rules still allows multiple
immutable references or only one mutable reference at a time! Breaking these
rules results in a panic (at runtime).