Rust is quite particular when it comes to pointers: it keeps a high-performance scheme while ensuring safety use patterns (no dangling pointers, etc.).
We will also talk about ownership and lifetimes in this chapter.
First, let’s look for a second at how things are laid out in memory. There are two important memory pools, the stack and the heap.
Functions are individual, reusable pieces of code. If we want to directly edit a variable from the originating (caller) namespace, we need to pass its memory address, i.e. a pointer to the said variable so that the function knows where to find it and access it.
Also, when passing huge structures of data you might want to pass a pointer to it to save on memory copies but that is a quite sparse use-case as we will see.
Each application uses a pool of memory called the stack, of fixed-size, allocated when a task starts up. There are various mechanisms occuring inside the stack but it is mostly hosting all local variables and parameters used along the execution. When a function is called, the existing registers are saved on memory and the program jumps to the function and creates a new stack frame for it. The stack also traces the order in which functions are called so that function returns occur correctly. The default stack size of a Rust task is 2 MiB.
So, we need to pass a memory address in order to be able to edit a variable from the caller’s namespace – that is the main reason why we use pointers.
A program can also dynamically request memory with the unused memory in the computer, managed by the OS: this is called the heap. All dynamically-sized types (DST) are stored on the heap with an OS allocation.
The heap can only be accessed via a pointer located on the stack since it is not the default memory pool, these pointers create a box.
So, an int
has a fixed-size in memory and is stack-allocated. A dynamic-array ~[]
, a string ~str
or anything ~
is allocated on the heap (that is where you would use malloc
in C or new
in C++).
Rust has two primary pointer types: the borrowed reference (using the &
symbol) and the owned pointer (indicated by ~
).
Referencing is also called borrowing because creating a reference freezes the target variable, i.e. it cannot be freed nor transfered (so no use-after-free). When the reference gets out of scope and freed, it is available again to the caller. References use the &
operator, just like C pointers; or &mut
with mutability, which need to point to uniquely referenced, mutable data – everything is statically checked.
This would be the most basic example:
let x = 3;
{
y = &x; // `x` is now frozen and cannot be modified
// ...but it can be read since the borrow is immutable
}
// `x` can be accessed again (`y` is out of scope)
A borrow cannot outlive the variable it originates from: it has a lifetime, meaning that you will have to change your allocation pattern if a reference outlives its lifetime, like here:
let mut x = &3;
{
let mut y = 4;
x = &y;
} // `y` is freed here, but `x` still lives...
This pattern will be rejected, since y
has a shorter lifetime than x
.
The compiler enforces valid references and yields:
error: borrowed value does not live long enough
Note: there are a few cases through like when returning a reference passed to a function where you will need to add a lifetime annotation, so that it is inferred from the caller; more on that a bit later.
Referencing is the default choice as a pointer: all checks are performed at compile-time (by the borrow checker) so its footprint is that of a C pointer (which is also available in Rust as unsafe
, *
pointer).
The unary star operator *
also serves for dereferencing, like in C.
An owned pointer owns a certain (dynamically allocated) part of the heap, i.e. the owner is the only one who can access the data – unless it transfers ownership, at which point the compiler will free the memory automatically (pointer is copied, but not the value).
fn take<T>(ptr: ~T) { // works for any type, `T`
...
}
let m = ~"Hello!";
take(m);
take_again(m); // ERROR: `m` is moved
Note that owned pointers can, like most objects, be borrowed. You can also copy a unique pointer using .clone()
, but this is an expensive operation.
Okay, let’s use a silly example involving a function return:
fn take(x: &int) -> &int {
x
}
fn main() {
let x = 4;
println!("{}", *take(&x));
}
You are probably thinking that x
outlives its lifetime; that’s where it is:
error: cannot infer an appropriate lifetime due to conflicting requirements
It doesn’t end there through, since we can pass a lifetime parameter from the caller to the function:
fn take<'a>(ptr: &'a int) -> &'a int {
ptr
}
fn main() {
let x = 4;
println!("{}", *take(&x));
}
As you can see here, we define 'a
as a parameter (the single quotation mark prefix denoting a lifetime), and annotate it to both the value being passed and the return value. In short, the return value will inherit the lifetime of the parameter.
Since x
is still alive until the end of main()
– the caller function, this pattern is valid and typechecks.
So generally speaking, if you want to return a borrowed value (eventually with a condition evaluation for instance), you will have to use that.
This is particularly useful if you want to modify a variable in-place (that is, without having to pass it as heap pointer), in which case you can take a mutable borrow &mut
with a lifetime annotation.
You can also annotate lifetime parameters to several variables, in which case the compiler will pick the lowest. This is useful when your output depends on a few variables:
fn max<'a>(x: &'a int, y: &'a int) -> &'a int {
if (*x >= *y) {
x
} else {
y
}
}
Lastly, there is a 'static
lifetime, which you want to use outside of any brace scope and does not expire. As an example, here is how rust defines its bug report URL string:
static BUG_REPORT_URL: &'static str = "...url...";