A conspiracy!
As I was reading the Rust API documentation for std::vec::Vec, something interesting caught my eye in the Vec
struct definition.
pub struct Vec<T, A = Global>
where
A: Allocator,
{ /* private fields */ }
I am looking at you { /* private fields */ }
! “What are you trying to hide from me?” I thought. Was I being sucked into a grand conspiracy? The documentation gives no hints as to what these fields are other than giving us a visual representation of the data structure:
ptr len capacity
+--------+--------+--------+
| 0x0123 | 2 | 4 |
+--------+--------+--------+
|
v
Heap +--------+--------+--------+--------+
| 'a' | 'b' | uninit | uninit |
+--------+--------+---...
A conspiracy!
As I was reading the Rust API documentation for std::vec::Vec, something interesting caught my eye in the Vec
struct definition.
pub struct Vec<T, A = Global>
where
A: Allocator,
{ /* private fields */ }
I am looking at you { /* private fields */ }
! “What are you trying to hide from me?” I thought. Was I being sucked into a grand conspiracy? The documentation gives no hints as to what these fields are other than giving us a visual representation of the data structure:
ptr len capacity
+--------+--------+--------+
| 0x0123 | 2 | 4 |
+--------+--------+--------+
|
v
Heap +--------+--------+--------+--------+
| 'a' | 'b' | uninit | uninit |
+--------+--------+--------+--------+
Surely we will find that our Vec
has three fields ptr
len
and capacity
. But to be sure we have to go straight to the source. Are you ready to come with me down the rabbit hole and see if we can uncover an age-old mystery?
The many faces of Vec<T>
Diving into the struct definition of Vec
in std::vec
this is what we find:
pub struct Vec<T, A: Allocator = Global> {
buf: RawVec<T, A>,
len: usize,
}
NOTE
We will be ignoring the Allocator
type entirely. This is a topic worth of its own article.
Yay, we have len
! Okay... that was easy. Now we only need ptr
and capacity
. We might be home very early, right?
No, not really!
“What is this misterious RawVec<T, A>
?” you rightly ask yourself and where the hell is the ptr
and the capacity
? Well, let’s follow the breadcrumbs!
If we type RawVec
into the search field of the Rust API documentation we find... nothing!?
I knew it! They really are trying to hide something from us!
Okay, okay... stay calm, don’t stress it! Let’s take a deep breath and look at the source code:
pub(crate) struct RawVec<T, A: Allocator = Global> {
inner: RawVecInner<A>,
_marker: PhantomData<T>,
}
Ah, so that’s why we can’t find it in the documentation, it is only public within its crate pub(crate)
and not accessible from outside. Good, one mistery solved but what in the world is the RawVecInner<A>
type now and what is PhantomData<T>
1!? How deep does this rabbit hole go?
Looking at RawVecInner<A>
we get a clearer picture:
struct RawVecInner<A: Allocator = Global> {
ptr: Unique<u8>,
cap: Cap,
alloc: A,
}
Haha, no, we don’t... Well at least somewhat. We finally found our lost ptr
and cap
acity! But both of them are defined by new types. We’re three layers deep now, with no end in sight. But we’ve come this far we’re not stopping now, are we?
NOTE
Cap
is just a type which manages its min and max bounds so we won’t go deep into this one. But you can check it out here and here
So what is Unique<u8>
?
pub struct Unique<T: PointeeSized> {
pointer: NonNull<T>,
_marker: PhantomData<T>,
}
No surprises here, just another wrapper Type NonNull<T>
.
pub struct NonNull<T: PointeeSized> {
pointer: *const T,
}
Wait?! Are we done? I think we are! Hallelujah, we now have a broad overview of the whole Vec
stack! Let’s try to unravel it’s secrets, shall we?
Understanding vecs layers
Our journey looked something like this:
Vec<T>
holds a...RawVec<T>
which holds a...RawVecInner
which holds a...Unique<u8>
which holds a...NonNull<u8>
which holds a...*const u8
(a raw pointer) Phew, a lot of abstractions. But what does this tell us? To understand why the engineering team behind the standard library chose to go this route we first need to learn each layers purpose. We will start at the bottom and climb our way up until we reach the top of ourVec<Mountain>
!
At the camp
We start at the very bottom, our foundation is our constant basecamp which defines our entire structure: *const u8.
This is the most primitive way to refer to a location in memory and is also called a raw pointer. It is just plain memory address, a number. The official documentation tells us it’s a risky tool since it can be null, dangle, or be unaligned. It doesn’t have any lifetime information, so the compiler can’t know if the data it points to is still valid.
Using it requires stepping into an unsafe block, telling the compiler, “I know what I’m doing.” It’s the necessary starting point because, to manage memory, you must first be able to talk about memory addresses directly.
Time to check our tools
The first step in adding security in our ascension is to make sure we have all our necessary tools at hand! So we check our tool belt and see if it is NonNull<u8>.
A simple but incredibly important wrapper that now secures our *const u8
and provides one crucial guarantee: the pointer is never null. Null pointers are the source of countless bugs and crashes in other languages (the infamous “billion-dollar mistake”). By encoding the non-null guarantee directly into the type system, Rust can eliminate this entire class of errors.
This also unlocks a fantastic compiler optimization. Since a NonNull<T>
can never be null, the compiler knows it can use the 0 address to represent the None variant when it sees an Option<NonNull<T>>
. This means Option<NonNull<T>>
takes up the exact same amount of space as a regular raw pointer! It’s the first step in building our safe abstraction which is also a zero-cost one at that.
A unique path uncovers
Continuing our way up we come across a Unique<u8> path... Seems as if we are the only ones that ever climbed here. Time to engrave our names into the stone to make it our own.
This layer builds on NonNull
by adding the concept of ownership. It’s a promise to the compiler that not only is the pointer valid and non-null, but that we have exclusive (unique) access to the memory it points to.
The source code documentation for Unique states that it “owns its content,” which is Rust-speak for being responsible for cleaning up (dropping) the data when it’s no longer needed.
As you might know, this is Rust’s way of being memory safe without a garbage collector. By knowing that a Unique pointer is the sole owner, the compiler can enforce strict rules about borrowing and prevent data races. It’s a zero-cost abstraction that exists purely to give the compiler the information it needs to enforce memory safety rules at compile time.
Eh, can I have some help here?
After the engraving (why would we do this in the middle of a climb?!) we are out of breath and need help from the top. In order to send us down a rope, the helpful people at the top need to first ask for one.
RawVec<T> is a “low-level utility for managing a buffer of memory.” It’s the component that actually talks to the memory allocator by handling the allocation and asking the allocator for a block of memory on the heap, growing it when needed (usually by doubling the capacity), and, crucially, deallocating it when the Vec is dropped.
As we have learned already, this one only knows about the capacity (the total allocated space). It does not track how many elements are actually initialized and in use. This deliberate separation of concerns makes RawVec a perfect, reusable building block. Other collection types in the standard library that also need a growable buffer of memory, like VecDeque<T>, can reuse this component instead of reinventing the wheel.
INFO
Remember that RawVecInner
type we saw? That’s a clever compile-time optimization. By splitting the logic, the part that isn’t generic over T (RawVecInner) doesn’t get duplicated for every Vec<T> you create, which helps speed up compilation.
Reaching the top
At last, we arrive at the Vec<Top>, time to plant some flags!
This is the one that brings everything together. It holds the RawVec (which manages the memory) and adds the final piece of information: len
.
This acts as the public API that hides all the unsafe complexity we’ve just learned about. The Vec knows how many elements are initialized (len) within the allocated block (capacity). It’s responsibility is to ensure that you can only access initialized elements. This gives us the safe methods we love to use every day: push
, pop
, insert
, indexing, and so on.
Each level builds upon the last, adding new guarantees and responsibilities, until we have a completely safe, efficient, and powerful data structure.
An ordinary Vec<Life>
Seems like my life remains somewhat uneventful. There is no big conspiracy and no hidden truth to be found... just very good engineering.
While we were digging deep to learn what Vec<T>
’s private fields are, we uncovered something very interesting: good API design. We saw how Rust’s engineers built one of its most common types from the ground up, starting with an unsafe pointer and carefully wrapping it, layer by layer, until it’s perfectly safe and ergonomic.
Every layer is crucial for Vec
’s contract to be fulfilled. It’s a testament to the power of abstraction and separation of concerns. The “conspiracy” was about managing complexity, allowing each piece of the puzzle to do one thing and do it well. And in doing so, it provides reusable types that power other parts of the standard library.
Maybe next time, when you push an element to a vector, you’ll know about the stack of abstractions working under the hood to make it all happen safely and efficiently. And that, in itself, is a pretty cool secret to have uncovered.
Conclusion
Uncovering Vec
’s inner structure helped me a lot to understand a little more what happens behind the scene in the standard library.
Unfortunately we only touched on a fraction of the complex structure that forms Vec
(but I hope enough, to have a better understanding). Explaining every little bit in detail would be too much for this post, so maybe there will be more Under the hood articles which explain Rust’s inner life even better since obvious questions already arise while reading this article: “How does NonNull
guarantee its promise?” or “How does RawVec
manage memory?”
Each type we learned about in this article is worth its own post, including the ones we only briefly mentioned (Allocator
, PhantomData
), so I encourage everyone check these types out for themselves. Rust’s documentation is phenomenal and with modern IDE features like go to definition you can easily jump back and forth until you get a clearer picture of the parts that form the public Rust API.
Footnotes
explaining PhantomData<T>
is out of scope for this article. But we might look at it in another one. ↩