In the first part of this series, we saw that objects occupy storage, that storage has duration, and that the type system imposes structure on raw bytes. But we have sidestepped a question that dominates systems programming in practice: when heap-allocated memory must be released, who bears responsibility for releasing it?
Table of Contents
Open Table of Contents
-
[ManuallyDrop and Suppressing Auto…
In the first part of this series, we saw that objects occupy storage, that storage has duration, and that the type system imposes structure on raw bytes. But we have sidestepped a question that dominates systems programming in practice: when heap-allocated memory must be released, who bears responsibility for releasing it?
Table of Contents
Open Table of Contents
Ownership: Who Calls Free?
The stack is self-managing. When a function returns, the stack pointer moves and automatic storage vanishes. There is no decision to make, no function to call, no possibility of error. The heap is different. Memory obtained through malloc persists until someone calls free. The allocator cannot know when we are finished with an allocation; only our program logic knows that. And so the burden falls on us.
This burden carries severe consequences, often exploitable ones. If we free too early, subsequent accesses read garbage or, worse, data that a new allocation has placed there (a use-after-free, the vulnerability class behind a substantial fraction of remote code execution exploits). If we free twice, we corrupt the allocator’s metadata; an attacker who controls the timing can often leverage this into arbitrary write primitives. If we never free, we leak, and the process grows until the operating system intervenes.
The question is how programming languages help us manage this responsibility, or whether they help at all.
C: Manual Management
C offers two primitives: malloc to acquire and free to release. Everything else is convention.
Consider a function that opens a file and allocates a read buffer:
typedef struct {
int fd;
char *buffer;
size_t capacity;
} FileReader;
FileReader *filereader_open(const char *path, size_t buffer_size) {
FileReader *reader = malloc(sizeof(FileReader));
if (!reader) return NULL;
reader->buffer = malloc(buffer_size);
if (!reader->buffer) {
free(reader);
return NULL;
}
reader->fd = open(path, O_RDONLY);
if (reader->fd < 0) {
free(reader->buffer);
free(reader);
return NULL;
}
reader->capacity = buffer_size;
return reader;
}
We acquire three resources: the struct itself, the buffer, and the file descriptor. Each acquisition can fail, and each failure path must release everything acquired before it. The code above handles this correctly. But observe the shape: the cleanup logic mirrors the acquisition logic in reverse, and we must write it by hand for every function that acquires resources.
The corresponding cleanup function:
void filereader_close(FileReader *reader) {
if (!reader) return;
if (reader->fd >= 0) close(reader->fd);
free(reader->buffer);
free(reader);
}
What happens if we call filereader_close twice on the same pointer? The first call closes the descriptor and frees the memory. The second call passes the same address to free, corrupting the allocator’s free list. The compiler does not warn us.
What happens if, between filereader_open and filereader_close, we reassign reader->buffer without freeing the old buffer? We leak. The original allocation becomes unreachable.
The fundamental problem is that C pointers carry no ownership semantics. When a function declares void process(FileReader *reader), nothing in that signature tells us whether process will free the reader, expects us to free it afterward, or assumes it will remain valid for some longer duration. The type FileReader * means only “address of a FileReader”; it says nothing about responsibility.
Large C codebases develop conventions to manage this. The Linux kernel uses reference counting for shared structures, with _get and _put suffixes indicating acquisition and release. GLib uses _new for allocation, _free for deallocation, and _ref/_unref for reference-counted objects. These conventions work, but they are conventions, patterns enforced by code review rather than by the compiler. Every violation is a latent bug.
C++: Binding Cleanup to Scope
C++ exploits a property that C has but does not leverage: local variables have a well-defined scope, and when that scope ends, the variable ceases to exist. The language allows us to attach custom cleanup logic to that moment through destructors.
A destructor is a special member function, denoted ~ClassName(), that the compiler calls automatically when an object’s lifetime ends. The call is not optional. It happens regardless of how control flow exits the scope: normal return, early return, exception propagation.
class FileReader {
public:
explicit FileReader(const char *path, size_t buffer_size)
: buffer_(new char[buffer_size])
, capacity_(buffer_size)
, fd_(::open(path, O_RDONLY))
{
if (fd_ < 0) {
delete[] buffer_;
throw std::system_error(errno, std::generic_category());
}
}
~FileReader() {
if (fd_ >= 0) ::close(fd_);
delete[] buffer_;
}
FileReader(const FileReader &) = delete;
FileReader &operator=(const FileReader &) = delete;
private:
char *buffer_;
size_t capacity_;
int fd_;
};
When we write:
void process_file(const char *path) {
FileReader reader(path, 4096);
do_something(reader);
}
the compiler generates an implicit call to ~FileReader() at the closing brace, regardless of whether do_something returns normally or throws. We did not write this call; the language guarantees it.
The destruction sequence is precise. After executing the destructor body and destroying any automatic objects declared within it, the compiler destroys all non-static data members in reverse order of their declaration, then all direct base classes in reverse order of construction. This reverse ordering matters: later-declared members may depend on earlier ones, so we tear down in the opposite order we built up.
Consider what this means for exception safety. If any operation between resource acquisition and release throws, the destructor still runs during stack unwinding. We do not need explicit cleanup code on every exit path. The resource management logic is written once, in the destructor, and the compiler inserts the call at every point where it is needed.
The standard library provides RAII wrappers for common resources: std::unique_ptr for exclusive ownership of heap memory, std::shared_ptr for reference-counted shared ownership, std::lock_guard for mutexes, std::fstream for files. Using these types, we rarely write new or delete directly.
But RAII in C++ is opt-in. Nothing prevents us from writing:
void leaky() {
int *p = new int[1000];
// forgot delete[]
}
Nothing prevents extracting a raw pointer and misusing it:
void dangling() {
int *raw;
{
auto owner = std::make_unique<int>(42);
raw = owner.get();
} // owner destroyed here
*raw = 10; // use-after-free
}
The compiler does not track which pointers own and which merely observe. We can bypass RAII entirely with raw new and delete. We can hold raw pointers past their owners’ lifetimes. We can delete through a base class pointer when the destructor is not virtual, which is undefined behavior even if no resources would leak, because the derived destructor never runs.
C++ gives us the machinery for safe resource management. Using that machinery is a choice the language cannot enforce. A codebase mixing raw pointers, unique_ptr, shared_ptr, and manual new/delete requires reasoning about ownership at every function boundary. The answer lives in programmers’ heads, in comments, in coding standards. This is better than C, where even the machinery does not exist. But it remains insufficient to eliminate memory safety bugs from large codebases.
Rust: Ownership in the Type System
Rust takes the RAII pattern and embeds it into the type system as a non-negotiable rule: every value has exactly one owner, and when that owner goes out of scope, the value is dropped. The compiler verifies this property statically, and no amount of programmer intent can bypass it.
Consider what happens when we allocate a vector:
fn example() {
let v = vec![1, 2, 3];
}
The binding v owns the heap-allocated buffer. When v goes out of scope at the closing brace, Rust calls drop on the Vec, which deallocates the buffer. We cannot accidentally forget to free the memory.
The critical mechanism is the move. When we assign a value to another binding, ownership transfers:
let v1 = vec![1, 2, 3];
let v2 = v1;
After this assignment, v1 is no longer valid. Any attempt to use it is a compile-time error. This is not a shallow copy followed by invalidation of the source, as in C++ move semantics where the moved-from object remains in a “valid but unspecified state” that we can still inspect and call methods on. In Rust, ownership transfer means the source binding ceases to exist as far as the type system is concerned. The compiler marks it as uninitialized. There is no moved-from state.
Why does this matter? Consider what would happen without this rule. If both v1 and v2 were valid after the assignment, both would attempt to free the same buffer when they went out of scope. The move rule makes double-free impossible: exactly one binding owns the allocation at any time, and exactly one drop occurs.
The same logic applies to function calls. When we pass a value to a function, ownership transfers to the function’s parameter:
fn consume(v: Vec<i32>) {
// v is owned here; dropped at end of function
}
fn main() {
let data = vec![1, 2, 3];
consume(data);
// data is no longer valid here
}
The function signature fn consume(v: Vec<i32>) declares that consume takes ownership. The caller cannot use data after the call because ownership has moved. Compare this to fn borrow(v: &Vec<i32>), which borrows without taking ownership, or fn mutate(v: &mut Vec<i32>), which borrows mutably. The type encodes the ownership relationship.
The Drop Trait and Recursive Destruction
When a value goes out of scope, Rust runs its destructor. For types that implement the Drop trait, this means calling the drop method:
struct FileHandle {
fd: i32,
}
impl Drop for FileHandle {
fn drop(&mut self) {
unsafe { libc::close(self.fd); }
}
}
The drop method receives &mut self, not self. We cannot move out of self during drop because drop takes a mutable reference. This prevents us from returning the inner file descriptor to avoid closing it. When drop returns, the value is gone.
After drop executes, Rust recursively drops all fields of the struct. This is automatic and unavoidable. If we have:
struct Connection {
socket: TcpStream,
buffer: Vec<u8>,
}
and Connection does not implement Drop, Rust still drops socket and buffer when a Connection goes out of scope. The compiler generates this “drop glue” for every type that needs it. We do not write boilerplate to drop children; the language handles it. If Connection does implement Drop, Rust first calls our drop method, then drops the fields. We cannot prevent the recursive field drops. After our drop returns, the fields will be dropped regardless of what we did.
The destruction order is deterministic and specified by the language. For local variables, drop order is the reverse of declaration order: later declarations are dropped first. The rationale is that later variables may hold references to earlier ones, so we must destroy the borrowers before the borrowed. For struct fields, drop order is declaration order (not reversed). For tuples, elements drop in order. For arrays, elements drop from index 0 to the end. For enums, only the active variant’s fields are dropped. Closure captures by move are dropped in an unspecified order, so if destruction order among captured values matters, do not rely on closure drop order.
ManuallyDrop and Suppressing Automatic Destruction
The recursive drop behavior creates a problem when we need fine-grained control over destruction. Consider a type that wraps a Box and wants to deallocate the contents in a custom way:
struct SuperBox<T> {
my_box: Box<T>,
}
impl<T> Drop for SuperBox<T> {
fn drop(&mut self) {
unsafe {
// Deallocate the box's contents ourselves
let ptr = Box::into_raw(self.my_box); // ERROR: cannot move out of &mut self
std::alloc::dealloc(ptr as *mut u8, Layout::new::<T>());
}
}
}
This does not compile. We cannot move self.my_box out of &mut self. And even if we could, after our drop returns, Rust would try to drop my_box again, causing a double-free.
One solution is Option:
struct SuperBox<T> {
my_box: Option<Box<T>>,
}
impl<T> Drop for SuperBox<T> {
fn drop(&mut self) {
if let Some(b) = self.my_box.take() {
// Handle b ourselves; self.my_box is now None
// When Rust drops self.my_box, it drops None, which does nothing
}
}
}
This works, but it pollutes our type with Option semantics. A field that should always be Some must be declared as Option solely because of what happens in the destructor. Every access to my_box elsewhere in the code must handle the None case that should never occur outside of drop.
ManuallyDrop<T> provides a cleaner solution. It is a wrapper that suppresses automatic drop for its contents:
use std::mem::ManuallyDrop;
struct SuperBox<T> {
my_box: ManuallyDrop<Box<T>>,
}
impl<T> Drop for SuperBox<T> {
fn drop(&mut self) {
unsafe {
// Take ownership of the inner Box
let b = ManuallyDrop::take(&mut self.my_box);
// Now we own b and can do whatever we want
// When our drop returns, Rust will "drop" self.my_box,
// but ManuallyDrop's drop is a no-op
}
}
}
ManuallyDrop<T> has the same size and alignment as T. It implements Deref and DerefMut, so we can use the wrapped value normally. But when Rust drops a ManuallyDrop<T>, nothing happens. The inner T is not dropped. We are responsible for dropping it manually via ManuallyDrop::drop(&mut x) or taking ownership via ManuallyDrop::take(&mut x).
This is useful beyond custom destructors. When we need to move a value out of a context where Rust would normally drop it, ManuallyDrop lets us suppress that drop and handle the value ourselves. The unsafe is required because we are taking responsibility for ensuring the value is eventually dropped (or intentionally leaked).
Drop Flags and Conditional Initialization
Rust tracks initialization state at compile time when possible. But consider:
let x;
if condition {
x = Box::new(0);
}
At the end of scope, should Rust drop x? It depends on whether condition was true. When the compiler cannot determine initialization state statically, it inserts a runtime drop flag: a hidden boolean that tracks whether the value was initialized. At scope exit, Rust checks the flag before calling drop.
For straight-line code and branches that initialize consistently, the compiler can eliminate these flags through static analysis. The flags exist only when genuinely necessary, and the generated code checks them only at points where the initialization state is ambiguous.
You Cannot Call Drop Directly
Rust prevents explicit calls to the Drop::drop method:
let v = vec![1, 2, 3];
v.drop(); // error: explicit use of destructor method
If we could call drop directly, the value would still be in scope afterward. When the scope ends, Rust would call drop again. Instead, we use std::mem::drop for early cleanup:
let v = vec![1, 2, 3];
drop(v); // v is moved into drop() and dropped there
The drop function takes ownership by value: fn drop<T>(_: T) {}. The value moves into the function and is dropped when the function returns. Since ownership moved, the original binding is invalid and no second drop occurs.
Drop Check: When Lifetimes and Destructors Collide
We now arrive at a subtle interaction between Rust’s lifetime system and its destructor semantics. Consider this seemingly innocuous code:
struct Inspector<'a>(&'a u8);
struct World<'a> {
inspector: Option<Inspector<'a>>,
days: Box<u8>,
}
fn main() {
let mut world = World {
inspector: None,
days: Box::new(1),
};
world.inspector = Some(Inspector(&world.days));
}
This compiles. The Inspector holds a reference to days, both are fields of World, and when world goes out of scope, both are dropped. The fact that days does not strictly outlive inspector does not matter here because neither has a destructor that could observe the other.
But watch what happens when we add a Drop impl to Inspector:
struct Inspector<'a>(&'a u8);
impl<'a> Drop for Inspector<'a> {
fn drop(&mut self) {
println!("I was only {} days from retirement!", self.0);
}
}
struct World<'a> {
inspector: Option<Inspector<'a>>,
days: Box<u8>,
}
fn main() {
let mut world = World {
inspector: None,
days: Box::new(1),
};
world.inspector = Some(Inspector(&world.days));
}
This does not compile:
error[E0597]: `world.days` does not live long enough
What changed? The Drop implementation. When Inspector has a destructor, that destructor might access the reference it holds. If days is dropped before inspector, the destructor would dereference freed memory. The borrow checker must now enforce that any data borrowed by Inspector outlives the Inspector itself, because the destructor could observe that data.
This is the drop checker (dropck). It enforces a stricter rule when types have destructors: for a generic type to soundly implement drop, its generic arguments must strictly outlive it. Not live at least as long as, but strictly outlive. The difference is subtle. Without a destructor, two values can go out of scope simultaneously because neither observes the other during destruction. With a destructor, the type with the destructor might observe its borrowed data, so that data must still be valid when the destructor runs.
Only generic types need to worry about this. If a type is not generic, the only lifetimes it can contain are 'static, which truly lives forever. The problem arises when a type is generic over a lifetime or a type parameter, and it implements Drop, and that Drop could potentially access the borrowed data.
The #[may_dangle] Escape Hatch
The drop checker is conservative. It assumes that any Drop impl for a generic type might access data of the generic parameter. But this is often not true. Consider Vec<T>:
impl<T> Drop for Vec<T> {
fn drop(&mut self) {
// Deallocate the buffer
// We DO drop each T element, but we don't "use" T in the sense
// of accessing references that T might contain
}
}
When we drop a Vec<&'a str>, we deallocate the buffer. We do call drop on each &'a str element, but &'a str has no destructor; its drop is a no-op. The Vec’s drop does not actually dereference those &'a str values. It does not read the strings. It just deallocates the backing memory.
Yet the drop checker does not know this. It sees impl<'a> Drop for Vec<&'a str> and concludes that 'a must strictly outlive the Vec. This prevents code like:
fn main() {
let mut v: Vec<&str> = Vec::new();
let s: String = "Short-lived".into();
v.push(&s);
drop(s); // s dropped while v still holds a reference
}
This is correctly rejected. But it also rejects:
fn main() {
let mut v: Vec<&str> = Vec::new();
let s: String = "Short-lived".into();
v.push(&s);
} // s and v dropped here, but in what order?
The second example should be fine. Both v and s are dropped at the end of main. Variables are dropped in reverse declaration order, so v drops first, then s. By the time s drops, v is already gone. The references in v are never dereferenced during v’s destruction.
To allow this pattern, Rust provides an unstable, unsafe attribute: #[may_dangle]. It tells the drop checker “I promise not to access this generic parameter in my destructor”:
unsafe impl<#[may_dangle] T> Drop for Vec<T> {
fn drop(&mut self) {
// ...
}
}
The #[may_dangle] on T is a promise that the Drop impl does not access T values in a way that requires them to be valid. The drop checker relaxes its requirements: T may now dangle (be a reference to freed memory) when the Vec is dropped.
But this is a lie, or at least an incomplete truth. Vec<T>’s destructor does drop each T element. If T has a destructor that accesses borrowed data, that data must still be valid. The promise is more precisely: “I do not access T myself, but I may trigger T’s destructor.” This distinction matters for the soundness of the overall system.
PhantomData and Drop Check Interaction
When we use #[may_dangle], we are opting out of the drop checker’s conservative assumptions. But we must opt back in for cases where we do transitively drop T. This is where PhantomData enters the picture.
Consider the actual implementation of Vec:
struct Vec<T> {
ptr: *const T, // raw pointer, no ownership semantics
len: usize,
cap: usize,
}
A raw pointer *const T does not imply ownership. The drop checker does not assume that Vec owns T values just because it contains a *const T. If we write:
unsafe impl<#[may_dangle] T> Drop for Vec<T> {
fn drop(&mut self) {
// drop elements, deallocate buffer
}
}
we have told the drop checker that T may dangle. But Vec does own T values and does drop them. If T is PrintOnDrop<'a> (a type with a destructor that dereferences 'a), then 'a must be valid when Vec drops, because Vec will invoke T’s destructor.
We need to tell the drop checker: “T may dangle, except if T itself has drop glue that would observe borrowed data.” The mechanism for this is PhantomData<T>:
use std::marker::PhantomData;
struct Vec<T> {
ptr: *const T,
len: usize,
cap: usize,
_marker: PhantomData<T>,
}
PhantomData<T> is a zero-sized type that tells the compiler “act as if this struct owns a T.” It affects variance, auto-trait inference, and critically, drop check. When the drop checker sees PhantomData<T> in a struct, it knows that dropping the struct may involve dropping T values.
The interaction is:
#[may_dangle] Ton theDropimpl says “I promise not to accessTdirectly in my destructor.”PhantomData<T>in the struct says “but I do ownTvalues and will drop them.”- The drop checker combines these:
Titself may dangle (its references may be invalid), but ifThas drop glue, that glue will run, and whateverT’s glue accesses must still be valid.
This is subtle. Suppose T is &'a str. The type &'a str has no destructor (dropping a reference is a no-op). So PhantomData<&'a str> does not introduce additional constraints. The #[may_dangle] applies fully, and 'a may dangle.
Now suppose T is PrintOnDrop<'a>, a type with a destructor that dereferences 'a. The PhantomData<PrintOnDrop<'a>> tells the drop checker that Vec will drop PrintOnDrop<'a> values. The #[may_dangle] says Vec itself won’t access 'a. But dropping PrintOnDrop<'a> will access 'a. So 'a must still be valid when Vec is dropped, despite the #[may_dangle].
The rules compose correctly. The #[may_dangle] is not a blanket permission to let everything dangle. It is permission for the specific Drop impl to not access the parameter, combined with the PhantomData indicating that transitive drops may still occur.
Without #[may_dangle], implementing a collection like Vec would be unnecessarily restrictive. Every Vec<&'a T> would require 'a to strictly outlive the Vec, even when the Vec is dropped before the borrowed data goes out of scope. The combination of #[may_dangle] and PhantomData allows the standard library to express the precise ownership semantics: “we will drop T values, but we do not otherwise observe them in our destructor.”
Leaks Are Safe
Here we encounter a design decision that surprises many: leaking memory is safe in Rust. The function std::mem::forget takes ownership of a value and does not run its destructor:
let v = vec![1, 2, 3];
std::mem::forget(v);
// v's destructor never runs; the heap allocation is leaked
This is a safe function. It does not require unsafe. The reasoning is that safe in Rust means cannot cause undefined behavior. Leaking memory is wasteful, but it does not corrupt memory, create dangling pointers, or cause data races. A program that leaks is buggy but not undefined.
Moreover, leaks can occur in safe code without calling forget. The simplest example is a reference cycle with Rc:
use std::cell::RefCell;
use std::rc::Rc;
struct Node {
next: Option<Rc<RefCell<Node>>>,
}
fn create_cycle() {
let a = Rc::new(RefCell::new(Node { next: None }));
let b = Rc::new(RefCell::new(Node { next: Some(a.clone()) }));
a.borrow_mut().next = Some(b.clone());
}
When create_cycle returns, the Rcs go out of scope, but each has a reference count of 2 due to the cycle. The count decrements to 1, not 0. The nodes are never freed. This is safe code with no unsafe blocks, and it leaks.
Since leaks are possible in safe code, the language cannot assume destructors always run. This has profound implications for unsafe code. Any type whose safety invariants depend on the destructor running is unsound in the presence of mem::forget or cycles.
The standard library learned this lesson with thread::scoped, an API that allowed spawning threads referencing stack data. It relied on a guard whose destructor joined the thread:
pub fn scoped<'a, F>(f: F) -> JoinGuard<'a>
where F: FnOnce() + Send + 'a
The guard’s lifetime tied it to the borrowed data. When the guard dropped, it joined the thread, ensuring the thread finished before the data went out of scope. But mem::forget(guard) prevented the destructor from running. The thread would continue executing while the stack data it referenced was freed. Use-after-free from safe code. The API was unsound and had to be removed.
The correct design principle is that unsafe code cannot rely on destructors running to maintain safety invariants. Safe abstractions must account for the possibility that destructors are skipped. The standard library’s Vec::drain is a good example. Draining a vector moves elements out one at a time. If the drain iterator is forgotten mid-iteration, some elements have been moved out and the vector’s length is wrong. Rather than leaving the vector in an inconsistent state, Drain sets the vector’s length to zero at the start of iteration. If Drain is forgotten, the remaining elements leak (their destructors do not run, their memory is not reused), but the vector is in a valid state (empty). Leaks amplify leaks, but undefined behavior does not occur.
Beyond RAII
We have built a comprehensive picture of ownership. Rust makes ownership tracking mandatory and statically verified. Yet even Rust’s model has limits.
The Single-Destructor Problem
The RAII model, whether in C++ or Rust, shares a structural limitation: the destructor is a single, parameterless function that returns nothing. When an object goes out of scope, exactly one action occurs. We cannot choose between alternatives. We cannot pass runtime information to the cleanup logic. We cannot receive a result from it.
For many resources this constraint is invisible. A file handle has one sensible cleanup action: close the descriptor. A heap allocation has one cleanup action: free the memory. A mutex guard has one cleanup action: unlock the mutex. The destructor does the obvious thing, and the single-destructor model works well.
But consider a database transaction. A transaction must eventually either commit (make changes permanent) or rollback (discard changes). These are fundamentally different operations with different semantics, different failure modes, and often different parameters. A commit might require a priority level. A rollback might need to log the reason for abandonment. The destructor cannot accommodate this. It must pick one.
The standard workaround in C++ and Rust is to default to rollback and provide an explicit commit method:
class Transaction {
public:
explicit Transaction(Database& db) : db_(db), committed_(false) {
db_.begin();
}
void commit() {
db_.commit();
committed_ = true;
}
~Transaction() {
if (!committed_) {
db_.rollback();
}
}
private:
Database& db_;
bool committed_;
};
We call commit() when the transaction succeeds and let the destructor rollback on error paths or early returns. Exception safety falls out naturally; if an exception propagates, the destructor runs, and uncommitted transactions rollback.
The problem is that forgetting to call commit() is not a compile-time error. If we write a function that successfully completes its work but neglects to call commit(), the destructor happily rolls back. The program is wrong, but the compiler cannot tell us. We have traded one category of bug (forgetting to cleanup) for another (forgetting to finalize). The second category is arguably more insidious because the cleanup happens, silently doing the wrong thing.
Rust’s ownership system does not help here:
struct Transaction<'a> {
db: &'a mut Database,
committed: bool,
}
impl<'a> Transaction<'a> {
fn commit(mut self) {
self.db.commit();
self.committed = true;
}
}
impl<'a> Drop for Transaction<'a> {
fn drop(&mut self) {
if !self.committed {
self.db.rollback();
}
}
}
The commit method takes self by value, consuming the transaction. After calling commit, the binding is gone. But if we never call commit, the transaction goes out of scope, drop runs, and we rollback. No compiler error. The type system tracked ownership but not obligation.
Defer: Explicit Scope-Bound Cleanup
Some languages take a different approach. Rather than binding cleanup to object destruction, they provide explicit defer statements that execute at scope exit.
Zig’s defer runs an expression unconditionally when control leaves the enclosing block:
fn processFile(path: []const u8) !void {
const file = try std.fs.cwd().openFile(path, .{});
defer file.close();
const buffer = try allocator.alloc(u8, 4096);
defer allocator.free(buffer);
// work with file and buffer
// both are cleaned up when we exit, regardless of how
}
The cleanup code sits immediately after the acquisition code. We see allocation and deallocation together, which aids comprehension. Zig extends this with errdefer, which executes only if the function returns an error, separating error-path cleanup from success-path transfer.
The defer model has a structural limitation that RAII does not: the cleanup is scope-bound. We cannot return a resource to our caller the way we return an RAII object. The defer runs when the current scope exits, period. RAII is more flexible; we can return an RAII object, store it in a data structure, transfer ownership to another thread. The cleanup travels with the object. Defer is local; RAII is transferable.
But defer has an advantage in simplicity. We do not need to define a type, implement a trait, worry about drop order among struct fields. We write the cleanup code inline, at the point of acquisition. For resources that do not leave the current function, defer is often cleaner.
Linear Types: Enforcing Explicit Consumption
The transaction example shows a gap in RAII’s guarantees. RAII ensures that cleanup happens, but not that we made an explicit decision about which cleanup to perform. The destructor chooses for us, silently.
Linear types close this gap. A linear type must be explicitly consumed; it cannot simply go out of scope. If we try to let a linear value fall out of scope without passing it to a consuming function, the compiler rejects the program.
Consider a hypothetical extension to Rust:
// Hypothetical syntax
#[linear]
struct Transaction { db: Database }
fn commit(txn: Transaction) -> Result<(), Error> {
txn.db.commit()?;
destruct txn; // explicitly consume
Ok(())
}
fn rollback(txn: Transaction, reason: &str) {
txn.db.rollback();
destruct txn; // explicitly consume
}
fn do_work(db: Database) {
let txn = Transaction { db };
// ERROR: `txn` goes out of scope without being consumed
// must call either commit(txn) or rollback(txn, ...)
}
The transaction cannot be ignored. Forgetting to decide is a compile-time error, not a runtime silent rollback. Languages with linear types (Vale, Austral, and to some extent Haskell with LinearTypes) can express patterns that RAII cannot.
Why Rust Does Not Have Linear Types
Rust’s types are affine, not linear. An affine type can be used at most once; a linear type must be used exactly once. The difference matters when panics unwind the stack.
Rust permits silent dropping because Drop::drop must always be callable. When a scope exits, when a panic unwinds, when we reassign a variable, Rust calls drop. The entire language assumes that any value can be dropped at any time.
Linear types would break this assumption. If a value must be explicitly consumed, what happens when a panic unwinds through a function holding that value? The unwinding code cannot know which consuming function to call, what parameters to pass, what to do with the return value. Every generic container, every iterator adapter, every function that might discard a value would need to be reconsidered.
Rust chose affine types, accepting that the compiler cannot enforce explicit consumption but gaining a simpler model where any value can be dropped. The #[must_use] attribute provides a weaker guarantee: a warning (not an error) if a value is unused. It catches some mistakes but does not provide the hard guarantee that linear types would.
When Things Go Wrong
RAII ensures cleanup happens when scope ends. But it says nothing about how we signal failures. When a function cannot complete its task, it must communicate this to the caller, and the caller must have a chance to respond. The three languages take fundamentally different approaches to this problem, and the differences reveal deep assumptions about what error handling should look like.
C: Return Codes and errno
C library functions indicate failure through their return values, but the convention varies by function and must be looked up individually. The C standard defines several patterns. Functions like fopen return a null pointer on failure; the null serves as an out-of-band sentinel that cannot be confused with a valid result. Functions like puts and fclose return EOF, a special error code, on failure. Functions like fgetpos and fsetpos return a nonzero value on failure, where zero indicates success and the actual return value is otherwise unneeded. Functions like thrd_create return a special success code, and any other value indicates a specific failure condition. Functions like printf return a negative value on failure, where success produces a positive count.
if (puts("hello world") == EOF) {
perror("can't output to terminal:");
exit(EXIT_FAILURE);
}
The perror function prints a diagnostic message based on the current value of errno, a thread-local variable that library functions set when they fail. The combination of checking the return value and consulting errno provides both detection and diagnosis. But errno is fragile. It must be checked immediately after the failing call. Any intervening function call, even a successful one, may modify errno. If we need to preserve error information across other operations, we must copy it:
FILE *f = fopen(path, "r");
if (!f) {
int saved = errno;
log_message("fopen failed"); // might modify errno
errno = saved;
return -1;
}
The more serious problem is propagation. Consider a function that opens a file, allocates a buffer, reads data, and processes it. Each step can fail. Each failure requires releasing resources acquired in previous steps:
int process_file(const char *path) {
FILE *f = fopen(path, "r");
if (!f) return -1;
char *buf = malloc(4096);
if (!buf) {
fclose(f);
return -1;
}
if (fread(buf, 1, 4096, f) == 0 && ferror(f)) {
free(buf);
fclose(f);
return -1;
}
// process buf...
free(buf);
fclose(f);
return 0;
}
The cleanup code is duplicated at each error site. If we add a fourth resource, we must update three error paths. The duplication invites mistakes. Miss one cleanup and we leak.
The goto statement consolidates cleanup into a single location:
int process_file(const char *path) {
int result = -1;
FILE *f = NULL;
char *buf = NULL;
f = fopen(path, "r");
if (!f) goto cleanup;
buf = malloc(4096);
if (!buf) goto cleanup;
if (fread(buf, 1, 4096, f) == 0 && ferror(f)) goto cleanup;
// process buf...
result = 0;
cleanup:
free(buf);
if (f) fclose(f);
return result;
}
All resources must be initialized to null at the top. The cleanup block must handle partially-initialized state; free(NULL) is defined to do nothing, but fclose(NULL) is undefined behavior, hence the explicit check. The pattern requires discipline. Reviewers must verify that every resource acquired before a goto is released in the cleanup block, and that the cleanup block handles every possible partial state.
For errors that cannot be handled locally, C provides setjmp and longjmp. The setjmp macro marks a location in the code; longjmp transfers control directly to that location, unwinding the call stack without executing intervening code. This is C’s mechanism for non-local jumps, used when an error deep in a call chain must abort the entire operation:
#include <setjmp.h>
jmp_buf error_handler;
void deep_function(void) {
if (catastrophic_failure()) {
longjmp(error_handler, 1); // never returns
}
}
int main(void) {
if (setjmp(error_handler) != 0) {
// longjmp landed here
fprintf(stderr, "operation aborted\n");
return EXIT_FAILURE;
}
deep_function();
return EXIT_SUCCESS;
}
The longjmp function never returns to its caller. Control transfers directly to the setjmp site as if setjmp had just returned, but with a nonzero value indicating which longjmp triggered. This bypasses all intermediate stack frames. Local variables in those frames are not cleaned up; their destructors (in the C sense of cleanup code we might have written) do not run. Resources acquired between setjmp and longjmp leak unless we manually track and release them. The jmp_buf must remain valid when longjmp is called; if the function containing setjmp has returned, the behavior is undefined.
An int return code provides limited information. We can distinguish success from failure, and errno codes like ENOENT or EACCES give some context, but rich error information requires either out-parameters, custom error structs, or global state.
C++: Exceptions and Hidden Control Flow
C++ exceptions address the propagation problem directly. When a function cannot fulfill its contract, it executes throw with an exception object. Control transfers immediately to the nearest enclosing catch block whose type matches the thrown object, skipping all intermediate code but calling destructors along the way.
std::string read_file(const std::string& path) {
std::ifstream f(path);
if (!f) {
throw std::runtime_error("cannot open: " + path);
}
std::stringstream buf;
buf << f.rdbuf();
return buf.str();
}
void process() {
try {
auto content = read_file("data.txt");
// use content...
} catch (const std::runtime_error& e) {
std::cerr << e.what() << '\n';
}
}
The try block delimits code where exceptions may be caught. The catch clause specifies a type; if the thrown object’s type matches (including derived classes), that handler executes. Multiple catch clauses can handle different exception types; the runtime tries them in order and executes the first match. Catching by const reference avoids copying and preserves polymorphism for exception hierarchies.
When an exception propagates through a scope, the compiler ensures that destructors run for all local objects in reverse order of construction. This is stack unwinding. RAII objects release their resources automatically on error paths:
void safe_operation() {
auto resource = std::make_unique<Widget>();
might_throw();
} // resource destroyed whether or not might_throw() throws
The destructor runs regardless of how we exit the scope. We wrote no cleanup code; the compiler inserted it.
Here lies the problem with exceptions: every function call is a potential exit point. In the code above, if might_throw() throws, control leaves safe_operation immediately. This is convenient when we want automatic cleanup. It is treacherous when we are maintaining invariants across multiple operations.
void transfer(Account& from, Account& to, int amount) {
from.withdraw(amount); // might throw
to.deposit(amount); // might throw
}
If deposit throws after withdraw succeeds, the money has left from but never arrived at to. The program is in an inconsistent state. The strong exception guarantee (if an operation fails, state is unchanged) requires explicit effort:
void transfer(Account& from, Account& to, int amount) {
int withdrawn = from.withdraw(amount); // might throw
try {
to.deposit(amount);
} catch (...) {
from.deposit(withdrawn); // rollback
throw;
}
}
Now we are back to writing manual error handling, but with a twist: the error paths are invisible. Looking at a function that calls other functions, we cannot tell which calls might throw without examining their declarations (or their transitive callees). The C code with goto cleanup was verbose, but every exit point was visible in the source.
Destructors must not throw. Since C++11, destructors are implicitly noexcept. If a destructor throws during stack unwinding from another exception, the program has two active exceptions with no way to handle both; std::terminate is called. This constraint shapes class design: destructors perform only operations that cannot fail, or they catch and suppress exceptions internally.
The noexcept specifier declares that a function does not throw. If it does throw, std::terminate is called rather than unwinding. Move constructors and move assignment operators should be noexcept when possible; this enables std::vector to move elements during reallocation rather than copying them. If a move constructor might throw, the vector must copy to preserve the strong exception guarantee. Copying is dramatically slower for types with expensive copy operations. The performance of your vector depends on whether your move constructor is marked noexcept.
The exception safety guarantees classify behavior when exceptions occur. The nothrow guarantee means the function never throws; destructors and swap functions provide this. The strong guarantee means if an exception is thrown, the program state is unchanged; std::vector::push_back provides this. The basic guarantee means if an exception is thrown, the program is in a valid state with no leaks, but the state may have changed. Every function in a C++ codebase implicitly has one of these guarantees, whether the author thought about it or not.
C++23: std::expected and the Missing Operator
C++23 introduced std::expected<T, E>, a vocabulary type that holds either a value of type T or an error of type E. The design is explicitly modeled on Rust’s Result type and Haskell’s Either.
#include <expected>
std::expected<int, std::errc> parse_int(std::string_view s) {
int value;
auto [ptr, ec] = std::from_chars(s.data(), s.data() + s.size(), value);
if (ec != std::errc{}) {
return std::unexpected(ec);
}
return value;
}
The std::unexpected wrapper distinguishes error values from success values. Callers inspect the result:
auto result = parse_int("42");
if (result) {
use(*result);
} else {
handle(result.error());
}
C++23 added monadic operations for chaining. The and_then method applies a function if a value is present, where the function returns another std::expected. The transform method applies a function that returns a plain value, wrapping the result. The or_else method handles the error case:
auto result = parse(input)
.and_then(validate)
.transform([](int n) { return n * 2; });
The type exists. The monadic operations exist. What C++ lacks is the operator that makes the pattern ergonomic.
Consider a function that calls two fallible operations and combines their results. With exceptions, C++ is concise:
auto strcat(int i) -> std::string {
return std::format("{}{}", foo(i), bar(i));
}
We do not even need to know that foo and bar might throw. Errors propagate invisibly.
With std::expected, the same logic becomes:
auto strcat(int i) -> std::expected<std::string, E> {
auto f = foo(i);
if (!f) {
return std::unexpected(f.error());
}
auto b = bar(i);
if (!b) {
return std::unexpected(b.error());
}
return std::format("{}{}", *f, *b);
}
The manual error propagation dominates the function. We give names to the expected objects (f, b) rather than to the values we care about. We access values through *f and *b. The ceremony obscures the logic.
Rust solves this with the ? operator:
fn strcat(i: i32) -> Result<String, E> {
Ok(format!("{}{}", foo(i)?, bar(i)?))
}
One character per fallible operation. The happy path reads as naturally as the exception version. The error path is explicit but minimal.
C++ cannot adopt Rust’s ? syntax. The conditional operator ?: creates ambiguity: in a ? *b ? *c : d, the parser cannot determine whether ? begins a conditional or is a postfix propagation operator. P2561R0 proposes ?? as an alternative, but as of C++26, no propagation operator has been adopted.
The monadic operations provide a workaround:
auto strcat(int i) -> std::expected<std::string, E> {
return foo(i).and_then([&](int f) {
return bar(i).transform([&](int b) {
return std::format("{}{}", f, b);
});
});
}
This works. It is also neste