My attempt at talking about one of the most underrated features of C#.
29 Sep 2025 — 9 min read
C# is a versatile language. You can write mobile apps, desktop apps, games, websites, services and APIs with it. You can write it like Java with all the abstractions and
AbstractionFactoryClassProvider
s. But differently from Java, you can write low-level and unsafe code too. When I say low-level, I mean without the GC, with raw pointers.
Low-level code is usually required for performance or interoperability with C libraries or the operating system. The reason low-level code helps with performance is that it can be used to eliminate runtime checks on memory accesses…
My attempt at talking about one of the most underrated features of C#.
29 Sep 2025 — 9 min read
C# is a versatile language. You can write mobile apps, desktop apps, games, websites, services and APIs with it. You can write it like Java with all the abstractions and
AbstractionFactoryClassProvider
s. But differently from Java, you can write low-level and unsafe code too. When I say low-level, I mean without the GC, with raw pointers.
Low-level code is usually required for performance or interoperability with C libraries or the operating system. The reason low-level code helps with performance is that it can be used to eliminate runtime checks on memory accesses.
Array element accesses are bounds-checked in C# for safety. But, that means that there’s performance impact unless the compiler can eliminate a bounds-checking operation. The bounds-checking elimination logic needs to ensure that the array index was already bounds-checked before, or can be assured to be inside bounds during the compile-time. For example, take this simple function:
int sum(int[] array)
{
int sum = 0;
for (int i = 0; i < array.Length; i++)
{
sum += array[i];
}
return sum;
}
That’s an ideal situation for bounds-checking elimination because the index variable i
is created with known boundaries, and it depends on the array’s length. The index variable’s lifetime is shorter than the array’s lifetime and it’s guaranteed to be contained valid values throughout the function. The native code produced for sum
has no bounds-checking:
L0000 xor eax, eax
L0002 xor edx, edx
L0004 mov r8d, [rcx+8] ; read length
L0008 test r8d, r8d ; is empty?
L000b jle short L001c ; skip the loop
L000d mov r10d, edx
L0010 add eax, [rcx+r10*4+0x10] ; sum += array[i];
L0015 inc edx ; i++
L0017 cmp r8d, edx ; compare length with i
L001a jg short L000d ; loop if still greater
L001c ret
But, what if the function signature was slightly different?
int sum(int[] array, int startIndex, int endIndex)
{
int sum = 0;
for (int i = startIndex; i <= endIndex; i++)
{
sum += array[i];
}
return sum;
}
Now, the C# compiler doesn’t have a way to know if the passed startIndex
and endIndex
values are inside the boundaries of array
because their lifetimes are distinct. So the native assembly produced becomes way more involved with bounds-checking operations:
L0000 sub rsp, 0x28
L0004 xor eax, eax ; sum = 0
L0006 cmp edx, r8d ; startIndex > endIndex?
L0009 jg short L0045 ; then skip the entire function
L000b test rcx, rcx ; array is null?
L000e je short L0031 ; then cause NullReferenceException
L0010 mov r10d, edx
L0013 or r10d, r8d
L0016 jl short L0031
L0018 cmp [rcx+8], r8d ; array.Length <= endIndex ?
L001c jle short L0031 ; then do bounds-checking
L001e xchg ax, ax ; alignment NOP
L0020 mov r10d, edx
L0023 add eax, [rcx+r10*4+0x10] ; sum += array[i]
L0028 inc edx ; consider i + 1
L002a cmp edx, r8d ; i > endIndex?
L002d jle short L0020 ; no, go on
L002f jmp short L0045 ; return
L0031 cmp edx, [rcx+8] ; i > array.Length?
L0034 jae short L004a ; bounds-checking failed. go to ----+
L0036 mov r10d, edx |
L0039 add eax, [rcx+r10*4+0x10] ; sum += array[i] |
L003e inc edx ; i++ |
L0040 cmp edx, r8d ; i <= endIndex ? |
L0043 jle short L0031 ; continue for loop |
L0045 add rsp, 0x28 |
L0049 ret ; return sum |
L004a call 0x00007ff857ec6200 ; throw IndexOutOfRangeException <--+
We could use low-level unsafe functions and pointers in C# (yes, C# supports raw pointers!) to avoid bounds-checking altogether, like this:
unsafe int sum(int* ptr, int length)
{
int* end = ptr + length;
int sum = 0;
for (; ptr < end; ptr++)
{
sum += *ptr;
}
return sum;
}
That also creates a veryoptimized code that supports passing along a sub-portion of an array:
L0000 movsxd rax, edx
L0003 lea rax, [rcx+rax*4] ; end = ptr + length
L0007 xor edx, edx ; sum = 0
L0009 cmp rcx, rax ; ptr >= end ?
L000c jae short L0019 ; then return
L000e add edx, [rcx] ; sum += *ptr
L0010 add rcx, 4 ; ptr += sizeof(int)
L0014 cmp rcx, rax ; ptr < end?
L0017 jb short L000e ; then keep looping
L0019 mov eax, edx
L001b ret ; return sum
Unsafe code and pointer-arithmetic can be very performant as you can see. The problem is that it’s too dangerous. With incorrect values of length, you don’t simply get an IndexOutOfRangeException
but instead your app either crashes, or returns incorrect results. If your code happened to modify the memory region instead of just reading it, then you could have a nice entry point for a buffer overflow security vulnerability in your app too. Not to mention that all the callers of that function will have to have unsafe blocks too.
But it’s possible to handle this safe and fast in C# without resorting to esoteric rituals like that. First, how do you solve this problem of indexes to describe a portion of an array and actual boundaries of the array being disconnected from each other? You create a new immutable type that holds these values together. And that type is called a span in C#. Other programming languages may call it a slice. Declaration of Span
type resembles something like this. Well, it’s not exactly this, but I want you to understand the concept first:
readonly struct Span<T>
{
readonly T* _ptr;
readonly int _len;
}
It’s basically an immutable pointer with length. The great thing about a type like this is that the compiler can assure that once an immutable Span is initialized with correct bounds, it will always be safe to access without any bounds-checking. That means, you can pass around sub-views of arrays or even other spans safely and quickly without the performance overhead.
But, how can it be safe? What if the GC decides to throw away the structure that ptr
points to? Well, that’s where “ref types” come into play in C#.
A ref type is a type that can’t leave the stack and escape to the heap, so it’s always guaranteed that a T
type will outlive a Span<T>
instance. That’s why the actual Span<T>
declaration looks like this:
readonly ref struct Span<T> // notice "ref"
{
readonly ref T _ptr; // notice "ref"
readonly int _len;
}
Since a ref type can only live in stack, it can’t be a member of a class, nor can it be assigned to a non-ref variable, like, it can’t be boxed either. A ref type can only be contained inside another ref type. It’s ref types all the way.
Span-based version of our sum
function can eliminate bounds-checking despite that it can now have several super powers. The first one is that it can receive a sub-view of an array too with specific indices:
int sum(Span<int> span)
{
int sum = 0;
for (int i = 0; i < span.Length; i++)
{
sum += span[i];
}
return sum;
}
For instance you can call this function with sum(array)
or you can call it with a sub-view of an array like sum(array[startIndex..endIndex])
. That wouldn’t incur new bounds-checking operations other than when you’re trying to slice the array using the range operator. See how the generated assembly code for sum
becomes optimized again:
L0000 mov rax, [rcx]
L0003 mov ecx, [rcx+8]
L0006 xor edx, edx ; sum = 0
L0008 xor r8d, r8d ; i = 0
L000b test ecx, ecx ; span.Length == 0?
L000d jle short L001e
L000f mov r10d, r8d
L0012 add edx, [rax+r10*4] ; sum += span[i]
L0016 inc r8d ; i++
L0019 cmp r8d, ecx ; i < Length?
L001c jl short L000f ; then keep looping
L001e mov eax, edx
L0020 ret ; return sum
Another superpower you get is that the ability to declare the data structure you receive immutable in your function signature, so your function is guaranteed not to touch it, and you can find bugs instantly. All you need to do is to replace Span<T>
with ReadOnlySpan<T>
. Then your attempts to modify the span contents will immediately cause a compiler error. Something impossible with regular arrays, even if you declare them readonly
. The readonly
directive only protects the reference from modification not the contents of the data structure.
Passing along a smaller portion of a larger data structure to relevant APIs used to involve either copying or passing the relevant part’s offset and length values along with that data structure. It required the API to support calls with larger structures with offsets. It was impossible to guarantee the safety of such APIs as the relationship between parameters couldn’t be established by the compiler or the runtime.
It’s now both easy and expressive to implement zero-copy operations safely using spans. Consider a Quicksort implementation for instance; it usually has a function like this that works with portions of a given array:
int partition(int[] array, int low, int high)
{
int midpoint = (high + low) / 2; // I know, we'll get there!
int mid = array[midpoint];
// tuple swaps in C#! ("..^1" means "Length - 1")
(array[midpoint], array[^1]) = (array[^1], array[midpoint]);
int pivotIndex = 0;
for (int i = low; i < high - 1; i++)
{
if (array[i] < mid)
{
swap(array, i, pivotIndex);
pivotIndex += 1;
}
}
(array[midpoint], array[endpoint]) = (array[endpoint], array[midpoint]);
return pivotIndex;
}
This function receives an array and offsets that designate a part of it, and rearranges the items based on a picked value in it. Values smaller than it move to the left, larger than it move to the right.
The mid = array[midpoint]
has to be bounds-checked because the compiler can’t know if the index is inside the bounds of this array. The for
loop also performs bounds-check for array accesses in the loop, which some of them can be eliminated, but not fully guaranteed.
There is also an overflow error because we pass array ranges using high
and low
values: (high+low)
can overflow for very large arrays, and the results would be catastrophic, can even cause buffer overflow exceptions.
The partition
function gets recursively called many times by Quicksort
function below, so that means bounds-checking can be a performance issue.
void Quicksort(int[] array, int low, int high)
{
if (array.Length <= 1)
{
return;
}
int pivot = partition(array, low, high);
Quicksort(span, low, pivot - 1);
Quicksort(span, pivot + 1, high);
}
With spans, the same Quicksort function looks like this:
void Quicksort(Span<int> span)
{
if (span.Length <= 1)
{
return;
}
int pivot = partition(span);
Quicksort(span[..pivot]);
Quicksort(span[(pivot + 1)..]);
}
See how expressive using spans are especially with the range syntax? It lets you get a new span out of an existing span or an array using double dots (..
)? And the partition function even looks much better too:
int partition(Span<int> span)
{
int midpoint = span.Length / 2; // look ma, no buffer overflows!
int mid = span[midpoint];
(span[midpoint], span[^1]) = (span[^1], span[midpoint]);
int pivotIndex = 0;
for (int i = 0; i < span.Length - 1; i++)
{
if (span[i] < mid)
{
swap(array, i, pivotIndex);
pivotIndex += 1;
}
}
(span[midpoint], span[^1]) = (span[^1], span[midpoint]);
return pivotIndex;
}
Because C# spans are also zero-based, it’s harder to have buffer overflow problems caused by formulae like (low + high) / 2
. Now, the implementation is as fast as an unsafe implementation with raw pointers, but still extremely safe.
New zero-copy operations in .NET runtime
I used a recursive example here to show how sub-portions of a larger data structure can be passed to another function, but spans can be used almost everywhere, and now .NET runtime supports zero-copy alternatives of popular functions too.
Take String.Split
for example. You can now split a string without creating new copies of every split portion of the string. You can split a CSV line into its parts like this:
string csvLine = // .. read CSV line
string[] parts = csvLine.Split(',');
foreach (string part in parts)
{
Console.WriteLine(part);
}
The problem with that is, now you’re dealing with newly created 5 buffers with varying lengths. .NET allocates new memory for them, GC keeps track of them. It’s slow, it hogs memory. It’s problematic especially in loops, and can create GC pressure, slowing your app even more.
You can instead cast your CSV line into a ReadOnlySpan<char>
and iterate over its components to write it to the output:
string csvLine = // .. read CSV line
var span = csvLine.AsSpan();
var parts = span.Split(',');
foreach (var range in parts)
{
Console.Out.WriteLine(span[range]);
}
Note that we use a small detour to useConsole.Out.WriteLine
instead ofConsole.WriteLine
becauseConsole
class lacks an overload to output aReadOnlySpan<char>
like a string.
The great ROI of that experiment is making zero memory allocations after reading CSV line into memory. A similar logic that improves performance and memory efficiency can be applied everywhere that can receive a Span<T>
/ReadOnlySpan<T>
instead of an array.
Embrace the future
Spans and slice-like structures in are the future of safe memory operations in modern programming languages. Embrace them. The quick takeaways are:
- Use spans over arrays in your function declarations unless you explicitly require a standalone array in your function for some reason. Such a change opens your API into zero-copy optimization scenarios, and calling code will be more expressive.
- Don’t bother with unsafe/pointers if you can code the same logic with spans. You can still perform low-level memory operations without wandering into the dangerous parts of the forest. Your code will still be fast, and yet safer. Use spans, wherever possible, mostly readonly.