MCUboot: Bootup Journey

STM32F411RE Nucleo board

The Cortex-M4 is the CPU core (licensed from ARM).
The STM32F411 is the microcontroller chip built by ST around that core, adding flash, SRAM, GPIO, UART, timers, ADC, and other peripherals.
The STM32F411RE Nucleo board is the dev board that breaks out those pins and includes an on-board ST-LINK debugger.

STM32F411RE MCU details

Flash memory: 512KB
SRAM: 128KB

The First Microseconds: Reset and Vector Table Fetch

When the Cortex-M4 core comes out of reset, it doesn’t start executing instructions from an arbitrary location. The ARM architecture mandates a specific, hardware-defined sequence. The processor core immediately performs two critical read operations from address 0x00000000:

First, it reads the word at 0x00000000 and…

STM32F411RE Nucleo board

The Cortex-M4 is the CPU core (licensed from ARM).
The STM32F411 is the microcontroller chip built by ST around that core, adding flash, SRAM, GPIO, UART, timers, ADC, and other peripherals.
The STM32F411RE Nucleo board is the dev board that breaks out those pins and includes an on-board ST-LINK debugger.

STM32F411RE MCU details

Flash memory: 512KB
SRAM: 128KB

The First Microseconds: Reset and Vector Table Fetch

First, it reads the word at 0x00000000 and loads it into the Main Stack Pointer (MSP). This initializes the stack before any code runs, ensuring push and pop operations have a valid memory region to work with.

Second, it reads the word at 0x00000004 and loads it into the Program Counter (PC). This address contains the reset handler – the very first function that will execute. The processor then begins fetching and executing instructions from that handler address.

These two reads happen automatically in hardware. No software has run yet. The core simply expects these values to exist at the start of the address space.

On the STM32F411RE, the internal Flash memory is physically mapped starting at 0x08000000, not 0x00000000. The STM32F4 family uses a hardware memory aliasing mechanism that mirrors the Flash memory region from 0x08000000 to 0x00000000. When the core reads from 0x00000000, it’s actually reading from 0x08000000. When it reads from 0x00000004, it’s reading from 0x08000004.

The vector table is a contiguous array of 32-bit addresses placed at the very beginning of Flash. It’s the first region the Cortex-M4 core reads immediately after reset. The format is standardized by ARM:

arch/arm/core/cortex_m/vector_table.S

GDATA(z_main_stack)

SECTION_SUBSEC_FUNC(exc_vector_table,_vector_table_section,_vector_table)

/*
* setting the _very_ early boot on the main stack allows to use memset
* on the interrupt stack when CONFIG_INIT_STACKS is enabled before
* switching to the interrupt stack for the rest of the early boot
*/
.word z_main_stack + CONFIG_MAIN_STACK_SIZE

.word z_arm_reset
.word z_arm_nmi

.word z_arm_hard_fault
....

Word 0 (initial MSP) z_main_stack is the base of the main thread’s stack buffer; adding CONFIG_MAIN_STACK_SIZE points to the top of that stack (Cortex-M stacks grow downward).

Word 1 (Reset Handler) z_arm_reset is the entry point the CPU jumps to after loading MSP.

SECTION_SUBSEC_FUNC(exc_vector_table,_vector_table_section,_vector_table) Add _vector_table into the .exc_vector_table._vector_table_section segment

The linker script later collects .exc_vector_table.* into the final .isr_vector section, which gets placed at the start of Flash so the CPU can find it at reset.

Linker script

The C compiler produces object files containing machine code and data, but these fragments don’t know where they’ll live in the MCU’s memory. The linker script is the architectural blueprint that tells the linker exactly where to place every byte.

The linker script starts by declaring the physical memory regions available:

MEMORY
{
FLASH (rx) : ORIGIN = (0x8000000 + 0x0), LENGTH = (512 * 1024 - 0x0 - 0x0)
RAM (wx) : ORIGIN = 0x20000000, LENGTH = (96 * 1K)
}

Flash memory starting at 0x08000000 (read and execute permissions), and RAM starting at 0x20000000 (read, write, and execute permissions).

Then the linker script defines how the various sections of the compiled code map into these memory regions (simplified):

SECTIONS
{
rom_start:
{
_vector_start = .;
KEEP(*(.exc_vector_table))
KEEP(*(".exc_vector_table.*"))
KEEP(*(.vectors))
_vector_end = .;
} > FLASH
text :
{
__text_region_start = .;
*(.text)
*(".text.*")
*(".TEXT.*")
__text_region_end = .;
} > FLASH
rodata :
{
*(.rodata)
*(".rodata.*")
} > FLASH


datas :
{
__data_start = .;
*(.data)
*(".data.*")
__data_end = .;
} > RAM AT > FLASH

__data_size = __data_end - __data_start;
__data_load_start = LOADADDR(datas);
__data_region_load_start = LOADADDR(datas);

bss:
{
__bss_start = .;
*(.bss)
*(".bss.*")
__bss_end
} > RAM AT > RAM
}

The Interrupt Vector Table This section must appear at the very start of memory. It’s the first region the Cortex-M4 core reads immediately after reset.

Executable Code (.text) The .text section contains all the compiled functions. The wildcard (.text) captures all text-like sections from every object file. This goes in Flash because it’s read-only and needs to persist across power cycles.

Read-Only Data (.rodata) String literals, const arrays, and other immutable data live in .rodata. Since this data never changes, it stays in Flash rather than wasting precious RAM. When the code accesses const char* str = "Hello";, it’s reading directly from Flash at an address in this section.

Initialized Data (.data) The .data section contains global and static variables with initial values:

int counter = 42;
static char buffer[16] = {0xAA, 0xBB, 0xCC};

These variables must reside in RAM at runtime because the program needs to mutate them. But if they’re in RAM, their initial values disappear at power-off. The solution is: store the initial values in Flash, then copy them to RAM during startup.

The linker script notation > RAM AT> FLASH means “the section will run from RAM, but load from Flash”. The linker creates two addresses for this section:

VMA (Virtual Memory Address): Where the section will be accessed at runtime (in RAM, starting at __data_start)
LMA (Load Memory Address): Where the section’s initial data is stored (in Flash, captured by __data_load_start)

The symbols defined in this section are:

__data_load_start = LOADADDR(datas): The source address in Flash
__data_start: The destination start in RAM
__data_end: The destination end in RAM

During startup, the initialization code uses these symbols to copy the .data section from Flash into RAM

Uninitialized Data (.bss) The .bss section contains global and static variables that should start at zero:

int uninitialized_var;
static uint8_t rx_buffer[256];

Rather than storing 256 zero bytes in Flash for that buffer, the script simply reserve space in RAM and zero it during startup.

The symbols __bss_start and __bss_end mark the RAM region that needs zeroing.

Next: Analyze Startup Assembly (reset.S)

STM32F411RE Nucleo board

STM32F411RE MCU details

The First Microseconds: Reset and Vector Table Fetch

STM32F411RE Nucleo board

STM32F411RE MCU details

The First Microseconds: Reset and Vector Table Fetch

Linker script

Similar Posts