10 min readJust now
–
This post does not reflect the views of current, past, or future employers. The opinions in this article are my own.
When I was a sophomore in high school my parents bought our first computer — an Atari 800! It was an awesome system and I spent many an hour hacking on it. One of the most interesting features was PEEK and POKE. PEEK reads a memory location, and POKE writes a memory location. The memory locations didn’t have to be actual memory, they could be registers. PEEKing a register could be done to read some state, and POKEing a register could be done to produce side effects. POKE was the interface to change graphics modes, invoke the sound generator, or change character set (e.g. highlight or invert…
10 min readJust now
–
This post does not reflect the views of current, past, or future employers. The opinions in this article are my own.
When I was a sophomore in high school my parents bought our first computer — an Atari 800! It was an awesome system and I spent many an hour hacking on it. One of the most interesting features was PEEK and POKE. PEEK reads a memory location, and POKE writes a memory location. The memory locations didn’t have to be actual memory, they could be registers. PEEKing a register could be done to read some state, and POKEing a register could be done to produce side effects. POKE was the interface to change graphics modes, invoke the sound generator, or change character set (e.g. highlight or invert characters on the screen). This was the days before everything was documented, so POKEing random locations and to see what happens was a fun pastime of the geek crowd! :-)
**My favorite book in high school **(much more fun than having to read the Brothers Karamazov!). The book documented many memory mapped I/O registers in the Atari 800 that can be PEEKed or POKEd.
By today’s standard’s, the Atari 800 is quaint and simplistic. Nevertheless, I believe it was a great learning platform for the basics of computer function and how a CPU interacts with memory and hardware. With that in mind let’s take a look at how CPUs interact with memory and I/O devices using memory addresses and memory mapped I/O in a modern system.
Memory addressing
Memory addresses are fundamental. For instance, pointers in C or C++ are in actuality memory addresses from the system’s point of view. A memory address is a reference to a specific memory location in memory used by both software and hardware. These addresses are fixed-length sequences of bits, typically displayed and handled as unsigned integers. Programming language constructs treat the memory like an array like with pointers.
Physical addresses
A computer’s main memory, or RAM for Random Access Memory, consists of many memory locations, each identified by a unique physical address. CPUs and other devices use these identifiers to access the corresponding memory locations. Usually, only system software, like BIOS and operating systems, directly addresses physical memory using machine code instructions. The instructions tell the CPU to interact with a hardware component called the memory controller.
Memory controller and memory bus
Thememory controller manages access to memory using the memory bus or a system bus, to execute the program’s commands. A bus is a communication system used to transfer data between these CPUs, memory, and devices (like I/O devices) in a computer. A bus consists of physical connections like wires, circuits, or cables. The memory bus itself may consist of three buses: address bus, data bus, and control bus.
The address bus is a collection of wires used to identify particular locations in main memory. The address bus transports memory addresses which a processor wants to access in order to read or write data. The address bus is unidirectional and the size of the bus determines how many unique memory locations can be addressed. Each device connected to the address bus listens on it to match and address signaled on the bus as being one of the addresses it “owns”
The data bus is a collection of wires through which data is transmitted from one part of a computer to another. It can be thought of as a highway on which data travels within a computer. The data bus transfers the data between the CPU to I/O devices or memory. The data bus is bidirectional because the data can flow in either direction from CPU to memory (or I/O device) or from memory to the CPU. The size (width) of the bus determines how much data can be transmitted at one time.
The control bus is a collection of wires that carry control information between the CPU and other devices within the computer that transports orders and synchronization signals coming from the control unit and travelling to all other hardware components. The control bus is bidirectional because the data can flow in either direction from CPU to memory (or I/O device) or from memory to the CPU.
Press enter or click to view image in full size
Memory bus. CPUs, memory devices, and I/O devices share memory over a system bus consisting of Address, Data, and Control buses.
Logical addresses
A computer program uses memory addresses to execute machine code, and to store and retrieve data. In early computers, like the Atari 800, logical addresses (used by programs) and physical addresses (actual locations in hardware memory) were the same. However, with the introduction of virtual memory, most application programs do not deal directly with physical addresses. Instead, they use logical or virtual addresses which are translated to physical addresses by the computer’s Memory Management Unit (MMU) and the operating system’s memory mapping mechanisms.
Press enter or click to view image in full size
Logical to physical address mapping table. On the left the logical space where the colored blocks, or more specifically 4K size pages in this example, map to blocks in the physical address space. The addresses on the left are virtual addresses, and the addresses on the right are physical addresses.
Memory Management Unit (MMU)
The Memory Management Unit, or **MMU **is a hardware component whose main purpose is to convert virtual addresses created by the CPU into physical addresses in the computer’s memory. It acts as a bridge between the CPU and the RAM (memory) which ensures that programs can run smoothly and access the required data without clashes or unauthorized access. It is usually integrated in the CPU but in some cases it is also constructed as a separate Integrated Circuit (IC).
Reading a memory location
A CPU reads a memory location with a load instruction. A CPU ISA may have instructions for explicitly loading from a memory location to a register, or the data loaded from a memory location might be a side effect as in the case of loading data as an operand of an arithmetic operation. Below are examples of load instructions for x86.
// Load 32-bit word from address in register RCX and set the value in EDX mov (%rcx),%edx// Load a byte from address in register RAX and add it to AL and set ALadd (%rax),%al
The processing steps for a plain load instruction are:
- Decode the instruction. The instruction is decoded to determine it is a load instruction, the memory address for the load, and the register into which the loaded data is placed.
- Virtual to Physical Address Translation: The Memory Management Unit translates the virtual address in the instruction to a physical address.
- The address is placed on the address bus: The CPU puts the physical address of the desired word onto the address bus (voltages are set accordingly for the individual wires of the address bus).
- Send a read command: The CPU activates the “read” line on the control bus to indicate a read operation is requested.
- Memory retrieves data: The memory controller uses the address to find the correct physical location and places the contents of that location onto the data bus.
- CPU reads the data: The CPU reads the data from the data bus, and sets the data in the destination register of the load instruction.
Writing a memory location
A CPU writes a memory location with a store instruction. A CPU ISA may have instructions for explicitly storing to a memory location, or the address might be the destination operand of an arithmetic operation. Below are examples of store instructions for x86.
// Store the value in RAX to the address in RIP (the Instruction Pointer)// plus 0x2e00mov %rax,0x2e00(%rip)// Load a 32-bit work from the address in RAX, add the value in EAX, and// then store the result to the address in RAXadd %eax,(%rax)
The processing steps for a plain store instruction are:
- Decode the instruction. The load instruction is decoded to determine it is a store instruction, the memory address for the store, and the source register or immediate value for the data to be stored.
- Virtual to Physical Address Translation: The Memory Management Unit translates the virtual address in the instruction to a physical address.
- Place Address on Address Bus: The CPU places the specific memory address where the data should be stored onto the address bus.
- Place Data on Data Bus: The CPU places the actual data (value) to be stored onto the data bus.
- Activate Write Signal: The CPU sets the read/write (R/W) wire on the control bus to “low” (write mode) and activates the memory enable signal.
- Chip Selection: The address valid signal, along with the address on the bus, activates the chip select (CS) wire on the target memory module.
- Data Transfer and Storage: The data on the data bus is transferred and stored into the memory location designated by the address.
- Terminate Cycle: The CPU sets the read/write wire back to “high” (inactive) to finish the write operation.
Memory Mapped I/O
PEEK and POKE are really a type of Memory Mapped I/O, or MMIO, that is a method where a computer’s CPU accesses peripheral devices (such as graphics cards, network adapters) by treating their control registers and memory as part of the main system memory. This technique allows using the same standard memory instructions (load/store) for both RAM and I/O. This approach unifies the address space, simplifying programming by avoiding special I/O instructions and enabling efficient data transfer, as the hardware directly handles routing requests to the correct device.
In a Memory Mapped I/O system, there are no special input or output instructions. Instead, the CPU uses the same instructions it uses for memory (like LOAD and STORE) to access I/O devices. The properties are:
- Each I/O device is assigned a specific address in the regular memory address space
- Devices are connected through interface registers, which act like memory locations
- When the CPU wants to read from or write to an I/O device, it accesses the corresponding address, just like it would access a memory word
- These interface registers respond to normal read/write operations as if they were plain memory cells
This design allows I/O and memory to be treated uniformly, simplifying programming and hardware design. There are many applications of Memory Mapped I/O including:
- Graphics Processing: Memory Mapped I/O is widely used in graphics cards to provide fast access to frame buffers and control registers. Graphics data is mapped directly to memory, allowing the CPU to interact with the graphics hardware as if it were accessing normal memory. This enables efficient rendering and display operations.
- Network Communication: Network Interface Cards (NICs) often use Memory Mapped I/O to manage data transfer between the system memory and the network. The NIC’s control and status registers are mapped to specific memory addresses, allowing the CPU to efficiently control and monitor network operations.
- Direct Memory Access (DMA): DMA controllers use Memory Mapped I/O to enable high-speed data transfers between I/O devices and system memory without involving the CPU. By mapping DMA control registers to memory, devices can transfer data directly, improving system performance and reducing CPU load.
- Accelerator Communications: Accelerators and CPUs can convey requests and replies via Memory Mapped I/O. We introduced the idea inAccelerator FIFOs and next time we’ll do a deep dive into this.
Press enter or click to view image in full size
Memory mapped I/O. I/O devices can expose their registers and on board memory as addresses in the main address space. In this example, an accelerator has mapped a register, perhaps a Memory Mapped FIFO, to physical address 0x89341. A CPU use load and store instructions to read and write the register,
System Initialization
When a computer first boots there is quite a bit of work needed to figure out what memory a system has, what devices are connected to the system bus, and what addresses are used for Memory Mapped I/O. Initialization is performed by BIOS (Basic Input/Output System), a firmware program that initializes the computer’s hardware and loads the operating system. It is the crucial link between the hardware and software. BIOS runs the Power-On Self-Test (POST) to check components like RAM and hard drives, then finds and loads the OS (like Linux, Windows or macOS) from storage into memory, managing basic communication for the system to function.
Physical memory map
BIOS detects physical memory by communicating with the motherboard’s chipset and RAM modules during startup. It creates a physical memory map, and then hands this map to the operating system (OS). The OS manages these physical addresses, translating them into virtual addresses via the Memory Management Unit (MMU). The CPU actually uses the physical addresses to access RAM via the memory controller.
Initializing Memory Mapped I/O
I/O devices are not typically hardwired on a computer motherboard, but are pluggable devices like interface cards that can be put in PCIe slots in a computer. If a device wants to use Memory Mapped I/O it needs addresses within the computer main memory address space. There’s no concept of a universal address space, so a device is assigned its addresses for MMIO at boot time. The key to this are Base Address Registers (BARs).
Base Address Registers (BARs)
A Base Address Register (BAR) is a special hardware register in computer systems, especially for Peripheral Component Interconnect Express (PCIe) devices, that tells the system where to map a device’s memory in the system’s physical address space. When a system boots, the BIOS performs procedures to query devices that might be using Memory Mapped I/O. The BIOS determines the requested size of mapped address blocks for a device, allocates addresses in its physical map, and then writes the base allocated addresses to BARs in the device. The device in turn uses the value in the BAR as the base address to identify its assigned addresses.
BARs in the PCIe Configuration Space. This shows BAR registers in a Type 0 Configuration Space Header for PCIe. When a system boots, attached devices are enumerated and their configuration space is mapped into memory to make it accessible. BIOS writes addresses to the BAR Registers to instantiate Memory Mapped I/O.
BIOS operations
The procedures of BIOS at system initialization are:
- Power-On & Initial Code: When the computer starts, the CPU begins executing code from a fixed, high address in the BIOS firmware, which is stored on a ROM (Read Only Memory) chip.
- Hardware Discovery: The firmware identifies the motherboard chipset and communicates with memory controllers (often part of the chipset, like the Platform Controller Hub).
- RAM Detection: The BIOS queries the RAM modules (DIMMs) to determine their size, type, and speed, often using SPD (Serial Presence Detect) data from small ROMs on the modules.
- Memory Map Generation: BIOS creates a map of all available physical RAM (and reserved areas) and makes this map available to the OS (e.g., via
INT 15h E820interrupt in x86). - Assign addresses to Memory Mapped I/O devices. The BIOS assigns physical memory addresses to Memory Mapped I/O devices using Base Address Registers (BARs) during boot. It scans PCI devices, reads their requested address space from BARs, and maps them into the system’s overall physical memory map, and then writes the allocated addresses to the BARs.
After the BIOS completes its initialization the OS is booted. The Operating System is booted. The OS reads the memory map from BIOS and creates its own tables for physical memory and virtual address mappings. Processes can then allocate memory for their purposes. Device drivers communicate with their respective devices and can read or write memory mapped I/O in the devices.