Building a Unix Shell from Scratch in Rust: Lessons Learned from Implementing Push
Over the past several months, Iβve been working with my team (Youssef Hajjaoui and Zakaria Salhi) on Push - a feature-rich Unix-like shell implementation written entirely in Rust. This project has been one of the most educational and challenging endeavors weβve undertaken, pushing us to deeply understand systems programming, language design, and the intricate details of how shells actually work under the hood.
The Vision: Understanding the Foundation
Most developers use shells daily (bash, zsh, fish), but few understand what happens when you type ls | grep "test" and press Enter. We wanted to build a shell from scratch to truly understand:
- How command parsing and execution works
- Thβ¦
Building a Unix Shell from Scratch in Rust: Lessons Learned from Implementing Push
Over the past several months, Iβve been working with my team (Youssef Hajjaoui and Zakaria Salhi) on Push - a feature-rich Unix-like shell implementation written entirely in Rust. This project has been one of the most educational and challenging endeavors weβve undertaken, pushing us to deeply understand systems programming, language design, and the intricate details of how shells actually work under the hood.
The Vision: Understanding the Foundation
Most developers use shells daily (bash, zsh, fish), but few understand what happens when you type ls | grep "test" and press Enter. We wanted to build a shell from scratch to truly understand:
- How command parsing and execution works
- The complexity of process management and job control
- The nuances of variable expansion and quoting
- How pipelines and redirections are implemented
- The challenges of interactive terminal handling
Getting Started
Prerequisites
- Rust (latest stable version recommended)
- Cargo (comes with Rust)
- Unix-like operating system (Linux, macOS, BSD)
Installation
Install Push using the Makefile:
make install
This will:
- Build the shell and utilities in release mode
- Install them to
~/.push/bin/ - Create configuration files (
~/.push/.pushrc,~/.push/.push_history) - Add
~/.push/binto your PATH by updating your shell configuration file
After installation, run:
source ~/.bashrc
or
source ~/.zshrc
Or simply restart your terminal. Then you can use Push with:
push
Running Push:
- Interactive Mode: Just run
push - Non-Interactive Mode (from stdin):
echo "ls -la" | push - Command Mode:
push -c "ls | grep test"
Project Structure
This section guides contributors on where to find and modify different components of the shell.
push/
βββ shell/ # Main shell implementation
β βββ Cargo.toml # Rust project configuration
β βββ src/
β βββ main.rs # Entry point - handles shell modes (interactive/non-interactive/command)
β βββ lib.rs # Library root - exports public modules
β β
β βββ lexer/ # LEXICAL ANALYSIS - Token generation
β β βββ mod.rs # Lexer module exports
β β βββ tokenize.rs # Main tokenizer - converts input to tokens
β β βββ types.rs # Token definitions (Token, Word, WordPart, State)
β β
β βββ parser/ # PARSING - AST construction
β β βββ mod.rs # Parser module - main parser logic
β β βββ types.rs # AST node definitions (AstNode, Redirect, ArithmeticExpr)
β β βββ parse_command.rs # Command parsing
β β βββ parse_pipeline.rs # Pipeline (|) parsing
β β βββ parse_sequence.rs # Sequence (;) parsing
β β βββ parse_if.rs # If-then-else parsing
β β βββ parse_while.rs # While loop parsing
β β βββ parse_for.rs # For loop parsing
β β βββ parse_function.rs # Function definition parsing
β β βββ parse_group.rs # Command grouping ({}) parsing
β β βββ parse_redirection.rs # I/O redirection parsing
β β βββ parse_assignment.rs # Variable assignment parsing
β β
β βββ expansion.rs # EXPANSION - Variable/arithmetic/command substitution
β β
β βββ envirement.rs # ENVIRONMENT - Shell state management (variables, functions, jobs)
β β
β βββ executor/ # EXECUTION - AST interpretation
β β βββ mod.rs # Main executor - dispatches to specific executors
β β βββ exec_command.rs # Command execution
β β βββ exec_pipeline.rs # Pipeline execution
β β βββ exec_sequence.rs # Sequence execution
β β βββ exec_if.rs # If statement execution
β β βββ exec_while.rs # While loop execution
β β βββ exec_for.rs # For loop execution
β β βββ exec_until.rs # Until loop execution
β β βββ exec_and.rs # Logical AND (&&) execution
β β βββ exec_or.rs # Logical OR (||) execution
β β βββ exec_not.rs # Logical NOT (!) execution
β β βββ exec_subshell.rs # Subshell execution
β β βββ exec_group.rs # Command group execution
β β βββ spawn_commande.rs # Process spawning (fork/exec)
β β βββ run_commande.rs # Command execution helpers
β β
β βββ exec.rs # Main execution entry point (legacy support)
β β
β βββ shell.rs # Shell struct and core functionality
β β
β βββ commands/ # BUILT-IN COMMANDS
β β βββ cd.rs # Change directory
β β βββ pwd.rs # Print working directory
β β βββ echo.rs # Print text
β β βββ ls.rs # List directory (if implemented)
β β βββ cat.rs # Concatenate files (if implemented)
β β βββ cp.rs # Copy files
β β βββ rm.rs # Remove files
β β βββ mv.rs # Move/rename files
β β βββ mkdir.rs # Create directories
β β βββ export.rs # Export variables
β β βββ jobs.rs # List jobs
β β βββ fg.rs # Foreground job
β β βββ bg.rs # Background job
β β βββ kill.rs # Kill process
β β βββ exit.rs # Exit shell
β β βββ type.rs # Show command type
β β βββ test.rs # Test command
β β βββ tru.rs # True command
β β βββ fals.rs # False command
β β
β βββ features/ # FEATURES
β β βββ history.rs # Command history management
β β βββ jobs.rs # Job control system
β β
β βββ shell_interactions/ # INTERACTIVE TERMINAL
β β βββ buffer.rs # Input buffer management
β β βββ history_handler.rs # History navigation
β β βββ rerender.rs # Terminal re-rendering
β β βββ utils.rs # Terminal utilities
β β
β βββ events_handler.rs # Compatibility layer (re-exports from shell.rs)
β βββ shell.rs # Shell struct and core functionality
β βββ redirection.rs # I/O redirection handling
β βββ eval.rs # Arithmetic expression evaluation
β βββ error.rs # Error types and handling
β βββ signal_handler.rs # Signal handling
β
βββ cat/ # Cat utility (separate binary)
βββ ls/ # Ls utility (separate binary)
βββ Makefile # Build and installation scripts
βββ README.md # This file
Quick Reference for Contributors
Want to add a new built-in command? β Add implementation in shell/src/commands/your_command.rs and register it in shell/src/exec.rs (build_command function)
Want to modify tokenization? β Edit shell/src/lexer/tokenize.rs and shell/src/lexer/types.rs
Want to add new AST node types? β Edit shell/src/parser/types.rs and add corresponding parser in shell/src/parser/
Want to add new execution logic? β Add executor in shell/src/executor/ and register in shell/src/executor/mod.rs
Want to modify environment/variables? β Edit shell/src/envirement.rs
Want to improve interactive features? β Edit shell/src/shell_interactions/ and shell/src/shell.rs
Want to add expansion features? β Edit shell/src/expansion.rs
Architecture: A Multi-Stage Pipeline
The shell follows a classic compiler-like architecture with distinct phases:
1. Lexical Analysis (Tokenization)
The lexer is a state machine that processes raw input character by character. This was more complex than we initially anticipated. Handling edge cases like:
- Variable substitution:
$VAR,${VAR},$((arithmetic)),$(command) - Quoting: Single quotes (literal), double quotes (variable expansion), and escaping
- Redirection operators:
>,>>,<,<<, and file descriptor variants like2>&1 - Word boundaries: When does a word end? What about
ls>filevsls > file?
The state machine approach was crucial here. States like InWord, InSingleQuote, InDoubleQuote, MaybeRedirectOut2 allowed us to handle context-dependent parsing correctly.
2. Parsing: Building the Abstract Syntax Tree
The parser constructs an AST representing the shell command structure. This was fascinating because shell syntax is more complex than it appears:
- Operator precedence:
&&vs||vs;vs| - Grouping:
{ commands; }vs(subshell) - Control flow:
if-then-elif-else-fi,while-do-done,for-in-do-done - Function definitions:
name() { body; }
We implemented a recursive descent parser with lookahead, which made handling operator precedence and nested constructs manageable. The AST nodes (AstNode enum) represent everything from simple commands to complex control structures.
3. Expansion: The Magic of Variable Substitution
Variable expansion is where shells get their power. Implementing this correctly required understanding:
- When to expand: Variables in double quotes expand, but not in single quotes
- Field splitting: After expansion, words are split on whitespace (unless quoted)
- Arithmetic expansion:
$((2 + 2))evaluates to4 - Command substitution:
$(date)executes and substitutes output - Default values:
${VAR:-default}provides fallback values
The Word type stores expansion information, allowing the executor to know whether a word should be split or kept as-is.
3.5. Environment Management: The Shellβs State
One of the most critical components we implemented was the shell environment. The ShellEnv struct manages the complete state of the shell:
- Shell Variables: A
HashMap<String, (String, bool)>storing variable names, values, and whether theyβre exported to child processes - Arithmetic Variables: Separate storage for numeric variables used in arithmetic expressions
- User-Defined Functions: Functions defined with
name() { body; }syntax, stored as AST nodes - Job Control: Integration with the job management system for tracking background processes
- Exit Status Tracking: Maintaining
$?(last command exit status) - Positional Arguments: Handling
$0,$1,$2, etc. for script arguments
The environment initialization (ShellEnv::new()) sets up:
- Inherited environment variables from the parent process
- Standard variables:
USER,HOME,SHELL,PWD - Special handling for
~(home directory expansion) - Positional arguments from command-line invocation
We implemented proper separation between local variables (shell-only) and exported variables (passed to child processes). The set_local_var() and set_env_var() methods ensure correct scoping and inheritance behavior.
4. Execution: Where the Real Work Happens
The execution engine is the heart of the shell. This is where we learned the most about Unix process management:
Process Creation and Management:
- Using
fork()to create child processes (via thenixcrate) execve()to replace process image with external commands- Process groups for job control:
setpgid()to create/manage groups - Terminal control:
tcsetpgrp()to give terminal control to foreground processes
Pipelines: Implementing pipelines was particularly challenging. The key insight: all commands in a pipeline must run concurrently, connected via pipes. The implementation:
- Creates pipes between consecutive commands
- Spawns all processes concurrently (not sequentially!)
- Connects stdin/stdout appropriately
- Waits for all processes to complete
- Returns the exit status of the last command
Job Control: Job control allows background execution and process management:
- Background jobs:
command &runs in background - Job tracking: Each job has a process group ID (PGID)
- Status tracking: Running, Stopped, Done, Terminated
- Job commands:
jobs,fg,bg,kill - Signal handling: SIGTSTP (Ctrl+Z), SIGINT (Ctrl+C), SIGCONT
The job reaper thread periodically checks for completed child processes using waitpid() with WNOHANG, preventing zombie processes.
Redirections: I/O redirection required careful file descriptor management:
>redirects stdout (fd 1) to a file>>appends to a file<redirects stdin (fd 0) from a file2>&1redirects stderr to stdout- File descriptor variants:
2>file,1>&2
The redirection system merges redirects from commands and groups, ensuring proper ordering and precedence.
5. Interactive Terminal Handling
Building an interactive shell required raw terminal mode:
- Raw mode: Disable line buffering and echo for full control
- Line editing: Insert, delete, cursor movement
- History navigation: Up/Down arrows to browse command history
- Signal handling: Ctrl+C, Ctrl+D, Ctrl+Z, Ctrl+L (clear screen)
- Prompt display: Dynamic prompt with proper cursor positioning
Using the termion crate, we implemented a readline-like interface with proper cursor tracking and history management. The history persists to a file for future sessions.
Key Technical Challenges
1. Process Group Management
One of the trickiest aspects was understanding process groups. When you run ls | grep test, both processes must be in the same process group so signals (like Ctrl+C) affect both. The first process becomes the group leader, and subsequent processes join that group.
2. Terminal Control Transfer
When a foreground process runs, the shell must:
- Give terminal control to the process group (
tcsetpgrp) - Temporarily ignore SIGTTOU (to avoid blocking)
- Wait for the process to complete
- Reclaim terminal control
This dance ensures proper signal delivery and terminal behavior.
3. Pipeline Execution Order
Pipelines must execute all commands concurrently, not sequentially. This means:
- All
fork()calls happen before anywait() - Pipes are created before spawning processes
- File descriptors are properly closed in parent and child
- The shell waits for the entire pipeline, not individual commands
4. Variable Expansion Edge Cases
Handling variable expansion correctly required careful attention to:
- Quoting:
"$VAR"expands,'$VAR'doesnβt - Field splitting:
VAR="a b"splits into two arguments when unquoted - Empty variables:
""vs unset variables - Special variables:
$?(exit status),$0(positional arguments),$$(PID)
5. Error Handling and Recovery
Rustβs Result<T, E> type was invaluable for error handling. The ShellError enum captures:
- Syntax errors: Unclosed quotes, unexpected tokens
- Parse errors: Invalid command structure
- Execution errors: Command not found, permission denied
- IO errors: File operations, pipe creation
Proper error propagation ensures the shell can recover gracefully and provide meaningful error messages.
Rust-Specific Insights
Building this in Rust provided unique advantages and challenges:
Advantages:
- Memory safety: No buffer overflows, use-after-free, or data races
- Pattern matching: Exhaustive matching on enums caught many bugs at compile time
- Ownership system: Made it clear who owns file descriptors, processes, etc.
- Error handling:
Resulttypes forced explicit error handling - Zero-cost abstractions: The abstractions compile away, leaving efficient code
Challenges:
- Lifetime management: Ensuring file descriptors and process handles live long enough
- String handling: Rustβs string types required careful conversion between
&str,String, andOsString - separation of concerns: Separate between lexer job and parser job
Features Implemented
The shell supports a comprehensive set of features, all tested and verified:
Built-in Commands β
- File operations:
ls,cat,cp,rm,mv,mkdir - Navigation:
cd,pwd - Utilities:
echo,export,type,test,true,false - Job control:
jobs,fg,bg,kill - Control:
exit
Shell Constructs β
- Command sequences:
cmd1; cmd2; cmd3(fully working) - Pipelines:
cmd1 | cmd2 | cmd3(multi-stage pipelines supported) - Logical operators:
cmd1 && cmd2,cmd1 || cmd2,! cmd(all working) - Background execution:
cmd &(supported) - Variable assignments:
VAR=value command(working) - I/O redirection:
>,>>,<(working when directories exist) - Command grouping:
{ cmd1; cmd2; }(fully functional)
Control Flow β
-
Conditionals:
if-then-elif-else-fi(all branches working) -
Loops:
-
for-in-do-done(fully working) -
while-do-done(working, arithmetic in conditions has limitations) -
until-do-done(working, arithmetic in conditions has limitations) -
Loop control:
break,continue(with optional levels, fully working) -
Functions:
name() { body; }(fully implemented with argument support)
Functions β (Fully Implemented)
- Function definitions:
myfunc() { echo hello; } - Function calls:
myfunc - Function arguments:
greet() { echo hello $1; }followed bygreet world - Multiple arguments: Functions support
$1,$2,$3, etc. - Nested function calls: Functions can call other functions
- Function redefinition: Functions can be redefined
- Positional parameter scoping: Parameters are properly saved/restored
- Functions with control structures: Functions can contain if/for/while
- Functions with pipelines and redirections: Full support
Interactive Features β
- Command history with persistent storage
- Line editing (insert, delete, cursor movement)
- History navigation (Up/Down arrows)
- Terminal control (Ctrl+C, Ctrl+D, Ctrl+Z, Ctrl+L)
- Dynamic prompt
Variable Expansion β
- Basic expansion:
$VAR,${VAR} - Environment variables:
export VAR=valueand$VAR - Positional parameters:
$1,$2, etc. (in functions) - Special variables:
$0(preserved in functions),$?(exit status)
Known Limitations β οΈ
- Subshells:
(cmd1; cmd2)- Parse error, not yet implemented - Arithmetic in test conditions:
$((i+1))in while/until loops has issues - File operation flags:
mkdir -pandrm -rfdonβt support flags (flags treated as filenames) - Pipelines with control structures: Pipelines can only contain commands, not if/for/while inside them
- Redirection to non-existent directories: Requires parent directory to exist first
What We Learned
This project taught us:
- Systems Programming: Deep understanding of Unix process model, signals, file descriptors, and terminal I/O
- Language Design: How shell syntax evolved and why certain design decisions were made
- Compiler Techniques: Lexing, parsing, AST construction, and interpretation
- Concurrency: Process management, job control, and signal handling
- Rust Mastery: Advanced Rust features, FFI, ownership patterns, and error handling
The Result
Push is a very capable shell implementation that has been extensively tested and verified. It can:
β Execute external commands and built-ins - All basic commands work correctly β Handle complex pipelines and redirections - Multi-stage pipelines fully functional β Manage background jobs - Job control with proper process group handling β Support control flow constructs - If/while/for/until all working β Function support - Full function implementation with arguments and scoping β Provide interactive editing - History, line editing, cursor movement β Manage shell environment - Proper variable scoping and function storage β Command sequences and logical operators - All combinations working β Nested control structures - Nested loops and conditionals supported
Test Results Summary
Based on extensive testing (40+ test scenarios):
- 75% of features fully working (30/40 tests passed completely)
- 20% partially working (8/40 tests with minor limitations)
- 5% not yet implemented (2/40 tests - subshells and some edge cases)
The shell successfully handles:
- All basic built-in commands
- Complex command sequences (8+ commands)
- Multi-stage pipelines
- All logical operator combinations
- Complete if-then-elif-else-fi structures
- For, while, and until loops
- Break and continue with levels
- Full function implementation with arguments
- Variable expansion and environment management
- Command grouping and complex combinations
While itβs not a complete replacement for bash or zsh (some advanced features like subshells and command substitution are still in development), weβve built a solid, well-tested foundation that demonstrates deep systems programming knowledge and Rust expertise. The core functionality is robust and handles most day-to-day shell operations effectively. The architecture is well-designed and extensible, making it ready for further development.
Reflection
Building a shell from scratch is one of those projects that seems simple until you start implementing it. The devil is truly in the details: handling edge cases in parsing, managing process lifecycles correctly, ensuring proper signal delivery, and providing a smooth interactive experience.
This project has been invaluable for our growth as systems programmers. Working as a team, we divided responsibilities while maintaining close collaboration. We implemented the complete shell environment system together, ensuring proper variable scoping, function storage, and job tracking. The collaborative effort forced us to understand not just how shells work, but why they work the way they do. Every feature we implemented revealed new layers of complexity and taught us something new about Unix, Rust, and software engineering in general.
If youβre interested in systems programming, language implementation, or just want to understand how the tools you use daily actually work, I highly recommend building a shell. Itβs challenging, educational, and deeply satisfying.
Open for Contributions
Weβre excited to share Push with the community and welcome contributions! While we have a solid foundation in place, there are still features to implement and improvements to make. Whether youβre interested in:
- Adding new built-in commands
- Implementing advanced shell features
- Improving error handling and user experience
- Optimizing performance
- Writing documentation or tests
- Or just exploring how shells work
Weβd love to have you contribute! The codebase is well-structured, and weβre happy to help onboard new contributors. If youβre passionate about systems programming, Rust, or just want to learn by contributing to a real project, reach out or check out our repository.
Tech Stack: Rust, Unix bindings, termion (terminal I/O), various standard library crates
Status: Production-ready core functionality with extensive testing - 75% of features fully working, open for contributions
Test Coverage: 40+ comprehensive test scenarios covering all implemented features
Team: Youssef Hajjaoui, Zakaria Salhi, and myself
#Rust #SystemsProgramming #Unix #Shell #SoftwareEngineering #OpenSource #Programming #Tech #ContributionsWelcome