What is Git and How Does Version Control Work?

If you write code, you need to use Git. Full stop. It’s become so fundamental to software development that it’s hard to imagine working without it. But here’s the thing: a lot of developers use Git without really understanding what’s happening under the hood. Let me fix that.

I’m going to walk you through what Git actually does, why it’s designed the way it is, and how to use it effectively. By the end, you’ll understand not just the commands, but the underlying model that makes Git so powerful.

What Problem Does Version Control Solve?

Before we dive into Git specifically, let’s talk about why version control exists at all. Imagine you’re working on a project and you want to:

Save your progress at different points in time
Try an experimental feature without breaking your…

What Problem Does Version Control Solve?

Before we dive into Git specifically, let’s talk about why version control exists at all. Imagine you’re working on a project and you want to:

Save your progress at different points in time
Try an experimental feature without breaking your working code
Collaborate with others without overwriting each other’s work
See what changed between yesterday and today
Roll back when something breaks (and trust me, something will break)

Without version control, developers used to save files like this:

project_final.py
project_final_v2.py
project_final_v2_actually_final.py
project_final_v2_actually_final_for_real_this_time.py

Yeah… not great. Version control systems formalize this chaos into a structured system.

Git vs Other Version Control Systems

Git isn’t the only version control system out there. Before Git became dominant, systems like SVN (Subversion) and CVS were common. Here’s what makes Git different:

Distributed vs Centralized:

Centralized (SVN): One central server holds the repository. Developers check out files, make changes, commit back to the server.
Distributed (Git): Every developer has a complete copy of the repository with full history. You can work offline and sync later.

This might seem like a small difference, but it’s huge in practice. With Git:

You can commit changes locally without network access
The repository is backed up on every developer’s machine
Branching and merging are lightning fast (they don’t require server communication)
You can work on multiple features simultaneously without conflicts

How Git Actually Works: The Mental Model

Here’s where it gets interesting. Git doesn’t store your files the way you might think. Understanding this model is key to mastering Git.

The Three States

Every file in your Git repository can be in one of three states:

Modified: You’ve changed the file but haven’t committed it yet
Staged: You’ve marked a modified file to go into your next commit
Committed: The file is safely stored in your local repository

These states correspond to three areas:

Working Directory  →  Staging Area  →  Git Repository
(modified)         (staged)        (committed)

When you make changes, they start in your working directory. You explicitly add them to the staging area (git add), then commit them to the repository (git commit).

How Git Stores Data

Here’s what blew my mind when I first learned it: Git doesn’t store file changes (deltas), it stores snapshots.

When you commit, Git takes a snapshot of all your files and stores a reference to that snapshot. If files haven’t changed, Git doesn’t store them again—it just links to the previous version.

# Commit history visualization
Commit A: [file1_v1, file2_v1, file3_v1]
↓
Commit B: [file1_v2, file2_v1, file3_v1]  # Only file1 changed
↓
Commit C: [file1_v2, file2_v2, file3_v2]  # file2 and file3 changed

This might seem inefficient (storing whole files instead of just changes), but Git uses clever compression and deduplication. Files that are identical across commits are stored only once.

Essential Git Operations: A Practical Guide

Let me walk you through the core Git operations you’ll use daily. I’ll show you what’s happening behind the scenes.

Initializing a Repository

# Create a new Git repository
git init my-project
cd my-project

# Or initialize in existing directory
git init

This creates a .git directory that contains all of Git’s metadata and object database. Everything Git needs to track your project is in this folder.

Making Your First Commit

# Create a file
echo "print('Hello, Git!')" > hello.py

# Check status (you'll use this constantly)
git status
# Shows: hello.py is untracked

# Stage the file
git add hello.py

# Check status again
git status
# Shows: hello.py is staged for commit

# Commit with a message
git commit -m "Add hello world script"

What just happened?

git add created a blob object in .git/objects containing the file contents
git commit created a tree object (directory structure) and a commit object (metadata)
The commit object points to the tree, which points to the blob

Each commit has:

A unique SHA-1 hash (like a3f5b7c2...)
Author information
Timestamp
Commit message
Pointer to parent commit(s)
Pointer to the tree (file snapshot)

Understanding the Commit Graph

Commits form a directed acyclic graph (DAG). Each commit points to its parent(s):

A ← B ← C ← D (main)
↖ E ← F (feature-branch)

Commit D’s parent is C, C’s parent is B, etc. This is how Git knows the history of your project.

Branching: Git’s Killer Feature

Branches in Git are incredibly lightweight—they’re just pointers to commits. This is why branching is so fast in Git (unlike SVN where branches were copies of the entire repository).

# Create a new branch
git branch feature-login

# Switch to the branch
git checkout feature-login

# Or do both at once
git checkout -b feature-login

# Modern Git (v2.23+) uses 'switch' instead of checkout
git switch -c feature-login

What’s happening? Git creates a new pointer called feature-login that points to your current commit. When you make new commits, this pointer moves forward while main stays put:

Before:
main → A ← B ← C
feature-login → (same as main)

After commits on feature-login:
main → A ← B ← C
feature-login → A ← B ← C ← D ← E

Merging: Bringing Changes Together

Once you’ve finished work on a branch, you merge it back:

# Switch to the branch you want to merge into
git checkout main

# Merge the feature branch
git merge feature-login

Git tries to automatically merge changes. There are two types of merges:

1. Fast-Forward Merge (simple case):

Before:
main → A ← B ← C
feature → A ← B ← C ← D ← E

After merge:
main → A ← B ← C ← D ← E (main caught up)

Git just moves the main pointer forward. No merge commit needed.

2. Three-Way Merge (when both branches have changes):

Before:
main → A ← B ← C ← D
feature → A ← B ← E ← F

After merge:
main → A ← B ← C ← D ← M
feature → A ← B ← E ← F ↗

M = merge commit with two parents (D and F)

Git creates a new commit that combines changes from both branches.

Handling Merge Conflicts

Sometimes Git can’t automatically merge changes (e.g., both branches modified the same line). You get a conflict:

git merge feature-login
# Auto-merging app.py
# CONFLICT (content): Merge conflict in app.py
# Automatic merge failed; fix conflicts and then commit the result.

Open the conflicted file:

<<<<<<< HEAD
def login(username, password):
# Version in main branch
return authenticate_user(username, password)
=======
def login(email, password):
# Version in feature branch
return auth_service.verify(email, password)
>>>>>>> feature-login

You need to manually resolve this:

def login(email, password):
# Resolved: use email parameter with new auth service
return auth_service.verify(email, password)

Then:

# Stage the resolved file
git add app.py

# Complete the merge
git commit

Working with Remote Repositories

Git is distributed, so you can push and pull from remote repositories (like GitHub, GitLab, or your own server).

Cloning a Repository

# Clone from remote
git clone https://github.com/user/repo.git
cd repo

# See configured remotes
git remote -v
# origin  https://github.com/user/repo.git (fetch)
# origin  https://github.com/user/repo.git (push)

origin is the conventional name for your primary remote repository.

Pushing Changes

# Push commits to remote
git push origin main

# Push a new branch
git push -u origin feature-login
# The -u flag sets up tracking so future pushes can just use: git push

What’s happening? Git sends your commit objects, trees, and blobs to the remote server. The remote updates its main branch to point to your latest commit.

Pulling Changes

# Fetch changes from remote
git fetch origin

# Merge them into your current branch
git merge origin/main

# Or do both at once
git pull origin main

Pro tip: I prefer git fetch + git merge over git pull because it lets me review changes before merging them into my local branch.

Essential Git Commands Cheat Sheet

Here are the commands you’ll use most often:

# Status and Information
git status                    # Show working directory status
git log                       # Show commit history
git log --oneline --graph     # Compact, visual history
git diff                      # Show unstaged changes
git diff --staged             # Show staged changes

# Basic Operations
git add <file>                # Stage changes
git add .                     # Stage all changes
git commit -m "message"       # Commit staged changes
git commit -am "message"      # Stage and commit (tracked files only)

# Branching
git branch                    # List branches
git branch <name>             # Create branch
git checkout <branch>         # Switch branch
git switch <branch>           # Modern way to switch branch
git switch -c <branch>        # Create and switch to new branch
git branch -d <branch>        # Delete branch (safe)
git branch -D <branch>        # Force delete branch

# Merging
git merge <branch>            # Merge branch into current
git merge --abort             # Abort a problematic merge

# Remote Operations
git clone <url>               # Clone repository
git push origin <branch>      # Push to remote
git pull origin <branch>      # Fetch and merge from remote
git fetch                     # Download objects from remote (no merge)

# Undoing Changes
git checkout -- <file>        # Discard changes in working directory
git restore <file>            # Modern way to discard changes
git reset HEAD <file>         # Unstage file
git reset --hard HEAD         # Discard all changes (dangerous!)
git revert <commit>           # Create new commit that undoes a previous commit

Advanced Git Concepts

Once you’re comfortable with the basics, these concepts will level up your Git skills.

Rebasing

Rebasing rewrites commit history by moving commits to a new base:

# Instead of merging, rebase your feature branch onto main
git checkout feature-login
git rebase main

Before rebase:

main → A ← B ← C
feature → A ← D ← E

After rebase:

main → A ← B ← C
feature → A ← B ← C ← D' ← E'

The commits D and E are rewritten (hence D’ and E’) to apply on top of C.

When to use rebase:

To keep a linear project history
Before merging a feature branch (clean up your commits)
To incorporate upstream changes without merge commits

When NOT to use rebase:

On commits you’ve already pushed to a shared remote (it rewrites history)
If you’re not comfortable with resolving conflicts

Cherry-Picking

Apply a specific commit from one branch to another:

# Apply commit abc123 to current branch
git cherry-pick abc123

This is super useful when you need just one fix from a feature branch without merging everything.

Interactive Rebase

Clean up your commit history before pushing:

# Rebase last 3 commits interactively
git rebase -i HEAD~3

This opens an editor where you can:

Reorder commits
Squash multiple commits into one
Edit commit messages
Drop commits entirely

Stashing

Temporarily save changes without committing:

# Stash current changes
git stash

# Do other work, switch branches, etc.

# Reapply stashed changes
git stash pop

# List all stashes
git stash list

# Apply a specific stash
git stash apply stash@{2}

I use this all the time when I need to quickly switch contexts.

Best Practices and Workflow

After years of using Git on teams of all sizes, here are the practices that make a difference:

1. Write Good Commit Messages

# Bad
git commit -m "fix"
git commit -m "updates"

# Good
git commit -m "Fix null pointer exception in user login
- Add null check before accessing user.email
- Add test case for null user scenario
- Resolves issue #123"

A good commit message:

Starts with a concise summary (50 chars or less)
Explains what changed and why (not how—the code shows how)
References related issues/tickets

2. Commit Often, Push Deliberately

Make small, focused commits as you work. Each commit should be a logical unit of change that you could revert independently if needed.

# Good: Atomic commits
git commit -m "Add user model"
git commit -m "Add user validation"
git commit -m "Add user controller"

# Bad: One giant commit
git commit -m "Add entire user system"

3. Use Branches for Everything

Never commit directly to main. Create a branch for each feature, bug fix, or experiment:

git checkout -b feature/add-user-auth
git checkout -b bugfix/fix-login-timeout
git checkout -b experiment/try-new-caching

4. Keep Your Local Main Up to Date

# Regularly update your local main
git checkout main
git pull origin main

# Then rebase your feature branch
git checkout feature-branch
git rebase main

This prevents massive merge conflicts later.

5. Use .gitignore

Don’t commit generated files, dependencies, or sensitive data:

# .gitignore file
__pycache__/
*.pyc
node_modules/
.env
.DS_Store
dist/
build/
*.log

Conclusion

Git is one of those tools that seems simple on the surface but has incredible depth. The key to mastering it is understanding the underlying model: snapshots, not deltas; branches as pointers; commits as nodes in a graph.

Once you internalize these concepts, Git’s commands start to make sense. You stop memorizing recipes and start understanding what each command actually does to the repository.

Start with the basics: clone, add, commit, push, pull. Get comfortable with branches and merging. Then gradually add more advanced techniques like rebasing and interactive commits to your toolkit.

And remember: Git’s complexity is there for a reason. It’s solving hard problems around distributed collaboration, and it’s doing it remarkably well. The learning curve is steep, but it’s absolutely worth it.

For deeper dives into Git internals, check out the Pro Git book (free online) and Git’s official documentation. The Git Reference Manual is also invaluable when you need to understand exactly what a command does.

Thank you for reading! If you have any feedback or comments, please send them to [email protected] or contact the author directly at [email protected].

What Problem Does Version Control Solve?

What Problem Does Version Control Solve?

Git vs Other Version Control Systems

How Git Actually Works: The Mental Model

The Three States

How Git Stores Data

Essential Git Operations: A Practical Guide

Initializing a Repository

Making Your First Commit

Understanding the Commit Graph

Branching: Git’s Killer Feature

Merging: Bringing Changes Together

Handling Merge Conflicts

Working with Remote Repositories

Cloning a Repository

Pushing Changes

Pulling Changes

Essential Git Commands Cheat Sheet

Advanced Git Concepts

Rebasing

Cherry-Picking

Interactive Rebase

Stashing

Best Practices and Workflow

1. Write Good Commit Messages

2. Commit Often, Push Deliberately

3. Use Branches for Everything

4. Keep Your Local Main Up to Date

5. Use .gitignore

Conclusion

Similar Posts