How Git Works

Peek inside the .git folder. Understand blobs, trees, commits, and branches — the building blocks of version control.

Pro Git Book

Working Dir

index.js

style.css

Staging

Empty

.git/objects

.git/refs

main a1b2c3d

Current Step

git init

1 / 10

git init

Creating the .git folder

Running `git init` creates a hidden `.git` directory. This folder IS your repository — it contains all the magic.

$ git init
Initialized empty Git repository in /project/.git/

Key Takeaways

📦 Objects

Blobs (files), Trees (directories), Commits (snapshots). All stored by SHA-1 hash.

🌿 Branches

Branches are just files containing commit SHAs. HEAD tells Git where you are.

📸 Staging

The index lets you craft commits. Changes move: Working → Staging → Commit.

The Engineering of Git: A Directed Acyclic Graph of Hashes

Many developers treat Git as a magical black box that uploads code to the cloud. In reality, Git is not a version control system in the traditional sense; it is a meticulously engineered, purely functional content-addressable filesystem that operates entirely via cryptographic hashes. To truly master Git—and easily reverse any mistake—you must look inside the invisible .git folder.

Part 1: The Three Trees

Git manages three completely distinct areas of your computer. Understanding how files move between them is the key to mastering Git:

The Working Directory: This is the physical folder you see in your code editor. Git tracks changes here, but does not permanently save them.
The Staging Area (Index): A hidden file (.git/index) that meticulously queues up exactly which files you intend to include in the next permanent snapshot. When you run git add process.js, you are mathematically hashing the current state of process.js and placing that hash in the Index.
The .git Directory (Repository): The actual database. If you delete your working directory but keep the .git folder, you still possess 100% of the project's history.

Part 2: Content-Addressable Storage (Blobs)

Unlike Subversion (SVN), Git does not store differences (deltas) between files. When you type git commit, Git takes an entire, identical snapshot of every single file in your repository.

If you have a 1,000-page novel and change one letter on page 50, does Git duplicate the other 999 pages? No, due to Content Addressing.

Every file placed into Git is passed through the SHA-1 cryptographic hashing algorithm. The resulting 40-character hash (e.g., a1b2c3d4e5f6...) becomes the file's permanent name inside .git/objects. The object is compressed via Zlib and stored forever. This is called a Blob.

Crucially, Blobs do not store filenames. If you rename math.js to calc.js without changing the content, the SHA-1 hash remains identical. Git seamlessly realizes it already possesses that exact file content and entirely skips duplication.

Part 3: Directories (Trees)

If Blobs don't store filenames or folder structures, how does Git know your project layout? Through Tree Objects.

A Tree is a simple text file that maps human-readable filenames to cryptographic SHA-1 hashes.

// Snapshot of a Tree Object (e.g., the 'src/' folder)

100644 blob a1b2c3... utils.js

100644 blob 9f8a7d... main.py

040000 tree 81e74a... images/

Notice that the Tree can point downward to other Trees (subdirectories). When you commit, Git generates a master "Root Tree" representing the top level of your project. This entire directory structure is itself hashed into a single 40-character SHA-1.

Part 4: The Commit Object and The DAG

A Commit Object is incredibly tiny. It is a 200-byte text file containing only four things:

The SHA-1 hash of the Root Tree (the exact snapshot of the filesystem).
The SHA-1 hash of the Parent Commit (the commit that came directly before this one).
The Author Name, Email, and Timestamp.
The human-readable Commit Message.

The Commit Object itself is then hashed. Because every commit immutably points to its parent, Git history forms a Directed Acyclic Graph (DAG). If you altered a past commit, its SHA-1 hash would completely change, which would break the parent pointer of the subsequent commit. Mathematical integrity is guaranteed.

Part 5: Branches are 41-Byte Files

In older SVN systems, creating a branch meant physically copying all source files into a new network directory, a slow and catastrophic process.

In Git, a Branch is literally a text file located at .git/refs/heads/main. Open it in a text editor, and you will see exactly 41 bytes: a 40-character Commit SHA-1 and a newline.

When you run git branch feature-x, Git instantly creates a new tiny file named feature-x containing the exact same SHA-1 hash as your current branch. It takes milliseconds because absolutely NO application text was copied.

When you create a new commit while on feature-x, Git mathematically calculates the new Commit Hash, and completely overwrites the feature-x text file to point to the new hash. The pointer physically moves forward through the graph.

Part 6: HEAD and Detached States

How does Git know which branch you are currently on? It checks a special file located at .git/HEAD.

Normally, the HEAD file contains a symbolic reference: ref: refs/heads/main. It points directly to the branch name. If you switch branches via git checkout feature-x, Git surgically modifies the HEAD file to read ref: refs/heads/feature-x, and alters your working directory to match the target Commit.

Detached HEAD State

If you run git checkout [specific-commit-hash], you are bypassing branches entirely. Git forcefully overwrites the .git/HEAD file to contain that raw 40-character Hash.

Why is this highly dangerous? If you create new commits in this state, they will form a new path in the Graph. But because NO branch pointer is moving forward to track them, the instant you switch back to main, those new commits will become entirely orphaned. The Git Garbage Collector will eventually delete them permanently, as they are no longer reachable by any known Reference.

Glossary & Concepts

🌲 Tree Object

Git's way of storing directory listings. It maps file names to Blob SHA-1 hashes and sub-directory names to other Tree objects.

📦 Blob Object

Stands for Binary Large Object. It stores only the file content, completely detached from file names or metadata.

🏷️ Reference (Ref)

A simple text file pointing to a commit hash. Branches are just refs stored in `.git/refs/heads/`.

🪢 Directed Acyclic Graph (DAG)

The exact mathematical structure of Git's commit history. Commits point exactly in one direction (to their parents) and can never form a cycle.

How Git Works

git init

Key Takeaways

📦 Objects

🌿 Branches

📸 Staging

The Engineering of Git: A Directed Acyclic Graph of Hashes

Part 1: The Three Trees

Part 2: Content-Addressable Storage (Blobs)

Part 3: Directories (Trees)

Part 4: The Commit Object and The DAG

Part 5: Branches are 41-Byte Files

Part 6: HEAD and Detached States

Detached HEAD State

Glossary & Concepts

🌲 Tree Object

📦 Blob Object

🏷️ Reference (Ref)

🪢 Directed Acyclic Graph (DAG)

Related Resources

Pro Git Book

Learn Git Branching