Last modified: June 06, 2026
This article is written in: πΊπΈ
Git stores your project as a graph of immutable objects. Instead of storing changes as a sequence of file diffs, Git stores snapshots of your project. Each snapshot is built from content-addressed objects, meaning each object is identified by a hash of its contents.
At the bottom are blobs. A blob stores raw file contents only. It does not store the file name, path, permissions, or history.
Above blobs are trees. A tree is like a directory. It maps names to objects. Each tree entry contains a file mode, object type, object hash, and name. A tree can point to blobs, which represent files, or to other trees, which represent subdirectories.
Above trees are commits. A commit points to one top-level tree, which represents the full project snapshot at that moment. A commit also stores parent commit IDs, author and committer information, timestamps, and the commit message.
Refs such as main, feature/login, or v1.2.0 are human-friendly names that point to object IDs, usually commit IDs. HEAD tells Git what you currently have checked out. Most of the time, HEAD points to a branch ref such as refs/heads/main.
The important idea is this:
file bytes β blob
directory listing β tree
snapshot + history β commit
human name β ref
current checkout β HEAD
Git objects are immutable. Git does not edit an existing blob, tree, or commit in place. When something changes, Git writes new objects and then moves a pointer, such as a branch ref, to the new commit.
This is one reason Git is efficient:
A simple mental model:
Blobs are file contents.
Trees are directories.
Commits are snapshots with history.
Refs are names for commits.
HEAD is where you are now.
.git/ DirectoryThe .git/ directory is the database and control center of a Git repository. Your working directory contains the checked-out files, but .git/ contains the history, references, staging area, and recovery logs.
Common important files and directories:
.git/
HEAD
index
objects/
refs/
logs/
config
| Path | Purpose |
.git/objects/ |
Stores Git objects: blobs, trees, commits, and tags |
.git/refs/ |
Stores branch and tag references |
.git/HEAD |
Points to the current branch or directly to a commit |
.git/index |
The staging area |
.git/logs/ |
Reflogs that remember where refs used to point |
.git/config |
Repository-specific configuration |
New objects usually start as loose objects under .git/objects/. Later, Git may compress many objects into packfiles for better storage efficiency and faster transfer.
Example layout:
.git/
HEAD
index
objects/
12/34abcd...
pack/
pack-xxxx.pack
pack-xxxx.idx
refs/
heads/main
tags/v1.2.0
logs/
HEAD
refs/heads/main
The design is simple: Git stores immutable objects, then moves small pointers around.
That is why many Git operations feel atomic. Creating a commit writes new objects, then updates a ref. The old commit is still there. If a branch moves unexpectedly, the reflog often lets you recover it.
Git has four main object types:
blob
tree
commit
tag
A blob stores file contents.
It does not know:
Example:
blob = "hello world\n"
If two files have exactly the same bytes, they point to the same blob.
A tree represents a directory. It stores entries that map names to object IDs.
Each tree entry contains:
mode type hash name
Example:
100644 blob <hash> hello.txt
040000 tree <hash> src
The tree is where file names and modes live. That is why a blob can be reused under multiple names.
A commit represents a project snapshot plus history metadata.
A commit stores:
A normal commit has one parent. The first commit has no parent. A merge commit usually has two or more parents.
Example commit content:
tree <tree-hash>
parent <parent-commit-hash>
author You <you@example.com> 1699999999 +0000
committer You <you@example.com> 1699999999 +0000
Add hello.txt
The commit does not store a full copy of every file directly. It points to a tree, and that tree points to blobs and subtrees.
Git has two common types of tags:
A lightweight tag is just a ref that points directly to an object, usually a commit.
An annotated tag is a real Git object. It has its own metadata, message, tagger, and a pointer to another object.
Annotated tags are useful for releases because they can be signed, inspected, and treated as first-class objects.
Objects
ββββββββββββββ
file β β blob β raw file bytes only
βββββββ¬βββββββ
β referenced by name and mode
βΌ
ββββββββββββββ
dir β β tree β entries: mode, type, hash, name
βββββββ¬βββββββ
β root tree of snapshot
βΌ
ββββββββββββββ
hist β β commit β tree, parent(s), author, message
ββββββββββββββ
Names
HEAD β refs/heads/main β commit
Another view:
refs/heads/main
β
βΌ
commit C2
β
βββ root tree T2
β βββ blob B1 hello.txt
β βββ blob B2 bye.txt
β
βββ parent commit C1
β
βββ root tree T1
βββ blob B1 hello.txt
This lab builds a tiny repository and inspects the objects Git creates.
mkdir toy
cd toy
git init
Example output:
Initialized empty Git repository in .../toy/.git/
At this point, Git has created a .git/ directory, but there are no commits yet.
echo "hello world" > hello.txt
git status
Example output:
Untracked files:
hello.txt
The file exists in your working directory, but Git has not stored it as part of a snapshot yet.
git add hello.txt
Staging does two important things:
Now inspect the index:
git ls-files -s
Example output:
100644 <blob-hash> 0 hello.txt
Meaning:
100644 file mode
<blob-hash> blob object ID
0 stage number
hello.txt path
Stage 0 means the normal, resolved version of the file. Other stage numbers appear during merge conflicts.
Use git cat-file to inspect the object.
git cat-file -t <blob-hash>
Output:
blob
Print the blob contents:
git cat-file -p <blob-hash>
Output:
hello world
Notice that the blob contains only the file content. It does not contain the name hello.txt.
The index can be turned into a tree object.
git write-tree
Example output:
<tree-hash>
Inspect the tree:
git cat-file -p <tree-hash>
Example output:
100644 blob <blob-hash> hello.txt
Now the name hello.txt appears. That name is stored in the tree, not in the blob.
Use git commit-tree to create a commit object that points to the tree.
git commit-tree <tree-hash> -m "initial snapshot"
Example output:
<commit-hash>
At this point, the commit object exists, but no branch necessarily points to it yet. A commit without a ref can become hard to find later.
Move main to the commit:
git update-ref refs/heads/main <commit-hash>
Make HEAD point to main:
echo "ref: refs/heads/main" > .git/HEAD
Now inspect the commit:
git cat-file -p <commit-hash>
Example output:
tree <tree-hash>
author You <you@example.com> 1699999999 +0000
committer You <you@example.com> 1699999999 +0000
initial snapshot
The first commit has no parent.
Create another file:
echo "bye" > bye.txt
git add bye.txt
git commit -m "add bye"
Example output:
[main 9f3a1c2] add bye
1 file changed, 1 insertion(+)
create mode 100644 bye.txt
Now inspect the tree for HEAD:
git ls-tree -r HEAD
Example output:
100644 blob <hash1> bye.txt
100644 blob <hash2> hello.txt
You can also inspect the root tree directly:
git cat-file -p HEAD^{tree}
Example output:
100644 blob <hash1> bye.txt
100644 blob <hash2> hello.txt
git log --oneline --graph --decorate
Example output:
* 9f3a1c2 (HEAD -> main) add bye
* a1b2c3d initial snapshot
The second commit points to the first commit as its parent.
Diagram:
refs/heads/main βββΊ commit C2
β
βββ tree T2
β βββ blob B1 bye.txt
β βββ blob B2 hello.txt
β
βββ parent commit C1
β
βββ tree T1
βββ blob B2 hello.txt
Look at HEAD:
cat .git/HEAD
Output:
ref: refs/heads/main
Look at the branch ref:
cat .git/refs/heads/main
Example output:
9f3a1c2...
This file contains the commit ID that main currently points to.
In some repositories, refs may be packed into .git/packed-refs, so you may not always see every ref as a loose file under .git/refs/.
The index is Gitβs staging area. It sits between the working directory and the next commit.
A useful model:
working directory β index β commit
files staged snapshot
The index stores a compact table of paths and blob IDs, plus file mode and metadata. It does not store file contents directly. The contents are stored as blob objects.
Inspect it with:
git ls-files --stage
Example output:
100644 <hash1> 0 bye.txt
100644 <hash2> 0 hello.txt
The index is why Git can quickly answer questions such as:
git status
git diff
git diff --staged
git commit
Git compares:
working directory vs index β unstaged changes
index vs HEAD β staged changes
HEAD vs another commit/tree β committed differences
During a merge conflict, the index can store multiple versions of the same path:
stage 1 = common ancestor
stage 2 = ours
stage 3 = theirs
stage 0 = resolved version
That is how Git keeps track of conflict information before you resolve it.
New objects usually begin as loose objects.
A loose object is stored under .git/objects/ using the first two hex characters of its object ID as a directory name.
Example:
.git/objects/ab/cdef1234...
Here:
ab first two hex characters
cdef... rest of the object ID
Loose objects are individually compressed. This is simple and fast for creating new objects.
Over time, Git may pack many loose objects into a packfile. A packfile stores many objects together and can use delta compression to reduce space.
Packfiles live here:
.git/objects/pack/
Example:
pack-1234abcd.pack
pack-1234abcd.idx
The .pack file stores the objects. The .idx file lets Git quickly find objects inside the pack.
Commands:
find .git/objects -type f | wc -l
git gc
ls .git/objects/pack
Example progression:
Loose objects:
.git/objects/
ab/cdef...
12/3456...
Packed objects:
.git/objects/pack/
pack-xxxx.pack
pack-xxxx.idx
git gc means garbage collection. It cleans up and optimizes the repository by packing objects, pruning unreachable objects when safe, and improving storage efficiency.
A blob only stores bytes. File names live in tree objects.
This explains Gitβs deduplication behavior.
If two files have identical contents, they use the same blob:
cp hello.txt copy.txt
git add copy.txt
git ls-files -s
Example output:
100644 <same-blob-hash> 0 copy.txt
100644 <same-blob-hash> 0 hello.txt
After committing, the tree has two names pointing to the same blob.
tree
βββ hello.txt β blob B1
βββ copy.txt β blob B1
This is also why a rename does not rewrite the file contents. Git can represent the new name in a new tree while reusing the same blob.
Important detail: Git does not store βrename objects.β Rename detection is usually computed later by commands like git diff or git log --follow, based on similarity between deleted and added paths.
A ref is a name that points to an object ID.
Common refs:
refs/heads/main
refs/heads/feature/login
refs/tags/v1.0
refs/remotes/origin/main
A branch is simply a movable ref that usually points to a commit.
Example:
refs/heads/main β commit C3
When you create a new commit on main, Git writes the commit object and then moves refs/heads/main to the new commit.
before:
main β C2
after commit:
main β C3 β C2
HEAD usually points to the current branch:
HEAD β refs/heads/main β commit C3
If you check out a specific commit instead of a branch, Git enters a detached HEAD state:
HEAD β commit C2
Detached HEAD is not dangerous by itself, but new commits made there are not attached to a branch unless you create one.
The reflog records where refs used to point. It is local to your repository and is extremely useful for recovery.
For example, when a branch moves due to commit, reset, rebase, or merge, Git records the previous position.
Inspect the reflog:
git reflog
Example output:
9f3a1c2 HEAD@{0}: commit: add bye
a1b2c3d HEAD@{1}: commit: initial snapshot
You can use reflog entries to recover lost commits:
git checkout -b recovered HEAD@{1}
or:
git reset --hard HEAD@{1}
Use reset --hard carefully because it changes the working directory and index.
The reflog is local. It is not normally pushed to remotes.
Tags give names to important points in history, often releases.
There are two main tag types.
A lightweight tag is just a ref pointing directly to a commit.
git tag v1.0
Conceptually:
refs/tags/v1.0 β commit C3
An annotated tag creates a tag object with metadata and a message.
git tag -a v1.0 -m "first release"
Inspect it:
git cat-file -p refs/tags/v1.0
Example output:
object <commit-hash>
type commit
tag v1.0
tagger You <you@example.com> ...
first release
Annotated tags are generally better for releases because they preserve tagger information, can be signed, and have their own object ID.
Use these commands when you want to inspect Git directly.
These commands are useful when you want to look under Gitβs normal user-facing commands and inspect the actual objects Git stores internally. Git stores repository data as objects, mainly commits, trees, blobs, and tags. These commands help you identify object types, inspect commit metadata, map filenames to blob hashes, compare stored snapshots, and check how much object storage the repository is using.
git cat-file -t <id> # object type
git cat-file -s <id> # object size
git cat-file -p <id> # pretty-print object
The git cat-file command lets you inspect Git objects directly by object ID, branch name, tag name, or other revision expressions. It is useful when you already have a hash and want to know what it represents.
Example output:
$ git cat-file -t HEAD
commit
$ git cat-file -s HEAD
245
$ git cat-file -p HEAD
tree 7b3f9a1c5d3e8f6a2b0c9d4e1f8a6b7c9d0e1f2a
parent 2a4c6e8f1b3d5a7c9e0f2a4b6d8e1c3f5a7b9d0
author Alex Example <alex@example.com> 1777808400 +0200
committer Alex Example <alex@example.com> 1777808400 +0200
Add hello.txt
Explanation:
git cat-file -t <id> shows the object type, such as commit, tree, blob, or tag.git cat-file -s <id> shows the object size in bytes.git cat-file -p <id> pretty-prints the object in a readable form.<id> can be a full hash, short hash, branch name, tag name, or expression like HEAD.git show --no-patch --pretty=raw HEAD
This shows the commitβs tree, parent commits, author, committer, and message.
This command displays the raw metadata for the commit pointed to by HEAD, without showing the file diff. It is useful when you want to inspect exactly what commit object Git has stored and which tree snapshot that commit points to.
Example output:
commit 9fceb02d0ae598e95dc970b74767f19372d61af8
tree 7b3f9a1c5d3e8f6a2b0c9d4e1f8a6b7c9d0e1f2a
parent 2a4c6e8f1b3d5a7c9e0f2a4b6d8e1c3f5a7b9d0
author Alex Example <alex@example.com> 1777808400 +0200
committer Alex Example <alex@example.com> 1777808400 +0200
Add hello.txt
Explanation:
commit is the hash of the commit object itself.tree is the root tree object for the project snapshot.parent points to the previous commit.author records who originally wrote the change.committer records who created or applied the commit.--no-patch hides the diff so only commit metadata is shown.--pretty=raw shows Gitβs raw commit fields with minimal formatting.git ls-tree HEAD
git ls-tree -r HEAD
git ls-tree -r --long HEAD
Useful for mapping file paths to blob hashes.
A tree object represents a directory snapshot. It stores filenames, file modes, object types, and object IDs. The git ls-tree command shows the contents of a tree, commit, branch, or tag without checking anything out into the working directory.
Example output:
$ git ls-tree HEAD
100644 blob e965047ad7c57865823c7d992b1d046ea66edf78 hello.txt
040000 tree 3b18e512dba79e4c8300dd08aeb37f8e728b8dad src
$ git ls-tree -r HEAD
100644 blob e965047ad7c57865823c7d992b1d046ea66edf78 hello.txt
100644 blob a3f5c7d9e1b2a4c6e8f0d3b5a7c9e1f2d4b6a8c0 src/main.py
$ git ls-tree -r --long HEAD
100644 blob e965047ad7c57865823c7d992b1d046ea66edf78 14 hello.txt
100644 blob a3f5c7d9e1b2a4c6e8f0d3b5a7c9e1f2d4b6a8c0 128 src/main.py
Explanation:
git ls-tree HEAD shows the top-level files and directories in HEAD.git ls-tree -r HEAD recursively lists files inside subdirectories.git ls-tree -r --long HEAD also shows blob sizes.100644 means a normal file.040000 means a directory tree.blob means file content.tree means directory content.git diff --name-status <commitA>^{tree} <commitB>^{tree}
This compares snapshots without needing to check them out.
This command compares the tree snapshots from two commits. It focuses on stored project content rather than the working directory. It is useful when you want to compare two repository states directly at the object level.
Example output:
M README.md
A hello.txt
D old-config.yml
R100 app.py src/app.py
Explanation:
M means the file was modified.A means the file was added.D means the file was deleted.R100 means the file was renamed with 100% similarity.<commitA>^{tree} resolves commit A to its tree object.<commitB>^{tree} resolves commit B to its tree object.git rev-parse HEAD:hello.txt
This prints the blob ID for hello.txt at HEAD.
This command asks Git to resolve a specific path inside a specific commit. It is useful when you want to know exactly which blob object stores the file content for a path at a particular commit.
Example output:
e965047ad7c57865823c7d992b1d046ea66edf78
Explanation:
HEAD:hello.txt means the file hello.txt as stored in HEAD..git/objects or packfiles.git show HEAD:hello.txt
This prints the version of hello.txt stored in HEAD.
This command reads a file directly from a commit without checking out that commit. It is useful when you want to inspect a historical version of a file, compare content mentally, or recover a fileβs contents from another revision.
Example output:
Hello, Git!
Explanation:
HEAD:hello.txt selects hello.txt from the HEAD commit.HEAD with another commit, branch, or tag.git show main:hello.txt reads hello.txt from the main branch.git count-objects -v
This shows loose object count, pack count, and storage size.
This command displays statistics about Gitβs object database. It helps you understand how many loose objects exist, how many packfiles are present, and how much disk space Git objects are using.
Example output:
count: 24
size: 96
in-pack: 1520
packs: 2
size-pack: 384
prune-packable: 0
garbage: 0
size-garbage: 0
Explanation:
count is the number of loose objects.size is the disk space used by loose objects, usually in KiB.in-pack is the number of objects stored inside packfiles.packs is the number of packfiles.size-pack is the disk space used by packfiles.prune-packable shows loose objects that also exist in packs and may be removable.garbage shows invalid or leftover object files.git gc.Git is made of three main layers:
1. Object database
blobs, trees, commits, tags
2. References
branches, tags, remote-tracking refs, HEAD
3. Working state
working directory, index, current checkout
The working directory is what you edit.
The index is what you plan to commit.
The commit is the saved snapshot.
working directory
β git add
βΌ
index
β git commit
βΌ
commit
β branch ref moves
βΌ
history
When you run:
git add hello.txt
Git writes or reuses a blob and updates the index.
When you run:
git commit
Git writes a tree from the index, writes a commit pointing to that tree, and moves the current branch ref.
When you run:
git checkout main
Git updates HEAD, updates the index, and writes files into the working directory to match the commit pointed to by main.