what is the git index?
This article describes how gits manages the index, aka staging area, aka cache.
According to the git documentation:
The “index” holds a snapshot of the content of the working tree, and it is this snapshot that is taken as the contents of the next commit. So, after making any changes to the working tree, and before running the commit command, you must use the add command to add any new or modified files to the index.
The index is a file in the .git directory called the index. It is a binary file that contains a list of files that are staged for the next commit.
The index has multiple names:
- index
- staging area
- cache
fun (?) fact
The original name for the index was “directory cache”.
If you do a hexadecimal dump of the .git/index
file, you will see something like this:
alina
alina git:( main ) xxd .git/index
00000000: 4449 5243 0000 0002 0000 0002 6478 ab96 DIRC........dx..
00000010: 1585 01d8 6478 ab96 1585 01d8 0100 0004 ....dx..........
...omitted for brevity...
alina git:( main ) █
The first 4 bytes — DIRC
— are a signature for that file and it means directory cache.
You can see that in the very first git commit.
As Linus Torvalds says in the comment: It’s just a cache, after all
How does it work?
Let’s create and add a file to our repository.
alina
alina git:( main ) date > file1.txt
alina git:( main ) git add file1.txt
alina git:( main ) git commit -m 'file1'
[main e01fae6] file1
1 file changed, 1 insertion(+)
create mode 100644 file1.txt
alina git:( main ) █
Git provides a command to view the content of the index: git ls-files --stage
.
alina
alina git:( main ) git ls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 bfd06f4ff13d91caf5ace4c88b764c229b129b84 0 file1.txt
alina git:( main ) █
Here’s a quick description of the output:
- The first column is the file mode. It is used to set permissions of the file on the file system when it is checked out.
- The second column is the SHA1 of the file. It serves as a signature for the file content.
- The third column is the stage number. In our case, it will be 0, but it can be different during a merge operation.
- The fourth column is the file path.
You can obtain the git SHA1 of a file using the git hash-object
command.
alina
alina git:( main ) git hash-object file1.txt
bfd06f4ff13d91caf5ace4c88b764c229b129b84
alina git:( main ) █
By comparing the SHA1 of a file on disk with what is in the index file, Git can determine if a file has been modified in the working directory.
Let’s update the file:
alina
alina git:( main ) date >> file1.txt
alina git:( main ) █
As long as it has not been added with git add
, the index will not be changed.
alina
alina git:( main ) git ls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 bfd06f4ff13d91caf5ace4c88b764c229b129b84 0 file1.txt
alina git:( main ) █
The SHA1 of the file on disk has changed:
alina
alina git:( main ) git hash-object file1.txt
2f5479226eab5908ed27083f858f51e22d5188c7
alina git:( main ) █
And Git knows that this file has been modified:
alina
alina git:( main ) git status
On branch main
Your branch is ahead of 'origin/main' by 1 commit.
(use "git push" to publish your local commits)
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: file1.txt
no changes added to commit (use "git add" and/or "git commit -a")
alina git:( main ) █
If we add it to the index, the index will be updated with the new SHA1:
alina
alina git:( main ) git add file1.txt
alina git:( main ) git ls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 2f5479226eab5908ed27083f858f51e22d5188c7 0 file1.txt
alina git:( main ) █
Now, we can commit our changes:
alina
alina git:( main ) git commit -m 'update file1'
[main a94683e] update file1
1 file changed, 1 insertion(+)
alina git:( main ) █
But this action does not change the index file:
alina
alina git:( main ) git ls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 2f5479226eab5908ed27083f858f51e22d5188c7 0 file1.txt
alina git:( main ) █
Let’s create a new branch called b1
and switch to it. This action does not change the index file.
alina
alina git:( main ) git checkout -b b1
Switched to a new branch 'b1'
alina git:( b1 ) git ls-files-stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 2f5479226eab5908ed27083f858f51e22d5188c7 0 file1.txt
alina git:( b1 ) █
Next, let’s create a new file called file2.txt
and add it to the index:
alina
alina git:( b1 ) date >> file2.txt
alina git:( b1 ) git add file2.txt
alina git:( b1 ) █
This will change the index:
alina
alina git:( b1 ) git ls-files-stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 2f5479226eab5908ed27083f858f51e22d5188c7 0 file1.txt
100644 07a39f9930bfdb6235a6af5c5da4b8371427f4cd 0 file2.txt
alina git:( b1 ) █
Now, let’s commit our changes:
alina
alina git:( b1 ) git commit -m 'add file2'
[b1 121d150] add file2
1 file changed, 1 insertion(+)
create mode 100644 file2.txt
alina git:( b1 ) █
And now, let’s make a change to file1.txt
:
alina
alina git:( b1 ) date >> file1.txt
alina git:( b1 ) git add file1.txt
alina git:( b1 ) █
This change to file1.txt
will be reflected in the index:
alina
alina git:( b1 ) git ls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 fb6b07ec0e4765950d93c5b26ea8a642087bc26a 0 file1.txt
100644 07a39f9930bfdb6235a6af5c5da4b8371427f4cd 0 file2.txt
alina git:( b1 ) █
Finally, let’s commit our changes:
alina
alina git:( b1 ) git commit -m 'change file1'
[b1 d06ada7] change file1
1 file changed, 1 insertion(+)
alina git:( b1 ) █
Now, let’s switch back to the main
branch:
alina
alina git:( b1 ) git checkout main
Switched to branch 'main'
Your branch is ahead of 'origin/main' by 2 commits.
(use "git push" to publish your local commits)
alina git:( main ) █
And now, Git will restore the index to the state of the main
branch:
file1.txt
is restored to the state of themain
branchfile2.txt
is removed from the index
alina
alina git:( main ) git checkout mainls-files --stage
100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0 README.md
100644 2f5479226eab5908ed27083f858f51e22d5188c7 0 file1.txt
alina git:( main ) █
At the same time, Git restores the working directory to match the content of the index.