what is the git index?

This article describes how gits manages the index, aka staging area, aka cache.

According to the git documentation:

The “index” holds a snapshot of the content of the working tree, and it is this snapshot that is taken as the contents of the next commit. So, after making any changes to the working tree, and before running the commit command, you must use the add command to add any new or modified files to the index.

The index is a file in the .git directory called the index. It is a binary file that contains a list of files that are staged for the next commit.

The index has multiple names:

  • index
  • staging area
  • cache

fun (?) fact

The original name for the index was “directory cache”.

If you do a hexadecimal dump of the .git/index file, you will see something like this:

alina

alina git:( main ) xxd .git/index

00000000: 4449 5243 0000 0002 0000 0002 6478 ab96  DIRC........dx..

00000010: 1585 01d8 6478 ab96 1585 01d8 0100 0004  ....dx..........

...omitted for brevity...

alina git:( main )

The first 4 bytes — DIRC — are a signature for that file and it means directory cache. You can see that in the very first git commit.

As Linus Torvalds says in the comment: It’s just a cache, after all

How does it work?

Let’s create and add a file to our repository.

alina

alina git:( main ) date > file1.txt

alina git:( main ) git add file1.txt

alina git:( main ) git commit -m 'file1'

[main e01fae6] file1

 1 file changed, 1 insertion(+)

 create mode 100644 file1.txt

alina git:( main )

Git provides a command to view the content of the index: git ls-files --stage.

alina

alina git:( main ) git ls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 bfd06f4ff13d91caf5ace4c88b764c229b129b84 0       file1.txt

alina git:( main )

Here’s a quick description of the output:

  • The first column is the file mode. It is used to set permissions of the file on the file system when it is checked out.
  • The second column is the SHA1 of the file. It serves as a signature for the file content.
  • The third column is the stage number. In our case, it will be 0, but it can be different during a merge operation.
  • The fourth column is the file path.

You can obtain the git SHA1 of a file using the git hash-object command.

alina

alina git:( main ) git hash-object file1.txt

bfd06f4ff13d91caf5ace4c88b764c229b129b84

alina git:( main )

By comparing the SHA1 of a file on disk with what is in the index file, Git can determine if a file has been modified in the working directory.

Let’s update the file:

alina

alina git:( main ) date >> file1.txt

alina git:( main )

As long as it has not been added with git add, the index will not be changed.

alina

alina git:( main ) git ls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 bfd06f4ff13d91caf5ace4c88b764c229b129b84 0       file1.txt

alina git:( main )

The SHA1 of the file on disk has changed:

alina

alina git:( main ) git hash-object file1.txt

2f5479226eab5908ed27083f858f51e22d5188c7

alina git:( main )

And Git knows that this file has been modified:

alina

alina git:( main ) git status

On branch main

Your branch is ahead of 'origin/main' by 1 commit.

  (use "git push" to publish your local commits)

Changes not staged for commit:

  (use "git add <file>..." to update what will be committed)

  (use "git restore <file>..." to discard changes in working directory)

        modified:   file1.txt

no changes added to commit (use "git add" and/or "git commit -a")

alina git:( main )

If we add it to the index, the index will be updated with the new SHA1:

alina

alina git:( main ) git add file1.txt

alina git:( main ) git ls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 2f5479226eab5908ed27083f858f51e22d5188c7 0       file1.txt

alina git:( main )

Now, we can commit our changes:

alina

alina git:( main ) git commit -m 'update file1'

[main a94683e] update file1

 1 file changed, 1 insertion(+)

alina git:( main )

But this action does not change the index file:

alina

alina git:( main ) git ls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 2f5479226eab5908ed27083f858f51e22d5188c7 0       file1.txt

alina git:( main )

Let’s create a new branch called b1 and switch to it. This action does not change the index file.

alina

alina git:( main ) git checkout -b b1

Switched to a new branch 'b1'

alina git:( b1 ) git ls-files-stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 2f5479226eab5908ed27083f858f51e22d5188c7 0       file1.txt

alina git:( b1 )

Next, let’s create a new file called file2.txt and add it to the index:

alina

alina git:( b1 ) date >> file2.txt

alina git:( b1 ) git add file2.txt

alina git:( b1 )

This will change the index:

alina

alina git:( b1 ) git ls-files-stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 2f5479226eab5908ed27083f858f51e22d5188c7 0       file1.txt

100644 07a39f9930bfdb6235a6af5c5da4b8371427f4cd 0       file2.txt

alina git:( b1 )

Now, let’s commit our changes:

alina

alina git:( b1 ) git commit -m 'add file2'

[b1 121d150] add file2

 1 file changed, 1 insertion(+)

 create mode 100644 file2.txt

alina git:( b1 )

And now, let’s make a change to file1.txt:

alina

alina git:( b1 ) date >> file1.txt

alina git:( b1 ) git add file1.txt

alina git:( b1 )

This change to file1.txt will be reflected in the index:

alina

alina git:( b1 ) git ls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 fb6b07ec0e4765950d93c5b26ea8a642087bc26a 0       file1.txt

100644 07a39f9930bfdb6235a6af5c5da4b8371427f4cd 0       file2.txt

alina git:( b1 )

Finally, let’s commit our changes:

alina

alina git:( b1 ) git commit -m 'change file1'

[b1 d06ada7] change file1

 1 file changed, 1 insertion(+)

alina git:( b1 )

Now, let’s switch back to the main branch:

alina

alina git:( b1 ) git checkout main

Switched to branch 'main'

Your branch is ahead of 'origin/main' by 2 commits.

  (use "git push" to publish your local commits)

alina git:( main )

And now, Git will restore the index to the state of the main branch:

  • file1.txt is restored to the state of the main branch
  • file2.txt is removed from the index

alina

alina git:( main ) git checkout mainls-files --stage

100644 0c50cb1098bda2e5c32fe2667cb64453434edc8f 0       README.md

100644 2f5479226eab5908ed27083f858f51e22d5188c7 0       file1.txt

alina git:( main )

At the same time, Git restores the working directory to match the content of the index.