The git staging area
So far, we have been mostly interested in git commits trees, and not so much about how those commits were created.
Time to fill that gap.
Different areas in git
When working with git, there are three different areas to consider.
the git repository
That’s where git stores all the commits and their associated metadata information.
All those files are stored in the .git
directory.
the working directory
This directory is the developer’s workspace on her computer’s filesystem. The files in that directory are regular files which constitute the content of the project.
the staging area
Git’s primary responsibility is to allow the transition of data between the working directory and the git repository:
- when a user clones a repository or checks out a branch, git must convert a set of commits in plain old files in the working directory
- when a user wants to commit her work, git must convert data from the working directory into commits in the git repository
The staging area is the buffer between the git repository and the working directory.
In terms of shape, the staging area is very much like a commit. It contains information about files and directories, and like commits, the staging area refers to blobs and trees for the git database and not files from the working directory.
The staging area is the mechanism which allows the developer to build her next commit.
The content of the staging area is in a single file: .git/index
Exploration time
Let’s start from an empty repository and create our first file.
At this stage, the staging area doesn’t even exist, no commit has been created and there is a file A.txt
in your working directory.
The first step to prepare a commit is to do a git add
:
This will create the staging area (.git/index file
) and you can see it’s content with the git ls-files command:
But no commits has been created yet:
Time to create our first commit:
This time we have out first commit:
This does not change the content of our staging area:
In a clean situation, the staging area is a copy of our current commit (HEAD
).
Let’s create another file and git add it: this will add that file to our staging area:
This staging area allows git to provide a status about the intent of the developer:
At this stage, we can modify B.txt right?
The git status result is interesting after that change:
The file B.txt
appears in 2 sections of the status description:
- because this file is in the staging area, this file is marked as a new file to be committed
- but its content in the working directory is different from what is in the staging area, so this file is also marked as a non staged file
If we git add
that file:
The staging area is now:
Same 2 files as before but the astute reader will notice that the sha1 for B.txt
is different (is now169638
and was d5edb5
before)
That is the file which will be commited now:
We can also delete this file
The staging area has not changed, so git can report that there is an uncommited change (the deletion of B.txt
).
Thanks to the staging area, we can revive that file:
But we could also move forward with the deletion:
and now the staging area is back to 1 file and we can update our repository accordingly:
Finally, let’s suppose that we have a branch branch01 with 2 commits :
As usual, the staging area is a copy of our current HEAD commit:
We can vizualize the fact that the staging area is actually some sort of a copy of the HEAD commit (daac) by listing the content of that commit:
The key element here is the tree
reference (2db45
) which is a description of the tree associated to that commit.
We can also explore the content of that tree:
Which is exactly the content of our staging area:
And each time we switch branch, not only the HEAD reference will be updated to match the new current branch, but the staging area will be updated to match also the content of the new top commit: