Git References in a Nutshell
Introduction
In a previous post, I explained the types and content of Git objects, and how to reference them using SHA1 checksums. However, checksums are hard to remember; it would be more convenient if a checksum is associated with a meaningful name, and then users can just use that name to reference an object. Fortunately, Git provides such mechanism through Git references, which will be covered by this post. Although references can be defined for various kinds of objects, the focus of this post will be on commit objects, as they are perhaps the most frequently referenced kind of objects in practice.
Some background
Git references are stored as text files in three subdirectories (heads/
, tags/
and remotes/
) under .git/refs/
. Since these files are all text files, you can inspect the content just by opening them with a text editor. As the names of the subdirectories suggest, there are three kinds of references, each of which will be introduced in one of the following sections.
In the following text, I will pick up the example created in the previous post (see the link above) for demonstration. You should finish reading that post before you continue with this one.
Heads
A head reference points to the last commit object of a branch. It is created whenever a branch is created, with the name of the branch (e.g., “master”) being the reference that points to the commit. To list existing head references, use the git branch
command with no other options.
In the previous example where there is a branch named “master”1, you can see there is also a file named “master” in .git/refs/heads/
. If you open it with an editor, you will see the checksum that points to the latest commit object in the branch:
1 Note that the “master” branch in Git is not a special branch; it is just a branch that is created by default by the git init
command.
8d3dc140bf3fe82f050acef49d4be6e0e44d8016
With this file, Git associates the name “master” with the commit object denoted by the above checksum. Therefore, if we use git checkout master
, Git knows which commit to checkout.
It should be empathized again that a head reference always points to the last commit of a branch. This means that the associated checksum changes when a new commit is made to that branch, so you can’t depend on it to reference the same commit throughout the history of the project.
Remotes
The last type of reference is remote references. To list existing remote references in the repository, use the git remote
command with no other options.
A remote can be added via the git remote add
command. Then, when you use git push
to push a branch to the remote, Git creates a file under a subdirectory in .git/remotes/
, with the name of the subdirectory being the name of the remote (e.g., origin/
) and the name of the file being the remote branch (e.g., master
). This file is similar to that for a head reference, except that it contains the checksum for the last commit that you pushed to the remote branch instead of the last commit on the local branch.
Additional notes
A post that explains Git references will not be complete without mentioning the HEAD
file, which is located in .git/
. The content of this file points to the commit which is currently checked-out. Such information is needed when a new commit is made and Git needs to specify the parent of the new commit (see the previous post). It should be mentioned that, if a commit is checked-out with a head reference, HEAD
points to that reference; in this case, when a new commit is made, the reference will be updated and point to the new commit. However, if the same commit is checked-out with a checksum, a tag or a remote reference, HEAD
points to the checksum of that commit; in this case, it is said to be in a “detached state” and the reference won’t be updated when a new commit is made.
Another thing worth mentioning is that the git push
command by default does not push tags to remote repositories. To do it manually, use git push <remote> <tag>
where <remote>
is the targeted remote and <tag>
is the tag to be pushed.
Summary
The previous post and this post complements each other in introducing two important aspects of Git, including objects and references. Such knowledge should promote better understanding of how Git works and clarify some confusion when using various Git commands.