Git References in a Nutshell

hacking skills

Author

zenggyu

Published

2018-09-12

Abstract

A brief introduction to git references.

Introduction

In a previous post, I explained the types and content of Git objects, and how to reference them using SHA1 checksums. However, checksums are hard to remember; it would be more convenient if a checksum is associated with a meaningful name, and then users can just use that name to reference an object. Fortunately, Git provides such mechanism through Git references, which will be covered by this post. Although references can be defined for various kinds of objects, the focus of this post will be on commit objects, as they are perhaps the most frequently referenced kind of objects in practice.

Some background

Git references are stored as text files in three subdirectories (heads/, tags/ and remotes/) under .git/refs/. Since these files are all text files, you can inspect the content just by opening them with a text editor. As the names of the subdirectories suggest, there are three kinds of references, each of which will be introduced in one of the following sections.

In the following text, I will pick up the example created in the previous post (see the link above) for demonstration. You should finish reading that post before you continue with this one.

Heads

A head reference points to the last commit object of a branch. It is created whenever a branch is created, with the name of the branch (e.g., “master”) being the reference that points to the commit. To list existing head references, use the git branch command with no other options.

In the previous example where there is a branch named “master”¹, you can see there is also a file named “master” in .git/refs/heads/. If you open it with an editor, you will see the checksum that points to the latest commit object in the branch:

¹ Note that the “master” branch in Git is not a special branch; it is just a branch that is created by default by the git init command.

8d3dc140bf3fe82f050acef49d4be6e0e44d8016

With this file, Git associates the name “master” with the commit object denoted by the above checksum. Therefore, if we use git checkout master, Git knows which commit to checkout.

It should be empathized again that a head reference always points to the last commit of a branch. This means that the associated checksum changes when a new commit is made to that branch, so you can’t depend on it to reference the same commit throughout the history of the project.

Tags

If you need a name that always references the same commit, define a tag reference. This kind of reference is ideal for marking important development milestones like releases. To list existing tag references in the repository, use the git tag command with no other options.

There are two types of tags: lightweight and annotated. According to the book Pro Git:

A lightweight tag is very much like a branch that doesn’t change – it’s just a pointer to a specific commit. Annotated tags, however, are stored as full objects in the Git database. They’re checksummed; contain the tagger name, email, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG). It’s generally recommended that you create annotated tags so you can have all this information; but if you want a temporary tag or for some reason don’t want to keep the other information, lightweight tags are available too.

Tag references can be created using the git tag command. To create an annotated tag named “v1.0” that points to the first commit (394606d39892d861cea4f1250cd6ddd9c1ae933b) in the example, use the git tag command with the -a option:

git tag -a -m 'This is a tag that points to the first commit.' v1.0 394606d39892d861cea4f1250cd6ddd9c1ae933b

where the -m option should be followed by a tagging message. After the command is executed, you will find a file named v1.0 in .git/refs/tags/ of the project directory, which contains the above checksum. Now that I’ve added the tag, I can use it as a reference to checkout the commit:

git checkout v1.0

This is much more convenient than using the SHA1 checksum as a reference for the same purpose:

git checkout 394606d39892d861cea4f1250cd6ddd9c1ae933b

Remotes

The last type of reference is remote references. To list existing remote references in the repository, use the git remote command with no other options.

A remote can be added via the git remote add command. Then, when you use git push to push a branch to the remote, Git creates a file under a subdirectory in .git/remotes/, with the name of the subdirectory being the name of the remote (e.g., origin/) and the name of the file being the remote branch (e.g., master). This file is similar to that for a head reference, except that it contains the checksum for the last commit that you pushed to the remote branch instead of the last commit on the local branch.

Additional notes

A post that explains Git references will not be complete without mentioning the HEAD file, which is located in .git/. The content of this file points to the commit which is currently checked-out. Such information is needed when a new commit is made and Git needs to specify the parent of the new commit (see the previous post). It should be mentioned that, if a commit is checked-out with a head reference, HEAD points to that reference; in this case, when a new commit is made, the reference will be updated and point to the new commit. However, if the same commit is checked-out with a checksum, a tag or a remote reference, HEAD points to the checksum of that commit; in this case, it is said to be in a “detached state” and the reference won’t be updated when a new commit is made.

Another thing worth mentioning is that the git push command by default does not push tags to remote repositories. To do it manually, use git push <remote> <tag> where <remote> is the targeted remote and <tag> is the tag to be pushed.

Summary

The previous post and this post complements each other in introducing two important aspects of Git, including objects and references. Such knowledge should promote better understanding of how Git works and clarify some confusion when using various Git commands.