- when working on a feature branch, it’s a good idea to rebase on top of latest
version of source branch to keep the history of feature branch clean (without
git rebase develop).
git pull --rebaseis useful in the above situation. It automates the whole thing.
- Git is able to automatically understand that some file got renamed, without an
git mvautomatically adds a change to the Staging Area (index).
git rebase -iis powerful - the commits can be reordered!
- if something goes wrong during a rebase, we can recover our previous state by
moving the branch to the previous HEAD (
git reflog HEADmight help to find the SHA-1 of the old HEAD commit), before it’s garbage collected
git show <COMMIT>:
HEAD^^- parent of the parent
HEAD~3- 3rd commit behind
HEAD~2^2- if the 2nd commit behind HEAD is a merge commit, it has 2 parents. This is how we refer to the second parent.
Git can be seen as a set of layers:
Persistent key-value map
Git stores objects. Key is SHA-1, and the value is the contents.
Each commit is a point in the history. Git creates a kind of file system built with objects. Different commits point to a different “shape” of our repository.
.git/objects folder is a database of all Git objects.
There are 4 types of objects in Git:
- annotated tags
Every object in Git has its SHA-1.
In example, if a commit is created, a new file is created in
(name is the same as commit’s SHA-1):
It contains a basic info about a commit, including the SHA-1 of a tree - a
directory object in Git. If the commit was not first in history, there would
also be a parent information (SHA-1 of a parent commit(s)). If we look at the
contents of the tree object (with
git cat-file -p 4e9e0b052b2f0c5aaeabd215eacbb62c6ea299bf) we’d get:
The root folder of our repo contains two things:
directory. These 2 are shown above. Blob is an object that represents a
file. Contents of
This is exactly the same content as the file in our repo. Blobs store just the contents of the file. They do not store file’s name or its permissions. These are stored in the tree that points to that blob.
The commit points to a tree object, which represents the state of the repo for this commit. A tree than points to its contents: files (blobs) and directories (trees).
If there were 2 different files with the same content, just one object would be created. It would be pointed to twice (by different trees, if different directories contained these files).
An example of Git objects:
There are 2 commits (red). In the last commit, one file (
modified. Because of that, a new object was created for that new state of this
file. Since the
recipes directory stayed the same, both commits point to the
Git might also, for optimization, work a bit differently if files are big. When we change just a line of such a file, Git will store only the difference, instead of copying the entire file into a new blob object. However, for simplicity, it can be ignored.
After cloning a project from a remote, the
objects directory will not contain
the individual objects. Instead, there will be a “pack” containing all the
Branches are stored in
.git/refs/heads. Each branch has its file, and contains
just the SHA-1 of the commit it points to. A branch is just a reference to some
git cat-file -p <SHA-1 of an object> - displays the object’s content
git count-objects - show how many objects there are in the repo
SHA-1 of a commit is unique globally. There might be collision, but the chance for that is very low. The SHA-1 of blobs or trees is not unique, because it is dependent on the file contents only. Commit’s SHA-1 depends from the author, timestamp.
When there’s an ongoing merge (i.e., conflict),
.git directory will contain
MERGING_* files, which inform what we are merging.
.git/HEAD contains the path to the current branch, i.e.
ref: refs/heads/master. HEAD is a reference to a branch. When checking out another
branch, the only thing that changes is the content of the
If we checkout a commit (“detached” HEAD), instead of a branch,
contain the SHA-1 of that commit.
We can still commit in the detached state. If we then switch back to some
branch, these comits will be unreachable. After some time Git will remove them
objects database. We can still use these commits until that happens.
.git/index contains Staging Area changes. It’s a binary file, the content
is not readable.
Rebase moves commits around. It can do a fast-forward, just like
in this case,
git rebase spaghetti and
git merge spaghetti will have the
same result - “master” will be moved to “spaghetti” commit.
When rebasing commits, NEW commits are created that are mostly the same as the
original ones. Parents change, and this changes the SHA-1 - new objects will be
created as a result. The original commits will be deleted at some point from the
objects database by Git (garbage collection).
There are “normal” and “annotated” tags. Tags are similar to branches, with one difference - branches move with commits, tags do not move.
They are just like a label. They are stored in
.git/refs/tags and contain just
the SHA-1 of the commit that they point to. Such a tag could easily be
transformed in to a branch, just by moving it to
.git/refs/heads, where all
branches are stored.
They contain more information - a date, an author, a description. They are stored as an object.
.git/config contains the remotes configuration.
git pull is a combination of
git fetch and
Adding a part of file
It might happen that we modify a few lines of a file, but we do not want to add
all of it to the commit. We want to add just a part of these changes.
git add -p <file> does that. It splits the file into hunks. Git asks about each
hunk and we have to decide to include it (
y) or not (
n). We might also
decide to split the hunk into a smaller one (
s) and decide for each of the
smaller hunks separately. After it’s done, we may commit.
A few other git commands have the
git reflog - show the log of things happening in the repo (switching branches,
new commits, rebases, etc.)
git reflog HEAD - show the log of where
git show <SHA-1> - show any file by its SHA-1
git config --global --replace-all core.pager "less -iXFR" - displays
git log (and others) like
cat, instead of
less (content does not disappear).
git show <COMMIT> -
shows changes introduced by a commit
git blame <FILE> - shows line-by-line,
who changed each line and when.
^ means that the line was there since the
git revert <COMMIT> - creates a commit that does the opposite
operations than the supplied commit. It can also revert multiple commits.
Reverting merge commits is not so straightforward, and requires a bit different
git reset does different things in different contexts.
git reset moves the
branch to another commit. It can work in different modes:
--hard - it sets the working directory and index to the state of the new
commit - useful when working directory changes do not make sense and we want to
get back to where we started from
--mixed - (default) - it does not change the
working directory, only sets the index to the new commit - usefult when we have
some changes staged, but we want to unstage them. It optionally accepts a path
to a file that should be “reset”.
--soft - just moves the branch, does not
modify working directory, nor index - useful when we want to move back to some
commit from the past, modify some stuff, and commit again (rewriting history)
checkout. Switches branch. Use
-c to create one.
checkout. Restores files from the past.
Remove file from history
It is possible to remove some file entirely from Git history as if it never existed.
Entire history gets rewritten, new commits are created. This command might need to be installed on its own.
git bisect allows to find a commit that introduced some issue into the
codebase. It can work manually or automatically. We specify the range of commits
that the tool will look in. It will check each commit one-by-one, until it finds
the “bad” one.
gitattributes file allows to specify various configs.
Clean and Smudge filters
Git is able to remove sensitive data from specified files when commiting. It is also able to set the sensitive values when checking out the repository. It’s done using *Git Attributes.
Submodules is a way to share code between repos. One repo may have many submodules that it makes use of.
There should be some directory for submodules (i.e.,
Submodule will be cloned into the selected path. A new file -
will be created, listing all submodules.
More information about submodules
git status and
git diff do not show information about
modifications to submodules.
git status will give more information (i.e., submodule added).
git diff will show info about submodules.
Cloning a repo with submodules
The second way will work also if there are nested submodules.
Syncing repo with submodules
When working with a team, the following commands allow to make sure that our whole repo is in sync with remote:
When we update a submodule, we need to push it, together with the “main” repo that contains this submodule. By default, we need to do it manually. There is a config that automates that:
Now, when pushing main repo, also submodules will be pushed.
There are client-side and server-side hooks that can be added. They will eb run
automatically on speicified actions. By default, client-side hooks are stored in
.git/hooks. Therefore they are not share with the team. This can be chaged via