Must-Know Git Commands and Practices for Software Developers

Rafi Muhammad Daffa
Python in Plain English
11 min readMay 4, 2021

--

Note: Some scenarios in this article are based on true story from the development of Crowd+ software by Nice PeoPLe team.

Let’s face the truth. Almost every software project these days use some form of version control system (VCS) in the development process. The adoption is also accelerated by the increase in the software’s complexity and the amount of developers participating in the development which makes versioning more important than ever. One of the most popular VCS out there is Git, which will be covered in this article.

At its core, there are four “spaces” that we should care about in Git:

  1. Workspace: This is the space where we do our work and is directly visible to and editable by us. All modifications will initially be stored here.
  2. Staging Area: This is the space where we “stage” the modifications that we want to commit. This will come in handy when we want to commit changes incrementally and it is best practice to stage the modifications first before committing even though all changes are about to be committed here.
  3. Repository: This is the actual persistent storage in Git in which all changes between commits are tracked and is addressable by their commit IDs.
  4. Stashing Area: This is the area in which changes can be moved out of the way temporarily from the workspace without staging/committing it.

In addition to providing these four spaces, Git also provide mechanisms to transport files between spaces and also between commits (which is Git’s analogue to file version, although certain commits can also function as a “tag” with a more familiar version numbering system). Most of Git’s commands (including the ones elaborated in this article) serves this function.

The Greatest Common Divisor: CLI

At its purest form, Git’s main interface is actually command-line. Graphical interfaces are actually plugins built on top of the command-line interface as an adapter. Although most IDEs these days provide integrated Git functionalities with robust graphical interfaces, some scenarios may prevent you from using it, such as when using a CLI-only server terminal. Therefore, it is prudent to be able to operate Git in its most basic form: the CLI. This method will be used for this article. Usage through graphical interfaces can be intuitively inferred using its’ command-line counterpart.

Like most command-line software, Git can be installed either by an installer package (such as for Windows and macOS) or through a package manager (such as for Linux and FreeBSD). These installation methods also adds Git to the “Path” which enables easy access to the Git software. This can be validated by checking Git’s version using this command:

git --version
When the installed Git’s version shows up upon execution of the command, then you are ready to go.

The Two Starting Points: Init and Clone

There are usually two scenarios in which developers may start using Git in the development lifecycle:

  1. The developer is starting the repository locally from scratch
  2. A remote repository already exists and the developer will continue work on it.

In the first scenario, the way to initiate a Git repository is through the init command, which creates an empty repository and its necessary support files in the current folder.

git init
This command will initiate an empty repository in which you can start working

The above command is actually also valid for the second scenario with some extra steps to make our local repository aware of a so-called “remote repository” and pulling existing files. However, Git has simplified the process under a single clone command. Do note that cloning creates a new folder by default instead of creating a repository in the current folder like before.

git clone <remote repo url>
When cloning, Git will create a new folder named according to the repository’s name and set up a repository with files from the remote repository there.

Tracking and Staging Changes: Status, Add, and Remove

Now that we have a functional Git repository, we can start working on our software. Other than doing it inside a Git repository folder, the working experience itself will not be different with or without Git. File I/O will work as usual until we decide to let Git know about the changes that we make in the repository’s workspace.

Let’s say that we add some files to our newly created repository. We can command Git to produce a report on our repository which also lets Git know that there are pending changes. The report can be seen through the status command as follows:

git status
The status report picks up our new files as an untracked files.

Untracked means Git has yet to track any changes to that file. This is the case for new files. Therefore, when we first stage the modification to this file, Git will also begin to track changes to the file.

Staging a file to the staging area in preparation for commit can be done through the add command and can be done repeatedly (even on the same file through multiple changes) which will be accumulated:

git add <file/folder name># For example:
git add text.txt

Note: Staging a folder will trigger a recursive staging process in which every child of that folder will also be staged. Following this logic, staging . (dot, current directory) in the repository’s root directory stages every single change.

Now, let’s say that we want to withdraw changes to the file. There are some cases that can warrant a change withdrawal, such as:

  1. Environment files in a software package (.env) are not intended to be committed to VCS since it will put sensitive configuration files at risk.
  2. Build artefacts such as Python caches are usually left out of the VCS since it will usually be generated again in a remote repository as part of the CI/CD process.

To un-stage a file (or folder), the reverse of the add command can be used: remove (rm). In this case, the --cached parameter is required to only unstage the file. Without this parameter, the actual file will be lost forever.

git rm --cached <file or folder name># For example:
git rm --cached text.txt
By applying the remove command, the original state is restored. In this case, the file becomes untracked again.

Committing Changes

Once we have staged the appropriate changes to our repository, we can now commit the changes to our repository. This process is similar to creating a new version of our repository in the system (unrelated to the tags system in which commits can be given actual version numbering) which can be addressed by a unique ID. Once a commit command is sent, all files and folders currently in the staging area will have their changes recorded as part of a commit and the staging area will be cleared. To commit changes, the following command is used:

git commit -m “<commit message>”# For example:
git commit -m "[DEMO] Add a text file"
Upon completion of the commit process, the changes in this commit are addressable through the commit ID (in this case, it is b677e24)

This process can be said as a point of no return since there is no “diplomatic” way to un-commit a change to the repository. Once a change is committed, it will be stored forever as part of that commit. Although brute-force method exists to remove a commit, policies can be implemented to prevent such methods from being used, such as the case in GitHub and GitLab.

Undoing Committed Changes: Revert

While a committed change cannot be “diplomatically” uncommitted in the sense that the commit cannot be erased, undoing said changes is still possible. Rather than erasing the previous commit, the “diplomatic” way to undo the committed changes is to introduce a new commit whose changes reversed what the previous commit has done. It is important to note that the previous changes still exist and is addressable through its commit ID. This step does not magically make those changes disappear entirely. This is why thinking before committing is important (even in real life).

For example, when we commit changes in the incorrect workspace/folder, we can introduce a new commit to reverse those changes so that other people will not see our changes in their workspace, although they will still see the extra commits. 😄

To revert a change, we will need obtain the “problematic” commit ID which can be obtained through the log using the following command:

git log
The log command shows all commits in the repository, including details such as its long-form commit ID, author, date, and message.

The long-form commit ID obtained from the log can be copied and used to revert the commit using the following command:

git revert <commit ID># For example:
git revert b677e24158c4f84b9af80ba37512228d696e8b91
Since reverts are actually commits that reverses another commit’s changes, it can also have a commit message. In this case, a default message is generated which shows that this commit reverts another commit
Reverts also have their own commit ID. By that logic, you can also revert a revert by doing the same command to this commit 🤣

Keeping Up with Remote Repositories: Push and Pull

Whenever we add changes and committing them in our own machine, the commit will be added to our local repository, regardless of whether this repository is created from scratch or as a clone of a remote repository (for example: GitHub and GitLab repositories). In both cases, there will be time when our commits will need to be published in the remote repository. On the other hand, we also need to know and use other people’s changes which is also published in the remote repository.

In the case of a cloned repository, the sole remote repository can be addressed by the origin name. Remote repositories can be added in either case by using the appropriate commands. Also, since we are still working on the default branch, the branch name will most likely be master (for old repositories) or main (for some new repositories), depending on your team’s policy.

To publish our existing commits, we can use the push command:

git push <remote repo name> <branch whose changes are to be pushed># For example:
git push origin main
Since we are publishing the changes onto a remote repository, network connection is required
Our commits are now also visible in the remote repository (in this case, in GitHub). Note that no files are seen since we previously reversed the addition of the text file.

Now, let’s say that someone else also published their changes to the remote repository and we need to use that changes in our local machine. For this example, I will be adding a file directly through GitHub’s interface as a simulation.

A new file has been introduced in the remote repository. Now, we will need to pull the file to our cloned local repository.

To obtain changes from a remote repository, we can use the pull command as follows:

git pull <remote repo name> <branch whose changes are to be pulled># For example:
git pull origin main
Upon completion of the pull command, the new file is also introduced in the local repository.

Branching

In a software development lifecycle, it is often the case that we need a new set of workspace with a separate history from the repository’s main branch, such as when:

  1. Commits to the main branch are restricted because it contains production code
  2. We need to experiment something without disturbing or risk damaging the main branch
  3. We need to prevent race conditions when people attempts to push at the same time in a fast-paced environment

In those cases, we can use Git’s branch feature. In a nutshell, a Git branch behaves like a tree’s branch which deviates from the “trunk” (main branch) at some point and has a separate history from there. The separate history solves the last two problems mentioned above.

For example, let’s say that we are at the tip of the staging branch and we need to create a new branch called PBI-10-mengisi_project to work on this PBI on a separate set of history. To do this, we can use the checkout command with an extra -b argument which creates a new branch which deviates from the main branch starting from the current tip.

git checkout -b “<new branch name>”# For example:
git checkout -b "PBI-10-mengisi_project"
A checkout command followed by the -b argument creates a new branch

To be able to see the current list of branches and which branch we are currently in, we can use the branch command. Here, the current branch is highlighted and marked with an asterisk (*).

git branch
The highlighting process depends on the terminal configuration, but an asterisk is always available to mark the current branch.

Now, let’s say the we commit some new changes to this branch. Do those changes also apply to the main branch (staging)? To see this, we would need to change the current position to the main branch. This can be done with the checkout command without the -b argument and specifying the destination branch.

git checkout <destination branch name>
Let’s try to commit a change to our new branch. In this case, a new file called “new_file_in_pbi_10.txt” is created.
When we change position to the main branch, we can no longer find the file since the change is recorded in our new branch, not this branch.

Merging Changes

At some point, the changes that we have made in our branch will need to be integrated to the main branch. In the case of a PBI branch, if a PBI task has been completed, we will need to combine it with other PBI’s component to start integration testing and deployment. Therefore, we need to merge the changes in our new branch to the main branch.

Merging works by adding one new commit in the destination branch which records and applies all changes (even in different commit) in the source branch. To do this in the CLI, we will need to move to the destination branch (in this case, it is staging). After that, we can use the following command to merge changes:

git merge <source branch># For example:
git merge PBI-10-mengisi_project
Now that we have merged the changes in our new branch, the new file is now also visible in the main branch.

Now that we have merged the changes to the main branch, we may no longer need the diverging branch anymore. Therefore, once the changes are successfully merged, or maybe we want to throw out the changes entirely before merging, we can use the branch command with the -d argument and specify the branch that we want to delete.

git branch -d <branch to be deleted>
Once a branch is deleted, it won’t be showed in the branch list anymore.

Merging in Remote Repository Platforms

In the case of a collaborative repository with large amount of developers, the ability to merge locally is usually restricted by policy. For example, GitHub and GitLab both automatically classifies the main branch as a “protected branch” in which commits (including merges) are restricted. Indeed, merging locally has some problems, such as:

  1. Does not solve the race condition problem since we will still need to push the main branch itself later on.
  2. When a merge conflict is introduced, solving it is more difficult locally rather than using the remote repository’s platform tools.

Therefore, it is recommended that any merges, especially those targeting the main branch, to be done in the remote repository. To do this, we would need to push our new branch to the remote repository, then use the remote repository to merge. Both in GitHub and GitLab, merges must be requested first which can be checked and approved by others before actually merging. In GitHub, this is called pull request. On the other hand, GitLab calls this feature merge request.

Other Best Practices

In addition to the must-know commands, there are also some practices that will improve the effectiveness of our Git workflow, such as:

  1. Adding a .gitignore file to list files that should not be tracked by Git, such as environment files and build artefacts.
  2. Use a descriptive commit message to be able to easily identify commits and the changes it introduces.
  3. Before every push, be sure to pull first to make sure that your local and remote repositories are in sync.

More content at plainenglish.io

--

--