PPL 2021 — Introduction to version control: Git

This article was made as a part of Individual Review for PPL 2021

Gitlab, version control of git that is used in this course (Source: https://unsplash.com/photos/ZV_64LdGoao)

For most software developers, Git is one of the mandatory requirements to use, especially if we’re talking about multiple versions or working with a team. For this article, I will discuss what Git is and how I use it for PPL.

Brief introduction about Git, Git is a Distributed Version Control System(DVCS). The main principle of DVCS is that every file on the repository is saved on each developer's local machine, then they can push their work to the repo, then merge with another developer's work.

To put it simply, git will help store multiple versions of your work, including your team’s work, and track each version so you can quickly review your or your team's works. And then compare it for any conflict before the code makes way to the repository in the server.

DVCS Illustration (Source: https://medium.com/faun/centralized-vs-distributed-version-control-systems-a135091299f0)

Before we start to work on our project, we need to initialize our project. For this article's purpose, we will initialize our git inside the “PPL2021” folder. First, we need to change our directory to “PPL2021”. Then we can type git init

mkdir PPL201 # make PPL2021 folder
cd PPL2021 # change our directory to PPL2021
git init # initialize an empty git repository

You should get this output :

Initialized empty Git repository in /home/fredypasaud/Documents/PPL2021/.git/

What happens exactly? What you do is initialise an empty git repository. This repository is stored inside the .git file, hidden from you because you should never interact with this folder. This folder will save about version history of your git repository, including any branch or objects or any commit that you did.

After we have our initial repository, we can start to write our first file. In this example, we will write our markdown file, and the file name will be README.md.

## Hello WorldThis is test for week 2 individual review.

Before we continue, I want to cover Staging Area or Index in Git. One of the advantages of using Git from another control version is an intermediate layer between the repository and your local machine. In almost any control version, any file that you want to put on your repository will be directly stored on your repo; this method has a disadvantage when there is a conflict between the file on the repo, it’s hard to review the conflict locally, and there is a chance that the file in the repo will be broken.

Git Staging Area ( Source: https://dev.to/sublimegeek/git-staging-area-explained-like-im-five-1anh)

With Git, any file you want to commit will be stored temporarily in the staging area before you explicitly say that you want to upload that file to your repo. Now we want to add our file, and we can do it using :

git add <filename> # to add one file
git add <filename1,filename2,....,filenameN) # if we want to add N ammount of files
git add . # to add all files on our working directory

So, when we want to add our README.md file, we use :

git add README.md

Don’t worry, and there should be no any output line. It does not mean that what you do is wrong. If you want to see if your file has been added to the staging, use :

git status

Git Status command will show what branch we are (will be helpful in the future), is there any commit, or is there any staging file that needs to be commit. When we execute git status command, the output should be like this :

On branch masterNo commits yetChanges to be committed:
(use "git rm --cached <file>..." to unstage)
new file: README.md

From that output, we can learn that we are on branch master right now, we haven’t committed anything yet, and there is one file (README.md) in the staging area.

There is another unique thing that was unique to Git Staging Area. In Git Staging area, you can add as many files as possible that you want to commit, even the same modified file. Git will record the modified file and track its history version, and you can checkout out a particular file version via git status.

When we add any file to the staging area, Git will record its full content rather than the delta change on the selected file. This is why any conflict or problem in the future can quickly revert to our previous version without spending too many resources on counting the changes made to the file.

After all of our files that need to be push already on the staging area, the next step would be committing our work. This is where things start to get interesting. With Git, any file on the staging area won’t be committed to the repo unless you explicitly told Git to do that. To commit your thew files from the staging area to the repository, we can use :

git commit -m "<Commit Message>"

Every commit should have its identity number(uniquely and automatically generate) with the filename that will be committed. Now we will try to commit our first commit with commit message “Initial Commit”. We should get this output :

git commit -m "Initial Commit"

And the output should be like this :

[master (root-commit) 28b33e8] Initial Commit
1 file changed, 2 insertions(+)
create mode 100644 README.md

What we can infer from that message is that we have made one file change and two lines insertion (see README.md content above to know that we insert two lines to our README file)

The best practice to commit is that commit small work that you do. For example, your first commit contains a bug fix, and then you found a typo, you should commit to a different commit so that the message won’t be too long and it’s easier to track the history. If there is any conflict found, you can easily track which file is the problem and solve that issue.

After you are sure that all your changes have been committed, we can continue to our next step, pushing our work on our local machine to the repository so that the world (or your team) can see any file.

We have initialised our git, and we have written our first file, and we also have added our file to the staging area and commit it. But so far, all of our changes only save on our local machine. If we work together with a team, or we want the world to know our beautiful work, we should put our committed changes to Git Clients, such as GitLab or Github.

To solve that, we can utilise git remote. Git Remote is a shared repository that all of our team members can exchange the code. Before we add our remote, make sure that we have made our repository on our favourite git client. For this article, we will use GitHub. After we complete our repository on Github, we can add it to our remote using:

git remote add <remotename> <gitlink>

For this purpose, I will use Git SSH link, and I want my remote to be named origin (you can name it whatever you want, but you should never forget your remote names), so my remote will look like this :

git remote add origin git@github.com:fredypasaud/ppl2021.git

Now, we can push our committed change to our remote. Using git push command, the

git push <remotename> <branchname>

Because my remote name is origin, and I want to push branch master so my command will look like this :

git push origin master

Output :

Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 275 bytes | 275.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To github.com:fredypasaud/ppl2021.git
* [new branch] master -> master

Done! Our local work has been added to the GitHub repository where furthermore, people can see it or pull it to make some change.

Real Magic of Git in PPL 2021

Usually, when developer use git is not as simple as that (but I wish it can). The magic and some headache with git will start to show when our project got bigger and when there are many branches (or version) or stages on our repository. Because this article is made for my evaluation of my PPL course, I will use my PPL repository as an example.

Branches

One of the magic of git is the branch. Simply put git branches will store a different version of your program on a separate repository. Imagine if you’re developing an app and your team and all your team have to work on the same version of the program, accidents will definitely happen, you can overwrite your team's code or your group can overwrite your code. Yikes.

Generally speaking, for now, in our repository there are three branches. The first branch is the development branch (this is our working area to implement a new feature during the current sprint), staging (the branch that will combine our different branch), and production (the end version of our program where user can use it).

Our development branch (noted by PBI-xxx) and Staging Branch

If you want to checkout to a new branch(not available on the remote) you can use command :

git checkout -b <your-new-branch-name>

If you're going to work on a branch that already exists on the remote you can use command :

git checkout <name-of-existing-branch>

Usually (in my case) newest remote branch does not automatically appear on my local machine. I use the remote update command to retrieve all active branch on the remote to my local machine. The command is:

git remote update

Pull

Now that you have your repository, if you’re working with a team, there is a massive chance that you will need to pull (or fetch) the data from the remote repository to your local repository.

You can command git to pull from remote using:

git pull remote <the-branch-name-you-want-to-pull>

If you pull successfully (there are no conflict between your local and remote), you can see the change immediately in your favourite editor.

Successful git pull command
Conflict detected when pulling

Don’t worry if there is a conflict, you can resolve the conflict using your favourite editor to discuss after this.

Resolving conflict

Most of the time, conflict happens when you pull some branches that were several commits ahead of you. But git will give you PLENTY of room to resolve the conflict by yourself. Usually, I resolve conflict using my favourite code editor, VS Code and accept which change I want to apply, like the screenshot below.

Using VScode, you can choose which change you want to apply to your code.

Merge and Rebase

Usually, the next step after pulling the latest code is that you want to merge your work with your team’s work. There are a lot of steps in which u can achieve this, I will post the method that I will use the most

NOTE : Git will show an error if you try to merge a repository that has conflict in it. Make sure that you have resolved all your conflict first.

After you resolve all your conflict, you can merge your change to another branch. Usually, this will be done when you want to put your change into the staging or production branch. You can tell git to merge your project using:

git merge <branch-name>

What happens when you do this, git will compare your local work and the work in the remote repository. If there is no conflict detected, git will recursively merge your code.

Another useful git command is Git rebase. Git rebase when you want to combine a sequence of commit into a brand new branch. This will appear like you made a new branch from a different commit. Git can achieve this by creating new commits and applying them to the specified base.

Rebase Illustration

You can use git rebase by typing :

# Create a feature branch based off of master 
git checkout -b feature_branch master
# Edit files
git commit -a -m “Adds new feature”
git rebase <base>

And that’s it! That’s all you need to know about git to make your life easier. Goodluck!

Under-graduated Students Majoring in Computer Science