Git 101

Welcome to Git 101! This pico-course is going to take you from zero-to-hero with one of the greatest technologies of the last 20 years. We’ll explore a brief history of Version Control, look at all the Git essentials, and finally, you’ll learn a practical Git workflow to immediately improve your quality-of-life. If you feel like you’ve wandered here in error, note that Zoology 319: Searching for Sasquatch has been moved down the hall to room 210.

Why is This Course Needed?

Git is an awesome technology that has become the de-facto standard for version control in the software development industry. It is also a technology that most people never properly learn. Few University Computer Science programs cover it. Most new graduates and junior developers will teach themselves Git or other version control software as needed. You will be hard pressed finding a development job today that doesn’t require you to learn and use Git in some manor, and I’d argue that it’s impossible to find a development job worth keeping if they don’t use some kind of version control (and yes, they do exist).

Git is an essential tool of the modern developer. A tool they will use as often as a good text editor or noise-canceling headphones in an open office. Yet it remains a tool that is often brought up as an afterthought in a developer interview. Git is to developers what a good hammer is to a carpenter. Can you drive a nail into a board without it? Sure. But you sure as hell wouldn’t want to.

What is Version Control?

I’m not going to give you a history less on Version Control. If you want that, just check out the Wikipedia article.

Here’s a TL;DR of that page: Version control is an essential component of software development that tracks changes to source code files over the life of a project. As the name implies, version control software lets you keep different versions of the same source code files and permanently track changes to those files. It allows you to easily deploy an application to different environments (like testing or production) and it allows erroneous updates to be gracefully rolled back if needed. Perhaps most importantly, modern VCS (Version Control Systems) allow multiple people to independently contribute to a project in a (mostly) headache-free way.

Are you tired of your source code directories looking like this

$ ls
index.php                    index-new.php     index-new2.php
index-final.php              index-final2.php  index-final2-edited.php
index-final2.edited.php.bak  index-actually-final!.php

after a long days work?

Tired of using FTP to transfer individual files and constantly making and copying backups by hand?

Look no further! With proper version control, those are all relics of the past.

Git: A Brief history

Git was first released on April 7, 2005, following an incident with the source control management (SCM) software BitKeeper and Linux Kernel development. Linus Torvalds, perhaps best known for developing a little OS (Kernel) named Linux, created Git as a result of BitKeeper removing free use of their product. Failing to find a free product that worked similar to BitKeeper and met the requirements for developing the Linux Kernel, Torvalds decided to create his own SCM software, and Git was born.

The primary design goals for Git was something that:

  • Ran stupidly fast, due to the demanding nature of patching the Linux Kernel.
  • Prevented data corruption.
  • Allowed for distributed workflows.

Within a couple of weeks Torvalds had managed to achieve all of this, and version control as we know it was changed forever. Git was 100% free software, arguably better than everything else on the market, and was compatible with existing SCM workflows.

Git Basics

Before we Begin

Visualizing Git

It’s best to image a Git repository like a tree. It starts growing from the ground up. When a tree is new, it might only consist of a single branch (AKA the trunk). In most git repositories (and by default) this trunk is called master. The oldest bits of the tree will be near the base of the tree, and the newest bits will be at the top. Over time, the tree will grow. It will grow taller. It will grow branches. Those branches will grow branches. When trying to understand how a Git repository is laid out, imagine a picture of a tree where the oldest stuff is at the bottom, and the newest stuff is spread far out on branches or at the very top.

Terminology

I’m going to throw around a lot of terminology in this course, and some of it is not going to be explained in detail. If a term confuses you, just look it up on Google. At some point I might add a proper glossary. Until then, here’s a few important definitions. Also check out the official Git Glossary.

  • repository (repo) - A directory that contains all the files associated with a particular Git project.
  • HEAD - A reference to the currently checked out commit. Basically, the files that you see.

Disclaimer: All of the Git commands we’re going to learn today are for the command-line. By nature, Git works best via a command-line interface. Although there are many awesome GUI tools for working with git, it is essential to learn the basics via the command-line. This guide assumes that you know how to do basic operations from the command line in a Unix-like environment.

There are a couple of basic Git commands you must know in order to effectively use the tool. While the total number of git commands and options far exceeds the basics, knowing these will allow you to do most things you will ever need to.

The big five: clone, pull, push, add, commit. Learn these and you can start working with git. Of the big five, none is more important than the others, because you need all of them in order for Git to work.

The essential two: status, log. While knowing the big five commands above is the bare minimum to get you started with Git, you’ll also want to learn these essential two commands.

Before we dig into the commands, let’s look at a simple workflow for working on a software project. Let’s imagine you are working on a simple application and the source code is hosted on a server managed by your operations team. You need to make a copy of that source code on your local development machine, make a change to a file, and then upload that file to the server.

The Big Five

0) Make sure git is installed on your machine, and make sure you have done the first-time setup by setting your Username and Email:

  • $ git config --global user.name "<Your Name>"
  • $ git config --global user.email <Your Email>

1) Use the clone command to copy source code from a remote server to your machine.

  • Change into the directory you want to work in (eg ~/Projects)
  • Copy the source code with: $ git clone <URL-of-remote-git-repository>
    • :warning: A git repository URL will end in .git.
  • This will copy the entire source code repository onto your local machine in the location you specified above. (eg ~/Projects/repo-name)
    • Not only does this copy the source code files, but it copies every branch and the history of every file to your computer.
    • By default, you will be viewing the master branch. We’ll look at what a branch is more in the next section.
  • Some people will be upset that I started with clone instead of init. Most people that use Git these days work with a remote repository (GitHub, GitLab, etc).
    • If you don’t have a remote repository and want to create a git repo, you can use $ git init instead to create a new git repository in your current working directory.

2) Great, now you’ve got the source code, edited the file you needed to, and you’re ready to send that change back to the remote server. First, you’ll need to commit your changes.

  • Before you can commit your changes, you need to add them. These two commands go together like peanut-butter and jelly.
  • Use the add command to add all files you have created or modified to the staging area.
    • $ git add <list-files-delimited-by-spaces>
    • You can also use a catch-all like git add *
  • Use the commit command to record your change in the Git repository. No change you make is recorded in the Git history until you make a commit.
    • $ git commit
      • :warning: You have to create a message for every commit you make. Just typing git commit will open up your default text editor for you to write a commit message.
      • You can short-hand this by using the -m flag.
        • $ git commit -m "A commit message"
          • This is useful but it is good practice to write proper commit messages. A proper commit message will have a single line of 72 characters or less that summarizes the changes. Below that, you can write a more detailed explanation of what the changes in the commit do. It’s also good practice to write commit messages in the imperative form (eg “Fix Bug” not “Fixed a Bug”).
            • Here’s an example:
Update database references

We've decided to switch from SQLite to MySQL for our web application.
This commit changes all of the variable references from the SQLite DB
to the new MySQL DB.

3) Okay, you’ve got your changes made and committed. Congratulations - only you can see those changes on your local machine. You need to push these changes to the remote server.

  • Use the push command to publish your changes on the remote server.
  • $ git push <Remote-repository-name> <Branch-name>
    • For this example, you can use something like: $ git push origin master
      • The remote repository is named origin by default.
      • We haven’t yet taught you about branching, so you’ll likely still be on the master branch.
  • Your changes are now live on the remote server! :tada:

4) Now, let’s imagine another developer has been making changes and you want to update your local copy of the source code with those changes. You could blow away the entire directory and re-clone it, but that seems like a lot of work and wasted bandwidth. Let’s use the pull command instead.

  • Run: $ git pull
  • This will fetch those changes from the remote server and merge those changes with your local copy. What’s this? fetch and merge? You told me there were only five commands I had to know?! Well, we’re doing a little bit of git magic here. The pull command is actually equivalent to running both git fetch followed immediately by git merge <Fetch-head>
    • Ulimately, you’ll want to learn both git fetch and git merge on their own. For now, just knowing git pull will let you get by.

:tada: Congratulations! With these five commands, you can start working with Git. Let’s go home and have some beer :beer:, right? Wrong! There’s a lot more to Git than just pulls, commits, and pushes.

In particular, there are two commands you will find essential when using git.

The Essential Two

The commands I outlined above will let you start working with git, but they sure don’t provide a lot of information. While the following two commands are not technically required to use Git, they are all but essential for using Git effectively. Trying to do anything without them would be like trying to perfectly mount a frame on a wall without a level.

  • git status - The status command, exactly as you would imagine, tells you the status of your current working tree.
    • It tells you what files have been created or modified.
    • It tells you if new/modified files have already been added to the current working index with git add.
    • It tells you what branch you are working on.
    • It tells you if your current working branch is “ahead” or “behind” the branch you started from.
  • git log - The git log command prints out a log (history) of all commits made in a repository.
    • At a glance, this lets you take a brief but detailed look at the changes to a project over time.
    • By default, git log will print a little basic information about each commit in reverse chronological order (most recent changes are listed first). Each bit of information contains:
      • The commit hash (A SHA-1 checksum, a 40-digit hexadecimal string you can use to uniquely identify a particular commit).
      • The commit author.
      • The commit date.
      • The commit summary.

Beyond the Basics

Each of the commands I outlined above has about a million options. It would be well worth your time to check out the official documentation or to read some more advanced tutorials. I will give a couple of examples of “advanced” commands in this Course, but this is Git 101 after all. Maybe some day I’ll be able to make a Git 201 . :smile:

Branches

Remember how I told you to visualize a Git repository as a tree? What kind of tree doesn’t have branches? Branches are an essential part of Git or any modern version control system and Git’s branching model is particularly awesome.

In practice, you don’t want to be making your changes directly to the master branch of a repository. Let’s look at some commands to help you create your own branch or work with branches other than master.

  • branch
    • Use the branch command to list, create, or delete branches.
    • $ git branch - This command with no flags will list all branches you have on your local machine, and designates your current working branch with an asterisk (*).
    • $ git branch -r The -r flag shows all branches in the remote repository.
    • $ git branch <New-Branch-Name> will create a new branch based off of your current working branch.
  • checkout
    • Use the checkout command to switch between branches.
    • $ git checkout <Branch-name> immediately switches you to the desired branch.
    • $ git checkout -b <New-Branch-Name> will create a new branch and switch to it.
  • merge
    • When working between multiple branches, at some point you will probably copy the changes from one branch into another. You can accomplish this with the merge command.
    • $ git merge <Branch-name> will merge the branch (Branch-name) into your current branch. This is best explained with an example:
      • Let’s imagine that you have been working on a branch named “fix-bug” and you are ready to merge your changes back into the master branch.
      • The “tree” might look like this, where each letter is a commit:
      E---F---G    <-- fix-bug
     /
A---B---C---D      <-- master
  • To merge the fix-bug branch into master, first, checkout the master branch with $ git checkout master and then merge with git merge fix-bug.
  • If all goes well, the changes you made in fix-bug will now appear in master and the “tree” will look like this:
      E---F---G     <-- fix-bug
     /         \
A---B---C---D---H   <-- master
  • It is important to note that the fix-bug branch still exists, and you can still work on it independently from master or any other branches if you want to.

Solving a Merge Conflict

Inevitably you’re going to try to merge two branches together and see the dreaded words: merge conflict. This sounds a lot scarier than it is.

What is a merge conflict? A merge conflict occurs when git unable to automatically merge two files together. This typically occurs when a particular line or a file has been changed in two different branches. Git is unsure of which change you want to keep, so you will have to decide which change stays and which change gets deleted. In reality, merge conflicts are a regular part of using Git.

  • Run $ git status to see which file has the merge conflict.
  • Open that file with whatever text editor you prefer.
  • Inside the file, Git marks where the merge conflict happens. Here’s an example of what you will see.
<<<<<<< HEAD
some text
=======
text other text
>>>>>>> BRANCH-NAME
  • In this example, some text between <<<<<<< HEAD and ======= is the source code in the HEAD or current base branch.
  • some other text between ======= and >>>>>>> BRANCH-NAME is the source code in the BRANCH-NAME branch.
  • Simply delete whatever version you want to get rid of, leaving the version that you want to merge in. Also delete the conflict markers (<<<, ===, >>>) and associated text that Git inserted into the file.
    • commit the file that you fixed, and push it as necessary!

See, that’s not so scary! :ghost:

A Few More Things…

Here’s a couple more useful git tricks to get you started!

  • rebase
    • Use $ git rebase <Branch-to-Rebase-Onto> to move a commit, or set of commits, from one base commit to another.
    • Picture the Git tree we’ve talked about. Imagine if you could cut off a branch, and then seemly glue it back on somewhere else on the trunk. That is rebasing!
    • Visualy, it looks like this:
      E---F---G    <-- fix-bug
     /
A---B---C---D      <-- master
  • Run $ git rebase master fix-bug and it will now look like:
              E---F---G    <-- fix-bug
             /
A---B---C---D      <-- master
  • stash
    • Working on something and suddenly realize you have to switch branches to quickly fix a bug? You can quickly save your changes with $ git stash
      • This will store the current state of the repository, along with your file changes, and revert back to the HEAD commit. You can always re-apply these changes with $ git stash apply.
  • Use git diff to view changes in a commit.
    • Want to quickly see what files were changed from one commit to another? Use git diff!
    • $ git diff <HASH1> <HASH2>
      • HASH1 is the SHA-1 hash of the earlier commit, and HASH2 is the SHA-2 hash of the later commit. It will show you all of the changes in every file in classic diff style.
  • .gitignore
    • Don’t forget your .gitignore file! Use this to tell Git to ignore certain file types. This is useful for configuration files specific to your machine.
    • Create a .gitignore file with whatever text editor you want.
      • Simply list out the file extensions of whatever files you want to ignore.
      • You can also list specific files.
    • For example, a .gitignore file that contains .log will ignore every file in your repository that ends with .log.

Practical Git Workflow

Okay, now you know the basics. I promised you a no-nonsense method to use these tools to bring the best of Git to your life. Here goes! This is a basic Git workflow that will let you work both independently and with multiple people on projects that have source code hosted on a remote repository (e.g. Github).

This is a basic git Workflow. It incorporates most of what you just learned.

  1. Clone the master repository onto your local machine.
    • $ git clone [url]
  2. Create a new branch on your local machine to work on a feature.
    • $ git checkout -b [new_branch_name]
  3. Push the branch to the remote server.
    • $ git push origin [new_branch_name]
  4. Add a remote for your branch.
    • $ git remote add [new_remote_name] [git url]
  5. Make changes to local files.
  6. Stage the changes you want to commit.
    • $ git add [file_name]
  7. Commit the changes.
    • $ git commit
  8. Push the commit to your branch.
    • $ git push origin [new_branch_name/new_remote_name]
  9. Depending on your team organization, you have a couple options for merging these changes into the master branch.
    • If you’re using github, submit a pull request
    • Optionally, merge these into a staging/testing branch, or use a tool like Jenkins to automate the testing and deployment process.
  10. Once your changes have been accepted and you are done with this branch, you can delete it.
  11. Switch back to the master branch.
    • $ git checkout master
  12. Update your local master with the changes from the repository.
    • $ git pull
  13. Delete the branch you created.
    • $ git branch -d [new_branch_name]
  14. Delete the branch on the remote github repository.
    • $ git push origin :[new_branch_name]

This has gotten quite a bit longer than I originally intended. Looks like I’ll have to make a Git 201 pico-course some time soon to cover even more fun stuff you can do with Git.

That’s all for now! I hope you’ve learned a lot and you feel ready to start using Git with your own projects! Feel free to drop me a message on Twitter if you’ve got any questions or feedback. Thanks!

P.S. This is part of a week-long miniseries on Git. Make sure to check out the other articles!

Originally Posted:

Like what you're reading? Subscribe for email updates! No more than once a week - and probably much less often than that! No spam - I promise!