GIT & GITHUB
Basics of Distributed Version Control
          (some slides courtesy of Pro Git)
    Charles Liu
Overview
0. What is Git? Installation and setup
1. Introduction to version control; basic workflow in Git
2. Branching , merging, and rebasing
3. Working with remotes and Github
What is Git?
¨  A distributed version control system
¨  A few use cases:
      ¤  Keep a history of previous versions
      ¤  Develop simultaneously on different branches
        n  Easily
                try out new features, integrate them into production or
          throw them out
      ¤  Collaborate    with other developers
        n  “Push”   and “pull” code from hosted repositories such as Github
Key improvements
¨    A distributed version control system
      ¤  Everyone can act as the “server”
      ¤  Everyone mirrors the entire repository instead of simply
          checking out the latest version of the code (unlike svn)
¨    Many local operations
      ¤  Cheap to create new branches, merge, etc.
      ¤  Speed increases over non-distributed systems like svn
Installation and setup
¨    http://git-scm.com/download (try this first)
¨  Linux: apt-get install git-core
¨  Mac: http://code.google.com/p/git-osx-installer/
¨  Windows: http://msysgit.github.com/
      ¤    Git bash
¨    Eclipse extensions such as eGit
¨    Github GUI
First time setup
¨  git config --global user.name “Charles Liu”
¨  git config --global user.email “cliu2014@mit.edu”
      ¤  This    email should be registered in your Github (more on
            this later)
¨    Line breaks (\r\n in Windows vs. \n in Mac/Linux)
      ¤  Mac/Linux: git config --global core.autocrlf input
      ¤  Windows: git config --global core.autocrlf true
Use case #1: history of versions
q    Basic workflow
q    Adding and committing files
q    The git log
q    The staging area
q    Removing files
q    Viewing diffs of files
q    The .gitignore file
Big ideas
¨  Snapshots, not deltas
¨  Everything is confined to the .git directory
¨  Most operations are safe – they only add data
      ¤  We’ll   talk about two commands that are not safe today
¨    3 possible states for a file
      ¤  Changed
      ¤  Staged
      ¤  Committed
Basic workflow
¨    git init – create git project in existing directory
      ¤  Make   Git start to “watch” for changes in the directory
¨    The basic workflow:
Basic workflow
¨    Add files to be committed with git add <filename>
      ¤  Puts   the file in the “staging area”
¨  Create a commit (a “snapshot”) of added files with
    git commit, followed by a commit message
¨  Use git status to see the current status of your
    working tree
The git status output
The git log output
The staging area
¨    git add takes the snapshot of the file that will be
      committed à you can change the file after adding it
The staging area
¨    To unstage a file, but retain your changes in the
      working tree:
      ¤  git   reset HEAD <filename>
¨    To discard current changes in the working tree, and
      make it look like the last commit:
      ¤  gitcheckout -- <filename>
      ¤  Be careful! You will lose your changes and not get them
          back!
Removing a file
¨  To remove a file from the working tree and in the next
    commit, simply git rm <filename>
¨  To remove it from the next commit, but keep the file in
    the working tree, do git rm --cached <filename>
Viewing diffs of files
Viewing diffs of files
The .gitignore file
¨  Specifies files that you don’t want Git to track under
    version control
¨  Commonly used for compiled files, binaries, large
    asset files (e.g. images)
¨  Can use wildcards (e.g. *.pyc, *.png, Images/*, etc.)
¨  Be careful – if you add a file to .gitignore after it’s
    already been tracked, potential issues
¨  A list of recommended .gitignore files:
    https://github.com/github/gitignore
Use case #2: branching
q    What is a branch?
q    Branching commands
q    The HEAD pointer
q    Basics of merging
q    Basics of rebasing
q    Aside: the git reset command
What is a branch?
¨  Visualize a project’s development as a “linked list” of
    commits.
¨  When a development track splits, a new branch is
    created.
¨  In Git, branches are actually just a pointer to these
    commits
Branching commands
¨  List all branches in the project – git branch
¨  Create a new branch – git branch <branchname>
¨  Switch to a branch – git checkout <branchname>
¨  Create and immediately switch – git checkout –b
    <branchname>
¨  Delete a branch – git branch –d <branchname>
Stashing
¨  Working tree must be clean when switching branches
¨  Stash changes that you don’t want to commit at that
    time – git stash
      ¤  Puts   a stash onto the stack
¨    Later, apply the most recent stashed changes and
      remove that stash – git stash pop
The HEAD pointer
¨  Recall: all branches simply a pointer to a commit
¨  HEAD: special pointer to the current branch, moves
    around as you switch branches
Merging
Merging
                    HEAD
1. git checkout master
Merging
                          HEAD
1.  git checkout master
2.  git merge iss53
Merging
                            HEAD
1.  git checkout master
2.  git merge iss53
Note: merge creates a new
commit with 2 parents!
Merge commits
Before merge…master and new have diverged
(commit 3 on master vs. commit 1 on new branch)
Merge commits
                 Old tip of master
                Old tip of new branch
Merge conflicts
¨  Sometimes, two branches will edit the same piece of
    code in different ways.
¨  Must resolve the conflict manually, then add the
    conflicting files and explicitly commit.
                      Conflict markers
Merge conflicts
Rebasing
¨  Instead of a merge, which creates a new commit
    originating from both branches, a rebase takes the
    contents of one branch after the “split” and moves
    them to the end of the other branch.
¨  The command git rebase <basebranch> takes your
    currently checked out branch and replays the diffs on
    top of basebranch.
  Rebasing
1. git checkout server
                         HEAD
  Rebasing
1.  git checkout server
2.  git rebase master
                          HEAD
  Rebasing
1.  git checkout server
2.  git rebase master
Why rebase?
¨  Creates a linear history; often cleaner and easier to
    read.
¨  But…DO. NOT. EVER. rebase anything that has
    already been pushed to a repo someone else has
    access to
      ¤  Rebasingremoves commits and writes new ones; but
        someone else might have already based their work off
        your old commits!
An aside…the git reset command
¨    3 versions…and often the source of much confusion!
      ¤  git reset --soft <commit / pointer to commit>
      ¤  git reset --mixed <commit / pointer to commit> (or simply git
          reset)
      ¤  git reset --hard <commit / pointer to commit>
¨    Reset proceeds in 3 steps:
      1.    Move the HEAD pointer
      2.    Update the index/staging area to the new contents of HEAD
      3.    Update the working directory
3 steps to reset
1.    Move the HEAD pointer – soft stops here.
2.    Update the index/staging area to the new contents
      of HEAD – mixed stops here.
3.    Update the working directory – hard stops here
Note: reset --hard overwrites the working directory.
This is another command that can potentially cause
loss of data!
Use case #3: collaboration
q    Creating a repo on Github
q    Remotes
q    Remote-tracking branches
q    Push, fetch, and pull
q    The git clone command
Remotes
¨    A target computer that has Git repos that you can
      access
      ¤  Via   http(s), ssh, or git protocols
¨  git remote add <remotename> <remoteaddress>
¨  git remote –v (view remotes)
¨  git remote rm <remotename>
¨  Often, with one remote, we name it “origin”
Authenticating to Github
¨  Sometimes recommends HTTPS, but often SSH easier
¨  Need to generate a keypair:
    https://help.github.com/articles/generating-ssh-
    keys
Github – SSH keys
¨  cd ~/.ssh; ls
¨  If a file named id_dsa.pub or id_rsa.pub does not
    exist:
      ¤  ssh-keygen -t dsa -C “<your email here>”
      ¤  ssh-add id_dsa
¨    pbcopy < ~/.ssh/id_dsa.pub (on Macs)
    Github – SSH keys                    1. Click “account
                                         settings”
                                           3. Click “Add
                                           SSH key”
2. Click “SSH
Keys”
                4. Paste your key here
Create a repo
                Repo URL
Pushing and fetching
¨    git push <remotename> <branchname> sends your
      code in the branch up to the remote
      ¤  Often
              just git push: depends on settings but often
        equivalent to git push origin master
¨    git fetch <remotename>
Remote tracking branches
¨  When you do git fetch, you don’t immediately see the
    changes. Why?
¨  Changes are fetched to a “remote tracking branch”
      ¤  Branches associated with a remote, but treat them like a
          local branch
      ¤  Can merge with your current master (git checkout master;
          git merge origin/master)
      ¤  Even better…rebase
Remote tracking branches
Remote tracking branches
In summary…
q    Basic workflow in git
       q    Adding, committing, viewing diffs
q    Branches
       q    The HEAD pointer, merging, and rebasing
q    Remotes
       q    Pushing and fetching; quick introduction to Github
Lots of other topics
¨  Tags and version numbers
¨  Interactive rebase: squashing and amending commits
¨  Relative pointers from HEAD (e.g. HEAD^^, HEAD~3)
¨  Submodules
¨  Using your own server as a git server (bare repos)
¨  Git as a filesystem (git grep, git ls-files, etc.)
¨  GUIs to view trees and graphical merge tools
¨  …more!
For more information
¨    The book Pro Git (which I based much of this
      presentation on), available for free!
      ¤  https://github.s3.amazonaws.com/media/progit.en.pdf
      ¤  Covered
                Chapters 1-3 in detail, very simple ideas from
        Chapters 4-6
¨  Git documentation: do git help <commandname>
¨  Google, StackOverflow, etc.
That’s all for today!
¨  We’re done with client-side technologies!
¨  Office hours tonight, 7-9pm in 32-044 (basement of
    Stata)
¨    Tomorrow: 11am in 26-152 (8.01 teal room)
      ¤  Introduction   to server-side technologies, SQL databases,
        UI/UX