GIT: A BRIEF INTRODUCTION
Randal L. Schwartz, merlyn@stonehenge.com
Version 5.1.7 on 20 July 2017
This document is copyright 2006-2017 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
http://creativecommons.org/licenses/by-nc-sa/3.0/
ABOUT ME
➤ Been tracking git since it was created (April 2005)
➤ Took Linus to lunch when he moved to Portland: “What are you working on?”
➤ Used git on everything from individual tasks to large teams
➤ Formerly participated in the mailing list
➤ But it’s too noisy now
➤ Provided some patches to git, and suggestions for user interface changes
➤ Still build git every day from master
➤ Gave the first Google TechTalk (October 2007) that talked about what Git actually is
➤ Linus spoke six months earlier about what Git isn’t
WHAT IS GIT?
➤ Git manages changes to a tree of files over time
➤ Git is optimized for:
➤ Distributed development
➤ Large file counts
➤ Complex merges
➤ Making trial branches
➤ Being very fast
➤ Being robust
BUT NOT FOR...
➤ Tracking file permissions and ownership
➤ Tracking individual files with separate history
➤ Making things painful
WHY GIT?
➤ Essential to Linux kernel development
➤ Created in 2005 as a replacement when BitKeeper suddenly became “unavailable”
➤ Now used by millions of projects
➤ Everyone can:
➤ Clone the tree
➤ Make and test local changes
➤ Submit the changes as patches via mail
➤ OR submit them as a published repository
➤ Track the upstream to revise if needed
HOW DOES GIT DO IT?
➤ Universal public identifiers
➤ Computed from the constituent bits, so they can be verified
➤ Multi-protocol transport: HTTP, SSH, GIT
➤ Efficient object storage
➤ Everyone has entire repo (disk is cheap)
➤ Easy branching and merging
➤ Common ancestors are computable
➤ Patches (and repo updates) can be transported or mailed
➤ Binary “patches” are supported
THE SHA1 IS KING
➤ Every object has a SHA1 to uniquely identify it
➤ Objects consist of:
➤ Blobs (the contents of a file)
➤ Trees (directories of blobs or other trees)
➤ Commits:
➤ A tree
➤ Plus zero or more parent commits
➤ Plus a message about why
➤ And tags
TAGS
➤ An object (usually a commit)
➤ Plus an optional subject (if anything else is given)
➤ Plus an optional payload to sign it off
➤ Plus an optional GPG signature
➤ Designed to be immobile
➤ Changes not tracked after cloning
➤ Use a branch if you want to move around
OBJECTS LIVE IN THE REPO
➤ Git efficiently creates new objects
➤ Objects are generally added, not destroyed
➤ Unreferenced objects will garbage collect
➤ Objects start “loose”, but can be “packed”
➤ Packs represent objects as deltas
➤ Packs are also created for push and pull
COMMITS RULE THE REPO
➤ One or more commits form the head of object chains
➤ Typically one head called “master”
➤ Nothing special about the name “master” though
➤ Others can be made at will (“branches”)
➤ Usually one commit in the repo that has no parent commit (“root” commit)
REACHING OUT
➤ From a commit, reaching the components:
➤ Chase down the tree object to get to directories and files as they existed at this
commit time
➤ Chase down the parent objects to get to earlier commits and their respective trees
➤ Do this recursively, and you have all of history
➤ And the SHA1 depends on all of that!
THE GIT REPO
➤ A “working tree” has a “.git” dir at the top level
➤ Unlike CVS, SVN: no deeper pollution
➤ This makes it friendly to recursive greps
➤ Even though “git grep” is also provided
THE .GIT DIR CONTAINS:
➤ config – Configuration file (.ini style)
➤ objects/* – The object repository
➤ refs/heads/* – branches (like “master”)
➤ refs/tags/* - tags
➤ logs/* - logs
➤ refs/remotes/* - tracking others
➤ index – the “index cache” (described shortly)
➤ HEAD – points to one of the branches (the “current branch”, where commits go)
THE INDEX (OR “CACHE”)
➤ A directory of blob objects
➤ Represents the “next commit”
➤ “Add files” to put current contents in
➤ “Commit” takes the current index and makes it a real commit object
➤ Diff between HEAD and index:
➤ changed things not yet committed
➤ Diff between index and working dir:
➤ changed things not yet added
➤ untracked things
WHAT’S IN A NAME?
➤ Git doesn’t record explicit renaming
➤ Nor expect you to declare it
➤ Exact renaming determined by SHA1
➤ Copy-paste-edits detected by similarity
➤ Computer better than you at that
➤ Explicit tracking will be wrong sometimes
➤ Being wrong breaks merges
GIT SPEAKS AND LISTENS
➤ Many protocols to transfer between repos
➤ http, https, git, ssh, local files
➤ In the core, git also has:
➤ import/export with CVS, SVN
➤ I use SVN import to have entire history of an SVN project at 30K feet
➤ Third party solutions handle others
➤ Git core also includes cvs-server
➤ A git repository can act like a CVS repository for legacy clients or humans
GETTING GIT
➤ Get the latest: https://git-scm.com/downloads
➤ Linux packages also exist
➤ Track the git-developer archive:
➤ git clone https://github.com/git/git
➤ Maintenance releases are very stable
➤ I install mine “prefix=/opt/git”
➤ add /opt/git/bin to PATH
GIT COMMANDS
➤ All git commands start with “git”
➤ “git MUMBLE-FOO bar” can also be written as “git-MUMBLE-FOO bar”
➤ This allows a single entry “git” to be added to the /usr/local/bin path
➤ This works for internal calls as well
➤ Manpages are still under “git-MUMBLE-FOO”
➤ Unless you use “git help MUMBLE-FOO”
➤ Or “git MUMBLE-FOO --help”
PORCELAIN AND PLUMBING
➤ Low-level git operations are called “plumbing”
➤ Higher level actions are called “porcelain”
➤ The git distro includes both
➤ Use porcelain from command line
➤ But don’t script with it
➤ Future releases might change things
➤ Past releases have definitely changed things
➤ Use plumbing for scripts
➤ Intended to be upward compatible (or at least versioned)
➤ Strangely, the --porcelain flag often enables plumbing mode (WTF?)
CREATING A REPO
➤ git init
➤ Creates a .git in the current dir
➤ Optional: edit .gitignore
➤ “git add .” to add all files (except .git!)
➤ Then “git commit” for the initial commit
➤ Creates current branch named “master”
➤ The name “master” is otherwise unspecial
➤ Could also do this on a tarball, once unpacked
CLONING
➤ Creates a git repo from an existing repo
➤ Generally creates a subdirectory
➤ Your workfiles and .git are in there
➤ Clone repo identified as “origin”
➤ But the name is otherwise unspecial
➤ Remote branches are “tracked”
➤ On fetch operations, the remote branch heads are copied to origin/$branchname
➤ Remote “HEAD” branch checked out as your initial “master” branch as well
COMMITTING
➤ Your work product is more commits
➤ These are always on a “branch”
➤ A branch is just a named commit
➤ When you commit, the former branch head becomes the parent
➤ The branch head moves to be the new commit
➤ Thus, you’re creating a directed acyclic graph
➤ ... rooted in branch heads
➤ A merge is just a commit with multiple parents
TYPICAL WORK FLOW
➤ Edit edit edit
➤ git add files/you/have changed/now
➤ This adds the files to the index
➤ “git add .” for adding all interesting files (except ignored files)
➤ “git add -p” selects patch by patch (my favorite)
➤ git status
➤ Tells you differences between HEAD, index, and working directory
MAKING THE COMMIT
➤ “git commit” pops you into a text editor for the commit message
➤ Good commit message rules:
➤ Separate subject from body with a blank line
➤ Limit the subject line to 50 characters
➤ Capitalize the subject line
➤ Do not end the subject line with a period
➤ Use the imperative mood in the subject line
➤ Wrap the body at 72 characters
➤ Use the body to explain what and why vs. how
➤ Current branch is moved forward
BUT WHICH BRANCH?
➤ Git encourages branching
➤ A branch is just 41 text bytes!
➤ Typical work flow:
➤ Think of something to do
➤ git checkout -b topic-name master
➤ work work work, commit to topic-name
➤ When your thing is done:
➤ git checkout master
➤ git merge topic-name
➤ git branch -d topic-name
WORKING IN PARALLEL
➤ You can have multiple topics active:
➤ git checkout -b topic1 master
➤ work work; commit; work work; commit
➤ git checkout -b topic2 master
➤ work work work; commit
➤ git checkout topic1; work work; commit
➤ Decide how to bring them together
➤ Merge: parallel histories
➤ Rebase: serial histories
➤ Each has pros and cons
THE MERGE
➤ git checkout master
➤ git merge topic1; git branch -d topic1
➤ This should be trivial (“fast forward”) merge
➤ git merge topic2
➤ Conflicts may arise:
➤ overlapping changes in text edits
➤ files renamed two different ways
➤ You need to resolve, and continue:
➤ git commit -a -m “describe the merge fix here”
THE REBASE
➤ Rewrites commits
➤ Breaks SHA1s: commits are lost!
➤ Don’t rebase if you’ve published commits!
➤ git checkout topic2; git rebase master
➤ topic2’s commits rewritten on top of master
➤ May result in merge conflicts:
➤ git rebase --continue or --abort or --skip
➤ git rebase -i (interactive) is helpful
➤ When rebased, merge is a fast forward:
➤ git checkout master; git merge topic2
READ THE HISTORY
➤ git log
➤ print the changes
➤ git log -p
➤ print the changes, including a diff between revisions
➤ git log --stat
➤ Summarize the changes with a diffstat
➤ git log -- file1 file2 dir3
➤ Show changes only for listed files or subdirs
WHAT’S THE DIFFERENCE?
➤ git diff
➤ Diff between index and working tree
➤ These are things you should “git add”
➤ “git commit -a” will also make this list empty
➤ git diff HEAD
➤ Difference between HEAD and working tree
➤ “git commit -a” will make this empty
➤ git diff --cached
➤ between HEAD and index
➤ “git commit” (without -a) makes this empty
OTHER DIFFS
➤ git diff OTHERBRANCH
➤ Other branch and working tree
➤ git diff BRANCH1 BRANCH2
➤ Difference between two branch heads
➤ git diff BRANCH1...BRANCH2
➤ changes only on branch2 relative to common ancestor
➤ git diff --stat (other options)
➤ Nice summary of changes
➤ git diff --dirstat (other options)
➤ Summarize directory changes
BARKING UP THE TREE
➤ Most commands take “tree-ish” args
➤ SHA1 picks something absolutely
➤ Can be abbreviated if not ambiguous
➤ HEAD, some-branch-name, some-tag-name, some-origin-name
➤ Optionally followed by @{historical}
➤ “historical” can be:
➤ yesterday, 2011-11-22, etc (date ref)
➤ 1, 2, 3, etc (prior version of this ref)
➤ “upstream”, “u” (upstream version of local)
MEET THE PARENTS
➤ Any of those on the prior slide, followed by:
➤ ^n - “the n-th parent of an item” (default 1)
➤ ~n - n ^1’s (so ~3 is ^1^1^1)
➤ :path - pick the object from the tree
➤ ^{/some text here} - first parent with a commit message matching the text
RANGES
➤ $tree1 - all commits reachable from $tree1 back to the root
➤ ^$tree1 $tree2 - all commits from $tree2 MINUS all commits from $tree1
➤ Common enough that it can be written $tree1..$tree2 (“from $tree1 to $tree2”)
➤ $tree1…$tree2 is $m..$tree1 with $m..$tree2, where $m is the merge base
➤ Omitting either operand in .. or … defaults to HEAD
➤ $tree1.. - all commits on current branch since $tree1
➤ ..$tree2 - all commits introduced on $tree2 since this branch forked
RANGE EXAMPLES
➤ git diff HEAD^ HEAD
➤ most recent change on current branch
➤ Also: git diff HEAD~ HEAD
➤ git diff HEAD~3 HEAD
➤ What damage did last three edits do?
SEEING THE CHANGES
➤ gitk mytopic origin
➤ Tk widget display of history
➤ Shows changes back to common ancestor
➤ gitk --all
➤ show everything
➤ To skip the tags: gitk HEAD --branches --remotes
➤ gitk from..to
➤ Just the changes in “to” that aren’t in “from”
➤ git show-branch from..to
➤ Same thing for the Tk-challenged
PLAYING WELL WITH OTHERS
➤ git clone creates “tracking” branches
➤ Typically named “origin/master” etc
➤ To share your work, first get up to date:
➤ git fetch origin
➤ Now rebase your changes on upstream:
➤ git rebase origin/master
➤ Or fetch/rebase in one step
➤ git pull --rebase
➤ To push upstream:
➤ git push
RESETTING
➤ git reset --soft
➤ Makes all files “updated but not checked in”
➤ git reset --hard # DANGER
➤ Forces working dir to look like last commit
➤ git reset --hard HEAD~3
➤ Tosses most recent 3 commits
➤ use “git revert” instead if you’ve published
➤ git checkout HEAD some/lost/file
➤ Recover the version of some/lost/file from the last commit
IGNORING THINGS
➤ Every directory can contain a .gitignore
➤ lines starting with “!” mean “not”
➤ lines without “/” are checked against basename
➤ otherwise, shell glob via fnmatch(3)
➤ Leading / means “the current directory”
➤ Checked into the repository and tracked
➤ Every repository can contain a .git/info/exclude
➤ Both of these work together
➤ But .git/info/exclude won’t be cloned
CONFIGURATION
➤ Many commands have configurations
➤ git config name value
➤ set name to value
➤ name can contain periods for sub-items
➤ git config name
➤ get current value
➤ git config --global name [value]
➤ Same, but with ~/.gitconfig
➤ This applies to all git repos for a user
THE STASH
➤ Creates temporary commits to represent:
➤ current index (git add ...)
➤ current working directory (git add .)
➤ Can rebase those onto new index later
➤ Many uses, such as pull into dirty workdir:
➤ git stash; git pull ...; git stash pop
➤ Might result in conflicts, of course
➤ Multiple stashes can be in play
➤ “git stash list” to show them
OTHER USEFUL PORCELAIN
➤ git archive: export a tree as a tar/zip
➤ git bisect: find the offensive commit
➤ git cherry-pick: selective merging
➤ git mv: rename a file/dir with the right index manipulations
➤ git rm: ditto for delete
➤ git push: write to an upstream
➤ git revert: add a commit that undoes a previous commit
➤ git blame: who wrote this?
COMMIT ADVICE
➤ Split changes into small logical steps
➤ Ideally ones that pass the test suite again
➤ This helps for “blame” and “bisect”.
➤ Easier to squash commits later than to break up
➤ “git rebase -i” can squash, omit, reorder
PICKING FROM BRANCHES
➤ Two main tools: “merge” and “cherry-pick”
➤ Merge brings in all commits
➤ Scales well for large workflows
➤ Cherry-pick brings in one or more
➤ Great when a single patch is needed
GIT.GIT’S WORKFLOW
➤ Four branches:
➤ maint: fixes to existing releases
➤ master: next release
➤ next: testing for next master
➤ pu: experimental features
➤ Each one is a descendent of the one above
➤ Commit to the oldest branch needing patch
➤ Then merge it upward:
➤ maint to master to next to pu
TOPIC BRANCHES
➤ Most features require several iterations
➤ Commit these to topic branches during design
➤ Easier to rehack or abandon this way
➤ Fork topic from the oldest main branch
➤ Refresh-merge from that branch if needed
➤ But don’t do that routinely
➤ Rebase topic branch if forked from wrong branch
➤ More details at “man 7 gitworkflows”
TESTING INTEGRATION
➤ Merge from base branch to topic branch
➤ ... on a new throw-away branch
➤ This branch is never merged back in
➤ Just for testing
➤ Can be published publicly, if you make it clear that it’s just for testing
➤ Otherwise, typically used only locally
➤ If integration fails, fix, and cherry-pick those back to the topic branch before final
merge
TIME TO “GIT” DIRTY
➤ Make a git repository:
➤ mkdir git-tutorial
➤ cd git-tutorial
➤ git init
➤ git config user.name “Randal Schwartz”
➤ git config user.email merlyn@stonehenge.com
➤ Add some content:
➤ echo "Hello World" >hello
➤ echo "Silly example" >example
WHAT’S UP?
➤ git status
➤ git add example hello
➤ git status
➤ git diff --cached
“GIT ADD” TIMING
➤ Change the content of “hello”
➤ echo "It's a new day for git" >>hello
➤ git status
➤ git diff
➤ Now commit the index (with old hello)
➤ git commit -m initial
➤ git status
➤ git diff
➤ git diff HEAD
GIT COMMIT -A
➤ Note that we committed the version of “hello” at the time we added it!
➤ Fix this by adding -a nearly always:
➤ git commit -a -m update
➤ git status
WHAT HAPPENED?
➤ Ask for logs:
➤ git log
➤ git log -p
➤ git log --stat --summary
➤ Tag, you’re it:
➤ git tag my-first-tag
➤ Now we can always get back to that version later
SHARING THE WORK
➤ Create the clone:
➤ cd ..
➤ git clone git-tutorial my-git
➤ cd my-git
➤ The git clone will often have some sort of transport path, like git: or rsync: or http:
➤ See what we’ve got:
➤ git log -p
➤ Note that we have the entire history
➤ And that the SHA1s are identical
BRANCHING OUT
➤ Create branch “my-branch”
➤ git checkout -b my-branch
➤ git status
➤ Make some changes:
➤ echo "Work, work, work" >>hello
➤ git commit -a -m 'Some work.'
CONFLICTS
➤ Switch back, and make other changes:
➤ git checkout master
➤ echo "Play, play, play" >>hello
➤ echo "Lots of fun" >>example
➤ git commit -a -m 'Some fun.'
➤ We now have conflicting commits
SEEING THE DAMAGE
➤ In an X11 display:
➤ gitk --all
➤ The --all means “all heads, branches, tags”
➤ For the X11 challenged:
➤ git show-branch --all
➤ git log --pretty=oneline —abbrev-commit --graph --decorate --all
➤ Handy for a mail message
MERGING
➤ We’re on “master”, and we want to merge in the changes from my-branch
➤ Select the merge:
➤ git merge my-branch
➤ This fails, because we have a conflict in “hello”
➤ See this with:
➤ git status
➤ Edit “hello”, and commit:
➤ git commit -a -m “Merge work in my-branch”
DID IT WORK?
➤ Verify the merge with:
➤ gitk --all
➤ git show-branch --all
➤ See changes back to the common ancestor:
➤ gitk master my-branch
➤ git show-branch master my-branch
➤ Note that master is only one edit from my-branch now (the merge patch-up)
➤ “git show” handy with merges:
➤ git show HEAD
MERGING THE UPSTREAM
➤ Master is now updated with my-branch changes
➤ But my-branch is now lagging
➤ We can merge back the other way:
➤ git checkout my-branch
➤ git merge master
➤ This will succeed as a “fast forward”
➤ This means that the merge-from branch already has all of our change history
➤ So it’s just adding linear history to the end
UPSTREAM CHANGES
➤ Let’s change origin a bit
➤ cd ../git-tutorial
➤ echo "some upstream change" >>other
➤ git add other
➤ git commit -a -m "upstream change"
➤ And now fetch it downstream
➤ cd ../my-git
➤ git fetch
➤ gitk --all
➤ git diff master..origin/master
MERGE IT IN
➤ Explicit merging
➤ git checkout master
➤ git merge origin/master
➤ Implicit fetch/merge
➤ git pull
➤ Eliminating the bushy tree
➤ git pull --rebase
➤ (Fails in our example.. sigh.)
SPLITTING UP A PATCH
➤ Sometimes, your changes are logically separate
➤ echo “this change” >>hello
➤ echo “unrelated change” >>example
➤ Now make two commits:
➤ git add -p # interactively select hello change
➤ git commit -m “fixed hello” # not -a!
➤ git commit -a -m “fixed example”
FIXING A COMMIT
➤ Oops, left out something on that last one
➤ echo "another unrelated" >>example
➤ Now “amend” the patch:
➤ git commit -a --amend
➤ This replaces the commit
➤ Be careful that you haven’t pushed it!
FOR FURTHER INFO
➤ See “Git (software)” in Wikipedia
➤ And the git homepage http://git-scm.com/
➤ Git wiki at https://git.wiki.kernel.org/
➤ Wonderful (but dated) Pro Git book: https://git-scm.com/book/en/v2
➤ Get on the mailing list
➤ Helpful people there
➤ You can submit bugs, patches, ideas
➤ And the #git IRC channel (on Freenode)
➤ Now “git” to it!

Git: a brief introduction

  • 1.
    GIT: A BRIEFINTRODUCTION Randal L. Schwartz, merlyn@stonehenge.com Version 5.1.7 on 20 July 2017 This document is copyright 2006-2017 by Randal L. Schwartz, Stonehenge Consulting Services, Inc. This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License http://creativecommons.org/licenses/by-nc-sa/3.0/
  • 2.
    ABOUT ME ➤ Beentracking git since it was created (April 2005) ➤ Took Linus to lunch when he moved to Portland: “What are you working on?” ➤ Used git on everything from individual tasks to large teams ➤ Formerly participated in the mailing list ➤ But it’s too noisy now ➤ Provided some patches to git, and suggestions for user interface changes ➤ Still build git every day from master ➤ Gave the first Google TechTalk (October 2007) that talked about what Git actually is ➤ Linus spoke six months earlier about what Git isn’t
  • 3.
    WHAT IS GIT? ➤Git manages changes to a tree of files over time ➤ Git is optimized for: ➤ Distributed development ➤ Large file counts ➤ Complex merges ➤ Making trial branches ➤ Being very fast ➤ Being robust
  • 4.
    BUT NOT FOR... ➤Tracking file permissions and ownership ➤ Tracking individual files with separate history ➤ Making things painful
  • 5.
    WHY GIT? ➤ Essentialto Linux kernel development ➤ Created in 2005 as a replacement when BitKeeper suddenly became “unavailable” ➤ Now used by millions of projects ➤ Everyone can: ➤ Clone the tree ➤ Make and test local changes ➤ Submit the changes as patches via mail ➤ OR submit them as a published repository ➤ Track the upstream to revise if needed
  • 6.
    HOW DOES GITDO IT? ➤ Universal public identifiers ➤ Computed from the constituent bits, so they can be verified ➤ Multi-protocol transport: HTTP, SSH, GIT ➤ Efficient object storage ➤ Everyone has entire repo (disk is cheap) ➤ Easy branching and merging ➤ Common ancestors are computable ➤ Patches (and repo updates) can be transported or mailed ➤ Binary “patches” are supported
  • 7.
    THE SHA1 ISKING ➤ Every object has a SHA1 to uniquely identify it ➤ Objects consist of: ➤ Blobs (the contents of a file) ➤ Trees (directories of blobs or other trees) ➤ Commits: ➤ A tree ➤ Plus zero or more parent commits ➤ Plus a message about why ➤ And tags
  • 8.
    TAGS ➤ An object(usually a commit) ➤ Plus an optional subject (if anything else is given) ➤ Plus an optional payload to sign it off ➤ Plus an optional GPG signature ➤ Designed to be immobile ➤ Changes not tracked after cloning ➤ Use a branch if you want to move around
  • 9.
    OBJECTS LIVE INTHE REPO ➤ Git efficiently creates new objects ➤ Objects are generally added, not destroyed ➤ Unreferenced objects will garbage collect ➤ Objects start “loose”, but can be “packed” ➤ Packs represent objects as deltas ➤ Packs are also created for push and pull
  • 10.
    COMMITS RULE THEREPO ➤ One or more commits form the head of object chains ➤ Typically one head called “master” ➤ Nothing special about the name “master” though ➤ Others can be made at will (“branches”) ➤ Usually one commit in the repo that has no parent commit (“root” commit)
  • 11.
    REACHING OUT ➤ Froma commit, reaching the components: ➤ Chase down the tree object to get to directories and files as they existed at this commit time ➤ Chase down the parent objects to get to earlier commits and their respective trees ➤ Do this recursively, and you have all of history ➤ And the SHA1 depends on all of that!
  • 12.
    THE GIT REPO ➤A “working tree” has a “.git” dir at the top level ➤ Unlike CVS, SVN: no deeper pollution ➤ This makes it friendly to recursive greps ➤ Even though “git grep” is also provided
  • 13.
    THE .GIT DIRCONTAINS: ➤ config – Configuration file (.ini style) ➤ objects/* – The object repository ➤ refs/heads/* – branches (like “master”) ➤ refs/tags/* - tags ➤ logs/* - logs ➤ refs/remotes/* - tracking others ➤ index – the “index cache” (described shortly) ➤ HEAD – points to one of the branches (the “current branch”, where commits go)
  • 14.
    THE INDEX (OR“CACHE”) ➤ A directory of blob objects ➤ Represents the “next commit” ➤ “Add files” to put current contents in ➤ “Commit” takes the current index and makes it a real commit object ➤ Diff between HEAD and index: ➤ changed things not yet committed ➤ Diff between index and working dir: ➤ changed things not yet added ➤ untracked things
  • 15.
    WHAT’S IN ANAME? ➤ Git doesn’t record explicit renaming ➤ Nor expect you to declare it ➤ Exact renaming determined by SHA1 ➤ Copy-paste-edits detected by similarity ➤ Computer better than you at that ➤ Explicit tracking will be wrong sometimes ➤ Being wrong breaks merges
  • 16.
    GIT SPEAKS ANDLISTENS ➤ Many protocols to transfer between repos ➤ http, https, git, ssh, local files ➤ In the core, git also has: ➤ import/export with CVS, SVN ➤ I use SVN import to have entire history of an SVN project at 30K feet ➤ Third party solutions handle others ➤ Git core also includes cvs-server ➤ A git repository can act like a CVS repository for legacy clients or humans
  • 17.
    GETTING GIT ➤ Getthe latest: https://git-scm.com/downloads ➤ Linux packages also exist ➤ Track the git-developer archive: ➤ git clone https://github.com/git/git ➤ Maintenance releases are very stable ➤ I install mine “prefix=/opt/git” ➤ add /opt/git/bin to PATH
  • 18.
    GIT COMMANDS ➤ Allgit commands start with “git” ➤ “git MUMBLE-FOO bar” can also be written as “git-MUMBLE-FOO bar” ➤ This allows a single entry “git” to be added to the /usr/local/bin path ➤ This works for internal calls as well ➤ Manpages are still under “git-MUMBLE-FOO” ➤ Unless you use “git help MUMBLE-FOO” ➤ Or “git MUMBLE-FOO --help”
  • 19.
    PORCELAIN AND PLUMBING ➤Low-level git operations are called “plumbing” ➤ Higher level actions are called “porcelain” ➤ The git distro includes both ➤ Use porcelain from command line ➤ But don’t script with it ➤ Future releases might change things ➤ Past releases have definitely changed things ➤ Use plumbing for scripts ➤ Intended to be upward compatible (or at least versioned) ➤ Strangely, the --porcelain flag often enables plumbing mode (WTF?)
  • 20.
    CREATING A REPO ➤git init ➤ Creates a .git in the current dir ➤ Optional: edit .gitignore ➤ “git add .” to add all files (except .git!) ➤ Then “git commit” for the initial commit ➤ Creates current branch named “master” ➤ The name “master” is otherwise unspecial ➤ Could also do this on a tarball, once unpacked
  • 21.
    CLONING ➤ Creates agit repo from an existing repo ➤ Generally creates a subdirectory ➤ Your workfiles and .git are in there ➤ Clone repo identified as “origin” ➤ But the name is otherwise unspecial ➤ Remote branches are “tracked” ➤ On fetch operations, the remote branch heads are copied to origin/$branchname ➤ Remote “HEAD” branch checked out as your initial “master” branch as well
  • 22.
    COMMITTING ➤ Your workproduct is more commits ➤ These are always on a “branch” ➤ A branch is just a named commit ➤ When you commit, the former branch head becomes the parent ➤ The branch head moves to be the new commit ➤ Thus, you’re creating a directed acyclic graph ➤ ... rooted in branch heads ➤ A merge is just a commit with multiple parents
  • 23.
    TYPICAL WORK FLOW ➤Edit edit edit ➤ git add files/you/have changed/now ➤ This adds the files to the index ➤ “git add .” for adding all interesting files (except ignored files) ➤ “git add -p” selects patch by patch (my favorite) ➤ git status ➤ Tells you differences between HEAD, index, and working directory
  • 24.
    MAKING THE COMMIT ➤“git commit” pops you into a text editor for the commit message ➤ Good commit message rules: ➤ Separate subject from body with a blank line ➤ Limit the subject line to 50 characters ➤ Capitalize the subject line ➤ Do not end the subject line with a period ➤ Use the imperative mood in the subject line ➤ Wrap the body at 72 characters ➤ Use the body to explain what and why vs. how ➤ Current branch is moved forward
  • 25.
    BUT WHICH BRANCH? ➤Git encourages branching ➤ A branch is just 41 text bytes! ➤ Typical work flow: ➤ Think of something to do ➤ git checkout -b topic-name master ➤ work work work, commit to topic-name ➤ When your thing is done: ➤ git checkout master ➤ git merge topic-name ➤ git branch -d topic-name
  • 26.
    WORKING IN PARALLEL ➤You can have multiple topics active: ➤ git checkout -b topic1 master ➤ work work; commit; work work; commit ➤ git checkout -b topic2 master ➤ work work work; commit ➤ git checkout topic1; work work; commit ➤ Decide how to bring them together ➤ Merge: parallel histories ➤ Rebase: serial histories ➤ Each has pros and cons
  • 27.
    THE MERGE ➤ gitcheckout master ➤ git merge topic1; git branch -d topic1 ➤ This should be trivial (“fast forward”) merge ➤ git merge topic2 ➤ Conflicts may arise: ➤ overlapping changes in text edits ➤ files renamed two different ways ➤ You need to resolve, and continue: ➤ git commit -a -m “describe the merge fix here”
  • 28.
    THE REBASE ➤ Rewritescommits ➤ Breaks SHA1s: commits are lost! ➤ Don’t rebase if you’ve published commits! ➤ git checkout topic2; git rebase master ➤ topic2’s commits rewritten on top of master ➤ May result in merge conflicts: ➤ git rebase --continue or --abort or --skip ➤ git rebase -i (interactive) is helpful ➤ When rebased, merge is a fast forward: ➤ git checkout master; git merge topic2
  • 29.
    READ THE HISTORY ➤git log ➤ print the changes ➤ git log -p ➤ print the changes, including a diff between revisions ➤ git log --stat ➤ Summarize the changes with a diffstat ➤ git log -- file1 file2 dir3 ➤ Show changes only for listed files or subdirs
  • 30.
    WHAT’S THE DIFFERENCE? ➤git diff ➤ Diff between index and working tree ➤ These are things you should “git add” ➤ “git commit -a” will also make this list empty ➤ git diff HEAD ➤ Difference between HEAD and working tree ➤ “git commit -a” will make this empty ➤ git diff --cached ➤ between HEAD and index ➤ “git commit” (without -a) makes this empty
  • 31.
    OTHER DIFFS ➤ gitdiff OTHERBRANCH ➤ Other branch and working tree ➤ git diff BRANCH1 BRANCH2 ➤ Difference between two branch heads ➤ git diff BRANCH1...BRANCH2 ➤ changes only on branch2 relative to common ancestor ➤ git diff --stat (other options) ➤ Nice summary of changes ➤ git diff --dirstat (other options) ➤ Summarize directory changes
  • 32.
    BARKING UP THETREE ➤ Most commands take “tree-ish” args ➤ SHA1 picks something absolutely ➤ Can be abbreviated if not ambiguous ➤ HEAD, some-branch-name, some-tag-name, some-origin-name ➤ Optionally followed by @{historical} ➤ “historical” can be: ➤ yesterday, 2011-11-22, etc (date ref) ➤ 1, 2, 3, etc (prior version of this ref) ➤ “upstream”, “u” (upstream version of local)
  • 33.
    MEET THE PARENTS ➤Any of those on the prior slide, followed by: ➤ ^n - “the n-th parent of an item” (default 1) ➤ ~n - n ^1’s (so ~3 is ^1^1^1) ➤ :path - pick the object from the tree ➤ ^{/some text here} - first parent with a commit message matching the text
  • 34.
    RANGES ➤ $tree1 -all commits reachable from $tree1 back to the root ➤ ^$tree1 $tree2 - all commits from $tree2 MINUS all commits from $tree1 ➤ Common enough that it can be written $tree1..$tree2 (“from $tree1 to $tree2”) ➤ $tree1…$tree2 is $m..$tree1 with $m..$tree2, where $m is the merge base ➤ Omitting either operand in .. or … defaults to HEAD ➤ $tree1.. - all commits on current branch since $tree1 ➤ ..$tree2 - all commits introduced on $tree2 since this branch forked
  • 35.
    RANGE EXAMPLES ➤ gitdiff HEAD^ HEAD ➤ most recent change on current branch ➤ Also: git diff HEAD~ HEAD ➤ git diff HEAD~3 HEAD ➤ What damage did last three edits do?
  • 36.
    SEEING THE CHANGES ➤gitk mytopic origin ➤ Tk widget display of history ➤ Shows changes back to common ancestor ➤ gitk --all ➤ show everything ➤ To skip the tags: gitk HEAD --branches --remotes ➤ gitk from..to ➤ Just the changes in “to” that aren’t in “from” ➤ git show-branch from..to ➤ Same thing for the Tk-challenged
  • 37.
    PLAYING WELL WITHOTHERS ➤ git clone creates “tracking” branches ➤ Typically named “origin/master” etc ➤ To share your work, first get up to date: ➤ git fetch origin ➤ Now rebase your changes on upstream: ➤ git rebase origin/master ➤ Or fetch/rebase in one step ➤ git pull --rebase ➤ To push upstream: ➤ git push
  • 38.
    RESETTING ➤ git reset--soft ➤ Makes all files “updated but not checked in” ➤ git reset --hard # DANGER ➤ Forces working dir to look like last commit ➤ git reset --hard HEAD~3 ➤ Tosses most recent 3 commits ➤ use “git revert” instead if you’ve published ➤ git checkout HEAD some/lost/file ➤ Recover the version of some/lost/file from the last commit
  • 39.
    IGNORING THINGS ➤ Everydirectory can contain a .gitignore ➤ lines starting with “!” mean “not” ➤ lines without “/” are checked against basename ➤ otherwise, shell glob via fnmatch(3) ➤ Leading / means “the current directory” ➤ Checked into the repository and tracked ➤ Every repository can contain a .git/info/exclude ➤ Both of these work together ➤ But .git/info/exclude won’t be cloned
  • 40.
    CONFIGURATION ➤ Many commandshave configurations ➤ git config name value ➤ set name to value ➤ name can contain periods for sub-items ➤ git config name ➤ get current value ➤ git config --global name [value] ➤ Same, but with ~/.gitconfig ➤ This applies to all git repos for a user
  • 41.
    THE STASH ➤ Createstemporary commits to represent: ➤ current index (git add ...) ➤ current working directory (git add .) ➤ Can rebase those onto new index later ➤ Many uses, such as pull into dirty workdir: ➤ git stash; git pull ...; git stash pop ➤ Might result in conflicts, of course ➤ Multiple stashes can be in play ➤ “git stash list” to show them
  • 42.
    OTHER USEFUL PORCELAIN ➤git archive: export a tree as a tar/zip ➤ git bisect: find the offensive commit ➤ git cherry-pick: selective merging ➤ git mv: rename a file/dir with the right index manipulations ➤ git rm: ditto for delete ➤ git push: write to an upstream ➤ git revert: add a commit that undoes a previous commit ➤ git blame: who wrote this?
  • 43.
    COMMIT ADVICE ➤ Splitchanges into small logical steps ➤ Ideally ones that pass the test suite again ➤ This helps for “blame” and “bisect”. ➤ Easier to squash commits later than to break up ➤ “git rebase -i” can squash, omit, reorder
  • 44.
    PICKING FROM BRANCHES ➤Two main tools: “merge” and “cherry-pick” ➤ Merge brings in all commits ➤ Scales well for large workflows ➤ Cherry-pick brings in one or more ➤ Great when a single patch is needed
  • 45.
    GIT.GIT’S WORKFLOW ➤ Fourbranches: ➤ maint: fixes to existing releases ➤ master: next release ➤ next: testing for next master ➤ pu: experimental features ➤ Each one is a descendent of the one above ➤ Commit to the oldest branch needing patch ➤ Then merge it upward: ➤ maint to master to next to pu
  • 46.
    TOPIC BRANCHES ➤ Mostfeatures require several iterations ➤ Commit these to topic branches during design ➤ Easier to rehack or abandon this way ➤ Fork topic from the oldest main branch ➤ Refresh-merge from that branch if needed ➤ But don’t do that routinely ➤ Rebase topic branch if forked from wrong branch ➤ More details at “man 7 gitworkflows”
  • 47.
    TESTING INTEGRATION ➤ Mergefrom base branch to topic branch ➤ ... on a new throw-away branch ➤ This branch is never merged back in ➤ Just for testing ➤ Can be published publicly, if you make it clear that it’s just for testing ➤ Otherwise, typically used only locally ➤ If integration fails, fix, and cherry-pick those back to the topic branch before final merge
  • 48.
    TIME TO “GIT”DIRTY ➤ Make a git repository: ➤ mkdir git-tutorial ➤ cd git-tutorial ➤ git init ➤ git config user.name “Randal Schwartz” ➤ git config user.email merlyn@stonehenge.com ➤ Add some content: ➤ echo "Hello World" >hello ➤ echo "Silly example" >example
  • 49.
    WHAT’S UP? ➤ gitstatus ➤ git add example hello ➤ git status ➤ git diff --cached
  • 50.
    “GIT ADD” TIMING ➤Change the content of “hello” ➤ echo "It's a new day for git" >>hello ➤ git status ➤ git diff ➤ Now commit the index (with old hello) ➤ git commit -m initial ➤ git status ➤ git diff ➤ git diff HEAD
  • 51.
    GIT COMMIT -A ➤Note that we committed the version of “hello” at the time we added it! ➤ Fix this by adding -a nearly always: ➤ git commit -a -m update ➤ git status
  • 52.
    WHAT HAPPENED? ➤ Askfor logs: ➤ git log ➤ git log -p ➤ git log --stat --summary ➤ Tag, you’re it: ➤ git tag my-first-tag ➤ Now we can always get back to that version later
  • 53.
    SHARING THE WORK ➤Create the clone: ➤ cd .. ➤ git clone git-tutorial my-git ➤ cd my-git ➤ The git clone will often have some sort of transport path, like git: or rsync: or http: ➤ See what we’ve got: ➤ git log -p ➤ Note that we have the entire history ➤ And that the SHA1s are identical
  • 54.
    BRANCHING OUT ➤ Createbranch “my-branch” ➤ git checkout -b my-branch ➤ git status ➤ Make some changes: ➤ echo "Work, work, work" >>hello ➤ git commit -a -m 'Some work.'
  • 55.
    CONFLICTS ➤ Switch back,and make other changes: ➤ git checkout master ➤ echo "Play, play, play" >>hello ➤ echo "Lots of fun" >>example ➤ git commit -a -m 'Some fun.' ➤ We now have conflicting commits
  • 56.
    SEEING THE DAMAGE ➤In an X11 display: ➤ gitk --all ➤ The --all means “all heads, branches, tags” ➤ For the X11 challenged: ➤ git show-branch --all ➤ git log --pretty=oneline —abbrev-commit --graph --decorate --all ➤ Handy for a mail message
  • 57.
    MERGING ➤ We’re on“master”, and we want to merge in the changes from my-branch ➤ Select the merge: ➤ git merge my-branch ➤ This fails, because we have a conflict in “hello” ➤ See this with: ➤ git status ➤ Edit “hello”, and commit: ➤ git commit -a -m “Merge work in my-branch”
  • 58.
    DID IT WORK? ➤Verify the merge with: ➤ gitk --all ➤ git show-branch --all ➤ See changes back to the common ancestor: ➤ gitk master my-branch ➤ git show-branch master my-branch ➤ Note that master is only one edit from my-branch now (the merge patch-up) ➤ “git show” handy with merges: ➤ git show HEAD
  • 59.
    MERGING THE UPSTREAM ➤Master is now updated with my-branch changes ➤ But my-branch is now lagging ➤ We can merge back the other way: ➤ git checkout my-branch ➤ git merge master ➤ This will succeed as a “fast forward” ➤ This means that the merge-from branch already has all of our change history ➤ So it’s just adding linear history to the end
  • 60.
    UPSTREAM CHANGES ➤ Let’schange origin a bit ➤ cd ../git-tutorial ➤ echo "some upstream change" >>other ➤ git add other ➤ git commit -a -m "upstream change" ➤ And now fetch it downstream ➤ cd ../my-git ➤ git fetch ➤ gitk --all ➤ git diff master..origin/master
  • 61.
    MERGE IT IN ➤Explicit merging ➤ git checkout master ➤ git merge origin/master ➤ Implicit fetch/merge ➤ git pull ➤ Eliminating the bushy tree ➤ git pull --rebase ➤ (Fails in our example.. sigh.)
  • 62.
    SPLITTING UP APATCH ➤ Sometimes, your changes are logically separate ➤ echo “this change” >>hello ➤ echo “unrelated change” >>example ➤ Now make two commits: ➤ git add -p # interactively select hello change ➤ git commit -m “fixed hello” # not -a! ➤ git commit -a -m “fixed example”
  • 63.
    FIXING A COMMIT ➤Oops, left out something on that last one ➤ echo "another unrelated" >>example ➤ Now “amend” the patch: ➤ git commit -a --amend ➤ This replaces the commit ➤ Be careful that you haven’t pushed it!
  • 64.
    FOR FURTHER INFO ➤See “Git (software)” in Wikipedia ➤ And the git homepage http://git-scm.com/ ➤ Git wiki at https://git.wiki.kernel.org/ ➤ Wonderful (but dated) Pro Git book: https://git-scm.com/book/en/v2 ➤ Get on the mailing list ➤ Helpful people there ➤ You can submit bugs, patches, ideas ➤ And the #git IRC channel (on Freenode) ➤ Now “git” to it!