Git: a brief
introduction
Randal L. Schwartz, merlyn@stonehenge.com
Version 4.0.6 on 5 Jan 2012
This document is copyright 2011, 2012 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
http://creativecommons.org/licenses/by-nc-sa/3.0/
Monday, February 6, 12 1
About me
• Been tracking git since it was created
• Used git on small projects
• Used other systems on small and large projects
• Read a lot of people talk about git on the
mailing list
• Provided some patches to git, and suggestions
for user interface changes
• Worked on small and medium teams with git
• But not large ones
Monday, February 6, 12 2
What is git?
• Git manages changes to a tree of files over time
• Git is optimized for:
• Distributed development
• Large file counts
• Complex merges
• Making trial branches
• Being very fast
• Being robust
Monday, February 6, 12 3
But not for...
• Tracking file permissions and ownership
• Tracking individual files with separate history
• Making things painful
Monday, February 6, 12 4
Why git?
• Essential to Linux kernel development
• Created as a replacement when BitKeeper
suddenly became “unavailable”
• Now used by thousands of projects
• Everybody has a “commit bit”
Monday, February 6, 12 5
Everyone can...
• Clone the tree
• Make and test local changes
• Submit the changes as patches via mail
• OR submit them as a published repository
• Track the upstream to revise if needed
Monday, February 6, 12 6
How does git do it?
• Universal public identifiers
• None of the SVK “my @245 is your @992”
• Multi-protocol transport: HTTP, SSH, GIT
• Efficient object storage
• Everyone has entire repo (disk is cheap)
• Easy branching and merging
• Common ancestors are computable
• Patches (and repo updates) can be transported
or mailed
• Binary “patches” are supported
Monday, February 6, 12 7
The SHA1 is King
• Every “object” has a SHA1 to uniquely identify it
• “objects” consist of:
• Blobs (the contents of a file)
• Trees (directories of blobs or other trees)
• Commits:
•
A tree
•
Plus zero or more parent commits
•
Plus a message about why
• And tags
Monday, February 6, 12 8
Tags
• An object (usually a commit)
• Plus an optional subject (if anything else is given)
• Plus an optional payload to sign it off
• Plus an optional gpg signature
• Designed to be immobile
• Changes not tracked during cloning
• Use a branch if you want to move around
Monday, February 6, 12 9
Objects live in the repo
• Git efficiently creates new objects
• Objects are generally added, not destroyed
• Unreferenced objects will garbage collect
• Objects start “loose”, but can be “packed”
• “Packs” represent objects as deltas
• “Packs” are also created for repo transfer
Monday, February 6, 12 10
Commits rule the repo
• One or more commits form the head of object
chains
• Typically one head called “master”
• Others can be made at will (“branches”)
• Usually one commit in the repo that has no
parent commit (“root” commit)
Monday, February 6, 12 11
Reaching out
• From a commit, reaching the components:
• Chase down the tree object to get to
directories and files as they existed at this
commit time
• Chase down the parent objects to get to
earlier commits and their respective trees
• Do this recursively, and you have all of history
• And the SHA1 depends on all of that!
Monday, February 6, 12 12
The git repo
• A “working tree” has a “.git” dir at the top level
• Unlike CVS, SVN: no pollution of deeper
directories
• This makes it friendly to recursive greps
Monday, February 6, 12 13
The .git dir contains:
• config – Configuration file (.ini style)
• objects/* – The object repository
• refs/heads/* – branches (like “master”)
• refs/tags/* - tags
• logs/* - logs
• refs/remotes/* - tracking others
• index – the “index cache” (described shortly)
• HEAD – points to one of the branches (the
“current branch”, where commits go)
Monday, February 6, 12 14
The index (or “cache”)
• A directory of blob objects
• Represents the “next commit”
• “Add files” to put current contents in
• “Commit” takes the current index and makes it
a real commit object
• Diff between HEAD and index:
• changed things not yet committed
• Diff between index and working dir:
• changed things not yet added
• untracked things
Monday, February 6, 12 15
What’s in a name?
• Git doesn’t record explicit renaming
• Nor expect you to declare it
• Exact renaming determined by SHA1
• Copy-paste-edits detected by similarity
• Computer better than you at that
• Explicit tracking will be wrong sometimes
• Being wrong breaks merges
Monday, February 6, 12 16
Git speaks and listens
• Many protocols to transfer between repos
• rsync, http, https, git, ssh, local files
• In the core, git also has:
• import/export with CVS, SVN
• I use CVS/SVN import to have entire history of
a project at 30K feet
• Third party solutions handle others
• Git core also includes cvs-server
• A git repository can act like a CVS repository
for legacy clients or humans
Monday, February 6, 12 17
Getting git
• Get the latest “git-*.tar.gz” from
code.google.com/p/git-core
• RPMs and Debian packages also exist
• Track the git-developer archive:
• git clone git://git.kernel.org/pub/scm/git/git.git
• Maintenance releases are very stable
• I install mine “prefix=/opt/git”
• add /opt/git/bin to PATH
Monday, February 6, 12 18
Git commands
• All git commands start with “git”
• “git MUMBLE-FOO bar” has also been written
as “git-MUMBLE-FOO bar”
• This allows a single entry “git” to be added to
the /usr/local/bin path
• This works for internal calls as well
• Manpages are still under “git-MUMBLE-FOO”
• Unless you use “git help MUMBLE-FOO”
• Or “git MUMBLE-FOO --help”
Monday, February 6, 12 19
Porcelain and plumbing
• Low-level git operations are called “plumbing”
• Higher level actions are called “porcelain”
• The git distro includes both
• Use porcelain from command line
• But don’t script with it
• Future releases might change things
• Use plumbing for scripts
• Intended to be upward compatible
Monday, February 6, 12 20
Creating a repo
• git init
• Creates a .git in the current dir
• Optional: edit .gitignore
• “git add .” to add all files (except .git!)
• Then “git commit” for the initial commit
• Creates current branch named “master”
• Could also do this on a tarball
• tar xvfz some-tarball.tgz; cd some-tarball
• git init
• git add .
Monday, February 6, 12 21
Cloning
• Creates a git repo from an existing repo
• Generally creates a subdirectory
• Your workfiles and .git are in there
• Remote branches are “tracked”
• Remote “HEAD” branch checked out as your
initial “master” branch as well
• Clone repo identified as “origin”
• But the name is otherwise unspecial
Monday, February 6, 12 22
Committing
• Your work product is more commits
• These are always on a “branch”
• A branch is just a named commit
• When you commit, the former branch head
becomes the parent
• The branch head moves to be the new commit
• Thus, you’re creating a directed acyclic graph
• ... rooted in branch heads
• A merge is just a commit with multiple parents
Monday, February 6, 12 23
Typical work flow
• Edit edit edit
• git add files/you/have changed/now
• This adds the files to the index
• “git add .” for adding all interesting files
• git status
• Tells you differences between HEAD, index,
and working directory
Monday, February 6, 12 24
Making the commit
• “git commit”
• Popped into a text editor (or “-m msg”)
• First text line used for “short logs”
• Current branch is moved forward
• And you’re back to more editing
Monday, February 6, 12 25
But which branch?
• Git encourages branching
• A branch is just 41 text bytes!
• Typical work flow:
• Think of something to do
• git checkout -b topic-name master
• work work work, commit to topic-name
• When your thing is done:
• git checkout master
• git merge topic-name
• git branch -d topic-name
Monday, February 6, 12 26
Working in parallel
• You can have multiple topics active:
• git checkout -b topic1 master
• work work; commit; work work; commit
• git checkout -b topic2 master
• work work work; commit
• git checkout topic1; work work; commit
• Decide how to bring them together
• Merge: parallel histories
• Rebase: serial histories
• Each has pros and cons
Monday, February 6, 12 27
The merge
• git checkout master
• git merge topic1; git branch -d topic1
• This should be trivial (“fast forward”) merge
• git merge topic2
• Conflicts may arise:
• overlapping changes in text edits
• files renamed two different ways
• You need to resolve, and continue:
• git commit -a (describe the merge fix here)
Monday, February 6, 12 28
The rebase
• Rewrites commits
• Breaks SHA1s: commits are lost!
• Don’t rebase if you’ve published commits!
• git checkout topic2; git rebase master
• topic2’s commits rewritten on top of master
• May result in merge conflicts:
• git rebase --continue or --abort or --skip
• git rebase -i (interactive) is helpful
• When rebased, merge is a fast forward:
• git checkout master; git merge topic2
Monday, February 6, 12 29
Read the history
• git log
• print the changes
• git log -p
• print the changes, including a diff between
revisions
• git log --stat
• Summarize the changes with a diffstat
• git log -- file1 file2 dir3
• Show changes only for listed files or subdirs
Monday, February 6, 12 30
What’s the difference?
• git diff
• Diff between index and working tree
• These are things you should “git add”
• “git commit -a” will also make this list empty
• git diff HEAD
• Difference between HEAD and working tree
• “git commit -a” will make this empty
• git diff --cached
• between HEAD and index
• “git commit” (without -a) makes this empty
Monday, February 6, 12 31
Other diffs
• git diff OTHERBRANCH
• Other branch and working tree
• git diff BRANCH1 BRANCH2
• Difference between two branch heads
• git diff BRANCH1...BRANCH2
• changes only on branch2 relative to common
• git diff --stat (other options)
• Nice summary of changes
• git diff --dirstat (other options)
• Summarize directory changes
Monday, February 6, 12 32
Barking up the tree
• Most commands take “tree-ish” args
• SHA1 picks something absolutely
• Can be abbreviated if not ambiguous
• HEAD, some-branch-name, some-tag-name,
some-origin-name
• Optionally followed by @{historical}
• “historical” can be:
• yesterday, 2011-11-22, etc (date ref)
• 1, 2, 3, etc (prior version of this ref)
• “upstream” (upstream version of local)
Monday, February 6, 12 33
Meet the parents
• Any of those on the prior slide, followed by:
• ^n - “the n-th parent of an item” (default 1)
• ~n - n ^1’s (so ~3 is ^1^1^1)
• :path - pick the object from the tree
Monday, February 6, 12 34
Tree Examples
• git diff HEAD^ HEAD
• most recent change on current branch
• Also: git diff HEAD~ HEAD
• git diff HEAD~3 HEAD
• What damage did last three edits do?
Monday, February 6, 12 35
Seeing the changes
• gitk mytopic origin
• Tk widget display of history
• Shows changes back to common ancestor
• gitk --all
• show everything
• gitk from..to
• Just the changes in “to” that aren’t in “from”
• git show-branch from..to
• Same thing for the Tk-challenged
Monday, February 6, 12 36
Playing well with others
• git clone creates “tracking” branches
• Typically named “origin/master” etc
• To share your work, first get up to date:
• git fetch origin
• Now rebase your changes on upstream:
• git rebase origin/master
• Or fetch/rebase in one step
• git pull --rebase
• To push upstream:
• git push
Monday, February 6, 12 37
Resetting
• git reset --soft
• Makes all files “updated but not checked in”
• git reset --hard # DANGER
• Forces working dir to look like last commit
• git reset --hard HEAD~3
• Tosses most recent 3 commits
• use “git revert” instead if you’ve published
• git checkout HEAD some/lost/file
• Recover the version of some/lost/file from
the last commit
Monday, February 6, 12 38
Ignoring things
• Every directory can contain a .gitignore
• lines starting with “!” mean “not”
• lines without “/” are checked against
basename
• otherwise, shell glob via fnmatch(3)
• Leading / means “the current directory”
• Checked into the repository and tracked
• Every repository can contain a .git/info/exclude
• Both of these work together
• But .git/info/exclude won’t be cloned
Monday, February 6, 12 39
Configuration
• Many commands have configurations
• git config name value
• set name to value
• name can contain periods for sub-items
• git config name
• get current value
• git config --global name [value]
• Same, but with ~/.gitconfig
• This applies to all git repos from a user
Monday, February 6, 12 40
The stash
• Creates temporary commits to represent:
• current index (git add ...)
• current working directory (git add .)
• Can rebase those onto new index later
• Many uses, such as pull into dirty workdir:
• git stash; git pull ...; git stash pop
• Might result in conflicts, of course
• Multiple stashes can be in play
• “git stash list” to show them
Monday, February 6, 12 41
Other useful porcelain
• git archive: export a tree as a tar/zip
• git bisect: find the offensive commit
• git cherry-pick: selective merging
• git mv: rename a file/dir with the right index
manipulations
• git rm: ditto for delete
• git push: write to an upstream
• git revert: add a commit that undoes a previous
commit
• git blame: who wrote this?
Monday, February 6, 12 42
Commit Advice
• Split changes into small logical steps
• Ideally ones that pass the test suite again
• This helps for “blame” and “bisect”.
• Easier to squash commits later than to break up
• “git rebase -i” can squash, omit, reorder
Monday, February 6, 12 43
Picking from branches
• Two main tools: “merge” and “cherry-pick”
• Merge brings in all commits
• Scales well for large workflows
• Cherry-pick brings in one or more
• Great when a single patch is needed
Monday, February 6, 12 44
git.git’s workflow
• Four branches:
• maint: fixes to existing releases
• master: next release
• next: testing for next master
• pu: experimental features
• Each one is a descendent of the one above
• Commit to the oldest branch needing patch
• Then merge it upward:
•
maint to master to next to pu
Monday, February 6, 12 45
Topic branches
• Most features require several iterations
• Commit these to topic branches during design
• Easier to rehack or abandon this way
• Fork topic from the oldest main branch
• Refresh-merge from that branch if needed
• But don’t do that routinely
• Rebase topic branch if forked from wrong
branch
• More details at “man 7 gitworkflows”
Monday, February 6, 12 46
Testing integration
• Merge from base branch to topic branch
• ... on a new throw-away branch
• This branch is never merged back in
• Just for testing
• Can be published publicly, if you make that clear
• Otherwise, typically used only locally
• If integration fails, fix, and cherry-pick those
back to the topic branch before final merge
Monday, February 6, 12 47
Time to “git” dirty
• Make a git repository:
• mkdir git-tutorial
• cd git-tutorial
• git init
• git config user.name “Randal Schwartz”
• git config user.email merlyn@stonehenge.com
• Add some content:
• echo "Hello World" >hello
• echo "Silly example" >example
Monday, February 6, 12 48
What’s up?
• git status
• git add example hello
• git status
• git diff --cached
Monday, February 6, 12 49
“git add” timing
• Change the content of “hello”
• echo "It's a new day for git" >>hello
• git status
• git diff
• Now commit the index (with old hello)
• git commit -m initial
• git status
• git diff
• git diff HEAD
Monday, February 6, 12 50
git commit -a
• Note that we committed the version of “hello”
at the time we added it!
• Fix this by adding -a nearly always:
• git commit -a -m update
• git status
Monday, February 6, 12 51
What happened?
• Ask for logs:
• git log
• git log -p
• git log --stat --summary
• Tag, you’re it:
• git tag my-first-tag
• Now we can always get back to that version
later
Monday, February 6, 12 52
Sharing the work
• Create the clone:
• cd ..
• git clone git-tutorial my-git
• cd my-git
• The git clone will often have some sort of
transport path, like git: or rsync: or http:
• See what we’ve got:
• git log -p
• Note that we have the entire history
• And that the SHA1s are identical
Monday, February 6, 12 53
Branching out
• Create branch “my-branch”
• git checkout -b my-branch
• git status
• Make some changes:
• echo "Work, work, work" >>hello
• git commit -a -m 'Some work.'
Monday, February 6, 12 54
Conflicts
• Switch back, and make other changes:
• git checkout master
• echo "Play, play, play" >>hello
• echo "Lots of fun" >>example
• git commit -a -m 'Some fun.'
• We now have conflicting commits
Monday, February 6, 12 55
Seeing the damage
• In an X11 display:
• gitk --all
• The --all means “all heads, branches, tags”
• For the X11 challenged:
• git show-branch --all
• git log --pretty=oneline --abbrev-commit
--graph --decorate --all
• Handy for a mail message
Monday, February 6, 12 56
Merging
• We’re on “master”, and we want to merge in
the changes from my-branch
• Select the merge:
• git merge my-branch
• This fails, because we have a conflict in “hello”
• See this with:
• git status
• Edit “hello”, and commit:
• git commit -a -m “Merge work in my-branch”
Monday, February 6, 12 57
Did it work?
• Verify the merge with:
• gitk --all
• git show-branch --all
• See changes back to the common ancestor:
• gitk master my-branch
• git show-branch master my-branch
• Note that master is only one edit from my-
branch now (the merge patch-up)
• “git show” handy with merges:
• git show HEAD
Monday, February 6, 12 58
Merging the upstream
• Master is now updated with my-branch changes
• But my-branch is now lagging
• We can merge back the other way:
• git checkout my-branch
• git merge master
• This will succeed as a “fast forward”
• This means that the merge-from branch already
has all of our change history
• So it’s just adding linear history to the end
Monday, February 6, 12 59
Upstream changes
• Let’s change origin a bit
• cd ../git-tutorial
• echo "some upstream change" >>other
• git add other
• git commit -a -m "upstream change"
• And now fetch it downstream
• cd ../my-git
• git fetch
• gitk --all
• git diff master..origin/master
Monday, February 6, 12 60
Merge it in
• Explicit merging
• git checkout master
• git merge origin/master
• Implicit fetch/merge
• git pull
• Eliminating the bushy tree
• git pull --rebase
• (Fails in our example.. sigh.)
Monday, February 6, 12 61
Splitting up a patch
• Sometimes, your changes are logically separate
• echo “this change” >>hello
• echo “unrelated change” >>example
• Now make two commits:
• git add -p # interactively select hello change
• git commit -m “fixed hello” # not -a!
• git commit -a -m “fixed example”
Monday, February 6, 12 62
Fixing a commit
• Oops, left out something on that last one
• echo "another unrelated" >>example
• Now “amend” the patch:
• git commit -a --amend
• This replaces the commit
• Be careful that you haven’t pushed it!
Monday, February 6, 12 63
For further info
• See “Git (software)” in Wikipedia
• And the git homepage http://git-scm.com/
• Git wiki at https://git.wiki.kernel.org/
• Wonderful Pro Git book: http://progit.org/book/
• Get on the mailing list
• Helpful people there
• You can submit bugs, patches, ideas
• And the #git IRC channel (on Freenode)
• Now “git” to it!
Monday, February 6, 12 64