2. GIT INSIDE OUT
MICHAEL NADEL
▸Developer @ Pine River Capital Management
▸New(ish) to .NET
▸3-year Git practitioner
▸Please reach out!
▸michael.nadel@gmail.com
▸@mnadel
3. GIT INSIDE OUT
GIT IS HARD
▸Linus Torvalds, creator of Git (and Linux)
▸Initial revision of “git”, the information manager from hell
▸I didn’t really expect anyone to use it because it’s so hard
to use.
▸Andrew Morton, lead Linux kernel developer
▸Git is expressly designed to make you feel less intelligent
than you thought you were.
4. GIT INSIDE OUT
THE CHALLENGE WITH GIT
▸Plenty of rope
▸Paradigm shifts
▸Distributed
▸Content-addressable filesystem
8. GIT INSIDE OUT
COMMITTING != SHARING
▸Separate concerns
▸Crafting your history
▸Publishing your history
▸Richer workflows
▸Commit, commit, commit, squash, push
▸Reorder, push subset
▸Enforced code reviews
9. GIT INSIDE OUT
CONTENT ADDRESSABLE
▸Version control is an abstraction on top of a primitive
key/value store
▸hash-object
▸cat-file
▸Prove
▸cat-file performs no magic
16. GIT INSIDE OUT
CONTENT ADDRESSABLE FILESYSTEM
▸Instead of text, how about your filesystem?
17. GIT INSIDE OUT
CONCEPTUAL MODELS
▸Git as a Database
▸Store, retrieve, search your source code & its history
▸Git as a Graph
▸CRUD operations are performed against a graph of
commits
18. GIT INSIDE OUT
GIT AS A DATABASE
▸CRUD, search operations
▸Data types
▸Commit
▸Tree
▸Blob
Structured text
byte[]
28. GIT INSIDE OUT
GIT AS A GRAPH
▸What operations must I perform to get the graph to look the
way I want?
29. GIT INSIDE OUT
GIT COMMANDMENTS
▸Git is immutable
▸No updates, only appends
▸Git is a directed acyclic graph (DAG)
▸Directed: can only traverse in a single direction
▸Acyclic: no cycles — traversals only visit a node once
▸Every command is an operation on the graph
32. GIT INSIDE OUT
REFS, HEADS, BRANCHES
▸Ref is a pointer to a commit
▸Branch is a ref
▸HEAD is a pointer to your current branch
▸Branches have “namespaces”
35. ▸Heads contain your branches
▸Remotes contain remote
branches (eg origin)
▸“Namespaces” are directories
▸Branches are 40-byte files
containing a SHA1 hash of a
commit object
GIT INSIDE OUT
BRANCH IMPLEMENTATION
36. GIT INSIDE OUT
COMMIT (BEFORE)
▸A commit references its parent
▸HEAD, branch point at commit
51. GIT INSIDE OUT
RECAP - CONCEPTUAL MODELS
▸Duality of Git
▸As a database
▸As an immutable DAG
▸Reasoning through problems
▸Launch SmartGit & observe the result of commands
against the DAG
52. GIT INSIDE OUT
RECAP - “OH SHIT!” COMMANDS
▸git reflog
▸git reset
▸—soft won’t affect your workspace
▸—hard will make your workspace reflect where your
HEAD moved to (you can lose work)
▸git rebase -i
How many people can relate?
I often found myself in this situation. Then started learning more & more about Git’s internals. And found myself in this situation less & less. I started talking to other people about it, and, it turns out, they had a similar experience.
This is why I want to take a depth-first approach with you folks tonight. I think it’s important to grok Git’s internals in order to be able to reason your way through situations you find yourself in. And I want to share that journey with you this evening.
NEXT: Distributed
Git is egalitarianistic
NEXT: Content addressable
Porcelain vs plumbing
NEXT: Conceptual models
NEXT: Git as a database
Note that the filename isn’t part of the blob
NEXT: Git as a graph
A tree is a DAG iff each child has a single parent.
It’s immutable b/c of the key-value store.
NEXT: Dissect
Ruby on Rails SVN repo: 115M
Ruby on Rails Git repo: 13M
Implemented as writing a 40-byte hash to a file on your file system. This is why branching is blazing fast.
It's a *D*AG. Since new nodes aren't reachable by HEAD, your view of the graph hasn't changed, thus we haven't violated its immutability.