internals
Git Objects
Git objects
$ ls .git/
branches
config
description
HEAD
hooks

index
info
logs
objects
refs
The blob
$ git init
Initialized empty Git repository in /tmp/test/.git/

$ echo "test_content" | git hash-object -w -t blo...
The tree
tree [content size]0
10644 blob a906cb README
10755 blob 6f4e32 run
04000 tree 1f7a4e src
The commit
$ git commit file
[master dbaf944] This is a commit message.
$ cat .git/objects/db/af944a4a9eb72af64042b1e3a128...
The commit
• Is identified by
• a snapshot of the repo state (the tree).
• parent commit(s)
• a commit message

• Is immut...
Git References
$ cat .git/refs/heads/master
dbaf944a4a9eb72af64042b1e3a128936000dfc2

$ cat .git/HEAD
ref: refs/heads/mast...
Git References
$ git tag 1.0
dbaf944a4a9eb72af64042b1e3a128936000dfc2

$ cat .git/refs/tags/1.0
dbaf944a4a9eb72af64042b1e3...
Take home message
• Git stores a snapshot of the whole repo at each commit.
• The SHA1 of a commit depends only on its con...
Things we can play with
git reflog
git fsck

git pack
git config

git rebase -i
git reset
git refspecs
git stash

git add ...
Git internals
Upcoming SlideShare
Loading in …5
×

Git internals

672 views

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
672
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Internal model of the commit object.DEMO: git cat-file –p <commit sha1>
  • The most important Git object is the COMMIT.The most important thing about the commit is that it is IMMUTABLE.So why is it important?A commit is primarily defined by 3 things: a snapshot of the working directory, the “disk” state ; a commit message ; and most importantly, a parent commit. Every commit has a pointer toward its parent. This is what defined a history of commits, a chained list of commit.So, if you change- a single file -> different commit- the commit message -> different commit- the commit parent or parents -> different commitA commit is uniquely identified by its SHA1. A SHA1 is deterministic : a snapshot with the exact same content will have the same SHA1. A commit refering to the same snapshot, the same parent commit and the same commit message will be identified by the same SHA1.So why is this important?It matters because of the initial Git design choices. Git is primarily a content-addressables file that stores any version of any object as a distinct object accessible for ever. Any file, any snapshot, any commit that has been archived in Git, can be retrieved for ever by its SHA1.These files are stored entirely, all the time. This is a major difference with other versioning systems such as Subversion and Perforce, which stores diff of files.Now, I’m making simplifications, but this is true in a 1st approximation.Why do we care?This means that all snapshots that have been committed once can always be retrieved. Keep this in mind as it will be important later.
  • Commit has a pointer to a tree, which describes the entire git repo content.
  • Git internals

    1. 1. internals
    2. 2. Git Objects
    3. 3. Git objects $ ls .git/ branches config description HEAD hooks index info logs objects refs
    4. 4. The blob $ git init Initialized empty Git repository in /tmp/test/.git/ $ echo "test_content" | git hash-object -w -t blob --stdin 915e94ff1ac3818f1e458534b0228a12a99cd6c5 $ ls .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 $ cat .git/objects/91/5e94ff1ac3818f1e458534b02* | zlib_inflate -d blob 130test_content
    5. 5. The tree tree [content size]0 10644 blob a906cb README 10755 blob 6f4e32 run 04000 tree 1f7a4e src
    6. 6. The commit $ git commit file [master dbaf944] This is a commit message. $ cat .git/objects/db/af944a4a9eb72af64042b1e3a128936000dfc2 | zlib_inflate -d commit 318 tree 47ec7a250164a21cb14eb64618c3a903db0b7420 parent 402b26df0644f09fc62842c0a4a44a0a3345c530 author Manu <m.cupcic@criteo.com> 1380977766 +0200 committer Manu <m.cupcic@criteo.com> 1380977766 +0200 This is a commit message.
    7. 7. The commit • Is identified by • a snapshot of the repo state (the tree). • parent commit(s) • a commit message • Is immutable • Has a deterministic hash (SHA1) • Commits form a linked list: the history
    8. 8. Git References $ cat .git/refs/heads/master dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/HEAD ref: refs/heads/master $ echo "dbaf944" > .git/refs/heads/newbranch $ git checkout newbranch Switched to branch 'newbranch'
    9. 9. Git References $ git tag 1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/refs/tags/1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2
    10. 10. Take home message • Git stores a snapshot of the whole repo at each commit. • The SHA1 of a commit depends only on its content, message, committer and parent(s). • A git branch/tag is a 40 digits hex number stored in a file.
    11. 11. Things we can play with git reflog git fsck git pack git config git rebase -i git reset git refspecs git stash git add -p git log (advanced stuff) git pull –rebase

    ×