• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Git internals

Git internals






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Internal model of the commit object.DEMO: git cat-file –p
  • The most important Git object is the COMMIT.The most important thing about the commit is that it is IMMUTABLE.So why is it important?A commit is primarily defined by 3 things: a snapshot of the working directory, the “disk” state ; a commit message ; and most importantly, a parent commit. Every commit has a pointer toward its parent. This is what defined a history of commits, a chained list of commit.So, if you change- a single file -> different commit- the commit message -> different commit- the commit parent or parents -> different commitA commit is uniquely identified by its SHA1. A SHA1 is deterministic : a snapshot with the exact same content will have the same SHA1. A commit refering to the same snapshot, the same parent commit and the same commit message will be identified by the same SHA1.So why is this important?It matters because of the initial Git design choices. Git is primarily a content-addressables file that stores any version of any object as a distinct object accessible for ever. Any file, any snapshot, any commit that has been archived in Git, can be retrieved for ever by its SHA1.These files are stored entirely, all the time. This is a major difference with other versioning systems such as Subversion and Perforce, which stores diff of files.Now, I’m making simplifications, but this is true in a 1st approximation.Why do we care?This means that all snapshots that have been committed once can always be retrieved. Keep this in mind as it will be important later.
  • Commit has a pointer to a tree, which describes the entire git repo content.

Git internals Git internals Presentation Transcript

  • internals
  • Git Objects
  • Git objects $ ls .git/ branches config description HEAD hooks index info logs objects refs
  • The blob $ git init Initialized empty Git repository in /tmp/test/.git/ $ echo "test_content" | git hash-object -w -t blob --stdin 915e94ff1ac3818f1e458534b0228a12a99cd6c5 $ ls .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 .git/objects/91/5e94ff1ac3818f1e458534b0228a12a99cd6c5 $ cat .git/objects/91/5e94ff1ac3818f1e458534b02* | zlib_inflate -d blob 130test_content
  • The tree tree [content size]0 10644 blob a906cb README 10755 blob 6f4e32 run 04000 tree 1f7a4e src
  • The commit $ git commit file [master dbaf944] This is a commit message. $ cat .git/objects/db/af944a4a9eb72af64042b1e3a128936000dfc2 | zlib_inflate -d commit 318 tree 47ec7a250164a21cb14eb64618c3a903db0b7420 parent 402b26df0644f09fc62842c0a4a44a0a3345c530 author Manu <m.cupcic@criteo.com> 1380977766 +0200 committer Manu <m.cupcic@criteo.com> 1380977766 +0200 This is a commit message.
  • The commit • Is identified by • a snapshot of the repo state (the tree). • parent commit(s) • a commit message • Is immutable • Has a deterministic hash (SHA1) • Commits form a linked list: the history
  • Git References $ cat .git/refs/heads/master dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/HEAD ref: refs/heads/master $ echo "dbaf944" > .git/refs/heads/newbranch $ git checkout newbranch Switched to branch 'newbranch'
  • Git References $ git tag 1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2 $ cat .git/refs/tags/1.0 dbaf944a4a9eb72af64042b1e3a128936000dfc2
  • Take home message • Git stores a snapshot of the whole repo at each commit. • The SHA1 of a commit depends only on its content, message, committer and parent(s). • A git branch/tag is a 40 digits hex number stored in a file.
  • Things we can play with git reflog git fsck git pack git config git rebase -i git reset git refspecs git stash git add -p git log (advanced stuff) git pull –rebase