Git Internals
Upcoming SlideShare
Loading in...5
×
 

Git Internals

on

  • 4,422 views

An explanation about the organization of a Git repo, the type of objects it contains inside and the relations between them.

An explanation about the organization of a Git repo, the type of objects it contains inside and the relations between them.

Statistics

Views

Total Views
4,422
Views on SlideShare
4,370
Embed Views
52

Actions

Likes
1
Downloads
76
Comments
0

4 Embeds 52

http://codebits.eu 30
http://www.slideshare.net 14
https://codebits.eu 7
http://codebits.localhost 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Git Internals Git Internals Presentation Transcript

  • GIT Internals Pedro Melo <{mailto,xmpp}:melo@simplicidade.org>
  • A short GIT History • 2002 Apr 2005: The BitKeeper Wars • Apr 2005: Episode IV - A New Hope • July 2005: Hamano is the new maintainer • Late 2008: GitHub hits the spotlight “I’m an egotistical bastard, and I name all my projects after myself. First Linux, now git.” – Linus
  • on A short GIT History Person al take • 2002 Apr 2005: The BitKeeper Wars • Apr 2005: Episode IV - A New Hope • July 2005: Hamano is the new maintainer • Late 2008: GitHub hits the spotlight “I’m an egotistical bastard, and I name all my projects after myself. First Linux, now git.” – Linus
  • GIT rules (without !!) • Track content, not changes • Simple repository • Complex software • Its easier to update the software, complex to update all the repos so far Git Mantra: http://bit.ly/git-phylosophy
  • In other words, I'm right. I'm always right, but sometimes I'm more right than other times. And dammit, when I say "files don't matter", I'm really really Right(tm). Linus
  • Strong Points • Non-Linear development • Distributed Development • Centralized development is a subcase • Efficiency • Toolkit Design
  • Objects • Git repositories store objects • Stored in the Object Database • Inside the Git directory • .git at the root of your project • Four major object types • Objects are compressed for storage (zlib) • SHA1 of header+content ID
  • The Blob • Files are stored as blobs • Only content, no metadata
  • Meet the blob blob [content_size]0 Your content goes here after the header I like pizza with apples
  • The tree • Trees store directories • Mode, type, pointer and name • Recursive, trees can contain trees • Stored as a simple text file
  • Meet the tree tree [content_size]0 100644 blob b5f21a README 100644 blob afe433 Makefile.PL 040000 tree a42cd0 lib
  • The commit • The object that makes history • Pointer to a tree and the parent(s) commits if any • Author, committer and commit message
  • Meet the commit... commit [content_size]0 tree 23edfc author Pedro Melo <melo@mini.me> 1243036800 committer Pedro Melo <melo@mini.me> 1243036800 commit without a parent usually called first commit
  • ...and its child the other commit commit [content_size]0 tree fde45c parent 3454df author Pedro Melo <melo@mini.me> 1243036932 committer Pedro Melo <melo@mini.me> 1243036932 and we fixed that nasty bug after all, they do tend to crop up
  • The tag • A name for a particular commit • Can contain a message • Optionally GPG signed • Allows for cryptographically secure releases
  • Meet the tag tag [content_size]0 object 123fec type commit tag v1 tagger Pedro Melo <melo@mini.me> 1243037423 made it to 1.0!
  • Git Data Model Recap • Immutable objects • A file per object • Repacked into object packs for efficiency • Organized as a directed acyclic graph
  • proj/ Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL lib/ lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL lib/ lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL lib/ lib/ Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL lib/ lib/ Cool.pm Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL Makefile.PL lib/ lib/ lib/ Cool.pm Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL Makefile.PL lib/ lib/ lib/ Cool.pm Cool.pm
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL Makefile.PL lib/ lib/ lib/ Cool.pm Cool.pm
  • References • “Names” for commits • Mutable, they point to a specific commit and move to a new one after each commit Name • A branch is a reference, a name to a commit • Special HEAD reference: points to a reference
  • proj/ Makefile.PL lib/ Cool.pm Makefile.PL Makefile.PL Makefile.PL lib/ lib/ lib/ Cool.pm Cool.pm
  • master HEAD
  • master HEAD
  • master HEAD test
  • master test HEAD
  • master test HEAD
  • master HEAD test
  • test master HEAD
  • test master Merge
  • test Merge master
  • test master Rebase
  • master Rebase test
  • master Rebase test
  • Rebase + Merge master test
  • Rebase + Merge master test
  • Non-SCM uses for Git • Leverage strengths • immutable • over network pulls only missing objects • fast checkout (compare to copy, less to read) • easy rollback
  • Beware of weak points • Always stores full copy of files • not good for backups of DB dumps • Full history more disk space • this might chance as “shallow clones” gain funcionality...
  • Content distribution • Updates done in a master, central repository • Hierarchy of slave repositories • Fast sync between repositories, fast checkout • Can be automated with hooks • Useful if you have lots of static files, faster than rsync
  • Read-only filesystem • Design web server that fetch objects directly from the object database • Compact storage, efficient retrieval • Packs of objects also very VM friendly, mmap ready • Some solutions already available OSS
  • Wiki/Ticketing backend • Use git repository as storage for wiki or ticketing systems • Good match for distributed developement • Several solutions already available OSS • ... but similar to SCM usages
  • That’s all folks! • I’ll be around #codebits, feel free to ask me stuff • If you want a git as a SCM demo, lets get organized and I’ll do a impromptu presentation, or even private lapdan^H^H^H^H^Hdemos • After #codebits <{mailto,xmpp}:melo@simplicidade.org
  • About Git http://git-scm.com/ Git Internals: http://peepcode.com/products/git-internals-pdf Git book: http://progit.org/ About Me http://simplicidade.org/notes/ @pedromelo {mailto,xmpp}:melo@simplicidade.org skype:melopt http://github.com/melo http://www.slideshare.net/melopt