Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How GIT Works Internally
SeongJae Park <sj38.park@gmail.com>
Nice To Meet You
SeongJae Park
sj38.park@gmail.com
Git
DVCS(Distributed Version Control System)
http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
Git
DVCS(Distributed Version Control System)
Made-by Linus Torvalds For Linux
http://git-scm.com/images/logos/downloads/Gi...
Git
Many Projects Use Git Because It’s Awesome
http://blog.appliedis.com/wp-content/uploads/2013/11/android1.png
http://up...
Git
Hard To Learn
Confusing For CVCS Users
Push? Pull? Fetch? Rebase? HEAD???
http://www.quickmeme.com/img/fd/fd09e17b3393...
Git: The Information Manager From Hell
http://www.youblob.com/sites/default/files/styles/large/public/field/image/frontleg...
Git: The Information Manager From Hell
$ git log e83c516
commit e83c5163316f89bfbde7d9ab23ca2e25604af290
Author: Linus Tor...
Git: The Information Manager From Hell
That’s Why So Confusing And Hard To Learn
$ git log e83c516
commit e83c5163316f89bf...
This Time, We Will...
See How Git Works From The Scratch
https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak...
This Time, We Will...
See How Git Works From The Scratch
Just For Fun
...Or To Be Friend Of Git
https://lh4.googleusercont...
This Time, We Will...
See How Git Works From The Scratch
Just For Fun
...Or To Be Friend Of Git
Forget About The
Complicat...
In Short,
Git Is A Content-Addressable Storage System
http://www.juliagiff.com/wp-content/uploads/2014/03/tld
r_trollcat.j...
In Short,
Git Is A Content-Addressable Storage System
Blob, Tree, Commit, Reference. That’s It =3
http://www.juliagiff.com...
Plumbers: Unsung Heroes Behind
● Git Looks Graceful Owing To Plumbing
Commands Consisting Them
http://cfile4.uf.tistory.co...
Plumbers: Unsung Heroes Behind
● Git Looks Graceful Owing To Plumbing
Commands Consisting Them
○ The Wounded Foots Are Wha...
Again, From The Scratch
VCS? Why? How?
Why VCS?
Usual Life Of File
FileA ver 0 FileB ver 0
Why VCS?
Usual Life Of File
FileA ver 0 FileB ver 1FileB ver 0
Why VCS?
Usual Life Of File
FileA ver 0 FileB ver 1
Why VCS?
Usual Life Of File
FileB ver 1 FileA ver 1FileA ver 0
Why VCS?
Usual Life Of File
FileB ver 1 FileA ver 1
Why VCS?
Usual Life Of File
FileB ver 2FileA ver 1FileB ver 1
Why VCS?
Usual Life Of File
FileB ver 2FileA ver 1
Why VCS?
Usual Life Of File
FileB ver 2FileA ver 1
We Need Version Control System
VCS Would...
Record Every Changes Safely, Efficiently
We Need Version Control System
VCS Would...
Record Every Changes Safely, Efficiently
Able To Check Out Any Version
We Need Version Control System
VCS Would...
Record Every Changes Safely, Efficiently
Able To Check Out Any Version
Easy To...
Brute-force Idea
Version Control Using File System
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
$ ls
foo.c
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
$ ls
foo.c
foo_20140111.c
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
$ ls
foo.c
foo_20140111.c
foo_final.c
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
$ ls
foo.c
foo_20140111.c
foo_final.c
foo_realfinal.c
fo...
Brute-force Idea
Rename / Backup Every Files Whenever
Change Made
$ ls
foo.c
foo_20140111.c
foo_final.c
foo_realfinal.c
fo...
Brute-force Idea + History Isolation
Keep Working / History Directory Seperately.
Brute-force Idea + History Isolation
Keep Working / History Directory Seperately.
Better, But...
$ find . -type f
./workin...
TODOs From Version Control Using FS
Use Storage Space-Efficiently
TODOs From Version Control Using FS
Use Storage Space-Efficiently
Easy History Searching
Mission #1:
Store History Space-Efficiently
Basic Idea: Avoid Duplicated Objects
Basic Idea: Avoid Duplicated Objects
Content-Addressable Storage System
Basic Idea: Avoid Duplicated Objects
Content-Addressable Storage System
Key: SHA-1 Hash Of Object’s Content
Value: Compres...
Basic Idea: Avoid Duplicated Objects
Content-Addressable Storage System
Key: SHA-1 Hash Of Object’s Content
Value: Compres...
Save / Load ‘homer’
$ mkdir simpsons; cd simpsons; git init
Initialized empty Git repository in simpsons/.git/
$ echo ‘hom...
Save / Load ‘homer’
$ mkdir simpsons; cd simpsons; git init
Initialized empty Git repository in simpsons/.git/
$ echo ‘hom...
Save / Load ‘homer’
$ mkdir simpsons; cd simpsons; git init
Initialized empty Git repository in simpsons/.git/
$ echo ‘hom...
What `hash-object -w` did
hash_object_w(‘homern’)
What `hash-object -w` did
hash_object_w(‘homern’)
# Save compressed header + content at sha1 path
def hash_object_w(conten...
What `hash-object -w` did
hash_object_w(‘homern’)
# Save compressed header + content at sha1 path
def hash_object_w(conten...
What `hash-object -w` did
hash_object_w(‘homern’)
# Save compressed header + content at sha1 path
def hash_object_w(conten...
Version Control Using Hash Value
$ echo “bart” > son
$ git hash-object -w son
e00ddae83bdab443f4267426623aa34636c935f2
$
Version Control Using Hash Value
$ echo “bart” > son
$ git hash-object -w son
e00ddae83bdab443f4267426623aa34636c935f2
$ e...
Version Control Using Hash Value
$ echo “bart” > son
$ git hash-object -w son
e00ddae83bdab443f4267426623aa34636c935f2
$ e...
TODOs From Version Control Using FS
Use Storage Space-Efficiently
Easy History Searching
Version Control Using Hash Value
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
https://www.sciencenew...
Version Control Using Hash Value
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
● TODO
○ Support Direc...
WAIT!
Q: What If Small Changes Inside A Big File?
WAIT!
Q: What If Small Changes Inside A Big File?
$ du -h bigfile.c
188Kbigfile.c
$ du -sh
408K.
$ echo ‘/* small change *...
WAIT!
Q: What If Small Change Inside A Big File?
A: Git Pick up Diff-Only If Necessary
But, Don’t Forget To Keep It Small,...
Mission #2:
Store History Of Directories
tree Object
Point Other Objects(Using Hash) With Name
tree Object
Point Other Objects(Using Hash) With Name
tree
blob blob tree
blob
a113f2
mommy b8934
son
c9240
pets
d9b13
cat
tree Object
Point Other Objects(Using Hash) With Name
“A Root tree Object Is A Snapshot”
tree
blob blob tree
blob
a113f2
m...
tree object
$ mkdir pets; echo ‘snowball’ > pets/cat
$ git update-index --add son pets/cat
$ git write-tree
15ee76ed3e744b...
tree object
$ mkdir pets; echo ‘snowball’ > pets/cat
$ git update-index --add son pets/cat
$ git write-tree
15ee76ed3e744b...
tree object
$ mkdir pets; echo ‘snowball’ > pets/cat
$ git update-index --add son pets/cat
$ git write-tree
15ee76ed3e744b...
Internal Data Structure
tree
blob tree
8e1e2
son
85ab7
pets
Internal Data Structure
tree
blob tree
blob
6a1f9
cat
8e1e2
son
85ab7
pets
Version Control Using tree Object
$ echo “bart” > son
$ git update-index --add son
$ git write-tree
661e6ad514a7f05c46c293...
Version Control Using tree Object
$ echo “bart” > son
$ git update-index --add son
$ git write-tree
661e6ad514a7f05c46c293...
Version Control Using tree Object
$ echo “bart” > son
$ git update-index --add son
$ git write-tree
661e6ad514a7f05c46c293...
Internal Data Structure
tree
blob tree
blob
8e1e2
son
85ab7
pets
6a1f9
cat
Internal Data Structure
tree
blob tree
blob
tree
blob
e00dd
son85ab7
pets
8e1e2
son
85ab7
pets
6a1f9
cat
Version Control Using Hash Value
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
● TODO
○ Support Direc...
Version Control Using tree Object
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
○ Support Directory S...
Mission #3:
Commit Message
commit Object
Describe Who / When / Why The Change Made
http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY...
commit Object
Describe Who / When / Why The Change Made
Point A tree Object With Information Above
http://modthink.com/wp-...
commit Object
$ echo '1st commit' | git commit-tree 661e6
0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0
$
commit Object
$ echo '1st commit' | git commit-tree 661e6
0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0
$
$ git cat-file -p d07...
commit Object
$ echo '1st commit' | git commit-tree 661e6
0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0
$
$ git cat-file -p d07...
Version Control Using commit Object
$ echo '2nd commit' | git commit-tree 15ee7 -p 0ca73
003b5e66caa89a6228c7b4d91e0475e56...
Internal Data Structure
That’s Why People Says, “A Commit is a
snapshot”
tree
blob tree
blob
tree
blob
commit commit
tree
...
Version Control Using tree Object
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
○ Support Directory S...
Version Control Using commit Object
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
○ Support Directory...
Mission #4:
Human Readable Name
Git References
File With Human-Readable Name
Git References
File With Human-Readable Name
Storing SHA-1 Value Of commit Object
Git References
File With Human-Readable Name
Storing SHA-1 Value Of commit Object
Resides In .git/refs/
Git References Using echo
$ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first
$
Git References Using echo
$ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first
$
$ git log --pretty=o...
Git References Using echo
$ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first
$
$ git log --pretty=o...
Git References Using update-ref
$ git update-ref refs/heads/master 003b5
$ git log --pretty=oneline master
003b5e66caa89a6...
Git References Using update-ref
$ git update-ref refs/heads/master 003b5
$ git log --pretty=oneline master
003b5e66caa89a6...
Git References Using update-ref
$ git update-ref refs/heads/master 003b5
$ git log --pretty=oneline master
003b5e66caa89a6...
Internal Data Structure
tree
blob tree
blob
tree
blob
commit commit
tree
parent
tree
85ab7
pets
8e1e2
son
85ab7
pets
e00dd...
Internal Data Structure
tree
blob tree
blob
tree
blob
commit commit
tree
parent
tree
refs/heads/
master
refs/heads/
first
...
Version Control Using commit Object
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
○ Support Directory...
Version Control Using Reference
● DONE
○ Efficient Space Usage
○ Safe Record / Checkout Of History
○ Support Directory Str...
FAQ #1
How Git Make-up Working Directory?
How Git Knows Current Commit?
Answer: HEAD
How Git Knows Current Commit?
Answer: HEAD
HEAD Points reference Using ref format(Not
SHA-1)
How Git Knows Current Commit?
Answer: HEAD
HEAD Points reference Using ref format
(Not SHA-1)
$ cat .git/HEAD
ref: refs/he...
HEAD
$ cat .git/HEAD
ref: refs/heads/master
$
HEAD
$ cat .git/HEAD
ref: refs/heads/master
$ git branch
first
* master
$
HEAD
$ cat .git/HEAD
ref: refs/heads/master
$ git branch
first
* master
$
$ git symbolic-ref HEAD refs/heads/first
$ cat ....
Internal Data Structure
tree
blob tree
blob
tree
blob
commit commit
tree
parent
tree
refs/heads/
master
refs/heads/
first
...
Internal Data Structure
tree
blob tree
blob
tree
blob
commit commit
tree
parent
tree
refs/heads/
master
refs/heads/
first
...
FAQ #2
Cloned. Now Fetch Or Pull ?
Fetch / Pull
Fetch Or Pull To Get Latest Code?
Fetch
● Just Fetch Remote Repository’s Objects And
References To Local Git Internal Storage
Fetch
● Just Fetch Remote Repository’s Objects And
References To Local Git Internal Storage
● If You Need The Changes On Y...
Fetch
● Just Fetch Remote Repository’s Objects And
References To Local Git Internal Storage
● If You Need The Changes On Y...
Fetch
Refspec Describes Source / Destination
$ cat .git/config | grep remote -A3
[remote "origin"]
url = git://10.0.0.1/gi...
Fetch: Before
url = git://10.0.0.1/git/simpsons.git
fetch = +refs/heads/*:refs/remotes/origin/*
tree
blob tree
blob
a134f
...
Fetch: After
url = git://10.0.0.1/git/simpsons.git
fetch = +refs/heads/*:refs/remotes/origin/*
tree
blob tree
blob
a134f
s...
git merge origin/master
tree
blob tree
blob
a134f
son
799cf
pets
7cc07
cat
tree
blob
65464
son
799cf
pets
commit commit
tr...
Pull
Pull Is Just An Abbrev Of Fetch && Merge
May Merge Conflict Occur…
Pull Is Sufficient For Simple Project
Wrap-up
In Short,
Git Is A Content-Addressable File System
Blob, Tree, Commit, Reference. That’s It =3
http://www.juliagiff.com/wp...
Thank you :)
http://jeancharpentier.files.wordpress.com/2012/02/capture-plein-c3a9cran-01022012-230955.jpg
Slide-share
http://www.slideshare.net/SeongJaePark1/dee
p-darkside-ofgit
Latest Version Of This Slide Would Be
There
References
http://git-scm.com/book
http://www.youtube.com/watch?v=4XpnKHJAok8
http://en.wikipedia.org/wiki/The_Simpsons
This slide has been used for
Samsung Open Source CONference 2014
This work by SeongJae Park is licensed under the
Creative Commons Attribution-ShareAlike 3.0 Unported
License. To view a c...
Upcoming SlideShare
Loading in …5
×

Deep dark-side of git: How git works internally

10,684 views

Published on

Describe how git works internally using small and perfect plumbing commands.

The slide have been used at GDG DevFest 2014 and SOSCON 2014.
The slide can be updated later. And, the latest version would always be provided from this page always.

Published in: Software
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • ➤➤ How Long Does She Want You to Last? Here's the link to the FREE report ●●● https://tinyurl.com/rockhardxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @Gyeong-hwan Hong 부족한 발표 좋게 평해주셔 감사합니다. 조금이나마 도움이 됐다면 다행입니다 :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @Mohsin Hijazee Great pleasure if it was at least helpful :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Deep dark-side of git: How git works internally

  1. 1. How GIT Works Internally SeongJae Park <sj38.park@gmail.com>
  2. 2. Nice To Meet You SeongJae Park sj38.park@gmail.com
  3. 3. Git DVCS(Distributed Version Control System) http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
  4. 4. Git DVCS(Distributed Version Control System) Made-by Linus Torvalds For Linux http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png http://cdn.memegenerator.net/instances/400x/37078331.jpg
  5. 5. Git Many Projects Use Git Because It’s Awesome http://blog.appliedis.com/wp-content/uploads/2013/11/android1.png http://upload.wikimedia.org/wikipedia/en/4/40/Octocat,_a_Mascot_of_Github.jpg http://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Tux.svg/512px-Tux.svg.png http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
  6. 6. Git Hard To Learn Confusing For CVCS Users Push? Pull? Fetch? Rebase? HEAD??? http://www.quickmeme.com/img/fd/fd09e17b3393b2ea1cd7e52af1ad7c77f3c2d7a83e9f47d4b90ba3af52dde329.jpg http://git-scm.com/images/logos/downloads/Git-Logo-2Color.png
  7. 7. Git: The Information Manager From Hell http://www.youblob.com/sites/default/files/styles/large/public/field/image/frontlego1.png?itok=XA5CXt84
  8. 8. Git: The Information Manager From Hell $ git log e83c516 commit e83c5163316f89bfbde7d9ab23ca2e25604af290 Author: Linus Torvalds <torvalds@ppc970.osdl.org> Date: Thu Apr 7 15:13:13 2005 -0700 Initial revision of "git", the information manager from hell http://www.youblob.com/sites/default/files/styles/large/public/field/image/frontlego1.png?itok=XA5CXt84
  9. 9. Git: The Information Manager From Hell That’s Why So Confusing And Hard To Learn $ git log e83c516 commit e83c5163316f89bfbde7d9ab23ca2e25604af290 Author: Linus Torvalds <torvalds@ppc970.osdl.org> Date: Thu Apr 7 15:13:13 2005 -0700 Initial revision of "git", the information manager from hell http://www.youblob.com/sites/default/files/styles/large/public/field/image/frontlego1.png?itok=XA5CXt84
  10. 10. This Time, We Will... See How Git Works From The Scratch https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9C wJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg
  11. 11. This Time, We Will... See How Git Works From The Scratch Just For Fun ...Or To Be Friend Of Git https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9C wJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg
  12. 12. This Time, We Will... See How Git Works From The Scratch Just For Fun ...Or To Be Friend Of Git Forget About The Complicated Commands This Time https://lh4.googleusercontent.com/gBpfuABUjSNi2RagtJrGi8TW-pmtgak_0qtGOGubihvKH-5-umreO9C wJgjX2kaA9E7RkLwtEwiDnoMtOgm4iMJ0IWhvXlzlKL1kNVUYWuNa-gLRtRoyNjkVYg
  13. 13. In Short, Git Is A Content-Addressable Storage System http://www.juliagiff.com/wp-content/uploads/2014/03/tld r_trollcat.jpg
  14. 14. In Short, Git Is A Content-Addressable Storage System Blob, Tree, Commit, Reference. That’s It =3 http://www.juliagiff.com/wp-content/uploads/2014/03/tld r_trollcat.jpg
  15. 15. Plumbers: Unsung Heroes Behind ● Git Looks Graceful Owing To Plumbing Commands Consisting Them http://cfile4.uf.tistory.com/image/182FF7244CFDDFB33CC999 http://cfile29.uf.tistory.com/image/18574F224CFDD89B163073
  16. 16. Plumbers: Unsung Heroes Behind ● Git Looks Graceful Owing To Plumbing Commands Consisting Them ○ The Wounded Foots Are What We Interested In http://cfile4.uf.tistory.com/image/182FF7244CFDDFB33CC999 http://cfile29.uf.tistory.com/image/18574F224CFDD89B163073
  17. 17. Again, From The Scratch VCS? Why? How?
  18. 18. Why VCS? Usual Life Of File FileA ver 0 FileB ver 0
  19. 19. Why VCS? Usual Life Of File FileA ver 0 FileB ver 1FileB ver 0
  20. 20. Why VCS? Usual Life Of File FileA ver 0 FileB ver 1
  21. 21. Why VCS? Usual Life Of File FileB ver 1 FileA ver 1FileA ver 0
  22. 22. Why VCS? Usual Life Of File FileB ver 1 FileA ver 1
  23. 23. Why VCS? Usual Life Of File FileB ver 2FileA ver 1FileB ver 1
  24. 24. Why VCS? Usual Life Of File FileB ver 2FileA ver 1
  25. 25. Why VCS? Usual Life Of File FileB ver 2FileA ver 1
  26. 26. We Need Version Control System VCS Would... Record Every Changes Safely, Efficiently
  27. 27. We Need Version Control System VCS Would... Record Every Changes Safely, Efficiently Able To Check Out Any Version
  28. 28. We Need Version Control System VCS Would... Record Every Changes Safely, Efficiently Able To Check Out Any Version Easy To Read History
  29. 29. Brute-force Idea Version Control Using File System
  30. 30. Brute-force Idea Rename / Backup Every Files Whenever Change Made
  31. 31. Brute-force Idea Rename / Backup Every Files Whenever Change Made $ ls foo.c
  32. 32. Brute-force Idea Rename / Backup Every Files Whenever Change Made $ ls foo.c foo_20140111.c
  33. 33. Brute-force Idea Rename / Backup Every Files Whenever Change Made $ ls foo.c foo_20140111.c foo_final.c
  34. 34. Brute-force Idea Rename / Backup Every Files Whenever Change Made $ ls foo.c foo_20140111.c foo_final.c foo_realfinal.c foo_planb.c foo_finalfinal.c
  35. 35. Brute-force Idea Rename / Backup Every Files Whenever Change Made $ ls foo.c foo_20140111.c foo_final.c foo_realfinal.c foo_planb.c foo_finalfinal.c
  36. 36. Brute-force Idea + History Isolation Keep Working / History Directory Seperately.
  37. 37. Brute-force Idea + History Isolation Keep Working / History Directory Seperately. Better, But... $ find . -type f ./working/foo.c ./history/foo_20140111.c ./history/foo_final.c ./history/foo_realfinal.c ./history/foo_planb.c ./history/foo_finalfinal.c
  38. 38. TODOs From Version Control Using FS Use Storage Space-Efficiently
  39. 39. TODOs From Version Control Using FS Use Storage Space-Efficiently Easy History Searching
  40. 40. Mission #1: Store History Space-Efficiently
  41. 41. Basic Idea: Avoid Duplicated Objects
  42. 42. Basic Idea: Avoid Duplicated Objects Content-Addressable Storage System
  43. 43. Basic Idea: Avoid Duplicated Objects Content-Addressable Storage System Key: SHA-1 Hash Of Object’s Content Value: Compressed Content
  44. 44. Basic Idea: Avoid Duplicated Objects Content-Addressable Storage System Key: SHA-1 Hash Of Object’s Content Value: Compressed Content Same Content Never Saved Twice
  45. 45. Save / Load ‘homer’ $ mkdir simpsons; cd simpsons; git init Initialized empty Git repository in simpsons/.git/ $ echo ‘homer’ | git hash-object -w --stdin 4aa0bfa07f1680c50a1567ecc37bc3b6aa567b8f $
  46. 46. Save / Load ‘homer’ $ mkdir simpsons; cd simpsons; git init Initialized empty Git repository in simpsons/.git/ $ echo ‘homer’ | git hash-object -w --stdin 4aa0bfa07f1680c50a1567ecc37bc3b6aa567b8f $ find .git/objects/ -type f .git/objects/4a/a0bfa07f1680c50a1567ecc37bc3b6aa567b8f $
  47. 47. Save / Load ‘homer’ $ mkdir simpsons; cd simpsons; git init Initialized empty Git repository in simpsons/.git/ $ echo ‘homer’ | git hash-object -w --stdin 4aa0bfa07f1680c50a1567ecc37bc3b6aa567b8f $ find .git/objects/ -type f .git/objects/4a/a0bfa07f1680c50a1567ecc37bc3b6aa567b8f $ git cat-file -p 4aa0b homer $ git cat-file -t 4aa0b blob
  48. 48. What `hash-object -w` did hash_object_w(‘homern’)
  49. 49. What `hash-object -w` did hash_object_w(‘homern’) # Save compressed header + content at sha1 path def hash_object_w(content): header = ‘blob %d0’ % len(content) store = header + content sha1 = sha.new(store).hexdigest()
  50. 50. What `hash-object -w` did hash_object_w(‘homern’) # Save compressed header + content at sha1 path def hash_object_w(content): header = ‘blob %d0’ % len(content) store = header + content sha1 = sha.new(store).hexdigest() dir = ‘.git/objects/’ + sha1[0:2] + ‘/’ filename = sha1[2:]
  51. 51. What `hash-object -w` did hash_object_w(‘homern’) # Save compressed header + content at sha1 path def hash_object_w(content): header = ‘blob %d0’ % len(content) store = header + content sha1 = sha.new(store).hexdigest() dir = ‘.git/objects/’ + sha1[0:2] + ‘/’ filename = sha1[2:] open(dir + filename, ‘w’).write( zlib.compress(store))
  52. 52. Version Control Using Hash Value $ echo “bart” > son $ git hash-object -w son e00ddae83bdab443f4267426623aa34636c935f2 $
  53. 53. Version Control Using Hash Value $ echo “bart” > son $ git hash-object -w son e00ddae83bdab443f4267426623aa34636c935f2 $ echo “hugo” > son $ git hash-object -w son 8e1e2f09585e021c9727585af72e10871d7be7ce $
  54. 54. Version Control Using Hash Value $ echo “bart” > son $ git hash-object -w son e00ddae83bdab443f4267426623aa34636c935f2 $ echo “hugo” > son $ git hash-object -w son 8e1e2f09585e021c9727585af72e10871d7be7ce $ # Need former version, “bart” $ git cat-file -p e00dd > son $ cat son bart
  55. 55. TODOs From Version Control Using FS Use Storage Space-Efficiently Easy History Searching
  56. 56. Version Control Using Hash Value ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  57. 57. Version Control Using Hash Value ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ● TODO ○ Support Directory Structure ○ History Management ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  58. 58. WAIT! Q: What If Small Changes Inside A Big File?
  59. 59. WAIT! Q: What If Small Changes Inside A Big File? $ du -h bigfile.c 188Kbigfile.c $ du -sh 408K. $ echo ‘/* small change */’ >> bigfile.c $ git commit -as -m “small change, big difference” $ du -sh 496K. $
  60. 60. WAIT! Q: What If Small Change Inside A Big File? A: Git Pick up Diff-Only If Necessary But, Don’t Forget To Keep It Small, Simple $ du -sh 496K. $ git gc Counting objects: 6, done. Delta compression using up to 4 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (6/6), done. Total 6 (delta 1), reused 0 (delta 0) $ du -sh 388K.
  61. 61. Mission #2: Store History Of Directories
  62. 62. tree Object Point Other Objects(Using Hash) With Name
  63. 63. tree Object Point Other Objects(Using Hash) With Name tree blob blob tree blob a113f2 mommy b8934 son c9240 pets d9b13 cat
  64. 64. tree Object Point Other Objects(Using Hash) With Name “A Root tree Object Is A Snapshot” tree blob blob tree blob a113f2 mommy b8934 son c9240 pets d9b13 cat I’m a snapshot
  65. 65. tree object $ mkdir pets; echo ‘snowball’ > pets/cat $ git update-index --add son pets/cat $ git write-tree 15ee76ed3e744b6796950d07f26283d033ea3ea7 $
  66. 66. tree object $ mkdir pets; echo ‘snowball’ > pets/cat $ git update-index --add son pets/cat $ git write-tree 15ee76ed3e744b6796950d07f26283d033ea3ea7 $ git cat-file -p 15ee7 040000 tree 85ab72cf1946dc56392718a1aafb3c6f66c02072 pets 100644 blob 8e1e2f09585e021c9727585af72e10871d7be7ce son $
  67. 67. tree object $ mkdir pets; echo ‘snowball’ > pets/cat $ git update-index --add son pets/cat $ git write-tree 15ee76ed3e744b6796950d07f26283d033ea3ea7 $ git cat-file -p 15ee7 040000 tree 85ab72cf1946dc56392718a1aafb3c6f66c02072 pets 100644 blob 8e1e2f09585e021c9727585af72e10871d7be7ce son $ git cat-file -p 85ab7 100644 blob 6a1f952e1baedcb3db93a3ea5e3389e5a87941e9 cat $ git cat-file -p 6a1f9 snowball $
  68. 68. Internal Data Structure tree blob tree 8e1e2 son 85ab7 pets
  69. 69. Internal Data Structure tree blob tree blob 6a1f9 cat 8e1e2 son 85ab7 pets
  70. 70. Version Control Using tree Object $ echo “bart” > son $ git update-index --add son $ git write-tree 661e6ad514a7f05c46c2931280cb78a339d34ee2 $
  71. 71. Version Control Using tree Object $ echo “bart” > son $ git update-index --add son $ git write-tree 661e6ad514a7f05c46c2931280cb78a339d34ee2 $ git cat-file -p 661e6 040000 tree 85ab72cf1946dc56392718a1aafb3c6f66c02072 pets 100644 blob e00ddae83bdab443f4267426623aa34636c935f2 son $
  72. 72. Version Control Using tree Object $ echo “bart” > son $ git update-index --add son $ git write-tree 661e6ad514a7f05c46c2931280cb78a339d34ee2 $ git cat-file -p 661e6 040000 tree 85ab72cf1946dc56392718a1aafb3c6f66c02072 pets 100644 blob e00ddae83bdab443f4267426623aa34636c935f2 son $ git cat-file -p e00dd bart $
  73. 73. Internal Data Structure tree blob tree blob 8e1e2 son 85ab7 pets 6a1f9 cat
  74. 74. Internal Data Structure tree blob tree blob tree blob e00dd son85ab7 pets 8e1e2 son 85ab7 pets 6a1f9 cat
  75. 75. Version Control Using Hash Value ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ● TODO ○ Support Directory Structure ○ History Management ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  76. 76. Version Control Using tree Object ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ○ Support Directory Structure ● TODO ○ History Management ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  77. 77. Mission #3: Commit Message
  78. 78. commit Object Describe Who / When / Why The Change Made http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY.jpg
  79. 79. commit Object Describe Who / When / Why The Change Made Point A tree Object With Information Above http://modthink.com/wp-content/uploads/2013/05/WhoWhatWhenWhereWHY.jpg
  80. 80. commit Object $ echo '1st commit' | git commit-tree 661e6 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 $
  81. 81. commit Object $ echo '1st commit' | git commit-tree 661e6 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 $ $ git cat-file -p d075c tree 661e6ad514a7f05c46c2931280cb78a339d34ee2 author SeongJae Park <s**@gmail.com> 1410527921 +0900 committer SeongJae Park <s**@gmail.com> 1410527921 +0900 1st commit $
  82. 82. commit Object $ echo '1st commit' | git commit-tree 661e6 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 $ $ git cat-file -p d075c tree 661e6ad514a7f05c46c2931280cb78a339d34ee2 author SeongJae Park <s**@gmail.com> 1410527921 +0900 committer SeongJae Park <s**@gmail.com> 1410527921 +0900 1st commit $ Who When Why
  83. 83. Version Control Using commit Object $ echo '2nd commit' | git commit-tree 15ee7 -p 0ca73 003b5e66caa89a6228c7b4d91e0475e56bf1bdf6 $ $ git cat-file -p 003b5 tree 15ee76ed3e744b6796950d07f26283d033ea3ea7 parent 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 author SeongJae Park <s**@gmail.com> 1410528231 +0900 committer SeongJae Park <s**@gmail.com> 1410528231 +0900 2nd commit $
  84. 84. Internal Data Structure That’s Why People Says, “A Commit is a snapshot” tree blob tree blob tree blob commit commit tree parent tree 85ab7 pets 8e1e2 son 85ab7 pets 6a1f9 cat e00dd son
  85. 85. Version Control Using tree Object ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ○ Support Directory Structure ● TODO ○ History Management ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  86. 86. Version Control Using commit Object ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ○ Support Directory Structure ○ Manage History Well ● TODO ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  87. 87. Mission #4: Human Readable Name
  88. 88. Git References File With Human-Readable Name
  89. 89. Git References File With Human-Readable Name Storing SHA-1 Value Of commit Object
  90. 90. Git References File With Human-Readable Name Storing SHA-1 Value Of commit Object Resides In .git/refs/
  91. 91. Git References Using echo $ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first $
  92. 92. Git References Using echo $ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first $ $ git log --pretty=oneline first 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 1st commit $
  93. 93. Git References Using echo $ echo "0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0" > .git/refs/heads/first $ $ git log --pretty=oneline first 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 1st commit $ $ find .git/refs/heads -type f .git/refs/heads/first .git/refs/heads/master $
  94. 94. Git References Using update-ref $ git update-ref refs/heads/master 003b5 $ git log --pretty=oneline master 003b5e66caa89a6228c7b4d91e0475e56bf1bdf6 2nd commit 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 1st commit $
  95. 95. Git References Using update-ref $ git update-ref refs/heads/master 003b5 $ git log --pretty=oneline master 003b5e66caa89a6228c7b4d91e0475e56bf1bdf6 2nd commit 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 1st commit $ $ find .git/refs/heads -type f .git/refs/heads/first .git/refs/heads/master $
  96. 96. Git References Using update-ref $ git update-ref refs/heads/master 003b5 $ git log --pretty=oneline master 003b5e66caa89a6228c7b4d91e0475e56bf1bdf6 2nd commit 0ca7304ad6f5a40f8a26ba05b10b514ff2d8d8a0 1st commit $ $ find .git/refs/heads -type f .git/refs/heads/first .git/refs/heads/master $ $ cat .git/refs/heads/master 003b5e66caa89a6228c7b4d91e0475e56bf1bdf6
  97. 97. Internal Data Structure tree blob tree blob tree blob commit commit tree parent tree 85ab7 pets 8e1e2 son 85ab7 pets e00dd son 6a1f9 cat
  98. 98. Internal Data Structure tree blob tree blob tree blob commit commit tree parent tree refs/heads/ master refs/heads/ first 85ab7 pets 8e1e2 son 85ab7 pets e00dd son 6a1f9 cat
  99. 99. Version Control Using commit Object ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ○ Support Directory Structure ○ Manage History Well ● TODO ○ Better Reference Than Hash Value https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  100. 100. Version Control Using Reference ● DONE ○ Efficient Space Usage ○ Safe Record / Checkout Of History ○ Support Directory Structure ○ Manage History Well ○ Easy To Remember Specific Snapshot ● TODO ○ ...cooperation? https://www.sciencenews.org/sites/default/files/main/articles/sad_opener.jpg
  101. 101. FAQ #1 How Git Make-up Working Directory?
  102. 102. How Git Knows Current Commit? Answer: HEAD
  103. 103. How Git Knows Current Commit? Answer: HEAD HEAD Points reference Using ref format(Not SHA-1)
  104. 104. How Git Knows Current Commit? Answer: HEAD HEAD Points reference Using ref format (Not SHA-1) $ cat .git/HEAD ref: refs/heads/master
  105. 105. HEAD $ cat .git/HEAD ref: refs/heads/master $
  106. 106. HEAD $ cat .git/HEAD ref: refs/heads/master $ git branch first * master $
  107. 107. HEAD $ cat .git/HEAD ref: refs/heads/master $ git branch first * master $ $ git symbolic-ref HEAD refs/heads/first $ cat .git/HEAD ref: refs/heads/first $ git branch * first master
  108. 108. Internal Data Structure tree blob tree blob tree blob commit commit tree parent tree refs/heads/ master refs/heads/ first 85ab7 pets 8e1e2 son 85ab7 pets e00dd son 6a1f9 cat
  109. 109. Internal Data Structure tree blob tree blob tree blob commit commit tree parent tree refs/heads/ master refs/heads/ first .git/HEAD 85ab7 pets 8e1e2 son 85ab7 pets e00dd son 6a1f9 cat
  110. 110. FAQ #2 Cloned. Now Fetch Or Pull ?
  111. 111. Fetch / Pull Fetch Or Pull To Get Latest Code?
  112. 112. Fetch ● Just Fetch Remote Repository’s Objects And References To Local Git Internal Storage
  113. 113. Fetch ● Just Fetch Remote Repository’s Objects And References To Local Git Internal Storage ● If You Need The Changes On Your Working Directory,
  114. 114. Fetch ● Just Fetch Remote Repository’s Objects And References To Local Git Internal Storage ● If You Need The Changes On Your Working Directory, ○ Manually Merge Them Using git-merge Or, ○ Checkout
  115. 115. Fetch Refspec Describes Source / Destination $ cat .git/config | grep remote -A3 [remote "origin"] url = git://10.0.0.1/git/simpsons.git fetch = +refs/heads/*:refs/remotes/origin/* Source Destination
  116. 116. Fetch: Before url = git://10.0.0.1/git/simpsons.git fetch = +refs/heads/*:refs/remotes/origin/* tree blob tree blob a134f son 799cf pets 7cc07 cat tree blob 65464 son 799cf pets commit commit tree parent tree refs/ heads/ master .git/ HEAD git://10.0.0.1/git/simpsons.git tree blob tree blob a134f son 799cf pets 7cc07 cat commit tree refs/ heads/ master .git/ HEAD file:///home/sjpark/simpsons
  117. 117. Fetch: After url = git://10.0.0.1/git/simpsons.git fetch = +refs/heads/*:refs/remotes/origin/* tree blob tree blob a134f son 799cf pets 7cc07 cat tree blob 65464 son 799cf pets commit commit tree parent tree refs/ heads/ master .git/ HEAD git://10.0.0.1/git/simpsons.git tree blob tree blob a134f son 799cf pets 7cc07 cat tree blob 65464 son 799cf pets commit commit tree parent tree refs/ remotes/ origin/ master refs/ heads/ master .git/ HEAD file:///home/sjpark/simpsons
  118. 118. git merge origin/master tree blob tree blob a134f son 799cf pets 7cc07 cat tree blob 65464 son 799cf pets commit commit tree parent tree refs/ remotes/ origin/ master refs/ heads/ first .git/ HEAD tree blob tree blob a134f son 799cf pets 7cc07 cat tree blob 65464 son 799cf pets commit commit tree parent tree refs/ remotes/ origin/ master refs/ heads/ first .git/ HEAD
  119. 119. Pull Pull Is Just An Abbrev Of Fetch && Merge May Merge Conflict Occur… Pull Is Sufficient For Simple Project
  120. 120. Wrap-up
  121. 121. In Short, Git Is A Content-Addressable File System Blob, Tree, Commit, Reference. That’s It =3 http://www.juliagiff.com/wp-content/uploads/2014/03/tld r_trollcat.jpg
  122. 122. Thank you :) http://jeancharpentier.files.wordpress.com/2012/02/capture-plein-c3a9cran-01022012-230955.jpg
  123. 123. Slide-share http://www.slideshare.net/SeongJaePark1/dee p-darkside-ofgit Latest Version Of This Slide Would Be There
  124. 124. References http://git-scm.com/book http://www.youtube.com/watch?v=4XpnKHJAok8 http://en.wikipedia.org/wiki/The_Simpsons
  125. 125. This slide has been used for Samsung Open Source CONference 2014
  126. 126. This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.

×