SlideShare a Scribd company logo
How Git Works ?
Let’s GIT Deeper :)
Git Folder Structure
It’s All in The .git , this where all
the magic happens,while every item on
the list here has a definite role in
git magic, the object and the refs
folders hold the most important roles
regarding stored data.
Git Data Structure
Git is a content-addressable file system,but what
does that means well , it’s simply a key value
store Where is the key is a hash and the value is
a blob or a group of blob with hashes.
Git uses cryptographic hashes for keys which is
an algorithm which constructs a short digest from
a sequence of bytes of any length. Ex [sha 1=>160
bit, sha2 > 256, skein > 256]. * the bigger the
better
When you commit a file into Git , it calculates
and remembers the hash of the contents of the
file. So when you retrieve it, it can verify that
the hash of the data being retrieved exactly
matches the hash that was computed when it was
stored.
Let’s Git Objective :)
Git use a different kinds of objects
to store data related to revisions
for example we have the basic hash
object and the tree also the commit
and the tag “which is hybrid kind
and not really an object”.
Everything in git is organised
around those objects so most of the
plumbing tools in git deals with
those objects directly to handle
saving the revisions behind the
scenes.
Git Hash Object
The Hash object is the basic unit in git, it’s used in every revision as a stand
alone object or part of group of objects important note that it only saves shows you
the hash and store the data as blob no other information is saved.
All Objects except of tags are stored in objects folder.
Git hash-object “blob object” -> used to hash and save a single unnamed data object
in objects folder
$ echo 'test content' | git hash-object -w --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
*hashing side effect : When Git stores the contents of somefile.txt, it will realize
that it already has a copy of that data when comparing hashes, There is no need to
store it again,his process is called deduplication.
That Blob , What Exactly it looks like
You know that git store revision as object but what is that object looks like
actually , let's find out .
Object creations steps :
1 - content ,i.e $content = "Hello , I’m content" .
2 - header ,i.e $header = "blob ".strlen($content)."0".
3 - concatenate header and content into a store,i.e $store = $header.$content .
4 - calculate the hash, hash = sha1($store).
5 - zipping the store , $compressed = gzdeflate($string, 9).
6 - use the hash as name for the object storage , using two first characters as
folder name inside objects folder and the rest as file name.
path = '.git/objects/' + substr($hash,0,2) + '/' + substr($hash,2)
7 - write out the the zipped store to that location.
Git Tree Object*
Git stores content in a manner similar to a
UNIX filesystem, but a bit simplified. All the
content is stored as tree and blob objects,
with trees corresponding to UNIX directory
entries and blobs corresponding more or less to
inodes or file contents. A single tree object
contains one or more entries, each of which is
the SHA-1 hash of a blob or subtree with its
associated mode, type, and filename.
The tree can be chained to hold different
revisions in the same tree by reading the old
tree before updating the new tree index
$ git cat-file -p master^{tree}
100644 blob a906cb2a4a904a152e80877d4088654daad0c859
README
100644 blob 8f94139338f9404f26296befa88755fc2598c289
Rakefile
Git Commit Object
Now you saved your data but that’s it you must
remember all SHA-1 values in order to recall the
snapshots. You also don’t have any information about
who saved the snapshots, when they were saved, or
why they were saved. This is the basic information
that the commit object stores for you
$ echo 'First commit' | git commit-tree d8329f
fdf4fc3344e67ab068f836878b6c4951e3b15f3d
The format for a commit object is simple: it
specifies the top-level tree for the snapshot
of the project at that point; the parent
commits if any (the commit object described
above does not have any parents); the
author/committer information (which uses your
user.name and user.email configuration settings
and a timestamp); a blank line, and then the
commit message
And Then Commit Chain
Log Miracle was Birthed
hmmm,What’s About Branching ?
Here where the refs folder comes to play,While you can use the hashes to
navigate to commits but you have to remember them which is hard thus git
provides easier way to access those values,references or refs for short
are filename which mirror to branches i.e .git/refs/heads/master
References Types :
Local refs : stored locally and can be accessed using git log and name of the reference which usually corresponds to
the name of the branch.
$ git log --pretty=oneline master
The Head : a Symbolic reference to the last current commit on branch and is used to determine the sha-1 and update
refs using update-refs and it do that by maintaining a sym ref to your current branch,but it also can can point to
sha-1 of an object when you checkout tag,commit,remote branch*.
Remotes : they same as local refs but the main difference they are on another location usually upstream version of
the same repo and they are read only but they can be navigated the same.
$ cat .git/refs/remotes/origin/master
Tag : which is technically is an object but it’s actually a hard reference to a single commit and contains extra
information like a date and a message .
$ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 -m 'Test tag'
*it’s called a detached head mode and can be used to manually moving to a specific commit/tag
but ,I want to Talk To My Friends outside !
Using remotes we can move changesets between multiple git repositories.
The remote uses format called refspec which is stored in config folder ,
you can view them using git remote -v
Refspec section example:
[remote "origin"]
url = https://github.com/schacon/simplegit-progit -> remote
url
fetch = +refs/heads/*:refs/remotes/origin/* -> +<src>:<dst>
We can filter by single branch
We have two main operations to be used with the refspec fetch and push :
Fetch:
fetch = +refs/heads/master:refs/remotes/origin/master
The + sign is optional , it tells git to update the remote refs even
it’s n’t using fast forward
You can use multiple fetch in the same section or you can the command
git fetch
You can use globe in the fetch to create a pattern , which can be used
to namespace and partition the code
Push :
push = refs/heads/master:refs/heads/qa/master
You can create a push section to push automatically or use the command
But That’s a lot of objects
Well , git got your pack, literally :)
That’s where the role of packfiles comes , when you create revisions
more objects created as more revision you have more objects are
created but since we don't work alone those object at some time will
have to travel to somewhere else and sending a lot of objects
sequentially seems like a recipe for failure, so the packfiles were
created.
So what’s a packfile :
Packfile is basically a compressed format of all the committed
objects into two files .pack and .idx ,The pack file contains the
objects and the index file contain offsets to the objects in the
pack file
what happens when travelling up there
When we talked about how git saves revisions as objects that’s called The loose object format which
means each revision is stored as a single object in the object folder but as we said before when
sending your changesets from one place to another having thousands of objects isn’t very efficient so
git had the command gc or as we know in the computer science lingo a garbage collector , while garbage
collector is usually used for clean objects from memory , git gc is used to pack the objects into a
neat format which can be compressed and sent anywhere neatly , the gc command is usually run before
every push but you can run it locally to clean up every now and then.
Example :
$ git cat-file -s b042a60ef7dff760008df33cee372b945b6e884e 22054
$ git gc
Counting objects: 18, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (14/14), done.
Writing objects: 100% (18/18), done.
Total 18 (delta 3), reused 0 (delta 0)
$ find .git/objects -type f
.git/objects/bd/9dbf5aae1a3862dd1526723246b20206e5fc37
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
.git/objects/info/packs
.git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.idx
.git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.pack
git verify-pack
plumbing command
allows you to see what
was packed up.
Cool , but how do we actually travel ?
Git uses two ways to send and receive data from the outside world :
Dumb Protocol:
The dumb protocol is likely to be used when setup a repository for the first time , read operations basically .
This protocol is called “dumb” because it requires no Git-specific code on the server side during the transport
process, the fetch process is a series of HTTP GET requests.
Using git clone it starts by downloading info/refs using update-server-info command then HEAD to know what to
checkout then it start getting the objects from the list in the refs in the loose object format.
Warning: this method is not secure and rarely anyone use it but it can be used in rare situation
Smart Protocol:
The smart protocol is a more common method of transferring data, but it requires a process on the remote end that
is intelligent it can read local data, figure out what the client has and needs, and generate the packfile for
it.
There are two sets of processes for transferring data: a pair for uploading data and a pair for downloading data.
Uploading Data: PUSH
To upload data to a remote process, Git uses the send-pack and receive-pack processes. The send-pack process runs
on the client and connects to a receive-pack process on the remote side.
Downloading Data: FETCH
When you download data, the fetch-pack and upload-pack processes are involved. The client initiates a fetch-pack
process that connects to an upload-pack process on the remote side to negotiate what data will be transferred
down.
I lost some Data , what would i do ?
Git provide two ways to access missing revisions :
1- git reflog / git log -g
Git silently records what your HEAD is every time you
change it. Each time you commit or change branches,
the reflog is updated. The reflog is also updated by
the git update-ref command, which is another reason to
use it instead of just writing the SHA-1 value to your
ref files
2- git fsck --full
git fsck utility, which checks your database for
integrity, which will show any orphaned object by
adding dangling before the type of the object and the
hash.
Both ways will provide a way to access the hash value
which you can write it’s values in a new branch
git branch recover-branch ab1afef
How git works

More Related Content

Similar to How git works

Git, Fast and Distributed Source Code Management
Git, Fast and Distributed Source Code ManagementGit, Fast and Distributed Source Code Management
Git, Fast and Distributed Source Code Management
Salimane Adjao Moustapha
 
Advanced git
Advanced gitAdvanced git
Advanced git
satya sudheer
 
New Views on your History with git replace
New Views on your History with git replaceNew Views on your History with git replace
New Views on your History with git replace
Christian Couder
 
GIT: Content-addressable filesystem and Version Control System
GIT: Content-addressable filesystem and Version Control SystemGIT: Content-addressable filesystem and Version Control System
GIT: Content-addressable filesystem and Version Control System
Tommaso Visconti
 
Git: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commandsGit: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commands
th507
 
Introduction to Git (part 1)
Introduction to Git (part 1)Introduction to Git (part 1)
Introduction to Git (part 1)
Salvatore Cordiano
 
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
Ahmed El-Arabawy
 
Learning git
Learning gitLearning git
Learning git
Sid Anand
 
Git basic
Git basicGit basic
Git basic
Akbar Uddin
 
Git 101 for Beginners
Git 101 for Beginners Git 101 for Beginners
Git 101 for Beginners
Anurag Upadhaya
 
Git slides
Git slidesGit slides
Git slides
Nanyak S
 
Introduction to Git for developers
Introduction to Git for developersIntroduction to Git for developers
Introduction to Git for developers
Dmitry Guyvoronsky
 
Introduction of Git
Introduction of GitIntroduction of Git
Introduction of GitWayne Chen
 
Version control with Git
Version control with GitVersion control with Git
Version control with Git
Claudio Montoya
 
1-Intro to VC & GIT PDF.pptx
1-Intro to VC & GIT PDF.pptx1-Intro to VC & GIT PDF.pptx
1-Intro to VC & GIT PDF.pptx
HuthaifaAlmaqrami1
 
git.pptx
git.pptxgit.pptx
Demystifying git
Demystifying git Demystifying git
Demystifying git
Andrey Dyblenko
 
Git basics with notes
Git basics with notesGit basics with notes
Git basics with notes
Surabhi Gupta
 
Version control with GIT
Version control with GITVersion control with GIT
Version control with GIT
Zeeshan Khan
 

Similar to How git works (20)

Git, Fast and Distributed Source Code Management
Git, Fast and Distributed Source Code ManagementGit, Fast and Distributed Source Code Management
Git, Fast and Distributed Source Code Management
 
Advanced git
Advanced gitAdvanced git
Advanced git
 
New Views on your History with git replace
New Views on your History with git replaceNew Views on your History with git replace
New Views on your History with git replace
 
GIT: Content-addressable filesystem and Version Control System
GIT: Content-addressable filesystem and Version Control SystemGIT: Content-addressable filesystem and Version Control System
GIT: Content-addressable filesystem and Version Control System
 
Git: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commandsGit: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commands
 
Introduction to Git (part 1)
Introduction to Git (part 1)Introduction to Git (part 1)
Introduction to Git (part 1)
 
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
Embedded Systems: Lecture 11: Introduction to Git & GitHub (Part 2)
 
Learning git
Learning gitLearning git
Learning git
 
Git basic
Git basicGit basic
Git basic
 
Git 101 for Beginners
Git 101 for Beginners Git 101 for Beginners
Git 101 for Beginners
 
Git slides
Git slidesGit slides
Git slides
 
Introduction to Git for developers
Introduction to Git for developersIntroduction to Git for developers
Introduction to Git for developers
 
Introduction of Git
Introduction of GitIntroduction of Git
Introduction of Git
 
Version control with Git
Version control with GitVersion control with Git
Version control with Git
 
16 Git
16 Git16 Git
16 Git
 
1-Intro to VC & GIT PDF.pptx
1-Intro to VC & GIT PDF.pptx1-Intro to VC & GIT PDF.pptx
1-Intro to VC & GIT PDF.pptx
 
git.pptx
git.pptxgit.pptx
git.pptx
 
Demystifying git
Demystifying git Demystifying git
Demystifying git
 
Git basics with notes
Git basics with notesGit basics with notes
Git basics with notes
 
Version control with GIT
Version control with GITVersion control with GIT
Version control with GIT
 

Recently uploaded

Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
Kamal Acharya
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 

Recently uploaded (20)

Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 

How git works

  • 1. How Git Works ? Let’s GIT Deeper :)
  • 2. Git Folder Structure It’s All in The .git , this where all the magic happens,while every item on the list here has a definite role in git magic, the object and the refs folders hold the most important roles regarding stored data.
  • 3. Git Data Structure Git is a content-addressable file system,but what does that means well , it’s simply a key value store Where is the key is a hash and the value is a blob or a group of blob with hashes. Git uses cryptographic hashes for keys which is an algorithm which constructs a short digest from a sequence of bytes of any length. Ex [sha 1=>160 bit, sha2 > 256, skein > 256]. * the bigger the better When you commit a file into Git , it calculates and remembers the hash of the contents of the file. So when you retrieve it, it can verify that the hash of the data being retrieved exactly matches the hash that was computed when it was stored.
  • 4. Let’s Git Objective :) Git use a different kinds of objects to store data related to revisions for example we have the basic hash object and the tree also the commit and the tag “which is hybrid kind and not really an object”. Everything in git is organised around those objects so most of the plumbing tools in git deals with those objects directly to handle saving the revisions behind the scenes.
  • 5. Git Hash Object The Hash object is the basic unit in git, it’s used in every revision as a stand alone object or part of group of objects important note that it only saves shows you the hash and store the data as blob no other information is saved. All Objects except of tags are stored in objects folder. Git hash-object “blob object” -> used to hash and save a single unnamed data object in objects folder $ echo 'test content' | git hash-object -w --stdin d670460b4b4aece5915caf5c68d12f560a9fe3e4 *hashing side effect : When Git stores the contents of somefile.txt, it will realize that it already has a copy of that data when comparing hashes, There is no need to store it again,his process is called deduplication.
  • 6. That Blob , What Exactly it looks like You know that git store revision as object but what is that object looks like actually , let's find out . Object creations steps : 1 - content ,i.e $content = "Hello , I’m content" . 2 - header ,i.e $header = "blob ".strlen($content)."0". 3 - concatenate header and content into a store,i.e $store = $header.$content . 4 - calculate the hash, hash = sha1($store). 5 - zipping the store , $compressed = gzdeflate($string, 9). 6 - use the hash as name for the object storage , using two first characters as folder name inside objects folder and the rest as file name. path = '.git/objects/' + substr($hash,0,2) + '/' + substr($hash,2) 7 - write out the the zipped store to that location.
  • 7. Git Tree Object* Git stores content in a manner similar to a UNIX filesystem, but a bit simplified. All the content is stored as tree and blob objects, with trees corresponding to UNIX directory entries and blobs corresponding more or less to inodes or file contents. A single tree object contains one or more entries, each of which is the SHA-1 hash of a blob or subtree with its associated mode, type, and filename. The tree can be chained to hold different revisions in the same tree by reading the old tree before updating the new tree index $ git cat-file -p master^{tree} 100644 blob a906cb2a4a904a152e80877d4088654daad0c859 README 100644 blob 8f94139338f9404f26296befa88755fc2598c289 Rakefile
  • 8. Git Commit Object Now you saved your data but that’s it you must remember all SHA-1 values in order to recall the snapshots. You also don’t have any information about who saved the snapshots, when they were saved, or why they were saved. This is the basic information that the commit object stores for you $ echo 'First commit' | git commit-tree d8329f fdf4fc3344e67ab068f836878b6c4951e3b15f3d The format for a commit object is simple: it specifies the top-level tree for the snapshot of the project at that point; the parent commits if any (the commit object described above does not have any parents); the author/committer information (which uses your user.name and user.email configuration settings and a timestamp); a blank line, and then the commit message And Then Commit Chain Log Miracle was Birthed
  • 9. hmmm,What’s About Branching ? Here where the refs folder comes to play,While you can use the hashes to navigate to commits but you have to remember them which is hard thus git provides easier way to access those values,references or refs for short are filename which mirror to branches i.e .git/refs/heads/master References Types : Local refs : stored locally and can be accessed using git log and name of the reference which usually corresponds to the name of the branch. $ git log --pretty=oneline master The Head : a Symbolic reference to the last current commit on branch and is used to determine the sha-1 and update refs using update-refs and it do that by maintaining a sym ref to your current branch,but it also can can point to sha-1 of an object when you checkout tag,commit,remote branch*. Remotes : they same as local refs but the main difference they are on another location usually upstream version of the same repo and they are read only but they can be navigated the same. $ cat .git/refs/remotes/origin/master Tag : which is technically is an object but it’s actually a hard reference to a single commit and contains extra information like a date and a message . $ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 -m 'Test tag' *it’s called a detached head mode and can be used to manually moving to a specific commit/tag
  • 10. but ,I want to Talk To My Friends outside ! Using remotes we can move changesets between multiple git repositories. The remote uses format called refspec which is stored in config folder , you can view them using git remote -v Refspec section example: [remote "origin"] url = https://github.com/schacon/simplegit-progit -> remote url fetch = +refs/heads/*:refs/remotes/origin/* -> +<src>:<dst> We can filter by single branch We have two main operations to be used with the refspec fetch and push : Fetch: fetch = +refs/heads/master:refs/remotes/origin/master The + sign is optional , it tells git to update the remote refs even it’s n’t using fast forward You can use multiple fetch in the same section or you can the command git fetch You can use globe in the fetch to create a pattern , which can be used to namespace and partition the code Push : push = refs/heads/master:refs/heads/qa/master You can create a push section to push automatically or use the command
  • 11. But That’s a lot of objects Well , git got your pack, literally :) That’s where the role of packfiles comes , when you create revisions more objects created as more revision you have more objects are created but since we don't work alone those object at some time will have to travel to somewhere else and sending a lot of objects sequentially seems like a recipe for failure, so the packfiles were created. So what’s a packfile : Packfile is basically a compressed format of all the committed objects into two files .pack and .idx ,The pack file contains the objects and the index file contain offsets to the objects in the pack file
  • 12. what happens when travelling up there When we talked about how git saves revisions as objects that’s called The loose object format which means each revision is stored as a single object in the object folder but as we said before when sending your changesets from one place to another having thousands of objects isn’t very efficient so git had the command gc or as we know in the computer science lingo a garbage collector , while garbage collector is usually used for clean objects from memory , git gc is used to pack the objects into a neat format which can be compressed and sent anywhere neatly , the gc command is usually run before every push but you can run it locally to clean up every now and then. Example : $ git cat-file -s b042a60ef7dff760008df33cee372b945b6e884e 22054 $ git gc Counting objects: 18, done. Delta compression using up to 8 threads. Compressing objects: 100% (14/14), done. Writing objects: 100% (18/18), done. Total 18 (delta 3), reused 0 (delta 0) $ find .git/objects -type f .git/objects/bd/9dbf5aae1a3862dd1526723246b20206e5fc37 .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4 .git/objects/info/packs .git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.idx .git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.pack git verify-pack plumbing command allows you to see what was packed up.
  • 13. Cool , but how do we actually travel ? Git uses two ways to send and receive data from the outside world : Dumb Protocol: The dumb protocol is likely to be used when setup a repository for the first time , read operations basically . This protocol is called “dumb” because it requires no Git-specific code on the server side during the transport process, the fetch process is a series of HTTP GET requests. Using git clone it starts by downloading info/refs using update-server-info command then HEAD to know what to checkout then it start getting the objects from the list in the refs in the loose object format. Warning: this method is not secure and rarely anyone use it but it can be used in rare situation Smart Protocol: The smart protocol is a more common method of transferring data, but it requires a process on the remote end that is intelligent it can read local data, figure out what the client has and needs, and generate the packfile for it. There are two sets of processes for transferring data: a pair for uploading data and a pair for downloading data. Uploading Data: PUSH To upload data to a remote process, Git uses the send-pack and receive-pack processes. The send-pack process runs on the client and connects to a receive-pack process on the remote side. Downloading Data: FETCH When you download data, the fetch-pack and upload-pack processes are involved. The client initiates a fetch-pack process that connects to an upload-pack process on the remote side to negotiate what data will be transferred down.
  • 14. I lost some Data , what would i do ? Git provide two ways to access missing revisions : 1- git reflog / git log -g Git silently records what your HEAD is every time you change it. Each time you commit or change branches, the reflog is updated. The reflog is also updated by the git update-ref command, which is another reason to use it instead of just writing the SHA-1 value to your ref files 2- git fsck --full git fsck utility, which checks your database for integrity, which will show any orphaned object by adding dangling before the type of the object and the hash. Both ways will provide a way to access the hash value which you can write it’s values in a new branch git branch recover-branch ab1afef