New Views on your History
with git replace
Christian Couder, Murex
chriscool@tuxfamily.org
OSDC.fr 2013
October 5, 2013
About Git
A Distributed Version Control System
(DVCS):
● created by Linus Torvalds
● maintained by Junio Hamano
● since 20...
Git Design
Git is made of these things:
● “Objects”
● “Refs”
● config, indexes, logs, hooks,
grafts, packs, ...
Only “Obje...
Git Objects

● Blob: content of a file
● Tree: content of a directory
● Commit: state of the whole source code
● Tag: stam...
Git Objects Storage

● Git Objects are stored in a
content addressable database.
● The key to retrieve each Object is the
...
Blob
SHA1: e8455...

blob = content of a file
blob

size

/* content of this blob, it can be
anything like an image, a vid...
Example of storing and
retrieving a blob
# echo “Whatever…” | git hash-object -w --stdin
aa02989467eea6d8e0bc68f3663de5176...
Tree
SHA1: 0de24...
size

tree
blob
tree

hello.c
lib

tree = content of a
directory

e8455...
10af9...

It can point to b...
Example of storing and
retrieving a tree
# BLOB=aa02989467eea6d8e0bc68f3663de51767a9f5b1
# (printf "100644 whatever.txt0";...
Commit
SHA1: 98ca9...
size

commit
tree

0de24...

parents

commit = information
about some changes

()

author

Christian...
Example of storing and
retrieving a commit (1)
# TREE=0625da548ef0a7038c44b480f10d5550b2f2f962
# ME=”Christian Couder <chr...
Example of storing and
retrieving a commit (2)
# git cat-file -p 37449e9554
tree 0625da548ef0a7038c44b480f10d5550b2f2f962
...
Git Objects Relations
SHA1: e84c7...
Commit

SHA1: 0de24...

size

tree

29c43...

parents

()

author

Christian

committ...
Git Refs
● Head: branch,
.git/refs/heads/
● Tag: lightweight tag,
.git/refs/tags/
● Remote: distant repository,
.git/refs/...
Example of storing and
retrieving a branch
# git update-ref refs/heads/master 37449e9554
# git rev-parse master
37449e9554...
Result from previous examples
master

commit 37449e9554

tree 0625da548e

blob aa02989467
Commits in Git form a DAG
(Directed Acyclic Graph)

● history direction is from left to right
● new commits point to their...
git bisect

B

● B introduces a bad behavior called "bug" or
"regression"
● red commits are called "bad"
● blue commits ar...
Problem when bisecting
Sometimes the commit that introduced a bug
will be in an untestable area of the graph.
For example:...
Possible solutions
Possible solutions to bisect anyway:
● apply a patch before testing and remove it
afterwards (can be do...
A good solution
The idea is that we will replace Z with Z' so that
we bisect from the beginning using the fixed up
branch....
Grafts
Created mostly for projects like linux
kernel with old repositories.
● “.git/info/grafts” file
● each line describe...
Problem with Grafts

They are neither objects nor refs, so
they cannot be easily transferred.
We need something that is ei...
Solution, part 1: replace ref

● It is a ref in .git/refs/replace/
● Its name is the SHA-1 of the
object that should be re...
Solution, part 2: git replace

● git replace [ -f ] <object> <replacement>:
to create a replace ref
● git replace -d <obje...
Replace ref transfer
● as with heads, tags, notes, remotes
● except that there are no shortcuts and
you must be explicit
●...
Creating replacement objects
When it is needed the following commands
can help:
● git rebase [ -i ]
● git cherry-pick
● gi...
What can it be used for?
Create new views of your history.
Right now only 2 views are possible:
● the view with all the re...
Why new views?
● split old and new history or merge them
● fix bugs to bisect on a clean history
● fix mistakes in author,...
Limitations
● everything is still in the repo
● so the repo is still big
● there are probably bugs
● confusing?
● ...
Current and future work
● a script to replace grafts
● fix bugs
● allow subdirectories in .git/refs/replace/
● maybe allow...
Considerations
● best of both world: immutability and
configurability of history
● no true view
● history is important for...
Many thanks to:
● Junio Hamano (comments, help, discussions,
reviews, improvements),
● Ingo Molnar,
● Linus Torvalds,
● ma...
Questions ?
Upcoming SlideShare
Loading in...5
×

New Views on your History with git replace

879
-1

Published on

Git has become the most popular version control system in the Open Source world, and more and more companies are also using it.

The source code history when managed by Git is supposed to be immutable, because Git uses a content addressed database. The Git objects are indexed by their SHA-1 hash.

When mistake have been made, or to make some history based features more useful or more reliable, though, it can be interesting to transform the Git source code history. To do that it is a good idea to use git replace.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
879
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

New Views on your History with git replace

  1. 1. New Views on your History with git replace Christian Couder, Murex chriscool@tuxfamily.org OSDC.fr 2013 October 5, 2013
  2. 2. About Git A Distributed Version Control System (DVCS): ● created by Linus Torvalds ● maintained by Junio Hamano ● since 2005 ● prefered VCS among open source developers
  3. 3. Git Design Git is made of these things: ● “Objects” ● “Refs” ● config, indexes, logs, hooks, grafts, packs, ... Only “Objects” and “Refs” are transferred from one repository to another.
  4. 4. Git Objects ● Blob: content of a file ● Tree: content of a directory ● Commit: state of the whole source code ● Tag: stamp on an object
  5. 5. Git Objects Storage ● Git Objects are stored in a content addressable database. ● The key to retrieve each Object is the SHA-1 of the Object’s content. ● A SHA-1 is a 160-bit / 40-hex / 20-byte hash value which is considered unique.
  6. 6. Blob SHA1: e8455... blob = content of a file blob size /* content of this blob, it can be anything like an image, a video, ... but most of the time it is source code like:*/ #include <stdio.h> int main(void) { printf("Hello world!n"); return 0; }
  7. 7. Example of storing and retrieving a blob # echo “Whatever…” | git hash-object -w --stdin aa02989467eea6d8e0bc68f3663de51767a9f5b1 # git cat-file -p aa02989467 Whatever...
  8. 8. Tree SHA1: 0de24... size tree blob tree hello.c lib tree = content of a directory e8455... 10af9... It can point to blobs and other trees.
  9. 9. Example of storing and retrieving a tree # BLOB=aa02989467eea6d8e0bc68f3663de51767a9f5b1 # (printf "100644 whatever.txt0"; echo $BLOB | xxd -r -p) | git hash-object -t tree -w --stdin 0625da548ef0a7038c44b480f10d5550b2f2f962 # git cat-file -p 0625da548e 100644 blob aa02989467... whatever.txt
  10. 10. Commit SHA1: 98ca9... size commit tree 0de24... parents commit = information about some changes () author Christian <timestamp> committer Christian <timestamp> My commit message It points to one tree and 0 or more parents.
  11. 11. Example of storing and retrieving a commit (1) # TREE=0625da548ef0a7038c44b480f10d5550b2f2f962 # ME=”Christian Couder <chriscool@tuxfamily.org>” # DATE=$(date "+%s %z") # (echo -e "tree $TREEnauthor $ME $DATE"; echo -e "committer $ME $DATEnnfirst commit") | git hash-object -t commit -w --stdin 37449e955443883a0a888ee100cfd0a7ba7927b3
  12. 12. Example of storing and retrieving a commit (2) # git cat-file -p 37449e9554 tree 0625da548ef0a7038c44b480f10d5550b2f2f962 author Christian Couder <chriscool@tuxfamily.org> 1380447450 +0200 committer Christian Couder <chriscool@tuxfamily.org> 1380447450 +0200 first commit
  13. 13. Git Objects Relations SHA1: e84c7... Commit SHA1: 0de24... size tree 29c43... parents () author Christian committer Christian Blob size SHA1: 29c43... int main() { ... } Tree Initial commit blob tree size hello.c 0de24... doc 98ca9... SHA1: 98ca9... Tree size blob readme 677f4... blob SHA1: 98ca9... Commit tree install 23ae9... size 5c11f... parents (e84c7...) author Arnaud committer Arnaud Change hello.c SHA1: 5c11f... SHA1: bc789... Tree blob tree size hello.c bc789... doc 98ca9... Blob size int main(void) { ... }
  14. 14. Git Refs ● Head: branch, .git/refs/heads/ ● Tag: lightweight tag, .git/refs/tags/ ● Remote: distant repository, .git/refs/remotes/ ● Note: note attached to an object, .git/refs/notes/ ● Replace: replacement of an object, .git/refs/replace/
  15. 15. Example of storing and retrieving a branch # git update-ref refs/heads/master 37449e9554 # git rev-parse master 37449e955443883a0a888ee100cfd0a7ba7927b3 # git reset --hard master HEAD is now at 37449e9 first commit # cat whatever.txt Whatever...
  16. 16. Result from previous examples master commit 37449e9554 tree 0625da548e blob aa02989467
  17. 17. Commits in Git form a DAG (Directed Acyclic Graph) ● history direction is from left to right ● new commits point to their parents
  18. 18. git bisect B ● B introduces a bad behavior called "bug" or "regression" ● red commits are called "bad" ● blue commits are called "good"
  19. 19. Problem when bisecting Sometimes the commit that introduced a bug will be in an untestable area of the graph. For example: W X X1 X2 X3 Y Z Commit X introduced a breakage, later fixed by commit Y.
  20. 20. Possible solutions Possible solutions to bisect anyway: ● apply a patch before testing and remove it afterwards (can be done using "git cherrypick"), or ● create a fixed up branch (can be done with "git rebase -i"), for example: X+Y W X X1' X1 X2' X2 X3' X3 Z' Y Z Z1
  21. 21. A good solution The idea is that we will replace Z with Z' so that we bisect from the beginning using the fixed up branch. X+Y W X X1' X1 $ git replace Z Z' X2' X2 X3' X3 Z' Y Z1 Z
  22. 22. Grafts Created mostly for projects like linux kernel with old repositories. ● “.git/info/grafts” file ● each line describe parents of a commit ● <commit> <parent> [<parent>]* ● this overrides the content in the commit
  23. 23. Problem with Grafts They are neither objects nor refs, so they cannot be easily transferred. We need something that is either: ● an object, or ● a ref
  24. 24. Solution, part 1: replace ref ● It is a ref in .git/refs/replace/ ● Its name is the SHA-1 of the object that should be replaced. ● It contains, so it points to, the SHA-1 of the replacement object.
  25. 25. Solution, part 2: git replace ● git replace [ -f ] <object> <replacement>: to create a replace ref ● git replace -d <object>: to delete a replace ref ● git replace [ -l [ pattern ] ]: to list some replace refs
  26. 26. Replace ref transfer ● as with heads, tags, notes, remotes ● except that there are no shortcuts and you must be explicit ● refspec: refs/replace/*:refs/replace/* ● refspec can be configured (in .git/config), or used on the command line (after git push/fetch <remote>)
  27. 27. Creating replacement objects When it is needed the following commands can help: ● git rebase [ -i ] ● git cherry-pick ● git hash-object ● git filter-branch
  28. 28. What can it be used for? Create new views of your history. Right now only 2 views are possible: ● the view with all the replace refs enabled ● the view with all the replace refs disabled, using --no-replace-objects or the GIT_NO_REPLACE_OBJECTS environment variable
  29. 29. Why new views? ● split old and new history or merge them ● fix bugs to bisect on a clean history ● fix mistakes in author, committer, timestamps ● remove big files to have something lighter to use, when you don’t need them ● prepare a repo cleanup ● mask/unmask some steps ● ...
  30. 30. Limitations ● everything is still in the repo ● so the repo is still big ● there are probably bugs ● confusing? ● ...
  31. 31. Current and future work ● a script to replace grafts ● fix bugs ● allow subdirectories in .git/refs/replace/ ● maybe allow “views” as set of active subdirectories ● ...
  32. 32. Considerations ● best of both world: immutability and configurability of history ● no true view ● history is important for freedom
  33. 33. Many thanks to: ● Junio Hamano (comments, help, discussions, reviews, improvements), ● Ingo Molnar, ● Linus Torvalds, ● many other great people in the Git and Linux communities, especially: Andreas Ericsson, Johannes Schindelin, H. Peter Anvin, Daniel Barkalow, Bill Lear, John Hawley, ... ● OSDC/OWF organizers and attendants, ● Murex the company I am working for.
  34. 34. Questions ?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×