Version control with GIT


Published on

A presentation on the basics of version controlling your code with the help of GIT

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Version control with GIT

  1. 1. For zombies…… By Zeeshan Khan
  2. 2.  A method for centrally storing files  Keeping a record of changes  Who did what, when in the system  Covering yourself when things inevitably go wrong  Another “trendy” word combination  Something that every software developer should deal with
  3. 3. 3
  4. 4.  You can avoid using version control  But it can’t last long  You will need to collaborate eventually  It might be tricky sometimes  But you can avoid most problems  Recommendations:  Stick to basic working cycle  Learn basic working cycle commands  Practice on sandbox project
  5. 5.  Allows a team to share code  Maintains separate “production” versions of code that are always deployable  Allows simultaneous development of different features on the same codebase  Keeps track of all old versions of files  Prevents work being overwritten
  6. 6.  There are version control tools even for designers:  There is version control functionality embedded in: 6 Adobe version cue PixelNovel Timeline Microsoft Word Writer
  7. 7.  Branch - a copy of a set of files under version control which may be developed at different speeds or in different ways  Checkout - to copy the latest version of (a file in) the repository to your working copy  Commit - to copy (a file in) your working copy back into the repository as a new version  Merge - to combine multiple changes made to different working copies of the same files in the repository  Repository - a (shared) database with the complete revision history of all files under version control  Trunk - the unique line of development that is not a branch  Update - to retrieve and integrate changes in the repository since the update.  Working copy - your local copies of the files under version control you want to edit
  8. 8. 8 • CVS • Subversion • VSS,TFS,Vault • ClearCase • AccuRev Centralized (client-server model) • Git • Mercurial • Bazzar • Perforce • BitKeeper Distributed
  9. 9. CVS etc GIT etc.
  10. 10. Users commits changes to the central repository and a new version is born to be checked out by other users
  11. 11. CENTRALIZED WORKFLOW  Branch by Release Release 1 Release 2 Branch V1.0 V1.1 V1.2 V2.0 V2.1 V2.2 Merge • Branch by Feature / Task Feature 1 Main Trunk Branch Merge BranchFeature 2 Merge
  12. 12. CENTRALIZED WORKFLOW  Access the central server and ‘pull’ down the changes others have made  Make your changes, and test them  Commit (*) your changes to the central server, so other programmers can see them.  (*) Work out the merge conflicts (windiff, built in tools etc.)
  13. 13. Canonical Repository Local Repository Jeff 3. Local repository is update from canonical repository 2. Pushes changes to the canonical repository 4. Working copy is updated from local repository 1. Commits changes to the local repository Each user has a full local copy of the repository.Users commit changes and when they want to share it,they push it to the shared repository
  14. 14. DISTRIBUTED WORKFLOW • Simple Add Commit • Branch by Member/ Features Init/Clone Push Development Trunk V1.0 V1.1 Main Trunk Developer 1 Developer 2
  15. 15. DISTRIBUTED WORKFLOW  Each developer ‘clones’ a copy of a repository to their own machine. The full history of the project is on their own hard drive.  Two phase commits:You commit first to your local staging area, and then push to the repository.  Central Repository is not mandatory, but you usually have one  Examples of distributed source control systems  Git, Mercurial, Bazaar
  16. 16.  Single repository  Commit requires connection (no staging area).  Impossible to commit changes to another user  All history in one place  Reintegrating the branch might be a pain  Considered to be not so fast as DVCS  Multiple repositories  Commit does not require connection (due to staging area)  Possible to commit changes to another user  Impossible to get all history  Easier branches management (especially reintegration)  Considered to be faster than CVCS Centralized Distributed
  17. 17. • Speed • Simple design • Strong support for thousands of parallel branches • Fully distributed • Able to handle larges projects like Linux kernel effectively • Ensure integrity
  18. 18. Snapshots of the filesystem are saved in every commit instead of saving the differences
  19. 19. • Fetch or clone (create a copy of the remote repository) (compare to cvs check out) • Modify the files in the local branch • Stage the files (no cvs comparison) • Commit the files locally (no cvs comparison) • Push changes to remote repository (compare to cvs commit)
  20. 20. • Git directory: stores the metadata and object database for your project. • Working directory: a single checkout of one version of the project • Staging area (Index): file contained in your Git directory that stores information about what will go into he next commit
  21. 21. Untracked: files in your working directory that were not in the last snapshot and are not in staging area. Unmodified: tracked but not modified (initial clone) Modified: tracked and modified Staged: identified for next commit
  22. 22.  There are four elementary object types in Git:  blob - a file.  tree - a directory.  commit - a particular state of the working directory.  tag - an annotated tag (we will ignore this one for now).
  23. 23.  A blob is simply the content of a particular file plus some  meta-data.  A tree is a plain text file, which contains a list of blobs and/or trees with their corresponding file modes and names.  A commit is also a plain text file containing information about the author of the commit, a timestamp and references to the parent commit(s) and the corresponding tree.  All objects are compressed with the DEFLATE algorithm and stored in the git object database under .git/objects.
  24. 24.  Everything is check-summed before it is stored  Everything is referred to by that checksum.  SHA-1 hash is used for making checksum hash.  Every commit is referred to by that SHA-1 hash.  Cannot change the contents of any file or directory without Git knowing about it
  25. 25.  The Secure Hash Algorithm is a 160 bit cryptographic hash  function used in TLS, SSH, PGP, . . .  Every object is identified and referenced by its SHA-1 hash.  Every time Git accesses an object, it validates the hash.  Linus Torvalds: ”Git uses SHA-1 in a way which has nothing at all to do with security. [...] It’s about the ability to trust your data.”  If you change only a single character in a single file, all hashes up to the commit change!
  26. 26. Creating a new repository:  $ git init Cloning from an existing repository:  $ git clone
  27. 27. SPECIFIC CHANGES:  $ git add *.py  $ git add README.rst  $ git commit -m 'First commit' ALL CHANGES:  $ git commit -am 'First commit'
  28. 28. FROM STAGING AREA  $ git rm --cached FROM INDEX AND FILE SYSTEM  $ git rm
  29. 29. Git tracks content, not files. Although there is a move command...  $ git mv file1 file2 ...this is the same as...  $ mv file1 file2  $ git rm file1  $ git add file2
  30. 30. SHOWING STATUS: $ git status SHOWING LOG (ENTIRE PAGED)  $ git log SHOWING LOG (DATE FILTERING)  $ git log --since=2.weeks  $ git log --since="2 years 1 day 3 minutes ago"
  31. 31. LAST COMMIT  $ git show SPECIFIC COMMIT  $ git show 1776f5  $ git show HEAD^
  32. 32. UNSTAGED CHANGES  $ git diff STAGED CHANGES  $ git diff --cached RELATIVE TO SPECIFIC REVISION  $ git diff 1776f5  $ git diff HEAD^
  33. 33. CHANGE LAST COMMIT  $ git commit --amend UNSTAGE STAGED FILE  $ git reset HEAD UNMODIFY MODIFIED FILE  $ git checkout -- REVERT A COMMIT  $ git revert 1776f5
  34. 34.  This is a file describing the files that are to be ignored from git tracking  Blank lines or lines starting with # are ignored  Standard glob patterns work  End pattern with slash (/) to specify a directory  Negate pattern with exclamation point (!)  $ cat .gitignore *.pyc /doc/[abc]*.txt .pypirc
  35. 35.  Other clones of the same repository  Can be local (another checkout) or remote (coworker, central server)  There are default remotes for push and pull  $ git remote -v origin git:// (fetch) origin git:// (push)
  36. 36. WITHOUT DEFAULT  $ git push <remote> <branch> SETTING A DEFAULT  $ git push -u <remote> <branch> THEN...  $ git push
  37. 37. FETCH & MERGE  $ git pull [<remote> <branch>] FETCH & REBASE  $ git pull --rebase [<remote> <branch>] -> Rebasing should be done cautiously!
  38. 38.  Like most VCSs, Git has the ability to tag specific points in history as being important. Generally, people use this functionality to mark release points (v1.0, and so on)  Git uses two main types of tags: lightweight and annotated. A lightweight tag is very much like a branch that doesn’t change — it’s just a pointer to a specific commit. Annotated tags, however, are checksummed; contains the tagger name, e-mail, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG).
  39. 39. LIGHTWEIGHT TAGS  $ git tag v0.1.0 ANNOTATED TAGS  $ git tag -a v0.1.0 -m 'Version 0.1.0'
  40. 40. Branches are "Pointers" to commits.
  41. 41. Any reference is actually a text file which contains nothing more than the hash of the latest commit made on the branch:  $ cat .git/refs/heads/master 57be35615e5782705321e5025577828a0ebed13d HEAD is also a text file and contains only a pointer to the last object that was checked out:  $ cat .git/HEAD ref: refs/heads/master
  42. 42. Scenario 1 – Interrupted workflow You’re finished with part 1 of a new feature but you can’t continue with part 2 before part 1 is released and tested
  43. 43. Scenario 2 – Quick fixes While you’re busy implementing some feature suddenly you’re being told to drop everything and fix a newly discovered bug
  44. 44. Branches can diverge.
  45. 45. Branches can be merged.
  46. 46. Different auto-merge strategies are there like fast-forward, 3 way , etc... If it fails, fix by hand…..  $ git merge <branch> Auto-merging index.html CONFLICT (content): Merge conflict in index.html Automatic merge failed; fix conflicts and then commit the result. Then mark as resolved and trigger merge commit  $ git add index.html  $ git commit
  47. 47.  Linear alternative to merging  Rewrites tree! Never rebase published code!
  48. 48.  Often, when you’ve been working on part of your project, things are in a messy state and you want to switch branches for a bit to work on something else.The problem is, you don’t want to do a commit of half-done work just so you can get back to this point later.The answer to this issue is $ git stash  Stashing takes the dirty state of your working directory — that is, your modified tracked files and staged changes — and saves it on a stack of unfinished changes that you can reapply at any time.
  49. 49. CREATE NEW BRANCH  $ git branch iss53  $ git checkout -b iss53 master SWITCH BRANCH  $ git checkout iss53 DELETE BRANCH  $ git branch -d iss53
  50. 50. SHOW ALL BRANCHES  $ git branch iss53 *master testing SHOW LAST BRANCH COMMITS  $ git branch -v iss53 93b412c fix javascript issue *master 7a98805 Merge branch 'iss53' testing 782fd34 add scott to the author list in the readmes
  51. 51. SHOW MERGED BRANCHES  $ git branch --merged iss53 *Master SHOW UNMERGED BRANCHES  $ git branch --no-merged testing
  52. 52.  AKA feature branches  For each feature, create a branch  Merge early, merge often  If desired, squash commits
  53. 53. ?