Git! Why? How?


Published on

Why should you use git today instead of centralized version management systems like svn.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • was a time when you stored your versions manually. Ok, for many of you this time wasn’t the 80s, but a few years back when you were at college naming your source-code archives,,,, and so on. Well, believe it or not, there was a time without real SCMs. It was always dark and people were living in caves.RCSThen 1982 came and RCS was released. RCS is not a huge piece of technology, but you can still find it around in Unix distros. It is simple and straight to the point.One nice feature was that text changes were stored as deltas (pretty important, considering hard drives used to be small!). Deltas are still used nowadays by most SCMs.Some RCS drawbacks worth mentioning:It is text only.There is no central repository; each version-controlled file has its own repo, in the form of an RCS file, stored near the file itself. For example, the RCS file for /usr/project/foo.c is /usr/project/foo.c,v -- or a little better, in a subdirectory, /usr/project/RCS/foo.c,v.Developers make private workspaces by creating symbolic links to RCS subdirectories – say, a symlink from /usr/home/john/RCS to /usr/project/RCS.Naming of versions and branches is downright hostile. A version might be named 1.3, and a branch might be named 1.3.1, and a version on the branch might be named classic eraIn the SCM arena, the 90s are the classic era.CVSIt all started with CVS (Concurrent Version System) in 1990. It was able to handle multiple versions being developed concurrently on different machines and stored on a central server. The client-server age was upon us and developers took major advantage out of it.CVS was able to handle versions in a decent way. And it even supported branching and merging, though it wasn’t very good at doing it. That’s one of the reasons many people are scared about the “B” word and the “M” word.CVS didn’t track directories or filename changes (no refactoring allowed here!) and heavily relied on locking the whole repository. It is outdated now, but it worked in the 90s! (If you have it, just walk away and go on to something else!)PVCSPolytron Version Control System (PVCS) was initially released in 1985 and then went through a series of mergers and acquisitions: Polytron, then Sage, Merant, and finally Serena.It’s an old, outdated system (initially designed to avoid branching/merging, using file-locking instead), but it’s still supported by Serena Software.ClearCaseIn 1992, one of the major beasts in the SCM world was born. ClearCase was clearly ahead of its time and for some it is still the most powerful SCM ever built.Outdated, slow moving, over priced, and overly complicated to administer (in the early days, you had to generate a new Unix kernel to run the beast!), good-old CC isn’t the cool guy anymore -- you can hardly find anything positive about it on the net. But it’s still very good at branching and merging and still has unique features, such as its legendary “dynamic views”. While powerful, CC came from a time when disk space was scarce and networks were mostly LANs, with no concerns for things like latency or working through firewalls.Atria (the developer of ClearCase) merged with Pure (which was run by Reed Hastings, now the head of Netflix), was purchased by Rational and then IBM. And lo, the powerful CC stopped evolving. Well, it did evolve towards UCM in the early 2000s, which basically got rid of all the good things and left the weak ones, together with a huge price. Not very good idea.ClearCase is still one of the most-used SCMs in the corporate world, and certainly one of the revenue leaders.VSSAll the systems on my list had their moment and their clear advantages over previous systems. All except Visual SourceSafe. VSS was a weak system from day one, forcing developers to work with a “locking” approach, discouraging parallel development and creating a huge “fear of merging”.Slow, error prone, and utterly limited, VSS has been one of the most-used systems by Windows developers around the world. It is still in use, spreading pain and fear among good-hearted coders. But VSS was ahead of its time in one sense: it more properly belongs in the “dark SCM middle ages” (see below), instead of the classic era.VSS was entirely graphical, which was probably one of the reasons why it was widely adopted (along with being closely tied in with Visual Studio distributions).PerforcePerforce (P4) is one of the independent vendors who are totally focused on SCM, battling for the SCM gold. It is still one of the market leaders among mid-range companies with huge teams, and it has a strong presence in some market niches, such as the gaming industry.When it was released in the mid 90s, P4 was one of the most affordable and powerful systems to date. Worlds ahead of VSS and CVS, it was never at the level of Clearcase. But it was able to clearly beat CC in cost, performance, and ease of use.Being centralized and not very good with branching and merging (branches are implemented as subdirectory trees – didn’t they ever hear of metadata?) P4 doesn’t seem to be the best option for the future, but it is rock solid, mature, and well established. That will help it keep growing. At the time of this writing, P4 is the biggest code repository inside Google. Cool!Enter the middle agesA time of darkness, when most of the previous advances were lost and a degraded environment emerged…SubversionSubversion (SVN) was conceived as “enhanced CVS” and its developers hit their target: it is better than CVS. Period.Although systems like ClearCase were perfectly capable of branching and merging, SVN educated an entire developer generation on the following dogma: fear branching and merging at all cost! This caused environmental damage that persists to this day, only starting to be healed by the new DVCS generation.SVN was close to P4 in features, and spread like crazy: more than 5 million developers around the world use SVN on a daily basis. Huge!SVN is extremely simple to use and evangelized everyone on the “mainline development model”. Error-prone (break the build!) on non-toy projects, it helped developed techniques like “continuous integration” as a way to “avoid integrations”. While the idea is good, most of the surrounding concepts were clearly limited by the tool itself.Linus himself raged against SVN when he first introduced Git back in 2006.During 2009 and 2010, all major open-source projects on earth gravitated away from SVN. A good sign of how wrong SVN was. But it’s still big and won’t die for ages.AccuRevBorn in an age of darkness, AccuRev was developed as an entirely new approach to source control. Its original way of doing things still seems new to lots of developers nowadays.AccuRev has strong support for branching (“streams” in its jargon) and merging. It has played a valuable role in helping the community move away from ClearCase and older tools like CVS.Enter The RenaissanceAfter an age of darkness, an entirely new generation of SCM systems broke the established status quo. “SCM is a mature market” was the analysts’ conventional wisdom, but the new generation broke onto the scene and blew everything apart.Able to sever ties with the Internet and work unplugged (like cool rock stars), the new generation also excels at branching and merging, which was touted as the root of all evil during the “dark ages”. These new systems have successfully shifted the tide in the “branching/merging is good” direction.BitKeeperBitKeeper was one of the innovators in the DVCS field. Designed by Larry McVoy (who previously worked on TeamWare, Sun’s internal version control system, built on top of SCCS, long evolution story here…) it rose to fame in 2002 when the Linux kernel development team started using it. A huge flame war started, with some developers complaining about using commercial tools for the world’s premier open-source project. Things only got worse in 2005 when fights with the core kernel developers grew even bigger. BitMover, the company behind the product, became concerned about people reverse-engineering their code. They discontinued support for open-source development and, ironically, thus prompted the creation of Git to fill the gap. For more, see Torvalds, the father of Linux himself, designed and implemented the first version of Git (almost over a weekend, in pure-hacker style) to give his kernel developers an alternative to BitKeeper. Linus not only did the original design (simple, clean, genius), but helped promote the project with his unique style. (See During his famous speech, he heavily criticized (ok, insulted) CVS, SVN, and Perforce: “Subversion has been the most pointless project ever started”, “If you like using CVS, you should be in some kind of mental institution or somewhere else” and finally “Get rid of Perforce, it is sad, but it is so, so true”. You can love him or hate him, but he definitely made his point: the Middle Ages were over and now distributed systems were to rule the world, including removing the arcane fear of branching and merging, a key concept behind every DVCS.During the next years, every major open-source project migrated away from Subversion towards Git (and provided a really huge, huge hosting service), making it the strongest and coolest SCM on earth.Git is based on a DAG structure (Directed Acyclic Graph), in which the main unit of change is the changeset. It implements full merge-tracking, but at the commit level instead of the individual file revision level (as, for instance, ClearCase does). It is extremely fast, with the only caveats being management of large binary files and the requirement to replicate repositories in their entirety.Git is clearly influenced by its kernel roots, and it’s obviously not the easiest thing on earth to use . But it will definitely be the SCM of the next decade. Check out this awesome book.MercurialMercurial (Hg) was first announced on April 2005, also rushing in after the BitMover decision to remove support for the free version. Hg is also one of the key open-source DVCSs, along with Git. They can even work together quite well: Scott Chacon, the Git evangelist and one of the best SCM tech writers ever, wrote a nice integration -- see Hg differs quite a bit from Git in terms of design. They share the concept of commit/changeset as the unit of change. Git implements this based on trees; each tree points to an older tree, and so on – hence the DAG. With Hg, every changeset is a flat list of files and directories, called a revlog.(For more on Hg, including internals, see and provides very strong merging, but it’s a bit different from other SCMs in its branching model: it has “named branches” but the preference is to create a new repository as a separate branch instead of hosting “many heads” inside a single one.Joel Spolsky has written an extremely good Hg tutorial (, which will help a lot of new users. Spolsky’s company, Fog Creek Software, has recently released Kiln, a commercial wrapper around the Hg core.DarcsDarcs (Darcs Advanced Revision Control System) is another open source attempt to get rid of CVS and Subversion. It started in 2002 and has been continuously evolving since then, reaching version 2.5 in November 2010.The major shortcomings of Darcs have been performance and its different way of handling history: instead of managing “snapshots” (commits or changesets) it manages patches, but in a way that makes traversing history difficult to understand. (a current status may have not been a real snapshot).BazaarBazaar (bzr) is another open-source DVCS, which tries to provide some fresh air to the SCM world. While less used than Git and Mercurial, Bazaar features interesting features, such as the ability to work in a centralized way, if needed. (The “pure” DVCSs didn’t include central servers in their original design.)Bazaar was developed by Canonical (yes, the Ubuntu company!) and became GNU in early 2008.Plastic SCMPlastic is a DVCS system designed with commercial use in mind instead of open-source projects (unlike Git and Mercurial). Plastic was first released in late 2006, featuring strong branching and merging, including full merge tracking and rename support in merges. It provides a highly graphical working environment, with many data-visualization capabilities, including a 3D revision tree). This distinguishes it from DVCSs that are oriented toward the hard-core, CLI-oriented hacker community.The motivation of Plastic’s developers (BTW, I’m one of them) is to target small and medium teams, closing the gap between expensive high-end systems like ClearCase and low-end ones like SVN.Plastic is built around the concept of parallel development, encouraging use of the “branch per task” pattern (feature branches). It can handle thousands of branches without breaking a sweat. Plastic is also distributed, supporting disconnected development, pushing and pulling of changesets on branches, and conflict resolution.A Community Edition of Plastic SCM was launched in November 2010.Team Foundation ServerMicrosoft, wanting to play a role in the SCM/ALM market, came up with Team Foundation Server (TFS). It’s an effort to heal the pain caused by its own VSS devil.While TFS is not very strong as a source-control system (kind of a new guy on the block, but using previous-generation technology), it comes fully packaged with a huge set of tools, from issue tracking to test management, in the pure “corporate-huge-integrated-thing-style”.You won’t be doing branching, merging, or DVCS if you go for it, but maybe your company already purchased it, along with an MSDN subscription.
  • Enhanced CVS Apache Software Foundation ProjectApache HTTP ServerHTTP/WebDAV/svnserve protocolStores deltasFile system based (FSFS)Doesn’t support tagsFear of branching
  • Tracking file permissions and ownershipTracking individual files with separate history
  • CVS = dont doDistributed, BitKeeper-like workflowStrong safeguards against corruptionHigh performancegit
  • GitLinux KernelPerlEclipseGnomeKDEQtRuby on
  • The stagingarea represents the next commitAdd files – add them to the indexCommit – commit index to repositoryDiff HEAD <> IndexChanged things not yet commitedDiff index <> working dirChanged things not addedUntracked filesThe basic Git workflow goes something like this:You modify files in your working directory.You stage the files, adding snapshots of them to your staging area.You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.
  • Auto-Completion you use the Bash shell, Git comes with a nice auto-completion script you can enable. Download the Git source code, and look in the contrib/completion directory; there should be a file called git-completion.bash. Copy this file to your home directory, and add this to your .bashrc file:source ~/.git-completion.bash If you want to set up Git to automatically have Bash shell completion for all users, copy this script to the /opt/local/etc/bash_completion.d directory on Mac systems or to the /etc/bash_completion.d/ directory on Linux systems. This is a directory of scripts that Bash will automatically load to provide shell completions.If you’re using Windows with Git Bash, which is the default when installing Git on Windows with msysGit, auto-completion should be preconfigured.Press the Tab key when you’re writing a Git command, and it should return a set of suggestions for you to pick from:$ git co<tab><tab> commit configGit Aliases$ gitconfig --global checkout $ gitconfig --global branch $ gitconfig --global commit $ gitconfig --global status
  • Why Git?Save TimeGit is lightning fast. And although we're talking about only a few seconds per command, it quickly adds up in your work day. Use your time for something more useful than waiting for your version control system to get back to you.Work OfflineWhat if you want to work while you're on the move? With a centralized VCS like Subversion or CVS, you're stranded if you're not connected to the central repository. With Git, almost everything is possible simply on your local machine: make a commit, browse your project's complete history, merge or create branches... Git let's you decide where and when you want to work.Undo MistakesPeople make mistakes. A good thing about Git is that there's a little "undo" command for almost every situation. Correct your last commit because you forgot to include that small change. Revert a whole commit because that feature isn't necessary, anymore. And when the going gets tough you can even restore disappeared commits with the Reflog - because, behind the scenes, Git rarely really deletes something. This is peace of mind.Don't WorryGit gives you the confidence that you can't screw things up - and this is a great feeling. In Git, every clone of a project that one of your teammates might have on his local computer is a fully usable backup. Additionally, almost every action in Git only adds data (deleting is very rare). That means that losing data or breaking a repository beyond repair is really hard to do.Make Useful CommitsA commit is only really useful if it just contains related changes. Imagine having a commit that contains something from feature A, a little bit of feature B, and bugfix C. This is hard to understand for your teammates and can't be rolled back easily if feature A is causing problems. Git helps you create granular commits with its unique "staging area" concept: you can determine exactly which changes shall be included in your next commits, even down to single lines. This is where version control starts to be useful.Work in Your Own WayWhen working with Git you can use your very own workflow. One that feels good for you. You don't have to use most of the advanced features to still benefit from all the others. Of course, you can connect with multiple remote repositories, rebase instead of merge, and work with submodules when you need it. But you can just as easily work with one central remote repository like in Subversion. Whatever works for you is fine with Git.Keep OrderSeparation of concerns is paramount to keeping track of things. While you're working on feature A, nothing (and no-one) else should be affected by your unfinished code. What if it turns out the feature isn't necessary anymore? Or if, after 10 commits, you notice that you took the completely wrong approach? Branching is the answer for these problems. And while other version control systems also know branches, Git is the first one to make it work as it should: fast & easy.Go With the FlowGit is used by more and more well-known companies and OpenSource projects: Ruby On Rails, jQuery, Perl, Debian, the Linux Kernel and many more. A large community often is an advantage by itself because an ecosystem evolves around the system. Lots of learning content, tools, and services make Git even more attractive.
  • Git! Why? How?

    1. 1. git!why?how?   October  2011  
    2. 2. VCShistory
    3. 3. Svn revisited
    4. 4. Basic svn workflowSo, what’s wrong with this?
    5. 5. When……did you last branch?…was your last merge?…were you stressed by a broken commit*?* commited by someone else of course
    6. 6. It began with the linux kernel…•  Birthday April 3, 2005•  Linus Torvalds invented it “over night”, after BitKeeper became proprietary•  Tech-Talk: Linus Torvalds on git
    7. 7. Git is optimized for…Distributed developmentLarge file setsMerge complex structuresBranchingFast operationsRobustness
    8. 8. Torvalds’ design criteria
    9. 9. What’s git
    10. 10. Git isbetterthansvn!
    11. 11. Corefeatures
    12. 12. How does it work?•  SHA-1 is King –  Universal public identifier –  Every object has it (Blobs, Trees Commits, Tags)•  Multiple protocols – http ssh git•  Efficient object store•  Disk is cheap – everyone has entire repo•  Easy branching and merging
    13. 13. Snapshots, Not Differences GitSvn
    14. 14. Not centralized…Svn
    15. 15. …distributed! G it
    16. 16. Workingwith git
    17. 17. Staging area –“index” or“cache”
    18. 18. BranchingBest practice
    19. 19. Into the ring!
    20. 20. Git @ Namicsknow.namics   git.namics   (scm.namics)  • buildrun/Git • ssh  public  key   • Git   +bei+Namics   authenBcaBon   • SVN  • Schnipsel   • access?   • Jenkins   contact  a  git   • Bamboo   admin   • Jira   • LDAP  
    21. 21. Cool Stuff
    22. 22. Tons of Toolsgit-archive Export a tree as tar/zipgit-bisect Find broken commitgit-cherry-pick Selective merginggit-revert Add a second revert commitgit-blame Who wrote this?
    23. 23. Bash aliases - I’m lazy!alias ga=git add .alias gcam=git commit -amalias grh=git reset HEAD --hardalias gs=git statusalias gb=git branchalias gc=git checkoutalias gcm=git checkout masteralias gcs=git checkout stagingalias gcd=git checkout developmentalias gph=git pushalias gpt=git push --tagsalias gpl=git pullalias gm=git mergealias gmm=git merge masteralias gms=git merge stagingalias gmd=git merge developmentalias gba=git branch -aalias gt=git tagalias gpt=git push --tagsalias garc=git archive HEAD --format=zip >
    24. 24. AutocompleteDownload fileh"ps://­‐comple6on.bash  Add to bash_profile$ source ~/.git-completion.bash
    25. 25. Graphical clients
    26. 26. SourceTree - OSX GET WH IT ILE I FREE T’S !
    27. 27. Tower - OSX
    28. 28. SmartGit - WIN
    29. 29. Save   Keep   Bme   Backup   order   Save   space  Work   What’s the gainoffline   for Namics?
    30. 30. ResourcesGreat Book, and its free! search engine of choice…
    31. 31. drop svnanduse git!