Before talking about Git lets talk about : What is Version Control?
About Version Control• Locally if one were to maintain a backup of the working directory, then what we do is maintain copies of our files in properly named folders to track the change.• Now a smarter way is to maintain a database locally to track these changes.• Git just does this for us. Maintains a database of all the local changes and takes it further to create a centralized server that will help in making the code available to others.
Operations in a VCS• Commit changes• Track file changes in working directory• Compare changes between versions• Ability to checkout any earlier version• Collaborate between many systems through a server accessible to all involved.These are few basic operations that all VCS are able toperform.
• Different from subversion and other existing VCS being used.• Very efficient and much more sophisticated.• Snapshots, not differences
Snapshots and differences• Differences: Most VCS store each file and keep a track of changes that happen in each file at each commit. That is, they keep a track of the differences made in each file and store information as a list of file based changes.• Snapshots: Git doesn’t think of or store its data this way. Instead, Git thinks of its data more like a set of snapshots of a mini filesystem. Every time you commit, or save the state of your project in Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again—just a link to the previous identical file it has already stored
• Nearly every operation is local• It generally adds data• The Three Stages
The three stages of local operations• Working directory: The working directory is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify.• Staging Area: The staging area is a simple file, generally contained in your Git directory, that stores information about what will go into your next commit. It’s sometimes referred to as the index, but it’s becoming standard to refer to it as the staging area.• Git Directory: The Git directory is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer.
The basic Git workflow goes something like this:1. You modify files in your working directory.2. You stage the files, adding snapshots of them to your staging area.3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.• If a particular version of a file is in the git directory, it’s considered committed.• If it’s modified but has been added to the staging area, it is staged.• And if it was changed since it was checked out but has not been staged, it is modified.
First time Git setup • Your identityThis is important because Git commit uses these information inall its operations.$ git config --global user.name "John Doe"$ git config --global user.email firstname.lastname@example.org is for default value that will be picked. If any specificvalue is required for any particular project then config can beset for that particular project. • Your diff tool and default editor (Optional)$ git config --global merge.tool vimdiff$ git config --global core.editor emacs
git init• Initializing git in an existing directory$ git init• Start tracking files in the directory$ git add *.c$ git add README$ git commit -m initial project version
Recording Changes to Repository
File Status LifecycleEach file in working directory can be in one of two states: • Tracked: files that were there in the last snapshot. They can be of 3 types: o Unmodified o Modified o Staged• Untracked: everything else. i.e any files that were not in the last snapshot and are not in the staging area. To track them we have to stage and commit them first.As you edit tracked files, Git sees them as modified, becauseyou’ve changed them since your last commit. You stage thesemodified files and then commit all your staged changes, andthe cycle repeats.
Checking status of files in Repogit status • No file modifications.$ git status# On branch master nothing to commit (working directory clean) • Status for untracked file$ vim README$ git status# On branch master# Untracked files:# (use "git add <file>..." to include in what will be committed) ## README nothing added to commit but untracked files present (use "git add" to track)
Tracking new filesgit addIn order to track a new file README the following commandwill suffice:$ git add READMETo see that it has been tracked we can check the status now:$ git status# On branch master# Changes to be committed:# (use "git reset HEAD <file>..." to unstage)## new file: README#
Staging Modified Filesgit add • For a modified unstaged file git status will show something like :$ git status# Changed but not updated:# (use "git add <file>..." to update what will be committed)## On branch master# modified: benchmarks.rb#
git add • After git add$ git add benchmarks.rb$ git status# Changes to be committed:## (use "git reset HEAD <file>..." to unstage)## On branch master## modified: benchmarks.rb#
• Modifying a staged file$ git status# Changes to be committed:# (use "git reset HEAD <file>..." to unstage)# On branch master## modified: benchmarks.rb## Changed but not updated:# (use "git add <file>..." to update what will be committed)## modified: benchmarks.rb#
• Now the staged files are snapshots taken during the earlier git add. To stage the new changes git add needs to be called again.$ git add benchmarks.rb$ git status# On branch master# Changes to be committed:# (use "git reset HEAD <file>..." to unstage)# modified: benchmarks.rb# • Now finally all the modifications are staged! Not committed though. But before that lets see something more that git has to offer at this stage..
Ignoring files.gitignoreThe rules for the patterns you can put in the .gitignore file areas follows: • Blank lines or lines starting with # are ignored. • Standard glob patterns work. • You can end patterns with a forward slash (/) to specify a directory. • You can negate a pattern by starting it with an exclamation point (!). • This file can be added to the root directory and/or to any subfolder. • The particular sub folders .gitignores conditions will have precedence over the parent folders .gitignore. • This file should be added to the git repository and it gets tracked like any other file.
An example of .gitignore instructions:# a comment - this is ignored*.a # no .a files!lib.a # but do track lib.a, even though youre ignoring .a files above/TODO # only ignore the root TODO file, not subdir/TODObuild/ # ignore all files in the build/ directorydoc/*.txt # ignore doc/notes.txt, but not doc/server/arch.txt
Viewing differences • To view all unstaged changes.$ git diff • To view all staged changes with respect to the previous commit$git diff --cached • To view all changes from last commit (staged + unstaged)$git diff HEAD • To compare difference between 2 different branch index(tips)$git diff master..test • To compare difference between 2 different branch index(tips) from their common ancestor$git diff master...test
Committing changesgit commit • This commits all the staged changes to the local commit history. • It launches the default editor to help add the commit message since git never commits with an empty message • We can use a shorthand to provide inline commit message though o -mex - $ git commit -m "adding inline commit message" • We can also skip the staging step and directly commit all changes o -a ex - $ git commit -a -m "inline commit message"
Removing files from being trackedgit rm• To remove a file from being tracked and also deleting it git rm followed by file name can be used. o $ git rm <file name>• Another good way to not track a file but keep the copy of it in the directory is to use --cached shorthand. o $ git rm --cached <file name>• Simple deleting a file from directory will just be considered as a unstaged modification. git rm needs to be used to commit it.
Viewing the commit historygit log• This will show all the commit logs done in the repository in reverse chronological order.• There are few shorthands that come in handy with git log like: o -p : Will show the diff introduced in the each commit. -p -<n> :will give only n line(s) of the diff in each commit log (the last n diff line(s) of each commit). o --stat : This shorthand will provide abbreviated stats for each commit
Undoing changes• Changing last commitSuppose one missed staging a particular change needed forthe commit, but dont want to create one more commit for this.To tackle these situations we can usegit commit --amend:$ git add <forgotten file>$ git commit --amendThis will merge the new commit to the existing commit.
Undoing changes• Unstaging a staged file$ git reset HEAD <file name>This will unstage the file but keep the modifications we made inthe file• Unmodifying a modified file$ git checkout -- <file name>This will revert the changes we did in the file to the last commitstate.
Summarized flow of commandsgit init.add files.git add/rm <file>.modified files. hence have to stage again.git add/rm <file>git commit -m "commit message"
Recap• Starting a local repository o git init• Staging o git add/rm• Committing o git commit• Comparing o git diff• Undoing staging and committing o git reset/checkout
First lets see how git stores data
For every new commit a new pointer is added to the newcommit pointing to the parent commit.
A branch in a git repository is a slightly moveable pointerpointer to one of the various commits in the network
Creating a new branch$ git branch testingThis will create a new pointer pointing to the latest commit ofthe existing branch
Now git keeps tag on which branch we are currently working onby a special pointer called HEAD. It always points to the localbranch we are presently working on
To switch between branches$ git checkout testingwe can also merge these two steps of creating and switchingby adding a shorthand$ git checkout -b <new branch name>This will create as well as switch directly to the new branch. Ofcourse after that we have to use only checkout command forswitching between already existent branches.
Now the network looks like this
Lets commit a change to this branch and see how the state ofthe network changes$ git commit -a -m "made some change"
Now if we decide to go back to master branch and make somechanges there.$ git checkout master
Now lets make some changes...$ git commit -a -m "made master changes"
Branch mergingLets say we have, at some point of time, a network similar to..
Merging the hotflix branch with master.• We have to first switch to the branch we want another branch to be merged to.$ git checkout master$ git merge hotflix
Deleting a branchLets delete the hotflix branch.$ git branch -d hotflixThe image here shows a later stage where we have mademore changes to the iss53 branch after the delete of hotflixbranch
Now it becomes a bit more complex to merge iss53 andmaster as they both have different parents.But git is smart in this case. It determines the best commonancestor for both the branches and uses it as its merge base.This is different that how other from other version controlsystems.
The final state
Basic merge conflicts • Auto merging after git merge stops when a conflict arises. • In such a case we have to manually resolve the conflict and stage the changes. • Without staging the conflict cannot be resolved and the merge cannot be completed. • An example how a file having a merge conflict looks like<<<<<<< HEAD:index.html<div id="footer">contact : email@example.com</div>=======<div id="footer"> please contact us at firstname.lastname@example.org</div>>>>>>>> iss53:index.htmlHere HEAD means the current branch code segment and therest after "=======" is the code in the branch that is beingmerged into current branch
Working with remote server
Remote• By remote we mean the server through which we collaborate• It has all the integrated union of the commit network of each and every system involved in the project• This brings us to GitHub
GitHub• It is a social code sharing network.• Share your code• Clone/download code from repository• Fork code from a readonly repository to own repository• Browse available public repositories• Control visibility of your repository• Offers nice interface to view code, branch network and commit history, and much more• Very well maintained
Showing remote serversgit clone$ git clone <remote git server url/ SSH url>This will create a new local repository which is a clone of theserver repository. Along with all its branches.By default it will add the server url as a remotename origin and all the branches in the server will berepresented by origin/<branch name>. The origin/masterlocal repo by default starts tracking the master branch of theserver repo
Add remote repositories• To check existent remote repositories added to the our local repository:$ git remoteWe can also use the shorthand -v to show us the remote url too• If we want to add more remote repositories to our local repository:$ git remote add [short name] [url]
Fork a repository• This feature of GitHub allows us to make a mutable copy of a read only public repository into our own GitHub repository and work with it.• This feature also allows us to stay in sync with the original repository by simple git pull/ git fetch commands.• But for the above to happen we have to add the remote address of the original server using$ git remote add [short name] [url]
Fetching and Pulling from remotegit fetch$ git fetch [remote name]This will fetch all the code from the remote keep a copy of it. Itwont merge the files in the working directory but we have tomanually merge it.But if we have a branch in server which is being tracked by ourlocal working branch then its easier to call git pull which willfetch and automatically merge the files.$ git pull [remote name]
Pushing to remotegit pushThis command pushes the latest committed files of the workingbranch of the local repository to the remote server.$ git push [remote server name] [branch name]This will push the code without error if the remote serverslatest commit in this branch is a part of out commit history. Elseit will ask us to pull the server code first and then we can pushout code.
Inspecting a remoteIf we want to see information about a remote server we havereference to in the local repository, we can do so with the helpof:$ git remote show [remote name]There are other methods like: • renaming remote$ git remote rename [existing remote name] [new remotename] • removing remote$ git remote rm [remote name]
Remote branchesList of branches present in our clone$ git branch -aThis will list all the branches present in the local repositoryalong with all the branches that have been cloned from theserver ([remote name]/[branch name]).Switching to a particular branch in the clone which our localrepository doesnt have we need to create a new branch bymerging the remote branch$ git checkout -b [local branch name] [[remote name]/[remotebranch name]]
• Tracking Remote branchesBy default a cloned repositorys master branch tracks theremote servers [remote name]/master branch.But we can change which branch to track and our local repo willstart tracking this branch in the remote server.$ git checkout --track [remote name]/[branch name]• Pushing a particular branch to server$ git push [remote name] [local brach name]:[remote branchname]• Deleting remote branch$ git push [remote name] :[remote branch name]
Thank you• Prepared by Robin Srivastava , Soumya Behera – email@example.com – firstname.lastname@example.org