SlideShare a Scribd company logo
Data science Git
management
Arindam Banerjee
Data Scientist, Ericsson
8/7/2020 Arindam Banerjee 1
Version Control Systems (VCS)
• Version control is indispensable for any kind of development.
• Crucial for collaboration and teamwork.
• VCS record changes (revisions) to a file or set of files over time so that a
specific version can be recalled later.
• Part of Software Component Management.
• Revisions are generally thought as a directed tree - a directed acyclic graph.
• Revisions occur in sequence over time - can be arranged in order by revision
number or timestamp.
• Example: Git, Subversion, Mercurial, CVS etc.
8/7/2020 Arindam Banerjee 2
Git
• “Git is a distributed version-control system for tracking changes
in source code during software development.” — Wikipedia
• Most recognized and widely used modern version control system.
• Git is free and open-source software distributed under GNU General
Public License Version 2.
• Git is a mature, actively maintained open source project originally
developed in 2005 by Linus Torvalds.
• Designed with performance, security and flexibility in mind.
8/7/2020 Arindam Banerjee 3
Characteristics
• Support non-linear development - rapid branching and merging,
navigating a non-linear development history.
• Distributed – local version and remote version.
• Fast and scalable
• Toolkit-based design – C and Shell Script for speed and portability.
• Primarily developed on Linux but supports macOS and Windows.
8/7/2020 Arindam Banerjee 4
• Founded in 2008, GitHub.com - most well known and widely
used cloud-based, version control platform.
• It has been a subsidiary of Microsoft since 2018.
• Project files are stored at remote cloud location - repository.
• Once local changes are pushed to GitHub, remote version is
updated.
• Every change is logged, and commit is recorded.
• Allows rollback to a previous version.
• Anyone can access the remote repository and download it.
• Only registered users can contribute, discuss, manage
repositories, and review code changes.
8/7/2020 Arindam Banerjee 5
Branching
• One of the most important
and efficient features of git.
• A temporary copy of the
project can be made where
changes are made first
without breaking anything in
production version.
8/7/2020 Arindam Banerjee 6
For Data Scientists
• Software engineers, developers, and data scientists use GitHub to store
their work, read documentation of other’s work, and collaborate across
teams and organizations.
• For incremental development and deployment.
• CI/CD pipeline - DevOps operations.
• Faster release cycle - agile workflow with frequent smaller changes.
• Putting better and retrained models into production.
8/7/2020 Arindam Banerjee 7
Creating a git repo
• Make your project directory as you want it.
• Run git init inside the top directory of your project.
• After that a directory called .git will be created in that location.
• This is where git records all the commits and keeps track of everything.
8/7/2020 Arindam Banerjee 8
Cloning a git repo
• Instead of creating a repo from scratch, one can clone an existing repo
from remote location.
• Use git clone command with a URL of the remote repository.
• git clone
https://github.com/link/to/your/remote/repo
• Cloning creates a new directory and places the cloned Git repository in
it.
8/7/2020 Arindam Banerjee 9
Check repo’s status
• git status command is used to check the state of the repo.
• Run this command after running any other command.
• It shows information about untracked new files which are created in the
working directory.
• Tracked files which have been modified.
• Branch you are currently in.
8/7/2020 Arindam Banerjee 10
Information about commits
• git log command displays all the commits of a repository.
• It shows the SHA (a unique id), the author, the date, the commit message etc.
• git log --stat command shows the files modified in the commit, the
number of lines that have been modified (added/deleted) and a summary of
modification.
• git log --patch or git log –p shows the actual changes
made to a file.
• git show command shows the same result as git log –p
• git show XXXXXX command shows the changes made in the commit of
SHA XXXXXX.
8/7/2020 Arindam Banerjee 11
Add commits to repo
• Git’s staging step allows to continue making changes to the working
directory, and interaction with version control is required, it allows to
record changes in small commits.
• If a file needs to be committed, it should be kept in the Staging Index.
• git add file1 file2 … command is used to move files from
the Working Directory to the Staging Index.
• git add . is used to include all files of the current working directory
to stage.
• git rm --cached file1... command is used to unstage files if
you added wrong files by mistake.
8/7/2020 Arindam Banerjee 12
Git commit
• Use git status first.
• Configure at least username and email address to make commit to a Git
repository because this information is stored in each commit. Use below
mentioned command:
git config --global user.name “Firstname Lastname”
git config --global user.email
“your.email@example.com”
• Commit using the git commit command:
git commit -m “Your commit message goes here.”
• Use git log command to review the commit just made.
8/7/2020 Arindam Banerjee 13
Git Branching
• A powerful feature of Git.
• It represents an independent line of development. New commits are recorded in the
history for the current branch, and as a fork in the history of the project.
• git branch “new_branch_name” is used to create a new branch with the
given name.
• git checkout “new_branch_name” command is used to switch between
branches. Else even after branching, commits wil lbe made to main branch.
• git checkout –b “new_branch_name” – create and checkout to the new
branch.
• git branch command will show the active branch with an asterisk sign.
• git branch –d “new_branch_name” will delete the new branch.
8/7/2020 Arindam Banerjee 14
Git Merging
• git merge command is used to combine branches in Git.
• git merge “new_branch_name”
8/7/2020 Arindam Banerjee 15
Communicating with remote repo
• git fetch command downloads commits, files, and references from
a remote repository into local repo.
• Git isolates fetched content from existing local content, so that local
development work is no way affected.
• git pull on the other hand does that AND brings (copy) those
changes from the remote repository.
• git push command is used to upload local repository content to a
remote repository.
• Push exports commits to remote branches.
8/7/2020 Arindam Banerjee 16
GitHub for your Personal Branding
• Make your work public.
• Participate in Opensource Projects.
• Recruiters search through GitHub for potential job candidates.
• GitHub activity, repos, commits, documentation etc. are evidence of a
skilled practitioner.
• GitHub Pages - Websites for you and your projects. Hosted directly from
your GitHub repository. Just edit, push, and your changes are live.
8/7/2020 Arindam Banerjee 17
8/7/2020 Arindam Banerjee 18

More Related Content

What's hot

Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucket
jeetendra mandal
 
JupyterLabを中心とした快適な分析生活
JupyterLabを中心とした快適な分析生活JupyterLabを中心とした快適な分析生活
JupyterLabを中心とした快適な分析生活
Classi.corp
 
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
Shinya Takamaeda-Y
 
第45回elasticsearch勉強会 BERTモデルを利用した文書分類
第45回elasticsearch勉強会 BERTモデルを利用した文書分類第45回elasticsearch勉強会 BERTモデルを利用した文書分類
第45回elasticsearch勉強会 BERTモデルを利用した文書分類
shinhiguchi
 
コンテナネットワーキング(CNI)最前線
コンテナネットワーキング(CNI)最前線コンテナネットワーキング(CNI)最前線
コンテナネットワーキング(CNI)最前線
Motonori Shindo
 
SpectreとMeltdown:最近のCPUの深い話
SpectreとMeltdown:最近のCPUの深い話SpectreとMeltdown:最近のCPUの深い話
SpectreとMeltdown:最近のCPUの深い話
LINE Corporation
 
Immutable vs mutable data types in python
Immutable vs mutable data types in pythonImmutable vs mutable data types in python
Immutable vs mutable data types in python
Learnbay Datascience
 
アプリ起動時間高速化 ~推測するな、計測せよ~
アプリ起動時間高速化 ~推測するな、計測せよ~アプリ起動時間高速化 ~推測するな、計測せよ~
アプリ起動時間高速化 ~推測するな、計測せよ~
gree_tech
 
Git Branching Model
Git Branching ModelGit Branching Model
Git Branching Model
Lemi Orhan Ergin
 
THETA プラグインで WebRTC やってみた
THETA プラグインでWebRTC やってみたTHETA プラグインでWebRTC やってみた
THETA プラグインで WebRTC やってみた
Hideki Shiro
 
実践イカパケット解析
実践イカパケット解析実践イカパケット解析
実践イカパケット解析
Yuki Mizuno
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basics
Seiji Hiraki
 
どうして昔の人は八進数でしゃべるのか?
どうして昔の人は八進数でしゃべるのか?どうして昔の人は八進数でしゃべるのか?
どうして昔の人は八進数でしゃべるのか?
たけおか しょうぞう
 
Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
 Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
Akihiro Suda
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
C++からWebRTC (DataChannel)を利用する
C++からWebRTC (DataChannel)を利用するC++からWebRTC (DataChannel)を利用する
C++からWebRTC (DataChannel)を利用する
祐司 伊藤
 
ChatGPT + LlamaIndex 0 .6 による チャットボット の実装
ChatGPT + LlamaIndex 0  .6 による チャットボット の実装ChatGPT + LlamaIndex 0  .6 による チャットボット の実装
ChatGPT + LlamaIndex 0 .6 による チャットボット の実装
Takanari Tokuwa
 
Burstを使ってSHA-256のハッシュ計算を高速に行う話
Burstを使ってSHA-256のハッシュ計算を高速に行う話Burstを使ってSHA-256のハッシュ計算を高速に行う話
Burstを使ってSHA-256のハッシュ計算を高速に行う話
Unity Technologies Japan K.K.
 
先駆者に学ぶ MLOpsの実際
先駆者に学ぶ MLOpsの実際先駆者に学ぶ MLOpsの実際
先駆者に学ぶ MLOpsの実際
Tetsutaro Watanabe
 

What's hot (20)

Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucket
 
JupyterLabを中心とした快適な分析生活
JupyterLabを中心とした快適な分析生活JupyterLabを中心とした快適な分析生活
JupyterLabを中心とした快適な分析生活
 
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
PyCoRAM: Python-Verilog高位合成とメモリ抽象化によるFPGAアクセラレータ向けIPコア開発フレームワーク (FPGAX #05)
 
第45回elasticsearch勉強会 BERTモデルを利用した文書分類
第45回elasticsearch勉強会 BERTモデルを利用した文書分類第45回elasticsearch勉強会 BERTモデルを利用した文書分類
第45回elasticsearch勉強会 BERTモデルを利用した文書分類
 
コンテナネットワーキング(CNI)最前線
コンテナネットワーキング(CNI)最前線コンテナネットワーキング(CNI)最前線
コンテナネットワーキング(CNI)最前線
 
SpectreとMeltdown:最近のCPUの深い話
SpectreとMeltdown:最近のCPUの深い話SpectreとMeltdown:最近のCPUの深い話
SpectreとMeltdown:最近のCPUの深い話
 
Immutable vs mutable data types in python
Immutable vs mutable data types in pythonImmutable vs mutable data types in python
Immutable vs mutable data types in python
 
アプリ起動時間高速化 ~推測するな、計測せよ~
アプリ起動時間高速化 ~推測するな、計測せよ~アプリ起動時間高速化 ~推測するな、計測せよ~
アプリ起動時間高速化 ~推測するな、計測せよ~
 
Git Branching Model
Git Branching ModelGit Branching Model
Git Branching Model
 
THETA プラグインで WebRTC やってみた
THETA プラグインでWebRTC やってみたTHETA プラグインでWebRTC やってみた
THETA プラグインで WebRTC やってみた
 
実践イカパケット解析
実践イカパケット解析実践イカパケット解析
実践イカパケット解析
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basics
 
どうして昔の人は八進数でしゃべるのか?
どうして昔の人は八進数でしゃべるのか?どうして昔の人は八進数でしゃべるのか?
どうして昔の人は八進数でしゃべるのか?
 
Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
 Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
 
Python - the basics
Python - the basicsPython - the basics
Python - the basics
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
C++からWebRTC (DataChannel)を利用する
C++からWebRTC (DataChannel)を利用するC++からWebRTC (DataChannel)を利用する
C++からWebRTC (DataChannel)を利用する
 
ChatGPT + LlamaIndex 0 .6 による チャットボット の実装
ChatGPT + LlamaIndex 0  .6 による チャットボット の実装ChatGPT + LlamaIndex 0  .6 による チャットボット の実装
ChatGPT + LlamaIndex 0 .6 による チャットボット の実装
 
Burstを使ってSHA-256のハッシュ計算を高速に行う話
Burstを使ってSHA-256のハッシュ計算を高速に行う話Burstを使ってSHA-256のハッシュ計算を高速に行う話
Burstを使ってSHA-256のハッシュ計算を高速に行う話
 
先駆者に学ぶ MLOpsの実際
先駆者に学ぶ MLOpsの実際先駆者に学ぶ MLOpsの実際
先駆者に学ぶ MLOpsの実際
 

Similar to Data science Git management

Introduction to git hub
Introduction to git hubIntroduction to git hub
Introduction to git hub
Naveen Pandey
 
Git
GitGit
Mini-training: Let’s Git It!
Mini-training: Let’s Git It!Mini-training: Let’s Git It!
Mini-training: Let’s Git It!
Betclic Everest Group Tech Team
 
Git presentation
Git presentationGit presentation
Git presentation
Sai Kumar Satapathy
 
Git hub
Git hubGit hub
Git hub
Nitin Goel
 
Git Session 2K23.pptx
Git Session 2K23.pptxGit Session 2K23.pptx
Git Session 2K23.pptx
Eshaan35
 
Git 101
Git 101Git 101
Git 101
jayrparro
 
Version control git day02
Version control   git day02Version control   git day02
Version control git day02
Gourav Varma
 
Learn Git - For Beginners and Intermediate levels
Learn Git - For Beginners and Intermediate levelsLearn Git - For Beginners and Intermediate levels
Learn Git - For Beginners and Intermediate levels
Gorav Singal
 
Introduction to git and githhub with practicals.pptx
Introduction to git and githhub with practicals.pptxIntroduction to git and githhub with practicals.pptx
Introduction to git and githhub with practicals.pptx
Abdul Salam
 
Beginner's Guide to Version Control with Git
Beginner's Guide to Version Control with GitBeginner's Guide to Version Control with Git
Beginner's Guide to Version Control with GitRobert Lee-Cann
 
Introduction git
Introduction gitIntroduction git
Introduction git
Dian Sigit Prastowo
 
Introduction to Git for Network Engineers
Introduction to Git for Network EngineersIntroduction to Git for Network Engineers
Introduction to Git for Network Engineers
Joel W. King
 
The Basics of Open Source Collaboration With Git and GitHub
The Basics of Open Source Collaboration With Git and GitHubThe Basics of Open Source Collaboration With Git and GitHub
The Basics of Open Source Collaboration With Git and GitHub
BigBlueHat
 
Introducing Git and git flow
Introducing Git and git flow Introducing Git and git flow
Introducing Git and git flow
Sebin Benjamin
 
git and github-1.pptx
git and github-1.pptxgit and github-1.pptx
git and github-1.pptx
tnscharishma
 
ePOM - Fundamentals of Research Software Development - Code Version Control
ePOM - Fundamentals of Research Software Development - Code Version ControlePOM - Fundamentals of Research Software Development - Code Version Control
ePOM - Fundamentals of Research Software Development - Code Version Control
Giuseppe Masetti
 
Techoalien git
Techoalien gitTechoalien git
Techoalien git
Aditya Tiwari
 
Techoalien git
Techoalien gitTechoalien git
Techoalien git
Aditya Tiwari
 

Similar to Data science Git management (20)

Introduction to git hub
Introduction to git hubIntroduction to git hub
Introduction to git hub
 
Git
GitGit
Git
 
Mini-training: Let’s Git It!
Mini-training: Let’s Git It!Mini-training: Let’s Git It!
Mini-training: Let’s Git It!
 
Git presentation
Git presentationGit presentation
Git presentation
 
Git hub
Git hubGit hub
Git hub
 
Git Session 2K23.pptx
Git Session 2K23.pptxGit Session 2K23.pptx
Git Session 2K23.pptx
 
Git 101
Git 101Git 101
Git 101
 
Version control git day02
Version control   git day02Version control   git day02
Version control git day02
 
Learn Git - For Beginners and Intermediate levels
Learn Git - For Beginners and Intermediate levelsLearn Git - For Beginners and Intermediate levels
Learn Git - For Beginners and Intermediate levels
 
Introduction to git and githhub with practicals.pptx
Introduction to git and githhub with practicals.pptxIntroduction to git and githhub with practicals.pptx
Introduction to git and githhub with practicals.pptx
 
Beginner's Guide to Version Control with Git
Beginner's Guide to Version Control with GitBeginner's Guide to Version Control with Git
Beginner's Guide to Version Control with Git
 
Introduction git
Introduction gitIntroduction git
Introduction git
 
Introduction to Git for Network Engineers
Introduction to Git for Network EngineersIntroduction to Git for Network Engineers
Introduction to Git for Network Engineers
 
The Basics of Open Source Collaboration With Git and GitHub
The Basics of Open Source Collaboration With Git and GitHubThe Basics of Open Source Collaboration With Git and GitHub
The Basics of Open Source Collaboration With Git and GitHub
 
Introducing Git and git flow
Introducing Git and git flow Introducing Git and git flow
Introducing Git and git flow
 
git.ppt.pdf
git.ppt.pdfgit.ppt.pdf
git.ppt.pdf
 
git and github-1.pptx
git and github-1.pptxgit and github-1.pptx
git and github-1.pptx
 
ePOM - Fundamentals of Research Software Development - Code Version Control
ePOM - Fundamentals of Research Software Development - Code Version ControlePOM - Fundamentals of Research Software Development - Code Version Control
ePOM - Fundamentals of Research Software Development - Code Version Control
 
Techoalien git
Techoalien gitTechoalien git
Techoalien git
 
Techoalien git
Techoalien gitTechoalien git
Techoalien git
 

Recently uploaded

Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Data science Git management

  • 1. Data science Git management Arindam Banerjee Data Scientist, Ericsson 8/7/2020 Arindam Banerjee 1
  • 2. Version Control Systems (VCS) • Version control is indispensable for any kind of development. • Crucial for collaboration and teamwork. • VCS record changes (revisions) to a file or set of files over time so that a specific version can be recalled later. • Part of Software Component Management. • Revisions are generally thought as a directed tree - a directed acyclic graph. • Revisions occur in sequence over time - can be arranged in order by revision number or timestamp. • Example: Git, Subversion, Mercurial, CVS etc. 8/7/2020 Arindam Banerjee 2
  • 3. Git • “Git is a distributed version-control system for tracking changes in source code during software development.” — Wikipedia • Most recognized and widely used modern version control system. • Git is free and open-source software distributed under GNU General Public License Version 2. • Git is a mature, actively maintained open source project originally developed in 2005 by Linus Torvalds. • Designed with performance, security and flexibility in mind. 8/7/2020 Arindam Banerjee 3
  • 4. Characteristics • Support non-linear development - rapid branching and merging, navigating a non-linear development history. • Distributed – local version and remote version. • Fast and scalable • Toolkit-based design – C and Shell Script for speed and portability. • Primarily developed on Linux but supports macOS and Windows. 8/7/2020 Arindam Banerjee 4
  • 5. • Founded in 2008, GitHub.com - most well known and widely used cloud-based, version control platform. • It has been a subsidiary of Microsoft since 2018. • Project files are stored at remote cloud location - repository. • Once local changes are pushed to GitHub, remote version is updated. • Every change is logged, and commit is recorded. • Allows rollback to a previous version. • Anyone can access the remote repository and download it. • Only registered users can contribute, discuss, manage repositories, and review code changes. 8/7/2020 Arindam Banerjee 5
  • 6. Branching • One of the most important and efficient features of git. • A temporary copy of the project can be made where changes are made first without breaking anything in production version. 8/7/2020 Arindam Banerjee 6
  • 7. For Data Scientists • Software engineers, developers, and data scientists use GitHub to store their work, read documentation of other’s work, and collaborate across teams and organizations. • For incremental development and deployment. • CI/CD pipeline - DevOps operations. • Faster release cycle - agile workflow with frequent smaller changes. • Putting better and retrained models into production. 8/7/2020 Arindam Banerjee 7
  • 8. Creating a git repo • Make your project directory as you want it. • Run git init inside the top directory of your project. • After that a directory called .git will be created in that location. • This is where git records all the commits and keeps track of everything. 8/7/2020 Arindam Banerjee 8
  • 9. Cloning a git repo • Instead of creating a repo from scratch, one can clone an existing repo from remote location. • Use git clone command with a URL of the remote repository. • git clone https://github.com/link/to/your/remote/repo • Cloning creates a new directory and places the cloned Git repository in it. 8/7/2020 Arindam Banerjee 9
  • 10. Check repo’s status • git status command is used to check the state of the repo. • Run this command after running any other command. • It shows information about untracked new files which are created in the working directory. • Tracked files which have been modified. • Branch you are currently in. 8/7/2020 Arindam Banerjee 10
  • 11. Information about commits • git log command displays all the commits of a repository. • It shows the SHA (a unique id), the author, the date, the commit message etc. • git log --stat command shows the files modified in the commit, the number of lines that have been modified (added/deleted) and a summary of modification. • git log --patch or git log –p shows the actual changes made to a file. • git show command shows the same result as git log –p • git show XXXXXX command shows the changes made in the commit of SHA XXXXXX. 8/7/2020 Arindam Banerjee 11
  • 12. Add commits to repo • Git’s staging step allows to continue making changes to the working directory, and interaction with version control is required, it allows to record changes in small commits. • If a file needs to be committed, it should be kept in the Staging Index. • git add file1 file2 … command is used to move files from the Working Directory to the Staging Index. • git add . is used to include all files of the current working directory to stage. • git rm --cached file1... command is used to unstage files if you added wrong files by mistake. 8/7/2020 Arindam Banerjee 12
  • 13. Git commit • Use git status first. • Configure at least username and email address to make commit to a Git repository because this information is stored in each commit. Use below mentioned command: git config --global user.name “Firstname Lastname” git config --global user.email “your.email@example.com” • Commit using the git commit command: git commit -m “Your commit message goes here.” • Use git log command to review the commit just made. 8/7/2020 Arindam Banerjee 13
  • 14. Git Branching • A powerful feature of Git. • It represents an independent line of development. New commits are recorded in the history for the current branch, and as a fork in the history of the project. • git branch “new_branch_name” is used to create a new branch with the given name. • git checkout “new_branch_name” command is used to switch between branches. Else even after branching, commits wil lbe made to main branch. • git checkout –b “new_branch_name” – create and checkout to the new branch. • git branch command will show the active branch with an asterisk sign. • git branch –d “new_branch_name” will delete the new branch. 8/7/2020 Arindam Banerjee 14
  • 15. Git Merging • git merge command is used to combine branches in Git. • git merge “new_branch_name” 8/7/2020 Arindam Banerjee 15
  • 16. Communicating with remote repo • git fetch command downloads commits, files, and references from a remote repository into local repo. • Git isolates fetched content from existing local content, so that local development work is no way affected. • git pull on the other hand does that AND brings (copy) those changes from the remote repository. • git push command is used to upload local repository content to a remote repository. • Push exports commits to remote branches. 8/7/2020 Arindam Banerjee 16
  • 17. GitHub for your Personal Branding • Make your work public. • Participate in Opensource Projects. • Recruiters search through GitHub for potential job candidates. • GitHub activity, repos, commits, documentation etc. are evidence of a skilled practitioner. • GitHub Pages - Websites for you and your projects. Hosted directly from your GitHub repository. Just edit, push, and your changes are live. 8/7/2020 Arindam Banerjee 17