SlideShare a Scribd company logo
1 of 28
Download to read offline
Enter Data Version Control
Problem description
Problem description
Problem description
Index
● Explicit data and process dependencies
● Data and model caching
● Visualize metrics across model and data versions
● “One click” pipeline reproducibility
● 🍻 🍕
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
$ dvc add data/raw
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
$ dvc run 
--file prepare_data.dvc 
--deps prepare_data.ipynb 
--deps data/raw 
--outs data/prepared 
papermill 
prepare_data.ipynb 
prepare_data_out.ipynb
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
$ dvc run 
--file extract_features.dvc 
--deps extract_features.ipynb 
--deps data/prepared 
--outs data/features 
papermill 
extract_features.ipynb 
extract_features_out.ipynb
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
model
Test
model
metrics
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
model
Test
model
metrics
$ dvc run 
--file select_model.dvc 
--deps select_model.ipynb 
--deps model.py 
--deps data/features 
--outs model 
papermill 
select_model.ipynb 
select_model_out.ipynb
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
Explicit data and process dependencies
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
$ dvc run 
--file test_model.dvc 
--deps test_model.ipynb 
--deps data/features 
--deps model 
--deps model.py 
--metrics test_metrics.json 
papermill 
test_model.ipynb 
test_model_out.ipynb
Explicit data and process dependencies
Explicit data and process dependencies
$ dvc pipeline show --ascii select_model.dvc
$ dvc pipeline show --ascii --outs select_model.dvc
Data and model caching
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
Data and model caching
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
CHANGE HERE
Data and model caching
raw
Prepare
data
prepared
train
prepared
test
Extract
features
features
train
features
test
Select
model
Test
model
model metrics
CHANGE HERE
$ dvc repro test_model.dvc
Data and model caching
$ dvc metrics show -T
Visualize metrics across model and data versions
“One click” pipeline reproducibility
$ git clone git@gitlab.com:adsmurai/training/dvc-meetup.git
$ git pull --tags
$ dvc pull --all-tags --all-branches
$ dvc checkout
$ dvc repro test_model.dvc
Thank you
DVC meetup

More Related Content

Similar to DVC meetup

Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io
 
Unit Test your Views
Unit Test your ViewsUnit Test your Views
Unit Test your ViewsJorge Ortiz
 
Pa Project And Best Practice 2
Pa Project And Best Practice 2Pa Project And Best Practice 2
Pa Project And Best Practice 2alice yang
 
Reproducibility and experiments management in Machine Learning
Reproducibility and experiments management in Machine Learning Reproducibility and experiments management in Machine Learning
Reproducibility and experiments management in Machine Learning Mikhail Rozhkov
 
A00-440: Useful Questions for SAS ModelOps Specialist Certification Success
A00-440: Useful Questions for SAS ModelOps Specialist Certification SuccessA00-440: Useful Questions for SAS ModelOps Specialist Certification Success
A00-440: Useful Questions for SAS ModelOps Specialist Certification SuccessPalakMazumdar1
 
Citrix AppDNA Management Overview v7.6
Citrix AppDNA Management Overview v7.6Citrix AppDNA Management Overview v7.6
Citrix AppDNA Management Overview v7.6Kerry Dirks MCPS MS
 
Bdd test automation analysis
Bdd test automation analysisBdd test automation analysis
Bdd test automation analysisssuser2e8d4b
 
Data science workflows: from notebooks to production
Data science workflows: from notebooks to productionData science workflows: from notebooks to production
Data science workflows: from notebooks to productionMarissa Saunders
 
Netserv Software Testing
Netserv Software TestingNetserv Software Testing
Netserv Software Testingsthicks14
 
Neotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da SilvaNeotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da SilvaNeotys_Partner
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)Arnab Biswas
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic StackRochelle Sonnenberg
 
Loading Huge Amounts of Data
Loading Huge Amounts of DataLoading Huge Amounts of Data
Loading Huge Amounts of DataVaticle
 
DataOps - Production ML
DataOps - Production MLDataOps - Production ML
DataOps - Production MLAl Zindiq
 
Strategy-driven Test Generation with Open Source Frameworks
Strategy-driven Test Generation with Open Source FrameworksStrategy-driven Test Generation with Open Source Frameworks
Strategy-driven Test Generation with Open Source FrameworksDimitry Polivaev
 
Performance testing in scope of migration to cloud by Serghei Radov
Performance testing in scope of migration to cloud by Serghei RadovPerformance testing in scope of migration to cloud by Serghei Radov
Performance testing in scope of migration to cloud by Serghei RadovValeriia Maliarenko
 
Wix's ML Platform
Wix's ML PlatformWix's ML Platform
Wix's ML PlatformRan Romano
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...SQUADEX
 

Similar to DVC meetup (20)

Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
DDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptxDDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptx
 
Unit Test your Views
Unit Test your ViewsUnit Test your Views
Unit Test your Views
 
Pa Project And Best Practice 2
Pa Project And Best Practice 2Pa Project And Best Practice 2
Pa Project And Best Practice 2
 
Reproducibility and experiments management in Machine Learning
Reproducibility and experiments management in Machine Learning Reproducibility and experiments management in Machine Learning
Reproducibility and experiments management in Machine Learning
 
A00-440: Useful Questions for SAS ModelOps Specialist Certification Success
A00-440: Useful Questions for SAS ModelOps Specialist Certification SuccessA00-440: Useful Questions for SAS ModelOps Specialist Certification Success
A00-440: Useful Questions for SAS ModelOps Specialist Certification Success
 
Citrix AppDNA Management Overview v7.6
Citrix AppDNA Management Overview v7.6Citrix AppDNA Management Overview v7.6
Citrix AppDNA Management Overview v7.6
 
Bdd test automation analysis
Bdd test automation analysisBdd test automation analysis
Bdd test automation analysis
 
Data science workflows: from notebooks to production
Data science workflows: from notebooks to productionData science workflows: from notebooks to production
Data science workflows: from notebooks to production
 
Netserv Software Testing
Netserv Software TestingNetserv Software Testing
Netserv Software Testing
 
Neotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da SilvaNeotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da Silva
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic Stack
 
Loading Huge Amounts of Data
Loading Huge Amounts of DataLoading Huge Amounts of Data
Loading Huge Amounts of Data
 
DataOps - Production ML
DataOps - Production MLDataOps - Production ML
DataOps - Production ML
 
Strategy-driven Test Generation with Open Source Frameworks
Strategy-driven Test Generation with Open Source FrameworksStrategy-driven Test Generation with Open Source Frameworks
Strategy-driven Test Generation with Open Source Frameworks
 
Performance testing in scope of migration to cloud by Serghei Radov
Performance testing in scope of migration to cloud by Serghei RadovPerformance testing in scope of migration to cloud by Serghei Radov
Performance testing in scope of migration to cloud by Serghei Radov
 
Wix's ML Platform
Wix's ML PlatformWix's ML Platform
Wix's ML Platform
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
 
Integration testing - A&BP CC
Integration testing - A&BP CCIntegration testing - A&BP CC
Integration testing - A&BP CC
 

Recently uploaded

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 

Recently uploaded (20)

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 

DVC meetup