SlideShare a Scribd company logo
1 of 20
Taking Data
Science to
Enterprise level
Christos Charmatzis
@xarmatzis Athens
May 2018
Few things about me
 I am a GIS Solution Architect
 Microsoft Professional Program Data Science Certificate
 Do Azure since 2010
 Open Source developer and contributor
• React-Leaflet-Google (npm downloads > 26,500)
• Geotrellis (geographic data processing engine for Spark)
• Magellan (Geo Spatial Data Analytics on Spark)
Agenda
• Introduction
• Data Science Team
• Standardized project structure
• Execution of data science projects
• Azure Machine Learning Workbench
Introduction
What is Data Science?
Data science is an interdisciplinary field of scientific methods, processes,
algorithms and systems to extract knowledge or insights from data in various
forms, either structured or unstructured, similar to data mining.
Source: Wikipedia
Introduction Data wrangling
(munging), retrieval
+ storage
Data
mining &
machine
learning
Statistics
Big data
Introduction
What are the characteristics of a project at Enterprise stage?
1. I am not alone, I am part of a Team.
2. The deliverables should be reusable and production ready.
3. Need for scale up.
Introduction
How can we take Data Science to Enterprise level?
Follow the 3 principles:
1. The Team writes experiments
2. The Team members keep their work as simple as possible
3. The Team members collaborate and share experiments, ALL THE TIME!!!!
Data Science Team
Data science functions in enterprises are organized:
1. Data science group/s
2. Data science team/s within group/s
Data Science Team
Roles in Data Science Group:
• Project Individual Contributor. Data Scientist, Business Analyst, Data
Engineer, Architect, etc. A project individual contributor executes a data
science project.
• Project Lead. A project lead manages the daily activities of individual data
scientists on a specific data science project.
• Team Lead. A team lead is managing a team in the data science unit of an
enterprise.
• Group Manager. Group Manager is the manager of the entire data
science unit in an enterprise.
Data Science Team
Tasks in Data Science Group:
Group
Manager
Team
Lead
Project
Lead
Data
Scientist
1. Create Group
Account on a Version Control Platform
2. Create Team
Environment
3. Create Project
4. Add Storage/ Analytics
Resources to Project
Merge Pull
Request
5. Execute
Project
Standardized project structure
Azure-TDSP-ProjectTemplate
Standardized project structure
Azure-TDSP-ProjectTemplate
Standardized project structure
Azure-TDSP-ProjectTemplate
Project Charter
• Business background
• Scope
• Personnel
• Metrics
• Plan
• Architecture
• Communication
Exit Report
• Overview
• Business Domain
• Business Problem
• Data Processing
• Modeling, Validation
• Benefits
• Learnings
Standardized project structure
We need standards
ONNX (http://onnx.ai/) is a open format to represent deep learning
models. With ONNX, AI developers can more easily move models
between state-of-the-art tools and choose the combination that is best
for them.
ONNX is developed and supported by a community of partners
Facebook and Microsoft
Execution of data science projects
What is an experiment?
An experiment is a Study.
Execution of data science projects
Macroscopically
Introduction Main Part Conclusion
Azure Machine Learning Workbench
What is that?
It’s an integrated end-to-end Data Science Solution.
Requirements
• Create Azure Machine Learning services account
(https://bit.ly/2x1yWu0 )
Azure Machine Learning Workbench
Let’s demo it!
Azure Machine Learning Workbench
Learn more
• https://channel9.msdn.com/events/Ignite/Microsoft-Ignite-Orlando-
2017/BRK3319
• https://channel9.msdn.com/events/Build/2018/BRK3215
• http://onnx.ai/news
• https://docs.microsoft.com/en-us/azure/machine-learning/service/
Thank U!
Qs & As

More Related Content

What's hot

Stephen Dillon - Fast Data Presentation Sept 02
Stephen Dillon - Fast Data Presentation Sept 02Stephen Dillon - Fast Data Presentation Sept 02
Stephen Dillon - Fast Data Presentation Sept 02
Stephen Dillon
 
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
Matt Stubbs
 

What's hot (20)

Books neended
Books neendedBooks neended
Books neended
 
Why Hadoop and benefits
Why Hadoop and benefits Why Hadoop and benefits
Why Hadoop and benefits
 
Stephen Dillon - Fast Data Presentation Sept 02
Stephen Dillon - Fast Data Presentation Sept 02Stephen Dillon - Fast Data Presentation Sept 02
Stephen Dillon - Fast Data Presentation Sept 02
 
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
Big Data LDN 2018: HOW RANK GAMING PRODUCTIONISED & AUTOMATED THE MANAGEMENT ...
 
Supporting Big Data, Open Data, Data Analytics and Data Science
Supporting Big Data, Open Data, Data Analytics and Data ScienceSupporting Big Data, Open Data, Data Analytics and Data Science
Supporting Big Data, Open Data, Data Analytics and Data Science
 
Best Practices for Strucuturing a Data Team
Best Practices for Strucuturing a Data TeamBest Practices for Strucuturing a Data Team
Best Practices for Strucuturing a Data Team
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
 
Hadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeHadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance Initiative
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
High Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming waveHigh Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming wave
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Hudson Data Corp Training
Hudson Data Corp TrainingHudson Data Corp Training
Hudson Data Corp Training
 
BitBootCamp Evening Classes
BitBootCamp Evening ClassesBitBootCamp Evening Classes
BitBootCamp Evening Classes
 
Beyond the Science Gateway
Beyond the Science GatewayBeyond the Science Gateway
Beyond the Science Gateway
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Challenges for Information Access in Multi-Disciplinary Product Design and En...
Challenges for Information Access in Multi-Disciplinary Product Design and En...Challenges for Information Access in Multi-Disciplinary Product Design and En...
Challenges for Information Access in Multi-Disciplinary Product Design and En...
 
Taming Your Deep Learning Workflow by Determined AI
Taming Your Deep Learning Workflow by Determined AITaming Your Deep Learning Workflow by Determined AI
Taming Your Deep Learning Workflow by Determined AI
 

Similar to Taking Data Science to Enterprise level

1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
Ayele40
 

Similar to Taking Data Science to Enterprise level (20)

Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 

Recently uploaded (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 

Taking Data Science to Enterprise level

  • 1. Taking Data Science to Enterprise level Christos Charmatzis @xarmatzis Athens May 2018
  • 2. Few things about me  I am a GIS Solution Architect  Microsoft Professional Program Data Science Certificate  Do Azure since 2010  Open Source developer and contributor • React-Leaflet-Google (npm downloads > 26,500) • Geotrellis (geographic data processing engine for Spark) • Magellan (Geo Spatial Data Analytics on Spark)
  • 3. Agenda • Introduction • Data Science Team • Standardized project structure • Execution of data science projects • Azure Machine Learning Workbench
  • 4. Introduction What is Data Science? Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. Source: Wikipedia
  • 5. Introduction Data wrangling (munging), retrieval + storage Data mining & machine learning Statistics Big data
  • 6. Introduction What are the characteristics of a project at Enterprise stage? 1. I am not alone, I am part of a Team. 2. The deliverables should be reusable and production ready. 3. Need for scale up.
  • 7. Introduction How can we take Data Science to Enterprise level? Follow the 3 principles: 1. The Team writes experiments 2. The Team members keep their work as simple as possible 3. The Team members collaborate and share experiments, ALL THE TIME!!!!
  • 8. Data Science Team Data science functions in enterprises are organized: 1. Data science group/s 2. Data science team/s within group/s
  • 9. Data Science Team Roles in Data Science Group: • Project Individual Contributor. Data Scientist, Business Analyst, Data Engineer, Architect, etc. A project individual contributor executes a data science project. • Project Lead. A project lead manages the daily activities of individual data scientists on a specific data science project. • Team Lead. A team lead is managing a team in the data science unit of an enterprise. • Group Manager. Group Manager is the manager of the entire data science unit in an enterprise.
  • 10. Data Science Team Tasks in Data Science Group: Group Manager Team Lead Project Lead Data Scientist 1. Create Group Account on a Version Control Platform 2. Create Team Environment 3. Create Project 4. Add Storage/ Analytics Resources to Project Merge Pull Request 5. Execute Project
  • 13. Standardized project structure Azure-TDSP-ProjectTemplate Project Charter • Business background • Scope • Personnel • Metrics • Plan • Architecture • Communication Exit Report • Overview • Business Domain • Business Problem • Data Processing • Modeling, Validation • Benefits • Learnings
  • 14. Standardized project structure We need standards ONNX (http://onnx.ai/) is a open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners Facebook and Microsoft
  • 15. Execution of data science projects What is an experiment? An experiment is a Study.
  • 16. Execution of data science projects Macroscopically Introduction Main Part Conclusion
  • 17. Azure Machine Learning Workbench What is that? It’s an integrated end-to-end Data Science Solution. Requirements • Create Azure Machine Learning services account (https://bit.ly/2x1yWu0 )
  • 18. Azure Machine Learning Workbench Let’s demo it!
  • 19. Azure Machine Learning Workbench Learn more • https://channel9.msdn.com/events/Ignite/Microsoft-Ignite-Orlando- 2017/BRK3319 • https://channel9.msdn.com/events/Build/2018/BRK3215 • http://onnx.ai/news • https://docs.microsoft.com/en-us/azure/machine-learning/service/

Editor's Notes

  1. Typically, a data science project is done by a data science team, which may be composed of project leads (for project management and governance tasks) and data scientists or engineers (individual contributors / technical personnel) who will execute the data science and data engineering parts of the project.
  2. Definition of four TDSP roles With the above assumption, we have specified four distinct roles for our team personnel: Project Individual Contributor. Data Scientist, Business Analyst, Data Engineer, Architect, etc. A project individual contributor executes a data science project. Project Lead. A project lead manages the daily activities of individual data scientists on a specific data science project. Team Lead. A team lead is managing a team in the data science unit of an enterprise. A team consists of multiple data scientists. For data science unit with only a small number of data scientists, the Group Manager and the Team Lead might be the same person. Group Manager. Group Manager is the manager of the entire data science unit in an enterprise. A data science unit might have multiple teams, each of which is working on multiple data science projects in distinct business verticals. A Group Manager might delegate their tasks to a surrogate, but the tasks associated with the role do not change. Note: Depending on the structure in an enterprise, a single person may play more than one roles OR there may be more than one person working on a role. This may frequently be the case in small enterprises or enterprises with a small number of personnel in their data science organization.
  3. This is a general project directory structure for Team Data Science Process developed by Microsoft. It also contains templates for various documents that are recommended as part of executing a data science project when using TDSP. Team Data Science Process (TDSP) is an agile, iterative, data science methodology to improve collaboration and team learning. It is supported through a lifecycle definition, standard project structure, artifact templates, and tools for productive data science. NOTE: In this directory structure, the Sample_Data folder is NOT supposed to contain LARGE raw or processed data. It is only supposed to contain small and sample data sets, which could be used to test the code.
  4. Code folder for hosting code for a Data Science Project This folder hosts all code for a data science project. It has three sub-folders, belonging to 3 stages of the Data Science Lifecycle: Data_Acquisition_and_Understanding Modeling Deployment
  5. Folder for hosting all documents for a Data Science Project Documents will contain information about the following System architecture Data dictionaries Reports related to data understanding, modeling Project management and planning docs Information obtained from a business owner or client about the project Docs and presentations prepared to share information about the project The two documents under Docs/Project, namely the Charter and Exit Report are particularly important to consider. They help to define the project at the start of an engagement, and provide a final report to the customer or client