SlideShare a Scribd company logo
DATA SCIENCE; WHY, WHAT,
HOW?
MUHAMMAD SHAHID
Data Science with Dr Shahid
FACEBOOK.COM/DRSHAHID.PHD
SCOPE
Fundamentals Data terms
DS Lifecycle Pathway to DS
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
Statistics
• Traditionally concerned with
analyzing primary (e.g.
Experimental) data collected
for checking specific
hypotheses(ideas)
• Primary data analysis or top-
down(confirmatory) analysis
• Hypothesis evaluation or
testing
Data Science
• Typically concerned with
analyzing secondary (e.g.,
observational) data collected
for other reasons
• Secondary data analysis or
bottom-up(exploratory)
analysis
• Hypothesis generation
• Knowledge discovery
Data Science with Dr Shahid
Data science is an interdisciplinary field
Encompasses the usage of computing tools in order to extract
knowledge from data by deploying statistical methods
Multiple definitions exist, reason being the nature of
cross-disciplinary skills needed to create value
Holy-grail of data science can be ascertained
through Venn diagrams, e.g., Drew Conway’s
Data Science with Dr Shahid
Data science as portrayed by Drew ConwayData Science with Dr Shahid
Data Science with Dr Shahid
Stephan
Kolassa on StackExchange:
Big data
Artificial neural
networks
Machine
learning
Data mining
Deep
learning
Artificial learning
Data Science with Dr Shahid
Machine Learning
Deep
Learning
Data Science
Artificial
Intelligence
Big
Data
Data Science with Dr Shahid
Gregory Piatetsky-Shapiro, Ph.D
Knowledge Discovery to
Data Mining to Predictive
Analytics and now to
Data Science
Essence is always: discovery
of what is true and useful
Data Science with Dr Shahid
Data Science with Dr Shahid
How?
Data Science with Dr Shahid
•Asking right questions!
•Requirements on data collection
•Analysis/Modeling
•Conveying results
MSAzuredocumentation
Data Science with Dr Shahid
BusinessUnderstanding
Goals
• Specify key
variables
(model targets,
metrics of
success)
• Relevant data
sources
How?
• Define
*objectives
(business
problems,
stakeholders)
• **SMART
metrics
• Find the data
Artifacts
• Iterating charter
• Data Sources
• Data
Dictionaries
Data Science with Dr Shahid
Objectives
How much/many: Regression
Which category: Classification
Which group: Clustering
Is it weird: Anomaly Detection
Which opinion: Recommendation
Specific
Measurable
Achievable
Relevant
Time-bound
Data Science with Dr Shahid
MSAzuredocumentation
Data Science with Dr Shahid
DataAcqusition
Goals
• Clean, high
quality
• Architecture of
data pipeline
(refresh & score)
How?
• Data Ingestion
• Explore the data
(quality, eda)
• Setup data
pipeline (Batch-based
,Streaming or real time, A hybrid)
Artifacts
• Data Q report
• Solution
Architecture
• Checkpoint
decision (re-evaluate
before full-feature engineering/model
building)
Data Science with Dr Shahid
MSAzuredocumentation
Data Science with Dr Shahid
Modeling
Goals
• Optimal
features
• Informative
model
• Production
ready model
How?
• Feature
engineering
• Model Training
• Production
Ready?
Artifacts
• Feature sets
• Model report
• Checkpoint
decision (Evaluate for
production)
Data Science with Dr Shahid
Model Training
Raw data Features
Starting data
Training split (70-80%) Validation split
(10-15%)
Test split
(10-15%)
Model gets trained Hyper parameters
tuning
Model gets
evaluatedData Science with Dr Shahid
MSAzuredocumentation
Data Science with Dr Shahid
Deployement
Goals
• Deploy models
with a data
pipeline to a
production env
How?
• Operationalize
the model
Artifacts
• Status
dashboard
(system health
& KPIs)
• Final Modeling
report
• Final solution
arch doc
Data Science with Dr Shahid
Customeracceptance
Goals
• Finalize project
deliverables
Confirm that the
pipeline, the model,
and their deployment
in a production
environment satisfy
the customer's
objectives.
How?
• System
validation
• Project hand-off
Artifacts
• Exit report of
the project for
the customer
Data Science with Dr Shahid
Data Science with Dr Shahid
What does
it take?
Data Science with Dr Shahid
Data Science with Dr Shahid
Data Science with Dr Shahid
• Linear algebra, Calculus
• Probability theory, Graph theory
• Distributions, summary stats, hypothesis testing
Math/Statistics
• Supervised learning
• Unsupervised learning
• Validation, model comparison
Machine
learning
• Algorithms and data structures
• Data Visualization
• Data processing
Software engg
Data Science with Dr Shahid
Data
Scientists
Data Analyst
ML
engineer
Data engineer
Data
Architect
BI developer
Data Science with Dr Shahid
Data Science with Dr Shahid
Python for Data Science
Contact me!
Data Science with Dr Shahid
https://www.facebook.com/drshahid.phd
https://www.linkedin.com/in/muhammad-shahid-67876212
muhammad.shahid@ieee.org
Thank You!

More Related Content

What's hot

Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
ASIS&T
 
Abstract Writing & Oral Presentations 2016
Abstract Writing & Oral Presentations 2016Abstract Writing & Oral Presentations 2016
Abstract Writing & Oral Presentations 2016
evadew1
 

What's hot (20)

Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
Lightning Talk, Coates: Clinical Data Management strategies: How can they imp...
 
Doing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your reviewDoing a systematic review: top tips for progressing your review
Doing a systematic review: top tips for progressing your review
 
How to Leverage Technology for Medical Research
How to Leverage Technology for Medical ResearchHow to Leverage Technology for Medical Research
How to Leverage Technology for Medical Research
 
Scholarly Research: Therapeutic Recreation
 Scholarly Research: Therapeutic Recreation  Scholarly Research: Therapeutic Recreation
Scholarly Research: Therapeutic Recreation
 
Dr. Sundhararajan
Dr. Sundhararajan Dr. Sundhararajan
Dr. Sundhararajan
 
Use of secorndary data in research by m.hashaam
Use of secorndary data in research by m.hashaamUse of secorndary data in research by m.hashaam
Use of secorndary data in research by m.hashaam
 
How to Craft the "Significance” & "Innovation" Sections of a Grant Applicatio...
How to Craft the "Significance” & "Innovation" Sections of a Grant Applicatio...How to Craft the "Significance” & "Innovation" Sections of a Grant Applicatio...
How to Craft the "Significance” & "Innovation" Sections of a Grant Applicatio...
 
How to Write a Scientific Abstract & Make a Great Poster
How to Write a Scientific Abstract & Make a Great PosterHow to Write a Scientific Abstract & Make a Great Poster
How to Write a Scientific Abstract & Make a Great Poster
 
Investigating Performance
Investigating PerformanceInvestigating Performance
Investigating Performance
 
Critical appraisal of research evidence: The CASP resources
Critical appraisal of research evidence: The CASP resourcesCritical appraisal of research evidence: The CASP resources
Critical appraisal of research evidence: The CASP resources
 
UC Research Exchange (UC ReX) & Los Angeles Data Repository (LADR)
UC Research Exchange (UC ReX) & Los Angeles Data Repository (LADR) UC Research Exchange (UC ReX) & Los Angeles Data Repository (LADR)
UC Research Exchange (UC ReX) & Los Angeles Data Repository (LADR)
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
Research Protocol
Research ProtocolResearch Protocol
Research Protocol
 
Searching for Trials for a Systematic Review
Searching for Trials for a Systematic ReviewSearching for Trials for a Systematic Review
Searching for Trials for a Systematic Review
 
Chapter 028
Chapter 028Chapter 028
Chapter 028
 
How to Write the “Specific Aims” Section of a Grant Application (Duru 2020)
How to Write the “Specific Aims” Section of a Grant Application (Duru 2020)How to Write the “Specific Aims” Section of a Grant Application (Duru 2020)
How to Write the “Specific Aims” Section of a Grant Application (Duru 2020)
 
Abstract Writing & Oral Presentations 2016
Abstract Writing & Oral Presentations 2016Abstract Writing & Oral Presentations 2016
Abstract Writing & Oral Presentations 2016
 
How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)How to Structure the “Approach” Section of a Grant Application (2020)
How to Structure the “Approach” Section of a Grant Application (2020)
 
searching for evidence
searching for evidencesearching for evidence
searching for evidence
 
Welcome to
Welcome toWelcome to
Welcome to
 

Similar to Data science; why, what, how?

Ghanem and pape's presentation
Ghanem and pape's presentationGhanem and pape's presentation
Ghanem and pape's presentation
Pape Samb
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
Jordan Engbers
 
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
Perficient
 
Differentiating Quantitative and Qualitative Research Design
Differentiating Quantitative and Qualitative Research DesignDifferentiating Quantitative and Qualitative Research Design
Differentiating Quantitative and Qualitative Research Design
Dino Andrey
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
dataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 

Similar to Data science; why, what, how? (20)

Data Science: why, what, and how?
Data Science: why, what, and how?Data Science: why, what, and how?
Data Science: why, what, and how?
 
Ghanem and pape's presentation
Ghanem and pape's presentationGhanem and pape's presentation
Ghanem and pape's presentation
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
Essential Data Science for Product Designers and Non-Scientists
Essential Data Science for Product Designers and Non-ScientistsEssential Data Science for Product Designers and Non-Scientists
Essential Data Science for Product Designers and Non-Scientists
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
Leveraging Oracle's Life Sciences Data Hub to Enable Dynamic Cross-Study Anal...
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
Differentiating Quantitative and Qualitative Research Design
Differentiating Quantitative and Qualitative Research DesignDifferentiating Quantitative and Qualitative Research Design
Differentiating Quantitative and Qualitative Research Design
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 

Recently uploaded

Recently uploaded (20)

slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Benefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational ResourcesBenefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational Resources
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 

Data science; why, what, how?