SlideShare a Scribd company logo
1 of 26
Download to read offline
Girls Who Code and Do Data Science
@EstherVasiete
Data Scientist
July 12th, 2016
Girls Who Code
Summer Immersion Program
About me
•  Born and raised in Barcelona
•  Bachelor’s Degree in Electrical Engineering
About me
•  Studied abroad in UK
- Best time of my life
- Developed an interest in image processing and computer vision
- Also developed an interest in machine learning, I just didn’t know then
About me
•  Did my Masters at CU Boulder
•  Officially, I received my diploma in EE
- Unofficially, I like to think about it as a CS degree
- I managed to cross-list most courses and thesis
advisor so that I could feed my growing interest for
machine learning
About me
•  Once I graduated, I moved to San Francisco
- My first data science gig
So what is machine learning?
How does this…
…become this?
By recognizing this
Sensors + Other Structured and Unstructured Data
How can a machine learn that a cat is a cat?
How can a machine learn that a cat is a cat?
What about these?
The cat from Shrek Hairless cat Baby panther and baby tiger
Can your model generalize to new, unseen data?
The importance of data
Messy data – the norm and not the exception
Training examples Machine Learning
Algorithm Cat Model
Basic Machine Learning Framework
Gene Sequencing
Smart Grids
COST TO SEQUENCE
ONE GENOME
HAS FALLEN FROM
$100M IN 2001
TO $10K IN 2011
TO $1K IN 2014
READING SMART METERS
EVERY 15 MINUTES IS
3000X MORE
DATA INTENSIVE
Stock Market
Social Media
FACEBOOK UPLOADS
250 MILLION
PHOTOS EACH DAY
In all industries billions of data points represent
opportunities for data science
Oil Exploration
Video Surveillance
OIL RIGS GENERATE
25000
DATA POINTS
PER SECOND
Medical Imaging
Mobile Sensors
https://www.washingtonpost.com/posteverything/wp/2015/06/05/
the-auto-industry-discriminates-against-women-so-i-quit-my-
engineering-job-to-become-a-mechanic/
You can also transform a
male-dominant
industry with data science.
On-Board Diagnostics
Diagnostic Trouble Codes (DTC)
Unscheduled repairs
AB1029 – Power steering pump replacement
CT3408 – Wheel alignment
Data Sources for Predictive Maintenance
VIN
Timestamp
DTC Code
Odometer
Speed
Acceleration
Engine Temperature
Engine Torque GPS
Coordinates
etc.
VIN
Date vehicle in
Date vehicle out
Repair code
Parts replaced
Warranty claims
Repair Comments
Vehicle Data Car Repairs Data
Predicting Job Type from Diagnostic Trouble Codes
(DTCs)
Time
Job Type:
Transmission
Job Type:
Transmission
Engine
Job Type:
Regular check
DTC: B DTC:
B,
P, C
DTC: U
DTC: B DTC: B
DTC:
B, P, C, U
DTC:
P, B, U
DTC: P DTC: B DTC:
B,P
DTC:
B,P
Can the DTCs
observed here predict
this Job Type?
Can the DTCs observed
here predict this Job
Type?
Can the DTCs observed
here predict this Job
Type?
Predicting Job Type: a multi-class classification
problem
DF
12
10
DF
12
15
DF
29
80
AB
10
29
AB
16
22
AB
16
25
AB
86
22
CT
34
02
CT
34
08
CT
35
60
CT
24
09
Vehicle
Features
Hierarchical Classification Framework
Vehicle
Features
DF
12
10
DF
12
15
DF
29
80
AB
10
29
AB
16
22
AB
16
25
AB
86
22
CT
34
02
CT
34
08
CT
35
60
CT
24
09
•  Diagnostic Trouble Codes (DTCs) are not always symptomatic of an
ensuing repair.
•  Hence, creating a rule-based approach for repairs based on DTCs has
been challenging to construct.
•  A machine learning approach could be a better solution to infer the
relationship between groups of DTCs and repairs.
•  Become a mechanic and solve a few car repairs, or become a data
scientist and solve millions!
Takeaways
Other Data Science Use-Cases for Connected Cars
Other Data Science Use-Cases
Automated essay scoring
Drug/chemical discovery & analysisRecommendation systems
Fraud detection
blog.pivotal.io/data-science-pivotal/case-studies/pivotal-for-good-with-
crisis-text-line-using-text-analytics-to-better-serve-at-risk-teens
blog.pivotal.io/data-science-pivotal/features/pivotal-for-good-with-crisis-
text-line-a-first-look
Data Scientist Profile Ask me anything @EstherVasiete

More Related Content

Similar to Girls Who Code in Data Science

Why analytics projects fail
Why analytics projects failWhy analytics projects fail
Why analytics projects failDr. Bülent Dal
 
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?Haluk Demirkan
 
Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...ssuser6478a8
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceAnn Venkataraman
 
Data Science for Connected Vehicles
Data Science for Connected VehiclesData Science for Connected Vehicles
Data Science for Connected VehiclesVMware Tanzu
 
Industrial revolution 4.0
Industrial revolution 4.0 Industrial revolution 4.0
Industrial revolution 4.0 Aditya Randika
 
Exploring the barriers to developing data-driven business models in the creat...
Exploring the barriers to developing data-driven business models in the creat...Exploring the barriers to developing data-driven business models in the creat...
Exploring the barriers to developing data-driven business models in the creat...AAM_Associates
 
A career in tech by data analyst Henrica Makulu
A career in tech by data analyst Henrica MakuluA career in tech by data analyst Henrica Makulu
A career in tech by data analyst Henrica MakuluHenrica Makulu
 
SOP / Personal Statement for Teesside University
SOP / Personal Statement for Teesside UniversitySOP / Personal Statement for Teesside University
SOP / Personal Statement for Teesside Universityaziznitham
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
 
BI, AI/ML, Use Cases, Business Impact and how to get started
BI, AI/ML, Use Cases, Business Impact and how to get startedBI, AI/ML, Use Cases, Business Impact and how to get started
BI, AI/ML, Use Cases, Business Impact and how to get startedKarthick S
 
1215 dataikulunchlearn sanford
1215 dataikulunchlearn sanford1215 dataikulunchlearn sanford
1215 dataikulunchlearn sanfordRising Media, Inc.
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...News Leaders Association's NewsTrain
 
What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century Human Capital Media
 
Data Analytics in Today's War for Talent
Data Analytics in Today's War for TalentData Analytics in Today's War for Talent
Data Analytics in Today's War for TalentTALiNT Partners
 

Similar to Girls Who Code in Data Science (20)

Why analytics projects fail
Why analytics projects failWhy analytics projects fail
Why analytics projects fail
 
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
 
Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science for Connected Vehicles
Data Science for Connected VehiclesData Science for Connected Vehicles
Data Science for Connected Vehicles
 
Data Science for Connected Vehicles
Data Science for Connected VehiclesData Science for Connected Vehicles
Data Science for Connected Vehicles
 
Industrial revolution 4.0
Industrial revolution 4.0 Industrial revolution 4.0
Industrial revolution 4.0
 
Exploring the barriers to developing data-driven business models in the creat...
Exploring the barriers to developing data-driven business models in the creat...Exploring the barriers to developing data-driven business models in the creat...
Exploring the barriers to developing data-driven business models in the creat...
 
A career in tech by data analyst Henrica Makulu
A career in tech by data analyst Henrica MakuluA career in tech by data analyst Henrica Makulu
A career in tech by data analyst Henrica Makulu
 
SOP / Personal Statement for Teesside University
SOP / Personal Statement for Teesside UniversitySOP / Personal Statement for Teesside University
SOP / Personal Statement for Teesside University
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
 
BI, AI/ML, Use Cases, Business Impact and how to get started
BI, AI/ML, Use Cases, Business Impact and how to get startedBI, AI/ML, Use Cases, Business Impact and how to get started
BI, AI/ML, Use Cases, Business Impact and how to get started
 
1215 dataikulunchlearn sanford
1215 dataikulunchlearn sanford1215 dataikulunchlearn sanford
1215 dataikulunchlearn sanford
 
On data literacy by Marek Danis
On data literacy by Marek Danis On data literacy by Marek Danis
On data literacy by Marek Danis
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
 
Data mining and its applications!
Data mining and its applications!Data mining and its applications!
Data mining and its applications!
 
Big Data: How does it fit in your data strategy?
Big Data: How does it fit in your data strategy?Big Data: How does it fit in your data strategy?
Big Data: How does it fit in your data strategy?
 
What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century
 
Data Analytics in Today's War for Talent
Data Analytics in Today's War for TalentData Analytics in Today's War for Talent
Data Analytics in Today's War for Talent
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 

Recently uploaded

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Girls Who Code in Data Science