SlideShare a Scribd company logo
Chief Data Scientist,ADP
amit_sharma_ai
amit-sharma-ai
Payroll Tax Time HR Talent Benefits
• Small, Midsized, Large, Multinational markets
• Revenues: $11.7 billion in fiscal 2016
• ADP pays 24 million (1 in 6) workers in U.S., and 12 million
elsewhere
• FORTUNE 500®: Ranked 240
• Forbes® Global 2000: Ranked 334
• Human Capital Management
ADP
Ask a
question
(Leader)
Source
the data
(Engg)
Explore
data
(Analyst)
Model
data
(Scientist)
Share
Results
(Scientist)
(Binary Classification)
(Multi-class classification)
(Anamoly detection)
(Regression)
(Unsupervised)
(Reinforcement)
Ask a
question
What’s my
Company &
Product’s
Goal/Vision?
How can we
make our
products
‘smarter’?
What data do
we have or
can get?
Will this be
useful to end
user?
Business/Domain Expert
DB and Data Expert
Data
Security
Legal
Velocity
Granularity
Explore distributions
• Are they telling a story
Handle missing data
• Can the missing data be ignored
• Does it need to be imputed
Look for outliers
• Do we want to
identify/filter/manage them
0
2
4
6
8
10
12
14
16
18
1-3 4-6 7-10 11-14 15-18 19-22 23-26 27-30
No of Questions
%Dropout
0
2
4
6
8
10
12
14
16
18
0 100 200 300 400
%Dropout
No of Questions
How removing outliers helped uncover the correlation
Bias (underfitting) Optimal Variance (overfitting)
Solution
• Add more features
• Use a more complex model
Solution
• Fewer features
• More data to reduce
variance
overall accuracy is not good enough
People
resigned
People Not
Resigned
High Risk 30
100
(Type 1 Error)
Low Risk
70
(Type 2 Error)
800
OverallAccuracy – 83%
Sensitivity (Recall) – 30%
Intellectually
Curious
Domain
Knolwedge
Great
Communications
Statistical
Knowledge
Hands-on
Programming
Learn
Online and
Curriculum
Compete
Analytics
Challenges
Explore
Datasets
Follow
Trend/Tech
Go Deep
Network
Share and Learn
Forums, Meetups
Ask a
question
(Leader)
Source
the data
(Engg)
Explore
data
(Analyst)
Model
data
(Scientist)
Share
Results
(Scientist)
- How can I use data to
drive value for my
users/customers?
- How to ensure
Legality, Security, IP of
data?
- Explore distributions
- Handle missing values
and outliers
- Keep it Simple
- Progressive Disclosure
- Prefer simple, explicable,
actionable models
- Avoid over/under fitting
- Validation is key
5 Thing Engineers Need to know about Data Science
5 Thing Engineers Need to know about Data Science

More Related Content

Similar to 5 Thing Engineers Need to know about Data Science

The Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementThe Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata Management
DATAVERSITY
 
The Rise of People Management Analytics
The Rise of People Management AnalyticsThe Rise of People Management Analytics
The Rise of People Management Analytics
Mario Faria
 
Strategic Workforce Planning: Key Principles and Objectives, Paul Turner
Strategic Workforce Planning: Key Principles and Objectives, Paul TurnerStrategic Workforce Planning: Key Principles and Objectives, Paul Turner
Strategic Workforce Planning: Key Principles and Objectives, Paul Turner
The HR Observer
 
HR Analytics and KPIs with LBi HR HelpDesk
HR Analytics and KPIs with LBi HR HelpDeskHR Analytics and KPIs with LBi HR HelpDesk
HR Analytics and KPIs with LBi HR HelpDesk
LBi Software
 
Huntel global webinar aligning data talent with your analytics needs
Huntel global webinar aligning data talent with your analytics needsHuntel global webinar aligning data talent with your analytics needs
Huntel global webinar aligning data talent with your analytics needs
Huntel Global
 
Germany Executive Summit at LinkedIn
Germany Executive Summit at LinkedInGermany Executive Summit at LinkedIn
Germany Executive Summit at LinkedIn
Lutz Finger
 
Advancements in Legal Entity Data Quality
Advancements in Legal Entity Data QualityAdvancements in Legal Entity Data Quality
Advancements in Legal Entity Data Quality
Kingland
 
Moneyball & Data Analytics
Moneyball & Data AnalyticsMoneyball & Data Analytics
Moneyball & Data Analytics
HRBoss
 
Real-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slidesReal-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slides
Yellowfin
 
Fundamentals of Data Analytics Outline
Fundamentals of Data Analytics OutlineFundamentals of Data Analytics Outline
Fundamentals of Data Analytics Outline
Dan Meyer
 
Offshore Analytics - material from the Big Data Analytics Conference held by ...
Offshore Analytics - material from the Big Data Analytics Conference held by ...Offshore Analytics - material from the Big Data Analytics Conference held by ...
Offshore Analytics - material from the Big Data Analytics Conference held by ...
Mario Faria
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
IntelAPAC
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
Precisely
 
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
G3 Communications
 
LDM Webinar: Data Modeling & Metadata Management
LDM Webinar: Data Modeling & Metadata ManagementLDM Webinar: Data Modeling & Metadata Management
LDM Webinar: Data Modeling & Metadata Management
DATAVERSITY
 
Jeffrey Ricker - "Big Data Governance"
Jeffrey Ricker - "Big Data Governance"Jeffrey Ricker - "Big Data Governance"
Jeffrey Ricker - "Big Data Governance"
Lviv Startup Club
 
DDI Selection Forecast 2007
DDI Selection Forecast 2007DDI Selection Forecast 2007
DDI Selection Forecast 2007
Meghan Daily
 
CDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and TrendsCDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and Trends
Jeffrey T. Pollock
 
Interviewing+W Orkshop[1]
Interviewing+W Orkshop[1]Interviewing+W Orkshop[1]
Interviewing+W Orkshop[1]gueste6cbc5
 
Intentional Interviewing
Intentional InterviewingIntentional Interviewing
Intentional Interviewing
SyndieC
 

Similar to 5 Thing Engineers Need to know about Data Science (20)

The Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementThe Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata Management
 
The Rise of People Management Analytics
The Rise of People Management AnalyticsThe Rise of People Management Analytics
The Rise of People Management Analytics
 
Strategic Workforce Planning: Key Principles and Objectives, Paul Turner
Strategic Workforce Planning: Key Principles and Objectives, Paul TurnerStrategic Workforce Planning: Key Principles and Objectives, Paul Turner
Strategic Workforce Planning: Key Principles and Objectives, Paul Turner
 
HR Analytics and KPIs with LBi HR HelpDesk
HR Analytics and KPIs with LBi HR HelpDeskHR Analytics and KPIs with LBi HR HelpDesk
HR Analytics and KPIs with LBi HR HelpDesk
 
Huntel global webinar aligning data talent with your analytics needs
Huntel global webinar aligning data talent with your analytics needsHuntel global webinar aligning data talent with your analytics needs
Huntel global webinar aligning data talent with your analytics needs
 
Germany Executive Summit at LinkedIn
Germany Executive Summit at LinkedInGermany Executive Summit at LinkedIn
Germany Executive Summit at LinkedIn
 
Advancements in Legal Entity Data Quality
Advancements in Legal Entity Data QualityAdvancements in Legal Entity Data Quality
Advancements in Legal Entity Data Quality
 
Moneyball & Data Analytics
Moneyball & Data AnalyticsMoneyball & Data Analytics
Moneyball & Data Analytics
 
Real-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slidesReal-world state of the BI market: Webinar presentation slides
Real-world state of the BI market: Webinar presentation slides
 
Fundamentals of Data Analytics Outline
Fundamentals of Data Analytics OutlineFundamentals of Data Analytics Outline
Fundamentals of Data Analytics Outline
 
Offshore Analytics - material from the Big Data Analytics Conference held by ...
Offshore Analytics - material from the Big Data Analytics Conference held by ...Offshore Analytics - material from the Big Data Analytics Conference held by ...
Offshore Analytics - material from the Big Data Analytics Conference held by ...
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
 
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
The Payoffs of a Strategic Content Audit: The Gift That Keeps Giving for Unli...
 
LDM Webinar: Data Modeling & Metadata Management
LDM Webinar: Data Modeling & Metadata ManagementLDM Webinar: Data Modeling & Metadata Management
LDM Webinar: Data Modeling & Metadata Management
 
Jeffrey Ricker - "Big Data Governance"
Jeffrey Ricker - "Big Data Governance"Jeffrey Ricker - "Big Data Governance"
Jeffrey Ricker - "Big Data Governance"
 
DDI Selection Forecast 2007
DDI Selection Forecast 2007DDI Selection Forecast 2007
DDI Selection Forecast 2007
 
CDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and TrendsCDO - Chief Data Officer Momentum and Trends
CDO - Chief Data Officer Momentum and Trends
 
Interviewing+W Orkshop[1]
Interviewing+W Orkshop[1]Interviewing+W Orkshop[1]
Interviewing+W Orkshop[1]
 
Intentional Interviewing
Intentional InterviewingIntentional Interviewing
Intentional Interviewing
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

5 Thing Engineers Need to know about Data Science