SlideShare a Scribd company logo
1 of 10
Crack Data Science Challenge:
0 to 1
Data Analyst
Qifang Zhao | PhD
Data Science
Venn Diagram
• Data Science Challenge
– Hacking skill and Math and Statistics Knowledge used to be more
important
– Substantive expertise
• Used to be not important, because common sense and experience works very well
and algorithms matter
• Becomes more and more important, because algorithms are off-the-shelf
• Feature engineering are very important
• Data Science -> 3 skills are
equally important
0 to 1
• Hacking skills
– Learn Python or R
• Math and Statistics Knowledge
– Learn Machine Learning algorithms
• Substantive Expertise
– Practice data challenges in data challenge platforms like Kaggle and
DEXTRA, and learn from other data scientists
– Attend data science meetup like DataScience SG and R user group –
Singapore, and learn from speakers and attendents
Hacking skills (Python or R)
• Overall suggestion:
– First, pick up one and master it
– Second, learn the other
– Finally, use them together to complement each other
• Learn Python:
– Google Python Class, and videos series Introduction to Python in
YouTube
– Learn Python packages numpy, sklearn, and pandas when practicing
data challenges
• Learn R:
– Coursera “Data Science Specialization by Johns Hopkins University”
Math and Statistics Knowledge
• To start:
– Coursera course “Machine Learning by Andrew Ng”
– YouTube videos of other algorithms not covered above
• To advance:
– Stanford Machine Learning course by Andrew Ng
– Practice data challenges in Kaggle or DEXTRA
Substantive Expertise
• To start:
– Not very important
– Usually common sense and existing experience will work
– Do some research on the data challenges
• To advance:
– Practice more data challenges and learn from other data
scientists
Get hand dirty
-- 1st blood
• Start with simplest data challenge in Kaggle and DEXTRA
• Kaggle
– Titanic: Machine Learning from Disaster
– Very detailed procedures on how to crack the challenge
– Follow those procedures, gain experience
• DEXTRA
– Knowledge & Practice: Titanic Survival Prediction Challenge
– Similar to Kaggle’s, but using different evaluation metric
– By comparing two, you’ll understand the importance of evaluation
metrics
Try hard challenges
• Try different types of challenges
– Classification
– Regression
– Clustering
• Understand different types of evaluation metrics
– Classification
• Precision, Recall, F-Score, Accuracy, Log Loss
– Regression
• Root Mean Squared Error, Root Mean Squared Logarithmic Error
– Clustering
• Complicated
Find a data analyst/scientist job
• Practice data analytics with real, unmasked data
• Gain substantive expertise in your domain
• Work with colleagues of different specializations
• Understand the whole pipeline of data processing,
analytics, visualization and results delivering
Thank You
• Final conclusion and suggestion
Practice, Practice, and Practice
• Slides available here:
http://www.slideshare.net/zhaoqf123
http://www.slideshare.net/zhaoqf123/crack-
data-science-challenges-0-to-1

More Related Content

Viewers also liked

Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kagglerKai Xin Thia
 
Good Enough Analytics
Good Enough AnalyticsGood Enough Analytics
Good Enough AnalyticsKai Xin Thia
 
Troubleshooting vlookup - Excel
Troubleshooting vlookup - ExcelTroubleshooting vlookup - Excel
Troubleshooting vlookup - ExcelYi Chiao Cheng
 
Franchisors: How To Deal With California AB 525
Franchisors: How To Deal With California AB 525Franchisors: How To Deal With California AB 525
Franchisors: How To Deal With California AB 525Citrin Cooperman
 
Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Kai Xin Thia
 
Vlookup - Excel Training
Vlookup - Excel TrainingVlookup - Excel Training
Vlookup - Excel TrainingYi Chiao Cheng
 
DataScience NUS meet up with DEXTRA_SG
DataScience NUS meet up with DEXTRA_SGDataScience NUS meet up with DEXTRA_SG
DataScience NUS meet up with DEXTRA_SGDEXTRA_SG
 
Lions, zebras and Big Data Anonymization
Lions, zebras and Big Data AnonymizationLions, zebras and Big Data Anonymization
Lions, zebras and Big Data AnonymizationKai Xin Thia
 
Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systemsJoaquin Delgado PhD.
 
Data Science Venn Diagram - From Good to Great
Data Science Venn Diagram - From Good to GreatData Science Venn Diagram - From Good to Great
Data Science Venn Diagram - From Good to GreatKoo Ping Shung
 
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingXavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingKai Xin Thia
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
nhận thiết kế phim quảng cáo tốt nhất
nhận thiết kế phim quảng cáo tốt nhấtnhận thiết kế phim quảng cáo tốt nhất
nhận thiết kế phim quảng cáo tốt nhấtshanti192
 

Viewers also liked (15)

Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kaggler
 
Good Enough Analytics
Good Enough AnalyticsGood Enough Analytics
Good Enough Analytics
 
Troubleshooting vlookup - Excel
Troubleshooting vlookup - ExcelTroubleshooting vlookup - Excel
Troubleshooting vlookup - Excel
 
Franchisors: How To Deal With California AB 525
Franchisors: How To Deal With California AB 525Franchisors: How To Deal With California AB 525
Franchisors: How To Deal With California AB 525
 
Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG Forecasting Techniques - Data Science SG
Forecasting Techniques - Data Science SG
 
Vlookup - Excel Training
Vlookup - Excel TrainingVlookup - Excel Training
Vlookup - Excel Training
 
DataScience NUS meet up with DEXTRA_SG
DataScience NUS meet up with DEXTRA_SGDataScience NUS meet up with DEXTRA_SG
DataScience NUS meet up with DEXTRA_SG
 
Lions, zebras and Big Data Anonymization
Lions, zebras and Big Data AnonymizationLions, zebras and Big Data Anonymization
Lions, zebras and Big Data Anonymization
 
Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systems
 
Data Science Venn Diagram - From Good to Great
Data Science Venn Diagram - From Good to GreatData Science Venn Diagram - From Good to Great
Data Science Venn Diagram - From Good to Great
 
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingXavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
nhận thiết kế phim quảng cáo tốt nhất
nhận thiết kế phim quảng cáo tốt nhấtnhận thiết kế phim quảng cáo tốt nhất
nhận thiết kế phim quảng cáo tốt nhất
 
Lush ltd
Lush ltdLush ltd
Lush ltd
 
HubPages
HubPagesHubPages
HubPages
 

Similar to Crack Data Science Challenges: 0 to 1

Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchRachel Berryman
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 

Similar to Crack Data Science Challenges: 0 to 1 (20)

Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the Switch
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 

Crack Data Science Challenges: 0 to 1

  • 1. Crack Data Science Challenge: 0 to 1 Data Analyst Qifang Zhao | PhD
  • 2. Data Science Venn Diagram • Data Science Challenge – Hacking skill and Math and Statistics Knowledge used to be more important – Substantive expertise • Used to be not important, because common sense and experience works very well and algorithms matter • Becomes more and more important, because algorithms are off-the-shelf • Feature engineering are very important • Data Science -> 3 skills are equally important
  • 3. 0 to 1 • Hacking skills – Learn Python or R • Math and Statistics Knowledge – Learn Machine Learning algorithms • Substantive Expertise – Practice data challenges in data challenge platforms like Kaggle and DEXTRA, and learn from other data scientists – Attend data science meetup like DataScience SG and R user group – Singapore, and learn from speakers and attendents
  • 4. Hacking skills (Python or R) • Overall suggestion: – First, pick up one and master it – Second, learn the other – Finally, use them together to complement each other • Learn Python: – Google Python Class, and videos series Introduction to Python in YouTube – Learn Python packages numpy, sklearn, and pandas when practicing data challenges • Learn R: – Coursera “Data Science Specialization by Johns Hopkins University”
  • 5. Math and Statistics Knowledge • To start: – Coursera course “Machine Learning by Andrew Ng” – YouTube videos of other algorithms not covered above • To advance: – Stanford Machine Learning course by Andrew Ng – Practice data challenges in Kaggle or DEXTRA
  • 6. Substantive Expertise • To start: – Not very important – Usually common sense and existing experience will work – Do some research on the data challenges • To advance: – Practice more data challenges and learn from other data scientists
  • 7. Get hand dirty -- 1st blood • Start with simplest data challenge in Kaggle and DEXTRA • Kaggle – Titanic: Machine Learning from Disaster – Very detailed procedures on how to crack the challenge – Follow those procedures, gain experience • DEXTRA – Knowledge & Practice: Titanic Survival Prediction Challenge – Similar to Kaggle’s, but using different evaluation metric – By comparing two, you’ll understand the importance of evaluation metrics
  • 8. Try hard challenges • Try different types of challenges – Classification – Regression – Clustering • Understand different types of evaluation metrics – Classification • Precision, Recall, F-Score, Accuracy, Log Loss – Regression • Root Mean Squared Error, Root Mean Squared Logarithmic Error – Clustering • Complicated
  • 9. Find a data analyst/scientist job • Practice data analytics with real, unmasked data • Gain substantive expertise in your domain • Work with colleagues of different specializations • Understand the whole pipeline of data processing, analytics, visualization and results delivering
  • 10. Thank You • Final conclusion and suggestion Practice, Practice, and Practice • Slides available here: http://www.slideshare.net/zhaoqf123 http://www.slideshare.net/zhaoqf123/crack- data-science-challenges-0-to-1

Editor's Notes

  1. Because algorithms are off-the-shelf. People are using the same algorithms