SlideShare a Scribd company logo
Kaggle – the global community 
of Data Science professionals 
Anastasiia Kornilova
Who am I?
- MS in Applied Mathematics, 
- 3 years as a Data Scientist
What is Data Science?
Scientific 
Method 
Math 
Statistics 
Data 
Engineering 
Domain 
Expertise 
Advanced 
Visualization Computing 
Hacker 
Mindset 
What 
matters?
What is Kaggle?
2010 - founded in Melbourne, Australia 
by Antony Goldbloom
What problem they solve? 
Data problems 
Data solvers
In fact, a McKinsey Global Institute report 
estimates that by 2018, “the United States 
alone could face a shortage of 140,000 to 
190,000 people with deep analytical skills as 
well as 1.5 million managers and analysts 
with the know-how to use the analysis of 
big data to make effective decisions.” 
! 
! 
!
Between 2010 and 2020, the data 
scientist career path is projected to 
increase by 18.7 percent, beat only by 
video game designers. The big data 
industry is expected to be a 53.4 billion 
industry by 2016.
Anyone with "data science" in his or 
her job title on a LinkedIn page is 
going to get "100 recruiter emails a 
day," said Josh Sullivan, who leads a 
500-person data-science group at the 
consulting firm Booz Allen Hamilton 
Holding
Are you good enough?
First Competition: 
Forecast Eurovision Song Contest Voting 
! 
! 
- 1000 dollars prize 
- 22 teams 
Outperformed prediction markets: 
predict 7 countries from Top10, prediction markets 
only 5.
Short story of success 
- 2011 - relocated to San Francisco 
- November, 2011 - raise 11M dollars fundings 
- July, 2013 - 100,000 data scientists involved 
- February, 2014 - more than 140,000 data 
scientists
How you can use Kaggle?
Rewarding types 
- Knowledge 
- Money 
- Job interview
Competitions for knowledge 
(always open) 
! 
- Digit recognizer, CIFAR-10, First steps with Julia 
- Titanic: Machine Learning for Disaster 
- Bike Sharing Demand 
- Learning Social Circles in Networks
Competitions with prize: 
Open: 
- American Epilepsy Society Seizure Prediction 
Challenge: 25, 000 prize 
- Africa Soil Property Prediction Challenge: 8,000 prize 
- Tradeshift Text Classification: 5,000 prize
Completed competitions (170+) 
- Heritage Health Price: 500,000 
- GE Flight Quest: 250,000 
- GE Hospital Quest: 100,000 
- Higgs Boson ML Challenge: 13,000 + invitation to 
CERN 
- Galaxy Zoo: 16,000 
- KDD Author Paper Identification Challenge 
- Job Recommendation Challenge
Job competitions (completed): 
Facebook: 
- recommend missing links in social graph (who to follow) 
- optimal graph path 
- predict text tags 
Yelp: 
- estimate the number of useful votes a review will receive 
Wallmart: 
- predict store sales 
+ Job Board
How to win?
Dig into the data
Stay on track
! 
Kaggle competition == Data science?
1. Understand 
2. Collect 
3. Data exploration 
4. Clean and 
transform 
6. Validate 
5. Model 
7. Communicating 
results 
Deploy
?

More Related Content

Viewers also liked

Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
Sri Ambati
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
Daniel Tunkelang
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
Anastasiia Kornilova
 
How to get started in Kaggle competition
How to get started in Kaggle competitionHow to get started in Kaggle competition
How to get started in Kaggle competition
Merja Kajava
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Anastasiia Kornilova
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
David Rostcheck
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
HJ van Veen
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
 

Viewers also liked (8)

Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
How to get started in Kaggle competition
How to get started in Kaggle competitionHow to get started in Kaggle competition
How to get started in Kaggle competition
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 

Similar to Kaggle - global Data Science community

Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -
Good Rebels
 
Future of jobs, big data & innovation
Future of jobs, big data & innovation Future of jobs, big data & innovation
Future of jobs, big data & innovation
suresh sood
 
Artificial intellect ukraine
Artificial intellect ukraineArtificial intellect ukraine
Artificial intellect ukraine
ananko
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
Martin Kaltenböck
 
Insight white paper_2014
Insight white paper_2014Insight white paper_2014
Insight white paper_2014
Lin Todd
 
Data science market insights usa
Data science market insights usaData science market insights usa
Data science market insights usa
Kaitlin McAndrews
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executives
Dylan Erens
 
Big data Career Opportunuties
Big data  Career OpportunutiesBig data  Career Opportunuties
Big data Career Opportunuties
Devashish Mishra
 
Why is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdfWhy is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdf
USDSI
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
Sanoj Kumar
 
20220307 utah state dixon_class v15
20220307 utah state dixon_class v1520220307 utah state dixon_class v15
20220307 utah state dixon_class v15
ISSIP
 
Spohrer GAMP 20230628 v17.pptx
Spohrer GAMP 20230628 v17.pptxSpohrer GAMP 20230628 v17.pptx
Spohrer GAMP 20230628 v17.pptx
ISSIP
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptx
ISSIP
 
#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"
Marcos Colebrook-Santamaria
 
Data Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionData Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinction
Christoforos Anagnostopoulos
 
A Play Ethic for Data
A Play Ethic for DataA Play Ethic for Data
A Play Ethic for Data
www.patkane.global
 
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
PAÍS DIGITAL
 
Vikas Arora - Evolution of Search - Nottingham Digital Summit
Vikas Arora - Evolution of Search - Nottingham Digital SummitVikas Arora - Evolution of Search - Nottingham Digital Summit
Vikas Arora - Evolution of Search - Nottingham Digital Summit
Hallam
 
Future of ai 20180719 v9
Future of ai 20180719 v9Future of ai 20180719 v9
Future of ai 20180719 v9
ISSIP
 

Similar to Kaggle - global Data Science community (20)

Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -
 
Future of jobs, big data & innovation
Future of jobs, big data & innovation Future of jobs, big data & innovation
Future of jobs, big data & innovation
 
Artificial intellect ukraine
Artificial intellect ukraineArtificial intellect ukraine
Artificial intellect ukraine
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
 
Insight white paper_2014
Insight white paper_2014Insight white paper_2014
Insight white paper_2014
 
Data science market insights usa
Data science market insights usaData science market insights usa
Data science market insights usa
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executives
 
Big data Career Opportunuties
Big data  Career OpportunutiesBig data  Career Opportunuties
Big data Career Opportunuties
 
Why is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdfWhy is Data Science a Popular Career Choice.pdf
Why is Data Science a Popular Career Choice.pdf
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
 
20220307 utah state dixon_class v15
20220307 utah state dixon_class v1520220307 utah state dixon_class v15
20220307 utah state dixon_class v15
 
Spohrer GAMP 20230628 v17.pptx
Spohrer GAMP 20230628 v17.pptxSpohrer GAMP 20230628 v17.pptx
Spohrer GAMP 20230628 v17.pptx
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptx
 
#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"
 
Data Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionData Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinction
 
A Play Ethic for Data
A Play Ethic for DataA Play Ethic for Data
A Play Ethic for Data
 
Big Data RF
Big Data RFBig Data RF
Big Data RF
 
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
 
Vikas Arora - Evolution of Search - Nottingham Digital Summit
Vikas Arora - Evolution of Search - Nottingham Digital SummitVikas Arora - Evolution of Search - Nottingham Digital Summit
Vikas Arora - Evolution of Search - Nottingham Digital Summit
 
Future of ai 20180719 v9
Future of ai 20180719 v9Future of ai 20180719 v9
Future of ai 20180719 v9
 

Recently uploaded

一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 

Recently uploaded (20)

一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 

Kaggle - global Data Science community

  • 1. Kaggle – the global community of Data Science professionals Anastasiia Kornilova
  • 3. - MS in Applied Mathematics, - 3 years as a Data Scientist
  • 4.
  • 5. What is Data Science?
  • 6. Scientific Method Math Statistics Data Engineering Domain Expertise Advanced Visualization Computing Hacker Mindset What matters?
  • 7.
  • 9. 2010 - founded in Melbourne, Australia by Antony Goldbloom
  • 10. What problem they solve? Data problems Data solvers
  • 11.
  • 12.
  • 13. In fact, a McKinsey Global Institute report estimates that by 2018, “the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.” ! ! !
  • 14. Between 2010 and 2020, the data scientist career path is projected to increase by 18.7 percent, beat only by video game designers. The big data industry is expected to be a 53.4 billion industry by 2016.
  • 15. Anyone with "data science" in his or her job title on a LinkedIn page is going to get "100 recruiter emails a day," said Josh Sullivan, who leads a 500-person data-science group at the consulting firm Booz Allen Hamilton Holding
  • 16. Are you good enough?
  • 17. First Competition: Forecast Eurovision Song Contest Voting ! ! - 1000 dollars prize - 22 teams Outperformed prediction markets: predict 7 countries from Top10, prediction markets only 5.
  • 18. Short story of success - 2011 - relocated to San Francisco - November, 2011 - raise 11M dollars fundings - July, 2013 - 100,000 data scientists involved - February, 2014 - more than 140,000 data scientists
  • 19.
  • 20. How you can use Kaggle?
  • 21. Rewarding types - Knowledge - Money - Job interview
  • 22. Competitions for knowledge (always open) ! - Digit recognizer, CIFAR-10, First steps with Julia - Titanic: Machine Learning for Disaster - Bike Sharing Demand - Learning Social Circles in Networks
  • 23. Competitions with prize: Open: - American Epilepsy Society Seizure Prediction Challenge: 25, 000 prize - Africa Soil Property Prediction Challenge: 8,000 prize - Tradeshift Text Classification: 5,000 prize
  • 24. Completed competitions (170+) - Heritage Health Price: 500,000 - GE Flight Quest: 250,000 - GE Hospital Quest: 100,000 - Higgs Boson ML Challenge: 13,000 + invitation to CERN - Galaxy Zoo: 16,000 - KDD Author Paper Identification Challenge - Job Recommendation Challenge
  • 25. Job competitions (completed): Facebook: - recommend missing links in social graph (who to follow) - optimal graph path - predict text tags Yelp: - estimate the number of useful votes a review will receive Wallmart: - predict store sales + Job Board
  • 27. Dig into the data
  • 28.
  • 30. ! Kaggle competition == Data science?
  • 31. 1. Understand 2. Collect 3. Data exploration 4. Clean and transform 6. Validate 5. Model 7. Communicating results Deploy
  • 32. ?