Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Anastasiia Kornilova
1
WHO AM I?
• 3+ years in Data Science
• MS in Applied Mathematics
• Professional interests: recommendations systems, natura...
• What is Data Science and why do we need it?
• Data Scientists.Who they are and what do they
do?
• How to start?
• Practi...
DATA IS EVERYWHERE
TONNS OF DATA
DATA EXCHANGE
MAKING SENSE OF DATA
USERS FEEDBACK
DATA ISTHE NEW OIL
WHAT IS DATA SCIENCE?
Andrew Conway
10
FRAUD DETECTION
RECOMMENDATIONS
OPTIMISATION
SENTIMENT ANALYSIS
IMAGE/SPEECH/TEXT
RECOGNITION
FUTURE PREDICTION
FIND PATTERNS IS USER
BEHAVIOUR/ACTIONS
WHO USES DATA SCIENCE?
18
DATA SCIENTISTS DEMAND
19
WHO DATA SCIENTISTS ARE
AND WHAT DOTHEY DO?
20
TYPES OF DATA SCIENTISTS
A - Analysis
B - Building
Robert Chang
DSTYPE “A” - ANALYSIS
• making sense of data or working with it in a fairly static way.
• very similar to a statistician (...
• share some statistical background withType A
• very strong coders and may be trained software
engineers
• mainly interes...
WHAT DOTHEY DO?
Understand
Collect
Data exploration
Clean and
transform
Model
Validate
Communicating
results
Deploy
WHAT DOTHEY DO?
TYPICAL DATA SCIENCE
WORKFLOW
• Preparing to run a model (Gathering, cleaning,
transformation)
• Running the model
• Inter...
REQUIRED SKILLS
27
DOMAIN KNOWLEDGE AND
SOFT SKILLS
• Passionate about the business
• Curios about data
• Influence without authority
• Hacker...
MATH AND STATISTICS
• Machine learning
• Statistical modelling
• Experiment design
• Supervised learning
• Unsupervised le...
PROGRAMMING AND
DATABASES
• Computer science fundamentals
• Scripting language
• Statistical computing language
• Database...
COMMUNICATION AND
VISUALIZATION
• Ability to engage with senior management
• Storytelling skills
• Visual art design
• Kno...
WHERETO OBTAIN SKILLS?
HARD WAY
33
TRADITIONAL WAY
• LITS - Machine Learning
• UCU - CS Master Degree
• Data Science Degree
• Kyivstar - Big Data University
...
EASY WAY
35
MACHINE LEARNING
≠
DATA SCIENCE
AND NOW WHAT?
38
AREYOU GOOD ENOUGH?
AREYOU GOOD ENOUGH?41
KAGGLE STORY
42
Problem owners
Problem solvers
43
WHAT CAN YOU FIND ON
KAGGLE?
• Knowledge
• Money
• Job
• Reputation
44
1. Understand
2. Collect
3. Data exploration
4. Clean and
transform
5. Model
6. Validate
7. Communicating
results
Deploy
45
PASSION AND PERSISTENCE
TIME FOR FUN
San Francisco crimes analysis
THANK YOU!
LINKS
• https://medium.com/@rchang/my-two-year-journey-as-a-data-scientist-at-twitter-f0c13298aee6#.49jdojamn
• https://bl...
Introduction to Data Science
Introduction to Data Science
Introduction to Data Science
Upcoming SlideShare
Loading in …5
×

Introduction to Data Science

8,290 views

Published on

Morning@Lohika talk.

Published in: Data & Analytics
  • Be the first to comment

Introduction to Data Science

  1. 1. Anastasiia Kornilova 1
  2. 2. WHO AM I? • 3+ years in Data Science • MS in Applied Mathematics • Professional interests: recommendations systems, natural language processing, scalable data science solutions • Authors of two blogs: energyfirefox.blogspot.com, datascientistdiary.blogspot.com • Fan of online education (20+ finished MOOCs)
  3. 3. • What is Data Science and why do we need it? • Data Scientists.Who they are and what do they do? • How to start? • Practical case AGENDA 3
  4. 4. DATA IS EVERYWHERE
  5. 5. TONNS OF DATA
  6. 6. DATA EXCHANGE
  7. 7. MAKING SENSE OF DATA
  8. 8. USERS FEEDBACK
  9. 9. DATA ISTHE NEW OIL
  10. 10. WHAT IS DATA SCIENCE? Andrew Conway 10
  11. 11. FRAUD DETECTION
  12. 12. RECOMMENDATIONS
  13. 13. OPTIMISATION
  14. 14. SENTIMENT ANALYSIS
  15. 15. IMAGE/SPEECH/TEXT RECOGNITION
  16. 16. FUTURE PREDICTION
  17. 17. FIND PATTERNS IS USER BEHAVIOUR/ACTIONS
  18. 18. WHO USES DATA SCIENCE? 18
  19. 19. DATA SCIENTISTS DEMAND 19
  20. 20. WHO DATA SCIENTISTS ARE AND WHAT DOTHEY DO? 20
  21. 21. TYPES OF DATA SCIENTISTS A - Analysis B - Building Robert Chang
  22. 22. DSTYPE “A” - ANALYSIS • making sense of data or working with it in a fairly static way. • very similar to a statistician (and may be one) • knows all the practical details of working with data that aren’t taught in the statistics curriculum: data cleaning, methods for dealing with very large data sets, visualization, deep knowledge of a particular domain, writing well about data
  23. 23. • share some statistical background withType A • very strong coders and may be trained software engineers • mainly interested in using data “in production.” • build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results). DSTYPE “B” - BUILDING
  24. 24. WHAT DOTHEY DO?
  25. 25. Understand Collect Data exploration Clean and transform Model Validate Communicating results Deploy WHAT DOTHEY DO?
  26. 26. TYPICAL DATA SCIENCE WORKFLOW • Preparing to run a model (Gathering, cleaning, transformation) • Running the model • Interpreting the results “80% of work” - Aaron Kimball “Other 80% of the work” 26
  27. 27. REQUIRED SKILLS 27
  28. 28. DOMAIN KNOWLEDGE AND SOFT SKILLS • Passionate about the business • Curios about data • Influence without authority • Hacker mindset • Problem solver • Strategic, proactive, creative, innovative and collaborative 28
  29. 29. MATH AND STATISTICS • Machine learning • Statistical modelling • Experiment design • Supervised learning • Unsupervised learning • Optimisation 29
  30. 30. PROGRAMMING AND DATABASES • Computer science fundamentals • Scripting language • Statistical computing language • Databases • Relational algebra • Distributed computations 30
  31. 31. COMMUNICATION AND VISUALIZATION • Ability to engage with senior management • Storytelling skills • Visual art design • Knowledge of a vizualisation tool • Translate data-driven insights into decisions and actions 31
  32. 32. WHERETO OBTAIN SKILLS?
  33. 33. HARD WAY 33
  34. 34. TRADITIONAL WAY • LITS - Machine Learning • UCU - CS Master Degree • Data Science Degree • Kyivstar - Big Data University 34
  35. 35. EASY WAY 35
  36. 36. MACHINE LEARNING ≠ DATA SCIENCE
  37. 37. AND NOW WHAT? 38
  38. 38. AREYOU GOOD ENOUGH? AREYOU GOOD ENOUGH?41
  39. 39. KAGGLE STORY 42
  40. 40. Problem owners Problem solvers 43
  41. 41. WHAT CAN YOU FIND ON KAGGLE? • Knowledge • Money • Job • Reputation 44
  42. 42. 1. Understand 2. Collect 3. Data exploration 4. Clean and transform 5. Model 6. Validate 7. Communicating results Deploy 45
  43. 43. PASSION AND PERSISTENCE
  44. 44. TIME FOR FUN San Francisco crimes analysis
  45. 45. THANK YOU!
  46. 46. LINKS • https://medium.com/@rchang/my-two-year-journey-as-a-data-scientist-at-twitter-f0c13298aee6#.49jdojamn • https://blog.kissmetrics.com/how-netflix-uses-analytics/ • http://recode.net/2015/10/07/jawbone-isnt-a-hardware-company-anymore-says-ceo-hosain-rahman/ • https://jawbone.com/blog/napa-earthquake-effect-on-sleep/ • http://cs.ucu.edu.ua/ • http://lits.com.ua/course/machine-learning/ • http://bigdata.kyivstar.ua/ • https://www.kaggle.com/ • http://inversquare.github.io/moon/mooncrime.html#part-2-crimes-of-passion • http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0059030#abstract0 • http://blog.babajob.com/wp-content/uploads/2015/08/manyhands-300x161.jpg • http://www.criminalelement.com/images/stories/-2015-Jul-Sep/sherlock-holmes-benedict-cumberbatch.jpg

×