Successfully reported this slideshow.

# How to Start Doing Data Science

Upcoming SlideShare
what is data science
×

# How to Start Doing Data Science

This talk goes over what Data Science is and how
you can start working with data in
your role. This is for everyone interested in Data
Science who might be unsure about how to
start working with data. Learn the core
concepts of Data Science and how you can
start learning data science pain-free!

This talk goes over what Data Science is and how
you can start working with data in
your role. This is for everyone interested in Data
Science who might be unsure about how to
start working with data. Learn the core
concepts of Data Science and how you can
start learning data science pain-free!

## More Related Content

### How to Start Doing Data Science

1. 1. Ayodele Odubela How to Start Doing Data Science
2. 2. About Me ● Data Scientist @ CometML ● MS in Data Science from Regis University ● Teaching Explainable ML ● Author of Getting Started in Data Science ● Currently writing Uncovering Bias in Machine Learning
3. 3. Skills 01
4. 4. What is Data Science? Data science is an inter-disciplinary field that uses (somewhat) scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.
5. 5. Coding SQL Python R Understanding and Creating New Metrics Deciding which methods work in your industry Business Sense Math Statistics Probability Linear Algebra
6. 6. What Data Projects Include Identify a Problem Asses the org’s incentives Gather & clean data Data documentation Exploratory Analysis Inferential Statistics Data Storytelling Harm identification and mitigation Creating ML Models Building User Recourse Frameworks
7. 7. Relevant Roles Data Scientist Skills: Advanced SQL, Intermediate Python/R, Intermediate Statistics Machine Learning Engineer Skills: Advanced Python, Tensorflow/PyTorch/Keras, Intermediate Linear Algebra, Calc, & Statistics Research Scientist Skills: Advanced Math, Science Communications
9. 9. Data Wrangling Any language can be used to get data from databases and API’s
10. 10. Data Cleaning Dealing with Missing Values Combining Sparse Categorical Columns
11. 11. Data Transformations Power Transforms and scaling/normalization We do this to make modeling structured data easier
12. 12. Functional Programming Applying and composing functions to make code more concise and reusable
13. 13. Experimental Design Understanding and making the consistent experimental choices Hypothesis Testing A/B Testing
14. 14. Concepts 03
15. 15. Goals ● Predict future events given past data ● Find anomalies in our datasets ● Make recommendations based on someone’s interests
16. 16. Methods 1. Clean data so its in a format we can model 2. Understand data distributions to inform model selection 3. Perform Exploratory Data Analysis to grasp data 4. Choose modeling techniques that help us solve problems 5. Measure how well our models perform and optimize then 6. Iterate!
17. 17. Exploratory What? In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.
18. 18. Regression Is just a fancy word for predicting numbers
19. 19. Classification attempts to tell different things apart
20. 20. Clustering tries to identify groups of similar things based on how far apart they are
21. 21. Reinforcement Learning autonomous agents learn from their environment and make new decisions based on if they were rewarded or punished
22. 22. Hands-On Experience 04
23. 23. Practice Consistently
24. 24. Finding Data Kaggle UCI Data Repository Data.World Government & Local Open Data Web Scraping Public APIs
25. 25. Cleaning & Manipulating Data Grasp the basic techniques Build intuition for when to use certain methods Understand pros and cons of each Tools: Excel Python & R SQL
26. 26. Getting Practice Github Medium Tutorials Hackathons MooCs
27. 27. To get a formal education or not?
28. 28. Market Yourself 05 Even while you’re still learning
29. 29. Communicate Your Value How have you impacted past businesses? How would your relevant projects help a company? Do you know how to quantify your value?
30. 30. Github Showing off code projects Connecting with other developers Collaborating and proving technical skills
31. 31. Blog / Personal Website Share Expertise Show off Portfolio Provide insight into your thought process
32. 32. Thank You! 25% off Getting Started in Data Science Code: VBROWNBAG @DataSciBae ayodeleodubela.com