Intro to Career in Data
Science
Content
1. FAQ - Summary of the course
2. Understanding the Role: Data Science Career Overview
3. Demand, supply and the Job Market
4. Salaries in Data Science
5. Typical Data Science Course
6. Excelling as Data Scientist
Data Science Career Overview
● Why a choose a career in
Data Science?
● Data Science is
Interdisciplinary
● Data Science Position and
Titles
● Thoughts on Higher
Education
Intro to Data Science - What is Data Science
What is Data Science?
Why do we need Data Science
What does a Data Scientist do
How the Data Scientist everyday looks like
Data Science Roles
Salary Range
How does a typical Data Science project
work
Future of Data Science
Will Data Science be on Demand
Prerequisites
How much Math's / Stat is required?
How many Machine Learning algorithms should I
know?
Importance of having a Master or PhD
What is the minimum basic requirements to start
a career in Data Science?
What background do you need to start a career
in Data Science?
What is the most important habit for becoming a
good Data Scientist?
What are the best 3 qualities to have in Data
Science?
Pathways to study Data Science
What are the main skills to learn?
Statistics topics to learn
What programming languages you should
know
R vs. Python vs Scala
Do I need to know R & Python or will only
one do?
Recommended Books
Resources to learn
How to become a Top Level Data Scientist
Where to get practical experience
Portfolio & Resume Prep
Should I create a blog or portfolio in order to get a
DS job?
Best places to promote your skills
Facebook / Linkedin Groups
Github Kaggle
How to stand out from the crowd
How to prepare a CV / Interview
What questions should you expect
How to prepare yourself for the interview
Understanding the
Role
1. Data Science vs Artificial
Intelligence vs Machine
Learning
2. Data Science vs Data Analyst
vs Business Intelligence
5-10 Minutes
Data Scientist
The title “data scientist” is relatively new and is not yet clearly defined. Due to the
fact that it lacks specificity it can sometimes be perceived as an elevated synonym
for “data analyst.” But that’s not the case. A data scientist possesses a combination
of analytic, machine learning, data mining, and statistical skills in addition to
experience with algorithms and coding.
Data scientists also have expertise in the following programs: R, SAS, Python,
Matlab, SQL, Hive, Pig, and Spark. But maybe the most important skill that a data
scientist possesses is the ability to explain the significance of data in a way that can
be easily understood by others.
Data Science vs Data Analytics
Data Analytics vs Data Science
BI vs Data Science
Scope of Business Analytics
 Descriptive analytics
- uses data to understand past and present
 Predictive analytics
- analyzes past performance
 Prescriptive analytics
- uses optimization techniques
Scope of Business Analytics
Example 1.1 Retail Markdown Decisions
 Most department stores clear seasonal inventory by
reducing prices.
 The question is:
When to reduce the price and by how much?
 Descriptive analytics: examine historical data for similar
products (prices, units sold, advertising, …)
 Predictive analytics: predict sales based on price
 Prescriptive analytics: find the best sets of pricing and
advertising to maximize sales revenue
Data for Business Analytics
Four Types Data Based on Measurement Scale:
 Categorical (nominal) data
 Ordinal data
 Interval data
 Ratio data
Data for Business Analytics
Example 1.3
Classifying Data Elements in a Purchasing Database
Data for Business Analytics
Example 1.3 (continued)
Classifying Data Elements in a Purchasing Database
Data for Business Analytics
Categorical (nominal) Data
 Data placed in categories according to a specified
characteristic
 Categories bear no quantitative relationship to one
another
 Examples:
- customer’s location (America, Europe, Asia)
- employee classification (manager, supervisor,
associate)
Data for Business Analytics
Ordinal Data
 Data that is ranked or ordered according to some
relationship with one another
 No fixed units of measurement
 Examples:
- college football rankings
- survey responses
(poor, average, good, very good, excellent)
Data for Business Analytics
Interval Data
 Ordinal data but with constant differences
between observations
 No true zero point
 Ratios are not meaningful
 Examples:
- temperature readings
- SAT scores
Data for Business Analytics
Ratio Data
 Continuous values and have a natural zero point
 Ratios are meaningful
 Examples:
- monthly sales
- delivery times
Unstructured Data
Mapreduce Big Data
NoSQL Databases
Cleaning and Wrangling
http://159.89.224.205/wp-content/uploads/2016/02/tumblr_inline_o21df5eSYo1sleek4_540.png
Big data, draws from a number of sources: structured data and
unstructured data. Structured data is organized, typically by
categories that make it easy for a computer to sort, read and organize
automatically.
Unstructured data, the fastest growing form of big data, is more
likely to come from human input — customer reviews, emails, videos,
social media posts, etc.
Typically, businesses employ data scientists to handle this
unstructured data, whereas other IT personnel will be responsible for
managing and maintaining structured data
How many Machine Learning algorithms should I
know?
Decision tree
Random forest
Logistic regression
Support vector machine
Naive Bayes
k-NearestNeighbor
k-means
Adaboost
Neural network
Markov
Artificial Intelligence vs Machine Learning
Machines Will Do Half Our Work By 2025 (Forbes Sep 2018).
Artificial Intelligence is the broader concept of machines being able to carry out
tasks in a way that we would consider “smart”. Artificial Intelligences – devices
designed to act intelligently. ML and neural networks.Python Automation
Source: https://www.forbes.com/sites/patrickwwatson/2018/09/27/machines-will-do-half-our-work-by-2025/#204a1b255e2a
http://blogs-images.forbes.com/louiscolumbus/files/2017/05/Data-Science-and-Analytics-Demand-by-industry.jpg
Supply & Demand of
Data Science
Professionals
10-15 Minutes
Demand and Supply of Data Science Professional
Bridging The Data Scientist Talent Gap Starts With Defining The Current Role
(Forbes June 2018). Demand for data science and analytics skills? New job postings
to reach 2.72M in 2020 (BHWS PWC 2017). Annual demand for the fast-
growing new roles of data scientist, data developers, and data engineers will reach
nearly 700,000 openings by 2020. By 2020, the number of jobs for all US
data professionals will increase by 364,000 openings to 2,720,000
according to IBM.
IT Spending, Freelancing and Hiring Trends
IT spending is projected to reach about $3.85 trillion in 2019, up 2.8% from
2018. 36% of the workforce is contract-based or freelance talent with
projections showing freelancers will outnumber non-freelancers in the U.S. by
2027. Predictive analytics algorithms monitor 3GB of data every
second streaming from millions of network interfaces. What's Coming: Tech
Hiring Predictions For 2019 (Forbes June 2018). The Amazing Ways Verizon Uses
AI And Machine Learning To Improve Performance.
Future Jobs - Machines taking away Jobs
Deep Learning is used by Googlein its voice and image recognition
algorithms, by Netflix and Amazon to decide what you want or buy. ML is
described as a sub-discipline of AI. The Workforce Needs AI -- But AI Needs Human
Workers, Too (Forbes Nov 2018). AI is expected to be able to write a high school
essay and drive a truck better than a human can, have a 50% chance of
outperforming all human tasks within 45 years and automate all jobs in the
next century. 14-54%of the U.S. workforce could see their jobs automated in
the next two decades. Let The Robots Take Over: How The Future Of AI Will Create
More Jobs (Forbes Dec 2018)
Future of Job Market
75% of finance departmentswill employ automation by 2020.
Jobs taken away from Artificial Intelligence. Robots Aren't Coming For Jobs: AI Is
Already Taking Them (Forbes Oct 2018).
Credit Suisse using deep neural networks, random forest and NLP to
eliminate analyst jobs (Waterstechnology 2019). What Is The
Difference Between Deep Learning, Machine Learning and AI? (Forbes Dec 2017).
10 Amazing Examples Of How Deep Learning AI Is Used In Practice? (Forbes Dec
2018). Machine Learning And AI Will Disrupt All Careers. Eight Ways Big Data And
AI Are Changing The Business World (Forbes 2018)
NYU Center for Data Science
Salary of Data
Science
Professionals
5-10 Minutes
https://www.burning-glass.com/wp-content/uploads/The_Quant_Crunch.pdf
Data Science Job - Demand & Salary
Data Scientist has been named the best job in America for three years running,
with a median base salary of $110,000and 4,524 job openings.Data Scientist
Is the Best Job In America According Glassdoor's 2018 Rankings (Forbes Jan 18).
Data Science Jobs
Data science is a fast growing and lucrative field, with the BLS predicting jobs in
this field will grow 11 percent by 2024. Data scientist is also shaping up to be a
satisfying long-term career path. According to data from Robert Half's 2018
Technology and IT Salary Guide, the average salary for data scientists, based on
experience, breaks down as follows:
25th percentile: $100,000
50th percentile: $119,000
75th percentile: $142,750
95th percentile: $168,000
http://blogs-images.forbes.com/louiscolumbus/files/2017/05/highest-paying-skills.jpg
Data Science
Courses &
Bootcamp
5-10 Minutes
Introduction
● The difference Data Science vs Machine Learning vs Artificial Intelligence vs
Data Analytics. How is the industry and HR using them while writing job
description?
● You will learn to use Python to help you acquire, parse and model your data.
● A significant portion of the course will be a hands-on approach to the
fundamental modeling techniques and machine learning algorithms that
enable you to build robust predictive models of real-world data and test their
validity.
● Seemingly enough, Scala Hadoop and other tech is faster which might be one
level closer to production. The idea if the course remain to develop analytical
thought process. Lot of Data Wrangling terms and concepts remain same
which is language agnostic.
What is inside the typical Data Science Course
● Mathematics
● Statistics
● Python statistical techniques in Python & Data Visualization
● Machine Learning
● Big Data Engineering
● Deep Learning
Pre-Works: Introductory Python (Optional), Data Analysis and Visualization with
Python, Statistics
How much Math's / Stat is required?
Logarithm, exponential, polynomial functions, rational numbers.
Basic geometry and theorems, trigonometric identities.
Real and complex numbers and basic properties.
Series, sums, and inequalities.
Graphing and plotting, Cartesian and polar coordinate systems, conic sections.
Linear algebra (and ideally basic multivariate calculus)
Regression linear regression and the things that violate the assumptions of linear
models (e.g., autocorrelation in time series data, non-independent observations)
Probability theory ... especially Bayes' Law and Central Limit Theorem
Numerical analysis (e.g., time series analysis and forecasting)
Core machine learning methods (clustering, decision trees, k-NN)
Excelling as Data
Scientist
5-10 Minutes
Excelling as Data Scientist
What Does It Take To Excel As A Data Scientist These Days? (Nov 2018).
Companies are only using about 12% of the data.Core Curriculum: Hadoop,
Spark, Machine Learning, Visualization. Specialization: Deep Learning, Data
Engineering & Big Data, Automation (DevOPs)
Technical Skills for Data Scientists
Math (e.g. linear algebra, calculus and probability)
Statistics (e.g. hypothesis testing and summary statistics)
Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.)
Software engineering skills (e.g. distributed computing, algorithms and data structures)
Data mining
Data cleaning and munging
Data visualization (e.g. ggplot and d3.js) and reporting techniques
Unstructured data techniques
R and/or SAS languages
SQL databases and database querying languages
Python (most common), C/C++ Java, Perl
Big data platforms like Hadoop, Hive & Pig
Cloud tools like Amazon S3

Career_Jobs_in_Data_Science.pptx

  • 1.
    Intro to Careerin Data Science
  • 2.
    Content 1. FAQ -Summary of the course 2. Understanding the Role: Data Science Career Overview 3. Demand, supply and the Job Market 4. Salaries in Data Science 5. Typical Data Science Course 6. Excelling as Data Scientist
  • 3.
    Data Science CareerOverview ● Why a choose a career in Data Science? ● Data Science is Interdisciplinary ● Data Science Position and Titles ● Thoughts on Higher Education
  • 4.
    Intro to DataScience - What is Data Science What is Data Science? Why do we need Data Science What does a Data Scientist do How the Data Scientist everyday looks like Data Science Roles Salary Range How does a typical Data Science project work Future of Data Science Will Data Science be on Demand
  • 5.
    Prerequisites How much Math's/ Stat is required? How many Machine Learning algorithms should I know? Importance of having a Master or PhD What is the minimum basic requirements to start a career in Data Science? What background do you need to start a career in Data Science? What is the most important habit for becoming a good Data Scientist? What are the best 3 qualities to have in Data Science?
  • 6.
    Pathways to studyData Science What are the main skills to learn? Statistics topics to learn What programming languages you should know R vs. Python vs Scala Do I need to know R & Python or will only one do? Recommended Books Resources to learn How to become a Top Level Data Scientist Where to get practical experience
  • 7.
    Portfolio & ResumePrep Should I create a blog or portfolio in order to get a DS job? Best places to promote your skills Facebook / Linkedin Groups Github Kaggle How to stand out from the crowd How to prepare a CV / Interview What questions should you expect How to prepare yourself for the interview
  • 8.
    Understanding the Role 1. DataScience vs Artificial Intelligence vs Machine Learning 2. Data Science vs Data Analyst vs Business Intelligence 5-10 Minutes
  • 9.
    Data Scientist The title“data scientist” is relatively new and is not yet clearly defined. Due to the fact that it lacks specificity it can sometimes be perceived as an elevated synonym for “data analyst.” But that’s not the case. A data scientist possesses a combination of analytic, machine learning, data mining, and statistical skills in addition to experience with algorithms and coding. Data scientists also have expertise in the following programs: R, SAS, Python, Matlab, SQL, Hive, Pig, and Spark. But maybe the most important skill that a data scientist possesses is the ability to explain the significance of data in a way that can be easily understood by others.
  • 10.
    Data Science vsData Analytics
  • 22.
    Data Analytics vsData Science
  • 23.
    BI vs DataScience
  • 24.
    Scope of BusinessAnalytics  Descriptive analytics - uses data to understand past and present  Predictive analytics - analyzes past performance  Prescriptive analytics - uses optimization techniques
  • 25.
    Scope of BusinessAnalytics Example 1.1 Retail Markdown Decisions  Most department stores clear seasonal inventory by reducing prices.  The question is: When to reduce the price and by how much?  Descriptive analytics: examine historical data for similar products (prices, units sold, advertising, …)  Predictive analytics: predict sales based on price  Prescriptive analytics: find the best sets of pricing and advertising to maximize sales revenue
  • 26.
    Data for BusinessAnalytics Four Types Data Based on Measurement Scale:  Categorical (nominal) data  Ordinal data  Interval data  Ratio data
  • 27.
    Data for BusinessAnalytics Example 1.3 Classifying Data Elements in a Purchasing Database
  • 28.
    Data for BusinessAnalytics Example 1.3 (continued) Classifying Data Elements in a Purchasing Database
  • 29.
    Data for BusinessAnalytics Categorical (nominal) Data  Data placed in categories according to a specified characteristic  Categories bear no quantitative relationship to one another  Examples: - customer’s location (America, Europe, Asia) - employee classification (manager, supervisor, associate)
  • 30.
    Data for BusinessAnalytics Ordinal Data  Data that is ranked or ordered according to some relationship with one another  No fixed units of measurement  Examples: - college football rankings - survey responses (poor, average, good, very good, excellent)
  • 31.
    Data for BusinessAnalytics Interval Data  Ordinal data but with constant differences between observations  No true zero point  Ratios are not meaningful  Examples: - temperature readings - SAT scores
  • 32.
    Data for BusinessAnalytics Ratio Data  Continuous values and have a natural zero point  Ratios are meaningful  Examples: - monthly sales - delivery times
  • 33.
    Unstructured Data Mapreduce BigData NoSQL Databases Cleaning and Wrangling http://159.89.224.205/wp-content/uploads/2016/02/tumblr_inline_o21df5eSYo1sleek4_540.png
  • 34.
    Big data, drawsfrom a number of sources: structured data and unstructured data. Structured data is organized, typically by categories that make it easy for a computer to sort, read and organize automatically. Unstructured data, the fastest growing form of big data, is more likely to come from human input — customer reviews, emails, videos, social media posts, etc. Typically, businesses employ data scientists to handle this unstructured data, whereas other IT personnel will be responsible for managing and maintaining structured data
  • 36.
    How many MachineLearning algorithms should I know? Decision tree Random forest Logistic regression Support vector machine Naive Bayes k-NearestNeighbor k-means Adaboost Neural network Markov
  • 38.
    Artificial Intelligence vsMachine Learning Machines Will Do Half Our Work By 2025 (Forbes Sep 2018). Artificial Intelligence is the broader concept of machines being able to carry out tasks in a way that we would consider “smart”. Artificial Intelligences – devices designed to act intelligently. ML and neural networks.Python Automation Source: https://www.forbes.com/sites/patrickwwatson/2018/09/27/machines-will-do-half-our-work-by-2025/#204a1b255e2a
  • 39.
  • 40.
    Supply & Demandof Data Science Professionals 10-15 Minutes
  • 41.
    Demand and Supplyof Data Science Professional Bridging The Data Scientist Talent Gap Starts With Defining The Current Role (Forbes June 2018). Demand for data science and analytics skills? New job postings to reach 2.72M in 2020 (BHWS PWC 2017). Annual demand for the fast- growing new roles of data scientist, data developers, and data engineers will reach nearly 700,000 openings by 2020. By 2020, the number of jobs for all US data professionals will increase by 364,000 openings to 2,720,000 according to IBM.
  • 42.
    IT Spending, Freelancingand Hiring Trends IT spending is projected to reach about $3.85 trillion in 2019, up 2.8% from 2018. 36% of the workforce is contract-based or freelance talent with projections showing freelancers will outnumber non-freelancers in the U.S. by 2027. Predictive analytics algorithms monitor 3GB of data every second streaming from millions of network interfaces. What's Coming: Tech Hiring Predictions For 2019 (Forbes June 2018). The Amazing Ways Verizon Uses AI And Machine Learning To Improve Performance.
  • 43.
    Future Jobs -Machines taking away Jobs Deep Learning is used by Googlein its voice and image recognition algorithms, by Netflix and Amazon to decide what you want or buy. ML is described as a sub-discipline of AI. The Workforce Needs AI -- But AI Needs Human Workers, Too (Forbes Nov 2018). AI is expected to be able to write a high school essay and drive a truck better than a human can, have a 50% chance of outperforming all human tasks within 45 years and automate all jobs in the next century. 14-54%of the U.S. workforce could see their jobs automated in the next two decades. Let The Robots Take Over: How The Future Of AI Will Create More Jobs (Forbes Dec 2018)
  • 44.
    Future of JobMarket 75% of finance departmentswill employ automation by 2020. Jobs taken away from Artificial Intelligence. Robots Aren't Coming For Jobs: AI Is Already Taking Them (Forbes Oct 2018). Credit Suisse using deep neural networks, random forest and NLP to eliminate analyst jobs (Waterstechnology 2019). What Is The Difference Between Deep Learning, Machine Learning and AI? (Forbes Dec 2017). 10 Amazing Examples Of How Deep Learning AI Is Used In Practice? (Forbes Dec 2018). Machine Learning And AI Will Disrupt All Careers. Eight Ways Big Data And AI Are Changing The Business World (Forbes 2018)
  • 45.
    NYU Center forData Science
  • 46.
  • 47.
  • 48.
    Data Science Job- Demand & Salary Data Scientist has been named the best job in America for three years running, with a median base salary of $110,000and 4,524 job openings.Data Scientist Is the Best Job In America According Glassdoor's 2018 Rankings (Forbes Jan 18).
  • 49.
    Data Science Jobs Datascience is a fast growing and lucrative field, with the BLS predicting jobs in this field will grow 11 percent by 2024. Data scientist is also shaping up to be a satisfying long-term career path. According to data from Robert Half's 2018 Technology and IT Salary Guide, the average salary for data scientists, based on experience, breaks down as follows: 25th percentile: $100,000 50th percentile: $119,000 75th percentile: $142,750 95th percentile: $168,000
  • 50.
  • 51.
  • 52.
    Introduction ● The differenceData Science vs Machine Learning vs Artificial Intelligence vs Data Analytics. How is the industry and HR using them while writing job description? ● You will learn to use Python to help you acquire, parse and model your data. ● A significant portion of the course will be a hands-on approach to the fundamental modeling techniques and machine learning algorithms that enable you to build robust predictive models of real-world data and test their validity. ● Seemingly enough, Scala Hadoop and other tech is faster which might be one level closer to production. The idea if the course remain to develop analytical thought process. Lot of Data Wrangling terms and concepts remain same which is language agnostic.
  • 53.
    What is insidethe typical Data Science Course ● Mathematics ● Statistics ● Python statistical techniques in Python & Data Visualization ● Machine Learning ● Big Data Engineering ● Deep Learning Pre-Works: Introductory Python (Optional), Data Analysis and Visualization with Python, Statistics
  • 54.
    How much Math's/ Stat is required? Logarithm, exponential, polynomial functions, rational numbers. Basic geometry and theorems, trigonometric identities. Real and complex numbers and basic properties. Series, sums, and inequalities. Graphing and plotting, Cartesian and polar coordinate systems, conic sections. Linear algebra (and ideally basic multivariate calculus) Regression linear regression and the things that violate the assumptions of linear models (e.g., autocorrelation in time series data, non-independent observations) Probability theory ... especially Bayes' Law and Central Limit Theorem Numerical analysis (e.g., time series analysis and forecasting) Core machine learning methods (clustering, decision trees, k-NN)
  • 55.
  • 56.
    Excelling as DataScientist What Does It Take To Excel As A Data Scientist These Days? (Nov 2018). Companies are only using about 12% of the data.Core Curriculum: Hadoop, Spark, Machine Learning, Visualization. Specialization: Deep Learning, Data Engineering & Big Data, Automation (DevOPs)
  • 57.
    Technical Skills forData Scientists Math (e.g. linear algebra, calculus and probability) Statistics (e.g. hypothesis testing and summary statistics) Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.) Software engineering skills (e.g. distributed computing, algorithms and data structures) Data mining Data cleaning and munging Data visualization (e.g. ggplot and d3.js) and reporting techniques Unstructured data techniques R and/or SAS languages SQL databases and database querying languages Python (most common), C/C++ Java, Perl Big data platforms like Hadoop, Hive & Pig Cloud tools like Amazon S3

Editor's Notes

  • #10  https://www.discoverdatascience.org/career-information/
  • #35  https://www.cio.com/article/3217026/data-science/what-is-a-data-scientist-a-key-data-analytics-role-and-a-lucrative-career.html
  • #37 https://bigdata-madesimple.com/10-machine-learning-algorithms-know-2018/
  • #39 https://www.forbes.com/sites/patrickwwatson/2018/09/27/machines-will-do-half-our-work-by-2025/#204a1b255e2a https://www.forbes.com/sites/bernardmarr/2016/12/06/what-is-the-difference-between-artificial-intelligence-and-machine-learning/
  • #42 https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/#d6e399b7e3bd
  • #43  https://www.forbes.com/sites/bernardmarr/2018/06/22/the-amazing-ways-verizon-uses-ai-and-machine-learning-to-improve-performance/#6c07a9b27638 https://www.forbes.com/sites/janakirammsv/2018/10/01/the-key-factor-that-influences-the-adoption-of-cloud-based-machine-learning-platforms/#5a7b2b561ad1 https://www.forbes.com/sites/forbesbusinessdevelopmentcouncil/2018/12/28/whats-coming-tech-hiring-predictions-for-2019/#f07b9e14c1a7
  • #45 https://www.waterstechnology.com/data-management/4007716/deep-learning-the-evolution-is-here
  • #46 https://cds.nyu.edu/careersindatascience/
  • #49  http://www.bhef.com/sites/default/files/bhef_2017_investing_in_dsa.pdf Data Scientist Is the Best Job In America According Glassdoor's 2018 Rankings https://www.forbes.com/sites/louiscolumbus/2018/01/29/data-scientist-is-the-best-job-in-america-according-glassdoors-2018-rankings/#c8cd6555357e
  • #54 https://www.forbes.com/sites/bernardmarr/2018/08/20/10-amazing-examples-of-how-deep-learning-ai-is-used-in-practice/ https://www.forbes.com/sites/bernardmarr/2016/12/08/what-is-the-difference-between-deep-learning-machine-learning-and-ai/#6aa87e9126cf
  • #55  https://www.datascienceweekly.org/articles/how-much-math-stats-do-i-need-on-my-data-science-resume https://towardsdatascience.com/essential-math-for-data-science-why-and-how-e88271367fbd
  • #57 https://www.burning-glass.com/wp-content/uploads/The_Quant_Crunch.pdf https://www.forbes.com/sites/forbestechcouncil/2018/11/26/what-does-it-take-to-excel-as-a-data-scientist-these-days/1