Introduction to Data Analytics
Data Analytics, Data Science, and
Machine Learning
Learning Objectives
Define data science and machine learning
Differentiate between data science, machine learning, and
data analytics
By the end of this lesson, you will be able to:
Introduction to Data Science
Data Science
Data science is the study of data, which involves gathering, storing, analyzing, and
plotting data, to effectively extract useful information.
Aim: Gain meaningful insights from both structured and unstructured data.
Data Science
Data cleansing
Preparation and
analysis
Trend forecast Machine learning
and data analytics
Types of Data Science
Data Science
Data Analytics Data Mining
Machine Learning
Data Analytics
Data analytics is the process of examining and analyzing raw data sets to:
Draw conclusion Derive information
Derive insights from raw data
sources
Machine Learning
Learns from patterns in the past
using a set of algorithms
Predicts outcomes
accurately
Data Mining
● Data mining is the process of analyzing data from
different perspectives.
● It summarizes data into useful information.
● It helps increase revenue and cut costs.
Data Science, Data Analytics, and Machine Learning
Data Science and Data Analytics
Forecasts the future
based on past patterns
Extracts meaningful insights
from various data sources
Data Scientist
Data Analyst
Machine Learning
Machine learning creates systems that can learn from the data.
It is the ability of machines to predict outcomes based on patterns in the past.
Machine Learning
Leverages various algorithms to
train the machine
ML Engineer
Data Science and Machine Learning
Gathers data from various
sources
Extracts useful information from
collected data sets Understands data from a
business point of view
Provides accurate predictions to
improve key business decisions
Data Scientist
Understanding Data Science
Understanding Data Science
A data scientist combines both domain and technology perspectives.
Understanding Data Science
Knows multiple analytical
functions
Works with data from video and
social media sources
Has a sound knowledge of technologies such as Python,
SAS, R, Scala, visualization libraries, SQL database, and
machine learning
Data Scientist
Data Science: Process Flow
How car insurance costs less if
you pay bills on time?
Data scientists found that the people who
pay bill promptly are less prone to the
accidents
Data Science: Process Flow
Data acquisition
Data scientists work with existing data
sets or gather them from various
sources.
Step 1: Data acquisition
The most important part of the whole process is to have the correct data.
Data Science: Process Flow
Data acquisition
Data wrangling
● Choose the right tools from
Python, R, and SQL
● Derive a clean data set
● Apply pick-and-shovel
algorithms
● Obtain meaningful data
Step 2: Data wrangling
Data Science: Process Flow
Data acquisition
Data wrangling
Machine learning
Step 3: Machine learning
● Validate the model
● Perform necessary statistical analysis
● Apply machine learning or recursive
analysis
● Run regression testing
● Compare results against other
techniques or sources
Challenge of a Data Scientist
The most challenging part of being a data scientist is taking the results and presenting them to the
stakeholders in an easy and consumable manner.
Data Science and Business Strategy
Data Science and Business Strategy
Business owners used to measure their success based only on the Profit and Loss Statement.
Current era of technology leverages data science for efficient prediction on what will work.
Data Science and Business Strategy
The process flow of a data-driven decision-making process:
Define business
goals
Build a team of data
scientists
Identify data sources and
enable new sources of data
capture
1 2
4 3
Design business
dashboards to track goals
Data Scientist: Asset to the Business
Empowers management
to make better decisions
Enables strategic changes
for better results
Identifies and refines
the target audience
Identifies opportunities
Identifies areas of
improvement
Provides insights on
various KPIs and
parameters
Data Scientist
Companies Using Data Science
Successful Companies Using Data Science
Few successful companies that use data science
Google Search Engine
Google Search Engine
The influencing factors include:
● Query volume: unique and verifiable users
● Geographical locations
● Keyword or phrase matches on the web
● Scrubbing for inappropriate content
Google uses data science to provide relevant search recommendations.
Facebook Tags
Facebook Tags
Scrolling the news feed Browsing images or videos
Facebook uses machine learning in every aspect including:
Facebook Tags
Uses clustering algorithm to:
Find mutual friends Send friend
suggestions
Alibaba
Alibaba’s Aliloan
Aliloan is an automated online system that provides flexible microloans to entrepreneurial
online vendors.
Alibaba’s Aliloan
Collects data from e-
commerce platforms
Uses predictive models to
analyze transaction records
Determines merchants’
creditworthiness
Analyzes trading records
and evaluates risk
Aliloan
Travel Industry
Travel Industry
The sensors from different modes of transport provide real-time data on various parameters to predict
and prevent problems.
Travel companies use datasets from social media, itineraries, predictive analytics, and location tracking
to arrive at the 360-degree view.
Travel Industry
Integrates historical data to
ensure maximum yield
Offers deals based on the user’s
preferences or recommended
local attractions
Predictive algorithms help drivers predict fuel needs, ETAs, and delays.
Retail
Retail
Frequency
Recency Monetary
RFM analysis is a marketing technique that leverages data to determine the target customer.
Retailers use data science to segment customers into RFM groups and target marketing and
promotions.
E-Commerce
E-Commerce
Amazon prefers an everything under one roof model.
Amazon is an e-commerce giant that leverages data science to the fullest extent.
E-Commerce
E-commerce companies use data science to upsell through their websites.
Amazon’s People who viewed that product, also liked this functionality uses
sophisticated mining techniques and boosts business.
Crime Agencies
Crime Agencies
Analytics keeps crime in check by:
● Using identified patterns to derive
prediction techniques
● Analyzing previous data to prevent future
burglaries
Crime Agencies
● Data mining can help identify pattern in from
domestic violence to terrorism.
● Advanced analytics helps prevent crime by using
information from social media.
Crime Agencies
● Where to deploy police manpower?
● Who to search at a border crossing?
● Which intelligence to consider in
counter-terrorism activities?
Crime prevention agencies use data science in
deciding:
Analytical Platforms across Industries
Analytical Platforms across Industries
Algorithms
Data storage
platforms
Architectures Tools
Analytical Platforms across Industries
Machine
learning
algorithms
Data storage
platforms
Architectures Tools
Forecasting
Regression
Bayesian network
Vector autoregression
Analytical Platforms across Industries
Machine learning
algorithms
Data storage
platforms
Deep learning
architectures
Tools
Deep Belief Network (DBN)
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Analytical Platforms across Industries
Machine learning
algorithms
Cloud storage
platforms
Deep learning
architectures
Tools
Amazon AWS
Microsoft Azure
Lambda
Analytical Platforms across Industries
Machine learning
algorithms
Cloud storage
platforms
Deep learning
architectures Tools
Reporting tools
● Tableau
● Splunk
● Power BI
● Kibana
Analytics tools
● Spark
● Python
● R
● Apache Pig
Key Takeaways
Data science is the study of data, which involves gathering,
storing, analyzing, and plotting data, to effectively extract
useful information.
Data science is an umbrella that contains data analytics,
data mining, and machine learning.
Data science is used by many successful companies such as
Google, Facebook, and Alibaba.
Analytical platforms across industries include algorithms,
architecture, data storage platforms, and tools.

5_Data Analytics, Data Science and Machine Learning

  • 1.
  • 2.
    Data Analytics, DataScience, and Machine Learning
  • 3.
    Learning Objectives Define datascience and machine learning Differentiate between data science, machine learning, and data analytics By the end of this lesson, you will be able to:
  • 4.
  • 5.
    Data Science Data scienceis the study of data, which involves gathering, storing, analyzing, and plotting data, to effectively extract useful information. Aim: Gain meaningful insights from both structured and unstructured data.
  • 6.
    Data Science Data cleansing Preparationand analysis Trend forecast Machine learning and data analytics
  • 7.
    Types of DataScience Data Science Data Analytics Data Mining Machine Learning
  • 8.
    Data Analytics Data analyticsis the process of examining and analyzing raw data sets to: Draw conclusion Derive information Derive insights from raw data sources
  • 9.
    Machine Learning Learns frompatterns in the past using a set of algorithms Predicts outcomes accurately
  • 10.
    Data Mining ● Datamining is the process of analyzing data from different perspectives. ● It summarizes data into useful information. ● It helps increase revenue and cut costs.
  • 11.
    Data Science, DataAnalytics, and Machine Learning
  • 12.
    Data Science andData Analytics Forecasts the future based on past patterns Extracts meaningful insights from various data sources Data Scientist Data Analyst
  • 13.
    Machine Learning Machine learningcreates systems that can learn from the data. It is the ability of machines to predict outcomes based on patterns in the past.
  • 14.
    Machine Learning Leverages variousalgorithms to train the machine ML Engineer
  • 15.
    Data Science andMachine Learning Gathers data from various sources Extracts useful information from collected data sets Understands data from a business point of view Provides accurate predictions to improve key business decisions Data Scientist
  • 16.
  • 17.
    Understanding Data Science Adata scientist combines both domain and technology perspectives.
  • 18.
    Understanding Data Science Knowsmultiple analytical functions Works with data from video and social media sources Has a sound knowledge of technologies such as Python, SAS, R, Scala, visualization libraries, SQL database, and machine learning Data Scientist
  • 19.
    Data Science: ProcessFlow How car insurance costs less if you pay bills on time? Data scientists found that the people who pay bill promptly are less prone to the accidents
  • 20.
    Data Science: ProcessFlow Data acquisition Data scientists work with existing data sets or gather them from various sources. Step 1: Data acquisition The most important part of the whole process is to have the correct data.
  • 21.
    Data Science: ProcessFlow Data acquisition Data wrangling ● Choose the right tools from Python, R, and SQL ● Derive a clean data set ● Apply pick-and-shovel algorithms ● Obtain meaningful data Step 2: Data wrangling
  • 22.
    Data Science: ProcessFlow Data acquisition Data wrangling Machine learning Step 3: Machine learning ● Validate the model ● Perform necessary statistical analysis ● Apply machine learning or recursive analysis ● Run regression testing ● Compare results against other techniques or sources
  • 23.
    Challenge of aData Scientist The most challenging part of being a data scientist is taking the results and presenting them to the stakeholders in an easy and consumable manner.
  • 24.
    Data Science andBusiness Strategy
  • 25.
    Data Science andBusiness Strategy Business owners used to measure their success based only on the Profit and Loss Statement. Current era of technology leverages data science for efficient prediction on what will work.
  • 26.
    Data Science andBusiness Strategy The process flow of a data-driven decision-making process: Define business goals Build a team of data scientists Identify data sources and enable new sources of data capture 1 2 4 3 Design business dashboards to track goals
  • 27.
    Data Scientist: Assetto the Business Empowers management to make better decisions Enables strategic changes for better results Identifies and refines the target audience Identifies opportunities Identifies areas of improvement Provides insights on various KPIs and parameters Data Scientist
  • 28.
  • 29.
    Successful Companies UsingData Science Few successful companies that use data science
  • 30.
  • 31.
    Google Search Engine Theinfluencing factors include: ● Query volume: unique and verifiable users ● Geographical locations ● Keyword or phrase matches on the web ● Scrubbing for inappropriate content Google uses data science to provide relevant search recommendations.
  • 32.
  • 33.
    Facebook Tags Scrolling thenews feed Browsing images or videos Facebook uses machine learning in every aspect including:
  • 34.
    Facebook Tags Uses clusteringalgorithm to: Find mutual friends Send friend suggestions
  • 35.
  • 36.
    Alibaba’s Aliloan Aliloan isan automated online system that provides flexible microloans to entrepreneurial online vendors.
  • 37.
    Alibaba’s Aliloan Collects datafrom e- commerce platforms Uses predictive models to analyze transaction records Determines merchants’ creditworthiness Analyzes trading records and evaluates risk Aliloan
  • 38.
  • 39.
    Travel Industry The sensorsfrom different modes of transport provide real-time data on various parameters to predict and prevent problems. Travel companies use datasets from social media, itineraries, predictive analytics, and location tracking to arrive at the 360-degree view.
  • 40.
    Travel Industry Integrates historicaldata to ensure maximum yield Offers deals based on the user’s preferences or recommended local attractions Predictive algorithms help drivers predict fuel needs, ETAs, and delays.
  • 41.
  • 42.
    Retail Frequency Recency Monetary RFM analysisis a marketing technique that leverages data to determine the target customer. Retailers use data science to segment customers into RFM groups and target marketing and promotions.
  • 43.
  • 44.
    E-Commerce Amazon prefers aneverything under one roof model. Amazon is an e-commerce giant that leverages data science to the fullest extent.
  • 45.
    E-Commerce E-commerce companies usedata science to upsell through their websites. Amazon’s People who viewed that product, also liked this functionality uses sophisticated mining techniques and boosts business.
  • 46.
  • 47.
    Crime Agencies Analytics keepscrime in check by: ● Using identified patterns to derive prediction techniques ● Analyzing previous data to prevent future burglaries
  • 48.
    Crime Agencies ● Datamining can help identify pattern in from domestic violence to terrorism. ● Advanced analytics helps prevent crime by using information from social media.
  • 49.
    Crime Agencies ● Whereto deploy police manpower? ● Who to search at a border crossing? ● Which intelligence to consider in counter-terrorism activities? Crime prevention agencies use data science in deciding:
  • 50.
  • 51.
    Analytical Platforms acrossIndustries Algorithms Data storage platforms Architectures Tools
  • 52.
    Analytical Platforms acrossIndustries Machine learning algorithms Data storage platforms Architectures Tools Forecasting Regression Bayesian network Vector autoregression
  • 53.
    Analytical Platforms acrossIndustries Machine learning algorithms Data storage platforms Deep learning architectures Tools Deep Belief Network (DBN) Convolutional Neural Network (CNN) Recurrent Neural Network (RNN)
  • 54.
    Analytical Platforms acrossIndustries Machine learning algorithms Cloud storage platforms Deep learning architectures Tools Amazon AWS Microsoft Azure Lambda
  • 55.
    Analytical Platforms acrossIndustries Machine learning algorithms Cloud storage platforms Deep learning architectures Tools Reporting tools ● Tableau ● Splunk ● Power BI ● Kibana Analytics tools ● Spark ● Python ● R ● Apache Pig
  • 56.
    Key Takeaways Data scienceis the study of data, which involves gathering, storing, analyzing, and plotting data, to effectively extract useful information. Data science is an umbrella that contains data analytics, data mining, and machine learning. Data science is used by many successful companies such as Google, Facebook, and Alibaba. Analytical platforms across industries include algorithms, architecture, data storage platforms, and tools.