By,
Dr.V.Sumathy,
Assistant Professor,
Department of Data Science,
Loyola College, Chennai 600 034
Introduction to Data Science
Big data
Agenda
Definition of
Data Science
Application of
Data Science
Process of
Data Science
Project
Skillset to
acquire
Placement
Opportunities
Questions
and
Discussion
Big data
Vs
Traditional data
Need of Data
Science
Introduction to
Predictive and
prescriptive
models
Learning
Resources
Definition of Data Science
01
02
Data science is the study of data to extract meaningful
insights for business.
It is a multidisciplinary approach that combines domain
expertise, programming skills, machine learning algorithms,
and knowledge of maths and statistics.
Need for Data Science
Need
for Data
Science
Data
Abundance
Business
Value
Complexit
y of Data
Advancement
of Technology
Decision
Making
Personalisation
Social
Impact
Big Data
Streaming
Data -
Volume
Variety
of data
Velocity Veracity Value
Features involved in Pricing model
The amount charged for the
distance traveled during the trip.
A flat fee charged at the beginning
of every ride, regardless of distance
or time.
The amount charged for the
duration of the trip, typically based
on the time spent in the vehicle.
A fee charged to cover operational
costs, such as insurance and
customer support.
01 03
02 04
Base Fare
Per-Minute Rate
Per-Mile Rate
Booking Fee
Features involved in Pricing model
Additional fees may be added for
tolls, airport pickups, or other special
circumstances.
Uber frequently offers promotions,
discounts, and referral bonuses that
can affect the final price of a ride.
06 07
Tolls and Surcharges Promotions and Discounts
During times of high demand, Uber may implement surge pricing, which
increases the fares to encourage more drivers to be available. Surge
pricing multipliers can vary depending on the level of demand in the
area.
05
Surge Pricing
Features involved in product
recommendation
01 05
02 06
User Preferences and History
Item Attributes
Implicit Feedback
Explicit Feedback
03 07
Collaborative Filtering Contextual Information
04 08
Content-Based Filtering Seasonality and Trends
Features involved in spam detection
B
H
A
Sender
Reputation
Sources
Keyword/content
Whitelist/Blacklist
HTML code
Attachments
Header
analysis
Images
C
E
D
F
Credit card/ Loan sanction analysis
Data science in defence
➔ Predictive Maintenance
➔ Mission Planning and Optimization
➔ Target Identification and Tracking
➔ Health Monitoring and Medical Research
➔ Cybersecurity and Information Assurance
and many more
Data science in Rocket
launching
➔ Risk Assessment and Safety Analysis
➔ Real-Time Monitoring and Control
➔ Weather Forecasting and Environmental
Conditions
➔ Launch Site Selection and Infrastructure
Planning and many more
Types of Data Analysis
Predictive Prescriptive Diagnostic Descriptive
Obtain Data
B
H
A
Open source
Sources
Real time data
Video
Secondary/
Primary data
Text data
Real world
data
Images
C
E
D
F
Scrub Data
01 04
02 05
Handle missing values
Handle outliers
Drop unwanted columns
Data Transformation
03 06
Duplication data Data discretization
Explore Data
01 03
02 04
Create Histogram
Create scatterplot
Create Boxplot
Generate descriptive
statistics
Model Building
Model Building
Model in simple words is an equation that helps in making decisions be
it predictive, prescriptive, descriptive, or diagnostic analysis.
Example:
Training and Test data set
Evaluation metrics
Evaluation metrics
Types of Machine Learning Algorithms
Skill set to acquire
➔ Statistics-Descriptive and Inferential Statistics
➔ Mathematics- eigen, eigenvector, projection(Linear algebra)
➔ Programming language- Python, Spark
➔ DBMS, SQL, NoSQL
➔ Visualisation Tools- PowerBI/Tableau
➔ Cloud – Azure/GCP
➔ Web Scraping
Explore data
B
A
Kaggle
Explore data
UCI Repository
Data.gov
Twitter API
MIMIC-III
The World Bank
Open Data
C
E
D
F
Learning resources
B
A
Udemy
Learning
resources
Coursera
NPTEL
Linkedin
Medium.com
YouTube videos by
Krish Naik
C
E
D
F
Build Your Profile
B
A
Aptitude skill
Blocks
Mini project
Certifications
LinkedIn
profile
Hackathon
ranks
USP
C
E
D
F
Thanks!

Introduction to data science.pdf-Definition,types and application of Data Science

  • 1.
    By, Dr.V.Sumathy, Assistant Professor, Department ofData Science, Loyola College, Chennai 600 034 Introduction to Data Science
  • 2.
    Big data Agenda Definition of DataScience Application of Data Science Process of Data Science Project Skillset to acquire Placement Opportunities Questions and Discussion Big data Vs Traditional data Need of Data Science Introduction to Predictive and prescriptive models Learning Resources
  • 3.
    Definition of DataScience 01 02 Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach that combines domain expertise, programming skills, machine learning algorithms, and knowledge of maths and statistics.
  • 4.
    Need for DataScience Need for Data Science Data Abundance Business Value Complexit y of Data Advancement of Technology Decision Making Personalisation Social Impact
  • 5.
    Big Data Streaming Data - Volume Variety ofdata Velocity Veracity Value
  • 8.
    Features involved inPricing model The amount charged for the distance traveled during the trip. A flat fee charged at the beginning of every ride, regardless of distance or time. The amount charged for the duration of the trip, typically based on the time spent in the vehicle. A fee charged to cover operational costs, such as insurance and customer support. 01 03 02 04 Base Fare Per-Minute Rate Per-Mile Rate Booking Fee
  • 9.
    Features involved inPricing model Additional fees may be added for tolls, airport pickups, or other special circumstances. Uber frequently offers promotions, discounts, and referral bonuses that can affect the final price of a ride. 06 07 Tolls and Surcharges Promotions and Discounts During times of high demand, Uber may implement surge pricing, which increases the fares to encourage more drivers to be available. Surge pricing multipliers can vary depending on the level of demand in the area. 05 Surge Pricing
  • 11.
    Features involved inproduct recommendation 01 05 02 06 User Preferences and History Item Attributes Implicit Feedback Explicit Feedback 03 07 Collaborative Filtering Contextual Information 04 08 Content-Based Filtering Seasonality and Trends
  • 13.
    Features involved inspam detection B H A Sender Reputation Sources Keyword/content Whitelist/Blacklist HTML code Attachments Header analysis Images C E D F
  • 14.
    Credit card/ Loansanction analysis
  • 15.
    Data science indefence ➔ Predictive Maintenance ➔ Mission Planning and Optimization ➔ Target Identification and Tracking ➔ Health Monitoring and Medical Research ➔ Cybersecurity and Information Assurance and many more
  • 16.
    Data science inRocket launching ➔ Risk Assessment and Safety Analysis ➔ Real-Time Monitoring and Control ➔ Weather Forecasting and Environmental Conditions ➔ Launch Site Selection and Infrastructure Planning and many more
  • 17.
    Types of DataAnalysis Predictive Prescriptive Diagnostic Descriptive
  • 19.
    Obtain Data B H A Open source Sources Realtime data Video Secondary/ Primary data Text data Real world data Images C E D F
  • 20.
    Scrub Data 01 04 0205 Handle missing values Handle outliers Drop unwanted columns Data Transformation 03 06 Duplication data Data discretization
  • 22.
    Explore Data 01 03 0204 Create Histogram Create scatterplot Create Boxplot Generate descriptive statistics
  • 26.
  • 27.
    Model Building Model insimple words is an equation that helps in making decisions be it predictive, prescriptive, descriptive, or diagnostic analysis. Example:
  • 30.
  • 31.
  • 32.
  • 33.
    Types of MachineLearning Algorithms
  • 34.
    Skill set toacquire ➔ Statistics-Descriptive and Inferential Statistics ➔ Mathematics- eigen, eigenvector, projection(Linear algebra) ➔ Programming language- Python, Spark ➔ DBMS, SQL, NoSQL ➔ Visualisation Tools- PowerBI/Tableau ➔ Cloud – Azure/GCP ➔ Web Scraping
  • 35.
    Explore data B A Kaggle Explore data UCIRepository Data.gov Twitter API MIMIC-III The World Bank Open Data C E D F
  • 36.
  • 37.
    Build Your Profile B A Aptitudeskill Blocks Mini project Certifications LinkedIn profile Hackathon ranks USP C E D F
  • 40.