SlideShare a Scribd company logo
1 of 71
Data Analytics
Data Analytics
Outline
• Data Analytics Definition
• Steps of Data Analytics
• Types of Data Analytics
• Subsets of Data Analytics
• Applications of Data Analytics
• Concluding Remarks
2
What is Data Analytics?
• Analytics is the use of:
– Data
– Information technology
– Statistical analysis
– Quantitative methods
– Mathematical or computer-based models
• To help managers:
– Gain improved insight about their business operations
– Make better, fact-based decisions.
3
4
5
Ronald Coase - Economist & author, Winner of Nobel Memorial Prize in Economic
Sciences.
Data Analytics Capabilities
6
Steps of Data Analytics
7
8
1
•Goals setting
•Vital, understandable, simple, short, and measurable goals
2
•Setting priorities for measurements
•Decide what to measuring, and what methods to use for measure it
3
•Data gathering
•Available datasets, recording/generating data
4
•Data cleansing
•Outlier rejection, missing values interpolation, data structuring
5
•Data analysis
•Data mining, business intelligence, data visualization, exploratory data analysis
6
•Precise results’ interpretation
•Checking whether they are helpful in meeting initial objectives, results limiting, or
inconclusive
1. Goal Setting
• The business unit has to decide on objectives for the
data analytics.
• These objectives might be set out in question format
• For example, if a business is struggling to sell its
products, some relevant questions may be:
– Are we overpricing our goods?
– How is the competition’s product different to ours?
• To answer the question, “Are we overpricing our
goods?” business company have to gather data of:
– Production costs
– Details about the price of similar goods on the market.
9
2. Setting Priorities for Measurements
• Determining what type of data is
needed to answer the questions
regarding objectives.
• How much time to take for the
analysis of the project.
• The units of measurement going to
be using.
10
3. Data Gathering
• Data can be already available datasets
• Data can be generated by:
– The direct or interview method
• Company would interview “shoppers” regarding their favorite
brand of toothpaste.
– The indirect or questionnaire method
• The questionnaire are distributed to the respondents either by
personal delivery or by mail/email.
– The registration method
• The registration records kept by government organizations, e.g.,
NADRA.
– The experimental method
• Experimentation, simulation.
11
4. Data Cleansing
12
• Data cleansing process identifying:
– Incomplete
– Incorrect
– Inaccurate
– Irrelevant parts of the data
• The dirty or coarse data is:
• Replaced
• Modified
• Or deleted.
Data Cleansing Cycle
13
5. Data Analysis
• Data analysis is process of:
– Evaluating data using:
• Analytical reasoning
• Logical reasoning
• To examine each component of the data provided.
14
15
16
Steps of Data
Analysis
I Preprocessing
• Data cleaning
– Fill in missing values, smooth noisy data, identify or remove outliers,
and resolve inconsistencies
• Data integration
– Integration of multiple databases, data cubes, or files
• Data transformation
– Normalization/ scaling and aggregation
• Data reduction
– Obtains reduced representation in volume but produces the same or
similar analytical results
17
Data Normalization
• Min-max normalization
• Z-score normalization
• Normalization by decimal scaling
A
A
A
A
A
A
min
new
min
new
max
new
min
max
min
v
v _
)
_
_
(
' 




A
A
dev
stand
mean
v
v
_
'


j
v
v
10
' Where, j is the smallest integer such that Max(| |) < 1
'
v
II Feature Engineering FE
• “Feature engineering is the process of transforming
raw data into features that better represent the
underlying problem to the predictive models,
resulting in improved accuracy on unseen data.”
Jason Brownlee, Machine Learning Mastery.
• As the models are getting better and better, the
focus shifts to what is put into them.
• Transforming data to create model’s inputs.
19
Feature Extraction
• Dimension reduction
– Principal component analysis (PCA)
– Non-negative matrix factorization (NMF)
– Kernel PCA
– Graph-based kernel PCA
– Generalized discriminant analysis (GDA)
• Data smoothing
– Wavelet transform
– Ramer–Douglas–Peucker algorithm
– Kernel smoother
– Laplacian smoothing
– Local regression, …
20
Feature Selection
• Identifying features that are redundant or
irrelevant
• Improved model interpretability.
21
Feature selection Approaches
• Wrapper – Search through the space of subsets, train
a model for current subset, evaluate it on held-out
data, and iterate. Simple greedy search heuristics:
– Forward selection - start with an empty set,
gradually add the “strongest” features
• Random hill-climbing algorithm
– Backward selection - Start with the full set,
gradually remove the “weakest" features
computationally expensive
22
Feature Selection Approaches
• Filter – Use N most promising features according to
ranking resulting from a proxy measure, e.g. from
– Mutual information
– Pearson correlation coefficient
– ANOVA
– Chi-Square
• Embedded methods – Feature selection is a part of
model construction
• LASSO
• RIDGE regression
23
Limitations on Feature Engineering
• Adding many correlated predictors can
decrease model performance.
• More variables make models less
interpretable.
• Models have to be generalizable to other data
– Too much feature engineering can lead to
overfitting.
– Close connection between feature engineering
and cross-validation.
24
III Model Training
• Model construction: Describing a set of
predetermined classes
– Each tuple/sample is assumed to belong to a
predefined class, as determined by the class label
attribute
– The set of tuples used for model construction is
training set.
– The model is represented as classification rules,
decision trees, or mathematical formulae.
• Model usage: For classifying future or unknown
objects.
25
Supervised vs. Unsupervised Learning
• Supervised learning (classification/ regression)
– Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations.
– New data is classified based on the training set.
• Unsupervised learning (clustering)
– The class labels of training data is unknown.
– Given a set of measurements, observations, etc. with
the aim of establishing the existence of classes or
clusters in the data.
26
Models for Analysis
• Approaches
– Classification
– Regression
• Techniques
– Data mining
– Machine learning
– Artificial Intelligence (AI)
27
Classification
• Each object (e.g. arrays or columns) associated with a class label (or
response) Y  {1, 2, …, K} and a feature vector (vector of predictor
variables) of G measurements: X = (X1, …, XG)
• Aim: Predict Y_new from X_new.
sample1 sample2 sample3 sample4 sample5 … New sample
1 0.46 0.30 0.80 1.51 0.90 ... 0.34
2 -0.10 0.49 0.24 0.06 0.46 ... 0.43
3 0.15 0.74 0.04 0.10 0.20 ... -0.23
4 -0.45 -1.03 -0.79 -0.56 -0.32 ... -0.91
5 -0.06 1.06 1.35 1.09 -1.09 ... 1.23
Y Normal Normal Normal Cancer Cancer Unknown =Y_new
X X_new
28
Classifiers
• A predictor or classifier partitions the space of gene expression
profiles into K disjoint subsets, A1, ..., AK, such that for a sample
with expression profile X=(X1, ...,XG)  Ak the predicted class is k.
• Classifiers are built from a learning set (LS)
L = (X1, Y1), ..., (Xn,Yn)
• Classifier C built from a learning set L:
C( . ,L): X  {1,2, ... ,K}
• Predicted class for observation X:
C(X,L) = k if X is in Ak
29
Classification vs. Prediction
30
Classification Prediction
Definition: A classification is a division or
category in a system which divides things
into groups or types
Definition: Prediction is a statement
made about the future, forecasting
unknown/ future figures
Model: Predicts categorical class labels
(discrete or nominal)
Model: Models continuous-valued
functions, i.e., predicts unknown or
missing values
Methods:
Linear Classifier LDA
SVM
Decision trees
Bayesian Classifier
Artificial Neural network
Kernel estimation k-nearest neighbor
Methods:
Linear Regression
Non linear regression
Poisson regression
Generalized linear model
Log-linear models
Regression trees
Applications : Email spam filtering
Cancer diagnosis
Voice classification (for Siri type
applications)
Video classification (for uploaded videos
on youtube, etc.)
Applications : Credit approval
Target marketing
Fault avoidance
Medical diagnosis
Fraud detection
31
Regression
• Models the relationship between one or more independent or predictor
variables and a dependent or response variable
• Linear regression: Involves a response variable y and a single predictor
variable x,y = w0 + w1x
Where, w0 (y-intercept) and w1 (slope) are regression coefficients
• Method of least squares: estimates the best-fitting straight line
• Multiple linear regression: Involves more than one predictor variable
– Training data is of the form (X1, y1), (X2, y2),…, (X|D|, y|D|)
– Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2
– Solvable by extension of least square method or using SAS, S-Plus
– Many nonlinear functions can be transformed into the above
32







 |
|
1
2
|
|
1
)
(
)
)(
(
1 D
i
i
D
i
i
i
x
x
y
y
x
x
w x
w
y
w
1
0


Issues regarding Models for Analysis
• Accuracy
– Classifier accuracy and predictor accuracy
• Speed and scalability
– Time to construct the model (training time)
– Time to use the model (classification/prediction time)
• Robustness
– Handling noise and missing values
• Scalability
– Efficiency in disk-resident databases
• Interpretability
– Understanding and insight provided by the model
• Other measures, e.g., goodness of rules, such as decision tree size or
compactness of classification rules.
33
IV Model Optimization
• Tuning model to reduce error
– Models parameter optimization
• Meta-heuristics approaches
• PSO
• GA
• ABC, …
– Validation
• K-fold cross validation
• Monte-carlo method
34
V Performance Evaluation
• Model verification
• Accuracy measures
– MAPE, MAE, RMSE, MSE, …
35
6. Results interpretation
• The most important step.
• First, check:
• Does it help you with any objections that may have
been raised initially?
• Are any of the results limiting, or inconclusive?
• If this is the case, may have to conduct further
research.
• Have any new questions been revealed that weren’t
obvious before?
• For every company to be successful, it needs experts
who can interpret the analysis results.
36
37
Types of Data Analytics
38
Model:
• An abstraction or representation of a real
system, idea, or object
• Captures the most important features
• Can be a written or verbal description, a
visual display, a mathematical formula, or a
spreadsheet representation
Decision Models
39
Decision Models
40
• A decision model is a model used to
understand, analyze, or facilitate decision
making.
• Types of model input
• Data
• Uncontrollable variables
• Decision variables (controllable).
Decision Models
41
• Descriptive Decision Models
• Simply tell “what is” and describe
relationships.
• Do not tell managers what to do.
Decision Models
42
Descriptive Analytics
What has occurred?
Descriptive analytics, such as data
visualization, is important in helping
users interpret the output from
predictive and predictive analytics.
• Descriptive analytics, such as reporting/OLAP,
dashboards, and data visualization, have been widely
used for some time.
• They are the core of traditional BI.
43
• Predictive Decision Models often incorporate
uncertainty to help managers analyze risk.
• Aim to predict what will happen in the future.
• Uncertainty is imperfect knowledge of what
will happen in the future.
• Risk is associated with the consequences of
what actually happens.
Decision Models
44
Predictive Analytics
What will occur?
• Marketing is the target for many predictive analytics applications.
• Descriptive analytics, such as data visualization, is important in helping
users interpret the output from predictive and prescriptive analytics.
• Algorithms for predictive analytics, such as regression analysis, machine
learning, and artificial neural networks, have also been around for some time.
• Prescriptive analytics are often referred to as advanced analytics.
45
A Linear Demand Prediction Model
As price increases, demand falls.
Decision Models
46
A Nonlinear Demand Prediction Model
Assumes price elasticity (constant ratio of % change
in demand to % change in price)
Decision Models
47
• Prescriptive Decision Models help decision makers identify
the best solution.
• Optimization - finding values of decision variables that
minimize (or maximize) something such as cost (or profit).
• Objective function - the equation that minimizes (or
maximizes) the quantity of interest.
• Constraints - limitations or restrictions.
• Optimal solution - values of the decision variables at the
minimum (or maximum) point.
Decision Models
48
Prescriptive Analytics
What should occur?
• For example, the use of mathematical programming for revenue management is
common for organizations that have “perishable” goods (e.g., rental cars, hotel
rooms, airline seats).
• Harrah’s has been using revenue management for hotel room pricing for some
time.
• Prescriptive analytics are often referred to as advanced analytics.
• Regression analysis, machine learning, and neural networks
• Often for the allocation of scarce resources
49
Data Analytics Cycle
50
51
Subsets of Data Analytics
Subsets of Data Analytics
• Business intelligence (BI)
• Big data analytics
52
Business Intelligence vs. Big Data Analytics
53
54
55
56
57
58
Business Intelligence Applications
• Analysis of clickstream data
• Customer profitability analysis
• Customer segmentation analysis
• Product recommendations
• Campaign management
• Pricing
• Forecasting
• Dashboards
59
60
61
Business Intelligence Applications
• Business intelligence is important:
• Predict customer trends and behaviors
• Analyze, interpret and deliver data in meaningful
ways
• Increase business productivity
• Drive effective decision-making
• Enables business experts:
• Understand business direction and objectives
• Explore the meaning behind the numbers and
figures in data
• Enables business experts:
• Analyze the causes of certain events based on data
findings
• Present technical insights using easy-to-understand
language
• Contribute to business decision-making by offering
educated opinions
62
Business Intelligence Applications
63
Big Data Analytics Applications
• Information from multiple internal and external sources:
• Transactions
• Social media
• Enterprise content
• Sensors
• Mobile devices
• Companies leverage data to adapt products and services to:
• Meet customer needs
• Optimize operations
• Optimize infrastructure
• Find new sources of revenue
• Can reveal more patterns and anomalies
64
Applications of Big Data Analytics
65
66
Concluding Remarks
• Data analysis helps in getting useful insights
that help in:
– Better decision making
– Long term planning
67
68
Data Analytics vs. Statistical Analysis
Statistical Analysis
Utilizes statistical and/or
mathematical techniques
Used based on theoretical
foundation
Seeks to identify a
significant level to address
hypotheses or RQs
Data Analytics
Utilizes data mining
techniques
Identifies inexplicable or
novel relationships/trends
Seeks to visualize the data
to allow the observation
of relationships/trends
69
70
71

More Related Content

Similar to DataAnalyticsIntroduction and its ci.pptx

Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!Khalid Salama
 
Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics
Supply Chain Analytics, Supply Chain Management, Supply Chain Data AnalyticsSupply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics
Supply Chain Analytics, Supply Chain Management, Supply Chain Data AnalyticsMujtabaAliKhan12
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & predictionhktripathy
 
background.pptx
background.pptxbackground.pptx
background.pptxKabileshCm
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CSThanveen
 
Data preprocessing 2
Data preprocessing 2Data preprocessing 2
Data preprocessing 2extraganesh
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfSaketBansal9
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1meenas06
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxshumPanwar
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 

Similar to DataAnalyticsIntroduction and its ci.pptx (20)

Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!
 
Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics
Supply Chain Analytics, Supply Chain Management, Supply Chain Data AnalyticsSupply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics
Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & prediction
 
Machine learning
Machine learning Machine learning
Machine learning
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Predictive Analysis
Predictive AnalysisPredictive Analysis
Predictive Analysis
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
 
Unit-1.ppt
Unit-1.pptUnit-1.ppt
Unit-1.ppt
 
Data preprocessing 2
Data preprocessing 2Data preprocessing 2
Data preprocessing 2
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptx
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Data Analysis, Intepretation
Data Analysis, IntepretationData Analysis, Intepretation
Data Analysis, Intepretation
 
Data analysis
Data analysisData analysis
Data analysis
 

More from PrincePatel272012

More from PrincePatel272012 (7)

Iot-Internet-of-Things-how its work.pptx
Iot-Internet-of-Things-how its work.pptxIot-Internet-of-Things-how its work.pptx
Iot-Internet-of-Things-how its work.pptx
 
digital forensic in pharmacy science department
digital forensic in pharmacy science departmentdigital forensic in pharmacy science department
digital forensic in pharmacy science department
 
Pr-2_DM.pptx
Pr-2_DM.pptxPr-2_DM.pptx
Pr-2_DM.pptx
 
Pr-1_DM.pptx
Pr-1_DM.pptxPr-1_DM.pptx
Pr-1_DM.pptx
 
Pr-4_Vitualization.pptx
Pr-4_Vitualization.pptxPr-4_Vitualization.pptx
Pr-4_Vitualization.pptx
 
ns.project.pptx
ns.project.pptxns.project.pptx
ns.project.pptx
 
Windows Azure(Pr-1).ppt.pptx
Windows Azure(Pr-1).ppt.pptxWindows Azure(Pr-1).ppt.pptx
Windows Azure(Pr-1).ppt.pptx
 

Recently uploaded

WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.CarlotaBedoya1
 

Recently uploaded (20)

WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 

DataAnalyticsIntroduction and its ci.pptx

  • 2. Outline • Data Analytics Definition • Steps of Data Analytics • Types of Data Analytics • Subsets of Data Analytics • Applications of Data Analytics • Concluding Remarks 2
  • 3. What is Data Analytics? • Analytics is the use of: – Data – Information technology – Statistical analysis – Quantitative methods – Mathematical or computer-based models • To help managers: – Gain improved insight about their business operations – Make better, fact-based decisions. 3
  • 4. 4
  • 5. 5 Ronald Coase - Economist & author, Winner of Nobel Memorial Prize in Economic Sciences.
  • 7. Steps of Data Analytics 7
  • 8. 8 1 •Goals setting •Vital, understandable, simple, short, and measurable goals 2 •Setting priorities for measurements •Decide what to measuring, and what methods to use for measure it 3 •Data gathering •Available datasets, recording/generating data 4 •Data cleansing •Outlier rejection, missing values interpolation, data structuring 5 •Data analysis •Data mining, business intelligence, data visualization, exploratory data analysis 6 •Precise results’ interpretation •Checking whether they are helpful in meeting initial objectives, results limiting, or inconclusive
  • 9. 1. Goal Setting • The business unit has to decide on objectives for the data analytics. • These objectives might be set out in question format • For example, if a business is struggling to sell its products, some relevant questions may be: – Are we overpricing our goods? – How is the competition’s product different to ours? • To answer the question, “Are we overpricing our goods?” business company have to gather data of: – Production costs – Details about the price of similar goods on the market. 9
  • 10. 2. Setting Priorities for Measurements • Determining what type of data is needed to answer the questions regarding objectives. • How much time to take for the analysis of the project. • The units of measurement going to be using. 10
  • 11. 3. Data Gathering • Data can be already available datasets • Data can be generated by: – The direct or interview method • Company would interview “shoppers” regarding their favorite brand of toothpaste. – The indirect or questionnaire method • The questionnaire are distributed to the respondents either by personal delivery or by mail/email. – The registration method • The registration records kept by government organizations, e.g., NADRA. – The experimental method • Experimentation, simulation. 11
  • 12. 4. Data Cleansing 12 • Data cleansing process identifying: – Incomplete – Incorrect – Inaccurate – Irrelevant parts of the data • The dirty or coarse data is: • Replaced • Modified • Or deleted.
  • 14. 5. Data Analysis • Data analysis is process of: – Evaluating data using: • Analytical reasoning • Logical reasoning • To examine each component of the data provided. 14
  • 15. 15
  • 17. I Preprocessing • Data cleaning – Fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies • Data integration – Integration of multiple databases, data cubes, or files • Data transformation – Normalization/ scaling and aggregation • Data reduction – Obtains reduced representation in volume but produces the same or similar analytical results 17
  • 18. Data Normalization • Min-max normalization • Z-score normalization • Normalization by decimal scaling A A A A A A min new min new max new min max min v v _ ) _ _ ( '      A A dev stand mean v v _ '   j v v 10 ' Where, j is the smallest integer such that Max(| |) < 1 ' v
  • 19. II Feature Engineering FE • “Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved accuracy on unseen data.” Jason Brownlee, Machine Learning Mastery. • As the models are getting better and better, the focus shifts to what is put into them. • Transforming data to create model’s inputs. 19
  • 20. Feature Extraction • Dimension reduction – Principal component analysis (PCA) – Non-negative matrix factorization (NMF) – Kernel PCA – Graph-based kernel PCA – Generalized discriminant analysis (GDA) • Data smoothing – Wavelet transform – Ramer–Douglas–Peucker algorithm – Kernel smoother – Laplacian smoothing – Local regression, … 20
  • 21. Feature Selection • Identifying features that are redundant or irrelevant • Improved model interpretability. 21
  • 22. Feature selection Approaches • Wrapper – Search through the space of subsets, train a model for current subset, evaluate it on held-out data, and iterate. Simple greedy search heuristics: – Forward selection - start with an empty set, gradually add the “strongest” features • Random hill-climbing algorithm – Backward selection - Start with the full set, gradually remove the “weakest" features computationally expensive 22
  • 23. Feature Selection Approaches • Filter – Use N most promising features according to ranking resulting from a proxy measure, e.g. from – Mutual information – Pearson correlation coefficient – ANOVA – Chi-Square • Embedded methods – Feature selection is a part of model construction • LASSO • RIDGE regression 23
  • 24. Limitations on Feature Engineering • Adding many correlated predictors can decrease model performance. • More variables make models less interpretable. • Models have to be generalizable to other data – Too much feature engineering can lead to overfitting. – Close connection between feature engineering and cross-validation. 24
  • 25. III Model Training • Model construction: Describing a set of predetermined classes – Each tuple/sample is assumed to belong to a predefined class, as determined by the class label attribute – The set of tuples used for model construction is training set. – The model is represented as classification rules, decision trees, or mathematical formulae. • Model usage: For classifying future or unknown objects. 25
  • 26. Supervised vs. Unsupervised Learning • Supervised learning (classification/ regression) – Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations. – New data is classified based on the training set. • Unsupervised learning (clustering) – The class labels of training data is unknown. – Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data. 26
  • 27. Models for Analysis • Approaches – Classification – Regression • Techniques – Data mining – Machine learning – Artificial Intelligence (AI) 27
  • 28. Classification • Each object (e.g. arrays or columns) associated with a class label (or response) Y  {1, 2, …, K} and a feature vector (vector of predictor variables) of G measurements: X = (X1, …, XG) • Aim: Predict Y_new from X_new. sample1 sample2 sample3 sample4 sample5 … New sample 1 0.46 0.30 0.80 1.51 0.90 ... 0.34 2 -0.10 0.49 0.24 0.06 0.46 ... 0.43 3 0.15 0.74 0.04 0.10 0.20 ... -0.23 4 -0.45 -1.03 -0.79 -0.56 -0.32 ... -0.91 5 -0.06 1.06 1.35 1.09 -1.09 ... 1.23 Y Normal Normal Normal Cancer Cancer Unknown =Y_new X X_new 28
  • 29. Classifiers • A predictor or classifier partitions the space of gene expression profiles into K disjoint subsets, A1, ..., AK, such that for a sample with expression profile X=(X1, ...,XG)  Ak the predicted class is k. • Classifiers are built from a learning set (LS) L = (X1, Y1), ..., (Xn,Yn) • Classifier C built from a learning set L: C( . ,L): X  {1,2, ... ,K} • Predicted class for observation X: C(X,L) = k if X is in Ak 29
  • 31. Classification Prediction Definition: A classification is a division or category in a system which divides things into groups or types Definition: Prediction is a statement made about the future, forecasting unknown/ future figures Model: Predicts categorical class labels (discrete or nominal) Model: Models continuous-valued functions, i.e., predicts unknown or missing values Methods: Linear Classifier LDA SVM Decision trees Bayesian Classifier Artificial Neural network Kernel estimation k-nearest neighbor Methods: Linear Regression Non linear regression Poisson regression Generalized linear model Log-linear models Regression trees Applications : Email spam filtering Cancer diagnosis Voice classification (for Siri type applications) Video classification (for uploaded videos on youtube, etc.) Applications : Credit approval Target marketing Fault avoidance Medical diagnosis Fraud detection 31
  • 32. Regression • Models the relationship between one or more independent or predictor variables and a dependent or response variable • Linear regression: Involves a response variable y and a single predictor variable x,y = w0 + w1x Where, w0 (y-intercept) and w1 (slope) are regression coefficients • Method of least squares: estimates the best-fitting straight line • Multiple linear regression: Involves more than one predictor variable – Training data is of the form (X1, y1), (X2, y2),…, (X|D|, y|D|) – Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2 – Solvable by extension of least square method or using SAS, S-Plus – Many nonlinear functions can be transformed into the above 32         | | 1 2 | | 1 ) ( ) )( ( 1 D i i D i i i x x y y x x w x w y w 1 0  
  • 33. Issues regarding Models for Analysis • Accuracy – Classifier accuracy and predictor accuracy • Speed and scalability – Time to construct the model (training time) – Time to use the model (classification/prediction time) • Robustness – Handling noise and missing values • Scalability – Efficiency in disk-resident databases • Interpretability – Understanding and insight provided by the model • Other measures, e.g., goodness of rules, such as decision tree size or compactness of classification rules. 33
  • 34. IV Model Optimization • Tuning model to reduce error – Models parameter optimization • Meta-heuristics approaches • PSO • GA • ABC, … – Validation • K-fold cross validation • Monte-carlo method 34
  • 35. V Performance Evaluation • Model verification • Accuracy measures – MAPE, MAE, RMSE, MSE, … 35
  • 36. 6. Results interpretation • The most important step. • First, check: • Does it help you with any objections that may have been raised initially? • Are any of the results limiting, or inconclusive? • If this is the case, may have to conduct further research. • Have any new questions been revealed that weren’t obvious before? • For every company to be successful, it needs experts who can interpret the analysis results. 36
  • 37. 37
  • 38. Types of Data Analytics 38
  • 39. Model: • An abstraction or representation of a real system, idea, or object • Captures the most important features • Can be a written or verbal description, a visual display, a mathematical formula, or a spreadsheet representation Decision Models 39
  • 41. • A decision model is a model used to understand, analyze, or facilitate decision making. • Types of model input • Data • Uncontrollable variables • Decision variables (controllable). Decision Models 41
  • 42. • Descriptive Decision Models • Simply tell “what is” and describe relationships. • Do not tell managers what to do. Decision Models 42
  • 43. Descriptive Analytics What has occurred? Descriptive analytics, such as data visualization, is important in helping users interpret the output from predictive and predictive analytics. • Descriptive analytics, such as reporting/OLAP, dashboards, and data visualization, have been widely used for some time. • They are the core of traditional BI. 43
  • 44. • Predictive Decision Models often incorporate uncertainty to help managers analyze risk. • Aim to predict what will happen in the future. • Uncertainty is imperfect knowledge of what will happen in the future. • Risk is associated with the consequences of what actually happens. Decision Models 44
  • 45. Predictive Analytics What will occur? • Marketing is the target for many predictive analytics applications. • Descriptive analytics, such as data visualization, is important in helping users interpret the output from predictive and prescriptive analytics. • Algorithms for predictive analytics, such as regression analysis, machine learning, and artificial neural networks, have also been around for some time. • Prescriptive analytics are often referred to as advanced analytics. 45
  • 46. A Linear Demand Prediction Model As price increases, demand falls. Decision Models 46
  • 47. A Nonlinear Demand Prediction Model Assumes price elasticity (constant ratio of % change in demand to % change in price) Decision Models 47
  • 48. • Prescriptive Decision Models help decision makers identify the best solution. • Optimization - finding values of decision variables that minimize (or maximize) something such as cost (or profit). • Objective function - the equation that minimizes (or maximizes) the quantity of interest. • Constraints - limitations or restrictions. • Optimal solution - values of the decision variables at the minimum (or maximum) point. Decision Models 48
  • 49. Prescriptive Analytics What should occur? • For example, the use of mathematical programming for revenue management is common for organizations that have “perishable” goods (e.g., rental cars, hotel rooms, airline seats). • Harrah’s has been using revenue management for hotel room pricing for some time. • Prescriptive analytics are often referred to as advanced analytics. • Regression analysis, machine learning, and neural networks • Often for the allocation of scarce resources 49
  • 51. 51 Subsets of Data Analytics
  • 52. Subsets of Data Analytics • Business intelligence (BI) • Big data analytics 52
  • 53. Business Intelligence vs. Big Data Analytics 53
  • 54. 54
  • 55. 55
  • 56. 56
  • 57. 57
  • 58. 58
  • 59. Business Intelligence Applications • Analysis of clickstream data • Customer profitability analysis • Customer segmentation analysis • Product recommendations • Campaign management • Pricing • Forecasting • Dashboards 59
  • 60. 60
  • 61. 61 Business Intelligence Applications • Business intelligence is important: • Predict customer trends and behaviors • Analyze, interpret and deliver data in meaningful ways • Increase business productivity • Drive effective decision-making • Enables business experts: • Understand business direction and objectives • Explore the meaning behind the numbers and figures in data
  • 62. • Enables business experts: • Analyze the causes of certain events based on data findings • Present technical insights using easy-to-understand language • Contribute to business decision-making by offering educated opinions 62 Business Intelligence Applications
  • 63. 63
  • 64. Big Data Analytics Applications • Information from multiple internal and external sources: • Transactions • Social media • Enterprise content • Sensors • Mobile devices • Companies leverage data to adapt products and services to: • Meet customer needs • Optimize operations • Optimize infrastructure • Find new sources of revenue • Can reveal more patterns and anomalies 64
  • 65. Applications of Big Data Analytics 65
  • 66. 66
  • 67. Concluding Remarks • Data analysis helps in getting useful insights that help in: – Better decision making – Long term planning 67
  • 68. 68
  • 69. Data Analytics vs. Statistical Analysis Statistical Analysis Utilizes statistical and/or mathematical techniques Used based on theoretical foundation Seeks to identify a significant level to address hypotheses or RQs Data Analytics Utilizes data mining techniques Identifies inexplicable or novel relationships/trends Seeks to visualize the data to allow the observation of relationships/trends 69
  • 70. 70
  • 71. 71