SlideShare a Scribd company logo
1
Presented by:
Daniel & Vini
2
“We are drowning in data, but starving for knowledge!”
-John Naisbett
3
TODAY‘S SHOW
4
You will learn a few data analysis topics
Posing a question
Wrangling your data into a format you can use and fixing
any problems with it
Exploring the data, finding patterns in it, and building
your intuition about it
Drawing conclusions and/or making predictions
Communicating your findings
5
What is Big Data Analytics?
Data analytics is an emerging technique that dives into a
data set without prior set of hypotheses
Accumulation of raw data captured from various sources
(i.e. discussion boards, emails, exam logs, chat logs in e-
learning systems) can be used to identify fruitful
patterns and relationships
Examining large amount of data
6
Data Drives
Performance
Big Data Analytics Drives
result
Increase Revenue
Decrese Costs
Increse Productivity
Why Big Data Analytics??
7
Why Big Data Analytics??
8
Applications of Data analytics
Understanding and targetting Customers
Understanding and optimizing Business Processes
Improving Healthcare and Public Health
Optimizing Machine and Device Performance
Financial Trading
Improving and Optimizing Cities and Countries
Can you think of anything more??
How??
9
Reference Models
CRISP-DM
Agile methodology: ASD-DM
10
11
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
12
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
13
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
14
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
15
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
16
Cross Industry Standard Process for Data Mining
(CRISP-DM)
The CRISP-DM reference model
17
The BIG Four
Classification Cluster Analysis
Association Rules Prediction
18
Data Classification
Some Examples:
Separating Customer based on gender
Data sorting based on content type/file type,size etc
Classifying data into restricted, pubic or private data
types
"Among all the customers of Zalando, which are likely to respond to a new
offer?"
Will respond Will not respond
19
Decision trees (DT)
Build classification or regression models in the form of Tree
structure
Classification Methods
20
Classification Methods
Decision Trees to Decision Rules
21
Classification Methods
Support Vector Machines(SVM)
Each data item is a point in n-dimensional space(n number
of features)
Find the hyperplane that differentiate the two classes
22
Classification Methods
Which do you think are the separating
Hyperplanes?
23
Classification Methods
Select the hyperplane which
segragates two classes better
Ans: B
Maximising the distance between
nearest data point (Margin)
Ans: C
Select hyper-plane which classifies
accurately prior to maximising margin
Ans: A
Ignores outliers
Introduce: Z=x²+y²
In original input space
hyperplane looks like a circle
24
Classification Methods
Bayesian Networks
Dotted lines: Potential Links
Blue box: Additional nodes and links between input
and output
 Based on probability theory.
 Can mix expert opinion and data to build
models
 Backwards reasoning - in addition to
predicting outputs given inputs, we can
use output values to infer inputs.
 Support for missing data during learning
and classification
25
Classification Methods
Bayesian Network Example
26
Association Rules
Discovering interesting realtions between variables in
large DB
Example Problems
 Which products are frequently bought together by
customers? (Basket Analysis)
● DataTable = Receipts x Products
● Results could be used to change the placements of products in the market
 Which courses tend to be attended together?
● DataTable = Students x Courses
● Results could be used to avoid scheduling conflicts....
27
Association Rules
Examples
 Bread, Cheese → Red Wine.
Customers that buy bread and cheese, also tend to buy red
wine
 Machine Learning → Web Mining, ML Praktikum
Students that take 'Machine Learning' also take 'Web Mining'
and the 'Machine Learning Praktikum'
28
Apriori Principle illustration
If {c,d,e} is frequent then all
subssets of this itemset are
frequent
Support Based pruning illustration
If {a,b} is infrequent then all
supersets of this itemset are
infrequent
Association Rules
29
Association Rules: Apriori example
30
Cluster analysis
Task of grouping a set of objects in such a way that
objects in the same group (called a cluster) are more
similar (in some sense or another) to each other than to
those in other groups (clusters).
Examples
Biology: What is the taxonomy of the species?
Education: What are student groups that need special
attention?
Business: What are the customer segments?
31
Clustering workflow
32
Cluster analysis
Methodologies
K-Means Clustering
Hierarchical Clustering
And many more!!
33
K-means clustering
k-means clustering aims to partition n observations into k
clusters in which each observation belongs to the cluster
with the nearest mean, serving as a prototype of the
cluster
Unsupervised learning algorithm
Define k centroids, one for each cluster
Take each point in the data set and associate it to the
nearest centroid
Recalculate the centroids
Repeat until the centroid doesnt move
34
Hierarchical clustering
Groups data over a variety of scales by creating a cluster
tree or dendrogram.
Find the similarity or dissimilarity between every pair of
objects in the data set.
Group the objects into a binary, hierarchical cluster
tree.
Determine where to cut the hierarchical tree into
clusters
35
Hierarchical clustering
Dissimilarity
measures
Grouped (B,F), less
dissimilarity
Grouped (A,E), less
dissimilarity
36
Hierarchical clustering
37
Hierarchical clustering
Cutting the Tree
50% similarity=50% dissimilarity
Take cluster samples below 0.5 dissimilarity
(B,F),(A,E,C,G),(D)
Creating 3 cluster labelled 1,2,3
38
Clustering workflow
Which algorithm fits my data?
Which parameters fit my data?
How good is the obtained result?
How to improve result quality?
39
Predictive Analytics
Make predictions about unknown future events based on
past happenings
Why now?
 Growing volumes and types of data, and more interest in
using data to produce valuable insights.
 Faster, cheaper computers.
 Easier-to-use software.
 Tougher economic conditions and a need for competitive
differentiation.
40
Predictive Analytics
improve pattern detection and prevent criminal
behavior.
determine customer responses or purchases, as well as
promote cross-sell opportunities
forecast inventory and manage resources, to set ticket
prices.
Credit scores are used to assess a buyer’s likelihood of
default for purchases
41
Data Visualization
Data visualization is the process of converting raw data
into easily understood pictures of information that
enable fast and effective decisions.
Visualization plays the key role in the efficient
communication of information (especially with large
amounts of information).
Visualization is used as a "check" to verify / falsify
results of automatic data analysis.
42
Why Data Visualization?
Identify areas that need attention or improvement.
Clarify which factors influence customer behavior.
Help you understand which products to place where.
Predict sales volumes.
Data visualization is a quick, easy way to convey concepts in a
universal manner
43
Where does Visualization fit in CRISP-DM
Visual
Reportting
44
Visual Analytics Loop
Visual Analytics will foster the constructive evaluation, correction and rapid
improvement of our processes and models and - ultimately - the improvement of our
knowledge and our decisions
45
Visual Analytics : Humane and Machine
46
Visual Analytics vs Information Visualization
Visual analytics is more than just visualization. It can rather be seen as an
integral approach to decision-making, combining visualization, human
factors and data analysis.
47
Various Data Visualization Techniques
48

More Related Content

What's hot

Data Visualization
Data VisualizationData Visualization
Data Visualization
simonwandrew
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
Mithilesh Trivedi
 
DATA VISUALIZATION.pptx
DATA VISUALIZATION.pptxDATA VISUALIZATION.pptx
DATA VISUALIZATION.pptx
PraneethBhai1
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptx
qwtadhsaber
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycle
Manoj Mishra
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
krishna singh
 
Data Visualization & Analytics.pptx
Data Visualization & Analytics.pptxData Visualization & Analytics.pptx
Data Visualization & Analytics.pptx
hiralpatel3085
 
Business analytics
Business analyticsBusiness analytics
Business analytics
Dinakar nk
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
Mithileysh Sathiyanarayanan
 
Data visualization introduction
Data visualization introductionData visualization introduction
Data visualization introduction
ManokamnaKochar1
 
Tableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analyticsTableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analytics
Arun K
 
Data visualisation & analytics with Tableau
Data visualisation & analytics with Tableau Data visualisation & analytics with Tableau
Data visualisation & analytics with Tableau
Outreach Digital
 
Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
Dr. C.V. Suresh Babu
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
Almog Ramrajkar
 
Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1
Beamsync
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
Ultimate Multimedia Consult
 
Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To Analytics
Alex Meadows
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
Marco Torchiano
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
Shwetabh Jaiswal
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
Srinimf-Slides
 

What's hot (20)

Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
 
DATA VISUALIZATION.pptx
DATA VISUALIZATION.pptxDATA VISUALIZATION.pptx
DATA VISUALIZATION.pptx
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptx
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycle
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
 
Data Visualization & Analytics.pptx
Data Visualization & Analytics.pptxData Visualization & Analytics.pptx
Data Visualization & Analytics.pptx
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Data visualization introduction
Data visualization introductionData visualization introduction
Data visualization introduction
 
Tableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analyticsTableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analytics
 
Data visualisation & analytics with Tableau
Data visualisation & analytics with Tableau Data visualisation & analytics with Tableau
Data visualisation & analytics with Tableau
 
Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
 
Introduction To Analytics
Introduction To AnalyticsIntroduction To Analytics
Introduction To Analytics
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 

Similar to Data analytics and visualization

Customer Profiling using Data Mining
Customer Profiling using Data Mining Customer Profiling using Data Mining
Customer Profiling using Data Mining
Suman Chatterjee
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
DrThenmozhiSPESUMCA
 
Data mining
Data miningData mining
Data mining
pradeepa n
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
Izwan Nizal Mohd Shaharanee
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
Izwan Nizal Mohd Shaharanee
 
Dwdm chapter 5 data mining a closer look
Dwdm chapter 5  data mining a closer lookDwdm chapter 5  data mining a closer look
Dwdm chapter 5 data mining a closer look
Shengyou Lin
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
cloudserviceuit
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
Dr-Dipali Meher
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
Neo4j
 
Building the Analytics Capability
Building the Analytics CapabilityBuilding the Analytics Capability
Building the Analytics CapabilityBala Iyer
 
BIDM Session 01.pdf
BIDM Session 01.pdfBIDM Session 01.pdf
BIDM Session 01.pdf
ROBIN964462
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Adetimehin Oluwasegun Matthew
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Tony Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Hoang Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Luis Goldster
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
James Wong
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Fraboni Ec
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Harry Potter
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Young Alista
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 

Similar to Data analytics and visualization (20)

Customer Profiling using Data Mining
Customer Profiling using Data Mining Customer Profiling using Data Mining
Customer Profiling using Data Mining
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
Data mining
Data miningData mining
Data mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Dwdm chapter 5 data mining a closer look
Dwdm chapter 5  data mining a closer lookDwdm chapter 5  data mining a closer look
Dwdm chapter 5 data mining a closer look
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
 
Building the Analytics Capability
Building the Analytics CapabilityBuilding the Analytics Capability
Building the Analytics Capability
 
BIDM Session 01.pdf
BIDM Session 01.pdfBIDM Session 01.pdf
BIDM Session 01.pdf
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 

Recently uploaded

一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 

Recently uploaded (20)

一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 

Data analytics and visualization

  • 2. 2 “We are drowning in data, but starving for knowledge!” -John Naisbett
  • 4. 4 You will learn a few data analysis topics Posing a question Wrangling your data into a format you can use and fixing any problems with it Exploring the data, finding patterns in it, and building your intuition about it Drawing conclusions and/or making predictions Communicating your findings
  • 5. 5 What is Big Data Analytics? Data analytics is an emerging technique that dives into a data set without prior set of hypotheses Accumulation of raw data captured from various sources (i.e. discussion boards, emails, exam logs, chat logs in e- learning systems) can be used to identify fruitful patterns and relationships Examining large amount of data
  • 6. 6 Data Drives Performance Big Data Analytics Drives result Increase Revenue Decrese Costs Increse Productivity Why Big Data Analytics??
  • 7. 7 Why Big Data Analytics??
  • 8. 8 Applications of Data analytics Understanding and targetting Customers Understanding and optimizing Business Processes Improving Healthcare and Public Health Optimizing Machine and Device Performance Financial Trading Improving and Optimizing Cities and Countries Can you think of anything more?? How??
  • 10. 10
  • 11. 11 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 12. 12 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 13. 13 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 14. 14 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 15. 15 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 16. 16 Cross Industry Standard Process for Data Mining (CRISP-DM) The CRISP-DM reference model
  • 17. 17 The BIG Four Classification Cluster Analysis Association Rules Prediction
  • 18. 18 Data Classification Some Examples: Separating Customer based on gender Data sorting based on content type/file type,size etc Classifying data into restricted, pubic or private data types "Among all the customers of Zalando, which are likely to respond to a new offer?" Will respond Will not respond
  • 19. 19 Decision trees (DT) Build classification or regression models in the form of Tree structure Classification Methods
  • 21. 21 Classification Methods Support Vector Machines(SVM) Each data item is a point in n-dimensional space(n number of features) Find the hyperplane that differentiate the two classes
  • 22. 22 Classification Methods Which do you think are the separating Hyperplanes?
  • 23. 23 Classification Methods Select the hyperplane which segragates two classes better Ans: B Maximising the distance between nearest data point (Margin) Ans: C Select hyper-plane which classifies accurately prior to maximising margin Ans: A Ignores outliers Introduce: Z=x²+y² In original input space hyperplane looks like a circle
  • 24. 24 Classification Methods Bayesian Networks Dotted lines: Potential Links Blue box: Additional nodes and links between input and output  Based on probability theory.  Can mix expert opinion and data to build models  Backwards reasoning - in addition to predicting outputs given inputs, we can use output values to infer inputs.  Support for missing data during learning and classification
  • 26. 26 Association Rules Discovering interesting realtions between variables in large DB Example Problems  Which products are frequently bought together by customers? (Basket Analysis) ● DataTable = Receipts x Products ● Results could be used to change the placements of products in the market  Which courses tend to be attended together? ● DataTable = Students x Courses ● Results could be used to avoid scheduling conflicts....
  • 27. 27 Association Rules Examples  Bread, Cheese → Red Wine. Customers that buy bread and cheese, also tend to buy red wine  Machine Learning → Web Mining, ML Praktikum Students that take 'Machine Learning' also take 'Web Mining' and the 'Machine Learning Praktikum'
  • 28. 28 Apriori Principle illustration If {c,d,e} is frequent then all subssets of this itemset are frequent Support Based pruning illustration If {a,b} is infrequent then all supersets of this itemset are infrequent Association Rules
  • 30. 30 Cluster analysis Task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). Examples Biology: What is the taxonomy of the species? Education: What are student groups that need special attention? Business: What are the customer segments?
  • 33. 33 K-means clustering k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster Unsupervised learning algorithm Define k centroids, one for each cluster Take each point in the data set and associate it to the nearest centroid Recalculate the centroids Repeat until the centroid doesnt move
  • 34. 34 Hierarchical clustering Groups data over a variety of scales by creating a cluster tree or dendrogram. Find the similarity or dissimilarity between every pair of objects in the data set. Group the objects into a binary, hierarchical cluster tree. Determine where to cut the hierarchical tree into clusters
  • 35. 35 Hierarchical clustering Dissimilarity measures Grouped (B,F), less dissimilarity Grouped (A,E), less dissimilarity
  • 37. 37 Hierarchical clustering Cutting the Tree 50% similarity=50% dissimilarity Take cluster samples below 0.5 dissimilarity (B,F),(A,E,C,G),(D) Creating 3 cluster labelled 1,2,3
  • 38. 38 Clustering workflow Which algorithm fits my data? Which parameters fit my data? How good is the obtained result? How to improve result quality?
  • 39. 39 Predictive Analytics Make predictions about unknown future events based on past happenings Why now?  Growing volumes and types of data, and more interest in using data to produce valuable insights.  Faster, cheaper computers.  Easier-to-use software.  Tougher economic conditions and a need for competitive differentiation.
  • 40. 40 Predictive Analytics improve pattern detection and prevent criminal behavior. determine customer responses or purchases, as well as promote cross-sell opportunities forecast inventory and manage resources, to set ticket prices. Credit scores are used to assess a buyer’s likelihood of default for purchases
  • 41. 41 Data Visualization Data visualization is the process of converting raw data into easily understood pictures of information that enable fast and effective decisions. Visualization plays the key role in the efficient communication of information (especially with large amounts of information). Visualization is used as a "check" to verify / falsify results of automatic data analysis.
  • 42. 42 Why Data Visualization? Identify areas that need attention or improvement. Clarify which factors influence customer behavior. Help you understand which products to place where. Predict sales volumes. Data visualization is a quick, easy way to convey concepts in a universal manner
  • 43. 43 Where does Visualization fit in CRISP-DM Visual Reportting
  • 44. 44 Visual Analytics Loop Visual Analytics will foster the constructive evaluation, correction and rapid improvement of our processes and models and - ultimately - the improvement of our knowledge and our decisions
  • 45. 45 Visual Analytics : Humane and Machine
  • 46. 46 Visual Analytics vs Information Visualization Visual analytics is more than just visualization. It can rather be seen as an integral approach to decision-making, combining visualization, human factors and data analysis.
  • 48. 48

Editor's Notes

  1. C04-0.01 room number Starting LMS registration >> BD2016 Groups Who are we repeat in brief What are we doing
  2. Interactive session
  3. Why are you sitting here? Why do u wanna do data anlysis? What dat do you have? Or what data you are familiar with? // for business people Convert data into a preferred data format Make others understand what you have found esp to business people
  4. Vini Do in day to day life Examining raw data with the purpose of drawing conclusions about that information Allows company to make better dcisions 3 types: Exploratory – new features in the data are discovered Confirmatory – existing hypothesis are validted Qualitative- draw conclusion from non numerical datalike words
  5. Why would you use big data analytics?
  6. Banks and credit cards companies: analyze withdrawal and spending patterns to prevent fraud or identity theft. Ecommerce companies examine Web site - buy a product or service based upon prior purchases or viewing trends. Predictive maintenance Virus signature Profit
  7. Digital advertisement (targeted advetisement) Recommender systems Image recognition Speech recognition Gaming (motion gaming) Price comparison websites – pricerunner, pricegrabber, junglee Airline route planning Delivery logistics – find best routes to ship Self driving car Robots Improving science and research Improving sports performance Cities – traffic monitoring
  8. danny
  9. danny
  10. Danny Determine business objectives Assess situations Determine data mining goals Produce poroject plan
  11. Danny Collect initial data Describe data Explore data Verify data
  12. Danny Select Clean Construct Integrate Format data
  13. Danny #select mofelling techniques Generate test design Build model Assess model
  14. Dannyevaluate results Review process Determine next step
  15. Danny Plan deployment Monitoring and maintenance Review project
  16. classification - a set of predefined classes and want to know which class a new object belongs to. Clustering - group a set of objects and find whether there is some relationship between the objects. classification - supervised learning  clustering - unsupervised learning. Association : discovering interesting relations ´between variables
  17. Learns a method for predicting the instance class from pre labelled classified instances Sorting data within a db or repository
  18. Decision trees Support vector machines Bayesian networks DT: Clearly lay out the problem so that all options can be challenged. Allow us to analyze fully the possible consequences of a decision. Provide a framework to quantify the values of outcomes and the probabilities of achieving them. Help us to make the best decisions on the basis of existing information and best guesses.
  19. Apriori principle : Any subset of a frequent itemset must be frequent
  20. Medicine : What are the diagnostic clusters? Business: common needs, attitude, beahavious, demographics Student groups : what issues they have for not excelling in exams: what psychological, environmental, aptitudinal, affective, and attitudinal factors
  21. danny
  22. Neuralnnetworks Fuzzy
  23. danny
  24. danny
  25. danny
  26. danny
  27. danny
  28. danny
  29. Detecting fraud Optimizing marketting campaigns Improving operations Reducing risk
  30. Forex – reducing risk Weather forecasting Sapm filtering Disease propogation