SlideShare a Scribd company logo
1 of 80
T
A
A
Lamees Mahmoud El-Ghazoly
2
Dear fellow learners and researchers,
I want to express my deep passion for sharing knowledge and helping people find the information
they need. I believe that knowledge should be accessible to everyone, and by sharing it, we can
contribute to the advancement of our society.
However, I also want to emphasize the importance of creating your own research. While my work
may serve as a source of inspiration or a starting point for your own research, I urge you to
conduct your own exploration and draw your own conclusions. This not only ensures the
originality and authenticity of your work but also allows for the discovery of new ideas and
perspectives.
Let us all strive to create and share knowledge in an ethical and responsible manner.
By doing so, we can make a positive impact on our communities and the world.
Thank you for your attention, and I wish you all the best in your research and learning endeavors.
And remember Share is Care.
Sincerely,
Lamees El-Ghazoly.
Data Visualization
vs. Data Mining
3
https://www.slideshare.net/lameesmahmou
d1/data-and-information-visualization-part-
1part-1pptx
Part 1
Data Mining and Data Visualization are two
complementary approaches to the analysis and
interpretation of data.
While they are distinct techniques, they
are often used together to gain valuable insights
from Data.
4
A
T
D A
A large Retail Chain wanted to identify patterns in customer behavior to increase sales and
customer satisfaction. The company had a vast amount of customer data, including purchasing
histories, demographic information, and store location data. To analyze this data, the company
used data mining techniques to identify patterns and relationships in the data.
To communicate these insights to the company's stakeholders, Data
Visualization was used to create a series of interactive dashboards.
These dashboards provided a visual representation of the
relationships between different products, as well as the
demographics of customers who purchased them.
The Data Visualization Dashboards allowed the stakeholders to easily
explore the data and gain a deeper understanding of customer
behavior. For example, they were able to see that customers in
certain age groups were more likely to purchase certain products,
allowing them to tailor marketing campaigns to specific
demographics.
IF
The Data Mining analysis revealed that customers who purchased
certain products were more likely to purchase other related
products. For example, customers who purchased diapers were also
likely to purchase baby food and formula.
By using Data Mining to identify patterns in the data and Data Visualization to communicate
these insights, the retail chain was able to increase sales and customer satisfaction by tailoring
their marketing and product offerings to better meet the needs and preferences of their
customers.​
5
6
Data Mining-
Data Mining is the process of Discovering Patterns, Trends, and Insights in large Datasets using
various Computational Techniques such as Statistical Analysis, Machine Learning, and Artificial
Intelligence.
It involves Analyzing and Extracting useful information from large volumes of data, which can
then be used for various purposes, such as Making Informed Business Decisions, Improving
Processes, Identifying Opportunities, and Predicting Future Outcomes.
Also known as Knowledge Discovery in Data (KDD)
7
Data Mining
Data Mining processes include
Sequences Analysis,
Classifications,
Path Analysis,
Clustering, and
Forecasting.
Data Mining is the practice of
Automatically searching large stores of
data to discover patterns and trends
that go beyond simple analysis.
Data mining uses Sophisticated
Mathematical Algorithms to Segment
the Data and Evaluate the probability of
future events.
Four Stages: Data Sources, Data Gathering or Data Exploring Data Modeling, and Deploying
the Data Models.
8
Data Mining
10 Data Mining Techniques
Outlier Direction:
For certain instances, you can’t easily interpret the data collection by merely understanding
the underlying trend. You must also be able to spot anomalies in the data or outliers.
For example, You’ll want to investigate the spike and figure out what drove it, so you can
either reproduce or bring your public into the cycle if your buyers are almost entirely male.
Still, there are significant spikes in female purchasers during a stranger week in July.
9
Data Mining
Associations:
The association is related to trends but is unique to variables that are dependently
connected. In this case, you should search for particular events and characteristics which
are closely related to another occurrence.
For example: such as when your customers purchase a particular item, they also purchase a
second similar item. This is also used to suggest on online platforms like ” People also
bought ” this item.
10
Data Mining
Clustering is a common method used in the psychological, social,and physical sciences to identify
subgroups or profiles of indi-viduals within the larger population who share similar patternson a
set of variables.
Clustering algorithms employ unsupervised learning to findnatural data groups in a non-classified
dataset .
Traditional methods of clustering (e.g.,K-means) attempt to place each individual case into a
clusterwith other observations with which it shares a similar scorepattern.
The fuzzy clustering is considered as soft clustering, in which each element has a probability of
belonging to each cluster. In other words, each element has a set of membership coefficients
corresponding to the degree of being in a given cluster.
This is different from k-means and k-medoid clustering, where each object is affected exactly to
one cluster. K-means and k-medoids clustering are known as hard or non-fuzzy clustering. 11
Data Mining
For Example: To bundle your customer Demographics into different bundles, based on the
amount of disposable income you have or how much you choose to shop in your store.
12
Data Mining
Classifications
This Analysis is used to obtain essential and appropriate data and metadata information.
This method of data mining assists in the classification of data into various groups. It is a
more complex data mining technique that forces you to collect various attributes into
distinguishable categories, and then to draw more conclusions or serve a function.
For example, You might, for instance, identify them as “low,” “medium,” or “high” loans if
you analyze the financial history or purchase records of each borrower. We will then be
used to learn more about these customers.
13
Data Mining
Regressios is used primarily for forecasting and modeling purposes, considering the
existence of other variables, to determine the likelihood of a particular variable.
For example, A certain amount, based on other factors as availability, market demand, and
competition, may be predicted. The main goal of regression is to help you identify the exact
relationship between two (or more) variables in a given collection of data.
14
Data Mining
Data Warehousing
Without Data Warehousing, Data Mining is incomplete.
Data storage is a method used to store vast volumes of organized data safely. The
preservation of data is not only a preservation problem but also for data maintenance and
security. The business of a large scale requires Data warehousing to store the data safely.
15
Data Mining
Visualization
Graphs, Charting, and Digital Images are a process of tableting of data Visualization This
allows businesses to quantify and improve their growth chart.
You may also compare your growths to your rivals and assess your market place.
Data visualization will enable companies to make informed decisions because they are aware
of a simple, well-defined representation of data.
16
Data Mining
Factor analysis: Determine which variables are combined
to generate a given factor
For Example, for many psychiatric data, one can indirectly
measure other quantities (such as test scores) that reflect
the factor of interest.
17
Data Mining
Statistical Techniques
As its name suggests, the mean, mode, and median of the data are determined to predict
future trends.
For businesses, Statistical analysis is Instrumental as it paves the way for their future profits.
Using statistics, companies can make strategic choices, measure their ROIs, and formulate a
marketing plan that takes into account potential trends through data.
Discriminant analysis:
Predict a categorical response variable, commonly used in
social science.
Attempts to determine several discriminant functions (linear
combinations of the independent variables) that discriminate
among the groups defined by the response variable.
Tracking Patterns
One of the most critical strategies for Data Mining is to find trends in the data sets. It
typically detects a specific aberration of the information that happens periodically or
fluctuation of a particular variable over time. For example, You may note that a specific
product tends to increase sales shortly before your holidays or that hot weather brings more
customers to your website.
Sequential Patterns
This implies that the sequence of the data is known. Sequential analyzes are also useful for
businesses because they can track selling trends. It may also assist organizations in learning
about the sequence of activities taking place in their Databases.
•Mining Sequence Data
oMining Time Series
oMining Symbolic Sequences
oMining Biological Sequences
•Mining Graphs and Networks
18
Data Mining
19
Data Mining
20
Data Mining
A Healthcare provider wants to identify patients who are at a high risk of developing a
particular disease.
They use Data Mining techniques to analyze Patient Records, including Demographic
Data, Medical Histories, and test results. By using Machine Learning Algorithms to
identify patterns in the data, they can identify patients who are at a High Risk of
Developing the Disease and take proactive measures to prevent it
IF
S.No Types of disease
Data mining
tool
Technique Algorithm
Traditional
method
Accuracy level
% for DM
application
1. Tuberculosis WEKA Naïve Bayes Classifier KNN
Probability
Statistics
78 %
2. Heart Disease ODND,NCC2 Classification Naive Probability 60 %
3. Kidney Dialysis RST Classification
Decision
Making
Statistics 76 %
4. Diabetes Mellitus ANN Classification
C4.5
Algorithm
Neural Network 82%
5.
Blood Bank
Sector
WEKA Classification J48 90 %
6. Dengue SPSS Modeler C5.0 Statistics 80 %
7. Hepatitis C SNP Information Gain Decision rule 74 %
21
Data Mining
The Bar Graph formed by using the above table with the percentage of accuracy level of health
care problems is as illustrated in the given figure.
In this bar graph, the predicted accuracy level of various data mining applications has been
distinguished.
• Data Visualization is the process of visualizing or displaying the data extracted in different
graphical or visual formats such as statistical representations, pie charts, bar graphs, graphical
images, etc.
• Data Visualization contains processing, analyzing, communicating the data, etc.
• Data Visualization gives a clear view of the data and will be easy for the human brain to
remember and memorize large chunks of data at a glance.
• In Data Visualization has seven stages: acquiring process, parsing, filtering, mining,
representing, refining, and interacting.
• Data Visualization facilitates complex data analysis by converting numerical data into
meaningful 3D pictures and other graphical images.
• In contrast, the applications of Data Visualization include sonar measurements, satellite photos,
computer simulations, surveys, etc.
22
Data Visualization
Data Visualization originated from statistics and
sciences, which give clear visualization at a glance,
meaning a picture gives 100 words at its sight.
In Data Visualization, the main application includes
geographical information systems where important
geographical information can be represented as
visual images that represent complex information
as simply as possible.
Data Visualization has different applications, such
as retail, government, medicine and healthcare,
transportation, telecommunication, insurance,
capital markets, and asset management.
Data Visualization provides a lot of visualization
techniques that have been developed over the past
decades that support the exploration of large data
sets.
23
Data Visualization
The infographic allows viewing, through the flows,
geographical movements of migrant masses.
24
Data Visualization
A company wants to analyze its sales data to identify trends and patterns. They use a
line chart to visualize the sales data over time, with different colors representing
different product lines. By looking at the chart, they can see which product lines are
performing well and which ones need improvement.
IF
Data Visualization vs. Data Mining
25
Basis For Comparison Data Mining Data Visualization
Definition Searches and produces relevant results
from large data chunks.
Gives a simple overview of complex data.
Preference This has different applications and is
preferred for web search engines.
They are preferred for data forecasting
and predictions.
Area Comes under data science. Comes under the area of data science.
Platform It is operated with web software systems
or applications.
Supports and works better in complex data
analyses and applications.
Generality New technology but underdeveloped. More useful in real-time data forecasting.
Algorithm Many algorithms exist in using data
mining.
No need to use any algorithms.
Integration It runs on any web-enabled platform or
with any applications.
Irrespective of hardware or software, it
provides visual information.
When working with Categorical Data and Numeric Data together, there are a few important
considerations to keep in mind:
Choosing the right visualization: When presenting Categorical Data and Numeric Data
together, it's important to choose a visualization that effectively communicates the
relationships between the Data.
For example, a scatter plot may be a good choice for visualizing the relationship between two
numeric variables, while a stacked bar chart may be more appropriate for comparing the
frequency of different categories in a categorical variable.
26
Categorical Data with Numeric Data
Scaling: When working with Numeric and Categorical Data together, it's important to ensure that
the data is scaled appropriately.
For example, if one variable has a much larger range of values than the other, it may be
necessary to Rescale the data to ensure that both variables are represented in the visualization.
There are different methods to Rescale Data such as Standard Scaling or Standardization,
Normalization or , Percentile Transformation and more. You can use codes to demonstrate how to
Standardize, Normalize and Percentilize Data in R
27
Categorical Data with Numeric Data
Statistical Analysis: When Analyzing Categorical Data and Numeric Data together, it's
important to use appropriate statistical techniques to identify relationships and patterns in
the Data.
For example, chi-squared tests may be used to determine whether there is a significant
relationship between a categorical variable and a numeric variable.
Interpretation: When interpreting the results of an analysis that includes both categorical
data and numeric data, it's important to consider the context of the data and the
relationships between the variables.
For example, a strong correlation between two variables may not necessarily imply a causal
relationship, and it may be important to consider other factors that may be influencing the
relationship.
28
Categorical Data with Numeric Data
The Popular
Emergence of Data
Visualization
Analyze Both Categorical
& Numerical Data.
Create a
PivotTable to
Analyze
Worksheet
Data
You would have to maybe copy-paste each of these
first quarter sales numbers into another spreadsheet
or another part of this spreadsheet and then I'd have
to do a formula to calculate that number.
It's just a lot of work and effort . OR
You can Simply used a Pivot Table.
31
you have an Excel sheet for your hypothetic Retail company that has many branches
and you need to Know how did your business do in the first quarter well that's a little
bit difficult!!
IF
But it produces a Report that is going to be
helpful to you.
One Important thing should be recognized
about Pivot Tables that when you create this
Pivot Table in just a minute it's not going to
change any Data in the Spreadsheet this is all
going to stay intact nothing’s going to be
changed at all it just helps to look at this Data in
a New Way.
So, let's get started first thing to consider when
you’re about to create a Pivot Table….
32
A Quick Definition Of A Pivot Table
A Pivot Table is an Excel tool that allows you to
Reorganize and Summarize certain Data in the
Spreadsheet, specifically in selected Columns and
Rows of Data and it not only Reorganizes and
Summarizes that!!
Let's get started first thing to consider when you’re about to create a Pivot Table….
33
Tip 1# Data should be Listed Vertically, with Column Titles
Tip 2# Make sure there are NO Blank Row in your Data
Let's get started first thing to consider when you’re about to create a Pivot Table….
34
Tip 3# Avoid having Extra “Data’’ in your
Spreadsheet. Such as Hidden Notes
Tip 4# Format your Data as a Table
NOW LET’S CREATE A PIVOT TABLE
35
Mechanizm
36
All you have to do is go up to insert and choose
pivot table and right away Excel wants you to
give it some information about the Pivot Table
and the first thing that’s asking… If the Data is
a Table or a Range or if you would like to use an
External Data Source…
In our Example You will use a Table
Next Choose where you want the Pivot Table
Report to be placed… Somewhere in this
Existing Worksheet Or New Worksheet!
Mechanizm
37
At the right you can see that a panel opened up on
the right and this is the pivot table fields panel or
pane and what we have here is a list of the column
headings or column titles that you had typed in the
original spreadsheet
And then Down Below you have these Four areas
filters…
Columns, Rows, and Values..
So, it depends on your purpose from Pivot Table..
What do you want to show it in Rows and what
value do you care about in this report !
Mechanizm
38
In this Example
Customer City
Customer State
Delivery postcode
Add to Columns
Payment Method
Customers Name
Add to Rows
Final Price Values
Mechanizm
39
Your Pivot Table report the way you want it.
You can see The total for each payment method even you can create a Pivot Chart....
Mechanizm
40
Create Pivot Chat.
Data visualization, like any other form of communication, can be manipulated or misrepresented
to deceive or mislead the audience.
What is misrepresentation of data?
Data representation is the visual depiction of useful information. However, it is even more
important to represent the insights correctly. Any misrepresentation of data will lead to errors of
judgment.
For example, using a red color for a bar that represents a positive value can convey a negative
message.
The results could be catastrophic in the worst cases. On the other hand, it could be an
embarrassment at the workplace is not the worst case.
41
How can Data Visualization lie?
Different reasons for misleading visualization of data
Data is misrepresented if it qualifies one or more of the following criteria:
 Unethical manipulation of data in analysis phase
 Unethical manipulation of data in visualization phase
 Inconsistency errors
 Incompetency errors
Analysis Phase
Data can be wrongly manipulated in the
analysis phase itself. Sadly, it is far more
common than we may imagine. Such
manipulation stems from the need to
force a particular ideology, perspective, or
result. One may selectively collect data.
Further, one may selectively filter the data
for analysis. Also, there are times when
someone may hide an unsupported
hypothesis. At times, people may even
fabricate data to show the results. All of
these are considered to be unethical
practices.
Unethical manipulation of data in
Visualization Phase
The second layer of misrepresentation
comes from the visualization phase. In
this phase, the analyst already has the
result. They may purposefully
manipulate what and how the insights
are presented. Again, they may keep
the unfavorable findings to
themselves. Alternatively, there are
more technical ways to misrepresent
data. We shall discuss these in detail
later. The way we prepare the charts
and graphics has a very strong impact
on what story is being conveyed.
And
43
Misleading Data Visualization Examples
This exercise is for analysis only and not meant to criticize any outlets. Charts are for education
and not copyrighted by Management Weekly.
Gun deaths vs ‘stand your ground’ campaign
What is wrong with this chart?
The Y-axis is flipped on this chart,
showing more gun-related deaths as
you move down instead of up. This goes
against the standard way of reading
charts and can be misleading to some
viewers.
How people interpret this chart?
This graph shows gun-related deaths in Florida over time. The number of deaths decreased
after 2005, possibly due to the "stand your ground" law aiming to reduce gun deaths.
44
Misleading Data Visualization Examples
The number of gun deaths increased from about 500 to 800 after the law was passed.
A correctly visualized graph would show this trend in the traditional form.
Correct representation
Accurate Data Visualizations require following conventions. Altered charts and selective
storytelling can promote biased viewpoints.
The chart uses a conventional trendline to make data interpretation less prone to errors. But,
correlation doesn't mean causation. More investigation is needed to confirm the impact of
gun laws on deaths.
45
Misleading Data Visualization Examples
Biggest covid worries
What is wrong with this chart?
A pie chart shows the proportion of each component as a percentage. All parts should add up
to 1, or 100%. For instance, around 48% of people may be concerned about getting the virus.
How people interpret this chart?
This chart wrongly depicts the
percentage of each of these components.
If you want to represent any data in the
form of a pie chart, you should always
do by finding the proportion in terms of
the whole. If you have got data that has
an overlap, such as this case, then you
must represent it differently.
46
Misleading Data Visualization Examples
Correct representation
There are various ways to accurately represent this data. One option is to use a Venn diagram
to show the different categories and any potential overlap between them. Another option is to
create a bar chart. Both methods are effective in conveying the information.
Data visualization, like any other form of communication, can be manipulated or
misrepresented to deceive or mislead the audience.
Here are some ways in which data visualization can be used to lie:
Distorting the scale: By manipulating the scales on an axis, the visualization can be
made to appear more or less significant than it actually is.
For example, a bar graph that starts at a value greater than zero can make differences
between bars seem larger than they actually are.
47
How can Data Visualization lie?
Money Raised
Cherry-picking data: By selectively choosing which data to include or exclude, the visualization can be made
to support a particular conclusion. This is known as "cherry-picking" and can be used to misrepresent the
overall trend or pattern of the data.
For Example: Emerging markets are a very volatile asset, and depending on when you look at them they
can be all over the map. Let’s look at the timeframes that each source used:
https://portfoliocharts.com/2016/03/29/the-avoidable-mistake-of-cherry-picking-data/
48
How can Data Visualization lie?
As you can see, each average return is
completely accurate, but no one average
return tells the full story.
The source with the most data has the
most representative long-term number,
but it hides the fact that emerging
markets grew massively in the decade
between 1983-1993 and have done pretty
poorly for the last 20 years.
The shortest source includes all data since
its index fund was founded, but excludes
the remarkable run that drove people to
start it in the first place.
So which number should you use for your
own decision making? That’s where the
cherry picking comes in.
49
How can Data Visualization lie?
Another Example: Below the created an example using the number of “leads” generated over the
course of 10 weeks. NOTE: Assume week 8 is simply an anomaly. There were no extra marketing
efforts made, just one great, random, week.
Leads
Generated
Week 1 20
Week 2 20
Week 3 30
Week 4 10
Week 5 10
Week 6 10
Week 7 10
Week 8 80
Week 9 10
Week 10 10
Created 2 very different graphical representations to illustrate how formatting can be manipulated
to create a large misrepresentation of data.
This is a Bar Graph showing the number of leads generated per week:
Logically, this tells us that after a decent showing in weeks 1-3,
excluding an abnormal week 8, we’ve seen a downtrend in leads
generated from 20-30 leads / week to 10.
Next Slide illustrated the exact same data set using different formatting:
50
How can Data Visualization lie?
Here we’re looking at a linear trend line of the above data. You’ll notice that the actual data
line has been removed and the Y-Axis has been limited to show the maxi”}
k7
Omitting context: By omitting important context or background information, the visualization can
be made to appear more significant or less significant than it actually is.
A visualization of crime rates that does not account for changes in population over time can be
misleading.
For example, if the population of a city increases over time but the number of crimes remains
constant, the crime rate will appear to decrease.
This is why it is important to use Per Capita Rates when comparing crime rates over time or
between different locations1.
51
How can Data Visualization lie?
Overgeneralizing: By presenting data in a way that overgeneralizes or oversimplifies
complex phenomena, the visualization can be used to support a particular conclusion or
narrative.
For example, using a single data point to represent the entire population can be misleading.
Overall, Data Visualization can be used to lie or mislead if the designer intentionally distorts
the data, omits important context, or misrepresents the data in a way that supports a
particular agenda or point of view. It is important to critically evaluate visualizations and to
verify the accuracy and validity of the data presented.
52
How can Data Visualization lie?
53
How to avoid data misrepresentation?
Unethical manipulation of data in analysis phase
Problem statement Have you defined your problem clearly, with required
variables?
Data Collection
Correct and Right source
Random sampling
Correct representation
Data Analysis
Pre-determined criteria
Avoid p-value hacking
Uniform methodology
Unethical manipulation of data in visualization phase
Type of visualization Use a visualization that enables correct inference
Visualization methodology
Unclutter the data with only top variables
Use standard and meaningful scales for axes and data
Don’t hide unfavorable findings
54
How to avoid data misrepresentation?
Incompetency errors
Problem statement
Use labels for axes and titles for the visualization
Use pie chart sparingly (may be appropriate for percentage data)
Normalize data that has a lot of variances
Data Analysis Data that has random fluctuations must be averaged to eliminate
these variations.
Inconsistency errors
Visualization technique Use consistent and commonly used scale
Use the same scale for different charts having similar data
Data representation
Refrain from using too many variables
Never represent unrelated variables as related ones
A
T
D A
55
Enhances Understanding
Data visualization can help people
understand complex Data by presenting it in
a more intuitive and accessible way. It
can reveal patterns, trends, and
relationships that may not be apparent in
Raw Data.
Improves Decision-making
Data visualization can help decision-makers
make more informed decisions by providing a
clear and concise representation of Data. It
enables them to quickly identify trends and
patterns, and make informed decisions based on
the insights gained from the Data.
Data Visualization plays a critical role in helping people to understand, analyze, and
communicate complex Data and information. It is an essential tool for Decision-
making, problem-solving, and exploring Data in a meaningful and impactful way.
Increases Engagement
Data visualization can make Data
more engaging and interesting by
presenting it in an interactive and
visually appealing way. This can
help to increase engagement and
encourage people to explore the
Data further.
Facilitates Communication
Data visualization can be used to
communicate complex data and
information to a wider audience.
It can help to simplify complex
concepts and ideas, making them
easier to understand
and communicate.
Enables Exploration
Data visualization can enable
people to explore Data in a more
interactive way. By providing tools
for filtering, sorting, and drilling
down into the Data, it can help
people to uncover insights and
gain a deeper understanding of
the Data. 56
57
•Data visualization is the graphical representation of
information and data in a visual or graphic format(Charts,
graphs, and maps).
•Data visualization tools provide an accessible way to see
and understand trends, patterns in data, and outliers.
•Data visualization tools and technologies are essential
to analyzing massive amounts of information and
making Data-driven decisions.
•Using pictures is to understand data that has been used
for centuries. General types of Data Visualization
are Charts, Tables, Graphs, Maps, and Dashboards.​
58
59
60
61
62
63
64
Business Analysts
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
https://app.powerbi.com/groups/me/reports/1df8a057-a7e5-4a3b-a752-e75ae2025b55/ReportSection?bookmarkGuid=71d5e22b-
e78e-488e-9379-22f7f83b9b69&bookmarkUsage=1&ctid=ea5fdf32-3f04-4813-8442-1c240c82b744&portalSessionId=906600d7-
8612-4722-a890-a4476b3da322&fromEntryPoint=export
THANK YOU

More Related Content

Similar to Data Visualization vs. Data Mining: How a Retail Chain Used Both Techniques to Increase Sales

what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysisData analysis ireland
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfAgile dock
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data miningEr. Nawaraj Bhandari
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxshyam1985
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Seerat Malik
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of datanomanbhutta
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining TechniqRespa Peter
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningTony Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningLuis Goldster
 

Similar to Data Visualization vs. Data Mining: How a Retail Chain Used Both Techniques to Increase Sales (20)

Data Mining
Data MiningData Mining
Data Mining
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptx
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Data Mining
Data MiningData Mining
Data Mining
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of data
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining Techniq
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
Data mining
Data miningData mining
Data mining
 
Cis 500 assignment 4
Cis 500 assignment 4Cis 500 assignment 4
Cis 500 assignment 4
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 

More from Lamees EL- Ghazoly

UPS packing tracking system.pptx
UPS packing tracking system.pptxUPS packing tracking system.pptx
UPS packing tracking system.pptxLamees EL- Ghazoly
 
Data and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxData and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxLamees EL- Ghazoly
 
Production Module & Report Completion Sub-Module. .pptx
Production Module & Report Completion Sub-Module. .pptxProduction Module & Report Completion Sub-Module. .pptx
Production Module & Report Completion Sub-Module. .pptxLamees EL- Ghazoly
 
Power BI by Lamees El-Ghazily.pptx
Power BI by Lamees El-Ghazily.pptxPower BI by Lamees El-Ghazily.pptx
Power BI by Lamees El-Ghazily.pptxLamees EL- Ghazoly
 
ORWE Financial Forecasting and Analysis.pdf
ORWE Financial Forecasting and Analysis.pdfORWE Financial Forecasting and Analysis.pdf
ORWE Financial Forecasting and Analysis.pdfLamees EL- Ghazoly
 
NAWAEM - ELSHAMADAN C0-Branding. PPT
NAWAEM - ELSHAMADAN C0-Branding. PPTNAWAEM - ELSHAMADAN C0-Branding. PPT
NAWAEM - ELSHAMADAN C0-Branding. PPTLamees EL- Ghazoly
 
Contemporary Management Fawry.pdf
Contemporary Management Fawry.pdfContemporary Management Fawry.pdf
Contemporary Management Fawry.pdfLamees EL- Ghazoly
 
Establish a Market – Competitive Pay structure
Establish a Market – Competitive Pay structureEstablish a Market – Competitive Pay structure
Establish a Market – Competitive Pay structureLamees EL- Ghazoly
 
Hrm strategic plan(learning intervention)
Hrm strategic plan(learning intervention)Hrm strategic plan(learning intervention)
Hrm strategic plan(learning intervention)Lamees EL- Ghazoly
 

More from Lamees EL- Ghazoly (13)

UPS packing tracking system.pptx
UPS packing tracking system.pptxUPS packing tracking system.pptx
UPS packing tracking system.pptx
 
Data and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxData and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptx
 
Blockchain- MIS.pptx
Blockchain- MIS.pptxBlockchain- MIS.pptx
Blockchain- MIS.pptx
 
Production Module & Report Completion Sub-Module. .pptx
Production Module & Report Completion Sub-Module. .pptxProduction Module & Report Completion Sub-Module. .pptx
Production Module & Report Completion Sub-Module. .pptx
 
Power BI by Lamees El-Ghazily.pptx
Power BI by Lamees El-Ghazily.pptxPower BI by Lamees El-Ghazily.pptx
Power BI by Lamees El-Ghazily.pptx
 
ORWE Financial Forecasting and Analysis.pdf
ORWE Financial Forecasting and Analysis.pdfORWE Financial Forecasting and Analysis.pdf
ORWE Financial Forecasting and Analysis.pdf
 
NAWAEM - ELSHAMADAN C0-Branding. PPT
NAWAEM - ELSHAMADAN C0-Branding. PPTNAWAEM - ELSHAMADAN C0-Branding. PPT
NAWAEM - ELSHAMADAN C0-Branding. PPT
 
Marketing Plan C0-Branding
Marketing Plan C0-BrandingMarketing Plan C0-Branding
Marketing Plan C0-Branding
 
Contemporary Management Fawry.pdf
Contemporary Management Fawry.pdfContemporary Management Fawry.pdf
Contemporary Management Fawry.pdf
 
Establish a Market – Competitive Pay structure
Establish a Market – Competitive Pay structureEstablish a Market – Competitive Pay structure
Establish a Market – Competitive Pay structure
 
Process of an erp system
Process of an erp systemProcess of an erp system
Process of an erp system
 
Request for proposal final#n
Request for proposal final#nRequest for proposal final#n
Request for proposal final#n
 
Hrm strategic plan(learning intervention)
Hrm strategic plan(learning intervention)Hrm strategic plan(learning intervention)
Hrm strategic plan(learning intervention)
 

Recently uploaded

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 

Recently uploaded (20)

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 

Data Visualization vs. Data Mining: How a Retail Chain Used Both Techniques to Increase Sales

  • 2. 2 Dear fellow learners and researchers, I want to express my deep passion for sharing knowledge and helping people find the information they need. I believe that knowledge should be accessible to everyone, and by sharing it, we can contribute to the advancement of our society. However, I also want to emphasize the importance of creating your own research. While my work may serve as a source of inspiration or a starting point for your own research, I urge you to conduct your own exploration and draw your own conclusions. This not only ensures the originality and authenticity of your work but also allows for the discovery of new ideas and perspectives. Let us all strive to create and share knowledge in an ethical and responsible manner. By doing so, we can make a positive impact on our communities and the world. Thank you for your attention, and I wish you all the best in your research and learning endeavors. And remember Share is Care. Sincerely, Lamees El-Ghazoly.
  • 3. Data Visualization vs. Data Mining 3 https://www.slideshare.net/lameesmahmou d1/data-and-information-visualization-part- 1part-1pptx Part 1
  • 4. Data Mining and Data Visualization are two complementary approaches to the analysis and interpretation of data. While they are distinct techniques, they are often used together to gain valuable insights from Data. 4
  • 5. A T D A A large Retail Chain wanted to identify patterns in customer behavior to increase sales and customer satisfaction. The company had a vast amount of customer data, including purchasing histories, demographic information, and store location data. To analyze this data, the company used data mining techniques to identify patterns and relationships in the data. To communicate these insights to the company's stakeholders, Data Visualization was used to create a series of interactive dashboards. These dashboards provided a visual representation of the relationships between different products, as well as the demographics of customers who purchased them. The Data Visualization Dashboards allowed the stakeholders to easily explore the data and gain a deeper understanding of customer behavior. For example, they were able to see that customers in certain age groups were more likely to purchase certain products, allowing them to tailor marketing campaigns to specific demographics. IF The Data Mining analysis revealed that customers who purchased certain products were more likely to purchase other related products. For example, customers who purchased diapers were also likely to purchase baby food and formula. By using Data Mining to identify patterns in the data and Data Visualization to communicate these insights, the retail chain was able to increase sales and customer satisfaction by tailoring their marketing and product offerings to better meet the needs and preferences of their customers.​ 5
  • 6. 6 Data Mining- Data Mining is the process of Discovering Patterns, Trends, and Insights in large Datasets using various Computational Techniques such as Statistical Analysis, Machine Learning, and Artificial Intelligence. It involves Analyzing and Extracting useful information from large volumes of data, which can then be used for various purposes, such as Making Informed Business Decisions, Improving Processes, Identifying Opportunities, and Predicting Future Outcomes. Also known as Knowledge Discovery in Data (KDD)
  • 7. 7 Data Mining Data Mining processes include Sequences Analysis, Classifications, Path Analysis, Clustering, and Forecasting. Data Mining is the practice of Automatically searching large stores of data to discover patterns and trends that go beyond simple analysis. Data mining uses Sophisticated Mathematical Algorithms to Segment the Data and Evaluate the probability of future events.
  • 8. Four Stages: Data Sources, Data Gathering or Data Exploring Data Modeling, and Deploying the Data Models. 8 Data Mining
  • 9. 10 Data Mining Techniques Outlier Direction: For certain instances, you can’t easily interpret the data collection by merely understanding the underlying trend. You must also be able to spot anomalies in the data or outliers. For example, You’ll want to investigate the spike and figure out what drove it, so you can either reproduce or bring your public into the cycle if your buyers are almost entirely male. Still, there are significant spikes in female purchasers during a stranger week in July. 9 Data Mining
  • 10. Associations: The association is related to trends but is unique to variables that are dependently connected. In this case, you should search for particular events and characteristics which are closely related to another occurrence. For example: such as when your customers purchase a particular item, they also purchase a second similar item. This is also used to suggest on online platforms like ” People also bought ” this item. 10 Data Mining
  • 11. Clustering is a common method used in the psychological, social,and physical sciences to identify subgroups or profiles of indi-viduals within the larger population who share similar patternson a set of variables. Clustering algorithms employ unsupervised learning to findnatural data groups in a non-classified dataset . Traditional methods of clustering (e.g.,K-means) attempt to place each individual case into a clusterwith other observations with which it shares a similar scorepattern. The fuzzy clustering is considered as soft clustering, in which each element has a probability of belonging to each cluster. In other words, each element has a set of membership coefficients corresponding to the degree of being in a given cluster. This is different from k-means and k-medoid clustering, where each object is affected exactly to one cluster. K-means and k-medoids clustering are known as hard or non-fuzzy clustering. 11 Data Mining
  • 12. For Example: To bundle your customer Demographics into different bundles, based on the amount of disposable income you have or how much you choose to shop in your store. 12 Data Mining
  • 13. Classifications This Analysis is used to obtain essential and appropriate data and metadata information. This method of data mining assists in the classification of data into various groups. It is a more complex data mining technique that forces you to collect various attributes into distinguishable categories, and then to draw more conclusions or serve a function. For example, You might, for instance, identify them as “low,” “medium,” or “high” loans if you analyze the financial history or purchase records of each borrower. We will then be used to learn more about these customers. 13 Data Mining
  • 14. Regressios is used primarily for forecasting and modeling purposes, considering the existence of other variables, to determine the likelihood of a particular variable. For example, A certain amount, based on other factors as availability, market demand, and competition, may be predicted. The main goal of regression is to help you identify the exact relationship between two (or more) variables in a given collection of data. 14 Data Mining
  • 15. Data Warehousing Without Data Warehousing, Data Mining is incomplete. Data storage is a method used to store vast volumes of organized data safely. The preservation of data is not only a preservation problem but also for data maintenance and security. The business of a large scale requires Data warehousing to store the data safely. 15 Data Mining
  • 16. Visualization Graphs, Charting, and Digital Images are a process of tableting of data Visualization This allows businesses to quantify and improve their growth chart. You may also compare your growths to your rivals and assess your market place. Data visualization will enable companies to make informed decisions because they are aware of a simple, well-defined representation of data. 16 Data Mining
  • 17. Factor analysis: Determine which variables are combined to generate a given factor For Example, for many psychiatric data, one can indirectly measure other quantities (such as test scores) that reflect the factor of interest. 17 Data Mining Statistical Techniques As its name suggests, the mean, mode, and median of the data are determined to predict future trends. For businesses, Statistical analysis is Instrumental as it paves the way for their future profits. Using statistics, companies can make strategic choices, measure their ROIs, and formulate a marketing plan that takes into account potential trends through data. Discriminant analysis: Predict a categorical response variable, commonly used in social science. Attempts to determine several discriminant functions (linear combinations of the independent variables) that discriminate among the groups defined by the response variable.
  • 18. Tracking Patterns One of the most critical strategies for Data Mining is to find trends in the data sets. It typically detects a specific aberration of the information that happens periodically or fluctuation of a particular variable over time. For example, You may note that a specific product tends to increase sales shortly before your holidays or that hot weather brings more customers to your website. Sequential Patterns This implies that the sequence of the data is known. Sequential analyzes are also useful for businesses because they can track selling trends. It may also assist organizations in learning about the sequence of activities taking place in their Databases. •Mining Sequence Data oMining Time Series oMining Symbolic Sequences oMining Biological Sequences •Mining Graphs and Networks 18 Data Mining
  • 20. 20 Data Mining A Healthcare provider wants to identify patients who are at a high risk of developing a particular disease. They use Data Mining techniques to analyze Patient Records, including Demographic Data, Medical Histories, and test results. By using Machine Learning Algorithms to identify patterns in the data, they can identify patients who are at a High Risk of Developing the Disease and take proactive measures to prevent it IF S.No Types of disease Data mining tool Technique Algorithm Traditional method Accuracy level % for DM application 1. Tuberculosis WEKA Naïve Bayes Classifier KNN Probability Statistics 78 % 2. Heart Disease ODND,NCC2 Classification Naive Probability 60 % 3. Kidney Dialysis RST Classification Decision Making Statistics 76 % 4. Diabetes Mellitus ANN Classification C4.5 Algorithm Neural Network 82% 5. Blood Bank Sector WEKA Classification J48 90 % 6. Dengue SPSS Modeler C5.0 Statistics 80 % 7. Hepatitis C SNP Information Gain Decision rule 74 %
  • 21. 21 Data Mining The Bar Graph formed by using the above table with the percentage of accuracy level of health care problems is as illustrated in the given figure. In this bar graph, the predicted accuracy level of various data mining applications has been distinguished.
  • 22. • Data Visualization is the process of visualizing or displaying the data extracted in different graphical or visual formats such as statistical representations, pie charts, bar graphs, graphical images, etc. • Data Visualization contains processing, analyzing, communicating the data, etc. • Data Visualization gives a clear view of the data and will be easy for the human brain to remember and memorize large chunks of data at a glance. • In Data Visualization has seven stages: acquiring process, parsing, filtering, mining, representing, refining, and interacting. • Data Visualization facilitates complex data analysis by converting numerical data into meaningful 3D pictures and other graphical images. • In contrast, the applications of Data Visualization include sonar measurements, satellite photos, computer simulations, surveys, etc. 22 Data Visualization
  • 23. Data Visualization originated from statistics and sciences, which give clear visualization at a glance, meaning a picture gives 100 words at its sight. In Data Visualization, the main application includes geographical information systems where important geographical information can be represented as visual images that represent complex information as simply as possible. Data Visualization has different applications, such as retail, government, medicine and healthcare, transportation, telecommunication, insurance, capital markets, and asset management. Data Visualization provides a lot of visualization techniques that have been developed over the past decades that support the exploration of large data sets. 23 Data Visualization The infographic allows viewing, through the flows, geographical movements of migrant masses.
  • 24. 24 Data Visualization A company wants to analyze its sales data to identify trends and patterns. They use a line chart to visualize the sales data over time, with different colors representing different product lines. By looking at the chart, they can see which product lines are performing well and which ones need improvement. IF
  • 25. Data Visualization vs. Data Mining 25 Basis For Comparison Data Mining Data Visualization Definition Searches and produces relevant results from large data chunks. Gives a simple overview of complex data. Preference This has different applications and is preferred for web search engines. They are preferred for data forecasting and predictions. Area Comes under data science. Comes under the area of data science. Platform It is operated with web software systems or applications. Supports and works better in complex data analyses and applications. Generality New technology but underdeveloped. More useful in real-time data forecasting. Algorithm Many algorithms exist in using data mining. No need to use any algorithms. Integration It runs on any web-enabled platform or with any applications. Irrespective of hardware or software, it provides visual information.
  • 26. When working with Categorical Data and Numeric Data together, there are a few important considerations to keep in mind: Choosing the right visualization: When presenting Categorical Data and Numeric Data together, it's important to choose a visualization that effectively communicates the relationships between the Data. For example, a scatter plot may be a good choice for visualizing the relationship between two numeric variables, while a stacked bar chart may be more appropriate for comparing the frequency of different categories in a categorical variable. 26 Categorical Data with Numeric Data
  • 27. Scaling: When working with Numeric and Categorical Data together, it's important to ensure that the data is scaled appropriately. For example, if one variable has a much larger range of values than the other, it may be necessary to Rescale the data to ensure that both variables are represented in the visualization. There are different methods to Rescale Data such as Standard Scaling or Standardization, Normalization or , Percentile Transformation and more. You can use codes to demonstrate how to Standardize, Normalize and Percentilize Data in R 27 Categorical Data with Numeric Data
  • 28. Statistical Analysis: When Analyzing Categorical Data and Numeric Data together, it's important to use appropriate statistical techniques to identify relationships and patterns in the Data. For example, chi-squared tests may be used to determine whether there is a significant relationship between a categorical variable and a numeric variable. Interpretation: When interpreting the results of an analysis that includes both categorical data and numeric data, it's important to consider the context of the data and the relationships between the variables. For example, a strong correlation between two variables may not necessarily imply a causal relationship, and it may be important to consider other factors that may be influencing the relationship. 28 Categorical Data with Numeric Data
  • 29. The Popular Emergence of Data Visualization Analyze Both Categorical & Numerical Data.
  • 31. You would have to maybe copy-paste each of these first quarter sales numbers into another spreadsheet or another part of this spreadsheet and then I'd have to do a formula to calculate that number. It's just a lot of work and effort . OR You can Simply used a Pivot Table. 31 you have an Excel sheet for your hypothetic Retail company that has many branches and you need to Know how did your business do in the first quarter well that's a little bit difficult!! IF
  • 32. But it produces a Report that is going to be helpful to you. One Important thing should be recognized about Pivot Tables that when you create this Pivot Table in just a minute it's not going to change any Data in the Spreadsheet this is all going to stay intact nothing’s going to be changed at all it just helps to look at this Data in a New Way. So, let's get started first thing to consider when you’re about to create a Pivot Table…. 32 A Quick Definition Of A Pivot Table A Pivot Table is an Excel tool that allows you to Reorganize and Summarize certain Data in the Spreadsheet, specifically in selected Columns and Rows of Data and it not only Reorganizes and Summarizes that!!
  • 33. Let's get started first thing to consider when you’re about to create a Pivot Table…. 33 Tip 1# Data should be Listed Vertically, with Column Titles Tip 2# Make sure there are NO Blank Row in your Data
  • 34. Let's get started first thing to consider when you’re about to create a Pivot Table…. 34 Tip 3# Avoid having Extra “Data’’ in your Spreadsheet. Such as Hidden Notes Tip 4# Format your Data as a Table
  • 35. NOW LET’S CREATE A PIVOT TABLE 35
  • 36. Mechanizm 36 All you have to do is go up to insert and choose pivot table and right away Excel wants you to give it some information about the Pivot Table and the first thing that’s asking… If the Data is a Table or a Range or if you would like to use an External Data Source… In our Example You will use a Table Next Choose where you want the Pivot Table Report to be placed… Somewhere in this Existing Worksheet Or New Worksheet!
  • 37. Mechanizm 37 At the right you can see that a panel opened up on the right and this is the pivot table fields panel or pane and what we have here is a list of the column headings or column titles that you had typed in the original spreadsheet And then Down Below you have these Four areas filters… Columns, Rows, and Values.. So, it depends on your purpose from Pivot Table.. What do you want to show it in Rows and what value do you care about in this report !
  • 38. Mechanizm 38 In this Example Customer City Customer State Delivery postcode Add to Columns Payment Method Customers Name Add to Rows Final Price Values
  • 39. Mechanizm 39 Your Pivot Table report the way you want it. You can see The total for each payment method even you can create a Pivot Chart....
  • 41. Data visualization, like any other form of communication, can be manipulated or misrepresented to deceive or mislead the audience. What is misrepresentation of data? Data representation is the visual depiction of useful information. However, it is even more important to represent the insights correctly. Any misrepresentation of data will lead to errors of judgment. For example, using a red color for a bar that represents a positive value can convey a negative message. The results could be catastrophic in the worst cases. On the other hand, it could be an embarrassment at the workplace is not the worst case. 41 How can Data Visualization lie? Different reasons for misleading visualization of data Data is misrepresented if it qualifies one or more of the following criteria:  Unethical manipulation of data in analysis phase  Unethical manipulation of data in visualization phase  Inconsistency errors  Incompetency errors
  • 42. Analysis Phase Data can be wrongly manipulated in the analysis phase itself. Sadly, it is far more common than we may imagine. Such manipulation stems from the need to force a particular ideology, perspective, or result. One may selectively collect data. Further, one may selectively filter the data for analysis. Also, there are times when someone may hide an unsupported hypothesis. At times, people may even fabricate data to show the results. All of these are considered to be unethical practices. Unethical manipulation of data in Visualization Phase The second layer of misrepresentation comes from the visualization phase. In this phase, the analyst already has the result. They may purposefully manipulate what and how the insights are presented. Again, they may keep the unfavorable findings to themselves. Alternatively, there are more technical ways to misrepresent data. We shall discuss these in detail later. The way we prepare the charts and graphics has a very strong impact on what story is being conveyed. And
  • 43. 43 Misleading Data Visualization Examples This exercise is for analysis only and not meant to criticize any outlets. Charts are for education and not copyrighted by Management Weekly. Gun deaths vs ‘stand your ground’ campaign What is wrong with this chart? The Y-axis is flipped on this chart, showing more gun-related deaths as you move down instead of up. This goes against the standard way of reading charts and can be misleading to some viewers. How people interpret this chart? This graph shows gun-related deaths in Florida over time. The number of deaths decreased after 2005, possibly due to the "stand your ground" law aiming to reduce gun deaths.
  • 44. 44 Misleading Data Visualization Examples The number of gun deaths increased from about 500 to 800 after the law was passed. A correctly visualized graph would show this trend in the traditional form. Correct representation Accurate Data Visualizations require following conventions. Altered charts and selective storytelling can promote biased viewpoints. The chart uses a conventional trendline to make data interpretation less prone to errors. But, correlation doesn't mean causation. More investigation is needed to confirm the impact of gun laws on deaths.
  • 45. 45 Misleading Data Visualization Examples Biggest covid worries What is wrong with this chart? A pie chart shows the proportion of each component as a percentage. All parts should add up to 1, or 100%. For instance, around 48% of people may be concerned about getting the virus. How people interpret this chart? This chart wrongly depicts the percentage of each of these components. If you want to represent any data in the form of a pie chart, you should always do by finding the proportion in terms of the whole. If you have got data that has an overlap, such as this case, then you must represent it differently.
  • 46. 46 Misleading Data Visualization Examples Correct representation There are various ways to accurately represent this data. One option is to use a Venn diagram to show the different categories and any potential overlap between them. Another option is to create a bar chart. Both methods are effective in conveying the information.
  • 47. Data visualization, like any other form of communication, can be manipulated or misrepresented to deceive or mislead the audience. Here are some ways in which data visualization can be used to lie: Distorting the scale: By manipulating the scales on an axis, the visualization can be made to appear more or less significant than it actually is. For example, a bar graph that starts at a value greater than zero can make differences between bars seem larger than they actually are. 47 How can Data Visualization lie? Money Raised
  • 48. Cherry-picking data: By selectively choosing which data to include or exclude, the visualization can be made to support a particular conclusion. This is known as "cherry-picking" and can be used to misrepresent the overall trend or pattern of the data. For Example: Emerging markets are a very volatile asset, and depending on when you look at them they can be all over the map. Let’s look at the timeframes that each source used: https://portfoliocharts.com/2016/03/29/the-avoidable-mistake-of-cherry-picking-data/ 48 How can Data Visualization lie? As you can see, each average return is completely accurate, but no one average return tells the full story. The source with the most data has the most representative long-term number, but it hides the fact that emerging markets grew massively in the decade between 1983-1993 and have done pretty poorly for the last 20 years. The shortest source includes all data since its index fund was founded, but excludes the remarkable run that drove people to start it in the first place. So which number should you use for your own decision making? That’s where the cherry picking comes in.
  • 49. 49 How can Data Visualization lie? Another Example: Below the created an example using the number of “leads” generated over the course of 10 weeks. NOTE: Assume week 8 is simply an anomaly. There were no extra marketing efforts made, just one great, random, week. Leads Generated Week 1 20 Week 2 20 Week 3 30 Week 4 10 Week 5 10 Week 6 10 Week 7 10 Week 8 80 Week 9 10 Week 10 10 Created 2 very different graphical representations to illustrate how formatting can be manipulated to create a large misrepresentation of data. This is a Bar Graph showing the number of leads generated per week: Logically, this tells us that after a decent showing in weeks 1-3, excluding an abnormal week 8, we’ve seen a downtrend in leads generated from 20-30 leads / week to 10. Next Slide illustrated the exact same data set using different formatting:
  • 50. 50 How can Data Visualization lie? Here we’re looking at a linear trend line of the above data. You’ll notice that the actual data line has been removed and the Y-Axis has been limited to show the maxi”} k7
  • 51. Omitting context: By omitting important context or background information, the visualization can be made to appear more significant or less significant than it actually is. A visualization of crime rates that does not account for changes in population over time can be misleading. For example, if the population of a city increases over time but the number of crimes remains constant, the crime rate will appear to decrease. This is why it is important to use Per Capita Rates when comparing crime rates over time or between different locations1. 51 How can Data Visualization lie?
  • 52. Overgeneralizing: By presenting data in a way that overgeneralizes or oversimplifies complex phenomena, the visualization can be used to support a particular conclusion or narrative. For example, using a single data point to represent the entire population can be misleading. Overall, Data Visualization can be used to lie or mislead if the designer intentionally distorts the data, omits important context, or misrepresents the data in a way that supports a particular agenda or point of view. It is important to critically evaluate visualizations and to verify the accuracy and validity of the data presented. 52 How can Data Visualization lie?
  • 53. 53 How to avoid data misrepresentation? Unethical manipulation of data in analysis phase Problem statement Have you defined your problem clearly, with required variables? Data Collection Correct and Right source Random sampling Correct representation Data Analysis Pre-determined criteria Avoid p-value hacking Uniform methodology Unethical manipulation of data in visualization phase Type of visualization Use a visualization that enables correct inference Visualization methodology Unclutter the data with only top variables Use standard and meaningful scales for axes and data Don’t hide unfavorable findings
  • 54. 54 How to avoid data misrepresentation? Incompetency errors Problem statement Use labels for axes and titles for the visualization Use pie chart sparingly (may be appropriate for percentage data) Normalize data that has a lot of variances Data Analysis Data that has random fluctuations must be averaged to eliminate these variations. Inconsistency errors Visualization technique Use consistent and commonly used scale Use the same scale for different charts having similar data Data representation Refrain from using too many variables Never represent unrelated variables as related ones
  • 56. Enhances Understanding Data visualization can help people understand complex Data by presenting it in a more intuitive and accessible way. It can reveal patterns, trends, and relationships that may not be apparent in Raw Data. Improves Decision-making Data visualization can help decision-makers make more informed decisions by providing a clear and concise representation of Data. It enables them to quickly identify trends and patterns, and make informed decisions based on the insights gained from the Data. Data Visualization plays a critical role in helping people to understand, analyze, and communicate complex Data and information. It is an essential tool for Decision- making, problem-solving, and exploring Data in a meaningful and impactful way. Increases Engagement Data visualization can make Data more engaging and interesting by presenting it in an interactive and visually appealing way. This can help to increase engagement and encourage people to explore the Data further. Facilitates Communication Data visualization can be used to communicate complex data and information to a wider audience. It can help to simplify complex concepts and ideas, making them easier to understand and communicate. Enables Exploration Data visualization can enable people to explore Data in a more interactive way. By providing tools for filtering, sorting, and drilling down into the Data, it can help people to uncover insights and gain a deeper understanding of the Data. 56
  • 57. 57 •Data visualization is the graphical representation of information and data in a visual or graphic format(Charts, graphs, and maps). •Data visualization tools provide an accessible way to see and understand trends, patterns in data, and outliers. •Data visualization tools and technologies are essential to analyzing massive amounts of information and making Data-driven decisions. •Using pictures is to understand data that has been used for centuries. General types of Data Visualization are Charts, Tables, Graphs, Maps, and Dashboards.​
  • 58. 58
  • 59. 59
  • 60. 60
  • 61. 61
  • 62. 62
  • 63. 63
  • 65. 65
  • 66. 66
  • 67. 67
  • 68. 68
  • 69. 69
  • 70. 70
  • 71. 71
  • 72. 72
  • 73. 73
  • 74. 74
  • 75. 75
  • 76. 76
  • 77. 77
  • 78. 78