SlideShare a Scribd company logo
1 of 59
Download to read offline
EDA Visualization
Orozco Hsu
2024-03-20
1
About me
• Education
• NCU (MIS)、NCCU (CS)
• Experiences
• Telecom big data Innovation
• Retail Media Network (RMN)
• Customer Data Platform (CDP)
• Know-your-customer (KYC)
• Digital Transformation
• Research
• Data Ops (ML Ops)
• Business Data Analysis, AI
2
Tutorial
Content
3
Story telling and visualization
Exploration Data Analysis and Visualization
Home work
What is data visualization?
Code
• Download materials:
• https://drive.google.com/drive/folders/1ibppjANnGy2RYe5CW805MwHrprm2
nu5f?usp=sharing
4
學習 Python 的建議書籍
• 史上最強Python入門邁向頂尖高手之路王者歸來
5
https://www.books.com.tw/products/0010976050?sloc=main
Python 視覺化套件
6
https://jovian.com/aakashns/python-matplotlib-data-visualization
Get ready to your Orange 3
• Open source machine learning and data visualization
• Version: 3.36.2
• https://orangedatamining.com/
7
Story telling With Data (SWD)
• Always remember Data Comparison!
• Focus on simplicity and ease of interpretation
• The takeaways!
8
https://www.storytellingwithdata.com
From touchdowns to takeaways
9
Sorting categories
10
A vertical bar chart can be a better choice if data is ordinal
Allow the labels to be written in a single,
easily readable line
11
Rainbow palette, overly distracting!
• If the goal is to observe the「fluctuation of commercials across
categories over the five years」, we could better achieve that by
iterating to a different graph type.
• On the other hand, if we’re meant simply to compare the overall
category trends,「toning down the color」usage might be beneficial.
12
Color in only the year with the highest
number of commercials in each category
13
This results in a visually chaotic!
2023
2022
Over-Time
The Over-Time means the Line-Graph
14
An overly complex visualization with numerous overlapping data series
In order of total number of commercials
across all five years of data
15
Bar charts instead of line graphs, we can
intentionally emphasize that aspect of our data
16
The number of commercial advertisers in each category, in each year, is a countable
The area graph small multiple chart
17
A visualization of this on social media.
It maintains visual interest while facilitating more straightforward
comparisons across categories over several years.
A combination of line graphs with descriptive
captions to convey these insights more clearly
18
A combination of line graphs with descriptive
captions to convey these insights more clearly
19
A combination of line graphs with descriptive
captions to convey these insights more clearly
20
Conclusion
• There is no singularly correct approach to data visualization.
• The key is to consider the audience's needs, the context of the
presentation, and the intended message.
• Visualizing data is as much an art as it is a science, requiring
experimentation, iteration, and feedback, rather than adherence to a
strict set of rules.
•All about communications!
21
https://www.storytellingwithdata.com/blog
What is data visualization?
• Data visualization is the graphical representation of information and
data.
• By using visual elements like charts, graphs, and maps.
• A way to see and understand trends, outliers, and patterns in data.
22
What is data visualization?
23
https://www.tableau.com/learn/articles/data-visualization#advantages-disadvantages
24
The Pyramid of Data Needs (and why it matters for your career) | by Hugh Williams | Medium
25
The Pyramid of Data Needs (and why it matters for your career) | by Hugh Williams | Medium
Static chart
• There are generally THREE STEPS in drawing a chart:
• Observing the data, determine the relationship, and select the chart.
• What type of data it is, and what content you want to express.
• Category
• Numeric
• Text
• Datetime
• After clarifying the content to be expressed, you can choose which chart to
use to express it.
26
Pie chart
• You must have some kind of whole
amount that is divided into a number
of distinct parts.
• Your primary objective in a pie chart
should be to compare each group’s
contribution to the whole.
27
Line chart
• Line charts provide the clearest
graphical representation of time-
related variables and are the
preferred mode for representing
trends or variables over time.
28
Histogram chart
• It is used to summarize discrete
or continuous data that are
measured on an interval scale.
• It is often used to illustrate the
major features of the distribution
of the data in a convenient form.
29
Bar chart
• It provides a way of showing
data values represented as
the comparison of multiple
data sets side by side.
30
Differences between histogram and bar chart
Comparison terms Bar chart Histogram
Usage
To compare different categories of
data.
To display the distribution of a variable.
Type of variable Categorical variables Numeric variables
Rendering
Each data point is rendered as a
separate bar.
The data points are grouped and
rendered based on the bin value.
The entire range of data values is
divided into a series of non-
overlapping intervals.
Space between bars Can have space. No space.
Reordering bars Can be reordered. Cannot be reordered.
31
Scatter Plot
• It uses dots to
represent values for
two different numeric
variables and observe
relationships between
variables.
32
Pearson Correlation
Box plot
• Q1: The first quartile (25%) position.
• Q3: The third quartile (75%) position.
• Interquartile range (IQR)
• Lower and upper 1.5*IQR whiskers:
These represent the limits and
boundaries for the outliers.
• Outliers: Defined as observations that
fall below Q1 − 1.5 IQR or above Q3 +
1.5 IQR.
33
Box plot
34
35
New workflow
36
Add some widgets file, and data table
37
Open Orange workflow
• Double click 01.ows
38
Modify your output file path
• Check each of
Python widget,
change the old
path to your
existing path.
39
Dataset description (titanic.csv)
• In total with 12 columns.
• A training dataset to
predict whether passengers
will survive in the Titanic
accident.
40
Data Summary
• Load titanic.csv
• Data description
• Look at Names, Types, Role,
Values in table.
• Change the configurations
of Columns.
41
Data Summary
• Missing values
• Using the Features
Statistics Widget
• How about those missing
ratios?
42
Remove columns (called data preprocessing)
• Using Select columns widget.
43
Impute columns (called data preprocessing)
• Using Impute columns widget.
• For Default Method
• For each column
44
Pie chart
• Orange 3 has deprecated
Pie chat widget
• Use Python Script widget.
45
Line chart
• Using Line Plot widget.
• Typically, trend analysis
charts are presented
together with time-based
data.
46
Distribution chart
• Using distributions widget to
compare each variables.
47
Scatter plot
• Using scatter plot widget.
• It used to observe the degree
of correlation between
features
• positive correlation
• negative correlation
• noncorrelation
48
Box plot
• Using box plot widget.
• Comparing multiple
features with each other
49
Pivot Table
• Using pivot table widget.
• It summarizes the data
of a more extensive
table into a table of
statistics.
• The statistics can include
sums, averages, counts,
etc.
50
1. Show me top 10 data rows
• Hint: Use Data Sampler widget
51
2. Show me dataset info
• How many Rows?
• How many Features?
• All information like this!
52
3. Get a count of the number of survivors
53
4. Survival Conclusion
• For features, SEX, PCLASS, SIBSP,
PARCH, EMBARKED
• Women had a higher chance of survival
than men.
• First-class passengers had a higher
chance of survival.
• Passengers with siblings, spouses had a
higher chance of survival.
• Passengers with children and parents
had a higher chance of survival.
• Departing from the S terminal may
lead to lower cabin class and lower
chances of survival.
54
5. Show me sex survival rate
55
6. Look at survival rate by SEX and PCLASS
• Women in first class had a survival rate as high as 96.8%. In contrast,
men in economy class only had a 13.54% chance of survival
56
7. Look at survival rate by SEX, AGE and
PCLASS
• In the event of a disaster, women in
first class or business class have a 90%
chance of survival regardless of age.
• On the other hand, if a man is in
economy class and older than 18, the
chance of survival is only 13.36%.
• To summarize, in a disaster scenario,
girls and women have a higher chance
of survival compared to boys and men.
• Additionally, the higher the class (such
as first class), the higher the chances
of survival.
57
8. The price paid of each class
• Try to plot Pclass and Fare chart
to visualize data
• Every seat had someone board
for free, while others spent over
500 pounds for a first-class
ticket. It's quite an interesting
observation!
58
9. Visualizing data and express your thoughts
• Using today’s teaching knowledge and referencing
Story_telling_with_data.pdf, please visualize and analysis this data
(20240320_HW.csv) with the theme of sales.
• Based on your observations, explain the relationship between sales
and these variables.
59

More Related Content

Similar to 資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf

Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Manzur Ashraf
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data ScienceMaloy Manna, PMP®
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISBabasID2
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyJen Stirrup
 
Data Visualization Tips for Oracle BICS and DVCS
Data Visualization Tips for Oracle BICS and DVCSData Visualization Tips for Oracle BICS and DVCS
Data Visualization Tips for Oracle BICS and DVCSEdelweiss Kammermann
 
Data visualization.pptx
Data visualization.pptxData visualization.pptx
Data visualization.pptxnaveen shyam
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big DataSaurabh Shanbhag
 
IBANK - Big data www.ibank.uk.com 07474222079
IBANK - Big data www.ibank.uk.com 07474222079IBANK - Big data www.ibank.uk.com 07474222079
IBANK - Big data www.ibank.uk.com 07474222079ibankuk
 
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016Seattle DAML meetup
 
Quality Tools & Techniques Presentation.pptx
Quality Tools & Techniques Presentation.pptxQuality Tools & Techniques Presentation.pptx
Quality Tools & Techniques Presentation.pptxSAJIDAli83655
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualizationVini Vasundharan
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive StatisticsCIToolkit
 
Ml conference slides boston june 2019
Ml conference slides boston june 2019Ml conference slides boston june 2019
Ml conference slides boston june 2019QuantUniversity
 

Similar to 資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf (20)

Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
Data visualization
Data visualizationData visualization
Data visualization
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Hadoop PDF
Hadoop PDFHadoop PDF
Hadoop PDF
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candy
 
Data Visualization Tips for Oracle BICS and DVCS
Data Visualization Tips for Oracle BICS and DVCSData Visualization Tips for Oracle BICS and DVCS
Data Visualization Tips for Oracle BICS and DVCS
 
Data visualization.pptx
Data visualization.pptxData visualization.pptx
Data visualization.pptx
 
Skillwise Big data
Skillwise Big dataSkillwise Big data
Skillwise Big data
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
 
IBANK - Big data www.ibank.uk.com 07474222079
IBANK - Big data www.ibank.uk.com 07474222079IBANK - Big data www.ibank.uk.com 07474222079
IBANK - Big data www.ibank.uk.com 07474222079
 
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
 
Quality Tools & Techniques Presentation.pptx
Quality Tools & Techniques Presentation.pptxQuality Tools & Techniques Presentation.pptx
Quality Tools & Techniques Presentation.pptx
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualization
 
EDA.pptx
EDA.pptxEDA.pptx
EDA.pptx
 
Big data
Big dataBig data
Big data
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Ml conference slides boston june 2019
Ml conference slides boston june 2019Ml conference slides boston june 2019
Ml conference slides boston june 2019
 

More from FEG

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfFEG
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdfFEG
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318FEG
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practicesFEG
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratchFEG
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratchFEG
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratchFEG
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_RulesFEG
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)FEG
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)FEG
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)FEG
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)FEG
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised LearningFEG
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning ClusteringFEG
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdfFEG
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdfFEG
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdfFEG
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdfFEG
 
2_Clustering.pdf
2_Clustering.pdf2_Clustering.pdf
2_Clustering.pdfFEG
 
1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdfFEG
 

More from FEG (20)

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdf
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdf
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdf
 
2_Clustering.pdf
2_Clustering.pdf2_Clustering.pdf
2_Clustering.pdf
 
1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Recently uploaded (20)

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf

  • 2. About me • Education • NCU (MIS)、NCCU (CS) • Experiences • Telecom big data Innovation • Retail Media Network (RMN) • Customer Data Platform (CDP) • Know-your-customer (KYC) • Digital Transformation • Research • Data Ops (ML Ops) • Business Data Analysis, AI 2
  • 3. Tutorial Content 3 Story telling and visualization Exploration Data Analysis and Visualization Home work What is data visualization?
  • 4. Code • Download materials: • https://drive.google.com/drive/folders/1ibppjANnGy2RYe5CW805MwHrprm2 nu5f?usp=sharing 4
  • 5. 學習 Python 的建議書籍 • 史上最強Python入門邁向頂尖高手之路王者歸來 5 https://www.books.com.tw/products/0010976050?sloc=main
  • 7. Get ready to your Orange 3 • Open source machine learning and data visualization • Version: 3.36.2 • https://orangedatamining.com/ 7
  • 8. Story telling With Data (SWD) • Always remember Data Comparison! • Focus on simplicity and ease of interpretation • The takeaways! 8 https://www.storytellingwithdata.com
  • 9. From touchdowns to takeaways 9
  • 10. Sorting categories 10 A vertical bar chart can be a better choice if data is ordinal
  • 11. Allow the labels to be written in a single, easily readable line 11
  • 12. Rainbow palette, overly distracting! • If the goal is to observe the「fluctuation of commercials across categories over the five years」, we could better achieve that by iterating to a different graph type. • On the other hand, if we’re meant simply to compare the overall category trends,「toning down the color」usage might be beneficial. 12
  • 13. Color in only the year with the highest number of commercials in each category 13 This results in a visually chaotic! 2023 2022 Over-Time
  • 14. The Over-Time means the Line-Graph 14 An overly complex visualization with numerous overlapping data series
  • 15. In order of total number of commercials across all five years of data 15
  • 16. Bar charts instead of line graphs, we can intentionally emphasize that aspect of our data 16 The number of commercial advertisers in each category, in each year, is a countable
  • 17. The area graph small multiple chart 17 A visualization of this on social media. It maintains visual interest while facilitating more straightforward comparisons across categories over several years.
  • 18. A combination of line graphs with descriptive captions to convey these insights more clearly 18
  • 19. A combination of line graphs with descriptive captions to convey these insights more clearly 19
  • 20. A combination of line graphs with descriptive captions to convey these insights more clearly 20
  • 21. Conclusion • There is no singularly correct approach to data visualization. • The key is to consider the audience's needs, the context of the presentation, and the intended message. • Visualizing data is as much an art as it is a science, requiring experimentation, iteration, and feedback, rather than adherence to a strict set of rules. •All about communications! 21 https://www.storytellingwithdata.com/blog
  • 22. What is data visualization? • Data visualization is the graphical representation of information and data. • By using visual elements like charts, graphs, and maps. • A way to see and understand trends, outliers, and patterns in data. 22
  • 23. What is data visualization? 23 https://www.tableau.com/learn/articles/data-visualization#advantages-disadvantages
  • 24. 24 The Pyramid of Data Needs (and why it matters for your career) | by Hugh Williams | Medium
  • 25. 25 The Pyramid of Data Needs (and why it matters for your career) | by Hugh Williams | Medium
  • 26. Static chart • There are generally THREE STEPS in drawing a chart: • Observing the data, determine the relationship, and select the chart. • What type of data it is, and what content you want to express. • Category • Numeric • Text • Datetime • After clarifying the content to be expressed, you can choose which chart to use to express it. 26
  • 27. Pie chart • You must have some kind of whole amount that is divided into a number of distinct parts. • Your primary objective in a pie chart should be to compare each group’s contribution to the whole. 27
  • 28. Line chart • Line charts provide the clearest graphical representation of time- related variables and are the preferred mode for representing trends or variables over time. 28
  • 29. Histogram chart • It is used to summarize discrete or continuous data that are measured on an interval scale. • It is often used to illustrate the major features of the distribution of the data in a convenient form. 29
  • 30. Bar chart • It provides a way of showing data values represented as the comparison of multiple data sets side by side. 30
  • 31. Differences between histogram and bar chart Comparison terms Bar chart Histogram Usage To compare different categories of data. To display the distribution of a variable. Type of variable Categorical variables Numeric variables Rendering Each data point is rendered as a separate bar. The data points are grouped and rendered based on the bin value. The entire range of data values is divided into a series of non- overlapping intervals. Space between bars Can have space. No space. Reordering bars Can be reordered. Cannot be reordered. 31
  • 32. Scatter Plot • It uses dots to represent values for two different numeric variables and observe relationships between variables. 32 Pearson Correlation
  • 33. Box plot • Q1: The first quartile (25%) position. • Q3: The third quartile (75%) position. • Interquartile range (IQR) • Lower and upper 1.5*IQR whiskers: These represent the limits and boundaries for the outliers. • Outliers: Defined as observations that fall below Q1 − 1.5 IQR or above Q3 + 1.5 IQR. 33
  • 35. 35
  • 37. Add some widgets file, and data table 37
  • 38. Open Orange workflow • Double click 01.ows 38
  • 39. Modify your output file path • Check each of Python widget, change the old path to your existing path. 39
  • 40. Dataset description (titanic.csv) • In total with 12 columns. • A training dataset to predict whether passengers will survive in the Titanic accident. 40
  • 41. Data Summary • Load titanic.csv • Data description • Look at Names, Types, Role, Values in table. • Change the configurations of Columns. 41
  • 42. Data Summary • Missing values • Using the Features Statistics Widget • How about those missing ratios? 42
  • 43. Remove columns (called data preprocessing) • Using Select columns widget. 43
  • 44. Impute columns (called data preprocessing) • Using Impute columns widget. • For Default Method • For each column 44
  • 45. Pie chart • Orange 3 has deprecated Pie chat widget • Use Python Script widget. 45
  • 46. Line chart • Using Line Plot widget. • Typically, trend analysis charts are presented together with time-based data. 46
  • 47. Distribution chart • Using distributions widget to compare each variables. 47
  • 48. Scatter plot • Using scatter plot widget. • It used to observe the degree of correlation between features • positive correlation • negative correlation • noncorrelation 48
  • 49. Box plot • Using box plot widget. • Comparing multiple features with each other 49
  • 50. Pivot Table • Using pivot table widget. • It summarizes the data of a more extensive table into a table of statistics. • The statistics can include sums, averages, counts, etc. 50
  • 51. 1. Show me top 10 data rows • Hint: Use Data Sampler widget 51
  • 52. 2. Show me dataset info • How many Rows? • How many Features? • All information like this! 52
  • 53. 3. Get a count of the number of survivors 53
  • 54. 4. Survival Conclusion • For features, SEX, PCLASS, SIBSP, PARCH, EMBARKED • Women had a higher chance of survival than men. • First-class passengers had a higher chance of survival. • Passengers with siblings, spouses had a higher chance of survival. • Passengers with children and parents had a higher chance of survival. • Departing from the S terminal may lead to lower cabin class and lower chances of survival. 54
  • 55. 5. Show me sex survival rate 55
  • 56. 6. Look at survival rate by SEX and PCLASS • Women in first class had a survival rate as high as 96.8%. In contrast, men in economy class only had a 13.54% chance of survival 56
  • 57. 7. Look at survival rate by SEX, AGE and PCLASS • In the event of a disaster, women in first class or business class have a 90% chance of survival regardless of age. • On the other hand, if a man is in economy class and older than 18, the chance of survival is only 13.36%. • To summarize, in a disaster scenario, girls and women have a higher chance of survival compared to boys and men. • Additionally, the higher the class (such as first class), the higher the chances of survival. 57
  • 58. 8. The price paid of each class • Try to plot Pclass and Fare chart to visualize data • Every seat had someone board for free, while others spent over 500 pounds for a first-class ticket. It's quite an interesting observation! 58
  • 59. 9. Visualizing data and express your thoughts • Using today’s teaching knowledge and referencing Story_telling_with_data.pdf, please visualize and analysis this data (20240320_HW.csv) with the theme of sales. • Based on your observations, explain the relationship between sales and these variables. 59