Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
data-exp-Viz-00-2.pdf
1. Exploring Data and Visualizing
Demeke A. Ayele
demekeayele@gmail.com
School of Computer Science
EIC
00-1
2. Data Visualization
Data Queries: Using Sorting and Filtering
Statistical Methods for Summarizing Data
Exploring Data Using PivotTables
Topics
3-2
3. Creating Charts in Microsoft Excel
Select the insert tab.
Highlight the data.
Click on chart type, then subtype.
Use chart tools to customize.
Data Visualization
3-3
Figure 3.1
Figure 3.2
4. Example 3.1 Creating a Column Chart
Data Visualization
3-4
Figure 3.3
Highlighted Cells
5. Example 3.1 (continued) Creating a Column Chart
Choose column chart (clustered or stacked).
Add chart title (Alabama Employment).
Rename Series1, Series2, and Series3
(ALL EMPLOYEES, Men, Women).
Data Visualization
3-5
Figure 3.4
6. Example 3.1 (continued) Creating a Column Chart
Data Visualization
3-6
Figure 3.5
Clustered
Column
Chart
7. Example 3.1 (continued) Creating a Column Chart
Data Visualization
3-7
Figure 3.6
Stacked
Column
Chart
8. Example 3.2 Line Chart for U.S. Exports to China
Data Visualization
3-8
Figure 3.7
9. Example 3.3 Pie Chart for Census Data
Data Visualization
3-9
Figure 3.8
Figure 3.9
10. Example 3.4 Area Chart for Energy Consumption
Data Visualization
3-10
Figure 3.10
11. Example 3.5 Scatter Chart for Real Estate Data
Data Visualization
3-11
Figure 3.11
12. Example 3.6
Bubble Chart for Comparing Stock Characteristics
Data Visualization
3-12
Figure 3.12
14. Example 3.7
Sorting Data in the Purchase Orders Database
3-14
Figure 3.13
Figure 3.14
Sort by Supplier
Data Queries: Using Sorting and Filtering
15. Pareto Analysis
An Italian economist, Vilfredo Pareto, observed in
1906 that a large proportion of the wealth in Italy
was owned by a small proportion of the people.
Similarly, businesses often find that a large
proportion of sales come from a small proportion
of customers.
A Pareto analysis involves sorting data and
calculating cumulative proportions.
3-15
Data Queries: Using Sorting and Filtering
16. Example 3.8 Applying the Pareto Principle
3-16
Data Queries: Using Sorting and Filtering
Figure 3.15
75% of the bicycle inventory value comes from 40% (9/24) of items.
Sort by
17. Example 3.9 Filtering Records by Item Description
Highlight A3:J97
Data tab
Sort & Filter group
Filter
Click on the D3
dropdown arrow.
Select Bolt-nut
package to filter out
all other items.
3-17
Figure 3.16
Data Queries: Using Sorting and Filtering
18. Example 3.9 (continued)
Filtering Records by Item Description
Filter results for the bolt-nut package
3-18
Figure 3.17
Data Queries: Using Sorting and Filtering
19. Example 3.10 Filtering Records by Item Cost
To identify items that
cost at least $200
Click on dropdown
arrow for item cost
Number Filters
Greater Than Or
Equal To…
3-19
Figure 3.18
Data Queries: Using Sorting and Filtering
20. Example 3.10 (continued) Filtering by Item Cost
Custom AutoFilter dialog box
Click OK
Only items
costing at least
$200 is then
displayed.
3-20
Figure 3.19
Data Queries: Using Sorting and Filtering
21. AutoFilter criteria is based on the data type.
Number Filters includes numerical criteria.
Date Filters include tomorrow, next week, etc.
AutoFilter can be used sequentially.
First filter by one variable.
Then filter those data by another variable.
3-21
Data Queries: Using Sorting and Filtering
22. Analytics in Practice: Discovering Value
of Data Analysis at Alders International
Duty free operations at airports, seaports, etc.
Maintain a data warehouse to track point-of-sale
information and inventory levels.
Pareto analysis revealed that 80% of profits were
generated from 20% of their product lines.
Allows selective elimination of less profitable items.
3-22
Data Queries:
Using Sorting and Filtering
23. A statistic is a summary measure of data.
Descriptive statistics are methods that describe
and summarize data.
Microsoft Excel supports statistical analysis in two
ways:
1. Statistical functions
2. Analysis Toolpak add-in for PCs
(for Macs, StatPlus is similar)
3-23
Statistical Methods for Summarizing Data
Statistical methods are essential to Business Analytics
24. Example 3.11 Constructing a Frequency Distribution
for Items in the Purchase Order Database
3-24
Statistical Methods for Summarizing Data
Figure 3.20
Copy Column D (Item Description) to Column A in a new worksheet
25. Example 3.11 (continued) Constructing a Frequency
Distribution for Items in the Purchase Order Database
3-25
Statistical Methods for Summarizing Data
Figure 3.22
Figure 3.21
26. Example 3.11 (continued) Constructing a Frequency
Distribution for Items in the Purchase Order Database
3-26
Statistical Methods for Summarizing Data
Figure 3.23
27. Example 3.12 Constructing a Relative Frequency
Distribution for Items Purchased
3-27
Statistical Methods for Summarizing Data
Figure 3.24
Compute relative
frequencies by
dividing each
frequency by 94.
28. Example 3.13 Frequency and Relative Frequency
Distribution for A/P Terms
3-28
Statistical Methods for Summarizing Data
Figure 3.26
Figure 3.25
29. Excel’s Histogram Tool
Using the Analysis Toolpak
Data
Data Analysis
Histogram
Fill in the Input Range and Bin Range (optional).
Choose Labels if columns have headers rows.
Choose Chart Output.
3-29
Statistical Methods for Summarizing Data
Figure 3.27
30. Example 3.14
Using the Histogram Tool for A/P Terms
A/P data in H3:H97
Bins below in H99:H103
Month
15
25
30
45
3-30
Statistical Methods for Summarizing Data
Figure 3.28
31. Example 3.14 (continued)
Using the Histogram Tool for A/P Terms
3-31
Statistical Methods for Summarizing Data
Figure 3.29
Table above
is not linked
to chart.
32. Example 3.15 Constructing a Frequency
Distribution and Histogram for Cost Per Order
3-32
Statistical Methods for Summarizing Data
5 groups with a
$26,000 group width
Figure 3.30
33. Example 3.15 (continued) Constructing a Frequency
Distribution and Histogram for Cost Per Order
3-33
Statistical Methods for Summarizing Data
Figure 3.31
10 groups with a
$13,000 group width
34. Example 3.16 Computing Cumulative Relative
Frequencies for the Cost Per Order Data
3-34
Statistical Methods for Summarizing Data
Ogive Figure 3.33
Figure 3.32
35. Example 3.17 Computing Percentiles
Compute the 90th
percentile for cost per order in the
Purchase Orders Data.
Rank of kth
percentile =
n = 94 observations
k = 90
Rank of 90th
percentile = 94(90)/100+0.5
= 85.1 (round to 85)
Value of the 85th
observation = $74,375
3-35
Statistical Methods for Summarizing Data
36. Example 3.18 Computing Percentiles in Excel
Compute the 90th
percentile for cost per order.
Excel function for the kth
percentile:
=PERCENTILE.INC(array, k)
=PERCENTILE.INC(G4:G97, 0.90)
= $73,737.50
Excel does not use the formula on previous slide.
3-36
Statistical Methods for Summarizing Data
37. Example 3.19 Excel’s Rank and Percentile Tool
Data Data Analysis
Rank and Percentile
90.3rd
percentile
= $74,375
(same result as
manually computing
the 90th
percentile)
3-37
Statistical Methods for Summarizing Data
Figure 3.34
38. Example 3.20 Computing Quartiles in Excel
Compute the Quartiles of the Cost per Order data
Excel function for quartiles:
=QUARTILE.INC(array, quart)
=QUARTILE.INC(G4:G97, 1) = $6,757.81
=QUARTILE.INC(G4:G97, 2) = $15,656.25
=QUARTILE.INC(G4:G97, 3) = $27,593.75
=QUARTILE.INC(G4:G97, 4) = $127,500.00
3-38
Statistical Methods for Summarizing Data
39. Example 3.21 Constructing a Cross-Tabulation
Sales Transactions database
Identify the number (and percentage) of books
and DVDs ordered by region.
3-39
Statistical Methods for Summarizing Data
Figure 3.35
40. Example 3.21 (continued) Constructing a Cross-
Tabulation
3-40
Statistical Methods for Summarizing Data
Table 3.1
Table 3.2
41. Example 3.21 (continued) Constructing a Cross-
Tabulation
Excel’s PivotTable (covered next) makes this easy.
3-41
Statistical Methods for Summarizing Data
Figure 3.36
Table 3.1
43. PivotTable Field List
Select the fields for:
Report Filter
Column Labels
Row Labels
Σ Values
Or, before choosing
PivotTable, you can select
a cell in the data and let
Excel prepare a default
PivotTable.
3-43
Exploring Data Using PivotTables
Figure 3.37
45. Example 3.22 (continued) Creating a PivotTable
Pivot Table Tools
Options
Active Field
Field Settings
Change summarization
method in Value Field
Settings dialog box
Select Count
3-45
Exploring Data Using PivotTables
Figure 3.39
46. Example 3.22 (continued) Creating a PivotTable
3-46
Exploring Data Using PivotTables
Figure 3.40
Table 3.1
PivotTable for Count
of Regional Sales
by Product
PivotTable results
match those shown
earlier in Table 3.1.
47. Drag Source into the
Row Labels box.
PivotTable for Sales
by Region, Product,
and Order Source
3-47
Exploring Data Using PivotTables
Figure 3.41
Example 3.22 (continued)
Creating a PivotTable
48. Example 3.23
Using the Pivot
Table Report Filter
Drag Payment into
Report Filter box.
PivotTable Filtered
by Payment Type.
3-48
Exploring Data Using PivotTables
Figure 3.42
49. Example 3.23 (continued)
Using the PivotTable Report Filter
Click on the drop-down arrow in row 1.
3-49
Exploring Data Using PivotTables
Figure 3.43
Choose Credit-Card.
Obtain this cross-tabulation
PivotTable for credit card
transactions.
50. Example 3.24 A PivotChart for Sales Data
Create a chart using the PivotTable for
Sales by Region, Product, and Order Source.
Insert
Column Chart
To display only Book
data, click on the
Product button and
deselect DVD.
3-50
Exploring Data Using PivotTables
Figure 3.44
51. Assignment I (20%)
(use Ms Excel)
- Search and get required data
- Do the exploratory analysis:
- statistical analysis
- visualization
52. 3-52
Key Terms
Area chart
Bar chart
Bubble chart
Column chart
Contingency table
Cross-tabulation
Cumulative relative
frequency
Cumulative relative
frequency distribution
Data profile (fractile)
Descriptive statistics
Doughnut chart
Frequency distribution
Histogram
kth
percentile
Line chart
Ogive
Pareto analysis
Pie chart
54. Recall that PLE produces lawnmowers and a
medium size diesel power lawn tractor.
Create charts of the satisfaction data, sales data,
delivery time data, and other variables of interest.
Compare shipping costs for existing and proposed
plant locations.
Examine customer attributes by region and write a
formal report summarizing your results.
Case Study
Performance Lawn Equipment (3)
3-54
56. Objectives
• Sort data and filter data
• Summarize an Excel table
• Insert subtotals into a range of data
• Outline buttons to show or hide details
• Create and modify a PivotTable and PivotChart
57. Planning a Structured Range of Data
• A collection of similar data can be
structured in a range of columns and
rows, representing fields and records,
respectively
• A structured range of data is
commonly referred to as a list or
table
58. Creating an Effective Structured Range
of Data
• Enter field names in top row of range
• Use short, descriptive field names
• Format field names to distinguish
header row from data
• Enter same kind of data for a field in
each record
• Separate data (including header row)
from other information in the
worksheet by at least one blank row
and one blank column
59. Planning a Structured Range of Data
• Freezing a row or column keeps
headings visible as you work with
data in a large worksheet
60. Save Time with Excel Table Features
• Format quickly using a table style
• Add new rows and columns that
automatically expand the range
• Add a Total row to calculate a
summary function (SUM, AVERAGE,
COUNT, MIN, MAX)
• Enter a formula in a cell that is
copied to all other cells in the column
• Create formulas that reference cells
in a table by using table and column
62. Sorting Data
• Sort data in ascending or
descending order
• Use the Sort A to Z button or the
Sort Z to A button to sort data
quickly with one sort field
63. Sorting Data
• Use sort dialog box to sort multiple
columns
• Primary and secondary sort fields
• Up to 64 sort fields possible
64. Sorting Using a Custom
List
• A custom list indicates sequence to order
data
– Four predefined custom sort lists
• Two days-of-the-week custom lists
• Two months-of-the-year custom lists
– Can also create a custom list to sort
records in a sequence you define
65. Filtering Data
• Filtering data temporarily hides any
records that do not meet specified
criteria
• After data is filtered, it can be sorted,
copied, formatted, charted, and
printed
66. Using the Total Row to Calculate
Summary Statistics
• You can calculate sum, average,
count, maximum, and minimum on all
columns in a table or on a filtered
table in a Total row
67. Creating Subtotals
• Subtotals can be created on columnar
data
– The data must be sorted for subtotals to
be created
– Column headers must also appear in the
data
• Subtotal command
– Offers many kinds of summary
information (counts, sums, averages,
minimums, maximums)
– Inserts a subtotal row into range for each
group of data; adds grand total row
below last row of data
68. Inserting Subtotals
• Sort data so that records with the same
value in a specified field are grouped
together before using Subtotal command
– It cannot be used in an Excel table
– First convert the Excel table to a range
• Click SubTotal on the Data ribbon
70. Using the Subtotal Outline View
• Control the level of detail with
buttons
–Level 3: Most detail
–Level 2: Subtotals and grand total,
but not individual records
–Level 1: Only the grand total
71. Pivot Tables
• Interactive table used to group and
summarize either a range of data or an
Excel table into a concise, tabular
format for easier reporting and analysis
• Dynamic organization; can be
“pivoted” to examine data from various
perspectives by rearranging its
structure
• Best used to analyze data that can be
summarized in multiple ways
• Pivot tables can be created from lists or
external data sources
72. Analyzing Data with PivotTables
• Provide ability to “pivot” the table
(rearrange, hide, and display
different category fields to provide
alternative views of the data)
73. Analyzing Data with PivotTables
• Summarize data into categories
using functions (COUNT, SUM,
AVERAGE, MAX, MIN)
• Values fields contain summary data
• Category fields group the values
74. Creating a PivotTable
• Use PivotTable dialog box to select
data to analyze and location of the
PivotTable report
75. Creating a PivotTable
• PivotTable Field List has two sections
– Upper field list section displays names of
each field; use check boxes to add fields
to PivotTable
– Lower layout section includes boxes for
four areas in which you can place fields
77. Creating a PivotTable
• Apply PivotTable styles by using a
preset style or modifying its
appearance
• Formatting PivotTable values fields
–Applying PivotTable styles does not
change the numeric formatting
78. Refreshing a PivotTable
• You cannot change data directly in
the PivotTable
• Instead, you must edit the Excel
table, and then refresh, or update,
the PivotTable to reflect the updated
data
79. Grouping PivotTable
Items
• Grouping items combines dates or
numeric items into larger groups so
that the PivotTable can include the
desired level of summarization
80. Filtering and Slicing a PivotTable
• Filters can be applied to a PivotTable
• PivotTable filters can be based on:
– Field values
– Row and column label groupings
• PivotTable filters can be removed
81. Filtering and Slicing a PivotTable
• Slicer—small window that contains a
button for each item in a field
• Slicer—helpful when filtering a
PivotTable based on multiple tables
• Slicers can be customized
83. Creating a Calculated
Field
• Custom calculation options:
– % of Grand Total
– % of Column Total
– % of Row Total
– % of Parent Row Total
– Running Total
– Rank Smallest to Largest
– Rank Largest to Smallest
84. Creating a PivotChart
• PivotChart—interactive graphical
representation of PivotTable data
• Changing the position of a field in the
PivotTable or the PivotChart changes
the corresponding object as well
• Create a PivotChart:
– Click in the PivotTable
– Click PivotChart in the Tools group on the
ANALYZE tab
88. 88
Functions
SUMIFS
Adds the cells in a
range that meet
multiple criteria
COUNTIFS
Applies criteria to
cells across multiple
ranges and counts
the number of times
all criteria are met
The key difference between these and Countif/Sumif is that these
allow the use of multiple criteria. Countif/Sumif do not
89. 89
DATA TABLES
A data table is a range of cells that shows how
changing one or two variables in your formulas
will affect the results of those formulas
To create a Data Table select
data and click Insert tab, Table
(in table group)
Convert a table to a range of data Click
anywhere in the table, click on Design tab then
click Convert to
Range in Tools group.
90. 90
DATA TABLES
Can be used to Calculate Options
In example sheet in cell J2 type =G3 then
select I2:J15
Click Data tab, What-if-analysis,
then Data Table
In Data Table, Column input
cell, click D4, and click OK
91. 91
Protecting Worksheets
Two step process, first unlock cells you
want user to change
Select cells you want unlocked
Home tab, Font group, click on Dialogue Box
expander, click on Protection tab, and remove
check mark from “Locked” choice
92. 92
PROTECT SHEETS
REVIEW tab > CHANGES group >
PROTECT
SHEET button
select the options you
want to be protected
> OK
94. 94
CONDITIONAL FORMATTING WITH A RULE cont.
Select a RULE TYPE:
Set your parameters:
Select the formatting you want by clicking on
the button at the bottom
95. 95
SORT BY MULTIPLE FIELDS
HOME tab > EDITING group > SORT
& FILTER Button > CUSTOM SORT
For each category you want
to sort by, click on the
ADD LEVEL button
96. 96
AUTOFILTER
Select a range of cells containing data.
HOME tab > EDITING group >
SORT & FILTER button > FILTER
Drop-down arrows will now
Appear beside each
Column heading
Select the drop-down arrow and:
De-select: SELECT ALL
Then select the checkbox beside
the option you wish to sort by
97. 97
SUBTOTALS
DATA tab >
Note that data should be sorted to get best results
You can automatically calculate subtotals and grand totals
in a list for a column by using the Subtotal command in the
Outline group on the Data tab.
98. 98
PIVOT TABLE
Are used to summarize, analyze, explore, and
present summary data
Select the range
INSERT > click on
PivotTable
My table has headers is selected > OK
99. 99
Modify A PivotTable So That A Column Displays The
MAXIMUM Value, Instead Of The SUM
Select the cell which has the desired
COLUMN HEADING
OPTIONS tab > ACTIVE FIELD group >
FIELD SETTINGS button
In the list, select the
Desired function > OK
102. 102
PIVOT CHART BASED ON A PIVOT TABLE
PIVOT TABLE TOOLS > OPTIONS > TOOLS group >
PivotChart button
in the PivotChart Filter Pane which pops up
when you create the PivotChart
Click on the drop-down arrow beside
the 1st
category name
De-select: SELECT ALL
Then select the categories you want to be
Able to view in your PivotChart > OK
103. 103
GOAL SEEK
Automatically vary the contents of one cell
so that the value of the contents of another cell
equals a certain amount
Click DATA tab > DATA TOOLS group >
"WHAT-IF ANALYSIS" icon >
GOAL SEEK
In the SET CELL textbox, key in the cell
you want the ANSWER to appear in
In the BY CHANGING CELL textbox,
key in the cell reference you want
changed in order to get the desired answer > OK