data-exp-Viz-00-2.pdf

Exploring Data and Visualizing
Demeke A. Ayele
demekeayele@gmail.com
School of Computer Science
EIC
00-1

 Data Visualization
 Data Queries: Using Sorting and Filtering
 Statistical Methods for Summarizing Data
 Exploring Data Using PivotTables
Topics
3-2

Creating Charts in Microsoft Excel
 Select the insert tab.
 Highlight the data.
 Click on chart type, then subtype.
 Use chart tools to customize.
Data Visualization
3-3
Figure 3.1
Figure 3.2

Example 3.1 Creating a Column Chart
Data Visualization
3-4
Figure 3.3
Highlighted Cells

Example 3.1 (continued) Creating a Column Chart
Choose column chart (clustered or stacked).
Add chart title (Alabama Employment).
Rename Series1, Series2, and Series3
(ALL EMPLOYEES, Men, Women).
Data Visualization
3-5
Figure 3.4

Data Visualization
3-6
Figure 3.5
Clustered
Column
Chart

Data Visualization
3-7
Figure 3.6
Stacked
Column
Chart

Example 3.2 Line Chart for U.S. Exports to China
Data Visualization
3-8
Figure 3.7

Example 3.3 Pie Chart for Census Data
Data Visualization
3-9
Figure 3.8
Figure 3.9

Example 3.4 Area Chart for Energy Consumption
Data Visualization
3-10
Figure 3.10

Example 3.5 Scatter Chart for Real Estate Data
Data Visualization
3-11
Figure 3.11

Example 3.6
Bubble Chart for Comparing Stock Characteristics
Data Visualization
3-12
Figure 3.12

Miscellaneous Excel Charts
Stock chart
Surface chart
Doughnut chart
Radar chart
Geographic mapping
Data Visualization
3-13

Example 3.7
Sorting Data in the Purchase Orders Database
3-14
Figure 3.13
Figure 3.14
Sort by Supplier
Data Queries: Using Sorting and Filtering

Pareto Analysis
 An Italian economist, Vilfredo Pareto, observed in
1906 that a large proportion of the wealth in Italy
was owned by a small proportion of the people.
 Similarly, businesses often find that a large
proportion of sales come from a small proportion
of customers.
 A Pareto analysis involves sorting data and
calculating cumulative proportions.
3-15

Example 3.8 Applying the Pareto Principle
3-16
Figure 3.15
75% of the bicycle inventory value comes from 40% (9/24) of items.
Sort by

Example 3.9 Filtering Records by Item Description
Highlight A3:J97
Data tab
Sort & Filter group
Filter
Click on the D3
dropdown arrow.
Select Bolt-nut
package to filter out
all other items.
3-17
Figure 3.16

Example 3.9 (continued)
Filtering Records by Item Description
Filter results for the bolt-nut package
3-18
Figure 3.17

Example 3.10 Filtering Records by Item Cost
To identify items that
cost at least $200
Click on dropdown
arrow for item cost
Number Filters
Greater Than Or
Equal To…
3-19
Figure 3.18

Example 3.10 (continued) Filtering by Item Cost
Custom AutoFilter dialog box
 Click OK
 Only items
costing at least
$200 is then
displayed.
3-20
Figure 3.19

AutoFilter criteria is based on the data type.
 Number Filters includes numerical criteria.
 Date Filters include tomorrow, next week, etc.
AutoFilter can be used sequentially.
 First filter by one variable.
 Then filter those data by another variable.
3-21

Analytics in Practice: Discovering Value
of Data Analysis at Alders International
 Duty free operations at airports, seaports, etc.
 Maintain a data warehouse to track point-of-sale
information and inventory levels.
 Pareto analysis revealed that 80% of profits were
generated from 20% of their product lines.
 Allows selective elimination of less profitable items.
3-22
Data Queries:
Using Sorting and Filtering

 A statistic is a summary measure of data.
 Descriptive statistics are methods that describe
and summarize data.
 Microsoft Excel supports statistical analysis in two
ways:
1. Statistical functions
2. Analysis Toolpak add-in for PCs
(for Macs, StatPlus is similar)
3-23
Statistical Methods for Summarizing Data
Statistical methods are essential to Business Analytics

Example 3.11 Constructing a Frequency Distribution
for Items in the Purchase Order Database
3-24
Figure 3.20
Copy Column D (Item Description) to Column A in a new worksheet

Example 3.11 (continued) Constructing a Frequency
Distribution for Items in the Purchase Order Database
3-25
Figure 3.22
Figure 3.21

Distribution for Items in the Purchase Order Database
3-26
Figure 3.23

Example 3.12 Constructing a Relative Frequency
Distribution for Items Purchased
3-27
Figure 3.24
Compute relative
frequencies by
dividing each
frequency by 94.

Example 3.13 Frequency and Relative Frequency
Distribution for A/P Terms
3-28
Figure 3.26
Figure 3.25

Excel’s Histogram Tool
Using the Analysis Toolpak
Data
Data Analysis
Histogram
Fill in the Input Range and Bin Range (optional).
Choose Labels if columns have headers rows.
Choose Chart Output.
3-29
Figure 3.27

Example 3.14
Using the Histogram Tool for A/P Terms
A/P data in H3:H97
Bins below in H99:H103
Month
15
25
30
45
3-30
Figure 3.28

Using the Histogram Tool for A/P Terms
3-31
Figure 3.29
Table above
is not linked
to chart.

Example 3.15 Constructing a Frequency
Distribution and Histogram for Cost Per Order
3-32
5 groups with a
$26,000 group width
Figure 3.30

Distribution and Histogram for Cost Per Order
3-33
Figure 3.31
10 groups with a
$13,000 group width

Example 3.16 Computing Cumulative Relative
Frequencies for the Cost Per Order Data
3-34
Ogive Figure 3.33
Figure 3.32

Example 3.17 Computing Percentiles
Compute the 90th
percentile for cost per order in the
Purchase Orders Data.
Rank of kth
percentile =
n = 94 observations
k = 90
Rank of 90th
percentile = 94(90)/100+0.5
= 85.1 (round to 85)
Value of the 85th
observation = $74,375
3-35

Example 3.18 Computing Percentiles in Excel
Compute the 90th
percentile for cost per order.
Excel function for the kth
percentile:
=PERCENTILE.INC(array, k)
=PERCENTILE.INC(G4:G97, 0.90)
= $73,737.50
Excel does not use the formula on previous slide.
3-36

Example 3.19 Excel’s Rank and Percentile Tool
Data Data Analysis
Rank and Percentile
90.3rd
percentile
= $74,375
(same result as
manually computing
the 90th
percentile)
3-37
Figure 3.34

Example 3.20 Computing Quartiles in Excel
Compute the Quartiles of the Cost per Order data
 Excel function for quartiles:
=QUARTILE.INC(array, quart)
 =QUARTILE.INC(G4:G97, 1) = $6,757.81
 =QUARTILE.INC(G4:G97, 2) = $15,656.25
 =QUARTILE.INC(G4:G97, 3) = $27,593.75
 =QUARTILE.INC(G4:G97, 4) = $127,500.00
3-38

Example 3.21 Constructing a Cross-Tabulation
 Sales Transactions database
 Identify the number (and percentage) of books
and DVDs ordered by region.
3-39
Figure 3.35

 Example 3.21 (continued) Constructing a Cross-
Tabulation
3-40
Table 3.1
Table 3.2

 Example 3.21 (continued) Constructing a Cross-
Tabulation
Excel’s PivotTable (covered next) makes this easy.
3-41
Figure 3.36
Table 3.1

Data
Tables
PivotTable
Follow wizard steps.
PivotTables allow:
 Quick creation of
cross tabulations
 Numerous custom-
made summary
tables and charts
3-42
Exploring Data Using PivotTables
Figure 3.37

PivotTable Field List
Select the fields for:
 Report Filter
 Column Labels
 Row Labels
 Σ Values
Or, before choosing
PivotTable, you can select
a cell in the data and let
Excel prepare a default
PivotTable.
3-43
Figure 3.37

Example 3.22
Creating a
PivotTable
Default PivotTable
for Regional Sales
by Product
(sum of CustID is
meaningless)
3-44
Figure 3.38

Example 3.22 (continued) Creating a PivotTable
Pivot Table Tools
Options
Active Field
Field Settings
 Change summarization
method in Value Field
Settings dialog box
 Select Count
3-45
Figure 3.39

Example 3.22 (continued) Creating a PivotTable
3-46
Figure 3.40
Table 3.1
PivotTable for Count
of Regional Sales
by Product
PivotTable results
match those shown
earlier in Table 3.1.

Drag Source into the
Row Labels box.
PivotTable for Sales
by Region, Product,
and Order Source
3-47
Figure 3.41
Creating a PivotTable

Example 3.23
Using the Pivot
Table Report Filter
Drag Payment into
Report Filter box.
PivotTable Filtered
by Payment Type.
3-48
Figure 3.42

Using the PivotTable Report Filter
Click on the drop-down arrow in row 1.
3-49
Figure 3.43
Choose Credit-Card.
Obtain this cross-tabulation
PivotTable for credit card
transactions.

Example 3.24 A PivotChart for Sales Data
Create a chart using the PivotTable for
Sales by Region, Product, and Order Source.
 Insert
 Column Chart
To display only Book
data, click on the
Product button and
deselect DVD.
3-50
Figure 3.44

Assignment I (20%)
(use Ms Excel)
- Search and get required data
- Do the exploratory analysis:
- statistical analysis
- visualization

3-52
Key Terms
 Area chart
 Bar chart
 Bubble chart
 Column chart
 Contingency table
 Cross-tabulation
 Cumulative relative
frequency
 Cumulative relative
frequency distribution
 Data profile (fractile)
 Descriptive statistics
 Doughnut chart
 Frequency distribution
 Histogram
 kth
percentile
 Line chart
 Ogive
 Pareto analysis
 Pie chart

3-53
Key Terms (continued)
 PivotChart
 PivotTable
 Quartile
 Radar chart
 Relative frequency
 Relative frequency
distribution
 Scatter chart
 Statistic
 Statistics
 Stock chart
 Surface chart

 Recall that PLE produces lawnmowers and a
medium size diesel power lawn tractor.
 Create charts of the satisfaction data, sales data,
delivery time data, and other variables of interest.
 Compare shipping costs for existing and proposed
plant locations.
 Examine customer attributes by region and write a
formal report summarizing your results.
Case Study
Performance Lawn Equipment (3)
3-54

XP
XP
XP
WORKING WITH EXCEL TABLES,
PIVOTTABLES, AND PIVOTCHARTS

Objectives
• Sort data and filter data
• Summarize an Excel table
• Insert subtotals into a range of data
• Outline buttons to show or hide details
• Create and modify a PivotTable and PivotChart

Planning a Structured Range of Data
• A collection of similar data can be
structured in a range of columns and
rows, representing fields and records,
respectively
• A structured range of data is
commonly referred to as a list or
table

Creating an Effective Structured Range
of Data
• Enter field names in top row of range
• Use short, descriptive field names
• Format field names to distinguish
header row from data
• Enter same kind of data for a field in
each record
• Separate data (including header row)
from other information in the
worksheet by at least one blank row
and one blank column

Planning a Structured Range of Data
• Freezing a row or column keeps
headings visible as you work with
data in a large worksheet

Save Time with Excel Table Features
• Format quickly using a table style
• Add new rows and columns that
automatically expand the range
• Add a Total row to calculate a
summary function (SUM, AVERAGE,
COUNT, MIN, MAX)
• Enter a formula in a cell that is
copied to all other cells in the column
• Create formulas that reference cells
in a table by using table and column

XP
XP
XP
Creating an Excel Table

Sorting Data
• Sort data in ascending or
descending order
• Use the Sort A to Z button or the
Sort Z to A button to sort data
quickly with one sort field

Sorting Data
• Use sort dialog box to sort multiple
columns
• Primary and secondary sort fields
• Up to 64 sort fields possible

Sorting Using a Custom
List
• A custom list indicates sequence to order
data
– Four predefined custom sort lists
• Two days-of-the-week custom lists
• Two months-of-the-year custom lists
– Can also create a custom list to sort
records in a sequence you define

Filtering Data
• Filtering data temporarily hides any
records that do not meet specified
criteria
• After data is filtered, it can be sorted,
copied, formatted, charted, and
printed

Using the Total Row to Calculate
Summary Statistics
• You can calculate sum, average,
count, maximum, and minimum on all
columns in a table or on a filtered
table in a Total row

Creating Subtotals
• Subtotals can be created on columnar
data
– The data must be sorted for subtotals to
be created
– Column headers must also appear in the
data
• Subtotal command
– Offers many kinds of summary
information (counts, sums, averages,
minimums, maximums)
– Inserts a subtotal row into range for each
group of data; adds grand total row
below last row of data

Inserting Subtotals
• Sort data so that records with the same
value in a specified field are grouped
together before using Subtotal command
– It cannot be used in an Excel table
– First convert the Excel table to a range
• Click SubTotal on the Data ribbon

Using the Subtotal Outline View
• Control the level of detail with
buttons
–Level 3: Most detail
–Level 2: Subtotals and grand total,
but not individual records
–Level 1: Only the grand total

Pivot Tables
• Interactive table used to group and
summarize either a range of data or an
Excel table into a concise, tabular
format for easier reporting and analysis
• Dynamic organization; can be
“pivoted” to examine data from various
perspectives by rearranging its
structure
• Best used to analyze data that can be
summarized in multiple ways
• Pivot tables can be created from lists or
external data sources

Analyzing Data with PivotTables
• Provide ability to “pivot” the table
(rearrange, hide, and display
different category fields to provide
alternative views of the data)

Analyzing Data with PivotTables
• Summarize data into categories
using functions (COUNT, SUM,
AVERAGE, MAX, MIN)
• Values fields contain summary data
• Category fields group the values

• Use PivotTable dialog box to select
data to analyze and location of the
PivotTable report

• PivotTable Field List has two sections
– Upper field list section displays names of
each field; use check boxes to add fields
to PivotTable
– Lower layout section includes boxes for
four areas in which you can place fields

XP
XP
XP
Adding Fields to a
PivotTable

• Apply PivotTable styles by using a
preset style or modifying its
appearance
• Formatting PivotTable values fields
–Applying PivotTable styles does not
change the numeric formatting

Refreshing a PivotTable
• You cannot change data directly in
the PivotTable
• Instead, you must edit the Excel
table, and then refresh, or update,
the PivotTable to reflect the updated
data

Grouping PivotTable
Items
• Grouping items combines dates or
numeric items into larger groups so
that the PivotTable can include the
desired level of summarization

Filtering and Slicing a PivotTable
• Filters can be applied to a PivotTable
• PivotTable filters can be based on:
– Field values
– Row and column label groupings
• PivotTable filters can be removed

• Slicer—small window that contains a
button for each item in a field
• Slicer—helpful when filtering a
PivotTable based on multiple tables
• Slicers can be customized

Creating a Calculated
Field
• Custom calculation options:
– % of Grand Total
– % of Column Total
– % of Row Total
– % of Parent Row Total
– Running Total
– Rank Smallest to Largest
– Rank Largest to Smallest

Creating a PivotChart
• PivotChart—interactive graphical
representation of PivotTable data
• Changing the position of a field in the
PivotTable or the PivotChart changes
the corresponding object as well
• Create a PivotChart:
– Click in the PivotTable
– Click PivotChart in the Tools group on the
ANALYZE tab

87
Mathematical Operators for
Excel
<
>
=
>=
<=
<>
^
 Less than
 greater than
 Equal
 Greater than or equal
 Less than or equal
 Not equal
 Power of

88
Functions
SUMIFS
Adds the cells in a
range that meet
multiple criteria
COUNTIFS
Applies criteria to
cells across multiple
ranges and counts
the number of times
all criteria are met
The key difference between these and Countif/Sumif is that these
allow the use of multiple criteria. Countif/Sumif do not

89
DATA TABLES
A data table is a range of cells that shows how
changing one or two variables in your formulas
will affect the results of those formulas
To create a Data Table select
data and click Insert tab, Table
(in table group)
Convert a table to a range of data Click
anywhere in the table, click on Design tab then
click Convert to
Range in Tools group.

90
DATA TABLES
Can be used to Calculate Options
In example sheet in cell J2 type =G3 then
select I2:J15

Click Data tab, What-if-analysis,
then Data Table

In Data Table, Column input
cell, click D4, and click OK

91
Protecting Worksheets
Two step process, first unlock cells you
want user to change
 Select cells you want unlocked
 Home tab, Font group, click on Dialogue Box
expander, click on Protection tab, and remove
check mark from “Locked” choice

92
PROTECT SHEETS
REVIEW tab > CHANGES group >
PROTECT
SHEET button
 select the options you
want to be protected
> OK

93
APPLY CONDITIONAL FORMATTING WITH A RULE
Select cell range
HOME tab > STYLES group >
CONDITIONAL FORMATTING > NEW
RULE

94
CONDITIONAL FORMATTING WITH A RULE cont.
Select a RULE TYPE:
Set your parameters:
Select the formatting you want by clicking on
the button at the bottom

95
SORT BY MULTIPLE FIELDS
HOME tab > EDITING group > SORT
& FILTER Button > CUSTOM SORT
For each category you want
to sort by, click on the
ADD LEVEL button

96
AUTOFILTER
Select a range of cells containing data.
HOME tab > EDITING group >
SORT & FILTER button > FILTER
Drop-down arrows will now
Appear beside each
Column heading
Select the drop-down arrow and:
De-select: SELECT ALL
Then select the checkbox beside
the option you wish to sort by

97
SUBTOTALS
DATA tab >
 Note that data should be sorted to get best results
 You can automatically calculate subtotals and grand totals
in a list for a column by using the Subtotal command in the
Outline group on the Data tab.

98
PIVOT TABLE
Are used to summarize, analyze, explore, and
present summary data
Select the range
INSERT > click on
PivotTable
My table has headers is selected > OK

99
Modify A PivotTable So That A Column Displays The
MAXIMUM Value, Instead Of The SUM
Select the cell which has the desired
COLUMN HEADING
OPTIONS tab > ACTIVE FIELD group >
FIELD SETTINGS button
In the list, select the
Desired function > OK

100
PIVOT TABLE
Drag the fields you want
into the areas you want

101
PIVOT TABLE cont.
Format a PivotTable using a Pivot style
Click the DESIGN tab:
Light styles
Medium styles

102
PIVOT CHART BASED ON A PIVOT TABLE
PIVOT TABLE TOOLS > OPTIONS > TOOLS group >
PivotChart button
 in the PivotChart Filter Pane which pops up
when you create the PivotChart
Click on the drop-down arrow beside
the 1st
category name
De-select: SELECT ALL
Then select the categories you want to be
Able to view in your PivotChart > OK

103
GOAL SEEK
Automatically vary the contents of one cell
 so that the value of the contents of another cell
equals a certain amount
Click DATA tab > DATA TOOLS group >
"WHAT-IF ANALYSIS" icon >
GOAL SEEK
In the SET CELL textbox, key in the cell
you want the ANSWER to appear in
 In the BY CHANGING CELL textbox,
key in the cell reference you want
changed in order to get the desired answer > OK

104
FREE “TIP OF THE WEEK”

data-exp-Viz-00-2.pdf

Recommended

Recommended

More Related Content

Similar to data-exp-Viz-00-2.pdf

Similar to data-exp-Viz-00-2.pdf (20)

Recently uploaded

Recently uploaded (20)

data-exp-Viz-00-2.pdf