2. •Describing the Distribution of a Variable
Populations and Samples
Data Sets, Variables, and Observations
Data Types
Person Age Gender Salary Region State City
First
Purchase
Amount
Spent
1 1 Male $16,400 South Florida Orlando 10/23/2011 $218
3 3 Female $97,300 South Florida Orlando 8/18/2015 $3,048
3. Summarizing Categorical Variables Using Excel
.
• Identify and name the categories within categorical variables.
• Count the observations in each category using Excel's COUNTIF function.
• To count the number of males in a Gender category, a formula like
=COUNTIF($D$2:$D$14060, "Male") might be used.
• Calculate the percentage of total observations for each category.
• Display the summarized data graphically using column charts or pie charts in Excel
4. Categorical summeries
Gender count Percentage
M 6889 49.00%
F 7170 51.00%
100.00%
Marital status Count Percentage
S 7193 51.16%
M 6866 48.84%
100.00%
`
Annual Income Count Percentage
$10K - $30K 3090 21.98%
$30K - $50K 4601 32.73%
$50K - $70K 2370 16.86%
$70K - $90K 1709 12.16%
$90K - $110K 613 4.36%
$110K - $130K 643 4.57%
$130K - $150K 760 5.41%
$150K + 273 1.94%
14059 100.00%
Summarizes of Categorical Variables
6700
6800
6900
7000
7100
7200
M F
Gender Count
48.00%
48.50%
49.00%
49.50%
50.00%
50.50%
51.00%
51.50%
M F
Gender Percentage
Gender Percentage
M F
Gender Count
M F
5. Mean Median Mode
Standard Deviation Minimum Maximum
Quartile 0 1st Quartile 2nd Quartile
3rd Quartile 4th Quartile 0 Percentile
25 Percentile 30 Percentile 50 Percentile
75 Percentile 100 Percentile Range
Variance Skewness Kurtosis
6. 2.5:Time Series Data
• Time Series in Business Analysis:
• Trend Analysis
• Seasonal Analysis
• Sales Forecasting
• Performance Monitoring
• Financial Analysis
• Budgeting and Planning
7. Time Series Graph and Tips ( in Excel ):
Here are the steps to create a time series graph in excel:
1. Select one or more columns with time series data.
1. Optionally, select a column with dates if there is one. These dates are used to label the horizont
3. Select one of the line chart types, such as the line chart with markers, from the Insert ribbon.
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
0
2,000,000
4,000,000
6,000,000
8,000,000
10,000,000
12,000,000
14,000,000
ChartTitle
Property crime total Violent crime total
8. oulliers
• In Excel, outliers refer to data points that are significantly different from
other data points in a dataset. Identifying outliers is important for data
cleaning and ensuring the accuracy of analytical results.
9. Missing value
• Replace missing values with a specific value (e.g., mean, median, mode) or
use methods like forward or backward filling.
10. Data Filtering
Refers to the process of selectively displaying specific information based on defined criteria.
Here is the process of data filtering :
• Select Range
• Access Filter
• Enable Filter
• Column Dropdown
• Set Criteria
• Apply Filter
11. 2.7:ExcelTables for Filtering, Sorting and Summarizing
Data Sorting
Refers to arranging data in a specific order, such as
alphabetically or numerically.
Process of data sorting:
• Selecting the Data
• Accessing the Sort Function
• Choosing the Sort Criteria
• Executing the Sort