Descriptive statistics are used to summarize and describe characteristics of a data set. It includes measures of central tendency like mean, median, and mode, measures of variability like range and standard deviation, and the distribution of data through histograms. Inferential statistics are used to generalize results from a sample to the population it represents through estimation of population parameters and hypothesis testing. Correlation and regression analysis are used to study relationships between two or more variables.
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
The two major areas of statistics are: descriptive statistics and inferential statistics. In this presentation, the difference between the two are shown including examples.
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docxanhlodge
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the sample variance
Step 1. Work out the mean
In the formula above, μ (the Greek letter "mu") is the mean of all our values.
For this example, the data points are: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
So:
μ = 7
Step 2. Then for each number: subtract the Mean and square the result
This is t.
The two major areas of statistics are: descriptive statistics and inferential statistics. In this presentation, the difference between the two are shown including examples.
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docxanhlodge
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the sample variance
Step 1. Work out the mean
In the formula above, μ (the Greek letter "mu") is the mean of all our values.
For this example, the data points are: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
So:
μ = 7
Step 2. Then for each number: subtract the Mean and square the result
This is t.
SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docxagnesdcarey33086
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the sample variance
Step 1. Work out the mean
In the formula above, μ (the Greek letter "mu") is the mean of all our values.
For this example, the data points are: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
So:
μ = 7
Step 2. Then for each number: subtract the Mean and square the result
This is t.
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxboyfieldhouse
Answer the questions in one paragraph 4-5 sentences.
· Why did the class collectively sign a blank check? Was this a wise decision; why or why not? we took a decision all the class without hesitation
· What is something that I said individuals should always do; what is it; why wasn't it done this time? Which mitigation strategies were used; what other strategies could have been used/considered? individuals should always participate in one group and take one decision
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each b.
SAMPLING MEAN DEFINITION The term sampling mean is.docxagnesdcarey33086
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical
distributions. In statistical terms, the sample mean from a group of observations is an
estimate of the population mean . Given a sample of size n, consider n independent random
variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these
variables has the distribution of the population, with mean and standard deviation . The
sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that
it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number
of items taken from a population. For example, if you are measuring American people’s weights,
it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the
weights of every person in the population. The solution is to take a sample of the population, say
1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are
going to analyze. In statistical terminology, it can be defined as the average of the squared
differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
• Determine the mean
• Then for each number: subtract the Mean and square the result
• Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
• Next we need to divide by the number of data points, which is simply done by
multiplying by "1/N":
Statistically it can be stated by the following:
•
• This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the sample variance
Step 1. Work out the mean
In the formula above, µ (the Greek letter "mu") is the mean of all our values.
For this example, the data points are: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
So:
µ.
These is info only ill be attaching the questions work CJ 301 – .docxmeagantobias
These is info only ill be attaching the questions work CJ 301 –
Measures of Dispersion/Variability
Think back to the description of
measures of central tendency
that describes these statistics as measures of how the data in a distribution are clustered, around what summary measure are most of the data points clustered.
But when comes to descriptive statistics and describing the characteristics of a distribution, averages are only half story. The other half is measures of variability.
In the most simple of terms, variability reflects how scores differ from one another. For example, the following set of scores shows some variability:
7, 6, 3, 3, 1
The following set of scores has the same mean (4) and has less variability than the previous set:
3, 4, 4, 5, 4
The next set has no variability at all – the scores do not differ from one another – but it also has the same mean as the other two sets we just showed you.
4, 4, 4, 4, 4
Variability (also called spread or dispersion) can be thought of as a measure of how different scores are from one another. It is even more accurate (and maybe even easier) to think of variability as how different scores are from one particular score. And what “score” do you think that might be? Well, instead of comparing each score to every other score in a distribution, the one score that could be used as a comparison is – that is right- the mean. So, variability becomes a measure of how much each score in a group of scores differs from the mean.
Remember what you already know about computing averages – that an average (whether it is the mean, the median or the mode) is a representative score in a set of scores. Now, add your new knowledge about variability- that it reflects how different scores are from one another. Each is important descriptive statistic. Together, these two (average and variability) can be used to describe the characteristics of a distribution and show how distribution differ from one another.
Measures of dispersion/variability
describe how the data in a distribution a
re scattered or dispersed around, or from, the central point represented by the measure of central tendency.
We will discuss
four different measures of dispersion
, the
range
, the
mean deviation
, the
variance
, and the
standard deviation
.
RANGE
The
range
is a very simple measure of dispersion to calculate and interpret.
The
range
is simply the difference between the highest score and the lowest score in a distribution.
Consider the following distribution that measures the “Age” of a random sample of eight police officers in a small rural jurisdiction.
Officer
X = Age_
41
20
35
25
23
30
21
32
First, let’s calculate the mean as our measure of central tendency by adding the individual ages of each officer and dividing by the number of officers.
The calculation is 227/8 = 28.375 years.
In general, the formula for the range is:
R=h-l
Where:
r is the range
h.
Detail Description about Probability Distribution for Dummies. The contents are about random variables, its types(Discrete and Continuous) , it's distribution (Discrete probability distribution and probability density function), Expected value, Binomial, Poisson and Normal Distribution usage and solved example for each topic.
It gives detail description about probability, types of probability, difference between mutually exclusive events and independent events, difference between conditional and unconditional probability and Bayes' theorem
How to write research proposal?, How to write statement of the problem?, Difference between Research question and hypothesis?, Difference between internal and external validity. Difference between l
This slides gives knowledge about how to define a research question. what are the do's and don'ts while defining research question, steps to define a research questions.examples of research questions
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
2. Role of statistics in research
Designing research
Analyzing data
Draw conclusion about research
3. Two major areas of statistics
Descriptive statistics
It concern with development of certain indices from the raw
data.
It summarizes collected/ classified data.
Inferential statistics
It adopts the process of generalization from small groups
(i.e., samples) to population.
It also known as sampling statistics.
It concern with two major problems
Estimation of population parameters.
Testing of statistical hypothesis.
4. Descriptive statistics
Measure of central tendency (Statistical
Average)
Measure of dispersion
Measure of asymmetry (Skewness)
Measure of relationship
Other measures
5. Measure of Central tendency
It also known as statistical average.
Mean, Median and Mode are the popular
averages.
Geometric and Harmonic mean are also
sometime used.
6. Mean
It also known as arithmetic average.
The mean is found by adding all the values in
the set, then dividing the sum by the number of
values.
7. Example
Find the mean of following numbers 8, 9, 10,
10, 10, 11, 11, 11, 12, 13
The mean is the usual average:
n= 10
(8 + 9 + 10 + 10 + 10 + 11 + 11 + 11 + 12 + 13)
÷ 10 = 105 ÷ 10 = 10.5
8. Weighted Mean
The weighted average formula is used to
calculate the average value of a particular set
of numbers with different levels of weight.
9. Example
Final marks has distribution of internal marks,
Mid semester marks and End semester marks.
Each one carries following weightage
Internal Mid-Sem End-Sem
30 20 50
Shakthi got 80 marks in internal, 75 marks in
Mid sem and 89 marks in End sem. Then what
is his final mark? Ans = 83.5
10. Frequency distribution Mean
Parkin
g
Spaces
Freque
ncy
1 15
2 27
3 8
Isabella went up and down the street to find out how many parking
spaces each house had. Here are her results: What is the mean
number of Parking Spaces?
Mean = 15×1 + 27×2 + 8×3 + 5×4
15+54+24+20 /15+27+8+5
Mean = 2.05
11. Median
Median is the value of the middle item of
series when it is arranged in ascending or
descending order of magnitude.
It divides the series into two halves; in one half
all items are less than median, whereas in the
other half all items have values higher than
median
12. Example
21, 18, 24, 19, 27
Arrange the number in ascending order
18, 19, 21, 24, 27 (21 is the median)
If there are two middle numbers,
18, 19, 21, 25, 27, 28
(21 + 25 )/2 = 23
13. Mode
The number that appears most frequently in a set
of numbers.
It useful in the studies of popular. (Popular shoe
size, popular cap, most demanded product etc.,)
Arrange the numbers in order from least to
greatest.
21, 18, 24, 19, 18
Find the number that is repeated the most.
18, 18, 19, 21, 24 ( 18 is the mode).
14. Geometric Mean
It is defined as the nth root of the product of the values of n times in
a given series.
The most frequently used application of this average is in the
determination of average percent of change.
15. Example
What is the geometric mean of 4,6 and 9 ?
Ans: 6
Your investment earns 20% during the first year, but
then realizes a loss of 10% in year 2, and another
10% in year 3
Calculate a growth factor for each year. (1+.2) for Year 1, (1-
.1) for year 2 and (1-.1) for Year 3
Multiply the 3 growth factors and take the 3rd root.
Thus, geometric mean = 0.990578-1=-0.009422.
So your investment losing roughly .9 percent of every year.
16. Harmonic Mean
It is defined as reciprocal of the average of
reciprocal of values of items of a series.
It has limited application particularly in time
and rate are involved.
It gives largest weight to smallest value and
smallest weight to largest value.
18. Example
For example, suppose that you have four 10 km
segments to your automobile trip. You drive your
car:
100 km/hr for the first 10 km
110 km/hr for the second 10 km
90 km/hr for the third 10 km
120 km/hr for the fourth 10 km.
Ans: 103.8 km/hr.
19. Measure of Dispersion
Average can’t reveal the entire story of study. It
fail to give ideas about data which distributed
around average.
Measures of dispersion measure how spread out
a set of data is.
Important measure of dispersion are
Range
Mean Deviation
Standard deviation
20. Range
It is difference between extreme values.
Range = Highest value – Lowest value
What is the range for following data?
2, 3, 1, 1, 0, 5, 3, 1, 2, 7, 4, 0, 2, 1, 2, 1, 6, 3, 2, 0,
0, 7, 4, 2, 1, 1, 2, 1, 3, 5, 12
Range = 12
Limitation: Its value is never stable, being based
on only two values of the variable.
21. Mean Deviation
It is the average of difference of the values of items
from some average of the series
Find the mean deviation of 3, 6, 6, 7, 8, 11, 15, 16
Find the mean
Find the absolute distance
from the mean.
Find mean of those
Distances.
Ans= 3.75
22. Coefficient of mean deviation
When mean deviation is divided by the
average used in finding out the mean deviation
itself, the resulting quantity is described as the
coefficient of mean deviation.
Coefficient of mean deviation is a relative
measure of dispersion and is comparable to
similar measure of other series.
23. Standard deviation
It is most widely used measure of dispersion of
a series and is commonly denoted by the
symbol ‘ 𝜎’ (pronounced as sigma).
24. Standard deviation
Find the standard deviation of 53,61,49,67,
55,63.
Steps:
Find the mean (58)
Find the deviation from mean, square it and sum it
(230)
Divide the above answer by sample size (38.333)
25. Other terms related to 𝜎
Variance : Square the standard deviation
Coefficient of standard deviation: Divide the
standard deviation by Mean of that data.
Coefficient of variation: Multiply the coefficient
of standard deviation with 100.
26. Distribution
The pattern of outcomes of a variable; it tells
us what values the variable takes and how
often it takes these values.
The distribution of data can be
find with help of histogram.
Histogram is bar chart.
27. Steps in making Distribution
Choose the classes by dividing the range of data
into classes of equal width (individuals fit into one
class).
Count the individuals in each class (this is the
height of the bar).
Draw the histogram:
The horizontal axis is marked off into equal class
widths.
The vertical axis contains the scale of counts
28. Histogram
The number of days of Maria’s last 15
vacations are listed below. Use the data to
make a frequency table with intervals.
4, 8, 6, 7, 5, 4, 10, 6, 7, 14, 12, 8, 10, 15, 12.
Step 1: Identify the least and greatest values.
Step 2: Divide the data into equal intervals.
29. Histogram
Step 3: List the intervals in the first column of
the table. Count the number of data values in
each interval and list the count in the last
column. Give the table a title.
30. Normal Distribution
The distribution of data happens to be
perfectly symmetrical.
It is perfectly bell shaped curve in which case
the value of mean 𝑋 = median M = mode Z.
31. Skewness
if the curve is distorted (whether on the right
side or on the left side), we have asymmetrical
distribution which indicates that there is
skewness.
If the curve is distorted on the right side, we
have positive skewness.
If the curve is distorted on the left side, we
have negative skewness.
33. Measure of Skewness
Skewness = 𝑋 - Z or 3 ( 𝑋 - M)
Coefficient of Skewness =
( 𝑋− 𝑍)
𝜎
If skewness value is positive/ negative/ zero,
then data are positively/negatively
skewed/symmetry.
34. Kurtosis
Kurtosis is the measure of flat-toppedness or
peakedness of a curve.
Normal curve is MesoKurtic..
Positive kurtosis is leptokurtic.
Negative kurtosis is platykurtic
35. Measure of Relationship
Univariate Analysis: The analysis is carried out
with the description of a single variable.
Bivariate Analysis: The analysis of two
variables simultaneously.
Multivariate Analysis: The analysis of multiple
variables simultaneously.
36. Measure of Relationship
Correlation
The word Correlation is made of Co- (meaning "together"),
and Relation.
It answers that Does there exist association or correlation between
the two (or more) variables ?
Regression
It answers that:
Is there any cause and effect relationship between the two variables
in case of the bivariate population or between one variable on one
side and two or more variables on the other side in case of
multivariate population? If yes, of what degree and in which direction?
37. Karl Pearson’s coefficient of
correlation
It is simple correlation and most widely used
method. Denotes by r.
It is also known as the product moment
correlation coefficient.
The value of r lies between ±1.
38. r value and interpretation
We can also say that for a unit change in independent variable, if there happens
to be a constant change in the dependent variable in the same direction, then
correlation will be termed as perfect positive.
40. Karl Pearson Correlation coefficient
formula
r
If substitute the value of sigma and derive, you will get below
equation
41. Example
A researcher want to know the relation between
advertisement expenditure and total sales. He took a sample
data of 7 companies for one year.
44. Regression
It is the study of the relationship between
variables.
It also used for prediction of dependent
variable.
Regression types
Simple Regression: single explanatory/independent
variable
45. Regression Analysis
Linear Regression: Straight-line relationship.
Non-linear: Implies curved relationships.
Regression is nothing but try to find out the equation
of line/curve.
Linear Non linear
y=mx+b
46. Linear Regression Equation
Unknown parameter a and b can be calculated by below
formulam = b = slope of line, a= c = intercept
Above formula gives best fit line by using least square method.
b = Σ [ (𝑥𝑖 - 𝑥)(𝑦𝑖 - 𝑦) ] / Σ [ (𝑥𝑖 − 𝑥)2
]
a = 𝑦 - b * 𝑥
47. Example for regression
In the table below, the X column shows scores
on the aptitude test. Similarly, the Y column
shows statistics grades.
Student X Y
1
2
3
4
5
95
85
80
70
60
85
95
70
65
70
49. Standard error
It provide an overall measure of how well the model
fits the data.
It represents the average distance that the
observed values fall from the regression line.
Smaller values are better because it indicates that
the observations are closer to the fitted line.
50. Standard error
Y 𝑌 (𝑌 − 𝑌) (𝑌 − 𝑌)2
85 81.508 3.492 12.19406
95 87.948 7.052 49.7307
70 71.848 -1.848 3.415104
65 68.628 -3.628 13.16238
70 71.848 -1.848 3.415104
Sum 81.91736
Standard error = 4.0475
52. Partial Correlation
In simple correlation, we measure the strength of the linear
relationship between two variables, without taking into
consideration the fact that both these variables may be
influenced by a third variable.
For Ex: when we study the correlation between price and demand,
we completely ignore the effect of other factors like money supply,
import and exports etc. which definitely have a bearing on the price.
The correlation co-efficient between two variables X1 and X2,
studied partially after eliminating the influence of the third
variable X3 from both of them, is the partial correlation co-
efficient.
53. Other Measures
Index number
Indicator of average percentage change in a series of figures
where one figure (called the base) is assigned an arbitrary
value of 100, and other figures are adjusted in proportion to
the base.
Time series analysis
Unlike the analyses of random samples of observations that
are discussed in the context of most other statistics, the
analysis of time series is based on the assumption that
successive values in the data file represent consecutive
measurements taken at equally spaced time intervals.