This document discusses two prominent methods for finding outliers in statistics: the interquartile range (IQR) method and the Tukey method. Both methods use quartiles to determine a range of values that are not outliers, and then identify outliers as any data points that fall above or below this range. The document provides examples of each method being applied to sample data sets to identify outlier values. It concludes by encouraging the use of these IQR and Tukey methods to solve problems involving outliers.
2. Today's
Discussion
What are outliers in statistics?
Examples of outliers in statistics
How to find outliers in statistics
using the Interquartile Range (IQR)?
How to find the outliers in statistics
using the Tukey method?
Specifications
Conclusion
3. A definition of outliers in
statistics can be considered as a
section of data, which is used to
represent an extraordinary range
from a piot to another point. Or we
can say that it is the data that
remains outside of the other given
values with a set of data. If one had
Pinocchio within a class of teenagers,
his nose’s length would be
considered as an outlier as compared
to the other children.
What are
outliers in
statistics?
4. In the given set of random values, 5 and 199 are
outliers:
5, 94, 95, 96, 99, 104, 105, 199
“5” is studied as an extremely low value whereas “199” is
recognized as an extremely high value. But, outliers are
not always taken as these simple values. Let’s assume
one accepted the given paychecks in the last month:
$220, $245, $20, $230.
Your average paycheck is considered as $130. But the
smaller paycheck ($20) can be because that person
went on holiday; that is why an average weekly
paycheck is $130, which is not an actual representation
of their earned. Their average is more like $232 if one
accepts the outlier ($20) from the given set of data.
That is why seeking outliers might not be that simple
as it seems. The given data set might resemble as:
Examples of
outliers in
statistics
5. 60, 9, 31, 18, 21, 28, 35, 13, 48, 2.
One might guess that 2 is an outlier and possibly 60.
But one predicts it as 60 is the outlier in the set of
data.
Whiskers and box chart often represent outliers:
Examples of
outliers in
statistics
6. However, one might not has a passage to the whiskers
and box chart. And if one does, the few boxplots might
not explain outliers. For instance, the chart has whiskers
which stand out to incorporate outliers as:
That is why do not believe in obtaining outliers in
statistics from the whiskers and a box chart. It said that
whiskers and box charts could be a valuable device to
present after one will be determined what their outliers
are—the efficient method to obtain all outliers with the
help of the interquartile range (IQR). These IQR includes
the average amount of the data; therefore, outliers could
quickly be determined once one understands the IQR.
Examples of
outliers in
statistics
7. An outlier is described as a data point that ranges above 1.5 IQRs,
which is under the first quartile (Q1) or over the third quartile (Q3)
within a set of data.
Low = (Q1) – 1.5 IQR
High = (Q3) + 1.5 IQR
Sample Problem: Find all of the outliers in statistics of the given
data set: 10, 20, 30, 40, 50, 60, 70, 80, 90, 100.
Step 1: Get the Interquartile Range, Q1(25th percentile) and
Q3(75th percentile).
IQR = 50
Q1 (25th percentile) = 30
Q2 (50th percentile) = 55
Q3 (75th percentile)= 80
How to calculate IQR of the above data set value
Put all the data values in order and mark a line between the values
to find Q1(25th percentile) and Q3(75th
percentile). [Q1:(10,20,30,40,50) | Q2: (60,70,80,90,100)]Find the
median of Q1 and Q2, which is 30 and 80.Subtract Q1 from Q2. [80-
How to find
outliers in
statistics
using the
Interquartile
Range (IQR)?
8. Step 2: Multiply the calculated IQR with 1.5 that has been obtained in
Step 1:
IQR * 1.5 = 50* 1.5 = 75.
Step 3: Add the number of Step 2 to Q3 [calculated in Step 1]:
75+ 80= 155.
It is considered as an upper limit. Keep this number away for a specific
moment.
Step 4: Subtract the number which one has found in Step 2 from Q1
from Step 1:
30 – 50= -20.
It is the lower limit. Put the number aside for a moment.
Step 5: Keep the values from the data set in order:
10, 20, 30, 40, 50, 60, 70, 80, 90, 100.
Step 6: Include these low and high values to the given data set in
order:
-20, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 155.
Step 7: Highlight a value above or below the values that one has put
in Step 6:
-20, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 155.
Here is the method for how to find outliers in statistics, and for this
example, it will be 100.
How to find
outliers in
statistics
using the
Interquartile
Range (IQR)?
9. The Tukey method to discover the outliers in statistics applies
the Interquartile Range to separate very small or very large
numbers. It is the equivalent of the above method, but one
might examine the formulas which are composed slightly
different, and the specification is slightly different. For instance,
the Tukey method utilizes the idea of “fences.”
The specifications are:
High outliers = Q3 + 1.5(Q3 – Q1) = Q3 + 1.5(IQR)
Low outliers = Q1 – 1.5(Q3 – Q1) = Q1 – 1.5(IQR)
Where:
Q1 = first quartile
Q2 = middle quartile
Q3 = third quartile
IQR = Interquartile range
The above equations provide two values. One can study a fence
that can highlight the outliers from the values included in the
amount of the data. Now, let’s check how to find outliers in
statistics.
How to find
the outliers
in statistics
using the
Tukey
method?
10. Sample Problem: Use Tukey’s method to get the value of outliers
of the following data:
3,4,6,8,9,11,14,17,20,21,42.
Step 1: Calculate the Interquartile range [follow the same
procedure shown in the table as mentioned above], which give
the value as
Q1 = 6 Q3 = 20 IQR = 14
Step 2: Measure the value of 1.5 * IQR:
1.5 * IQR = 1.5 * 14= 21
Step 3: Subtract the value of Q1 to obtain the lower fence:
6 – 21 = -15
Step 4: Sum the value to Q3 to obtain the upper fence:
20+ 21 = 41.
Step 5: Add these fences to the given data to get the value of
outliers:
-15, 3, 4, 6, 8, 9, 11, 14, 17, 20, 21, 41, 42.
Anything which is outside the fences is considered to be the
outliers. For the given data set, 42 is considered as an only outlier.
How to find
the outliers
in statistics
using the
Tukey
method?
11. Several students face difficulty regarding how to
find outliers in statistics; that is why we have
mentioned two different methods to calculate it.
Besides this, there are other advanced methods
too to get the value of outliers, such as Dixon’s Q
Test, Generalized ESD, and much more. Use the
above-mentioned IQR and Tukey method to solve
the problems of outliers values.
Conclusion
12. FOLLOW US ON SOCIAL
MEDIA
FACEBOOK
@statanalytica
TWITTER
@statanalytica
PINTEREST
@statanalytica