The Central Limit Theorem states that the distribution of sample means approximates a normal distribution as the sample size increases, regardless of the population distribution. It can be used to estimate the mean height of students in sports teams by taking random samples, calculating the mean of each sample, finding the mean of the sample means, and observing the results form a bell curve. Discrete and continuous data can be summarized using tables, histograms, stem-and-leaf plots, and other graphs depending on whether the values are countable categories or measured on a scale. Standard deviation is commonly used to measure the dispersion of samples from the same population.
2. The Central Limit theorem
Theorem defines that the mean of all the given samples of a population is the same as the mean of
the population(approx) if the sample size is sufficiently large enough with a finite variation. It is one
of the main topics of statistics.
The Central Limit Theorem (CLT) states that the distribution of a sample mean that approximates
the normal distribution, as the sample size becomes larger, assuming that all the samples are
similar, and no matter what the shape of the population distribution.
3. Suppose you have 10 teams in your school(Sports). Each team will have 100 students in it. Now,
we want to measure the average height of the students in the sports team. The simplest way to do
would be to find the average of their heights. The first step in this would be to measure the weight of
all the students individually and then add them. Then, Divide the sum of their weights with the total
number of students. This way we will get the average height. But this method will not make sense
for long calculations as it would be tiresome and very long
4. So, we will use CTL(Central Limit Theorem) to make the calculation easy. In this method, we will
randomly pick students from different teams and make a sample. Each sample will include 20
students. Then, we will follow the following steps to solve it.
1. Take all these samples and find the mean for each individual sample.
2. Now, Find the mean of the sample means.
3. This way we will get the approximate mean height of the students in the sports team.
4. We will get a bell curve shape if we will find the histogram of these sample mean heights
5. Note: The sample taken should be sufficient by size. When the sample size gets larger, the sample
means distribution will become normality as we calculate it using repeated sampling.
6. There are more ways to summarize quantitative data than qualitative data because numerical data
comes in two forms: discrete or continuous (as mentioned earlier). You will learn how to
create tables and histograms of each type of data. Two other summary methods for quantitative
data are stem-and-leaf plots and dot-plots. These plots are rarely used except as preliminary (quick-
and-dirty) techniques for understanding your data.
7. The methods for summarizing discrete data are similar to methods used for summarizing qualitative
data, since discrete data can be put into separate categories.
8. Discrete quantitative data can be presented in tables in several of the same ways as qualitative
data: by values listed in a table, by a frequency table, or by a relative frequency table. The only
difference is that instead of using category names, we use the discrete values taken by the data.
9. Discrete quantitative data can be presented in bar graphs in the same ways as qualitative data. A
bar graph for any type of quantitative data is called a histogram. The discrete values taken by the
data are labeled in ascending order across the horizontal axis, and a rectangle is drawn vertically
so that the height of each rectangle corresponds to each discrete variable’s frequency or relative
frequency. The main visual difference between a bar graph (qualitative data) and a histogram
(quantitative data) is that there should be no horizontal spacing between numerical values along the
horizontal axis. In other words, rectangles touch each other in a histogram.
10. A stem-and-leaf plot is a graph of quantitative data that is similar to a histogram in the way that it
visually displays the distribution. A stem-and-leaf plot retains the original data. The leaves are
usually the last digit in each data value and the stems are the remaining digits. A legend, sometimes
called a key, should be included so that the reader can interpret the information.
11. Continuous data has an infinite number of possibilities (like weights, heights, and times). In terms of
summarizing techniques, the main difference between discrete data and continuous data is that
continuous data cannot directly be put into frequency tables since they do not have any obvious
categories (you cannot create a table or histogram with an infinite number of categories).
To get around this, categories are created using classes, or intervals (ranges) of numbers. Each
class has a lower class limit, which is the smallest value within the class, and an upper class
limit, which is the largest value within the class. The class width is the difference between the
difference between the upper class limit and the lower class limit. Finally, if a class does not have a
lower or upper class limit (e.g., “shorter than 4 feet” or “60 and older”), the class is said to be open
ended.
12. There are several different graphs that are used for qualitative data. These graphs include bar
graphs, Pareto charts, and pie charts. Pie charts and bar graphs are the most common ways of
displaying qualitative data.
The only difference is that instead of using category names, we use the discrete values taken by
the data. Discrete quantitative data can be presented in bar graphs in the same ways as qualitative
data. A bar graph for any type of quantitative data is called a histogram.
13. Line Graphs – Line graph or the linear graph is used to display the continuous data and it is useful
for predicting future events over time. Bar Graphs – Bar Graph is used to display the category of
data and it compares the data using solid bars to represent the quantities.
14. What is discrete data?
Data is discrete if you can answer affirmatively on the following questions about it:
• Is it countable?
• Is it possible to divide the data into smaller parts? (i.e., to categorize it)
Discrete data can contain only a finite number of values. One of its notable properties is that, unlike continuous data, it can’t be
measured, only counted.
Examples of discrete data: the number of players in a team, the number of planets in the Solar System.
Examples of non-discrete (continuous) data: height, weight, length, income, temperature.
The following charts work especially well for representing the discrete data:
• Bar chart
• Stacked bar chart
• Column chart
• Stacked column chart
• Spider chart
15. Standard deviation is the most common measure of dispersion for any samples taken from the
same group of people (1). It’s the square root of the variance (3).
The standard deviation and variance are the most commonly used measures of
dispersion in the social sciences because: Both take into account the precise difference between
each score and the mean. The standard deviation is the baseline for
defining the concept of standardized score or "z-score".