2. Hannah Masoner, Sept. 22, 2014, Assignment 2: Correlation Report 2
Abstract:
This paper is about the correlation of random integers and whether or not, they are
actually random. It includes the details of an experiment done with three studies of different
sample sizes, with different variables. The purpose of this experiment was to show that a
correlation can sometimes occur between random objects or in this case, samples. The sample
sizes were 5, 30, and 100. Each of these samples were repeated 30 times and the correlation of
each one recorded. Lastly, a scatterplot of each study was made to visually identify whether or
not the study had a direction. The direction indicated whether or not there was stability among
the sample.
Introduction:
Randomness is the unpredictable choosing of objects that each have an equal chance of
being chosen. “Pearson’s correlation coefficient varies from -1 (perfect negative correlation) to
+1 (perfect positive correlation), with a value of 0 indicating no linear relationship.”(Data
Analysis in Geosciences). If numbers are chosen at random, there should be no linear
relationship, because the numbers are in no way relative to one another, there is no pattern upon
their choosing. This does not mean that the variables chosen are not linked somehow, it just
means that they have no linear relationship. If the numbers are closer to -1 or +1 the stronger the
correlation, if the correlation coefficient is 0, then there is no relationship. “The strength has
nothing to do with whether the number is positive of negative. A correlation of -.88 is stronger
than one that is +.56 the closer the number gets to zero (whether positive or negative), the
weaker the correlation.” (Correlation Research). The most common correlation method used
today is “…the Pearson product-moment correlation coefficient, measures the strength of the
linear association between variables.” (Linear Coefficient Correlation).
In this study, I used randomly generated numbers, as my variables, in different sample
sizes to see if the correlation coefficients would be closely related. I was able to perform these
tasks by using Random.org for my variables and Microsoft Excel for my statistical software. I
knew that being that all the variables and sample sizes were all different and random, I should
not find a perfect zero correlation, and however, I did know that I would not be too far away
from that either. The level of variability used in this study was standard deviation because it
included the whole population, or the entire study.
Methods:
For each study, I used Random.org as my number generator. There I was able to input my
specifics for the study. For study 1, I entered 10 numbers, between 1 and 100, in 2 columns. I
repeated this process a total of 30 times for this study.
3. Hannah Masoner, Sept. 22, 2014, Assignment 2: Correlation Report 3
For study 2, I entered 60 numbers, between 1 and 100, in 2 columns. I repeated this process 30
times for this study. For study 3, I entered 200 numbers, between 1 and 100, in 2 columns. I
repeated this process 30 times for this study.
I entered all of this data, each individual case of random numbers, into a spreadsheet in
Excel. Each case had its own page in Excel. Once, the data was input in 90 or more
spreadsheets, in three different books, I began setting up the program to calculate the data as a
correlation coefficient. I first, had to go to the data tab in each book, and select that the data be
put into two separate columns, so that the program would calculate it correctly. I specified that I
wanted everything that was input and separated by a space to be put into different cells. Then,
once that was done, I then told the program to calculate the data as a correlation. I did this, by
clicking the data analysis tool. The program then asked me to choose which data analysis tool,
and I chose correlation. Then, I had to input which cells I wanted included in the correlation. I
chose all the cells that contained the randomly generated numbers. Then, I chose an empty blank
cell for the program to put the correlation coefficient into. I repeated this process in every one of
the pages, in every book of Excel. After I had all the data organized and calculated, I set up
scatter plots, so that I could see the random points.
Results:
Study 1
The degrees of freedom, or df, in study 1 was 3. The critical value was 3.182. There was also an
alpha of 0.05. The range of variation of r’s was (-0.99306-0.800679). The mean correlation was
0.00890 and the standard deviation was .241588273.
1
0.5
0
-0.5
-1
-1.5
Chart Title
0 5 10 15 20 25 30 35
4. Hannah Masoner, Sept. 22, 2014, Assignment 2: Correlation Report 4
Study 2
The degrees of freedom in study 2 was 28. The critical value was 2.048. There was also an
alpha of 0.05. The range of variation of the r’s was (-0.38607-0.45147). The mean correlation
was 0.003931 and the standard deviation was .191413943.
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
Study 3
Chart Title
0 5 10 15 20 25 30 35
The degrees of freedom in study 3 was 98. The critical value was 1.960. There was also an
alpha of 0.05. The range of variation of r’s was (-0.13097-0.17113). The mean correlation was
0.01355 and the standard deviation was .081343769.
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
Chart Title
0 5 10 15 20 25 30 35
5. Hannah Masoner, Sept. 22, 2014, Assignment 2: Correlation Report 5
Conclusions:
The conclusion that I arrived at after this study, was that even though the numbers are
random, the larger the sample size, the more of a linear relation to one another they have. In
study 1, the sample size was 10 random numbers in two columns. Thus, produced 5 sets of two.
This is a small sample. The variables were very unpredictable because they were random and
when they were shown on a scatterplot, they were all over the place.
In study 2, things began to plane out more because there were 60 random numbers in two
columns, which produced two sets of 30 numbers. The Central Limits Theorem states that once
a sample size reaches 30 or more, the results will be the general for the population that of which
the sample came. This made this sample more stable and more of a generalized population,
rather than a random sample. The scatterplot for study 2 supports this.
In study 3, there was even more sense made. There were 200 random numbers in two
columns, which gave me two sets of 100 numbers. These numbers seemed a lot less random
than the numbers in the other two studies because there was many more of these numbers and
because they were all between 1 and 100, some were repeated within their group. As I was
conducting my research, I came across an example of how random things can be related, they
may not cause one another but they are related. (Correlation Research). The random numbers in
study 3 may really be random, but because of their limitation to what they are, in this case
numbers 1 to 100, they are not really random once they are all put together. They then point to a
certain direction or correlation.
My mean correlations went against what I expected. The mean correlation of each study
became larger, when I thought it would become smaller. I thought that the mean correlation
would get closer to zero, a perfect correlation, as the sample became larger. But in my studies, it
in fact, did not. Each studies’ mean is as follows, respectively: 0.00890, 0.003931, and 0.01355.
6. Hannah Masoner, Sept. 22, 2014, Assignment 2: Correlation Report 6
References:
Garvey, Killian (“n.d.”). Psyc 4000-42209 Psychology Laboratory Syllabus. Retrieved from <
http://moodle.ulm.edu/course/view.php?id=50505#section-0>
Hall, Mary (“n.d”). Psyc 4000- Textbooklet 2. Retrieved from <
https://mystudentmail.ulm.edu/zimbra/#1>
Random Number Generator. (2014). Random.org. Retrieved from:
<http://www.random.org/integers/?num=200&min=1&max=100&col=2&base=10&format=html
&rnd=new>
Anonymous (“n.d.”). Correlation Research. Retrieved from: <
http://www.appsychology.com/Book/ResearchM/correlationalresearch.htm>
Holland, Steven (2013). Data Analysis in The Geosciences: GEOL 6370. Retrieved from
< http://strata.uga.edu/6370/lecturenotes/correlation.html>
Anonymous (2014). Stat Trek: Teach Yourself Statistics. Retrieved from <
http://stattrek.com/statistics/correlation.aspx>
Anonymous (“n.d.”). Randomness. Retrieved from
<http://dictionary.reference.com/browse/randomness?s=ts>