4. SCATTER POINT
• It is difficult to arrange interval or ratio data into a crosstabulation. Interval or ratio data
do not usually fall into a small number of discrete categories such as big or small, old or
young.
• There are usually many points on an interval or ratio level scale, which means that a
contingency table displaying these scales will have as many rows or columns as there are
values in the data.
• If we are looking at the distribution of growth rates of Gross Domestic product (GDP) and
inflation (average price level) such data can of course be collapsed into a few categories,
but this is at the cost of information. A scatter plot allows for the greater range of values
we usually obtain from interval or ratio scales. It is therefore the best way to organize such
data to get an initial impression as to whether any correlation exists. A scatter plot shows
the combination of values that each case ‘scores’ on two variables simultaneously. It
displays the joint distribution for two continuous variables. Coordinates on a scatter plot
indicate the values each case takes for each of the two variables.
5. SCATTER POINT
• To illustrate th fundamental concept of rate of change, click the following
video:
6. SCATTER POINT
• To further explain, we are interested in the relationship between nominal
GDP growth rate and the inflation rate in the Philippines from first quarter
of 2017 up to third quarter of 2018. From the data of the Philippine
Statistical Authority (PSA) we attain the information given in table 1.1,
which shows the nominal GDP growth rate (which is the independent
variable, X) and the inflation rate (which is the dependent variable, Y) for
seven quarters. Arranging this information in a scatter plot in figure 1.1
makes these data simple to analyze to determine whether an association
occurs.
9. SCATTER POINT
• It is a rule to put the dependent variable, Y, on the vertical axis and the independent variable, X,
on the horizontal axis when constructing a scatter plot. If we try to consider at any one of these
points and draw a straight line down to the horizontal axis, we can find the growth rate of GDP.
• Also, by drawing a straight line across to the vertical axis we can examine the inflation rate. For
the Philippines the growth rate of GDP is 9.3 per cent and stands at 6.8 percent. Looking at figure
1.1, it can automatically be seen that a connection exists, because we can visualize a sloping line
through the data to reflect such a connection. The direction is indicated by whether this imaginary
line slopes up (positive) or down (negative).
• In this case the slope is positive, indicating that an increase in GDP growth rate is associated with
an increase in the inflation rate. We can give quantitative expression to this imaginary line
through the calculation of linear regression statistics. This extension of a scatter plot by
calculating regression statistics for variables measured on interval or ratio scales with many
points is directly analogous to the extension of crosstabulations by calculating measures of
association when working with categorical data.
11. Simple Regression Analysis
• Every straight line that can be written down in the area described by the
scatter plot has a single equation that distinguishes it from every other line.
Deriving this equation for any particular line is like giving any inimitable
person a unique combination of a thumbmark so that this person can be
separated from everybody else. The general formula for the line is given by
equation 1.1:
y = a ± bx
12. Simple Regression Analysis
• Thousands of straight lines can be drawn through the space marked out by
the vertical and horizontal axes of a scatter plot. But to identify the
individual line that we think best fits the scatter plot we need to provide it
with a unique equation. The Y-axis is its point of origin along the line. But
obviously this is not enough to distinguish it from the multitude of lines
that can start from the same point. This is shown in figure 2.2, which
exhibits only several of the lines that will share the same value for a in
their equation.
15. Simple Regression Analysis
• Nevertheless, if we identify both the point of origin on the Y-axis and the
slope of the line from that point, then we are able to identify exclusively
any line within the space.
• The technique is to come up with the only one of its kind combinations of
values for a and b that identify the line of best fit. Regression analysis is
simply the task of fitting a straight line through a scatter plot of cases that
‘best fits’ the data. Any straight line can be expressed in a mathematical
formula such as given in equation 1.1 and repeated in the next video:
17. Simple Regression Analysis
• The general formula for a straight line is
y = a ± bx
Where:
y is the dependent variable,
x is the independent variable,
a is a parameter, where the Y-intercept (the value of Y when X is zero),
b is a parameter, the slope of the line, wherein positive values indicate
positive relationship, and negative values show negative relationship.
18. Simple Regression Analysis
• This formula states that a line is described by two parameters. First is its
beginning point along the vertical axis, a, and the other is the slope of the
line from this point, b. It is the value of b that we are most interested in
since any slope, either positive or negative, indicates some relationship
between the two variables. In figure 1.4 we see three different lines
showing the value of b in the three alternative situations of positive,
negative, and no relationship.
22. Simple Regression Analysis
• To give us an idea how this simple regression is computed manually
through the excel this video will help us determine the parameters of the
equation:
23. Simple Regression Analysis
• Looking at the data for the growth rate in the nominal GDP and inflation
rate, we can actually illustrate lot of straight lines through this scatter plot,
and each of these lines will have its own unique formula. For example, we
can plot table 1.1 given below to give us figure .
PERIOD
NOMINALGDP
GROWTH RATE
PHILIPPINE INFLATION
RATE
2017-Q1 9.3 2.9
2017-Q2 9.2 2.3
2017-Q3 9.6 2.6
2017-Q4 8.6 2.5
2018-Q1 9.6 4.1
2018-Q2 9.9 5.1
2018-Q3 10.6 6.8
24. Simple Regression Analysis
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12
PhilippineInflationRate
Nominal GDP Growth Rate
Table 1.5. Straight line for Nominal GDP growth rate and Inflation rate from the first quarter of 2017 up to the third quarter of
2018
25. Simple Regression Analysis
• To determine the parameters of the regression model, there is a simple
procedure that we can make. First, open the excel on your window or mac,
be sure to install the toolpak to use the data analysis.
• The regression analysis is integrated in the data analysis.
• To illustrate this we will be using the data on table 1.1 and type this on the
spreadsheet, where nominal GDP growth rate is our independent variable
(x), and inflation rate is our dependent variable (y).
• Click data analysis, like the one that is shown in figure 1.6.
27. Simple Regression Analysis
• When Excel displays the Data Analysis dialog box, select the Regression tool
from the Analysis Tools list and then click OK.
• Identify your Y and X values.
• Use the Input Y Range text box to identify the worksheet range holding your
dependent variables. Then use the Input X Range text box to identify the
worksheet range reference holding your independent variables.
• Each of these input ranges must be a single column of values. In our example, if
you want to use the Regression tool to explore the effect of nominal GDP on
inflation rate, you enter $A$2:$A$8 into the Input X Range text box
and $B$2:$B$8 into the Input Y Range text box. If your input ranges include a
label, select the Labels check box.
28. Simple Regression Analysis
• Afterwards, click the regression analysis shown in figure 1.7.
Figure 1.7.The regression analysis
29. Simple Regression Analysis
• (Optional) Set the constant to zero.
If the regression line should start at zero — in other words, if the dependent
value should equal zero when the independent value equals zero — select the
Constant Is Zero check box.
• (Optional) Set the constant to zero.
If the regression line should start at zero — in other words, if the dependent
value should equal zero when the independent value equals zero — select the
Constant Is Zero check box.
30. Simple Regression Analysis
• Select a location for the regression analysis results.
Use the Output Options radio buttons and text boxes to specify where Excel
should place the results of the regression analysis. To place the regression
results into a range in the existing worksheet, for example, select the Output
Range radio button and then identify the range address in the Output Range
text box. To place the regression results someplace else, select one of the
other option radio buttons.
31. Simple Regression Analysis
• Identify what data you want returned.
Select from the Residuals check boxes to specify what residuals results you
want returned as part of the regression analysis. Similarly, select the Normal
Probability Plots check box to add residuals and normal probability
information to the regression analysis results.
32. Simple Regression Analysis
• Click OK.
Excel shows a portion of the regression analysis results including three, stacked visual plots
of data from the regression analysis.
There is a range that supplies some basic regression statistics, including the R-square value,
the standard error, and the number of observations. Below that information, the Regression
tool supplies analysis of variance (or ANOVA) data, including information about the
degrees of freedom, sum-of-squares value, mean square value, the f-value, and the
significance of F.
Beneath the ANOVA information, the Regression tool supplies information about the
regression line calculated from the data, including the coefficient, standard error, t-stat, and
probability values for the intercept — as well as the same information for the independent
variable, which is the number of ads. Excel also plots out some of the regression data using
simple scatter charts.
33. Simple Regression Analysis
• Click okay and you will have the result given in figure 1.8.
Figure 1.8. Spreadsheet showing the Result of Simple Regression Analysis
34. Simple Regression Analysis
• The equation for this simple regression analysis is given by:
y = -19.0704 + 2.39211x
The value for a (-19.0704) is the point on the Y-axis where the line ‘begins’.
This is the rate of inflation when the rate of nominal GDP is zero. The
negative (–) sign means that the line has a negative slope, which indicates a
negative relationship between these two variables.
35. Simple Regression Analysis
• The value for b (2.39211) is the slope coefficient of the regression line. The
regression coefficient indicates by how the growth rate of inflation will
increase if the growth rate of nominal GDP increases by 1 per cent. Since
the slope of any straight line is “change in inflation rate / change in
growth rate in GDP “ or generally we can say that ‘ΔY/ΔX’. For instance,
in figure 1.5, take the first two consecutive values in inflation rate that is
approximately 1.50 and 3.50, there is an increase of 2.0. On the other hand,
the first two numbers in the growth rate of nominal GDP or ΔX, to read the
corresponding increase in the growth rate of GDP gives a value of 0.84.
Dividing rise over run, the slope and given in figure 3.1will be:
37. Simple Regression Analysis
y = -19.0704 + 2.39211(10)
y = 4.8507
• In addition, there are nine actual observations in the data of inflation rate,
the error (e) term if we consider first quarter of year of 2018 at this point
is:
e = yactual – yexpected = 4.1 − 4.8507 = −0.7507
38. Simple Regression Analysis
• Regression analysis uses the idea in a slightly more complicated form. The
logic is called ordinary least squares regression (OLS): we require a line
such that the difference between the estimated values of Y and the actual
values of Y (squared) are as small as possible. To do this, we try to square
the residuals, because the sum of residuals for any line that passes through
the point that is the mean for both the dependent and independent variables
will equal zero. To eliminate the effect of the positive and negative signs,
the residuals are squared so that we are only dealing with positive
numbers.
39. Simple Regression Analysis
• Ordinary least squares regression is a rule that express us to illustrate a line
through a scatter plot that minimizes the sum of the squared residuals.
• The OLS regression line is determined through a process of trial and error.
We could illustrate lines through the scatter plot, working out their
respective equations and residuals, until we finally hit on the one that
minimizes these residuals. Fortunately, there is an alternative. If we use the
following two rules, we can derive the OLS regression line directly without
having to go through an indeterminate process of trial and error:
42. Simple Regression Analysis
• Although this formula still looks quite difficult, if we try to practice and
work through it step by step, we will see that it is a rather uncomplicated
calculation. The estimates of inflation rate equation is included in table 1.2
to show how the numbers are derived.
43. Simple Regression Analysis
Table 1.6. Estimation of the Slope of the OLS Regression Line
PERIOD
NOMINALGDP
GROWTH
RATE (X)
PHILIPPINE
INFLATION
RATE (Y)
X2 Y2 XY
2017-Q1 9.3 2.9 86.49 8.41 26.97
2017-Q2 9.2 2.3
84.64 5.29 21.16
2017-Q3 9.6 2.6 92.16 6.76 24.96
2017-Q4 8.6 2.5 73.96 6.25 21.5
2018-Q1 9.6 4.1 92.16 16.81 39.36
2018-Q2 9.9 5.1 98.01 26.01 50.49
2018-Q3 10.6 6.8 112.36 46.24 72.08
SUM 66.8 26.3 639.78 691.69 256.52
MEAN 9.5428 3.7571
46. Simple Regression Analysis
• Consequently, we can express the line of best fit, for this set of cases, with
the following and is given by equation 2.1:
Y = -19.0704 + 2.39211X
• In figure 4.1, this regression line is drawn through the scatter plot. What
does this tell us about the relationship between nominal GDP growth rate
and inflation rate, for this set of cases? There is a positive relationship
between the two variables: an increase (decrease) in inflation rate has
relationship with an increase (decrease) in nominal GDP growth rate. We
can calculate this positive relationship: an increase in the inflation rate by 1
percent would increase nominal GDP growth rate by 2.39211 percent.
47. Simple Regression Analysis
• We can utilize this formula for the reason of expectation: we can expect
changes in inflation rate is likely to have a certain variation in inflation
rate. To illustrate, if we were told that nominal GDP growth rate would
increase by 8 percent, our speculation will be to say that inflation rate will
increase by 0.06648 percent, that is:
Y = -19.0704 + 2.39211(8) = 0.06648