SlideShare a Scribd company logo
1 of 18
Curve-fitting Project - Linear Model (due at the end of Week 5)
Instructions
For this assignment, collect data exhibiting a relatively linear
trend, find the line of best fit, plot the data and the line,
interpret the slope, and use the linear equation to make a
prediction. Also, find r2 (coefficient of determination)
and r (correlation coefficient). Discuss your findings. Your
topic may be that is related to sports, your work, a hobby, or
something you find interesting. If you choose, you may use the
suggestions described below.
A Linear Model Example and Technology Tips are provided in
separate documents.
Tasks for Linear Regression Model (LR)
(LR-1) Describe your topic, provide your data, and cite your
source. Collect at least 8 data points. Label appropriately.
(Highly recommended: Post this information in the Linear
Model Project discussion as well as in your completed project.
Include a brief informative description in the title of your
posting. Each student must use different data.)
The idea with the discussion posting is two-fold: (1) To share
your interesting project idea with your classmates, and (2) To
give me a chance to give you a brief thumbs-up or thumbs-down
about your proposed topic and data. Sometimes students get off
on the wrong foot or misunderstand the intent of the project,
and your posting provides an opportunity for some feedback.
Remark: Students may choose similar topics, but must have
different data sets. For example, several students may be
interested in a particular Olympic sport, and that is fine, but
they must collect different data, perhaps from different events
or different gender.
(LR-2) Plot the points (x, y) to obtain a scatterplot. Use an
appropriate scale on the horizontal and vertical axes and be sure
to label carefully. Visually judge whether the data points
exhibit a relatively linear trend. (If so, proceed. If not, try a
different topic or data set.)
(LR-3) Find the line of best fit (regression line) and graph it on
the scatterplot. State the equation of the line. (LR-4) State the
slope of the line of best fit. Carefully interpret the meaning of
the slope in a sentence or two.
(LR-5) Find and state the value of r2, the coefficient of
determination, and r, the correlation coefficient. Discuss your
findings in a few sentences. Is r positive or negative? Why? Is a
line a good curve to fit to this data? Why or why not? Is the
linear relationship very strong, moderately strong, weak, or
nonexistent?
(LR-6) Choose a value of interest and use the line of best fit to
make an estimate or prediction. Show calculation work.
(LR-7) Write a brief narrative of a paragraph or two. Summarize
your findings and be sure to mention any aspect of the linear
model project (topic, data, scatterplot, line, r, or estimate, etc.)
that you found particularly important or interesting.
You may submit all of your project in one document or a
combination of documents, which may consist of word
processing documents or spreadsheets or scanned handwritten
work, provided it is clearly labeled where each task can be
found. Be sure to include your name. Projects are graded on the
basis of completeness, correctness, ease in locating all of the
checklist items, and strength of the narrative portions.
Here are some possible topics:
Choose an Olympic sport -- an event that interests you. Go to
http://www.databaseolympics.com/and collect data for winners
in the event for at least 8 Olympic games (dating back to at
least 1980). (Example: Winning times in Men's 400 m dash).
Make a quick plot for yourself to "eyeball" whether the data
points exhibit a relatively linear trend. (If so, proceed. If not,
try a different event.) After you find the line of best fit, use
your line to make a prediction for the next Olympics (2014 for a
winter event, 2016 for a summer event ).
Choose a particular type of food. (Examples: Fish sandwich at
fast-food chains, cheese pizza, breakfast cereal) For at least 8
brands, look up the fat content and the associated calorie total
per serving. Make a quick plot for yourself to "eyeball" whether
the data exhibit a relatively linear trend. (If so, proceed. If not,
try a different type of food.) After you find the line of best fit,
use your line to make a prediction corresponding to a fat
amount not occurring in your data set.) Alternative: Look up
carbohydrate content and associated calorie total per serving.
Choose a sport that particularly interests you and find two
variables that may exhibit a linear relationship. For instance,
for each team for a particular season in baseball, find the total
runs scored and the number of wins. Excellent websites:
http://www.databasesports.com/ and http://www.baseball-
reference.com/
(Sample) Curve-Fitting Project - Linear Model: Men's 400
Meter Dash Submitted by Suzanne Sands
(LR-1) Purpose: To analyze the winning times for the Olympic
Men's 400 Meter Dash using a linear model
Data: The winning times were retrieved from
http://www.databaseolympics.com/sport/sportevent.htm?sp=AT
H&enum=130 The winning times were gathered for the most
recent 16 Summer Olympics, post-WWII. (More data was
available, back to 1896.)
(
Page
4
of
4
)
(
Summer Olympics: Men's 400 Meter Dash
Winning Times
Year
Time (seconds)
1948
46.20
1952
45.90
1956
46.70
1960
44.90
1964
45.10
1968
43.80
1972
44.66
1976
44.26
1980
44.60
1984
44.27
1988
43.87
1992
43.50
1996
43.49
2000
43.84
2004
44.00
2008
43.75
)DATA:
(LR-2) SCATTERPLOT:
(
2008
2000
1992
1984
1976
Year
1968
1960
1952
43.00
1944
43.50
44.00
44.50
45.00
45.50
46.00
46.50
47.00
Summer Olympics: Men's 400 Meter Dash Winning Times
) (
Time (seconds)
)As one would expect, the winning times generally show a
downward trend, as stronger competition and training methods
result in faster speeds. The trend is somewhat linear.
(
2008
2000
1992
1984
1976
Year
1968
1960
1952
43.00
1944
43.50
44.00
44.50
45.00
45.50
y = -0.0431x +
129.84
R² = 0.6991
46.00
46.50
47.00
Summer Olympics: Men's 400 Meter Dash Winning Times
) (
Time (seconds)
)(LR-3)
Line of Best Fit (Regression Line)
y =
(in seconds)
(LR-
times are generally decreasing.
The slope indicates that in general, the winning time decreases
by 0.0431 second a year, and so the winning time decreases at
an average rate of 4(0.0431) = 0.1724 second each 4-year
Olympic interval.
(LR-5) Values of r2 and r:
r2 = 0.6991
We know that the slope of the regression line is negative so the
correlation coefficient r must be negative.
�� = −√0.6991 = −0.84
correlation (relatively close to -1 but not very strong).
(LR-6) Prediction: For the 2012 Summer Olympics, substitute x
The regression line predicts a winning time of 43.1 seconds for
the Men's 400 Meter Dash in the 2012 Summer Olympics in
London.
(LR-7) Narrative:
The data consisted of the winning times for the men's 400m
event in the Summer Olympics, for 1948 through 2008. The data
exhibit a moderately strong downward linear trend, looking
overall at the 60 year period.
The regression line predicts a winning time of 43.1 seconds for
the 2012 Summer Olympics, which would be nearly 0.4 second
less than the existing Olympic record of 43.49 seconds, quite a
feat!
Will the regression line's prediction be accurate? In the last two
decades, there appears to be more of a cyclical (up and down)
trend. Could winning times continue to drop at the same average
rate? Extensive searches for talented potential athletes and
improved full-time training methods can lead to decreased
winning times, but ultimately, there will be a physical limit for
humans.
Note that there were some unusual data points of 46.7 seconds
in 1956 and 43.80 in 1968, which are far above and far below
the regression line.
If we restrict ourselves to looking just at the most recent
winning times, beyond 1968, for Olympic winning times in
1972 and beyond (10 winning times), we have the following
scatterplot and regression line.
(
Time (seconds)
)
(
Year
2008
2000
1992
1984
1976
44.20
44.00
43.80
43.60
43.40
1968
y = -0.025x + 93.834
R² = 0.5351
44.60
44.40
Summer Olympics: Men's 400 Meter Dash Winning Times
44.80
)Using the most recent ten winning times, our regression line is
43.5 seconds. This line predicts a winning time of 43.5
seconds for 2012 and that would indicate an excellent time
close to the existing record of 43.49 seconds, but not
dramatically below it.
Note too that for r2 = 0.5351 and for the negatively sloping
line, the correlation coefficient is �� = −√0.5351 = −0.73, not
as strong as when we considered the time period going back to
1948. The most recent set of 10 winning times do not visually
exhibit as strong a linear trend as the set of 16 winning times
dating back to 1948.
CONCLUSION:
I have examined two linear models, using different subsets of
the Olympic winning times for the men's 400 meter dash and
both have moderately strong negative correlation coefficients.
One model uses data extending back to 1948 and predicts a
winning time of 43.1 seconds for the 2012 Olympics, and the
other model uses data from the most recent 10 Olympic games
and predicts 43.5 seconds. My guess is that 43.5 will be closer
to the actual winning time. We will see what happens later this
summer!UPDATE: When the race was run in August, 2012, the
winning time was 43.94 seconds.
Scatterplots, Linear Regression, and Correlation
When we have a set of data, often we would like to develop a
model that fits the data.
First we graph the data points (x, y) to get a scatterplot. Take
the data, determine an appropriate scale on the horizontal axis
and the vertical axis, and plot the points, carefully labeling the
scale and axes.
(
Summer Olympics:
Men's 400 Meter Dash Winning Times
Year (x)
Time(y) (seconds)
1948
46.20
1952
45.90
1956
46.70
1960
44.90
1964
45.10
1968
43.80
1972
44.66
1976
44.26
1980
44.60
1984
44.27
1988
43.87
1992
43.50
1996
43.49
2000
43.84
2004
44.00
2008
43.75
)
Burger
Fat (x) (grams)
Calories (y)
Wendy's Single
20
420
BK Whopper Jr.
24
420
McDonald's Big Mac
28
530
Wendy's Big Bacon Classic
30
580
Hardee's The Works
30
530
McDonald's Arch Deluxe
34
610
BK King Double Cheeseburger
39
640
Jack in the Box Jumbo Jack
40
650
BK Big King
43
660
BK King Whopper
46
730
Data from 1997
If the scatterplot shows a relatively linear trend, we try to fit a
linear model, to find a line of best fit.
We could pick two arbitrary data points and find the line
through them, but that would not necessarily provide a good
linear model representative of all the data points.
A mathematical procedure that finds a line of "best fit" is called
linear regression. This procedure is also called the method of
least squares, as it minimizes the sum of the squares of the
deviations of the points from the line. In MATH 107, we use
software to find the regression line. (We can use Microsoft
Excel, or Open Office, or a hand-held calculator or an online
calculator --- more on this in the Technology Tips topic.)
Linear regression software also typically reports parameters
denoted by r or r2.
The real number r is called the correlation coefficient and
provides a measure of the strength of the linear relationship.
r = 1 indicates perfect positive correlation --- the regression
line has positive slope and all of the data points are on the line.
--- the regression
line has negative slope and all of the data points are on the line
The closer |r| is to 1, the stronger the linear correlation. If r = 0,
there is no correlation at all. The following examples provide a
sense of what an r value indicates.
Source: The Basic Practice of Statistics, David S. Moore, page
108.
Notice that a positive r value is associated with an increasing
trend and a negative r value is associated with a decreasing
trend. The strongest linear models have r values close to 1 or
The nonnegative real number r2 is called the coefficient of
determination and is the square of the correlation coefficient r.
is to 1, the stronger the indication of a linear relationship.
Some software packages (such as Excel) report r2, and so to get
r, take the square root of r2 and determine the sign of r by
RESOURCES: Desmos Graphing Calculator and Linear
Regression
You can use the free online Desmos Graphing Calculator to
produce a scatterplot and find the regression line and
correlation coefficient.
Go to https://www.desmos.com/calculatorand launch the
calculator.
Select "table" from the menu at the upper left.
(
Page
1
of 7
)
Data for Project Example (Men's 400 Meter Dash) has been
entered. Regression help can be accessed via the "?" icon.
Select "expression" from the menu at the upper left.
Type y1 ~ mx1 + b and the values of r2, r, m, and b
automatically appear.
Selecting the tool at the upper right, you can then adjust the
scales on the x and y axes and create labels.
You can give your graph a name. In order to save your graph,
sign in with a free account and click the share button. If you
share the given link, then by followiing the link, the graph can
be opened and manipulated. If you click the Image button, then
you can save the graph as a file.
After clicking the Image button, you can view the graph as a
stand-alone image, and select from several options to save.
To complete the Linear Model portion of the project, you will
need to use technology (or hand-drawing) to create a
scatterplot, find the regression line, plot the regression line, and
find r and r2.
Below are some options, together with some videos. Each video
is limited to 5 minutes or less. It takes a bit of time for the
video to initially download. When playing the video, if you
want to slow it down to read the text, hit the pause icon. (If you
run the mouse over the bottom of the video screen, the video
controls will appear.) You may need to adjust the volume.
The basic options are to:
(1) Generate by hand and scan.
(2) Use Microsoft Excel.
Visit Scatterplot - Start(VIDEO) to see how to create a scatter
plot using Microsoft Excel and format the axes.
Visit Scatterplot - Regression Line(VIDEO) to see how to add
labels and title to the scatterplot, how to generate and graph the
line of best fit (regression) and obtain the value of r2 in
Microsoft Excel.
Using Excel to obtain precise values of slope m and y-intercept
b of the regression line: Video, Spreadsheet
(3) Use Open Office.
(4) Use a hand-held graphing calculator (See section 2.5 in your
textbook for help with Texas Instruments hand-held
calculators.)
(5) Use a free online tool
Use the free Desmos calculator: See
DesmosLinearRegressionGuide.pdfto view how to generate a
scatterplot and carry out linear regression.
The result of the free tool might not be as nice looking as the
Microsoft Excel version, but it is free. The Linear Project
Example uses Microsoft Excel.

More Related Content

Similar to Curve-fitting Project - Linear Model (due at the end of Week 5)I.docx

Day 4 trouble-shooting. SPSS
Day 4 trouble-shooting. SPSSDay 4 trouble-shooting. SPSS
Day 4 trouble-shooting. SPSS
abir hossain
 
Representing verifiable statistical index computations as linked data
Representing verifiable statistical index computations as linked dataRepresenting verifiable statistical index computations as linked data
Representing verifiable statistical index computations as linked data
Jose Emilio Labra Gayo
 
PEShare.co.uk Shared Resource
PEShare.co.uk Shared ResourcePEShare.co.uk Shared Resource
PEShare.co.uk Shared Resource
peshare.co.uk
 

Similar to Curve-fitting Project - Linear Model (due at the end of Week 5)I.docx (9)

Applied 20S December 16, 2008
Applied 20S December 16, 2008Applied 20S December 16, 2008
Applied 20S December 16, 2008
 
Selectivity Estimation for SPARQL Triple Patterns with Shape Expressions
Selectivity Estimation for SPARQL Triple Patterns with Shape ExpressionsSelectivity Estimation for SPARQL Triple Patterns with Shape Expressions
Selectivity Estimation for SPARQL Triple Patterns with Shape Expressions
 
Day 4 trouble-shooting. SPSS
Day 4 trouble-shooting. SPSSDay 4 trouble-shooting. SPSS
Day 4 trouble-shooting. SPSS
 
Olympics Sports Data Analysis
Olympics Sports Data AnalysisOlympics Sports Data Analysis
Olympics Sports Data Analysis
 
Representing verifiable statistical index computations as linked data
Representing verifiable statistical index computations as linked dataRepresenting verifiable statistical index computations as linked data
Representing verifiable statistical index computations as linked data
 
PEShare.co.uk Shared Resource
PEShare.co.uk Shared ResourcePEShare.co.uk Shared Resource
PEShare.co.uk Shared Resource
 
Formato guía10
Formato guía10Formato guía10
Formato guía10
 
A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...A Comparison of Propositionalization Strategies for Creating Features from Li...
A Comparison of Propositionalization Strategies for Creating Features from Li...
 
Symmetry 12-00356-v2
Symmetry 12-00356-v2Symmetry 12-00356-v2
Symmetry 12-00356-v2
 

More from dorishigh

Cyber War versus Cyber Realities Cyber War v.docx
Cyber War versus Cyber Realities Cyber War v.docxCyber War versus Cyber Realities Cyber War v.docx
Cyber War versus Cyber Realities Cyber War v.docx
dorishigh
 
Cyber terrorism, by definition, is the politically motivated use.docx
Cyber terrorism, by definition, is the politically motivated use.docxCyber terrorism, by definition, is the politically motivated use.docx
Cyber terrorism, by definition, is the politically motivated use.docx
dorishigh
 
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docxCyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
dorishigh
 
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docxCyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
dorishigh
 
Cyber Security and the Internet of ThingsVulnerabilities, T.docx
Cyber Security and the Internet of ThingsVulnerabilities, T.docxCyber Security and the Internet of ThingsVulnerabilities, T.docx
Cyber Security and the Internet of ThingsVulnerabilities, T.docx
dorishigh
 
Cyber Security Gone too farCarlos Diego LimaExce.docx
Cyber Security Gone too farCarlos Diego LimaExce.docxCyber Security Gone too farCarlos Diego LimaExce.docx
Cyber Security Gone too farCarlos Diego LimaExce.docx
dorishigh
 
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docxCW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
dorishigh
 
CW 1 Car Industry and AIby Victoria StephensonSubmission.docx
CW 1 Car Industry and AIby Victoria StephensonSubmission.docxCW 1 Car Industry and AIby Victoria StephensonSubmission.docx
CW 1 Car Industry and AIby Victoria StephensonSubmission.docx
dorishigh
 
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docxCWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
dorishigh
 
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docxCw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
dorishigh
 
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docxCyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
dorishigh
 
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docxCVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
dorishigh
 
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docxCYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
dorishigh
 
CUSTOMERSERVICE-TRAINIGPROGRAM 2 TA.docx
CUSTOMERSERVICE-TRAINIGPROGRAM 2  TA.docxCUSTOMERSERVICE-TRAINIGPROGRAM 2  TA.docx
CUSTOMERSERVICE-TRAINIGPROGRAM 2 TA.docx
dorishigh
 
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docxCustomer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
dorishigh
 
Customer requests areProposed Cloud Architecture (5 pages n.docx
Customer requests areProposed Cloud Architecture (5 pages n.docxCustomer requests areProposed Cloud Architecture (5 pages n.docx
Customer requests areProposed Cloud Architecture (5 pages n.docx
dorishigh
 
Customer Relationship Management Presented ByShan Gu Cris.docx
Customer Relationship Management Presented ByShan Gu Cris.docxCustomer Relationship Management Presented ByShan Gu Cris.docx
Customer Relationship Management Presented ByShan Gu Cris.docx
dorishigh
 
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docx
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docxCustom Vans Inc. Custom Vans Inc. specializes in converting st.docx
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docx
dorishigh
 

More from dorishigh (20)

Cyber War versus Cyber Realities Cyber War v.docx
Cyber War versus Cyber Realities Cyber War v.docxCyber War versus Cyber Realities Cyber War v.docx
Cyber War versus Cyber Realities Cyber War v.docx
 
Cyber terrorism, by definition, is the politically motivated use.docx
Cyber terrorism, by definition, is the politically motivated use.docxCyber terrorism, by definition, is the politically motivated use.docx
Cyber terrorism, by definition, is the politically motivated use.docx
 
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docxCyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
Cyber Security ThreatsYassir NourDr. Fonda IngramETCS-690 .docx
 
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docxCyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
Cyber Security in Industry 4.0Cyber Security in Industry 4.0 (.docx
 
Cyber Security and the Internet of ThingsVulnerabilities, T.docx
Cyber Security and the Internet of ThingsVulnerabilities, T.docxCyber Security and the Internet of ThingsVulnerabilities, T.docx
Cyber Security and the Internet of ThingsVulnerabilities, T.docx
 
Cyber Security Gone too farCarlos Diego LimaExce.docx
Cyber Security Gone too farCarlos Diego LimaExce.docxCyber Security Gone too farCarlos Diego LimaExce.docx
Cyber Security Gone too farCarlos Diego LimaExce.docx
 
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docxCW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
CW 1R Checklist and Feedback Sheet Student Copy Go through this.docx
 
CW 1 Car Industry and AIby Victoria StephensonSubmission.docx
CW 1 Car Industry and AIby Victoria StephensonSubmission.docxCW 1 Car Industry and AIby Victoria StephensonSubmission.docx
CW 1 Car Industry and AIby Victoria StephensonSubmission.docx
 
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docxCWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
CWTS CWFT Module 7 Chapter 2 Eco-maps 1 ECO-MAPS .docx
 
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docxCw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
Cw2 Marking Rubric Managerial Finance0Fail2(1-29) Fail.docx
 
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docxCyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
Cyber AttacksProtecting National Infrastructure, 1st ed.Ch.docx
 
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docxCVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
CVPSales price per unit$75.00Variable Cost per unit$67.00Fixed C.docx
 
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docxCYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
CYB207 v2Wk 4 – Assignment TemplateCYB205 v2Page 2 of 2.docx
 
CUSTOMERSERVICE-TRAINIGPROGRAM 2 TA.docx
CUSTOMERSERVICE-TRAINIGPROGRAM 2  TA.docxCUSTOMERSERVICE-TRAINIGPROGRAM 2  TA.docx
CUSTOMERSERVICE-TRAINIGPROGRAM 2 TA.docx
 
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docxCustomer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
Customer Service Test (Chapter 6 - 10)Name Multiple Choice.docx
 
Customer Value Funnel Questions1. Identify the relevant .docx
Customer Value Funnel Questions1. Identify the relevant .docxCustomer Value Funnel Questions1. Identify the relevant .docx
Customer Value Funnel Questions1. Identify the relevant .docx
 
Customer service is something that we have all heard of and have som.docx
Customer service is something that we have all heard of and have som.docxCustomer service is something that we have all heard of and have som.docx
Customer service is something that we have all heard of and have som.docx
 
Customer requests areProposed Cloud Architecture (5 pages n.docx
Customer requests areProposed Cloud Architecture (5 pages n.docxCustomer requests areProposed Cloud Architecture (5 pages n.docx
Customer requests areProposed Cloud Architecture (5 pages n.docx
 
Customer Relationship Management Presented ByShan Gu Cris.docx
Customer Relationship Management Presented ByShan Gu Cris.docxCustomer Relationship Management Presented ByShan Gu Cris.docx
Customer Relationship Management Presented ByShan Gu Cris.docx
 
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docx
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docxCustom Vans Inc. Custom Vans Inc. specializes in converting st.docx
Custom Vans Inc. Custom Vans Inc. specializes in converting st.docx
 

Recently uploaded

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
EADTU
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
Introduction to TechSoup’s Digital Marketing Services and Use Cases
Introduction to TechSoup’s Digital Marketing  Services and Use CasesIntroduction to TechSoup’s Digital Marketing  Services and Use Cases
Introduction to TechSoup’s Digital Marketing Services and Use Cases
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
PANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptxPANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptx
 

Curve-fitting Project - Linear Model (due at the end of Week 5)I.docx

  • 1. Curve-fitting Project - Linear Model (due at the end of Week 5) Instructions For this assignment, collect data exhibiting a relatively linear trend, find the line of best fit, plot the data and the line, interpret the slope, and use the linear equation to make a prediction. Also, find r2 (coefficient of determination) and r (correlation coefficient). Discuss your findings. Your topic may be that is related to sports, your work, a hobby, or something you find interesting. If you choose, you may use the suggestions described below. A Linear Model Example and Technology Tips are provided in separate documents. Tasks for Linear Regression Model (LR) (LR-1) Describe your topic, provide your data, and cite your source. Collect at least 8 data points. Label appropriately. (Highly recommended: Post this information in the Linear Model Project discussion as well as in your completed project. Include a brief informative description in the title of your posting. Each student must use different data.) The idea with the discussion posting is two-fold: (1) To share your interesting project idea with your classmates, and (2) To give me a chance to give you a brief thumbs-up or thumbs-down about your proposed topic and data. Sometimes students get off on the wrong foot or misunderstand the intent of the project, and your posting provides an opportunity for some feedback. Remark: Students may choose similar topics, but must have different data sets. For example, several students may be interested in a particular Olympic sport, and that is fine, but they must collect different data, perhaps from different events or different gender.
  • 2. (LR-2) Plot the points (x, y) to obtain a scatterplot. Use an appropriate scale on the horizontal and vertical axes and be sure to label carefully. Visually judge whether the data points exhibit a relatively linear trend. (If so, proceed. If not, try a different topic or data set.) (LR-3) Find the line of best fit (regression line) and graph it on the scatterplot. State the equation of the line. (LR-4) State the slope of the line of best fit. Carefully interpret the meaning of the slope in a sentence or two. (LR-5) Find and state the value of r2, the coefficient of determination, and r, the correlation coefficient. Discuss your findings in a few sentences. Is r positive or negative? Why? Is a line a good curve to fit to this data? Why or why not? Is the linear relationship very strong, moderately strong, weak, or nonexistent? (LR-6) Choose a value of interest and use the line of best fit to make an estimate or prediction. Show calculation work. (LR-7) Write a brief narrative of a paragraph or two. Summarize your findings and be sure to mention any aspect of the linear model project (topic, data, scatterplot, line, r, or estimate, etc.) that you found particularly important or interesting. You may submit all of your project in one document or a combination of documents, which may consist of word processing documents or spreadsheets or scanned handwritten work, provided it is clearly labeled where each task can be found. Be sure to include your name. Projects are graded on the basis of completeness, correctness, ease in locating all of the checklist items, and strength of the narrative portions. Here are some possible topics:
  • 3. Choose an Olympic sport -- an event that interests you. Go to http://www.databaseolympics.com/and collect data for winners in the event for at least 8 Olympic games (dating back to at least 1980). (Example: Winning times in Men's 400 m dash). Make a quick plot for yourself to "eyeball" whether the data points exhibit a relatively linear trend. (If so, proceed. If not, try a different event.) After you find the line of best fit, use your line to make a prediction for the next Olympics (2014 for a winter event, 2016 for a summer event ). Choose a particular type of food. (Examples: Fish sandwich at fast-food chains, cheese pizza, breakfast cereal) For at least 8 brands, look up the fat content and the associated calorie total per serving. Make a quick plot for yourself to "eyeball" whether the data exhibit a relatively linear trend. (If so, proceed. If not, try a different type of food.) After you find the line of best fit, use your line to make a prediction corresponding to a fat amount not occurring in your data set.) Alternative: Look up carbohydrate content and associated calorie total per serving. Choose a sport that particularly interests you and find two variables that may exhibit a linear relationship. For instance, for each team for a particular season in baseball, find the total runs scored and the number of wins. Excellent websites: http://www.databasesports.com/ and http://www.baseball- reference.com/ (Sample) Curve-Fitting Project - Linear Model: Men's 400 Meter Dash Submitted by Suzanne Sands (LR-1) Purpose: To analyze the winning times for the Olympic Men's 400 Meter Dash using a linear model Data: The winning times were retrieved from http://www.databaseolympics.com/sport/sportevent.htm?sp=AT H&enum=130 The winning times were gathered for the most recent 16 Summer Olympics, post-WWII. (More data was available, back to 1896.)
  • 4. ( Page 4 of 4 ) ( Summer Olympics: Men's 400 Meter Dash Winning Times Year Time (seconds) 1948 46.20 1952 45.90 1956 46.70 1960 44.90 1964 45.10 1968 43.80 1972 44.66 1976 44.26 1980 44.60 1984 44.27 1988 43.87
  • 6. 2000 1992 1984 1976 Year 1968 1960 1952 43.00 1944 43.50 44.00 44.50 45.00 45.50 46.00 46.50 47.00 Summer Olympics: Men's 400 Meter Dash Winning Times ) ( Time (seconds) )As one would expect, the winning times generally show a downward trend, as stronger competition and training methods result in faster speeds. The trend is somewhat linear. ( 2008 2000 1992 1984 1976 Year
  • 7. 1968 1960 1952 43.00 1944 43.50 44.00 44.50 45.00 45.50 y = -0.0431x + 129.84 R² = 0.6991 46.00 46.50 47.00 Summer Olympics: Men's 400 Meter Dash Winning Times ) ( Time (seconds) )(LR-3)
  • 8. Line of Best Fit (Regression Line) y = (in seconds) (LR- times are generally decreasing. The slope indicates that in general, the winning time decreases by 0.0431 second a year, and so the winning time decreases at an average rate of 4(0.0431) = 0.1724 second each 4-year Olympic interval. (LR-5) Values of r2 and r: r2 = 0.6991 We know that the slope of the regression line is negative so the correlation coefficient r must be negative. �� = −√0.6991 = −0.84 correlation (relatively close to -1 but not very strong). (LR-6) Prediction: For the 2012 Summer Olympics, substitute x The regression line predicts a winning time of 43.1 seconds for the Men's 400 Meter Dash in the 2012 Summer Olympics in London. (LR-7) Narrative:
  • 9. The data consisted of the winning times for the men's 400m event in the Summer Olympics, for 1948 through 2008. The data exhibit a moderately strong downward linear trend, looking overall at the 60 year period. The regression line predicts a winning time of 43.1 seconds for the 2012 Summer Olympics, which would be nearly 0.4 second less than the existing Olympic record of 43.49 seconds, quite a feat! Will the regression line's prediction be accurate? In the last two decades, there appears to be more of a cyclical (up and down) trend. Could winning times continue to drop at the same average rate? Extensive searches for talented potential athletes and improved full-time training methods can lead to decreased winning times, but ultimately, there will be a physical limit for humans. Note that there were some unusual data points of 46.7 seconds in 1956 and 43.80 in 1968, which are far above and far below the regression line. If we restrict ourselves to looking just at the most recent winning times, beyond 1968, for Olympic winning times in 1972 and beyond (10 winning times), we have the following scatterplot and regression line. ( Time (seconds) )
  • 10. ( Year 2008 2000 1992 1984 1976 44.20 44.00 43.80 43.60 43.40 1968 y = -0.025x + 93.834 R² = 0.5351 44.60 44.40 Summer Olympics: Men's 400 Meter Dash Winning Times 44.80
  • 11. )Using the most recent ten winning times, our regression line is 43.5 seconds. This line predicts a winning time of 43.5 seconds for 2012 and that would indicate an excellent time close to the existing record of 43.49 seconds, but not dramatically below it. Note too that for r2 = 0.5351 and for the negatively sloping line, the correlation coefficient is �� = −√0.5351 = −0.73, not as strong as when we considered the time period going back to 1948. The most recent set of 10 winning times do not visually exhibit as strong a linear trend as the set of 16 winning times dating back to 1948. CONCLUSION: I have examined two linear models, using different subsets of the Olympic winning times for the men's 400 meter dash and both have moderately strong negative correlation coefficients. One model uses data extending back to 1948 and predicts a winning time of 43.1 seconds for the 2012 Olympics, and the other model uses data from the most recent 10 Olympic games and predicts 43.5 seconds. My guess is that 43.5 will be closer to the actual winning time. We will see what happens later this summer!UPDATE: When the race was run in August, 2012, the winning time was 43.94 seconds. Scatterplots, Linear Regression, and Correlation When we have a set of data, often we would like to develop a model that fits the data. First we graph the data points (x, y) to get a scatterplot. Take the data, determine an appropriate scale on the horizontal axis and the vertical axis, and plot the points, carefully labeling the
  • 12. scale and axes. ( Summer Olympics: Men's 400 Meter Dash Winning Times Year (x) Time(y) (seconds) 1948 46.20 1952 45.90 1956 46.70 1960 44.90 1964 45.10 1968 43.80 1972 44.66 1976 44.26 1980 44.60 1984 44.27 1988 43.87 1992 43.50 1996 43.49 2000 43.84 2004 44.00
  • 13. 2008 43.75 ) Burger Fat (x) (grams) Calories (y) Wendy's Single 20 420 BK Whopper Jr. 24 420 McDonald's Big Mac 28 530 Wendy's Big Bacon Classic 30 580 Hardee's The Works 30 530 McDonald's Arch Deluxe 34 610 BK King Double Cheeseburger 39 640 Jack in the Box Jumbo Jack 40 650 BK Big King 43 660 BK King Whopper 46
  • 14. 730 Data from 1997 If the scatterplot shows a relatively linear trend, we try to fit a linear model, to find a line of best fit. We could pick two arbitrary data points and find the line through them, but that would not necessarily provide a good linear model representative of all the data points. A mathematical procedure that finds a line of "best fit" is called linear regression. This procedure is also called the method of least squares, as it minimizes the sum of the squares of the deviations of the points from the line. In MATH 107, we use software to find the regression line. (We can use Microsoft Excel, or Open Office, or a hand-held calculator or an online calculator --- more on this in the Technology Tips topic.) Linear regression software also typically reports parameters denoted by r or r2. The real number r is called the correlation coefficient and provides a measure of the strength of the linear relationship. r = 1 indicates perfect positive correlation --- the regression line has positive slope and all of the data points are on the line. --- the regression line has negative slope and all of the data points are on the line
  • 15. The closer |r| is to 1, the stronger the linear correlation. If r = 0, there is no correlation at all. The following examples provide a sense of what an r value indicates. Source: The Basic Practice of Statistics, David S. Moore, page 108. Notice that a positive r value is associated with an increasing trend and a negative r value is associated with a decreasing trend. The strongest linear models have r values close to 1 or The nonnegative real number r2 is called the coefficient of determination and is the square of the correlation coefficient r. is to 1, the stronger the indication of a linear relationship. Some software packages (such as Excel) report r2, and so to get r, take the square root of r2 and determine the sign of r by RESOURCES: Desmos Graphing Calculator and Linear Regression You can use the free online Desmos Graphing Calculator to produce a scatterplot and find the regression line and correlation coefficient. Go to https://www.desmos.com/calculatorand launch the calculator.
  • 16. Select "table" from the menu at the upper left. ( Page 1 of 7 ) Data for Project Example (Men's 400 Meter Dash) has been entered. Regression help can be accessed via the "?" icon. Select "expression" from the menu at the upper left. Type y1 ~ mx1 + b and the values of r2, r, m, and b automatically appear. Selecting the tool at the upper right, you can then adjust the scales on the x and y axes and create labels. You can give your graph a name. In order to save your graph, sign in with a free account and click the share button. If you share the given link, then by followiing the link, the graph can be opened and manipulated. If you click the Image button, then you can save the graph as a file.
  • 17. After clicking the Image button, you can view the graph as a stand-alone image, and select from several options to save. To complete the Linear Model portion of the project, you will need to use technology (or hand-drawing) to create a scatterplot, find the regression line, plot the regression line, and find r and r2. Below are some options, together with some videos. Each video is limited to 5 minutes or less. It takes a bit of time for the video to initially download. When playing the video, if you want to slow it down to read the text, hit the pause icon. (If you run the mouse over the bottom of the video screen, the video controls will appear.) You may need to adjust the volume. The basic options are to: (1) Generate by hand and scan. (2) Use Microsoft Excel. Visit Scatterplot - Start(VIDEO) to see how to create a scatter plot using Microsoft Excel and format the axes. Visit Scatterplot - Regression Line(VIDEO) to see how to add labels and title to the scatterplot, how to generate and graph the line of best fit (regression) and obtain the value of r2 in Microsoft Excel. Using Excel to obtain precise values of slope m and y-intercept b of the regression line: Video, Spreadsheet (3) Use Open Office. (4) Use a hand-held graphing calculator (See section 2.5 in your textbook for help with Texas Instruments hand-held calculators.)
  • 18. (5) Use a free online tool Use the free Desmos calculator: See DesmosLinearRegressionGuide.pdfto view how to generate a scatterplot and carry out linear regression. The result of the free tool might not be as nice looking as the Microsoft Excel version, but it is free. The Linear Project Example uses Microsoft Excel.