I am Sarah Reynolds, Currently associated with statisticsassignmenthelp.com as a Statistics assignment helper. After completing my master's from the University of London, UK. I was in search of an opportunity that would expand my area of knowledge hence I decided to help students with their assignments. I have written several statistics assignments to date to help students overcome numerous difficulties they face in Linear Regression Analysis assignments.
1. Linear Regression
Analysis Assignment Help
For More Visit:
https://www.statisticsassignmenthelp.com/
support@statisticsassignmenthelp.com
+1 (315) 557-6473
2. Part 1:
The police chief asks you to analyze the logs from emergency 911 calls in the city and then provide
a summary of that data.
A.Prepare a dataset from the data provided in the “Raw Data” spreadsheet, attached below. Remove
any potential errors or outliers, duplicate records, or data that are not necessary. Provide a clean
copy of the data in your submission.
The variable “At Scene Time” was removed, as it had too many missing data points. The other
variables don’t have any missing numbers or errors. Hence, they are retained.
B.Explain why you removed each column and row from the “Raw Data” spreadsheet, or why you
imputed data in empty fields as you prepared the data for analysis.
C.Create data sheets using your cleaned dataset, provide each of the following to represent the
requested aggregated data.
• table: date and number of events
•bar graph: date and number of events
• table: number of incident occurrences by event type
Event Type Frequency
ANIMAL COMPLAINTS 2
0
100
200
300
400
500
600
700
26th March 27th March 28th March
Frquency
Date
Bar chart of Events by Day
Date Frequency
26th March 244
27th March 583
28th March 219
statisticsassignmenthelp.com
4. • table: sectors and number of events
District/Sectors Frequency
W 37
• bar graph: sectors and number of events
B 83
C 44
D 60
E 86
F 35
G 39
H 125
J 41
K 64
L 38
M 91
N 53
O 31
Q 62
R 60
S 44
U 52
0
20
40
60
80
100
120
140
160
180
200
ANIMAL
COMPLAINTS
ASSAULTS
AUTO
RECOVERIES
CASUALTIES
COMMERCIAL…
CRISIS
CALL
DISTURBANCES
FRAUD
CALLS
GUN
CALLS
HARBOR
CALLS
HAZARDS
LIQUOR
VIOLATIONS
MISCELLANEOUS…
NARCOTICS…
NOISE
DISTURBANCE
PARKING…
PERSON…
PERSONS
-
LOST,…
PROPERTY
-…
PROPERTY
DAMAGE
PROWLER
RECKLESS
BURNING
RESIDENTIAL…
ROAD
RAGE
ROBBERY
SUSPICIOUS…
THEFT
THREATS,…
TRAFFIC
RELATED…
TRESPASS
WEAPONS
CALLS
Frequency
Event Types
Bar chart of Event Types
statisticsassignmenthelp.com
5. D. Summarize your observations from reviewing the data sheets you have created
There are 1046 events which are given in the dataset. The dataset contains the
details of events which occurred on three days, 26th
March 2016 to 28th
March
2016. More than 50% of the events occurred on 27th
March 2016. The most
frequent type of event was suspicious circumstances, which were recorded 185
times, traffic related events which happened 156 times and disturbances which
occurred 135 times. District H witnessed the maximum number of events, at
125 incidents, followed by District M, which saw 91 incidents.
Part 2:
The state governor has offered an additional funding incentive for police
departments that are able to meet the standard of having a minimum of 2.5
officers onsite per incident. The police department has asked you to analyze
their data to determine if the department will be eligible for additional funding,
using the attached linear regression.
E. Describe the fit of the linear regression line to the data, providing graphical
representations or tables as evidence to support your description.
The linear regression’s R2
value is 0.8795, which indicates that the model is a great
fit.
F. Describe the impact of the outliers on the regression model, providing graphical
representations or tables as evidence to support your description.
0
20
40
60
80
100
120
140
B C D E F G H J K L M N O Q R S U W
Frquency
District/Sectors
Bar chart of events by sector
statisticsassignmenthelp.com
6. The scatterplot indicates that there is one possible outlier, which is located on the extreme
right of the graph, which has 125 incidents and 165 officers at scene. The outlier is likely to
have reduced the slope of the equation and flattened the regression model.
G. Create a residual plot and explain how to improve the linear regression model based on
your interpretation of the plot.
The residuals plot indicates that there is no pattern between residuals and number of
incidents, which indicates that there is homoscedasticity in the model. There is one point
in the scatterplot of residuals, which shows that the outlier point could be investigated
further, prior to inclusion in the model.
H. Using the linear regression analysis, explain if the department qualifies for additional
state funding, including any limitations posed by the available data to the assessment of
the department’s current funding eligibility.
y = 1.491x + 21.914
R² = 0.8795
0
50
100
150
200
250
0 20 40 60 80 100 120 140
Officers
at
Scene
No. of Incidents
Officers at Scene
-50
-40
-30
-20
-10
0
10
20
30
0 20 40 60 80 100 120 140
Residuals
Number of incidents
Residuals of linear regression
statisticsassignmenthelp.com
7. The regression coefficient of number of incidents is 1.491, which indicates for
each additional incident, 1.491 officers are deployed offsite, which falls well
below the state’s threshold of 2.5. Upon further review of the data, it is
observed that there are no district sectors which met the governor’s threshold of
2.5 officers offsite per each incident. The highest value of officers per incident
is 2.324 in district W, followed by 2.322 in district O. Therefore, it is concluded
that none of the police departments in the state qualify for the additional
funding proposed by the state’s governor.
I.Describe the precautions or behaviors that should be exercised when working
with and communicating about the sensitive data in this scenario.
When working with sensitive data, one needs to ensure that proper care is taken
to weed out outliers, or any data errors, which may skew the regression models.
Models are to be built using backing from literature, so that there is theoretical
support for the model being proposed. This avoids various issues such as
spurious correlation etc.
J.Acknowledge sources, using in-text citations and references, for content that
is quoted, paraphrased, or summarized.
K.Demonstrate professional communication in the content and presentation of
your submission.
statisticsassignmenthelp.com