SlideShare a Scribd company logo
1 of 28
NFL Injury Analysis:
Synthetic v Natural Fields
January 2, 2020
By: Elijah Hall
Contents
▪ Problem Statement
▪ Hypothesis and Results
▪ The Data
▪ My Features
▪ Exploratory Data Analysis
▪ Hypothesis 1
▪ Hypothesis 2
▪ Conclusions
Problem Statement
▪ In the NFL, 12 of the 31 stadiums have fields with synthetic turf.
▪ Lower limb injuries among football athletes have indicated significantly
higher injury rates on synthetic turf compared with natural turf
– (Mack et al., 2018; Loughran et al., 2019).
▪ Synthetic turf surfaces do not release cleats as readily as natural turf and
may contribute to the incidence of non-contact lower limb injuries
– (Kent et al., 2015).
▪ It has yet to be determined whether player movement patterns and other
measures of player performance differ across playing surfaces and how
these may contribute to the incidence of lower limb injury.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Hypothesis & Summary of Results
▪ H01: Player movement patterns are the same on natural fields as synthetic.
– There is no strong evidence to reject this hypothesis. Players appear to move roughly the same
regardless of field type.
▪ H02: Player movement patterns are the same between injured and not injured players.
– There is strong evidence to reject this hypothesis . Players who are injured move differently than those
that are not which seem to suggest an increased risk of injury
▪ Modeling Questions:
– Are player movement patterns significant to predicting risk of injury?
▪ The movement metric derived is significant to predicting injury in players.
– Is field type significant to predicting risk of injury?
▪ FieldType_Synthetic is significant to an increased risk of injury.
It is a combination between several features that increases the overall risk of
injury.The leading ones of interest being movement patterns and field type.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
The Data
▪ Injury Record:The injury record file contains information on 105
lower-limb injuries that occurred during regular season games over
the two seasons.There were 5 cases of multiple injuries.
▪ Play List: –The play list file contains the details for the 267,005
player-plays that make up the dataset. Details about the game and
play include the player’s assigned roster position, stadium type, field
type, weather, play type, position for the play, and position group.
▪ PlayerTrack Data: player level data that describes the location,
orientation, speed, and direction of each player during a play
recorded at 10 Hz (i.e. 10 observations recorded per second).
Problem Statement | Hypothesis and Results | The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
My Features
▪ my_met: The log-ratio of distance between the start and end of the
player track route over the total distance traveled.
▪ diff_os: The difference between orientation multiplied by the
velocity.
▪ os_met:This metric counts the number of times a routes’ diff_os
metric exceeds the 95% threshold intervals.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
My Features: my_met
The log-ratio of distance between the start and end of the
player track route over the total distance traveled.This
tells you more or less how straight the players path was.
Large number indicates a very zig-zaggy route like the one
to the right and a very low one indicates a fairly straight
route. X0Y0
XtYt
𝑙
my_met = log(
𝑑𝑖𝑠𝑡 𝑙
euc(X0Y0, XtYt))
s.t. X0Y0 is the path start point
XtYt is the path end point
𝑙 is the path traveled
𝑑𝑖𝑠𝑡() is the total distance traveled
euc() is the Euclidean distance between two points
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
My Features: diff_os
The difference between orientation multiplied by the velocity. My calculated
velocity was found to differ from the speed metric. I made an assumption that
the X,Y coordinates were more likely to be accurate than the speed and
therefore chose to use my own calculated velocity. Additionally there were
cases where zeros could be introduced, such as no recorded movement,
causing errors and therefore a small noise metric was used.The noise metric is
a uniformed random variable between {-1,1} * 0.01 to ensure it is small
enough. Positive would indicate a left turn and negative a right turn.
diff_os = 𝑑𝑖𝑓𝑓 𝑜𝑡−1, 𝑜𝑡 + 𝜂 ⋅ 𝑣𝑡
s.t. 𝑜𝑡 is the orientation at time 𝑡
𝜂 = 0.01, a constant noise added to ensure no errors are caused by zero’s
𝑣𝑡 = velocity at time 𝑡
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
My Features: os_met
This metric counts the number of times a routes’ diff_os metric exceeds
the 95% threshold intervals.This is taken by looking at the diff_os
metric for the full comparison sample and selecting the 2.5 and 97.5
percentile of the distribution.
os_met = 𝑖=0
𝑡
𝑖𝑓 𝑑𝑖𝑓𝑓_𝑜𝑠𝑡 > 32.22, 1
𝑖𝑓 𝑑𝑖𝑓𝑓_𝑜𝑠𝑡 < −29.88, 1
𝑒𝑙𝑠𝑒, 0
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Exploratory Analysis: Playtime by Last Play
Injured players play less per game regardless of field type
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Permutation diff of
means two-tail test
p-value = 0.0882
(fail to reject H0)
Permutation diff of
means two-tail test
p-value = 0.0
(reject H0)
Exploratory Analysis: Play Type
natural synthetic natural % synthetic % Total
Kickoff 4 4 0.50 0.50 8
Pass 20 27 0.43 0.57 47
Punt 6 3 0.67 0.33 9
Punt Not Returned 1 1 0.50 0.50 2
Punt Returned 2 2 0.50 0.50 4
Rush 14 15 0.48 0.52 29
All PlayType Natural v Synthetic
natural synthetic natural % synthetic % Total
Extra Point 3400 2506 0.58 0.42 5906
Field Goal 2891 2024 0.59 0.41 4915
Kickoff 3222 2532 0.56 0.44 5754
Kickoff Not Returned 2414 2211 0.52 0.48 4625
Kickoff Returned 1540 1233 0.56 0.44 2773
Pass 82443 55636 0.60 0.40 138079
Punt 3351 2395 0.58 0.42 5746
Punt Not Returned 2011 1475 0.58 0.42 3486
Punt Returned 1429 1040 0.58 0.42 2469
Rush 53797 38809 0.58 0.42 92606
unk 404 242 0.63 0.37 646
All PlayType Natural v Synthetic
There are limited observations on injuries but for the two
with >10 observations Pass and Rush we can reasonably
compare the proportional difference. Visually we see that
the Pass is nearly 60:40 on All Natural v Synthetic Fields
respectively but on injuries it is the reverse, 40:60.This
suggests Synthetic fields have higher rates of injury on
Pass plays.There is a similar pattern on Rush plays with
60:40 in all plays and about 50:50 in injuries.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Exploratory Analysis: Play Type
There are not enough observations in the Injured players
across the various weather categories to walk away with
any confident insights.
The only real take away is that the most injuries
happened on cloudy or sunny weather days.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
natural synthetic natural % synthetic % Total
clear 25734 9749 0.73 0.27 35483
cloudy 67459 46093 0.59 0.41 113552
indoor 6352 28438 0.18 0.82 34790
rain 8284 4364 0.65 0.35 12648
sunny 43815 18546 0.70 0.30 62361
unk 4312 869 0.83 0.17 5181
snow 946 383 0.71 0.29 1329
natural synthetic natural % synthetic % Total
clear 9 7 0.56 0.44 16
cloudy 17 22 0.44 0.56 39
indoor 3 14 0.18 0.82 17
rain 4 1 0.80 0.20 5
sunny 11 10 0.52 0.48 21
unk 3 1 0.75 0.25 4
snow 0 0 0.00 0.00 0
Exploratory Analysis: Play Type
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Mild Severe
UnlikelyLikely
Ankle
Knee
FootToes
Heel
Count Mode Mean
Ankle 42 0 2.07
Foot 7 6 5.43
Heel 1 1 1.00
Knee 48 1 2.31
Toes 7 1 1.14
Exploratory Analysis: Base Rates of Injury
▪ Natural Injury Base Rate = 1.42%
▪ Synthetic Injury Base Rate = 2.33%
▪ Indoor Injury Base Rate = 2.43%
▪ Outdoor Injury Base Rate = 1.62%
Even though Synthetic fields represent only about 1/3 of the fields
their injury base rate is nearly double that of Natural Fields.The
same is seen on indoor v outdoor as they are near proxies for field
types.
All Plays Injured PlaysBase Rate
Natural 3311 47 1.42%
Synthetic 2401 56 2.33%
Indoor 1319 32 2.43%
Outdoor 4393 71 1.62%
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Exploratory Analysis: Speed v Velocity
There is some difference
between the recorded speed
and velocity measured. I chose
my velocity metric since it
appears that the speed
recorded may be smoothed in
some way.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Exploratory Analysis: Velocity
There is a significant difference between velocity on Synthetic v Natural field
type. It is approximately a negative 10.2% drop in average velocity. With
mean velocity on Synthetic 2.02 and Natural of 2.25. Performing a StudentsT-
Test we get a p-value of 0.000 meaning it is statistically significant with a
confidence value of 0.05.Therefore we reject the null hypothesis that velocity
on Synthetic and Natural are sampled from the same parent distribution.
This gets to the question about performance.Without points or other metrics
this is the best measure of performance.
Performance drops on Synthetic Fields
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Hypothesis 1: Field traveled
To the left is the heatmap of most frequently
traveled areas of the field. It appears to have
two oval like shapes that naturally occur due
to the way the game is played with most
drives starting around the 20yard line and
ending with decreasing probability with every
yard as they approach the endzone.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
To the right you can see the same
heatmap for injury paths. Since there are
only 105 the heatmap is more sparse but
we can see that the paths are more
spread out and doesn’t seem follow the
same pattern as above.
Hypothesis 1: Field traveled
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Above you can see the path ran by the player where he was injured. It is not
obvious where he was injured but you can see the curviness of his route and
that he accelerates and decelerates by the plot of his diff_os values to the
right.
Hypothesis 1: All diff_os
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Looking at all diff_os paths we see some obvious differences
such as larger and more frequent spikes of injured players.
Hypothesis 1: Hypothesis testing for os_met and my_met
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
H0: os_met for injured player track routes on
Synthetic fields are drawn from the same
population as routes on Natural fields.
H0: my_met for injured player track routes on
Synthetic fields are drawn from the same
population as routes on Natural fields.
Statistics=-1.073
p-value = 0.286
(fail to reject H0)
Statistics=-0.404
p-value =0.687
(fail to reject H0)
Movement patterns between field types is not significantly different
Hypothesis 1: Hypothesis testing for os_met and my_met
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
H0: os_met for injured player track routes are
drawn from the same population as routes for
non injured players.
H0: my_met for injured player track routes are
drawn from the same population as routes for
non injured players.
Statistics = 1.621
p-value = 0.107
(fail to reject H0)
Statistics = 1.742
p-value = 0.083
(fail to reject H0)
Even though there is not significant evidence to reject the null hypothesis there is some evidence to suggest a
difference which might need larger samples to see. I also would have expected the opposite of the distribution on
the right.The my_met measures the log-ratio of distance traveled and this suggests that injured players have less
curvy routes which might mean sharper turns.
Hypothesis 2: Modeling for Feature Importance
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Models:
1. Logistic Regression
• Accuracy = 0.6747
• AUC = 0.67
2. XGBoostClassifier
• Accuracy = 0.95
Hypothesis 2: Logistic Regression Feature Importance
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
FieldType_Synthetic
is the most important
feature for a linear
model. Inversely my
features seem not to
be as important.
Hypothesis 2: Logistic Regression Feature Importance
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Each prediction has different inputs and therefore the various features have
varying importance to each prediction.To demonstrate this we can see that for
“Player A” FieldType_Synthetic = 0 reduces the probability of this prediction
while PlayType_Rush = 1 increases it.You con contrast this with Player B.
Player A
Player B
Hypothesis 2: XGBoost Feature Importance
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
The plot to the left is the various importance values for each prediction.This is how the
feature importance values are generated.We see that for this tree based model my_met and
os_met are very important as well as FieldType_Synthetic andTemperature.
Hypothesis 2: XGBoost Feature Importance
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Just as with the logistic regression we can inspect the feature importance values for
individual predictions. For “PlayerA” Position_OLB= 1 and my_met = -0.06568 increased
the probability of this prediction while PlayType_Rush = 1 decreases it.You con contrast
this with Player B where my_met = -0.9802 decreases the risk and os_met = -0.7264.
These are a little hard to interpret directly since they are scaled.
Player A
Player B
Conclusions
▪ Players movements across different fields were not statistically significant.
▪ Players movements between injured and not injured players were not statistically significant, but
appeared to have some evidence that warrants more investigation.
▪ Performance drops on Synthetic Fields by about 10%
▪ Even though Synthetic fields represent only about 1/3 of the fields their injury base rate is nearly
double that of Natural Fields.The same is seen on indoor v outdoor as they are near proxies for
field types. Base Rates of injury:
– Natural Injury Base Rate = 1.42%
– Synthetic Injury Base Rate = 2.33%
– Indoor Injury Base Rate = 2.43%
– Outdoor Injury Base Rate = 1.62%
▪ Important feature for predicting injuries (XGBoost Accuracy = 95%):
– my_met, representing route curviness
– FieldType_Synthetic
– Temperature
– os_met, representing extreme orientation and speed changes
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
Proposed Rule/Policy Changes
Since it appears it’s the combination of player movements vs field type there
might be some potential rules or regulation changes that the NFL can act on.
1. Teams that want to convert or maintain Synthetic fields, must give
justifiable reasons as to why.These reasons must reasonably outweigh the
negative performance and risk of injury to players.
2. Sponsors of cleats should provide proof that their cleats do not negatively
impact player performance or increase risk of injury.This could be for a
single cleat that works for all field types or for two separate cleats that
perform best on respective field types. All evidence should be research
based and subject to NFL approval to be removed if any signs that might
contradict these conditions.
3. Players can be educated about the risks of types of movements on specific
field types to help prevent more risky movement patterns.
Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion

More Related Content

Similar to Nfl injury final deck

Effects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLEffects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLPatrice Marek
 
Supervised sequential pattern mining for identifying important patterns of pl...
Supervised sequential pattern mining for identifying important patterns of pl...Supervised sequential pattern mining for identifying important patterns of pl...
Supervised sequential pattern mining for identifying important patterns of pl...Rory Bunker
 
Creative component alexzajichek
Creative component alexzajichekCreative component alexzajichek
Creative component alexzajichekAlexZajichek
 
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...cscpconf
 
An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...csandit
 
Statistical Analysis Project
Statistical Analysis ProjectStatistical Analysis Project
Statistical Analysis ProjectBraedon Churchill
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA ijcsit
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA AIRCC Publishing Corporation
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA AIRCC Publishing Corporation
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Giannis Tsakonas
 
2014-mo444-final-project
2014-mo444-final-project2014-mo444-final-project
2014-mo444-final-projectPaulo Faria
 
Influencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective PrimingInfluencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective PrimingLane Harrison
 
Fuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease DiagnosisFuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease DiagnosisIRJET Journal
 
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG HyperscanningEmotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG HyperscanningKyongsik Yun
 
This is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxThis is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxdivinapavey
 
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptx
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptxAI3391 Artificial Intelligence Session 14 Adversarial Search .pptx
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptxAsst.prof M.Gokilavani
 
Using Pattern Matching to Assess Gameplay
Using Pattern Matching to Assess GameplayUsing Pattern Matching to Assess Gameplay
Using Pattern Matching to Assess GameplayRod Myers
 
Statistik 1 10 12 edited_anova
Statistik 1 10 12 edited_anovaStatistik 1 10 12 edited_anova
Statistik 1 10 12 edited_anovaSelvin Hadi
 

Similar to Nfl injury final deck (20)

Effects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLEffects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHL
 
Supervised sequential pattern mining for identifying important patterns of pl...
Supervised sequential pattern mining for identifying important patterns of pl...Supervised sequential pattern mining for identifying important patterns of pl...
Supervised sequential pattern mining for identifying important patterns of pl...
 
Creative component alexzajichek
Creative component alexzajichekCreative component alexzajichek
Creative component alexzajichek
 
International Tchaikovsky Competition Voting System
International Tchaikovsky Competition Voting SystemInternational Tchaikovsky Competition Voting System
International Tchaikovsky Competition Voting System
 
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
 
An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...
 
Statistical Analysis Project
Statistical Analysis ProjectStatistical Analysis Project
Statistical Analysis Project
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
 
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...
 
2014-mo444-final-project
2014-mo444-final-project2014-mo444-final-project
2014-mo444-final-project
 
Influencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective PrimingInfluencing Visual Judgment through Affective Priming
Influencing Visual Judgment through Affective Priming
 
Fuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease DiagnosisFuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
 
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG HyperscanningEmotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
 
This is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docxThis is my statistics exam I need help I have been lost this whole s.docx
This is my statistics exam I need help I have been lost this whole s.docx
 
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptx
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptxAI3391 Artificial Intelligence Session 14 Adversarial Search .pptx
AI3391 Artificial Intelligence Session 14 Adversarial Search .pptx
 
Using Pattern Matching to Assess Gameplay
Using Pattern Matching to Assess GameplayUsing Pattern Matching to Assess Gameplay
Using Pattern Matching to Assess Gameplay
 
Statistik 1 10 12 edited_anova
Statistik 1 10 12 edited_anovaStatistik 1 10 12 edited_anova
Statistik 1 10 12 edited_anova
 

Recently uploaded

Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024patrickdtherriault
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一fztigerwe
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证ju0dztxtn
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...mikehavy0
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethSamantha Rae Coolbeth
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxStephen266013
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.pptRachmaGhifari
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeralNABLAS株式会社
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...ssuserf63bd7
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证ppy8zfkfm
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationmuqadasqasim10
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Jon Hansen
 

Recently uploaded (20)

Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 

Nfl injury final deck

  • 1. NFL Injury Analysis: Synthetic v Natural Fields January 2, 2020 By: Elijah Hall
  • 2. Contents ▪ Problem Statement ▪ Hypothesis and Results ▪ The Data ▪ My Features ▪ Exploratory Data Analysis ▪ Hypothesis 1 ▪ Hypothesis 2 ▪ Conclusions
  • 3. Problem Statement ▪ In the NFL, 12 of the 31 stadiums have fields with synthetic turf. ▪ Lower limb injuries among football athletes have indicated significantly higher injury rates on synthetic turf compared with natural turf – (Mack et al., 2018; Loughran et al., 2019). ▪ Synthetic turf surfaces do not release cleats as readily as natural turf and may contribute to the incidence of non-contact lower limb injuries – (Kent et al., 2015). ▪ It has yet to be determined whether player movement patterns and other measures of player performance differ across playing surfaces and how these may contribute to the incidence of lower limb injury. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 4. Hypothesis & Summary of Results ▪ H01: Player movement patterns are the same on natural fields as synthetic. – There is no strong evidence to reject this hypothesis. Players appear to move roughly the same regardless of field type. ▪ H02: Player movement patterns are the same between injured and not injured players. – There is strong evidence to reject this hypothesis . Players who are injured move differently than those that are not which seem to suggest an increased risk of injury ▪ Modeling Questions: – Are player movement patterns significant to predicting risk of injury? ▪ The movement metric derived is significant to predicting injury in players. – Is field type significant to predicting risk of injury? ▪ FieldType_Synthetic is significant to an increased risk of injury. It is a combination between several features that increases the overall risk of injury.The leading ones of interest being movement patterns and field type. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 5. The Data ▪ Injury Record:The injury record file contains information on 105 lower-limb injuries that occurred during regular season games over the two seasons.There were 5 cases of multiple injuries. ▪ Play List: –The play list file contains the details for the 267,005 player-plays that make up the dataset. Details about the game and play include the player’s assigned roster position, stadium type, field type, weather, play type, position for the play, and position group. ▪ PlayerTrack Data: player level data that describes the location, orientation, speed, and direction of each player during a play recorded at 10 Hz (i.e. 10 observations recorded per second). Problem Statement | Hypothesis and Results | The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 6. My Features ▪ my_met: The log-ratio of distance between the start and end of the player track route over the total distance traveled. ▪ diff_os: The difference between orientation multiplied by the velocity. ▪ os_met:This metric counts the number of times a routes’ diff_os metric exceeds the 95% threshold intervals. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 7. My Features: my_met The log-ratio of distance between the start and end of the player track route over the total distance traveled.This tells you more or less how straight the players path was. Large number indicates a very zig-zaggy route like the one to the right and a very low one indicates a fairly straight route. X0Y0 XtYt 𝑙 my_met = log( 𝑑𝑖𝑠𝑡 𝑙 euc(X0Y0, XtYt)) s.t. X0Y0 is the path start point XtYt is the path end point 𝑙 is the path traveled 𝑑𝑖𝑠𝑡() is the total distance traveled euc() is the Euclidean distance between two points Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 8. My Features: diff_os The difference between orientation multiplied by the velocity. My calculated velocity was found to differ from the speed metric. I made an assumption that the X,Y coordinates were more likely to be accurate than the speed and therefore chose to use my own calculated velocity. Additionally there were cases where zeros could be introduced, such as no recorded movement, causing errors and therefore a small noise metric was used.The noise metric is a uniformed random variable between {-1,1} * 0.01 to ensure it is small enough. Positive would indicate a left turn and negative a right turn. diff_os = 𝑑𝑖𝑓𝑓 𝑜𝑡−1, 𝑜𝑡 + 𝜂 ⋅ 𝑣𝑡 s.t. 𝑜𝑡 is the orientation at time 𝑡 𝜂 = 0.01, a constant noise added to ensure no errors are caused by zero’s 𝑣𝑡 = velocity at time 𝑡 Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 9. My Features: os_met This metric counts the number of times a routes’ diff_os metric exceeds the 95% threshold intervals.This is taken by looking at the diff_os metric for the full comparison sample and selecting the 2.5 and 97.5 percentile of the distribution. os_met = 𝑖=0 𝑡 𝑖𝑓 𝑑𝑖𝑓𝑓_𝑜𝑠𝑡 > 32.22, 1 𝑖𝑓 𝑑𝑖𝑓𝑓_𝑜𝑠𝑡 < −29.88, 1 𝑒𝑙𝑠𝑒, 0 Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 10. Exploratory Analysis: Playtime by Last Play Injured players play less per game regardless of field type Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Permutation diff of means two-tail test p-value = 0.0882 (fail to reject H0) Permutation diff of means two-tail test p-value = 0.0 (reject H0)
  • 11. Exploratory Analysis: Play Type natural synthetic natural % synthetic % Total Kickoff 4 4 0.50 0.50 8 Pass 20 27 0.43 0.57 47 Punt 6 3 0.67 0.33 9 Punt Not Returned 1 1 0.50 0.50 2 Punt Returned 2 2 0.50 0.50 4 Rush 14 15 0.48 0.52 29 All PlayType Natural v Synthetic natural synthetic natural % synthetic % Total Extra Point 3400 2506 0.58 0.42 5906 Field Goal 2891 2024 0.59 0.41 4915 Kickoff 3222 2532 0.56 0.44 5754 Kickoff Not Returned 2414 2211 0.52 0.48 4625 Kickoff Returned 1540 1233 0.56 0.44 2773 Pass 82443 55636 0.60 0.40 138079 Punt 3351 2395 0.58 0.42 5746 Punt Not Returned 2011 1475 0.58 0.42 3486 Punt Returned 1429 1040 0.58 0.42 2469 Rush 53797 38809 0.58 0.42 92606 unk 404 242 0.63 0.37 646 All PlayType Natural v Synthetic There are limited observations on injuries but for the two with >10 observations Pass and Rush we can reasonably compare the proportional difference. Visually we see that the Pass is nearly 60:40 on All Natural v Synthetic Fields respectively but on injuries it is the reverse, 40:60.This suggests Synthetic fields have higher rates of injury on Pass plays.There is a similar pattern on Rush plays with 60:40 in all plays and about 50:50 in injuries. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 12. Exploratory Analysis: Play Type There are not enough observations in the Injured players across the various weather categories to walk away with any confident insights. The only real take away is that the most injuries happened on cloudy or sunny weather days. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion natural synthetic natural % synthetic % Total clear 25734 9749 0.73 0.27 35483 cloudy 67459 46093 0.59 0.41 113552 indoor 6352 28438 0.18 0.82 34790 rain 8284 4364 0.65 0.35 12648 sunny 43815 18546 0.70 0.30 62361 unk 4312 869 0.83 0.17 5181 snow 946 383 0.71 0.29 1329 natural synthetic natural % synthetic % Total clear 9 7 0.56 0.44 16 cloudy 17 22 0.44 0.56 39 indoor 3 14 0.18 0.82 17 rain 4 1 0.80 0.20 5 sunny 11 10 0.52 0.48 21 unk 3 1 0.75 0.25 4 snow 0 0 0.00 0.00 0
  • 13. Exploratory Analysis: Play Type Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Mild Severe UnlikelyLikely Ankle Knee FootToes Heel Count Mode Mean Ankle 42 0 2.07 Foot 7 6 5.43 Heel 1 1 1.00 Knee 48 1 2.31 Toes 7 1 1.14
  • 14. Exploratory Analysis: Base Rates of Injury ▪ Natural Injury Base Rate = 1.42% ▪ Synthetic Injury Base Rate = 2.33% ▪ Indoor Injury Base Rate = 2.43% ▪ Outdoor Injury Base Rate = 1.62% Even though Synthetic fields represent only about 1/3 of the fields their injury base rate is nearly double that of Natural Fields.The same is seen on indoor v outdoor as they are near proxies for field types. All Plays Injured PlaysBase Rate Natural 3311 47 1.42% Synthetic 2401 56 2.33% Indoor 1319 32 2.43% Outdoor 4393 71 1.62% Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 15. Exploratory Analysis: Speed v Velocity There is some difference between the recorded speed and velocity measured. I chose my velocity metric since it appears that the speed recorded may be smoothed in some way. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 16. Exploratory Analysis: Velocity There is a significant difference between velocity on Synthetic v Natural field type. It is approximately a negative 10.2% drop in average velocity. With mean velocity on Synthetic 2.02 and Natural of 2.25. Performing a StudentsT- Test we get a p-value of 0.000 meaning it is statistically significant with a confidence value of 0.05.Therefore we reject the null hypothesis that velocity on Synthetic and Natural are sampled from the same parent distribution. This gets to the question about performance.Without points or other metrics this is the best measure of performance. Performance drops on Synthetic Fields Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 17. Hypothesis 1: Field traveled To the left is the heatmap of most frequently traveled areas of the field. It appears to have two oval like shapes that naturally occur due to the way the game is played with most drives starting around the 20yard line and ending with decreasing probability with every yard as they approach the endzone. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion To the right you can see the same heatmap for injury paths. Since there are only 105 the heatmap is more sparse but we can see that the paths are more spread out and doesn’t seem follow the same pattern as above.
  • 18. Hypothesis 1: Field traveled Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Above you can see the path ran by the player where he was injured. It is not obvious where he was injured but you can see the curviness of his route and that he accelerates and decelerates by the plot of his diff_os values to the right.
  • 19. Hypothesis 1: All diff_os Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Looking at all diff_os paths we see some obvious differences such as larger and more frequent spikes of injured players.
  • 20. Hypothesis 1: Hypothesis testing for os_met and my_met Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion H0: os_met for injured player track routes on Synthetic fields are drawn from the same population as routes on Natural fields. H0: my_met for injured player track routes on Synthetic fields are drawn from the same population as routes on Natural fields. Statistics=-1.073 p-value = 0.286 (fail to reject H0) Statistics=-0.404 p-value =0.687 (fail to reject H0) Movement patterns between field types is not significantly different
  • 21. Hypothesis 1: Hypothesis testing for os_met and my_met Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion H0: os_met for injured player track routes are drawn from the same population as routes for non injured players. H0: my_met for injured player track routes are drawn from the same population as routes for non injured players. Statistics = 1.621 p-value = 0.107 (fail to reject H0) Statistics = 1.742 p-value = 0.083 (fail to reject H0) Even though there is not significant evidence to reject the null hypothesis there is some evidence to suggest a difference which might need larger samples to see. I also would have expected the opposite of the distribution on the right.The my_met measures the log-ratio of distance traveled and this suggests that injured players have less curvy routes which might mean sharper turns.
  • 22. Hypothesis 2: Modeling for Feature Importance Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Models: 1. Logistic Regression • Accuracy = 0.6747 • AUC = 0.67 2. XGBoostClassifier • Accuracy = 0.95
  • 23. Hypothesis 2: Logistic Regression Feature Importance Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion FieldType_Synthetic is the most important feature for a linear model. Inversely my features seem not to be as important.
  • 24. Hypothesis 2: Logistic Regression Feature Importance Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Each prediction has different inputs and therefore the various features have varying importance to each prediction.To demonstrate this we can see that for “Player A” FieldType_Synthetic = 0 reduces the probability of this prediction while PlayType_Rush = 1 increases it.You con contrast this with Player B. Player A Player B
  • 25. Hypothesis 2: XGBoost Feature Importance Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion The plot to the left is the various importance values for each prediction.This is how the feature importance values are generated.We see that for this tree based model my_met and os_met are very important as well as FieldType_Synthetic andTemperature.
  • 26. Hypothesis 2: XGBoost Feature Importance Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion Just as with the logistic regression we can inspect the feature importance values for individual predictions. For “PlayerA” Position_OLB= 1 and my_met = -0.06568 increased the probability of this prediction while PlayType_Rush = 1 decreases it.You con contrast this with Player B where my_met = -0.9802 decreases the risk and os_met = -0.7264. These are a little hard to interpret directly since they are scaled. Player A Player B
  • 27. Conclusions ▪ Players movements across different fields were not statistically significant. ▪ Players movements between injured and not injured players were not statistically significant, but appeared to have some evidence that warrants more investigation. ▪ Performance drops on Synthetic Fields by about 10% ▪ Even though Synthetic fields represent only about 1/3 of the fields their injury base rate is nearly double that of Natural Fields.The same is seen on indoor v outdoor as they are near proxies for field types. Base Rates of injury: – Natural Injury Base Rate = 1.42% – Synthetic Injury Base Rate = 2.33% – Indoor Injury Base Rate = 2.43% – Outdoor Injury Base Rate = 1.62% ▪ Important feature for predicting injuries (XGBoost Accuracy = 95%): – my_met, representing route curviness – FieldType_Synthetic – Temperature – os_met, representing extreme orientation and speed changes Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion
  • 28. Proposed Rule/Policy Changes Since it appears it’s the combination of player movements vs field type there might be some potential rules or regulation changes that the NFL can act on. 1. Teams that want to convert or maintain Synthetic fields, must give justifiable reasons as to why.These reasons must reasonably outweigh the negative performance and risk of injury to players. 2. Sponsors of cleats should provide proof that their cleats do not negatively impact player performance or increase risk of injury.This could be for a single cleat that works for all field types or for two separate cleats that perform best on respective field types. All evidence should be research based and subject to NFL approval to be removed if any signs that might contradict these conditions. 3. Players can be educated about the risks of types of movements on specific field types to help prevent more risky movement patterns. Problem Statement | Hypothesis and Results |The Data | My Features | Exploratory Data Analysis | Hypothesis 1 | Hypothesis 2 | Conclusion

Editor's Notes

  1. The median last play before injury is 23 and the median last play of all other players in all other games is 49. Therefore, we can say that those prone to injury are more likely to be injured in the first half or first 20 plays of their game. When we perform a permutation test on the difference of means we see the empirical difference has a p-value of around 0.0 which is < a significance level of 0.05. There **is significant evidence** to reject the null hypothesis that the injured and not injured last plays are from the same parent distribution. The histograms between Synthetic and Natural appear to be similar enough to assume the risk of injury relative to play is similar. When we perform a permutation test on the difference of means we see the empirical difference has a p-value of around 0.089 which is > a significance level of 0.05. Additional the distribution seems to suggest that of a survival analysis would be worth investigating. Synthetic median last play = 27.0 and Natural median last play = 18.5 meaning those more likely to be injured get injured earlier on natural vs synthetic. However, as stated before there **is not significant evidence** to reject the null hypothesis that the Synthetic and Natural last plays are from the same parent distribution.