SlideShare a Scribd company logo
Data Analysis Project for the
Olympic Dataset
4th April 2022
Kosuke Konno
Introduction
Overview
 Dataset: Olympics Dataset for 120 year
 This dataset contains 271,116 samples in total.
 Each sample has age, height weight, year, sport, country, medal, and so on.
 Client : SportsStats, a sports analysis firm
 Based on our analysis, they might be able to get interesting findings regarding Olympic
medalists and athletes. This information should provide insight into how an athletic
organization in each country can improve the performance of their members in order to
get a good results in the Olympic.
Question
1. What results have each country achieved?
1. Number of athletes of each country
2. Number of medalists of each country
3. Ratio of medalists to number of athletes in each country
2. How has the Olympic changed over time
1. Number of athletes in each Olympic game
2. Ratio of female athletes to male athletes in each game
3. Number of sports in each game
3. To what extent do physical characteristics influence each sport?
1. Age
2. Height
3. Weight
Initial Hypothesis
1. What results have each country achieved?
1. I think the United States or China has the largest number in term of both athletes and medalists.
2. Ratio of medalists to number of athletes should not largely different for every country, even though the
ratio could be close to zero for a number of countries.
2. How has the Olympic changed over time
1. Number of athletes in each Olympic game: This should have increased since the beginning mainly due
to the increase in population.
2. Ratio of female athletes to male athletes in each game: This also should have increased.
3. Number of sports in each game: This might not change drastically, because some sports have been
excluded while some have been newly adopted.
3. To what extent do physical characteristics influence each sport?
1. Age: I believe that a certain age can not have an impact on performance.
2. Height: This would positively affect outcomes of some sports like basketball.
3. Weight: I believe that competitions are separated based on weight in most cases, so we might not be
able to get interesting findings.
Approach
 All the questions can be answered with simple SQL queries using aggregate functions.
 COUNT will be used for question 1 and 2
 AVG and VAR/ STDEV will be used for question 3 -> if the variance/ standard deviation of a
certain sport is small compared with others, it might be possible to argue that such a specific
factor could affect a performance of that sport.
Initial Analysis
Descriptive Stats
 Below are basic Information of the main table and descriptive stats for the columns with
numerical values.
-> These are what we should look at first in order to get an overview of the data we chose.
Numbers by countries
 The following are the top 20 countries in terms of medals and athletes, and descriptive stats of
those categories.
-> The results for each country are the most important in the Olympics, and also directly related to
the hypothesis that I have set up.
 Top 20 countries  Descriptive Stats
Initial Findings
1. As for age, the mean is 25.6 years and the standard deviation is 6.4 years. Therefore, it
should be possible to argue that there is a high chance that we can perform best within
our 20s.
2. Regarding year, 25, 50 and 75 percentiles are 1960, 1988, and 2002, even though the data
covers 120 years from 1896,
3. US is by far the strongest in all categories, with Russia and Germany dominating the top
three places. Moreover, Since we can see several countries from colder regions such as
Russia, Norway, and Sweden, different results are expected for the Summer and Winter
Olympics.
4. In terms of distribution, while the standard deviations for the silver and bronze medals are
nearly identical, only that of the gold medal differ significantly. In addition, more than half
of the countries obtained only zero gold medals and one silver and one bronze medal.
What we got about the hypothesis
1. What results have each country achieved?
1. The US is clearly the strongest country in the world, but China is not as dominant as I expected. It is likely that
China has been able to get notable results only recently for some political or economic reasons.
2. Ratio of medalists to number of athletes : At first glance, it appeared that the number of medals is almost
proportional to the number of the athletes. However, the ratios of the US and Russia are clearly higher than
those of other countries. This ratio should be directly calculated and looked into further.
2. How has the Olympic changed over time
1. Number of athletes in each Olympic game: Based on the percentiles, this figure has increased as I expected.
2. Ratio of female athletes to male athletes in each game: Further calculation is needed.
3. Number of sports in each game: Further calculation is needed.
3. To what extent do physical characteristics influence each sport?
1. Age: It is still unclear whether a specific age can have an impact on performance, but at least we can say
there is a high chance that we can perform best within our 20s.
2. Height: Further calculation is needed.
3. Weight: Further calculation is needed.
Further Analysis
Correlation between Athletes and Medals
 The following table represent correlation coefficients among the number of athletes and
each medal.
 We sometimes tend to focus on the number of medals to measure the outcome of each country,
but we can say that such results are determined before the Olympic games start, because it
clearly correlates with how many athletes each country can send to games.
 Interestingly, the coefficient of gold medal is smaller than others, so obtaining a gold medal may
require something more special than other medals.
Ratio of medalists to number of athletes
 As new metrics, I calculated ratio of medalists to number of athletes by countries excluding those
not having medals. so that we can compare the level of athletes in each country.
 It shows that the max values are 3 to 5 times as high as the mean values.
-> We can conclude that some Olympic athletes in particular countries are more likely to obtain
medals those in other countries.
Time Series - Summer
 The number of athletes had rapidly increased until 80’s, and it has remained flat since then.
 The number of sports has also increased, but not as fast as the number of athletes.
 It is possible to suppose that the summer Olympic would not grow anymore. One of the
reason might be physical limitations for setting up a venue.
 Number of athletes  Number of sports
Time Series - Winter
 Although the size of the games is less than half of the Summer games, it still keeps growing.
 The number of sports has almost remained unchanged from the beginning.
 The contents might not drastically change, but the size will be expanded continuously.
 Number of athletes  Number of sports
Time Series - Gender
 The above figures represent Ratio of female athletes to male athletes in each game (%).
 It appears that there were turning points in 30’s and 90’s where the participation of more
females was promoted.
 It may be possible to argue that there is a more chance for women in winter games.
 Summer  Winter
Standard Deviation - Age
 Male  Female
 Even without old sports, people of a broader range of ages can play an active role in
some sports like Archery, Golf, and shooting.
 On the other hand, people of particular ages have participated in Football, Boxing ,
Swimming, and so on.
Standard Deviation - Height
 Male  Female
 Other than Basketball, some sports that are divided into some classes tend to have high values.
 As is generally accepted, certain heights seem to have an advantage in gymnastics
competitions.
Standard Deviation - Weight
 Male  Female
 Except some sports that are divided into some classes, the ones that do not require much
movement allow a wider range of weights.
 In addition to Gymnastics, a particular range of weights has an advantage in some winter
sports.
Conclusion on Hypothesis
1. What results have each country achieved?
1. The US is clearly the strongest country in the world, but China is not as dominant as I expected. It is likely that
China has been able to get notable results only recently for some political or economic reasons.
2. Ratio of medalists to number of athletes : the number of medals is almost proportional to the number of the
athletes. However, the ratios of some countries are clearly 3 to 5 times higher than the average.
2. How has the Olympic changed over time
1. Number of athletes in each Olympic game: Growing has stopped for summer, but the number keeps increasing
for winter.
2. Ratio of female athletes to male athletes in each game: The ratio is still getting bigger. Also, there were turning
points in 30’s and 90’s where the participation of more females was promoted.
3. Number of sports in each game: For summer, the number of sports has increased, although not as fast as the
number of athletes. For winter, it remains almost unchanged from the beginning.
3. To what extent do physical characteristics influence each sport?
1. Age:
2. Height:
3. Weight:
As for all the characteristics, some sports allow a wide range of
people to participate, while some other sports do not.
Extra Analysis
 The below table represents correlation coefficients among the number of athletes, GDP per
capita, and population of each country.
 Since the coefficient between athletes and GDP is higher than that between athletes
population, it should be possible to argue that GDP per capita is a more important factor.
 In other words, the size of the resource that a country can spare is likely to be more
important than the size of its population.
Summary
 If a country wants to get a good result in the Olympic game, you can
advise them to
1. Focus on how many athletes in the country can selected, although special efforts
may be necessary to get gold medals
2. Train more female athletes
3. Realize that sparing more resources will directly contribute to the result.
4. Concentrate its resources on some athletes having the characteristics that are
suitable for their sports.

More Related Content

Similar to Data_Analysis_Project_-_Presentation.pdf

Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docxInstructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
LaticiaGrissomzz
 
L'importance de suivre la poussée de croissance - DLTA
L'importance de suivre la poussée de croissance - DLTAL'importance de suivre la poussée de croissance - DLTA
L'importance de suivre la poussée de croissance - DLTABasketball Phénix
 
Monitoring growth(1)
Monitoring growth(1)Monitoring growth(1)
Monitoring growth(1)squashontario
 
Data driven approach olympics
Data driven approach   olympicsData driven approach   olympics
Data driven approach olympics
Prashant Mudgal
 
Humphreys ruseski sportsindustry
Humphreys ruseski sportsindustryHumphreys ruseski sportsindustry
Humphreys ruseski sportsindustry
正 徐
 
10 min summit - usher 1
10 min   summit - usher 110 min   summit - usher 1
10 min summit - usher 1
wusher
 
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
MuscleTech Network
 

Similar to Data_Analysis_Project_-_Presentation.pdf (9)

Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docxInstructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
Instructor’s Feedback Depth and Relevance 4.5 out of 4.5Rep.docx
 
FINAL ECON
FINAL ECONFINAL ECON
FINAL ECON
 
FINAL ECON
FINAL ECONFINAL ECON
FINAL ECON
 
L'importance de suivre la poussée de croissance - DLTA
L'importance de suivre la poussée de croissance - DLTAL'importance de suivre la poussée de croissance - DLTA
L'importance de suivre la poussée de croissance - DLTA
 
Monitoring growth(1)
Monitoring growth(1)Monitoring growth(1)
Monitoring growth(1)
 
Data driven approach olympics
Data driven approach   olympicsData driven approach   olympics
Data driven approach olympics
 
Humphreys ruseski sportsindustry
Humphreys ruseski sportsindustryHumphreys ruseski sportsindustry
Humphreys ruseski sportsindustry
 
10 min summit - usher 1
10 min   summit - usher 110 min   summit - usher 1
10 min summit - usher 1
 
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
Lluis Til. Sports Medicine & Orthopedics FCBarcelona Olympic Training Center ...
 

Recently uploaded

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 

Recently uploaded (20)

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 

Data_Analysis_Project_-_Presentation.pdf

  • 1. Data Analysis Project for the Olympic Dataset 4th April 2022 Kosuke Konno
  • 3. Overview  Dataset: Olympics Dataset for 120 year  This dataset contains 271,116 samples in total.  Each sample has age, height weight, year, sport, country, medal, and so on.  Client : SportsStats, a sports analysis firm  Based on our analysis, they might be able to get interesting findings regarding Olympic medalists and athletes. This information should provide insight into how an athletic organization in each country can improve the performance of their members in order to get a good results in the Olympic.
  • 4. Question 1. What results have each country achieved? 1. Number of athletes of each country 2. Number of medalists of each country 3. Ratio of medalists to number of athletes in each country 2. How has the Olympic changed over time 1. Number of athletes in each Olympic game 2. Ratio of female athletes to male athletes in each game 3. Number of sports in each game 3. To what extent do physical characteristics influence each sport? 1. Age 2. Height 3. Weight
  • 5. Initial Hypothesis 1. What results have each country achieved? 1. I think the United States or China has the largest number in term of both athletes and medalists. 2. Ratio of medalists to number of athletes should not largely different for every country, even though the ratio could be close to zero for a number of countries. 2. How has the Olympic changed over time 1. Number of athletes in each Olympic game: This should have increased since the beginning mainly due to the increase in population. 2. Ratio of female athletes to male athletes in each game: This also should have increased. 3. Number of sports in each game: This might not change drastically, because some sports have been excluded while some have been newly adopted. 3. To what extent do physical characteristics influence each sport? 1. Age: I believe that a certain age can not have an impact on performance. 2. Height: This would positively affect outcomes of some sports like basketball. 3. Weight: I believe that competitions are separated based on weight in most cases, so we might not be able to get interesting findings.
  • 6. Approach  All the questions can be answered with simple SQL queries using aggregate functions.  COUNT will be used for question 1 and 2  AVG and VAR/ STDEV will be used for question 3 -> if the variance/ standard deviation of a certain sport is small compared with others, it might be possible to argue that such a specific factor could affect a performance of that sport.
  • 8. Descriptive Stats  Below are basic Information of the main table and descriptive stats for the columns with numerical values. -> These are what we should look at first in order to get an overview of the data we chose.
  • 9. Numbers by countries  The following are the top 20 countries in terms of medals and athletes, and descriptive stats of those categories. -> The results for each country are the most important in the Olympics, and also directly related to the hypothesis that I have set up.  Top 20 countries  Descriptive Stats
  • 10. Initial Findings 1. As for age, the mean is 25.6 years and the standard deviation is 6.4 years. Therefore, it should be possible to argue that there is a high chance that we can perform best within our 20s. 2. Regarding year, 25, 50 and 75 percentiles are 1960, 1988, and 2002, even though the data covers 120 years from 1896, 3. US is by far the strongest in all categories, with Russia and Germany dominating the top three places. Moreover, Since we can see several countries from colder regions such as Russia, Norway, and Sweden, different results are expected for the Summer and Winter Olympics. 4. In terms of distribution, while the standard deviations for the silver and bronze medals are nearly identical, only that of the gold medal differ significantly. In addition, more than half of the countries obtained only zero gold medals and one silver and one bronze medal.
  • 11. What we got about the hypothesis 1. What results have each country achieved? 1. The US is clearly the strongest country in the world, but China is not as dominant as I expected. It is likely that China has been able to get notable results only recently for some political or economic reasons. 2. Ratio of medalists to number of athletes : At first glance, it appeared that the number of medals is almost proportional to the number of the athletes. However, the ratios of the US and Russia are clearly higher than those of other countries. This ratio should be directly calculated and looked into further. 2. How has the Olympic changed over time 1. Number of athletes in each Olympic game: Based on the percentiles, this figure has increased as I expected. 2. Ratio of female athletes to male athletes in each game: Further calculation is needed. 3. Number of sports in each game: Further calculation is needed. 3. To what extent do physical characteristics influence each sport? 1. Age: It is still unclear whether a specific age can have an impact on performance, but at least we can say there is a high chance that we can perform best within our 20s. 2. Height: Further calculation is needed. 3. Weight: Further calculation is needed.
  • 13. Correlation between Athletes and Medals  The following table represent correlation coefficients among the number of athletes and each medal.  We sometimes tend to focus on the number of medals to measure the outcome of each country, but we can say that such results are determined before the Olympic games start, because it clearly correlates with how many athletes each country can send to games.  Interestingly, the coefficient of gold medal is smaller than others, so obtaining a gold medal may require something more special than other medals.
  • 14. Ratio of medalists to number of athletes  As new metrics, I calculated ratio of medalists to number of athletes by countries excluding those not having medals. so that we can compare the level of athletes in each country.  It shows that the max values are 3 to 5 times as high as the mean values. -> We can conclude that some Olympic athletes in particular countries are more likely to obtain medals those in other countries.
  • 15. Time Series - Summer  The number of athletes had rapidly increased until 80’s, and it has remained flat since then.  The number of sports has also increased, but not as fast as the number of athletes.  It is possible to suppose that the summer Olympic would not grow anymore. One of the reason might be physical limitations for setting up a venue.  Number of athletes  Number of sports
  • 16. Time Series - Winter  Although the size of the games is less than half of the Summer games, it still keeps growing.  The number of sports has almost remained unchanged from the beginning.  The contents might not drastically change, but the size will be expanded continuously.  Number of athletes  Number of sports
  • 17. Time Series - Gender  The above figures represent Ratio of female athletes to male athletes in each game (%).  It appears that there were turning points in 30’s and 90’s where the participation of more females was promoted.  It may be possible to argue that there is a more chance for women in winter games.  Summer  Winter
  • 18. Standard Deviation - Age  Male  Female  Even without old sports, people of a broader range of ages can play an active role in some sports like Archery, Golf, and shooting.  On the other hand, people of particular ages have participated in Football, Boxing , Swimming, and so on.
  • 19. Standard Deviation - Height  Male  Female  Other than Basketball, some sports that are divided into some classes tend to have high values.  As is generally accepted, certain heights seem to have an advantage in gymnastics competitions.
  • 20. Standard Deviation - Weight  Male  Female  Except some sports that are divided into some classes, the ones that do not require much movement allow a wider range of weights.  In addition to Gymnastics, a particular range of weights has an advantage in some winter sports.
  • 21. Conclusion on Hypothesis 1. What results have each country achieved? 1. The US is clearly the strongest country in the world, but China is not as dominant as I expected. It is likely that China has been able to get notable results only recently for some political or economic reasons. 2. Ratio of medalists to number of athletes : the number of medals is almost proportional to the number of the athletes. However, the ratios of some countries are clearly 3 to 5 times higher than the average. 2. How has the Olympic changed over time 1. Number of athletes in each Olympic game: Growing has stopped for summer, but the number keeps increasing for winter. 2. Ratio of female athletes to male athletes in each game: The ratio is still getting bigger. Also, there were turning points in 30’s and 90’s where the participation of more females was promoted. 3. Number of sports in each game: For summer, the number of sports has increased, although not as fast as the number of athletes. For winter, it remains almost unchanged from the beginning. 3. To what extent do physical characteristics influence each sport? 1. Age: 2. Height: 3. Weight: As for all the characteristics, some sports allow a wide range of people to participate, while some other sports do not.
  • 22. Extra Analysis  The below table represents correlation coefficients among the number of athletes, GDP per capita, and population of each country.  Since the coefficient between athletes and GDP is higher than that between athletes population, it should be possible to argue that GDP per capita is a more important factor.  In other words, the size of the resource that a country can spare is likely to be more important than the size of its population.
  • 23. Summary  If a country wants to get a good result in the Olympic game, you can advise them to 1. Focus on how many athletes in the country can selected, although special efforts may be necessary to get gold medals 2. Train more female athletes 3. Realize that sparing more resources will directly contribute to the result. 4. Concentrate its resources on some athletes having the characteristics that are suitable for their sports.