SlideShare a Scribd company logo
FOOTBALL NETWORK ANALYSIS
WITH GEPHI
TO DETERMINE A TEAMS STRATEGY.
GROUP 2: ROBERT FERRO, SEAN JAMES, YOGESH
SHINDE,PRATIK DOSHI,MINGYANG CHEN AND
MICHEAL ABAHO
FORMALITIES
• Explain basic concepts in a readable form:
• Vertex – a player
• Edge – a relationship between 2 players.
• Objective: Analyse two opposing teams to see what tactics were used.
• Arsenal Vs West Ham : 0-2
WHAT HAS BEEN UNDERTAKEN
• Chose between 3 ideas:
• Football domain – most useful with real world application
• Coffee & sandwich habits – JCR,Coffeeshop
• Library habits
• Met at regular intervals/ social media /dropbox
• Initial research to find sites/resources of interest
WHY FOOTBALL AS A SUITABLE DATASET
• Other useful applications for this data
• How was data collected?
• Why is football domain a good choice? - Different tactics
• Defensive play: more in than out.
• Good team cohesion: triangles
HOW WE SPLIT THE TEAM
• Gephi construction team
• Retrieve relevant data and format ready - .gml format.
• 4-4-2.com
• Gephi graph
• How current is data(year of creation)
• Statistical analysis & visualisation team
• Presentation creation
WHAT IS GEPHI?
• Other tools available:
• Mathematica
• MATLAB
• How is it used
• Visual demonstration
WHAT WAS ACHIEVED
• Our working methodology
• Statistical analysis techniques
• Effective visualisations
• Ability to infer strategies and relationships between data.
OUR METHODOLOGY
• Define objective
• Research and collection of data
• Gephi to draw graphs to visualize collected data
• Then statistical reports,visualisation and analysis
• Conclusions and evaluations
STATISTICAL ANALYSIS
Arsenal Team West Ham Team
Total 110 edges Total 96 edges
Average Degree = 8.462 Average Degree = 13.714
Network Diameter = 2 Network Diameter: 3
Network radius = 1 Network Radius: 2
Number of Weakly Connected
Components: 1
Number of Weakly Connected
Components: 1
Number of Strongly Connected
Components: 1
Number of Strongly Connected
Components: 1
• Vertex degree chart
Vertex degree chart of Arsenal Team: Vertex degree chart of West Ham Team:
Players Degree In-
degree
Out-degree
Theo Walcott 7 5 2
Petr Cech 10 5 5
Alexis Sanchez 13 7 6
Olivier Giround 16 10 6
Laurent
Koscienly
17 9 8
Mathieu Debuchy 17 8 9
Francis Coquelin 18 9 9
Santiago Cazorla 19 8 11
Per
Mertesacacker
19 9 10
Alex Oxlade 19 8 11
Mesut Ozil 21 10 11
Nacho Monreal 21 11 10
Aaron Ramsey 23 11 12
Players Degree In-degree Out-degree
Modibo Maiga 4 3 1
Matthew Jarvis 8 5 3
Kevin Nolan 8 4 4
Diafra Sakho 10 7 3
Mauro Zarate 13 7 6
Angelo Ogbonna 14 8 6
Winston Reid 15 7 8
Aaron Creswell 15 8 7
Reece Oxford 15 6 9
Cheikhou
Kaouyate 15 6 9
Adrian 16 7 9
James Tomkins 18 7 11
Dimitri Payet 20 10 10
Mark Noble 21 11 10
• Degree distribution chart for arsenal team
Degree distribution chart for west ham team
FEATURES OF THE GRAPHS
• Nature of the graph: directed
• In/out degrees
• Weighted
• Formation signified by the graph. 4-4-2 visually looks like a 4-4-2 formation
STATISTICAL ANALYSIS TECHNIQUES
• Vertex degree
• Number of nodes joined to that node (popularity)
• Vertex degree
• Greater percentage for west ham because they passed more as a team.
• Isomorphic relationships
• Connectivity
Connectivity
• In graph theory connectivity indicates whether all nodes in a network can
be reached from any other node.
• If the graph is strongly connected(directed path) then it represents high
possession value of team.
• If the graph is weakly connected(undirected path) then it represents low
possession value of team.
• In our dataset by the graph analysis west ham team has slightly greater
possession value than arsenal team.
DIFFERENT TYPES OF CENTRALITY
• Betweeness centrality
• Closeness centrality
• Radius
• Diameter
• Pagerank algorithms
CENTRALITY- INDEPENDENT NODE
ANALYSIS
ID Label Eccentricity Closeness
Centrality
Betweenn
ess
Centrality
Degree
Centrality
9 Aaron Ramsey 1.0 1.0 11.56 23
8 Mesut Ozil 2.0 0.92 8.08 21
4 Nacho Monreal 2.0 0.86 6.86 21
6 Alex Oxlade - Chamerlain 2.0 0.92 4.06 19
2 Per Mertesacker 2.0 0.86 3.44 19
7 Santiago Cazorla 2.0 0.92 2.74 19
5 Francis Coquelin 2.0 0.8 1.64 18
3 Mathieu Debuchy 2.0 0.8 2.143 17
1 Laurent Koscienly 2.0 0.75 2.06 17
10 Olivier Giroud 2.0 0.66 2.31 16
12 Alexis Sanchez 2.0 0.66 0.64 13
0 Petr Cech 2.0 0.631 0.41 10
11 Theo Walcott 2.0 0.54 0.0 7
CLOSENESS CENTRALITY
• More central player -> higher closeness centrality.
• Easy access to any other player making them key to tactics
• Why is closeness good?
BETWEENESS CENTRALITY
• Number of times node acts as a bridge
• Central players ideally should have high betweeness centrality
• E.g. Midfielders: Ramsey,Ozil,Cazorla
• Extreme ends of network
RADIUS / DIAMETER
• Radius: Least number of hops to traverse network
• Diameter: most hops to get around network
PAGERANK ALGORITHM
• Adjacency matrix for Arsenal.
• Convert probabilities using a
‘random surfer’ based model
• Summarises player importance:
• Ramsey is central to Arsenal.
PAGERANK CONT.
• Dead ends – pages with no out links
• How do we address this?
• Spider traps – have out links but never link to other pages.
• Player who only gets passed the ball and is then tackled.
• Teleportation is a good compromise.
• A simplified pagerank formula
• v= (1−β)n+βM v
VISUALISATION TECHNIQUES
• Formation based graph(show subs and players positions in a visual way)
• Large edge means important relationship.
• Communities (defensive, midfield and offensive)
• Considered other layout techniques but not that useful(11 nodes is quite
easy to visualise)
VISUALISATION APPROACHES
TABLES AND FEATURES OF IN/OUT
• Explain in and out degrees
• Specific players as examples :
• Monreal: good defender hence has more in than out(trusted also).
• Machine learning could facilitate this sort of analysis.
BENEFITS/LIMITATIONS
Benefits:
1. Good analysis tool to visualize formation
2. Infer what sort of position a player is playing
3. Find out about certain players and their roles in the team
4. Discover the football team style of play and its tactics
5. Compared with two different network graphs for two teams to analyze
tactics
differences between two teams
6. Also applicable to many other suitable domains
Limitations:
1. Difficult to retrieve and extract relevant data
2. The data from the graph is just theoretical
3. Every match is different. Many uncontrollable factors can also affect the
final
result
4. Players may change
5. ……..
ASSUMPTIONS
• From graph analysis
insufficient data
exist substitute
impossible 100% successful passes
• From circumstances
weather effects
home and away
• From players
sports status
• From coaches
change tactics
LESSONS LEARNT AND CONCLUSIONS
• Retrieving data is difficult
• Gephi is a powerful network tool
• Visualisation is an important part of analysis
• Graphs provide a very interesting way to visualise sports team
cohesion
TEAM CONTRIBUTION
• Rob – Design and implement gephi graphs and pagerank algorithms to draw useful conclusions from them
• Pratik-Statistical analysis & visualisation approach,maintaining dropbox
• Yogesh-Statistical analysis & visualisation approach, how to use gephi,presentation speaker.
• Sean- bring together presentation, review work, research key areas, provide insight into domain area and areas for future
development.
• Chen- Key limitations and analysis, limitations (conclusion)
• Michael- Research into domain area, presentation speaker and detailed analysis of football games and domain.
OVERALL CONTRIBUTION

More Related Content

Recently uploaded

Top 10 Best Motivational Movies Of Bollywood
Top 10 Best Motivational Movies Of BollywoodTop 10 Best Motivational Movies Of Bollywood
Top 10 Best Motivational Movies Of Bollywoodsingsanjib421
 
Get Ahead with YouTube Growth Services....
Get Ahead with YouTube Growth Services....Get Ahead with YouTube Growth Services....
Get Ahead with YouTube Growth Services....SocioCosmos
 
“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...AJHSSR Journal
 
How social media marketing helps businesses in 2024.pdf
How social media marketing helps businesses in 2024.pdfHow social media marketing helps businesses in 2024.pdf
How social media marketing helps businesses in 2024.pdfpramodkumar2310
 
Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........SocioCosmos
 
Want to Amplify Your Pinterest Content?...
Want to Amplify Your Pinterest Content?...Want to Amplify Your Pinterest Content?...
Want to Amplify Your Pinterest Content?...SocioCosmos
 
Social Media kdjhadhnjbdsjbdff fjkjasfkl
Social Media kdjhadhnjbdsjbdff fjkjasfklSocial Media kdjhadhnjbdsjbdff fjkjasfkl
Social Media kdjhadhnjbdsjbdff fjkjasfklmdigitalmarketing001
 
Experience genuine and sustainable growth on TikTok.
Experience genuine and sustainable growth on TikTok.Experience genuine and sustainable growth on TikTok.
Experience genuine and sustainable growth on TikTok.SocioCosmos
 
Multilingual SEO Services | Multilingual Keyword Research | Filose
Multilingual SEO Services |  Multilingual Keyword Research | FiloseMultilingual SEO Services |  Multilingual Keyword Research | Filose
Multilingual SEO Services | Multilingual Keyword Research | Filosemadisonsmith478075
 
Children's Data Privacy_April-22_2024.pdf
Children's Data Privacy_April-22_2024.pdfChildren's Data Privacy_April-22_2024.pdf
Children's Data Privacy_April-22_2024.pdfSiobhan O'Flynn
 
Non-Financial Information and Firm Risk Non-Financial Information and Firm Risk
Non-Financial Information and Firm Risk Non-Financial Information and Firm RiskNon-Financial Information and Firm Risk Non-Financial Information and Firm Risk
Non-Financial Information and Firm Risk Non-Financial Information and Firm RiskAJHSSR Journal
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptxlorraineandreiamcidl
 
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...prithvikumar6695#S07
 
Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..SocioCosmos
 
7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategyDigital Marketing Lab
 
How to blow up on social media simple di
How to blow up on social media simple diHow to blow up on social media simple di
How to blow up on social media simple diRachaelOnuche
 

Recently uploaded (16)

Top 10 Best Motivational Movies Of Bollywood
Top 10 Best Motivational Movies Of BollywoodTop 10 Best Motivational Movies Of Bollywood
Top 10 Best Motivational Movies Of Bollywood
 
Get Ahead with YouTube Growth Services....
Get Ahead with YouTube Growth Services....Get Ahead with YouTube Growth Services....
Get Ahead with YouTube Growth Services....
 
“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...
 
How social media marketing helps businesses in 2024.pdf
How social media marketing helps businesses in 2024.pdfHow social media marketing helps businesses in 2024.pdf
How social media marketing helps businesses in 2024.pdf
 
Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........
 
Want to Amplify Your Pinterest Content?...
Want to Amplify Your Pinterest Content?...Want to Amplify Your Pinterest Content?...
Want to Amplify Your Pinterest Content?...
 
Social Media kdjhadhnjbdsjbdff fjkjasfkl
Social Media kdjhadhnjbdsjbdff fjkjasfklSocial Media kdjhadhnjbdsjbdff fjkjasfkl
Social Media kdjhadhnjbdsjbdff fjkjasfkl
 
Experience genuine and sustainable growth on TikTok.
Experience genuine and sustainable growth on TikTok.Experience genuine and sustainable growth on TikTok.
Experience genuine and sustainable growth on TikTok.
 
Multilingual SEO Services | Multilingual Keyword Research | Filose
Multilingual SEO Services |  Multilingual Keyword Research | FiloseMultilingual SEO Services |  Multilingual Keyword Research | Filose
Multilingual SEO Services | Multilingual Keyword Research | Filose
 
Children's Data Privacy_April-22_2024.pdf
Children's Data Privacy_April-22_2024.pdfChildren's Data Privacy_April-22_2024.pdf
Children's Data Privacy_April-22_2024.pdf
 
Non-Financial Information and Firm Risk Non-Financial Information and Firm Risk
Non-Financial Information and Firm Risk Non-Financial Information and Firm RiskNon-Financial Information and Firm Risk Non-Financial Information and Firm Risk
Non-Financial Information and Firm Risk Non-Financial Information and Firm Risk
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO.pptx
 
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...
Call Girls Dehradun | ₹,9500 Pay Cash 9719300533 Free Home Delivery Escorts S...
 
Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..
 
7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy
 
How to blow up on social media simple di
How to blow up on social media simple diHow to blow up on social media simple di
How to blow up on social media simple di
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

football-network-analysis-with-gephi

  • 1. FOOTBALL NETWORK ANALYSIS WITH GEPHI TO DETERMINE A TEAMS STRATEGY. GROUP 2: ROBERT FERRO, SEAN JAMES, YOGESH SHINDE,PRATIK DOSHI,MINGYANG CHEN AND MICHEAL ABAHO
  • 2. FORMALITIES • Explain basic concepts in a readable form: • Vertex – a player • Edge – a relationship between 2 players. • Objective: Analyse two opposing teams to see what tactics were used. • Arsenal Vs West Ham : 0-2
  • 3. WHAT HAS BEEN UNDERTAKEN • Chose between 3 ideas: • Football domain – most useful with real world application • Coffee & sandwich habits – JCR,Coffeeshop • Library habits • Met at regular intervals/ social media /dropbox • Initial research to find sites/resources of interest
  • 4. WHY FOOTBALL AS A SUITABLE DATASET • Other useful applications for this data • How was data collected? • Why is football domain a good choice? - Different tactics • Defensive play: more in than out. • Good team cohesion: triangles
  • 5. HOW WE SPLIT THE TEAM • Gephi construction team • Retrieve relevant data and format ready - .gml format. • 4-4-2.com • Gephi graph • How current is data(year of creation) • Statistical analysis & visualisation team • Presentation creation
  • 6. WHAT IS GEPHI? • Other tools available: • Mathematica • MATLAB • How is it used • Visual demonstration
  • 7. WHAT WAS ACHIEVED • Our working methodology • Statistical analysis techniques • Effective visualisations • Ability to infer strategies and relationships between data.
  • 8. OUR METHODOLOGY • Define objective • Research and collection of data • Gephi to draw graphs to visualize collected data • Then statistical reports,visualisation and analysis • Conclusions and evaluations
  • 9. STATISTICAL ANALYSIS Arsenal Team West Ham Team Total 110 edges Total 96 edges Average Degree = 8.462 Average Degree = 13.714 Network Diameter = 2 Network Diameter: 3 Network radius = 1 Network Radius: 2 Number of Weakly Connected Components: 1 Number of Weakly Connected Components: 1 Number of Strongly Connected Components: 1 Number of Strongly Connected Components: 1
  • 10. • Vertex degree chart Vertex degree chart of Arsenal Team: Vertex degree chart of West Ham Team: Players Degree In- degree Out-degree Theo Walcott 7 5 2 Petr Cech 10 5 5 Alexis Sanchez 13 7 6 Olivier Giround 16 10 6 Laurent Koscienly 17 9 8 Mathieu Debuchy 17 8 9 Francis Coquelin 18 9 9 Santiago Cazorla 19 8 11 Per Mertesacacker 19 9 10 Alex Oxlade 19 8 11 Mesut Ozil 21 10 11 Nacho Monreal 21 11 10 Aaron Ramsey 23 11 12 Players Degree In-degree Out-degree Modibo Maiga 4 3 1 Matthew Jarvis 8 5 3 Kevin Nolan 8 4 4 Diafra Sakho 10 7 3 Mauro Zarate 13 7 6 Angelo Ogbonna 14 8 6 Winston Reid 15 7 8 Aaron Creswell 15 8 7 Reece Oxford 15 6 9 Cheikhou Kaouyate 15 6 9 Adrian 16 7 9 James Tomkins 18 7 11 Dimitri Payet 20 10 10 Mark Noble 21 11 10
  • 11. • Degree distribution chart for arsenal team Degree distribution chart for west ham team
  • 12. FEATURES OF THE GRAPHS • Nature of the graph: directed • In/out degrees • Weighted • Formation signified by the graph. 4-4-2 visually looks like a 4-4-2 formation
  • 13. STATISTICAL ANALYSIS TECHNIQUES • Vertex degree • Number of nodes joined to that node (popularity) • Vertex degree • Greater percentage for west ham because they passed more as a team. • Isomorphic relationships • Connectivity
  • 14. Connectivity • In graph theory connectivity indicates whether all nodes in a network can be reached from any other node. • If the graph is strongly connected(directed path) then it represents high possession value of team. • If the graph is weakly connected(undirected path) then it represents low possession value of team. • In our dataset by the graph analysis west ham team has slightly greater possession value than arsenal team.
  • 15. DIFFERENT TYPES OF CENTRALITY • Betweeness centrality • Closeness centrality • Radius • Diameter • Pagerank algorithms
  • 16. CENTRALITY- INDEPENDENT NODE ANALYSIS ID Label Eccentricity Closeness Centrality Betweenn ess Centrality Degree Centrality 9 Aaron Ramsey 1.0 1.0 11.56 23 8 Mesut Ozil 2.0 0.92 8.08 21 4 Nacho Monreal 2.0 0.86 6.86 21 6 Alex Oxlade - Chamerlain 2.0 0.92 4.06 19 2 Per Mertesacker 2.0 0.86 3.44 19 7 Santiago Cazorla 2.0 0.92 2.74 19 5 Francis Coquelin 2.0 0.8 1.64 18 3 Mathieu Debuchy 2.0 0.8 2.143 17 1 Laurent Koscienly 2.0 0.75 2.06 17 10 Olivier Giroud 2.0 0.66 2.31 16 12 Alexis Sanchez 2.0 0.66 0.64 13 0 Petr Cech 2.0 0.631 0.41 10 11 Theo Walcott 2.0 0.54 0.0 7
  • 17. CLOSENESS CENTRALITY • More central player -> higher closeness centrality. • Easy access to any other player making them key to tactics • Why is closeness good?
  • 18. BETWEENESS CENTRALITY • Number of times node acts as a bridge • Central players ideally should have high betweeness centrality • E.g. Midfielders: Ramsey,Ozil,Cazorla • Extreme ends of network
  • 19. RADIUS / DIAMETER • Radius: Least number of hops to traverse network • Diameter: most hops to get around network
  • 20. PAGERANK ALGORITHM • Adjacency matrix for Arsenal. • Convert probabilities using a ‘random surfer’ based model • Summarises player importance: • Ramsey is central to Arsenal.
  • 21. PAGERANK CONT. • Dead ends – pages with no out links • How do we address this? • Spider traps – have out links but never link to other pages. • Player who only gets passed the ball and is then tackled. • Teleportation is a good compromise. • A simplified pagerank formula • v= (1−β)n+βM v
  • 22. VISUALISATION TECHNIQUES • Formation based graph(show subs and players positions in a visual way) • Large edge means important relationship. • Communities (defensive, midfield and offensive) • Considered other layout techniques but not that useful(11 nodes is quite easy to visualise)
  • 24. TABLES AND FEATURES OF IN/OUT • Explain in and out degrees • Specific players as examples : • Monreal: good defender hence has more in than out(trusted also). • Machine learning could facilitate this sort of analysis.
  • 25. BENEFITS/LIMITATIONS Benefits: 1. Good analysis tool to visualize formation 2. Infer what sort of position a player is playing 3. Find out about certain players and their roles in the team 4. Discover the football team style of play and its tactics 5. Compared with two different network graphs for two teams to analyze tactics differences between two teams 6. Also applicable to many other suitable domains
  • 26. Limitations: 1. Difficult to retrieve and extract relevant data 2. The data from the graph is just theoretical 3. Every match is different. Many uncontrollable factors can also affect the final result 4. Players may change 5. ……..
  • 27. ASSUMPTIONS • From graph analysis insufficient data exist substitute impossible 100% successful passes • From circumstances weather effects home and away • From players sports status • From coaches change tactics
  • 28. LESSONS LEARNT AND CONCLUSIONS • Retrieving data is difficult • Gephi is a powerful network tool • Visualisation is an important part of analysis • Graphs provide a very interesting way to visualise sports team cohesion
  • 29. TEAM CONTRIBUTION • Rob – Design and implement gephi graphs and pagerank algorithms to draw useful conclusions from them • Pratik-Statistical analysis & visualisation approach,maintaining dropbox • Yogesh-Statistical analysis & visualisation approach, how to use gephi,presentation speaker. • Sean- bring together presentation, review work, research key areas, provide insight into domain area and areas for future development. • Chen- Key limitations and analysis, limitations (conclusion) • Michael- Research into domain area, presentation speaker and detailed analysis of football games and domain.

Editor's Notes

  1. Rob explain domain briefly.
  2. Rob: Objective: using Gephi and network analysis to find what similarities and differences can be found(inferred) between the tactics employed by Arsenal and West Ham in their Premier league game where surprisingly Arsenal lost to West Ham 2-0.
  3. Rob: [Talk around the key themes and ideas we had.] Why we went for the football domain. What were the source of ideas: being realistic about retrieval but also choosing an area of interest which would work. The ideas were ours thinking about what could be an interesting research domain. We chose football because we wanted to use a domain which had a real world application
  4. Rob and Michael : Read slide Also explain defensive tactics but also lobbing and triangles (good team cohesion)
  5. Rob & Michael: embellish upon the slides Rob focus on the fourfourtwo site: maybe have the url ready to show?
  6. Michael: a visual demonstration is important.
  7. Yogesh: Read off your paper, adding extra content where appropriate.
  8. Yogesh Shinde We firstly did defining the objective and selection of the match Then we did the research and collect the match ball possession detail through websites like fourfourtwo.com.
  9. Yogesh Shinde The number of edges represents here how many passes are made between players and the passes between the players shows up that how the players are connected to each other. By above number of edges analysis arsenal team has slightly greater passes than west ham. West ham team uses better passes than arsenal team, the average degree of arsenal team is much lesser than average degree of west ham team. West ham team applies long passing or pass with loop strategies because the graph diameter and radius of west ham team measured by graph is larger than arsenal team.
  10. Yogesh Shinde In graph theory vertex degree means how many edges are connected to the node. In our dataset vertex degree represents how many passes are made between two players.
  11. Yogesh Shinde If the player of the team has high In degree and out degree value then we can say that this player might be playing at midfield of the team and defining the team play of style. And these midfield player are centrality of team that is these players are having more important value in defining team strategy or tactics.
  12. Yogesh: Once again explain and embellish on notes.
  13. Yogesh: explain your section further in terms of graph theory. In graph theory connectivity indicates whether all nodes in a network can be reached from any other node. If the graph is strongly connected(directed path) then it represents high possession value of team. If the graph is weakly connected(undirected path) then it represents low possession value of team. In our dataset by the graph analysis west ham team has slightly greater possession value than arsenal team. Rob: add to what Yogesh says in terms of pagerank and things which he does not mention. Using page rank to size the nodes and colouring them by their respective communities, we can try and gain an insight into the different passing and game styles of Arsenal and West Ham. What can we see? As we expect Arsenal appear to play the ‘beautiful game’. Their most important passers are their central midfielders including Aaron Ramsey and Mesut Ozil. Using the average formation as the layout and looking at the communities found we find that the communities correspond to the distance between players on the pitch, leading us to believe that Arsenal play with a short passing style. From out analysis with could also conclude that Arsenal play with a narrow style and don’t like to use a lot of width. West Ham have many similarities to Arsenal when using page rank and average formation as a layout. However of particular interest is the community found containing the goalkeeper and the striker, from this would could conclude that West Ham are more prone to playing the ‘long ball game’ certainly more so than Arsenal. The distribution of the page rank scores also differ some what, West Ham are less dependent on their central midfielder and their fullbacks and central defenders see more of the passing of the ball. Perhaps this means we can conclude that West Ham are more prone to a defensive long ball game, whilst also likely to include more width by getting their fullbacks more involved than Arsenal do. This analysis to be useful must be combined with expert analyst watching the game in question to confirm or rebuke any claims inferred from the analysis.
  14. Yogesh Shinde
  15. Michael introduces: Explain how Rob will talk about pagerank.
  16. Micheal: Analysis of independent nodes and their significance in the network Essentially, Centrality defines how influential or significant any node can be within a network. Typically these are the basic facts that centrality will tell us,- The characteristics of an important node. How many other nodes a particular node is connected to (i.e. he degree of connectivity). How often a node appears in a particular path or trail. How fatal the consequences can be if an important node is eliminated. Degree Centrality (degree of connectivity) This metric informs us how many other nodes connect to a particular node. Whether an in-ward connection or outward connection. Nodes with multiple connections have got a high degree of connectivity. Limitation - Elimination of Some nodes with a high degree of centrality doesn’t necessarily lead to a network partition, therefore such nodes might not be as significant as the metric proposes. e.g A player like debuchy (Arsenal’s right back) has got a high degree of centrality but, if taken out of the network, this doesn’t cause a massive network partition, because he fails to pass to three distinct players in the game as opposed to the other players with a relatively similar degree of centrality. Often, this metric is not utilized when determining a node’s significance.
  17. Michael:Closeness centrality Is the reciprocal of the average shortest path length of a node. Therefore closeness centrality is indirectly proportional to the shortest path length of any node. The more centrally a player is positioned within the network, the higher their closeness centrality, which suggests that they can easily access any other player within the network thus making them key in the teams tactical set-up of playing. Conclusion. The applicability of these metrics is very subjective in network analysis and therefore, they’re interchangeably used depending on the context of a given study.
  18. Micheal: Quantifies the number of times a node acts as a bridge along the shortest path of any other pair of nodes. Players that have got a high probability of appearing along any shortest path have a high betweeness centrality measure. Ideally, Centrally positioned players would be expected to have a high Betweenness centrality i.e. midfielders – Ramsey, Ozil, Francis and Cazorla. However, this analysis identifies a left-back amongst players having a high value of this metric, which signals that particular players relevance within the network. As we would expect, Players at the extreme ends of the network would have a low value of this metric thus not as relelvant as the other players.
  19. Rob: What does this mean for our graphs? So lobbing tactics etc.
  20. Rob: Adjacency matrix for Arsenal. Page rank uses this by converting to the probabilities of where a ‘random surfer’ will end up after one step. The value mij in row i and column j has value n/k if page j has k arcs out, and n of them are to page i. Otherwise, mij = 0. Note, the columns should always sum to 1. We don’t have any loops because it is assumed no player passes to themselves. It has to overcome problems such as it needs the web to be a strongly connected graph (Which it is not).
  21. Rob again: A page with no out links. - Surfers reaching a dead end ‘disappear’ after enough iterations of the algorithm and no page connected to dead end can have any page rank at all. To address this we add outgoing links to any dead ends to all other pages in the web, with an equal weighting of probability. Spider traps – Groups of pages that all have out links but never link to any other pages. (Think of a community with only one in link and no out links). If the surfer enters this community he will spend the every subsequent iteration of the algorithm in this community. As a result they ‘suck’ all of the page rank value into their pages. To overcome this we introduce the notion of randomly ‘teleporting’ with a small probability each iteration. Page rank formula uses a constant vector of probability.
  22. Rob and Yogesh: Formation graph In our dataset team players represents vertices and ball passes between player represents edges. Players with high weighted in team represented by large size node. The player in midfield of team controlling the team game and has high centrality or having more connectivity with other players shows with large size node. Larger or thicker edge represents too many passes are done between the players. Communities (defensive, midfield and offensive) In our dataset we define three communities and those are defensive, midfield and offensive. Players in defensive community if they got the ball they only do is that they clears the ball or they passes to the safe standing player. Players in midfield communities plays important role in game because they get the ball most of the time and the players in this community will define the play of the game by passing ball to offensive or defensive player. Players in offensive community are the players playing front side of the team if they get the ball they do pass to shot taking player or they take shot to convert it into goal chances.
  23. This layout uses the average formation over the match the position the nodes (As it makes most sense) Nodes are coloured by communities (Lots of links between nodes in same community, and few links between communities ) Nodes are sized by their page rank value (Indicating how ‘important’ a player is in regards to the passing style of the team)
  24. Yogesh or Rob: Table and explain analysis conducted.
  25. Chen: Read from prepared notes to detail the limitations and benefits of our solution
  26. Chen: Explain the assumptions we have made
  27. Rob concludes why this is really useful as well as what we have learned. Did we manage to infer anything useful.
  28. Mention contributions if he asks. Everyone worked hard though.