Soccer Trajectory


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Selling your ideas is challenging. First, you must get your listeners to agree with you in principle. Then, you must move them to action. Use the Dale Carnegie Training® Evidence – Action – Benefit formula, and you will deliver a motivational, action-oriented presentation.
  • Soccer Trajectory

    1. 1. Trajectory Analysis of Broadcast Soccer Videos Computer Science and Engineering Department Indian Institute of Technology, Kharagpur by Prof. Jayanta Mukherjee [email_address]
    2. 2. Collaborators <ul><li>V. Pallavi --- research scholar. </li></ul><ul><li>Prof. A.K. Majumdar, CSE </li></ul><ul><li>Prof. Shamik Sural, SIT </li></ul>
    3. 3. OUTLINE <ul><li>Motivation and Objective </li></ul><ul><li>State Based Video Model </li></ul><ul><li>Extraction of Features </li></ul><ul><li>Trajectory Detection </li></ul><ul><li>States and Event Detection </li></ul>
    4. 4. Motivation <ul><li>Increasing availability of soccer videos </li></ul><ul><li>Soccer videos appeal to a large audience </li></ul><ul><li>Processing of soccer videos to deliver it over narrow band networks </li></ul><ul><li>Relevance of soccer videos drops significantly after a short period of time </li></ul><ul><li>Therefore soccer video analysis needs to be made automatic and the results must be semantically meaningful </li></ul>
    5. 5. State based Video Model Video data model : representation of information contained in the unstructured video in order to support users’ queries. State based model: states of soccer video objects and their transitions (due to some event).
    6. 6. State Chart Diagram for Ball Possession
    7. 7. Immediate Goal Our objective is to identify these states and their transitions by analyzing the unstructured video.
    8. 8. Detection of States and Events (contd..) In a soccer match, the ball possession states may be any of the following <ul><li>possession of Team A </li></ul><ul><li>possession of Team B </li></ul><ul><li>both the teams fighting to possess the ball </li></ul><ul><li>ball in possession of none during a break </li></ul>
    9. 9. <ul><li>Cinematic Features </li></ul><ul><ul><ul><li>Shot Transitions </li></ul></ul></ul><ul><ul><ul><li>Shot Types </li></ul></ul></ul><ul><ul><ul><li>Shot Durations </li></ul></ul></ul><ul><li>Object Based Features </li></ul><ul><ul><ul><li>Players </li></ul></ul></ul><ul><ul><ul><li>Ball </li></ul></ul></ul><ul><ul><ul><li>Billboards </li></ul></ul></ul><ul><ul><ul><li>Field Descriptors </li></ul></ul></ul>Features Used
    10. 10. Cinematic Features <ul><li>Shot is a continuous sequence of frames captured from the same camera in a video. </li></ul><ul><li>Shot detection algorithms segment videos into shots automatically. </li></ul><ul><li>Shot classification algorithms partitions a video stream into a set of meaningful and manageable segments. </li></ul>Feature Extraction (contd ..)‏
    11. 11. Shots can be classified into: <ul><li>Long shot </li></ul><ul><ul><li>Captures a global view of the field </li></ul></ul><ul><li>Medium shot </li></ul><ul><ul><li>Shows close up view of one or more players in a specific part of the field </li></ul></ul><ul><li>Close shot </li></ul><ul><ul><li>Shows an above-waist view of a single player </li></ul></ul>Shot classification
    12. 12. Cinematic Features Shot Classification (contd..)‏ <ul><li>A soccer field has one distinct dominant color i.e. green which varies from </li></ul><ul><ul><li>Stadium to stadium </li></ul></ul><ul><ul><li>Lighting conditions </li></ul></ul><ul><li>In long views it has been observed that either grass dominates the entire frame or the crowd covers upper part of the frame </li></ul>
    13. 13. Typical long views in soccer videos Grass covering entire frame Grass covering partial frame
    14. 14. Shot Classification (contd..)‏ Soccer Video Sequence If dominant color is green Dominant color ratio >0.75 and <=1.0 Long Shot Medium Shot Dominant color ratio >0.5 and <=0.75 Close Shot Dominant color ratio >0.25 and <=0.5
    15. 15. Shot Classification Results 20 1472 110 78 Close Shot 7 38 418 37 Medium Shot 144 24 32 5830 Long Shot Unclassified Shot (No of frames)‏ Close Shot (No of frames)‏ Medium Shot (No of frames)‏ Long Shot (No of frames)‏ Predicted Class True Class
    16. 16. Shot Classification Results 87.63 Close Shot 83.76 Medium Shot 96.68 Long Shot % of True Classification Shot Type
    17. 17. Cinematic Features Shot Detection Shots in sports videos can be : <ul><li>Wipe </li></ul><ul><li>Dissolve </li></ul><ul><li>Hard cut </li></ul><ul><li>Fade </li></ul>
    18. 18. Proposed Shot Detection Method <ul><li>Extends the approach proposed by Vadivel et al. for broadcast soccer videos </li></ul><ul><li>Combines the shot detection method by Vadivel et al. with the proposed shot classification method. </li></ul>Limitations of Vadivel et al’s method for broadcast soccer videos : Hard cuts are missed
    19. 19. Proposed Shot Detection Method Each frame in a shot is classified with the shot classification algorithm If a long shot is segmented into a sequence of long and medium view frames If the number of frames in the sequence is above a certain threshold Hard cut exists within the shot
    20. 20. Proposed Shot Detection Results Overall Recall and Precision by: Vadivel et al’s method: 85.43%, 89.02% Proposed method: 91.76%, 93.65%
    21. 21. Shot detection improved by shot classification
    22. 22. Object Based Features Feature extraction for grass pixels Each frame is processed in YIQ color space. It is found experimentally that grass pixels have ‘I’ values ranging between 25 and 55 while ‘Q’ values range between 0 and 12.
    23. 23. Playfield region detected Grass pixels detected for a long view frame
    24. 24. Object Based Features (contd..)‏ Playfield Line Detection A playfield line separates playfield from the non playfield background which are usually the billboards (also called advertisement boards). Hough transform is used to detect the playfield line.
    25. 25. Object Based Features (contd..)‏ Midfield line is the line that divides the playfield in half along its width. Hough transform is applied to detect the midfield line. Midfield Line Detection
    26. 26. Ball Detection Object Based Features (contd..)‏ Challenges : <ul><li>Features of the ball (color, size, shape) vary with time </li></ul><ul><li>Relative size of the ball is very small </li></ul><ul><li>Ball may not be an ideal circle because of fast motion and illumination conditions </li></ul><ul><li>Objects in the field or in the crowd may look similar to a ball </li></ul><ul><li>Field appearance changes from place to place and time to time </li></ul>No definite property to uniquely identify ball in a frame
    27. 27. Detecting Ball Candidates in Long Shots <ul><li>Obtain ball candidates by detecting circular regions by using circular Hough Transform </li></ul><ul><li>Filter the non ball candidates by : </li></ul><ul><ul><li>Removing candidates from channel’s logo </li></ul></ul><ul><ul><li>Removing candidates from gallery region </li></ul></ul><ul><ul><li>Removing candidates from midfield line </li></ul></ul><ul><ul><li>Filtering out the candidates moving against the camera </li></ul></ul>Object Based Features (contd..)‏
    28. 28. Ball candidates before and after filtering Ball candidates before filtering Ball candidates after filtering Object Based Features (contd..)‏
    29. 29. Detecting Players in Long Shots Challenges <ul><li>Features of the players (color, texture, size, motion) are neither static nor uniform </li></ul><ul><li>Players appear very small in size </li></ul><ul><li>Size of players changes with their position and zooming of cameras </li></ul><ul><li>Color and texture of the jersey and shorts vary from team to team </li></ul><ul><li>Players in the field do not have constant motion </li></ul>Object Based Features (contd..)‏
    30. 30. <ul><li>Obtain player pixels by removing non player pixels : </li></ul><ul><ul><li>Removing grass pixels </li></ul></ul><ul><ul><li>Removing the broadcasting channel’s logo </li></ul></ul><ul><ul><li>Removing the extra field region (billboards and gallery)‏ </li></ul></ul><ul><ul><li>Removing pixels from the midfield line </li></ul></ul><ul><li>Segment the image containing player pixels to isolated player regions by : </li></ul><ul><ul><li>Region growing algorithm </li></ul></ul><ul><ul><li>Center of the bounding rectangle of each region is said to be the location of the player </li></ul></ul>Detecting Player Regions Object Based Features (contd..)‏
    31. 31. A Long Shot View Object Based Features (contd..)‏
    32. 32. Player pixels detected Object Based Features (contd..)‏
    33. 33. Players detected in long shot views Object Based Features (contd..)‏
    34. 34. Team Identification in Soccer Videos Players in a soccer videos are classified using a supervisory classification method. Mean I and Q values of the player regions are obtained by randomly selecting a few frames The minimum and maximum I and Q values are set as the range for classifying player regions Feature Detection (Contd.)
    35. 35. Team Classification in Soccer Videos <ul><li>Experiments were performed on two different matches: </li></ul><ul><li>Real Madrid and Manchester United (UEFA Champions League 2003)‏ </li></ul><ul><li>Chelsea and Liverpool (UEFA Champions League 2007)‏ </li></ul>Feature detection (contd.)
    36. 36. Team Classification Results Real Madrid and Manchester United 72 (11.51)‏ 652 0 Team B 5 0 (25)‏ 3 (17.86)‏ Unclassified 173 (16.29)‏ 0 725 Team A Unclassified No of players (%)‏ Team B No of players (%)‏ Team A No of players (%)‏ Predicted Class True Class
    37. 37. Chelsea and Liverpool Team Classification Results (contd..)‏ 72 (9.94)‏ 652 0 Team B 5 0 3 (37.5)‏ Unclassified 173 (19.27)‏ 0 725 Team A Unclassified No of players (%)‏ Team B No of players (%)‏ Team A No of players (%)‏ Predicted Class True Class
    38. 38. Camera Related Feature Object Based Features (contd..)‏ Camera Direction Estimation : <ul><li>Optical Flow velocities and their directions are computed using Horn and Shunck’s method. </li></ul><ul><li>Based on the sign of the horizontal component of the majority pixels in a frame, the direction of movement (left or right) of the camera is estimated. </li></ul>
    39. 39. Camera Direction Estimation (contd..)‏ Optical flow velocities for the camera moving towards right
    40. 40. Tracking of Broadcast Video Objects Challenges <ul><li>Camera parameters are unknown </li></ul><ul><li>Cameras are not fixed </li></ul><ul><li>Cameras are zoomed and rotated </li></ul><ul><li>Broadcast video is an edited video </li></ul>
    41. 41. Construction of a Directed Weighted Graph Objects in a frame form nodes. Between two correlated objects in two different frames an arc (edge) is formed. The measure of correlation or similarity provide the weight. Temporal direction provides the direction of the edge.
    42. 42. Directed Weighted Graph (contd..) Tracking of Broadcast Video Objects (contd..)‏
    43. 43. Object Trajectory Detection Given a source node, longest path of the graph obtained by dynamic programming gives the path of the object. Tracking of Broadcast Video Objects (contd..)‏
    44. 44. Ball detection results for long shots Average Recall is 96.75 % and Average Precision is 94.42 % * Liang D., Liu Y., Huang Q. & Gao W., A Scheme for Ball Detection and Tracking in Broadcast Soccer Video, Pacific Rim Conference on Multimedia, 2005, 1, LNCS 3767, 864-875. Tracking of Broadcast Video Objects (contd..)‏ 100 100 27 27 27 41400-41940 91.85 98.03 609 597 650 Liang et al * sequence 1 95.83 98.26 689 677 719 Liang et al * sequence 2 94.74 94.74 19 18 19 37020-37400 100 100 20 20 20 40500-40900 92.73 96.23 53 51 55 35900-37000 93.33 93.33 30 28 30 34800-35400 91.3 95.45 22 21 23 30300-30760 90 94.73 19 18 20 23800-24200 Precision Recall Ball present in (number of frames)‏ Ball identified in (number of frames)‏ Total (number of frames)‏ Frame Range
    45. 45. Results for ball detection in long shots (contd..)‏
    46. 46. Tracking a Single Player Given a source node (player in the first frame), longest path of the graph obtained by dynamic programming gives the path of the player in the whole sequence. Player being tracked Tracking of Broadcast Video Objects (contd..)‏
    47. 47. Tracking Multiple Players Longest path from each node (represented by players in the first frame) of the graph obtained by dynamic programming gives the trajectories of the players for the sequence of frames. Limitations : <ul><li>Occlusion between players </li></ul><ul><li>Players in contact </li></ul><ul><li>Similarity between players belonging to same team </li></ul>Tracking of Broadcast Video Objects (contd..)‏
    48. 48. Resolving Conflicting Player Trajectories <ul><li>If more than one player has more than two common nodes in its trajectory then only one amongst them is true. </li></ul><ul><li>The path having maximum weight is said to be the true trajectory </li></ul><ul><li>Nodes constituting the paths of correctly detected players are removed and a graph is again constructed </li></ul><ul><li>Mistracked players are again tracked </li></ul>Tracking Multiple Players (contd..)‏
    49. 49. Multiple Player Detection Results Average Recall is 97.53 % and Average Precision is 90.18% Tracking Multiple Players (contd..)‏ 99.37 99.48 1588 1578 1586 317 Soccer 6 95.28 100 530 505 505 68 Soccer 5 83.58 100 670 560 560 100 Soccer 4 100 85.71 360 360 420 60 Soccer 3 81.81 100 2200 1800 1800 200 Soccer 2 81.03 100 3110 2520 2520 180 Soccer 1 Precision (%)‏ Recall (%)‏ SOP detected SOTP detected SOP present No of Frames Video file
    50. 50. Multi - Player Tracking Results Average Accuracy is 94.05 % . Tracking Multiple Players (contd..)‏ 96.59 1532 1290 1586 317 Soccer 6 91.49 462 356 505 68 Soccer 5 100 560 540 560 100 Soccer 4 90.48 380 300 420 60 Soccer 3 100 1800 1200 1800 200 Soccer 2 85.71 2160 1800 2520 180 Soccer 1 Accuracy (%)‏ SOTP tracked by tracking and retracking algorithm SOTP tracked by tracking algorithm SOP present No of Frames Video file
    51. 51. Occlusion Results Average Accuracy is 83.89 % . Tracking Multiple Players (contd..)‏ 100 20 20 Soccer 6 80 16 20 Soccer 5 100 44 44 Soccer 3 55.55 20 36 Soccer 1 Accuracy (%)‏ No of cases that could be solved No of occlusion and contact cases Video file
    52. 52. Multi - Player Tracking Results
    53. 53. Multi - Player Tracking with Occlusion Results
    54. 54. Multi - Player Tracking with Occlusion Results (contd..)‏
    55. 55. Multi - Player Tracking with Occlusion Results (contd..)‏
    56. 56. Multi - Player Tracking with Occlusion Results (contd..)‏
    57. 57. Tracking the Mistracked Player (contd..)
    58. 58. Tracking the Mistracked Player (contd..)
    59. 59. Tracking the Mistracked Player (contd..)‏
    60. 60. Tracking the Mistracked Player (contd..)‏
    61. 61. Detection of States and Events The features extracted and the trajectories detected are used to detect states and events based on the proposed state based video model. States identified - Ball possession states Events detected - Ball passing events
    62. 62. State Chart Diagram for Ball Possession Detection of States and Events (contd..)
    63. 63. Play Break Detection Detection of States and Events (contd..)
    64. 64. State Detection <ul><li>Ball possession states are obtained based on </li></ul><ul><li>Spatial proximity analysis: </li></ul><ul><li>Distance between nearest player and second nearest player to the ball </li></ul><ul><li>Spatial arrangements between the players and the ball </li></ul>
    65. 65. Ball Possession State Detection Ball in possession of player 1’s team Ball in possession of player 1’s team
    66. 66. Ball Possession State Detection Ball in possession of player 1’s team Ball in a fight state
    67. 67. Ball Possession Results 8 (1.92)‏ 392 16 (3.85)‏ Team B 54 12 (17.14)‏ 4 (5.71)‏ Fight 8 (3.17)‏ 2 (0.93)‏ 206 Team A Fight No of frames (% of misclassified frames)‏ Team B No of frames (% of misclassified frames)‏ Team A No of frames (% of misclassified frames)‏ Predicted Class True Class
    68. 68. Edit Distance as performance measure for ball possession states <ul><li>If the actual state sequence for a sequence of frames is: </li></ul><ul><li>AAAAFFFFFFFFFFBBBB </li></ul><ul><li>And if the state sequence obtained by the proposed algorithm is: </li></ul><ul><li>AAAAFFFFFFFFBBBBBB </li></ul><ul><li>Both the sequences are represented as strings S 1 and S 2 . </li></ul><ul><li>Edit distance D(S 1 , S 2 ) is defined as the minimum number of point mutations required to change S 1 to S 2 where a point mutation is one of: </li></ul><ul><li>replacing an alphabet </li></ul><ul><li>inserting an alphabet </li></ul><ul><li>deleting an alphabet </li></ul><ul><li>Edit distance for the above sequence is 2. While normalized edit distance is: </li></ul><ul><li>D(S 1 , S 2 )/| S 1 | </li></ul>
    69. 69. Shot wise ball possession results
    70. 70. Event Detection <ul><li>The event detected in this work is the ball passing event. It can be: </li></ul><ul><li>Forward pass </li></ul><ul><li>Reverse pass </li></ul>
    71. 71. Event Detection (contd..)‏ <ul><li>The ball passing event cannot be detected </li></ul><ul><li>from state transition graphs because: </li></ul><ul><li>Ball is usually passed between players of the same team </li></ul><ul><li>State transition graphs show the change in ball possession states from Team A-Team B, Team B - Team A, Team B – Fight , Fight – Team B, Team A – Fight or Fight – Team A </li></ul>
    72. 72. Schematic diagram for ball passing events <ul><li>Ball is said to be passed in a sequence of frames, if: </li></ul><ul><li>Nearest player in the initial frames of the sequence is the second nearest player to the ball in the subsequent frames </li></ul><ul><li>Nearest and the second nearest players to the ball belong to the same team </li></ul>
    73. 73. Example of a ball passing event:
    74. 74. Example of a ball passing event (contd..)‏
    75. 75. Example of a ball passing event (contd..)‏
    76. 76. Example of a ball passing event (contd..)‏
    77. 77. Classifying ball passing events Forward pass: Direction of camera motion is towards the goal post of the team opposite to that of the nearest player Reverse pass: Direction of camera motion is towards the goal post of the team of the nearest player
    78. 78. Results for ball passing events Average Recall = 100% and Precision = 60%
    79. 79. Classification of ball passing events: 5 - Reverse 1 3 Forward Reverse (no of passes)‏ Forward (no of passes)‏ False Ball Passes True Ball Passes
    80. 80. Graphs for ball possession and ball passing Graphs illustrating ball possession states and ball passing events for Sequence 7
    81. 81. Graphs for ball possession and ball passing Graphs illustrating ball possession states and ball passing events for Sequence 10
    82. 82. Publication <ul><li>V. Pallavi , A. Vadivel, Shamik Sural, A.K. Majumdar, Jayanta Mukherjee, Identification of moving objects in a Soccer video, Workshop on Computer Vision, Graphics and Image Processing 2006 , Hyderabad, India, pp. 13-18. </li></ul><ul><li>V. Pallavi , J. Mukherjee, A.K. Majumdar and Shamik Sural, Shot classification in Soccer videos, Proceedings of National Conference on Recent Trends in Information Systems 2006 , Kolkata, India, pp. 216-219. </li></ul><ul><li>V. Pallavi , J. Mukherjee, A.K. Majumdar and Shamik Sural, Identification of team in possession of ball in a soccer video using static and dynamic segmentation, Proceedings of Sixth International Conference on Advances in Pattern Recognition 2007 , Kolkata, India, pp. 249-255.   </li></ul><ul><li>V. Pallavi , J. Mukherjee, A.K. Majumdar and Shamik Sural, Ball detection from broadcast soccer videos using static and dynamic features, Journal of Visual Communication and Image Representation , (Accepted for a second review ). </li></ul>
    83. 83. Thank You