Successfully reported this slideshow.
Your SlideShare is downloading. ×

Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi

Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi

Download to read offline

Advanced machine vision is increasingly being used to investigate, diagnose, and identify potential remedies and their progressions for complex health issues. In this study, a behavioral neuroscientist at the University of Chicago and his colleagues have collaborated with Kavi Global to characterize 3D feeding behavior and its potential changes caused by neurological conditions such as ALS, Parkinson’s disease, and stroke, or oral environmental changes such as tooth extraction and dental implants.

Videos of rodents feeding on kibble are recorded by a high-speed biplanar videofluoroscopy technique (XROMM). Their feeding behavior is then analyzed by tracking radio-opaque fiducial markers implanted in their head region. The marker tracking process, until now, was manual and tedious, and was not designed to process massive amounts of longitudinal data. This session will highlight a near-automated, deep learning-based solution for detecting and tracking fiducial markers in the videos, resulting in a more efficient and robust process, with a 300+ times reduction in data processing time compared to a manual use of the existing software.

Our approach involved the following steps: (i) Marker Detection-Deep Learning algorithms were used to identify the pixels corresponding to markers within each frame; (ii) Marker Tracking-Kalman filtering along with Hungarian algorithm were used for tracking markers across frames; (iii) 2D to 3D Conversion- sequence matching of videos recorded by both cameras, and triangulating marker locations in 2D track coordinates to generate 3D marker locations. The features extracted from videos would be used to characterize behaviorally relevant kinematic features such as rhythmic chewing or swallowing. The solution involved the use of TensorFlow-Python APIs and Spark.

Advanced machine vision is increasingly being used to investigate, diagnose, and identify potential remedies and their progressions for complex health issues. In this study, a behavioral neuroscientist at the University of Chicago and his colleagues have collaborated with Kavi Global to characterize 3D feeding behavior and its potential changes caused by neurological conditions such as ALS, Parkinson’s disease, and stroke, or oral environmental changes such as tooth extraction and dental implants.

Videos of rodents feeding on kibble are recorded by a high-speed biplanar videofluoroscopy technique (XROMM). Their feeding behavior is then analyzed by tracking radio-opaque fiducial markers implanted in their head region. The marker tracking process, until now, was manual and tedious, and was not designed to process massive amounts of longitudinal data. This session will highlight a near-automated, deep learning-based solution for detecting and tracking fiducial markers in the videos, resulting in a more efficient and robust process, with a 300+ times reduction in data processing time compared to a manual use of the existing software.

Our approach involved the following steps: (i) Marker Detection-Deep Learning algorithms were used to identify the pixels corresponding to markers within each frame; (ii) Marker Tracking-Kalman filtering along with Hungarian algorithm were used for tracking markers across frames; (iii) 2D to 3D Conversion- sequence matching of videos recorded by both cameras, and triangulating marker locations in 2D track coordinates to generate 3D marker locations. The features extracted from videos would be used to characterize behaviorally relevant kinematic features such as rhythmic chewing or swallowing. The solution involved the use of TensorFlow-Python APIs and Spark.

More Related Content

Similar to Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi

Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi

  1. 1. Saurabh Ghanekar, Kavi Global Kazutaka Takahashi, University of Chicago Fiducial Marker Tracking Using Machine Vision #AISAIS14
  2. 2. Outline • Motivation & Goals • Approach • Results • Next Steps 2#AISAIS14
  3. 3. Motivation • Feeding is a highly complex, life-sustaining behavior, essential for survival in all species • Certain neurological conditions such as Parkinson’s disease, ALS, stroke can cause difficulty in chewing and swallowing, known as dysphagia • Affects quality of life • Dysphagia can lead to malnutrition, dehydration, and aspiration 3#AISAIS14
  4. 4. End-Goal To characterize feeding dynamics and gain insights into feeding behavior changes caused by certain neurological conditions and changes in oral environment. 4#AISAIS14
  5. 5. Current State • Study focused on rodents • X-ROMM videos of rodents feeding on kibble • Videos recorded from 2 camera angles simultaneously • Radio-opaque markers implanted in skull, mandible, tongue • Movement of markers needs to be tracked and quantified • Marker tracking process is extremely tedious as it is done using manual, frame-by-frame methods [1,2] • Consumes valuable time, thus delaying further research 5#AISAIS14
  6. 6. Immediate Goal 6#AISAIS14 A near-automated, deep learning-based solution for detecting and tracking markers, resulting in a more efficient and robust process (c) Bunyak et al, 2017
  7. 7. Approach: Key Steps 7#AISAIS14 Head and Marker Detection: Utilize neural network to identify bounding box of head and also pinpoint unlabeled markers inside the bounding box. Data In: Read in videos frame by frame for left and right cameras in 2D (x,y) Marker Tracking: Employ Kalman filters along with Hungarian algorithm to keep track of markers from frame to frame Sequence Matching: Match sequence tracks from left and right cameras 2D to 3D conversion: Feed 2D left right coordinates along with rotational matrices and translation vector to get final 3D coordinates (x,y,z)
  8. 8. Data Description • 13 pairs of videos (left & right camera) available for training • 720px by 1260px videos, recorded at 250 fps, ~10 seconds each • Head and marker coordinates per frame used for model training & evaluation • 18-20 markers to be tracked in each video 8#AISAIS14 Camera 1 Camera 2
  9. 9. Head and Marker Detection • TensorFlow’s Object Detector API • Single Shot Multibox Detector (SSD) with MobileNet using transfer learning from the MS COCO dataset • Key model parameters: – Initial Learning Rate: 0.0004 – Feature Extractor Type: ssd_mobilenet_v1 – Minimum Depth: 16 – Depth Multiplier: 1.0 – conv_hyperparams: activation: RELU_6; regularizer: l2_regularizer; weight: 0.00004 9#AISAIS14
  10. 10. 10#AISAIS14 Head and Marker Detection
  11. 11. Marker Tracking Multi-object tracking involves three key components: • Predicting the object location in the next frame • Associating predictions with existing objects • Track Management 11#AISAIS14 (c) Howe, Holcombe, 2012
  12. 12. Prediction • Kalman Filter is used to predict marker location in the next frame • Estimate position recursively in each frame, based on previous frames • Uses Bayesian learning and estimates a joint probability distribution • Start with initial velocity estimate & covariance matrix 12#AISAIS14
  13. 13. Association • After prediction, an assignment cost-matrix is computed from the bounding-box intersection-over-union (IoU) • Hungarian Algorithm is used to optimally associate markers 13#AISAIS14 518 101 312 24 963 225 872 20 220 Predicted Marker Positions True Marker Positions 0 1 2 ... 0 1 2 ...
  14. 14. Track Management • If IoU is below a set threshold, there is no assignment • Also, not all potential tracks become actual tracks • As a result, tracks may die and new ones are born • The output of Kalman filter and Hungarian algorithm can result in a large number of discontinuous tracks • These are “stitched” together by looking forward and backward a number of frames to find the best match based on closest Euclidean distance • At the end, we get one track per marker 14#AISAIS14
  15. 15. Sequence Matching • After generating marker tracks separately for each camera, corresponding tracks from each camera must be matched • Tried different distinct methodologies such as Time Series Clustering and different correlation measures. • Spearman correlation on frame-to-frame changes in Y-coordinate values gave the best results (100% accuracy on manually tracked data) 15#AISAIS14 Predicted (x,y) For left camera Predicted (x,y) For right camera
  16. 16. 2D to 3D Conversion • P = K * (R | T) - Camera Projection Matrix (3x4) for each camera – K = Camera Matrix (3x3) – R = Rotation Matrix (3x3) – T = Translation Vector (3x1) • Results are a good match with actual 3D coordinates 16#AISAIS14
  17. 17. Evaluation 1: % IoU Difference 17#AISAIS14 Step 1: Calculate a perfect overlap. Sum the area of the boxes over each frame for each true stream True Streams Area Stream: (box area)*(number of frames) Step 2: Calculate the percent difference between the perfect area and actual IoU Predicted Streams True Streams % IoU Difference =
  18. 18. Evaluation 2: % Correctly Labeled 18#AISAIS14 Step 2: Calculate the percent of frames labeled with the same label as the overall label given in the tracking phase Step 1: Determine the best matching marker label for each frame in the predicted streams using the maximum IOU in each frame Predicted Streams True Streams 0 0 2 2 2 1 2 2 2 Predicted Stream: Marker 2 0 0 2 2 2 1 2 2 2
  19. 19. Results 19#AISAIS14
  20. 20. Challenges 20#AISAIS14 Ideal scenario Off-screen Markers Occluded Markers Even if marker detection and tracking models perform well, the above problems may negatively impact results since at some point a marker may be assigned to the incorrect track
  21. 21. Next Steps • Detection – Tune marker detection thresholds, and marker assignment thresholds • Kalman Filter – Tune initialization velocities, and acceleration and covariance matrices – Better initialization is known to produce better predictions – Non-linear methods (Extended Kalman Filters, particle filters) • Marker Detection Assignment to Kalman Tracks – Currently using Hungarian Assignment. Other options include Probabilistic Assignment, Markov Chain Monte Carlo methods • Stitching – Tune parameters and algorithm to better match together disparate tracks 21#AISAIS14
  22. 22. References [1] Bunyak F, Shiraishi N, Palaniappan K, Lever TE, Avivi-Arber L, Takahashi K. Development of semi-automatic procedure for detection and tracking of fiducial markers for orofacial kinematics during natural feeding. Conference proceedings : Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual Conference. 2017;2017:580-583. doi:10.1109/EMBC.2017.8036891. [2] Best MD, Nakamura Y, Kijak NA, et al. Semiautomatic marker tracking of tongue positions captured by videofluoroscopy during primate feeding. Conference proceedings: . Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual Conference. 2015;2015:5347-5350. doi:10.1109/EMBC.2015.7319599. [3] Howe PDL and Holcombe AO (2012) The effect of visual distinctiveness on multiple object tracking performance. Front. Psychology 3:307. doi: 10.3389/fpsyg.2012.00307 22#AISAIS14
  23. 23. Thank You! 23#AISAIS14 Funding Information: • National Center for Advancing Translational Sciences of the National Institutes of Health (UL1 TR000430) • JSPS The Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation (S2504) • JSPS KAKENHI (JP16K11589) Acknowledgements: • Dr. Naru Shiraishi, Niigata University, Japan (for experimental procedure development and data collection) • Animal Research Center (ARC) staff at the University of Chicago Saurabh Ghanekar Principal Consultant Kavi Global saurabh@kaviglobal.com Kazutaka Takahashi, Ph.D. Research Assistant Professor Department of Organismal Biology and Anatomy University of Chicago kazutaka@uchicago.edu

×