Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Tutorial #2 – Advanced Analytics Pr... by Anton Konushin 1247 views
- Machine Learning in Modern Medicine... by Sri Ambati 4144 views
- Deep Learning Tutorial, part 2 (by ... by Anton Konushin 2909 views
- Технологии разработки ПО by Anton Konushin 2087 views
- Deep Learning Tutorial, part 1 (Rus... by Anton Konushin 3499 views
- Modeling Images, Videos and Text Us... by Anton Konushin 6243 views

2,048 views

Published on

This course is devoted to the discussion of some interesting applications of machine learning methods to automatically analyse medical images and physiologic signals. Medical images acquired by means of special equipment represent internal structures of the human body and/or processes in it. The most modern technologies for acquisition of such images are magnetic resonance imaging (MRI) and computed tomography (CT). Physiologic signals usually refer to cardiologic time series such as electrocardiograms (ECG), but can also represent other physiological data, for example, stride intervals of human gait.

Several important problems will be highlighted along with successful solutions involving machine learning methods including examples both from the worldwide practice and the author’s own research. Description of the basic principles of the algorithms used will provide a good opprotunity to strengthen the knowledge acquired from the other courses of the school.

Published in:
Science

No Downloads

Total views

2,048

On SlideShare

0

From Embeds

0

Number of Embeds

189

Shares

0

Downloads

97

Comments

0

Likes

1

No embeds

No notes for slide

- 1. MACHINE LEARNING APPLICATIONS IN MEDICINE Olga Senyukova Graphics & Media Lab Faculty of Computational Mathematics and Cybernetics Lomonosov Moscow State University
- 2. Medical data Medical images Physiologic signals Other: narrative, textual, numerical, etc.
- 3. Medical data Medical images Physiologic signals Other: narrative, textual, numerical, etc.
- 4. Medical images X-Ray MRI CTUltrasound
- 5. Computed tomography (CT) 1972, Sir Godfrey Hounsfield X-rays are computer-processed to produce tomographic images https://en.wikipedia.org/wiki/CT_scan
- 6. Computed tomography (CT) insightci.com.au
- 7. Magnetic resonance imaging (MRI) 1973, Paul C. Lauterbur and Peter Mansfield Allows localizing the image by slices Source: K. Toennies
- 8. Magnetic resonance imaging (MRI) www.raleighrad.com
- 9. Electrocardiography (ECG) 1901, Einthoven Recording of the electrical activity of the heart by electrodes placed on the body intensivecarehotline.com
- 10. RR time series RR time series (interbeat intervals lengths) are widely used for ECG analysis www.elsevier.es
- 11. Human gait time series reylab.bidmc.harvard.edu
- 12. Analysis: what for? Normal or diseased? Where is the diseased area? What changes over time occur (especially, after treatment)? Does the specific condition take place (e.g. overtraining of the sportsman)? … www.fresher.ru
- 13. Main tasks: images Detection aneurysm Segmentation T Matching (Registration)
- 14. Main tasks: physiologic signals Diagnostics Healthy Disease XXX Disease YYY Template Matching Condition ZZZ The same or not???
- 15. Machine learning in medical imaging: challenges Slide by D. Rueckert Images are often 3D or 4D: # of voxels and # of extracted features is very large Number of images for training is often limited: large datasets means typically 100 to 1000 images “small sample size problem”
- 16. Machine learning in medical imaging: challenges Training data is expensive annotation of images is resource intensive (manpower, cost, time) sometimes possible to augment training using unlabelled images Training data is sometimes imperfect training data may be wrongly labelled e.g. diseases such as Alzheimer’s require confirmation through pathology (difficult and costly to obtain) Slide by D. Rueckert
- 17. The InnerEye project Measuring brain tumors Localizing and identifying vertebrae Kinect for surgery Source: A. Criminisi & the InnerEye team @ M
- 18. Anatomy localization via regression forests A. Criminisi, et al. Med Image Analysis 2013
- 19. Decision forests Leo Breiman, 2001 A. Criminisi, J. Shotton (eds.). Decision Forests in Computer Vision and Medical Image Analysis // Advances in Computer Vision and Pattern Recognition. 2013 Decision forest consists of decision trees…
- 20. Decision tree Each internal node: a split (test) function Each leaf: class label (predictor) Source: A. Konushin
- 21. Regression tree input value continuouslabel • Green – high uncertainty • Red – low uncertainty • Thickness – the number of samples from the training set Source: A. Criminisi, J. Shotton
- 22. Regression tree: training • S0 – whole training set • Sj – part of training set at the jth node ))(,;(~)|( 2 xyyNxyp y Source: A. Criminisi, J. Shotton
- 23. Regression tree: training Split function parameters at the jth node maximize the information gain At each part (L,R): fit a line to the points (e.g. least squares) for each x we have ))(,;(~)|( 2 xyyNxyp y ),(maxarg jj SI j j i jSyx RLi Syx yy xxI ),( },{ ),( ))(log())(log( y – green lineSource: A. Criminisi, J. Shotton
- 24. Example Source: A. Criminisi, J. Shotton
- 25. Example Source: A. Criminisi, J. Shotton
- 26. Different models Predictor models Constant Polynomial and linearProbabilistic linear Weak learners (split functions) Axis-aligned Generic oriented hyperplane Conic section Source: A. Criminisi, J.
- 27. Regression forest d dxx ),...,( 1v Source: A. Criminisi, J.
- 28. Randomness Bagging: each tree is learned on subset of the whole training set
- 29. Randomness Randomized node optimization: optimize a split function at the jth node w.r.t. a small random subset of parameter values ),(maxarg jj SI ),(maxarg jj SI j !!! j ),,( jjjj τ j j jτ selects features from the whole feature set is a weak learner type (axis-aligned, linear, etc.) is a set of splitting thresholds Source: A. Criminisi, J.
- 30. Forest vs tree Source: A. Criminisi, J.
- 31. The labeled database Source: A. Criminisi, J.
- 32. Anatomy localization Key idea: all voxels in the image vote for the position of the organ Each organ is defined by its 3D axis- aligned bounding box Cc ),,,,,( F c H c P c A c R c L cc bbbbbbb C = {liver, spleen, kidneyL, kidneyR, …} Source: A. Criminisi, J.
- 33. Anatomy localization For each input voxel we obtain distribution of relative displacements to the organ bounding box ),,( zyx vvvv ),,,,,()( F c H c P c A c R c L cc ddddddd v );( vf – feature response Source: A. Criminisi, J.
- 34. Context-rich features Features: mean intensity in randomly displaced boxes Source: A. Criminisi, J.
- 35. Features for CT and MRI CT: we can rely on absolute intensity MRI: only intensity difference makes sense Source: A. Criminisi, J.
- 36. Learning clinically useful information from medical images Biomedical Image Analysis Group Department of Computing Daniel Rueckert
- 37. Segmentation using registration Slide by D.
- 38. Multi-atlas segmentation using classifier fusion
- 39. Multi-atlas segmentation using classifier fusion and selection
- 40. Selection of atlases How to select atlases the most similar to our image? Atlases should be clustered by disease/population Manifold learning is used to efficiently discover such clusters
- 41. Manifold learning Source: D. Embed the data to the manifold (project to less- dimensional space) Find a manifold
- 42. Manifold learning: Laplacian eigenmaps Given a graph G = (V, E) Each vertex vi corresponds to an image Each edge weight wij defines the similarity between image i and j Define diagonal matrix T which contains the degree sums for each vertex j ijii wt Slide by D.
- 43. Manifold learning: Laplacian eigenmaps 2/12/1 )( TWTTL Normalized graph Laplacian 2 , min jiji ij yyW The eigen decomposition of L provides manifold coordinates yi for each Source: D. Rueckert
- 44. Manifold learning for multi-atlas segmentation We have two sets of images: labeled (atlases) unlabeled We want to label all the unlabeled images We can do it iteratively: label a part of unlabeled images using the most similar from already labeled these images can be used as atlases for the next iteration
- 45. Manifold learning for multi-atlas segmentation Wolz et al., Neuroimage, 2010
- 46. Example Wolz et al., Neuroimage, 2010
- 47. Segmentation of brain lesions in MRI Olga V. Senyukova, “Segmentation of blurred objects by classification of isolabel contours”. Pattern Recognition, 2014 Data was provided by Children's Clinical and Research Institute Emergency Surgery and Trauma
- 48. The proposed algorithm Each MRI slice is processed separately In order to improve speed and robustness the regions containing lesions can be specified manually Lesions inside these regions are segmented automatically
- 49. Algorithm overview Input region Isolabel contours I(x,y)=const Closed isolabel contours Nonlinear SVM classification
- 50. Isolabel contours In geography each isolabel contour (one color): constant height f(x,y)=h In image processing each isolabel contour (one color): constant intensity f(x,y)=I
- 51. How to distinguish lesion contours? Visually we can do it easily! Let’s use the same set of features for automatic classification of isolabel contours
- 52. Features of isolabel contours In order to distinguish isolabel contours delineating lesions we propose 4 features Imean Imean inside the contour / Imean inside BBox Imax-IminIvariance
- 53. Labeled training base Various regions on many images: a user can click on lesion contours: they will get “lesion” other isolabel contours will automatically get “non- lesion” …, , [ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> lesion [ɸ1, ɸ2, ɸ3, ɸ4] -> lesion … [ɸ1, ɸ2, ɸ3, ɸ4] is a feature vector
- 54. Binary classification via SVM We have a binary classification task: each isolabel contour belongs to one of two classes, lesions or non-lesions One of the best classifiers is SVM – Support Vector Machine original linear SVM: Vladimir Vapnick, Alexey Chervonenkis, 1963 applying a kernel trick results in nonlinear SVM: Bernhard Boser, Isabelle Guyon, Vladimir Vapnick, 1992
- 55. Linear SVM support vectors margin 1:1 by ii wx 1:1 by ii wx positive samples negative samples w/2 Maximizing we solve quadratic optimization problem: w/2 wwT 2 1 1)( bxy ii w minimizing subject to byb iii i xxxw Solution is a hyperplane: ix i – support vectors – learned weights
- 56. Nonlinear SVM For linearly separable data linear SVM is excellent What about the data that is not linearly separable?.. We can make it linearly separable by mapping it to more-dimensional space
- 57. Nonlinear SVM: kernel trick by iii i xx bKy iii i ),( xxInstead of we have )()(),( jijiK xxxx where 2 exp),( jijiK xxxx For classification of isolabel contours I use nonlinear SVM with RBF (radial basis function) kernel
- 58. Ensemble-based analysis of RR and gait Olga Senyukova Valeriy Gavrishchaka, Department of Physics, West Virginia University Springer, 2013, 2015
- 59. RR and gait time series Normal? Huntington’s disease? Parkinson’s disease? … Normal? Arrhythmia? Congestive heart failure? …
- 60. Ensemble learning techniques Ensemble can work better than a single classifier … accuracy: 0.61 accuracy: 0.73 accuracy: 0.65 base classifier 1 base classifier 2 base classifier N Ensemble of classifiers accuracy: 0.9
- 61. AdaBoost Freund and Schapire, 1997 On each iteration focuses on the most hard-to- classify samples
- 62. AdaBoost – training data, – labels Initial weights of all N items: M iterations, from m = 1 to M: find if then stop set update Classifier output: Nwi /1)0( ))(()( 1 M m mmTsignH xx Nii ,...,1, x }1;1{ iy )]([)(minarg)( 1 imi m i mj T m TyiwT j xx 2/1m m m m 1 log 2 1 t imimm m Z Tyiw iw )(exp)( )(1 x
- 63. Good classifier example
- 64. Iteration 1 of 3 T1
- 65. Iteration 2 of 3 T2
- 66. Iteration 3 of 3 STOP T3
- 67. Final model )](72.0)(70.0)(42.0[ 321 xxx TTTsign
- 68. Ensemble decomposition learning We apply ensemble-based classifier to vector x Each x can be described by its ensemble decomposition vector (EDL vector) We can classify data points by comparing their EDL vectors M m mmTH 1 )()( xx )](,),(),([)( 2211 xxxx MMTTTD
- 69. EDL: learning All available data «normal/abnormal» MS E DF A AdaBoo st Indicators from nonlinear dynamics Building a general classifier «normal/abnormal» MSE 1 DFA 2 … MSE N α1 + α2 + αN Ансамбль классификаторов Training example x MSE 1 DFA 2 … MSE N α1 + α2 + αN Applying the ensemble MS E +1 (normal) -1 (abnormal) +1 (normal) -1 (abnormal) DF A )]1(*,),1(*,1*[)( 21 MD x EDL vector
- 70. EDL: testing Testing example y MSE 1 DFA 2 … MSE N α1 + α2 + αN Applying the ensemble ]1*,,1*),1(*[)( 21 MD y )()( yx DD ? x = yx ≠ y EDL vector no yes In multi-class classification problem the class of y is the class of the training example with the closest EDL vector ))()((min)()(: yy DxDDxDC ik C i Ck
- 71. Results CHF/Arrhythmia classification Real data from http://www.physionet.org/physiobank
- 72. Thank you for attention! knizhnayaraduga.ru

No public clipboards found for this slide

Be the first to comment