Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine learning for Science and Society

1,333 views

Published on

Oplæg på DBC's Data Science Day den 14. januar 2016. v/ professor Christian Igel, diku, (Datalogisk Institut, Københavns universitet).

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Machine learning for Science and Society

  1. 1. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Faculty of Science Machine Learning for Science and Society Some recent results Christian Igel Department of Computer Science igel@diku.dk Slide 1/30
  2. 2. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Why machine learning? • Sometimes problems cannot be solved in the traditional way because • the designer’s knowledge is limited, and/or • the sheer complexity and variability precludes an accurate description. • However, large amounts of data describing the task are often available or can be automatically obtained – and machine learning can solve the problem. Slide 4/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  3. 3. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Why machine learning? • Sometimes problems cannot be solved in the traditional way because • the designer’s knowledge is limited, and/or • the sheer complexity and variability precludes an accurate description. • However, large amounts of data describing the task are often available or can be automatically obtained – and machine learning can solve the problem. Slide 5/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk Machine learning turns data into knowledge Machine learning automates the process of inductive inference: 1 Observe a phenomenon 2 Construct a model of that phenomenon 3 Make predictions using this model
  4. 4. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Alzheimer’s Disease progression Disease progression Cells healthy neurons formation of A¡ plaques and NFTs neuronal death NFTs A  plaques MRI structure normal size/shape normal size/shape abnormal size/shape (atrophy) MRI intensity normal variation abnormal variation abnormal variation Scale Slide 7/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  5. 5. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Hippocampal texture from structural MRI • Image analysis of brain scans • Segmentation of brain regions (hippocampi) • Extraction of image features • Classification using SVMs Sørensen, Igel, Hansen, Osler, Lauritzen, Rostrup, Nielsen. Early detection of Alzheimer’s disease using MRI hippocampal texture. Human Brain Mapping, 2016 Slide 8/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  6. 6. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E CADDementia Challenge • Computer-Aided Diagnosis of Dementia based on structural MRI data (CADDementia) competition • Goal: Diagnosis into Alzheimer’s disease (AD), mild cognitive impairment (MCI), and normal controls (NC) • No labelled training data was provided, we trained on freely available data sets. • We combined several biomarkers (cortical thickness measurements, hippocampal shape, hippocampal texture, and volumetric measurements). Sørensen, Pai, Anker, Balas, Lillholm, Igel, Nielsen. Dementia Diagnosis using MRI Cortical Thickness, Shape, Texture, and Volumetry. MICCAI 2014 – Challenge on Computer-Aided Diagnosis of Dementia Based on Structural MRI Data, 2014 Bron et al. Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge. NeuroImage, 2015 Slide 9/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  7. 7. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E CADDementia: Results Slide 10/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk • Our biomarker outperforms the state-of-the-art. • The texture score remains significant if combined with other biomarkers. • The biomarker generalizes across patient populations.
  8. 8. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Deep learning • Deep learning refers to ML architectures composed of multiple levels of non-linear transformations. • Idea: Extracting more and more abstract features from input data, learning more abstract representations. • Representations can be learned in a supervised and/or unsupervised manner. • Example: Convolutional neural networks (CNNs) are popular deep learning architectures. LeCun, Bottou, Bengio, Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998 Bengio. Learning Deep Architectures for AI. Foundations and Trends in Ma- chine Learning 2(1): 1–127, 2009 Slide 12/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  9. 9. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Breast cancer • Breast cancer is a frequent cause of death among women • Screening programs halve the risk of death1 and reduce mortality by 28-36 %2 • Still: 33 % of cancers are missed,3 70 % of referrals are false positives,4 and 25 % of cancers could have been detected earlier5 • Goal: More accurate image-based biomarkers allowing personalized breast cancer screening 1Otto et al. Cancer Epidemiol Biomarkers Prev 21, 2012 2Broeders et al. J Med Screen 19, 2012 3Karssemeijer et al. Radiology 227, 2003 4Yankaskas et al. Am J Roentgenol 177, 2001 5Timmers et al. Eur J Public Health, 2012 Slide 13/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  10. 10. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Breast density scoring Breast density is related to breast cancer risk, the risk of missing breast cancer, and the risk of false positive referral. input manual CNN Current work: Better biomarkers using CNN density scores and CNN texture scores Petersen, Nielsen, Diao, Karssemeijer, Lillholm. Breast Tissue Segmentation and Mammographic Risk Scoring Using Deep Learning. Digital Mammography / IWDM, 2014 Slide 14/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  11. 11. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Knee cartilage segmentation • Cartilage segmentation in knee MRI is the method of choice for quantifying cartilage deterioration. • Cartilage deterioration implies osteoarthritis, one of the major reasons for work disability in the western world. input manual CNN Prasoon, Petersen, Igel, Lauze, Dam, Nielsen. Deep Feature Learning for Knee Cartilage Segmentation Using a Triplanar Convolutional Neural Network. Medical Image Computing and Computer Assisted Intervention (MICCAI), LNCS 8150, 2013 Slide 16/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  12. 12. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Nuclear test ban treaty verification • CTBTO (Peparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organisation) watches out for nuclear weapon tests • Verification of the comprehensive nuclear-test-ban treaty • 15 GB of data every day from world-wide sensor network • Data analysis must be partly automatic Tuma et al. Integrated optimization of long-range underwater signal detection, feature ex- traction, and classification for nuclear treaty monitoring. IEEE Transactions on Geoscience and Remote Sensing, to appear Slide 18/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  13. 13. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Typical tasks in astronomy Slide 20/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk “Can you group all detected objects into classes? For instance, into stars, galaxies, and other objects . . . ’ ?’ “Can you find weird objects (which look different from other ones)?” “Can you find weird objects (which look different from other ones)?” “I am interested in these distant galaxies. Can you find more of them? Can you find the most distant ones?”
  14. 14. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Big data Today’s and future projects • Sloan Digital Sky Survey (SDSS) = 50TB in total • Large Synoptic Survey Telescope (LSST) = 30TB per night • Low Frequency Array (LOFAR) = 250TB per night • Square Kilometer Array (SKA) = 250TB per second Huge amounts of photometric data, however, detailed spectroscopic follow-ups are expensive. Slide 21/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  15. 15. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Specific star formation rate • Goal: Predict specific star formation rate based on photometric data • Approach: Machine learning (ML) augmenting standard features with image texture features Slide 22/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  16. 16. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Texture features Original image Scale-space Curvedness Shape Index Shape Index weighted by Curvedness Slide 23/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  17. 17. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Predicting the specific star formation rate • ML achieves higher accuracy than physical modelling • New data structures and GPU programming allow to process huge amounts of data Stensbo-Smidt, Igel, Zirm, Steenstrup Pedersen. Nearest Neighbour Regression Outperforms Model-based Prediction of Specific Star Formation Rate. IEEE Big Data 2013, 2013 Steenstrup Pedersen, Stensbo-Smidt, Zirm, Igel. Shape Index Descriptors Applied to Texture-Based Galaxy Analysis. International Conference on Computer Vision (ICCV), 2013 Gieseke, Heinermann, Oancea, Igel. Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs. International Conference on Machine Learning (ICML), 2014 Slide 24/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  18. 18. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Outline 1 Machine Learning 2 Medicine Diagnosis and Prognosis of Alzheimer’s Disease Analysis of Mammograms Knee Cartilage Segmentation 3 Nuclear-Test-Ban Treaty Monitoring 4 Astronomy 5 End Titles Slide 26/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  19. 19. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Shark image.diku.dk/shark Igel, Glasmachers, Heidrich-Meisner: Shark, Journal of Machine Learning Research 9:993–996, 2008 Gold Prize at Open Source Software Wold Challenge 2011 Slide 27/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  20. 20. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Buffer k-d trees http://bufferkdtree.readthedocs.org Gieseke, Heinermann, Oancea, Igel. Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs. International Conference on Machine Learning (ICML), 2014 Slide 28/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  21. 21. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Industrial Data Analysis Service (IDAS) Slide 29/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk
  22. 22. U N I V E R S I T Y O F C O P E N H A G E N D E P A R T M E N T O F C O M P U T E R S C I E N C E Thanks Machine Learning Lab http://image.diku.dk/MLLab Image Section http://www.diku.dk/english/research/imagesection Special thanks: Sune Darkner, Fabian Gieseke, Mads Nielsen, Cosmin Oancea, Akshay Pai, Kim Steenstrup Pedersen, Kai Lars Polsterer, Lauge Sørensen, Kristoffer Stensbo-Smidt Slide 30/30 — Christian Igel — Machine Learning for Science and Society — igel@diku.dk

×