Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Shiva Amiri, CEO, Biosymetrics, at MLconf Seattle 2017

Distributed Analytics and Machine Learning for Large-Scale Medical Image Processing:
The scale of data being generated in medicine and research can easily overwhelm typical analytic capabilities. This is particularly true with MRI/fMRI scanning, where: 1) large file sizes often preclude studies of the magnitude needed for overcoming the inherent noise, 2) currently no gold-standard protocol exists for extraction of standardized characteristics from MRI/fMRI files, and, 3) traditional methods for group-wise comparison can often result in spurious findings.

Here we have addressed these challenges by generating an easily deployable, scalable image processing pipeline capable of quickly permuting multiple options for fMRI/MRI processing, determining the optimal set of parameters for each study. Uniquely, our approach leverages the rapid model building capabilities of our real time machine learning software to iterate through normalization parameters for each disease class. Our optimized pipeline exceeded classification accuracy seen with previous analyses of comparable scope and allowed easy integration with other medical data types (genome sequence, phenotypic, and metabolic data) allowing generation of more comprehensive disease classification models.
As the CEO of BioSymetrics Inc., Shiva is working on delivering a unique real-time machine learning technology for the analysis of massive data in the biomedical space. Prior to BioSymetrics Inc. she was Chief Product Officer of Real Time Data Solutions Inc. Prior to RTDS Inc. she lead the Informatics and Analytics team at the Ontario Brain Institute, where they developed Brain-CODE, a large-scale neuroinformatics platform for the management, processing, and analytics of big data in neuroscience across the province of Ontario. Shiva is also the President and CEO of Modecular Inc., a Computational Biochemistry start-up company developing next generation drug screening methodologies.
She has previously lead the British High Commission’s Science and Innovation team in Canada where she was facilitating research, innovation and commercialization between UK and Canada. Shiva completed her D.Phil. (Ph.D.) in Computational Biochemistry at the University of Oxford and her undergraduate degree in Computer Science and Human Biology at the University of Toronto. Shiva is involved with several organisations including Let’s Talk Science and Shabeh Jomeh International.

The ability to standardize and pre-process imaging data for machine learning, no matter the source and type, and effectively combine it with other data types is a powerful capability and holds promise for the future of diagnostics and precision medicine.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

  • Be the first to like this

Shiva Amiri, CEO, Biosymetrics, at MLconf Seattle 2017

  1. 1. Distributed Analytics and Machine Learning for Large-Scale Medical Image Processing Shiva Amiri, PhD MLConf Seattle 19 May 2017
  2. 2. Challenges in biomedical analysis workflows Data Variety/Heterogeneity  Different Data Types: EHR/EMR, MRI/fMRI, EEG, EKG, chemistry  Disparate data sources across locations Lack of Scalability  Difficulty with incorporating large-scale analytics with growing datasets  Adding new data types can impact results/performance Lack of Standards  No standards for processing or interpreting medical data  Noise and artifacts can be patient-, group-, or site- specific
  3. 3. The analytic framework
  4. 4. Types of biomedical data Imaging MRI/fMRI Ultrasound Streaming Telemetry EKG/EEG ‘Omics Genomics Metabolomics Clinical HL7/FHIR Patient DBs Compounds Efficacy Trial Success
  5. 5. Types of questions that can be asked  How can we better diagnose disorders using multiple data types?  How does a patient’s health fingerprint (medical record, genetic markers, medical imaging, wearables, etc) better our understanding of disorders and improve treatments?  How can we standardize and prepare raw medical data such that the source and location of the data doesn’t bias the findings?
  6. 6. A closer look at a large-scale imaging analytics project
  7. 7. MRI & fMRI imaging  MRI scans contain structural information and are largely used for anatomical analysis  fMRI scans contain a temporal domain, and are used to study brain activity
  8. 8. MRI & fMRI imaging  In collaboration with researchers at Columbia University  Lack of a standard pipeline for processing of MRI/fMRI data has lead to conflicting findings globally  We wanted to focus on ensuring our models are robust and can handle varied data sources
  9. 9. Preprocessing of MRI & fMRI files • Slice timing correction • Motion correction • Co-registration to MRI • Spatial Normalization • Smoothing • Reorientation • Skull subtraction • Co-registration to fMRI • Segment MRI Processing fMRI Processing
  10. 10. A model for diagnosis?  Region-specific correlations allow us to determine the relationship between brain regions in the fMRI Control Patients (Avg) Autistic Patients (Avg)
  11. 11. A model for diagnosis?
  12. 12. Exciting results we couldn’t replicate
  13. 13. Letting the data decide
  14. 14. Addressing Lack of Standards  Using 1074 Autistic and Control patients, we identified an optimal set of parameters for autism classification  Focus on accuracy, removal of confounders and speed  Difference between algorithms was smaller than differences with parameter effects
  15. 15. Addressing Lack of Standards  We can examine the patient to patient variability of each iteration (left), and use anatomical maps (right) to identify regions of protocol-dependent variability Standard Deviation
  16. 16. Addressing Lack of Standards  Network analysis allows us to identify the effect of confounders (like measurement site) on prediction performance  This allows us to pick the combination of parameters that best minimizes these effects for a given problem
  17. 17. Integrating imaging with genetics ..
  18. 18. Integrating Datasets  To test the utility of AugustaTM on integrated data we selected a single NIH data set having genomic, imaging, and phenotypic information  We selected the UCLA Sigman/Bookheimer ACE and ARRA data set for a pilot test
  19. 19. Integrating Datasets  Features extracted from medical images can be compared using the results of genomic analysis (shown below for Autism Spectrum Disorder) Higher in Var- Group Higher in Var+ Group
  20. 20. Integrating Datasets  Combining MRI with genomic features allowed better prediction performance than with either alone  Adding metabolic features increased prediction performance (shown using data from ADNI – Alzheimer’s Disease)
  21. 21. Thank you Shiva Amiri, CEO Recipient of SAP’s AI and the Enterprise ‘Most Innovation Solution’ award – March 2017
  22. 22. Addressing Lack of Standards  Permutation and network analysis are used to find optimal processing parameters for each problem