Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Building an informatics solution to sustain AI-guided cell
profiling with high-content microscopy imaging
Ola Spjuth <ola....
Who are we?
• Academic research group at Uppsala University
• Background in computational pharmacology (AI/ML)
• Good at e...
Objective: Accelerate drug discovery using
AI and intelligent design of experiments.
• Predict safety concerns
• Explain d...
Hypothesis
revise
Insight
• Iterative
• Flexible
• Mostly manual
• Slow
Experiments
Analysis and interpretation
Traditiona...
Data-driven science
Stream Processing
Real- T ime
Analytics
Data Results
Data
Models Evaluation
Prediction/
Insight
Hard p...
Closing the loop:
Intelligent experimentation
Data
Stream Processing
Real- T ime
Analytics
Data Results
Current fact findi...
Genetic or
chemical
perturbations
Experiments
in multi-
well plates
Imaging Features Hypotheses
Convolutional Neural Netwo...
Holographic live cell imaging
• Quantitative phase-contrast microscopy
• Holographic phase-shift imaging
• Label-free, liv...
Main focus area: Drug/chemical profiling
with AI modeling
Explore profiling with AI/ML
• Target identification
• Mechanism...
Protein degradation Cholesterol-lowering DNA replication
Microtubule stabilizer Actin disruptor Kinase inhibitor
Classify ...
•Fluorescent LNPs (lipids)
•Fluorescent Cargo (mRNA)
•Fluorescent Product (protein)
No
LNPs
Partial LNP
uptake
LNP uptake ...
Make predictions
using available
data
External data
Data warehouse
Design new
experimentsAI
Modeling
Publish data and mode...
Automating our cell-based lab
Fixed setup (version 1)
• ImageXpress XLS (Molecular Devices)
• Plate robot (Preciseflex)
• ...
Robotized lab
images
Automating our data processing
ImageDBImage viewer
File system
Metadata Files (images)
https://github...
Robotized lab
Data scientists
Empowering our data scientists
ImageDB
File system
Metadata Files (images)
Models
CPU/GPU/HP...
Managing our software ecosystem
• Scientists require many different software tools
• Difficult and time-consuming to manag...
Building pipelines of containers
• A suitable way of using containers are
connecting them into a (scientific)
workflow
• G...
Dealing with large scale data
• High volume, relatively high velocity
• Continuously process data, train
models, serve mod...
AI modeling life cycle
Model Development
ML studio
ML workflow
automation
Package & Deploy Models Model Serving
Model
mana...
Integrate with our other AI services
Site-of-metabolism and reaction types
http://ptp.service.pharmb.io/
https://metpred.s...
Implications: Continuous Analytics
• We can handle the continuous data processing from instruments with
robust, resilient ...
http://haste.research.it.uu.se/
Research group website: http://pharmb.io
- Thank you -
Funding:
Building an informatics solution to sustain AI-guided cell profiling with high-content microscopy imaging
Upcoming SlideShare
Loading in …5
×

Building an informatics solution to sustain AI-guided cell profiling with high-content microscopy imaging

18 views

Published on

Presentation at SLAS Europe 2019 in Barcelona on 28 june, 2019.

High-content microscopy in automated laboratories present many challenges for storing and processing data, and to build AI models to aid decision making. We have established an informatics system to serve a robotized cell profiling setup with incubators, liquid handling and high-content microscopy for microplates. The informatics system consists of computational infrastructure (CPUs, GPUs, storage), middleware (Kubernetes), imaging database and software (OMERO), and workflow system (Pachyderm) to perform online prioritization of new data, and automate the process from acquired images to continuously updated and deployed AI models. The AI methodologies include Deep Learning models trained on image data, and conventional machine learning models trained on data from Cell Painting experiments. The microservice architecture makes the system scalable and expandable, and a key objective is on improving screening and toxicity assessment using AI-aided intelligent experimental design.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Building an informatics solution to sustain AI-guided cell profiling with high-content microscopy imaging

  1. 1. Building an informatics solution to sustain AI-guided cell profiling with high-content microscopy imaging Ola Spjuth <ola.spjuth@farmbio.uu.se> Department of Pharmaceutical Biosciences, Uppsala University www.pharmb.io
  2. 2. Who are we? • Academic research group at Uppsala University • Background in computational pharmacology (AI/ML) • Good at e-infrastructure, big data (data engineering) • Setting up an high-content imaging lab for cell profiling Research group website: http://pharmb.io
  3. 3. Objective: Accelerate drug discovery using AI and intelligent design of experiments. • Predict safety concerns • Explain drug mechanisms • Screen for new drugs
  4. 4. Hypothesis revise Insight • Iterative • Flexible • Mostly manual • Slow Experiments Analysis and interpretation Traditional hypothesis testing • Retrospective analysis • Hopefully predictive • Expensive • Limited for hypothesis testing more Predictive modeling Database Data generation Traditional Processing Stream Processing Data Data Query request response Real- T ime Analytics Data Results ModelPrediction Modeling and prediction
  5. 5. Data-driven science Stream Processing Real- T ime Analytics Data Results Data Models Evaluation Prediction/ Insight Hard problem! Poor accuracy? Hypothesis Hypothesis test generate Generate new data
  6. 6. Closing the loop: Intelligent experimentation Data Stream Processing Real- T ime Analytics Data Results Current fact finding Analyze data in motion – before it is storedsk Continuous Analytics Results Intelligent design of experiments Experiments Scientist • What experiments should we do and how? • Can we reduce search space? • How store only interesting data? • Can we replace experiments with predictions? Automation Informatics
  7. 7. Genetic or chemical perturbations Experiments in multi- well plates Imaging Features Hypotheses Convolutional Neural Network Predictions Cell painting: HCI with multiplexed dyes Bray et al. (2016). “Cell Painting, a High-Content Image-Based Assay for Morphological Profiling Using Multiplexed Fluorescent Dyes.” Nature Protocols 11 (9): 1757–74.
  8. 8. Holographic live cell imaging • Quantitative phase-contrast microscopy • Holographic phase-shift imaging • Label-free, live cell imaging • Used inside incubator HoloMonitor system
  9. 9. Main focus area: Drug/chemical profiling with AI modeling Explore profiling with AI/ML • Target identification • Mechanism-of-Action predictions • Pathway enrichment actin disruption microtubule destabilization aurora kinase inhibition DNA replication Eg5 inhibition protein degradation cholesterol lowering DNA damage epithelial kinase inhibition protein synthesis microtubule stabilization. Microscopy image Deep Neural Network MoA profile prediction Cell treatment • 2D monolayer, cell lines (U2OS, MCF-7, A549, RKO, …) • Integrate HCI with other data model
  10. 10. Protein degradation Cholesterol-lowering DNA replication Microtubule stabilizer Actin disruptor Kinase inhibitor Classify images into biological mechanisms Kensert A, Harrison PJ, Spjuth O. Transfer learning with deep convolutional neural network for classifying cellular morphological changes. SLAS DISCOVERY: Advancing Life Sciences R&D. 24, 4 (2019)
  11. 11. •Fluorescent LNPs (lipids) •Fluorescent Cargo (mRNA) •Fluorescent Product (protein) No LNPs Partial LNP uptake LNP uptake and mRNA decoding
  12. 12. Make predictions using available data External data Data warehouse Design new experimentsAI Modeling Publish data and models Manual wet lab Hypothesis Verify using external protocol Automated lab Carry out new experiments Analysis pipeline Vision: Intelligent systems for drug/chemical profiling Hypothesis Hypothesis test generate
  13. 13. Automating our cell-based lab Fixed setup (version 1) • ImageXpress XLS (Molecular Devices) • Plate robot (Preciseflex) • Plate incubator (Liconic), barcode reader • BioMek 4000 liquid handling (Beckman Coulter) • Green Button Go lab automation software (Biosero) Observations: • Quick to get up and running • Suitable for fixed protocols • Dependent on vendors to solve problems • Not easy to expand or configure for us Open source setup (under construction) • HoloMonitor (Phase Holographic Imaging) • OT-2 liquid handling (OpenTrons) • Plate robots (under procurement) • Open source lab automation (to be decided) • More components… (to be decided) Our priorities: • Flexibility to expand/adapt • Open source or good APIs • Low cost, serviceable by us • Configurable by us Collaborators wanted!
  14. 14. Robotized lab images Automating our data processing ImageDBImage viewer File system Metadata Files (images) https://github.com/pharmbio/imagedb Cold storage Hot storage Online, intelligent processing Cell profilesQC workflows Interestingness models HASTE CORE and Cell Profiler Pipeline https://github.com/HASTE-project/cellprofiler-pipeline Avoid storing uninteresting data
  15. 15. Robotized lab Data scientists Empowering our data scientists ImageDB File system Metadata Files (images) Models CPU/GPU/HPC cloud Notebooks Data Models External users Services Public services Publish
  16. 16. Managing our software ecosystem • Scientists require many different software tools • Difficult and time-consuming to manage dependencies • Software Containers • Offers isolation on application level, share operating system • Portable, fast, smaller than virtual machine images • Docker • Microservices • Decompose functionality into smaller, loosely coupled, on-demand services • Improve resilience, agile development • Easy to scale • Kubernetes • manage a cluster of machines running containers
  17. 17. Building pipelines of containers • A suitable way of using containers are connecting them into a (scientific) workflow • Goal: Reproducible, fault-tolerant, scalable execution • Lampa S et al. SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines. Gigascience. 8, 5 (2019) • Spjuth O et al. Approaches for containerized scientific workflows in cloud environments with applications in life science. PeerJ Preprints. 6, e27141v1 (2018 • Capuccini M, et al. MaRe: Container-Based Parallel Computing with Data Locality ArXiv. 1808.02318 (2018) • Novella JA et al. Container-based bioinformatics with Pachyderm. Bioinformatics. 35, 5, 839-846. (2018) • Lampa S et al. Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles. Journal of Cheminformatics. 8, 67. (2016)
  18. 18. Dealing with large scale data • High volume, relatively high velocity • Continuously process data, train models, serve models • Embrace scalable virtual infrastructures (cloud) and microservices (containers) GPU cluster CPU server Storage Cloud HPC Online processing
  19. 19. AI modeling life cycle Model Development ML studio ML workflow automation Package & Deploy Models Model Serving Model management Model serving Monitoring Explore Data and Develop Models Train at scale Register Model and Metadata for Serving Package and Publish Run in operations Monitor LoggingIntegrate Data scientist Data Engineer Data Engineer Promote Model Ship Model www.scaleoutsystems.com In collaboration with:
  20. 20. Integrate with our other AI services Site-of-metabolism and reaction types http://ptp.service.pharmb.io/ https://metpred.service.pharmb.io/draw/ Target (safety) profiles
  21. 21. Implications: Continuous Analytics • We can handle the continuous data processing from instruments with robust, resilient data pipelines • We can continuously re-train models as data is updated • We can (soon) continuously publish data and models Data Traditional Processing Stream Processing Data Query request response Real-Time Analytics Data Results Continuous Analytics Results Intelligent design of experiments Experiments Scientist Agile research group of different competencies • Scientists has access to necessary infrastructure • Data stored in structured databases • DevOps roles, no dedicated sysadmin / developer
  22. 22. http://haste.research.it.uu.se/
  23. 23. Research group website: http://pharmb.io - Thank you - Funding:

×