SlideShare a Scribd company logo
1 of 19
Exascale Computing and
Experimental Sensor Data
Overview given at Brookhaven National Laboratory
April 18 2014
Joel Saltz
Stony Brook University
joel.saltz@stonybrook.edu
Integrate Information from
Sensors, Images, Cameras
• Multi-dimensional spatial-temporal datasets
– Radiology and Microscopy Image Analyses
– Oil Reservoir Simulation/Carbon Sequestration/Groundwater Pollution
Remediation
– Biomass monitoring and disaster surveillance using multiple types of
satellite imagery
– Weather prediction using satellite and ground sensor data
– Analysis of Results from Large Scale Simulations
– Square Kilometer Array
– Google Self Driving Car
• Correlative and cooperative analysis of data from multiple sensor
modalities and sources
• Equivalent from standpoint of data access patterns – need to develop
new generation of data skeletons/mini-apps/data dwarfs
Spatio-temporal Sensor Integration,
Analysis, Classification
• Multi-scale material/tissue structural, molecular, functional
characterization. Design of materials with specific structural, energy
storage properties, brain, regenerative medicine, cancer
• Integrative multi-scale analyses of the earth, oceans, atmosphere, cities,
vegetation etc – cameras and sensors on satellites, aircraft, drones, land
vehicles, stationary cameras
• Digital astronomy
• Hydrocarbon exploration, exploitation, pollution remediation
• Aerospace – wind tunnels, acquisition of data during flight
• Solid printing integrative data analyses
• Autonomous vehicles, e.g. self driving cars
• Data generated by numerical simulation codes – PDEs, particle methods
• Fit model with data
Typical Computational/Analysis Tasks
Spatio-temporal Sensor Integration, Analysis, Classification
• Data Cleaning and Low Level Transformations
• Data Subsetting, Filtering, Subsampling
• Spatio-temporal Mapping and Registration
• Object Segmentation
• Feature Extraction
• Object/Region/Feature Classification
• Spatio-temporal Aggregation
• Diffeomorphism type mapping methods (e.g. optimal
mass transport)
• Particle filtering/prediction
• Change Detection, Comparison, and Quantification
Detect and track changes in data during production
Invert data for reservoir properties
Detect and track reservoir changes
Assimilate data & reservoir properties into
the evolving reservoir model
Use simulation and optimization to guide future production
Coupled data acquisition, data analysis, modeling, prediction and
correction – data assimilation, particle filtering etc.
Future State
• 100K – 1M pathology slides/hospital/year
• 2GB compressed per slide
• 1-10 slides used for Pathologist computer
aided diagnosis
• 100-10K slides used in hospital Quality control
• Groups of 100K+ slides used for clinical
research studies -- Combined with molecular,
outcome data
Center
Brain Tumor Pipeline Scaling on GT/ORNL NSF
Keeneland (100 Nodes)
Center
Runtime Support Objectives
• Coordinated mapping of data and computation to
complex memory hierarchies
• Hierarchical work assignment with flexibility capable
of dealing with data dependent computational
patterns, fluctuations in computational speed
associated with power management, faults
• Linked to comprehensible programming model –
model targeted at abstract application class but not
to application domain (In the sensor, image,
camera case -- Region Templates)
• Software stack including coordinated
compiler/runtime support/autotuning frameworks
HPC Segmentation and Feature Extraction
Pipeline
Tony Pan, George Teodoro,
Tahsin Kurc and Scott Klasky
Region Templates
• Provides a generic container template for common data structures, such
as points, arrays, regions, and object sets, within a spatial and temporal
bounding box
• Data region object is a storage materialization of data types and stores
the data elements in the region contained by a region template instance;
region template instance may have multiple data regions.
• Allows for different data I/O, storage, and management strategies and
implementations, while providing a homogeneous, unified interface to the
application developer.
• Application operations interact with data regions and region templates to
store and retrieve data elements, rather than explicitly handling the
management, staging, and distribution of the data elements.
• Current implementations on nodes with multi-core CPUs and GPUs,
distributed memory storage, and high bandwidth disk I/O.
Region Template: Preliminary
Experimental Evaluation
• Experimentally evaluated using pathology image analysis on the
Keeneland system
• This application consists of a pipeline with Segmentation and Feature
Computation Stages, and each of these stages are internally divided into
finer-grained tasks for better scheduling on heterogeneous CPU-GPU
equipped machines.
Center
Large Scale Data Management
 Represented by a complex data model capturing
multi-faceted information including markups,
annotations, algorithm provenance, specimen, etc.
 Support for complex relationships and spatial
query: multi-level granularities, relationships
between markups and annotations, spatial and
nested relationships
 Highly optimized spatial query and analyses
 Implemented in a variety of ways including
optimized CPU/GPU, Hadoop/HDFS and IBM DB2
 Supported by two NLM R01 grants – Saltz/Foran
Center
Spatial Centric – Sensor Data Feature “GIS”
Point query: human marked point
inside a nucleus
.
Window query: return markups
contained in a rectangle
Spatial join query: algorithm
validation/comparison
Containment query: nuclear feature
aggregation in tumor regions
Fusheng Wang
Center
Algorithm Validation: Intersection between Two
Result Sets (Spatial Join)
PAIS: Example Queries
. .
AIS (Analytical Imaging Standards)
 AIS Logical Model
 62 UML classes
 markups, annotations,
imageReferences,
provenance
 AIS Data Representation
 XML (compressed) or HDF5
 AIS Databases
 loading, managing and
querying and sharing data
 Native XML DBMS or
RDBMS + SDBMS
class Domain Mo...
Annotation
GeometricShape
CalculationObservation
Specimen
ImageReference
Provenance
User
PAIS
Equipment
Group
AnatomicEntity
Subject
Field
Project
MicroscopyImageReference
DICOMImageReference
TMAImageReference
Markup
Inference
Region
WholeSlideImageReference
Patient
Surface
Collection
AnnotationReference
10..1
1
0..1
0..*
0..*
1
0..*
1
0..1
1 0..*
1
0..1
1
0..1
1
0..1
1
0..*
1
0..*
0..*
0..*
1 0..1
1
0..1
1
0..*
0..1
0..*
1
0..*
1
0..1
1
0..*
1
0..1
1
0..1
1
0..*
10..*
1 0..*
1
0..*
PAIS
Center
VLDB 2012, 2013
Spatial Query, Change Detection, Comparison, and
Quantification
Soft real time and streaming Sensor
Data Analysis, Event Detection,
Decision Support
• Integrated analyses of patient data – physiological
streams, labs, mediations, notes, Radiology, Pathology
images, mobile health data feeds
• High frequency trading, arbitrage
• Real time monitoring earthquakes, control of oilfields
• Control of industrial plants, aircraft engines
• Fusion – data capture, control, prediction of
disruptions
• Internet of things
• Twitter feeds
• Intensive care alarms
Typical Computational Analysis Tasks
Streaming Sensor Data Analysis, Event Detection, Decision
Support
• Prediction algorithms – Kalman, particle filtering
• Machine learning algorithms on aggregated data
to develop model, use of model on streaming
data for decision support
• Searching for rare events
• Statistical algorithms to distinguish signal from
noise
• On the fly integration of multiple complementary
data streams

More Related Content

What's hot

Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Geoffrey Fox
 
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATA
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATAA HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATA
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATAijaia
 
Zeeshan.ali.presentations
Zeeshan.ali.presentationsZeeshan.ali.presentations
Zeeshan.ali.presentationsZeeshan Ali
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeCarole Goble
 
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODPREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODAM Publications
 
Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29Kerstin Lehnert
 
Model trees as an alternative to neural networks
Model trees as an alternative to neural networksModel trees as an alternative to neural networks
Model trees as an alternative to neural networksMartheana Kencanawati
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...Khalid Belhajjame
 

What's hot (9)

Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel 
 
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATA
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATAA HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATA
A HYBRID LEARNING ALGORITHM IN AUTOMATED TEXT CATEGORIZATION OF LEGACY DATA
 
Zeeshan.ali.presentations
Zeeshan.ali.presentationsZeeshan.ali.presentations
Zeeshan.ali.presentations
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
 
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHODPREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
PREDICTION OF STORM DISASTER USING CLOUD MAP-REDUCE METHOD
 
Madhavi tippani
Madhavi tippaniMadhavi tippani
Madhavi tippani
 
Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29Astromat Update on Developments 2021-01-29
Astromat Update on Developments 2021-01-29
 
Model trees as an alternative to neural networks
Model trees as an alternative to neural networksModel trees as an alternative to neural networks
Model trees as an alternative to neural networks
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
 

Similar to Exascale Computing and Experimental Sensor Data

High Dimensional Fused-Informatics
High Dimensional Fused-InformaticsHigh Dimensional Fused-Informatics
High Dimensional Fused-InformaticsJoel Saltz
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Geoffrey Fox
 
Visual tools for databade queries and analysis
Visual tools for databade queries and analysisVisual tools for databade queries and analysis
Visual tools for databade queries and analysismoochm
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and YouJoel Saltz
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data Geoffrey Fox
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Joel Saltz
 
Data analytics for engineers- introduction
Data analytics for engineers-  introductionData analytics for engineers-  introduction
Data analytics for engineers- introductionRINUSATHYAN
 
understanding the planet using satellites and deep learning
understanding the planet using satellites and deep learningunderstanding the planet using satellites and deep learning
understanding the planet using satellites and deep learningAlbert Pujol Torras
 
IEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfIEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfssuserff37aa
 
temporal and spatial database.pptx
temporal and spatial database.pptxtemporal and spatial database.pptx
temporal and spatial database.pptx64837JAYAASRIK
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected homeHéloïse Nonne
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralPaolo Missier
 
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Lokukaluge Prasad Perera
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor NetworksOscar Corcho
 
Big data in the research life cycle: technologies, infrastructures, policies
Big data in the research life cycle: technologies, infrastructures, policiesBig data in the research life cycle: technologies, infrastructures, policies
Big data in the research life cycle: technologies, infrastructures, policiesBigData_Europe
 

Similar to Exascale Computing and Experimental Sensor Data (20)

Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
High Dimensional Fused-Informatics
High Dimensional Fused-InformaticsHigh Dimensional Fused-Informatics
High Dimensional Fused-Informatics
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...
 
Visual tools for databade queries and analysis
Visual tools for databade queries and analysisVisual tools for databade queries and analysis
Visual tools for databade queries and analysis
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and You
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
 
Data analytics for engineers- introduction
Data analytics for engineers-  introductionData analytics for engineers-  introduction
Data analytics for engineers- introduction
 
understanding the planet using satellites and deep learning
understanding the planet using satellites and deep learningunderstanding the planet using satellites and deep learning
understanding the planet using satellites and deep learning
 
IEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfIEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdf
 
temporal and spatial database.pptx
temporal and spatial database.pptxtemporal and spatial database.pptx
temporal and spatial database.pptx
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
AI for Science
AI for ScienceAI for Science
AI for Science
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected home
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 
Big data in the research life cycle: technologies, infrastructures, policies
Big data in the research life cycle: technologies, infrastructures, policiesBig data in the research life cycle: technologies, infrastructures, policies
Big data in the research life cycle: technologies, infrastructures, policies
 

More from Joel Saltz

AI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersAI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersJoel Saltz
 
Pathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillancePathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillanceJoel Saltz
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingJoel Saltz
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataJoel Saltz
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Joel Saltz
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerJoel Saltz
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineJoel Saltz
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicineJoel Saltz
 
Machine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataMachine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataJoel Saltz
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Joel Saltz
 
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...Joel Saltz
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeJoel Saltz
 
Big Data and Extreme Scale Computing
Big Data and Extreme Scale Computing Big Data and Extreme Scale Computing
Big Data and Extreme Scale Computing Joel Saltz
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Joel Saltz
 
Data and Computational Challenges in Integrative Biomedical Informatics
Data and Computational Challenges in Integrative Biomedical InformaticsData and Computational Challenges in Integrative Biomedical Informatics
Data and Computational Challenges in Integrative Biomedical InformaticsJoel Saltz
 
Integrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesIntegrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesJoel Saltz
 
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)Biomedical Informatics Program -- Atlanta CTSA (ACTSI)
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)Joel Saltz
 
Role of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchRole of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchJoel Saltz
 

More from Joel Saltz (20)

AI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersAI and whole slide imaging biomarkers
AI and whole slide imaging biomarkers
 
Pathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillancePathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer Surveillance
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale Computing
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase Change
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase Change
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision Medicine
 
Machine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataMachine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of Data
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
 
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology Phenotype
 
Big Data and Extreme Scale Computing
Big Data and Extreme Scale Computing Big Data and Extreme Scale Computing
Big Data and Extreme Scale Computing
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Data and Computational Challenges in Integrative Biomedical Informatics
Data and Computational Challenges in Integrative Biomedical InformaticsData and Computational Challenges in Integrative Biomedical Informatics
Data and Computational Challenges in Integrative Biomedical Informatics
 
Integrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesIntegrative Multi-Scale Analyses
Integrative Multi-Scale Analyses
 
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)Biomedical Informatics Program -- Atlanta CTSA (ACTSI)
Biomedical Informatics Program -- Atlanta CTSA (ACTSI)
 
Role of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchRole of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer Research
 

Recently uploaded

Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 

Recently uploaded (20)

Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 

Exascale Computing and Experimental Sensor Data

  • 1. Exascale Computing and Experimental Sensor Data Overview given at Brookhaven National Laboratory April 18 2014 Joel Saltz Stony Brook University joel.saltz@stonybrook.edu
  • 2. Integrate Information from Sensors, Images, Cameras • Multi-dimensional spatial-temporal datasets – Radiology and Microscopy Image Analyses – Oil Reservoir Simulation/Carbon Sequestration/Groundwater Pollution Remediation – Biomass monitoring and disaster surveillance using multiple types of satellite imagery – Weather prediction using satellite and ground sensor data – Analysis of Results from Large Scale Simulations – Square Kilometer Array – Google Self Driving Car • Correlative and cooperative analysis of data from multiple sensor modalities and sources • Equivalent from standpoint of data access patterns – need to develop new generation of data skeletons/mini-apps/data dwarfs
  • 3. Spatio-temporal Sensor Integration, Analysis, Classification • Multi-scale material/tissue structural, molecular, functional characterization. Design of materials with specific structural, energy storage properties, brain, regenerative medicine, cancer • Integrative multi-scale analyses of the earth, oceans, atmosphere, cities, vegetation etc – cameras and sensors on satellites, aircraft, drones, land vehicles, stationary cameras • Digital astronomy • Hydrocarbon exploration, exploitation, pollution remediation • Aerospace – wind tunnels, acquisition of data during flight • Solid printing integrative data analyses • Autonomous vehicles, e.g. self driving cars • Data generated by numerical simulation codes – PDEs, particle methods • Fit model with data
  • 4. Typical Computational/Analysis Tasks Spatio-temporal Sensor Integration, Analysis, Classification • Data Cleaning and Low Level Transformations • Data Subsetting, Filtering, Subsampling • Spatio-temporal Mapping and Registration • Object Segmentation • Feature Extraction • Object/Region/Feature Classification • Spatio-temporal Aggregation • Diffeomorphism type mapping methods (e.g. optimal mass transport) • Particle filtering/prediction • Change Detection, Comparison, and Quantification
  • 5. Detect and track changes in data during production Invert data for reservoir properties Detect and track reservoir changes Assimilate data & reservoir properties into the evolving reservoir model Use simulation and optimization to guide future production Coupled data acquisition, data analysis, modeling, prediction and correction – data assimilation, particle filtering etc.
  • 6.
  • 7. Future State • 100K – 1M pathology slides/hospital/year • 2GB compressed per slide • 1-10 slides used for Pathologist computer aided diagnosis • 100-10K slides used in hospital Quality control • Groups of 100K+ slides used for clinical research studies -- Combined with molecular, outcome data
  • 8. Center Brain Tumor Pipeline Scaling on GT/ORNL NSF Keeneland (100 Nodes)
  • 9. Center Runtime Support Objectives • Coordinated mapping of data and computation to complex memory hierarchies • Hierarchical work assignment with flexibility capable of dealing with data dependent computational patterns, fluctuations in computational speed associated with power management, faults • Linked to comprehensible programming model – model targeted at abstract application class but not to application domain (In the sensor, image, camera case -- Region Templates) • Software stack including coordinated compiler/runtime support/autotuning frameworks
  • 10. HPC Segmentation and Feature Extraction Pipeline Tony Pan, George Teodoro, Tahsin Kurc and Scott Klasky
  • 11. Region Templates • Provides a generic container template for common data structures, such as points, arrays, regions, and object sets, within a spatial and temporal bounding box • Data region object is a storage materialization of data types and stores the data elements in the region contained by a region template instance; region template instance may have multiple data regions. • Allows for different data I/O, storage, and management strategies and implementations, while providing a homogeneous, unified interface to the application developer. • Application operations interact with data regions and region templates to store and retrieve data elements, rather than explicitly handling the management, staging, and distribution of the data elements. • Current implementations on nodes with multi-core CPUs and GPUs, distributed memory storage, and high bandwidth disk I/O.
  • 12. Region Template: Preliminary Experimental Evaluation • Experimentally evaluated using pathology image analysis on the Keeneland system • This application consists of a pipeline with Segmentation and Feature Computation Stages, and each of these stages are internally divided into finer-grained tasks for better scheduling on heterogeneous CPU-GPU equipped machines.
  • 13. Center Large Scale Data Management  Represented by a complex data model capturing multi-faceted information including markups, annotations, algorithm provenance, specimen, etc.  Support for complex relationships and spatial query: multi-level granularities, relationships between markups and annotations, spatial and nested relationships  Highly optimized spatial query and analyses  Implemented in a variety of ways including optimized CPU/GPU, Hadoop/HDFS and IBM DB2  Supported by two NLM R01 grants – Saltz/Foran
  • 14. Center Spatial Centric – Sensor Data Feature “GIS” Point query: human marked point inside a nucleus . Window query: return markups contained in a rectangle Spatial join query: algorithm validation/comparison Containment query: nuclear feature aggregation in tumor regions Fusheng Wang
  • 15. Center Algorithm Validation: Intersection between Two Result Sets (Spatial Join) PAIS: Example Queries . .
  • 16. AIS (Analytical Imaging Standards)  AIS Logical Model  62 UML classes  markups, annotations, imageReferences, provenance  AIS Data Representation  XML (compressed) or HDF5  AIS Databases  loading, managing and querying and sharing data  Native XML DBMS or RDBMS + SDBMS class Domain Mo... Annotation GeometricShape CalculationObservation Specimen ImageReference Provenance User PAIS Equipment Group AnatomicEntity Subject Field Project MicroscopyImageReference DICOMImageReference TMAImageReference Markup Inference Region WholeSlideImageReference Patient Surface Collection AnnotationReference 10..1 1 0..1 0..* 0..* 1 0..* 1 0..1 1 0..* 1 0..1 1 0..1 1 0..1 1 0..* 1 0..* 0..* 0..* 1 0..1 1 0..1 1 0..* 0..1 0..* 1 0..* 1 0..1 1 0..* 1 0..1 1 0..1 1 0..* 10..* 1 0..* 1 0..* PAIS
  • 17. Center VLDB 2012, 2013 Spatial Query, Change Detection, Comparison, and Quantification
  • 18. Soft real time and streaming Sensor Data Analysis, Event Detection, Decision Support • Integrated analyses of patient data – physiological streams, labs, mediations, notes, Radiology, Pathology images, mobile health data feeds • High frequency trading, arbitrage • Real time monitoring earthquakes, control of oilfields • Control of industrial plants, aircraft engines • Fusion – data capture, control, prediction of disruptions • Internet of things • Twitter feeds • Intensive care alarms
  • 19. Typical Computational Analysis Tasks Streaming Sensor Data Analysis, Event Detection, Decision Support • Prediction algorithms – Kalman, particle filtering • Machine learning algorithms on aggregated data to develop model, use of model on streaming data for decision support • Searching for rare events • Statistical algorithms to distinguish signal from noise • On the fly integration of multiple complementary data streams

Editor's Notes

  1. Metadata about images Metadata about image targets, how images are derived (patient, specimen, anatomicEntity, etc) 3) Metadata about analyses (the purpose of the analysis, who performed the analysis, etc) 4) Image markups -- a markup delineates a spatial region (e.g., as points, lines, polygons, multi-polygons) in images 5) Annotation: Image features: a type of annotation calculated or derived from the markups 6) Annotation: observation -- an annotation associates semantic meaning to markup entities through coded or free text terms that provide explanatory or descriptive information 7) provenance information, i.e., the derivation history of a markup or annotation, including algorithm information, parameters, and inputs Native XML database based approach Small sized PAIS documents, e.g., organ, tissue, or region level annotations No mapping needed, support standard XML queries Relational and spatial database approach For large scale PAIS documents, e.g., analysis results at cellular or subcellular level Data mapped into relational tables and spatial objects Highly efficient on storage and queries