SlideShare a Scribd company logo
Towards real-time analysis of large data volumes for synchrotron experiments

Martin Kunz, Nobumichi Tamura
Advanced Light Source, Lawrence Berkeley National Lab
Towards real-time analysis of large data volumes
for synchrotron experiments

Acknowledgements

- Jack Deslippe, David Skinner (NERSC)
- Abdelilah Essiari , Craig E. Tull (LBNL-CRD)
- Eli Dart (ESNET)
- Dula Parkinson (LBNL – ALS)
Towards real-time analysis of large data volumes
for synchrotron experiments

X-rays and Earth-Sciences; the story of a moving bottle-neck:
1960’s / 1970’s
X-ray Source

X-ray Detectors

Henry Levy with Picker 5-circle and PDP-5

Data Analysis

Publication
Towards real-time analysis of large data volumes
for synchrotron experiments

X-rays and Earth-Sciences; the story of a moving bottle-neck:
1980’s / 1990’s
X-ray Source

X-ray Detectors

1995: “MD Storm”: Readout time: 45 minutes

Data Analysis

Publication
Towards real-time analysis of large data volumes
for synchrotron experiments

X-rays and Earth-Sciences; the story of a moving bottle-neck:
2000’s / 2010’s
X-ray Source

X-ray Detectors

Data Analysis

Publication
Towards real-time analysis of large data volumes
for synchrotron experiments

X-rays and Earth-Sciences; the story of a moving bottle-neck:

Future:
X-ray Source

X-ray Detectors

Interactive access to supercomputers

Data Analysis

Publication
Towards real-time analysis of large data volumes
for synchrotron experiments

Examples of mineral physics related experiments with high data rates:
1) In situ powder diffraction with automated P-T stepping:

ALS BL 12.2.2 with Perkin Elmer detector (~ 0 read-out delay)

http://www.ltp-oldenburg.de

Data rate in the order of 1000’s of frames per day (i.e. 10’s of GB/day)
Towards real-time analysis of large data volumes
for synchrotron experiments

Examples of mineral physics related experiments with high data rates:
2) Micro-diffraction / phase/orientation/strain-mapping at high spatial resolution

Micro-diffraction set-up at ALS beamline 12.3.2 with
Pilatus-1M detector.

Left: Distribution of Re3N (black) and Re (blue) grown in a laser-heated DAC
Right: Relative orientation of Re3N grains.
Source: Friedrich et al. (2010), PRL (105), 085504.

Data rate in the order of 10000’s of frames per day (i.e. 100’s of GB/day)
Towards real-time analysis of large data volumes
for synchrotron experiments

Examples of mineral physics related experiments with high data rates:
3) Tomography 3d-mapping of geo-materials:

X-rays

Scintillator

Supercritical CO2 penetrating sandstone on ALS BL 8.3.2 (courtesy J
Ajo-Franklin)

Tomography set-up at ALS beamline 8.3.2
Distribution of Fe-alloy melt prepared at 64 GPa measured at SSRL. Shi et al. (2013)
Nature Geosciences. DOI: 10.1038/NGEO1956

Data rate in the order of 100’000’s of frames per day (i.e. TB’s/day)
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
- 24 dual-socket AMD Opteron 248 2.2Ghz processor nodes 48 CPU’s
- 48 GB aggregate memory
- 14 TB shared disk storage
- Gigabit Ethernet interconnect
- 212 GFLOPS (theoretical peak)
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
1) User tunes parameters manually on some ‘typical’ patterns
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
1) Analysis Parameters are written into a instruction-file
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
1) Analysis Parameters are written into a instruction-file
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
2) Launch parsing script:
-> reads instruction file and parses data-file onto available CPU’s
-> writes batch files which manage individual CPU’s
-> launches software on each node
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
1) Not-quite-real-time - local cluster for micro-diffraction analysis
3) Results are written in a single file which can be viewed and further analyzed and published:
Relative lattice orientation: Gives domain structure.
Total color range blue to red corresponds to 4 degs rotation.

Average Intensity: Gives high-res fine structure of grain
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
1) Data are sent directly to NERSC for analysis and storage during data collection

Data are packaged:
- after every n images a ‘trigger file’ is deposited in a
directory which is monitored by NERSC.
- a SPADE web-app wraps the data (512 files at a
time) with HDF5 (hierarchical data format) and ships
them to NERSC via a Gigabit line (will be upgraded to
10G line).
- at NERSC data are received by a SPADE instance,
places them in target folder and on tape, and sends
an acknowledgment.
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
1) Data are sent directly to NERSC for analysis and storage during data collection Up and running

Transfer control is web-based
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
1) Data are sent directly to NERSC for analysis and storage during data collection Up and running

Transfer control is web-based
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
1) Data are sent directly to NERSC for analysis and storage during data collection: Up and running

Transfer control is web-based
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
2) Analysis parameters are set-up with a web-app - under development
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
2) Analysis parameters are set-up with a web-app - under development

Jobs are launched manually by user via same web-page.
Test-runs indicate analysis time in the order of data collection time;
can in principle run synchronous to data collection.
Towards real-time analysis of large data volumes
for synchrotron experiments

How do we tackle this at the ALS?
2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC)
(in development)
3) Analysis jobs are executed on Carver - under development

Carver is an IBM iDataPlex cluster
- 1202 nodes with a total of 9984 processor cores
- 106 Tflop/sec peak performance
- largest allocated parallel job is 512 cores
Towards real-time analysis of large data volumes
for synchrotron experiments

Summary:
- Data analysis is the new bottle-neck limiting progress in many aspects of experimental mineral
physics
- Real-time analysis with immediate feed-back is increasingly important in experimental mineral
physics
- These challenges cannot always be met with traditional desktop machines – software has to be
automatized and parallelized; collaborations with super-computing is becoming important also for
experimental scientists (at least for a few more iterations of Moore’s cycle).
- Data analysis on super-computers, remotely controlled with web-applications is a very promising
alley, allowing for big-data methods to enter mineral physics.
- Future developments may (must?) evolve away from super computers to highly parallelized
(GPU’s) local computers and/or cloud computing.

More Related Content

What's hot

The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
Cybera Inc.
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean Observatories
Larry Smarr
 
Creating High Performance Lambda Collaboratories
Creating High Performance Lambda CollaboratoriesCreating High Performance Lambda Collaboratories
Creating High Performance Lambda Collaboratories
Larry Smarr
 
Reusable Software and Open Data To Optimize Agriculture
Reusable Software and Open Data To Optimize AgricultureReusable Software and Open Data To Optimize Agriculture
Reusable Software and Open Data To Optimize Agriculture
David LeBauer
 
Research on Blue Waters
Research on Blue WatersResearch on Blue Waters
Research on Blue Waters
inside-BigData.com
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
Larry Smarr
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
Mario Juric
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application Challenge
Larry Smarr
 
Ceoa Nov 2005 Final Small
Ceoa Nov 2005 Final SmallCeoa Nov 2005 Final Small
Ceoa Nov 2005 Final Small
Larry Smarr
 
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
Mario Juric
 
LSST Solar System Science: MOPS Status, the Science, and Your Questions
LSST Solar System Science: MOPS Status, the Science, and Your QuestionsLSST Solar System Science: MOPS Status, the Science, and Your Questions
LSST Solar System Science: MOPS Status, the Science, and Your Questions
Mario Juric
 
Advanced Cyberinfrastructure Enabled Services and Applications in 2021
Advanced Cyberinfrastructure Enabled Services and Applications in 2021Advanced Cyberinfrastructure Enabled Services and Applications in 2021
Advanced Cyberinfrastructure Enabled Services and Applications in 2021
Larry Smarr
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
Larry Smarr
 
Peering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains NetworkPeering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains Network
Larry Smarr
 
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
Larry Smarr
 
PRP, CHASE-CI, TNRP and OSG
PRP, CHASE-CI, TNRP and OSGPRP, CHASE-CI, TNRP and OSG
PRP, CHASE-CI, TNRP and OSG
Larry Smarr
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
Larry Smarr
 
The Pacific Research Platform Enables Distributed Big-Data Machine-Learning
The Pacific Research Platform Enables Distributed Big-Data Machine-LearningThe Pacific Research Platform Enables Distributed Big-Data Machine-Learning
The Pacific Research Platform Enables Distributed Big-Data Machine-Learning
Larry Smarr
 

What's hot (20)

The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
GeoCENS Source Talk: Results from an Atlantic Rainforest Micrometeorology Sen...
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean Observatories
 
Creating High Performance Lambda Collaboratories
Creating High Performance Lambda CollaboratoriesCreating High Performance Lambda Collaboratories
Creating High Performance Lambda Collaboratories
 
Reusable Software and Open Data To Optimize Agriculture
Reusable Software and Open Data To Optimize AgricultureReusable Software and Open Data To Optimize Agriculture
Reusable Software and Open Data To Optimize Agriculture
 
Research on Blue Waters
Research on Blue WatersResearch on Blue Waters
Research on Blue Waters
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application Challenge
 
Ceoa Nov 2005 Final Small
Ceoa Nov 2005 Final SmallCeoa Nov 2005 Final Small
Ceoa Nov 2005 Final Small
 
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
AstroInformatics 2015: Large Sky Surveys: Entering the Era of Software-Bound ...
 
LSST Solar System Science: MOPS Status, the Science, and Your Questions
LSST Solar System Science: MOPS Status, the Science, and Your QuestionsLSST Solar System Science: MOPS Status, the Science, and Your Questions
LSST Solar System Science: MOPS Status, the Science, and Your Questions
 
Advanced Cyberinfrastructure Enabled Services and Applications in 2021
Advanced Cyberinfrastructure Enabled Services and Applications in 2021Advanced Cyberinfrastructure Enabled Services and Applications in 2021
Advanced Cyberinfrastructure Enabled Services and Applications in 2021
 
PRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path ForwardPRP, NRP, GRP & the Path Forward
PRP, NRP, GRP & the Path Forward
 
Peering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains NetworkPeering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains Network
 
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
LambdaGrids--Earth and Planetary Sciences Driving High Performance Networks a...
 
PRP, CHASE-CI, TNRP and OSG
PRP, CHASE-CI, TNRP and OSGPRP, CHASE-CI, TNRP and OSG
PRP, CHASE-CI, TNRP and OSG
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 
The Pacific Research Platform Enables Distributed Big-Data Machine-Learning
The Pacific Research Platform Enables Distributed Big-Data Machine-LearningThe Pacific Research Platform Enables Distributed Big-Data Machine-Learning
The Pacific Research Platform Enables Distributed Big-Data Machine-Learning
 

Viewers also liked

Predictive analysis
Predictive analysisPredictive analysis
Predictive analysis
Dean Cousins
 
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data ContinuumFast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
VoltDB
 
Predictive Analytics: Big data lessons from big physics
Predictive Analytics: Big data lessons from big physicsPredictive Analytics: Big data lessons from big physics
Predictive Analytics: Big data lessons from big physics
Jake Bouma
 
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
Gigaom
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
Albert Bifet
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
In-Memory Computing Summit
 
Predictive Analysis
Predictive AnalysisPredictive Analysis
Predictive Analysis
Michael Bystry
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
Revolution Analytics
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop Sample
Alan Quayle
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modelling
lalit Lalitm7225
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
Amazon Web Services
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
Albert Bifet
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Spark Summit
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part I
jayroy
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
Arun Kejariwal
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
Nati Shalom
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
MachinePulse
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
Kimberley Mitchell
 
Predictive Analytics using R
Predictive Analytics using RPredictive Analytics using R
Predictive Analytics using R
Jeffrey Strickland, Ph.D., CMSP
 
A quick intro to In memory computing
A quick intro to In memory computingA quick intro to In memory computing
A quick intro to In memory computing
Neobric
 

Viewers also liked (20)

Predictive analysis
Predictive analysisPredictive analysis
Predictive analysis
 
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data ContinuumFast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
Fast Data: Achieving Real-Time Data Analysis Across the Financial Data Continuum
 
Predictive Analytics: Big data lessons from big physics
Predictive Analytics: Big data lessons from big physicsPredictive Analytics: Big data lessons from big physics
Predictive Analytics: Big data lessons from big physics
 
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
 
Predictive Analysis
Predictive AnalysisPredictive Analysis
Predictive Analysis
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop Sample
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modelling
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part I
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
Predictive Analytics using R
Predictive Analytics using RPredictive Analytics using R
Predictive Analytics using R
 
A quick intro to In memory computing
A quick intro to In memory computingA quick intro to In memory computing
A quick intro to In memory computing
 

Similar to Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Martin Kunz, LBNL

Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
PyData
 
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
Larry Smarr
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
Andrew Lowe
 
Jarp big data_sydney_v7
Jarp big data_sydney_v7Jarp big data_sydney_v7
Jarp big data_sydney_v7
Suma Pria Tunggal
 
The Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years InThe Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years In
Larry Smarr
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
Ian Foster
 
201109021 mcguinness ska_meeting
201109021 mcguinness ska_meeting201109021 mcguinness ska_meeting
201109021 mcguinness ska_meeting
Deborah McGuinness
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
Larry Smarr
 
Building an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic SciencesBuilding an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic Sciences
Larry Smarr
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
Larry Smarr
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data Science
Robert Grossman
 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architectures
Ian Foster
 
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean SciencesThe Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
Larry Smarr
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
Aureliano Bombarely
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
Miha Ahronovitz
 
"Some Reflections on Data in the Public Sector" : Communia: The European Them...
"Some Reflections on Data in the Public Sector" : Communia: The European Them..."Some Reflections on Data in the Public Sector" : Communia: The European Them...
"Some Reflections on Data in the Public Sector" : Communia: The European Them...
Tom Moritz
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ian Foster
 
Computational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain ScientistsComputational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain Scientists
Joshua Bloom
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana University
inside-BigData.com
 

Similar to Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Martin Kunz, LBNL (20)

Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
 
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
 
Jarp big data_sydney_v7
Jarp big data_sydney_v7Jarp big data_sydney_v7
Jarp big data_sydney_v7
 
The Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years InThe Pacific Research Platform
 Two Years In
The Pacific Research Platform
 Two Years In
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
201109021 mcguinness ska_meeting
201109021 mcguinness ska_meeting201109021 mcguinness ska_meeting
201109021 mcguinness ska_meeting
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
Building an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic SciencesBuilding an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic Sciences
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data Science
 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architectures
 
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean SciencesThe Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
"Some Reflections on Data in the Public Sector" : Communia: The European Them...
"Some Reflections on Data in the Public Sector" : Communia: The European Them..."Some Reflections on Data in the Public Sector" : Communia: The European Them...
"Some Reflections on Data in the Public Sector" : Communia: The European Them...
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Computational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain ScientistsComputational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain Scientists
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana University
 

More from EarthCube

Community Webinar: Tune up for AGU
Community Webinar: Tune up for AGUCommunity Webinar: Tune up for AGU
Community Webinar: Tune up for AGU
EarthCube
 
Engagement Team monthly meeting 10.10.2014
Engagement Team monthly meeting 10.10.2014Engagement Team monthly meeting 10.10.2014
Engagement Team monthly meeting 10.10.2014
EarthCube
 
Sci Committee Meeting Slides 10.06.14
Sci Committee Meeting Slides 10.06.14Sci Committee Meeting Slides 10.06.14
Sci Committee Meeting Slides 10.06.14
EarthCube
 
Funded teams slides 10.10.14
Funded teams slides 10.10.14Funded teams slides 10.10.14
Funded teams slides 10.10.14
EarthCube
 
Technology and Architecture Committee meeting slides 10.06.14
Technology and Architecture Committee meeting slides 10.06.14Technology and Architecture Committee meeting slides 10.06.14
Technology and Architecture Committee meeting slides 10.06.14
EarthCube
 
EarthCube Governance Intro for Solar Terrestrial End-user Workshop
EarthCube Governance Intro for Solar Terrestrial End-user WorkshopEarthCube Governance Intro for Solar Terrestrial End-user Workshop
EarthCube Governance Intro for Solar Terrestrial End-user Workshop
EarthCube
 
EarthCube Community Webinar: Introduction to Committees and Teams
EarthCube Community Webinar: Introduction to Committees and TeamsEarthCube Community Webinar: Introduction to Committees and Teams
EarthCube Community Webinar: Introduction to Committees and Teams
EarthCube
 
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
EarthCube
 
AHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
AHM 2014: PolarHub: A Global Hub for Geospatial Service DiscoveryAHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
AHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
EarthCube
 
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
EarthCube
 
AHM 2014: Revisting Governance Model, Preparing for Next Steps
AHM 2014: Revisting Governance Model, Preparing for Next StepsAHM 2014: Revisting Governance Model, Preparing for Next Steps
AHM 2014: Revisting Governance Model, Preparing for Next Steps
EarthCube
 
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
EarthCube
 
AHM 2014: Crawling for EarthCube
AHM 2014: Crawling for EarthCubeAHM 2014: Crawling for EarthCube
AHM 2014: Crawling for EarthCube
EarthCube
 
AHM 2014: The Flow Simulation Tools on VHub
AHM 2014: The Flow Simulation Tools on VHubAHM 2014: The Flow Simulation Tools on VHub
AHM 2014: The Flow Simulation Tools on VHub
EarthCube
 
AHM 2014: Integrated Data Management System for Critical Zone Observatories
AHM 2014: Integrated Data Management System for Critical Zone ObservatoriesAHM 2014: Integrated Data Management System for Critical Zone Observatories
AHM 2014: Integrated Data Management System for Critical Zone Observatories
EarthCube
 
Peckham 2014 i_em_ss
Peckham 2014 i_em_ssPeckham 2014 i_em_ss
Peckham 2014 i_em_ss
EarthCube
 
AHM 2014: BCube Brokering Framework
AHM 2014: BCube Brokering FrameworkAHM 2014: BCube Brokering Framework
AHM 2014: BCube Brokering Framework
EarthCube
 
AHM 2014: EarthCube Architecture Forum Introduction
AHM 2014: EarthCube Architecture Forum IntroductionAHM 2014: EarthCube Architecture Forum Introduction
AHM 2014: EarthCube Architecture Forum Introduction
EarthCube
 
AHM 2014: A Few Notes on GEOSS Architecture
AHM 2014: A Few Notes on GEOSS ArchitectureAHM 2014: A Few Notes on GEOSS Architecture
AHM 2014: A Few Notes on GEOSS Architecture
EarthCube
 
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
EarthCube
 

More from EarthCube (20)

Community Webinar: Tune up for AGU
Community Webinar: Tune up for AGUCommunity Webinar: Tune up for AGU
Community Webinar: Tune up for AGU
 
Engagement Team monthly meeting 10.10.2014
Engagement Team monthly meeting 10.10.2014Engagement Team monthly meeting 10.10.2014
Engagement Team monthly meeting 10.10.2014
 
Sci Committee Meeting Slides 10.06.14
Sci Committee Meeting Slides 10.06.14Sci Committee Meeting Slides 10.06.14
Sci Committee Meeting Slides 10.06.14
 
Funded teams slides 10.10.14
Funded teams slides 10.10.14Funded teams slides 10.10.14
Funded teams slides 10.10.14
 
Technology and Architecture Committee meeting slides 10.06.14
Technology and Architecture Committee meeting slides 10.06.14Technology and Architecture Committee meeting slides 10.06.14
Technology and Architecture Committee meeting slides 10.06.14
 
EarthCube Governance Intro for Solar Terrestrial End-user Workshop
EarthCube Governance Intro for Solar Terrestrial End-user WorkshopEarthCube Governance Intro for Solar Terrestrial End-user Workshop
EarthCube Governance Intro for Solar Terrestrial End-user Workshop
 
EarthCube Community Webinar: Introduction to Committees and Teams
EarthCube Community Webinar: Introduction to Committees and TeamsEarthCube Community Webinar: Introduction to Committees and Teams
EarthCube Community Webinar: Introduction to Committees and Teams
 
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
AHM 2014: The CSDMS Standard Names, Cross-Domain Naming Conventions for Descr...
 
AHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
AHM 2014: PolarHub: A Global Hub for Geospatial Service DiscoveryAHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
AHM 2014: PolarHub: A Global Hub for Geospatial Service Discovery
 
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
AHM 2014: Addressing Data and Heterogeneity, Semantic Building Blocks & CI Pe...
 
AHM 2014: Revisting Governance Model, Preparing for Next Steps
AHM 2014: Revisting Governance Model, Preparing for Next StepsAHM 2014: Revisting Governance Model, Preparing for Next Steps
AHM 2014: Revisting Governance Model, Preparing for Next Steps
 
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
AHM 2014: The World of VHub.org ONline Collaboration, Sharing, Data, Models...
 
AHM 2014: Crawling for EarthCube
AHM 2014: Crawling for EarthCubeAHM 2014: Crawling for EarthCube
AHM 2014: Crawling for EarthCube
 
AHM 2014: The Flow Simulation Tools on VHub
AHM 2014: The Flow Simulation Tools on VHubAHM 2014: The Flow Simulation Tools on VHub
AHM 2014: The Flow Simulation Tools on VHub
 
AHM 2014: Integrated Data Management System for Critical Zone Observatories
AHM 2014: Integrated Data Management System for Critical Zone ObservatoriesAHM 2014: Integrated Data Management System for Critical Zone Observatories
AHM 2014: Integrated Data Management System for Critical Zone Observatories
 
Peckham 2014 i_em_ss
Peckham 2014 i_em_ssPeckham 2014 i_em_ss
Peckham 2014 i_em_ss
 
AHM 2014: BCube Brokering Framework
AHM 2014: BCube Brokering FrameworkAHM 2014: BCube Brokering Framework
AHM 2014: BCube Brokering Framework
 
AHM 2014: EarthCube Architecture Forum Introduction
AHM 2014: EarthCube Architecture Forum IntroductionAHM 2014: EarthCube Architecture Forum Introduction
AHM 2014: EarthCube Architecture Forum Introduction
 
AHM 2014: A Few Notes on GEOSS Architecture
AHM 2014: A Few Notes on GEOSS ArchitectureAHM 2014: A Few Notes on GEOSS Architecture
AHM 2014: A Few Notes on GEOSS Architecture
 
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
AHM 2014: The iPlant Collaborative, Community Cyberinfrastructure for Life Sc...
 

Recently uploaded

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 

Recently uploaded (20)

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 

Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Martin Kunz, LBNL

  • 1. Towards real-time analysis of large data volumes for synchrotron experiments Martin Kunz, Nobumichi Tamura Advanced Light Source, Lawrence Berkeley National Lab
  • 2. Towards real-time analysis of large data volumes for synchrotron experiments Acknowledgements - Jack Deslippe, David Skinner (NERSC) - Abdelilah Essiari , Craig E. Tull (LBNL-CRD) - Eli Dart (ESNET) - Dula Parkinson (LBNL – ALS)
  • 3. Towards real-time analysis of large data volumes for synchrotron experiments X-rays and Earth-Sciences; the story of a moving bottle-neck: 1960’s / 1970’s X-ray Source X-ray Detectors Henry Levy with Picker 5-circle and PDP-5 Data Analysis Publication
  • 4. Towards real-time analysis of large data volumes for synchrotron experiments X-rays and Earth-Sciences; the story of a moving bottle-neck: 1980’s / 1990’s X-ray Source X-ray Detectors 1995: “MD Storm”: Readout time: 45 minutes Data Analysis Publication
  • 5. Towards real-time analysis of large data volumes for synchrotron experiments X-rays and Earth-Sciences; the story of a moving bottle-neck: 2000’s / 2010’s X-ray Source X-ray Detectors Data Analysis Publication
  • 6. Towards real-time analysis of large data volumes for synchrotron experiments X-rays and Earth-Sciences; the story of a moving bottle-neck: Future: X-ray Source X-ray Detectors Interactive access to supercomputers Data Analysis Publication
  • 7. Towards real-time analysis of large data volumes for synchrotron experiments Examples of mineral physics related experiments with high data rates: 1) In situ powder diffraction with automated P-T stepping: ALS BL 12.2.2 with Perkin Elmer detector (~ 0 read-out delay) http://www.ltp-oldenburg.de Data rate in the order of 1000’s of frames per day (i.e. 10’s of GB/day)
  • 8. Towards real-time analysis of large data volumes for synchrotron experiments Examples of mineral physics related experiments with high data rates: 2) Micro-diffraction / phase/orientation/strain-mapping at high spatial resolution Micro-diffraction set-up at ALS beamline 12.3.2 with Pilatus-1M detector. Left: Distribution of Re3N (black) and Re (blue) grown in a laser-heated DAC Right: Relative orientation of Re3N grains. Source: Friedrich et al. (2010), PRL (105), 085504. Data rate in the order of 10000’s of frames per day (i.e. 100’s of GB/day)
  • 9. Towards real-time analysis of large data volumes for synchrotron experiments Examples of mineral physics related experiments with high data rates: 3) Tomography 3d-mapping of geo-materials: X-rays Scintillator Supercritical CO2 penetrating sandstone on ALS BL 8.3.2 (courtesy J Ajo-Franklin) Tomography set-up at ALS beamline 8.3.2 Distribution of Fe-alloy melt prepared at 64 GPa measured at SSRL. Shi et al. (2013) Nature Geosciences. DOI: 10.1038/NGEO1956 Data rate in the order of 100’000’s of frames per day (i.e. TB’s/day)
  • 10. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis - 24 dual-socket AMD Opteron 248 2.2Ghz processor nodes 48 CPU’s - 48 GB aggregate memory - 14 TB shared disk storage - Gigabit Ethernet interconnect - 212 GFLOPS (theoretical peak)
  • 11. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis 1) User tunes parameters manually on some ‘typical’ patterns
  • 12. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis 1) Analysis Parameters are written into a instruction-file
  • 13. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis 1) Analysis Parameters are written into a instruction-file
  • 14. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis 2) Launch parsing script: -> reads instruction file and parses data-file onto available CPU’s -> writes batch files which manage individual CPU’s -> launches software on each node
  • 15. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 1) Not-quite-real-time - local cluster for micro-diffraction analysis 3) Results are written in a single file which can be viewed and further analyzed and published: Relative lattice orientation: Gives domain structure. Total color range blue to red corresponds to 4 degs rotation. Average Intensity: Gives high-res fine structure of grain
  • 16. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 1) Data are sent directly to NERSC for analysis and storage during data collection Data are packaged: - after every n images a ‘trigger file’ is deposited in a directory which is monitored by NERSC. - a SPADE web-app wraps the data (512 files at a time) with HDF5 (hierarchical data format) and ships them to NERSC via a Gigabit line (will be upgraded to 10G line). - at NERSC data are received by a SPADE instance, places them in target folder and on tape, and sends an acknowledgment.
  • 17. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 1) Data are sent directly to NERSC for analysis and storage during data collection Up and running Transfer control is web-based
  • 18. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 1) Data are sent directly to NERSC for analysis and storage during data collection Up and running Transfer control is web-based
  • 19. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 1) Data are sent directly to NERSC for analysis and storage during data collection: Up and running Transfer control is web-based
  • 20. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 2) Analysis parameters are set-up with a web-app - under development
  • 21. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 2) Analysis parameters are set-up with a web-app - under development Jobs are launched manually by user via same web-page. Test-runs indicate analysis time in the order of data collection time; can in principle run synchronous to data collection.
  • 22. Towards real-time analysis of large data volumes for synchrotron experiments How do we tackle this at the ALS? 2) Real time – collaboration with National Energy Research Scientific Computing Center (NERSC) (in development) 3) Analysis jobs are executed on Carver - under development Carver is an IBM iDataPlex cluster - 1202 nodes with a total of 9984 processor cores - 106 Tflop/sec peak performance - largest allocated parallel job is 512 cores
  • 23. Towards real-time analysis of large data volumes for synchrotron experiments Summary: - Data analysis is the new bottle-neck limiting progress in many aspects of experimental mineral physics - Real-time analysis with immediate feed-back is increasingly important in experimental mineral physics - These challenges cannot always be met with traditional desktop machines – software has to be automatized and parallelized; collaborations with super-computing is becoming important also for experimental scientists (at least for a few more iterations of Moore’s cycle). - Data analysis on super-computers, remotely controlled with web-applications is a very promising alley, allowing for big-data methods to enter mineral physics. - Future developments may (must?) evolve away from super computers to highly parallelized (GPU’s) local computers and/or cloud computing.

Editor's Notes

  1. I would like to start off by giving a brief slightly personalized historic perspective on the application of X-rays in mineral physics research: X-rays are applied in Earth Sciences on a routine basis for about 50 years, this story thus pretty much parallels my life. In the 60-ies and 70-ies, when I was just learning how to spell X-ray the first automated diffractometer replaced fully manual film techniques…. The brightness of the X-rays available in those days limited a data collection powder or single crystal to days and weeks.
  2. This changed most dramatically with the advent of dedicated light sources, in particular high-energy 3rd generation sources such as the ESRF in Grenoble where the first dedicated mineral physics beamline ID30. I meanwhile managed to spell X-rays and thus was fortunate enough to be involved in the early days of said dedicated beamline. The brilliance of the ID30 undulator enabled experiments through a diamond anvil cell to be performed in matter of seconds. However, each data point required the physical transport of a 1 x 1 ft image plate to the one and only IP reader on the floor, plus a read-out time of about 45 minutes. Sadly, the tremendous increase in brightness and flux of the X-ray sources could only be utilized in a limited way.
  3. Another twenty years later - the age-apropriate amount of light sources meanwhile doesn’t fit on my birthday cake anymore - we hail the advent of ultra-fast and ultra-low noise direct detection X-ray detectors such as the Perkin-Elmer or pilatus, which - in principle- allow data-point rates of up to 30 Hz. This leads to the possibility of large data rates. However, our capabil abilities to deal with these data are largely still on the level of high-end desktops and serial work-flow software. The opportunity given to us by the combination of ever brighter lightsources and fast detectors, I.e. to apply big-data methods to mineral physics research can therefore not be fully harnessed.
  4. The way out of this bottleneck is in automatizing and parallelizing the analysis workflow using - at least for the time being - massively parallel super-computers. This is the approach we are presently taking at the Advanced Lightsource in collaboration with the National Energy Research Scientific Computing Center.
  5. Let me quickly give you 3 examples of the order of magnitude of data rates we have to deal with: Intense X-rays and fast detector, coupled with programmable T and P change allows a much denser coverage of the P-V-T surface and thus a much better description of thermo-elastic properties of Earth materials and their phase transitions….
  6. Mineral physics experiments involving very high temperatures and pressures invariable forces us to deal with large spatial and temporal gradients of pressure, temperature and chemical composition. High-spatial or temporal resolution is therefore needed to explore these inhomegenities. Fast detectors and bright X-rays thus allow us to collect spatially / and or temporally highly resolved maps of our sample…..
  7. Going beyond diffraction, various flavors of tomographic techniques allow now to create 3-dimensional images of samples in- and ex-situ, if needed even with chemical or phase selectivity. Such experiments …..
  8. This solution works fairly well with medium-sized datasets of up to 10000 frames; With larger data volumes and/or tricky data, data analysis even on a 48 CPU cluster can take much more than the data collection