Enabling Real Time
Analysis and Decision
Making
Kerstin Kleese van Dam
Advancing science and technology to
fulfill the DOE Mission
BNL is a Data Driven Science Laboratory
DOE has 27 User Facilities - 6 are operated by BNL:
BNL provides Data-rich Experimental Facilities:
▪ RHIC - Relativistic Heavy Ion Collider - supporting over 1000
scientists world wide
▪ NSLS II - Newest and Brightest Synchrotron in the world,
supporting a multitude of scientific research in academia,
industry and national security
▪ CFN - Center for Functional Nanomaterials, combines theory
and experiment to probe materials
▪ ATF - Accelerator Test Facility
▪ LHC ATLAS - Largest Tier 1 Center outside CERN
▪ ARM - Atmospheric Radiation Measurement Program -
Partner in multi-side facility, operating its external data Center
BNL supports large scale Experimental Facilities:
▪ Belle II - Computing for Neutrino experiment
▪ QCD - Computing facilities for BNL, RIKEN & US QCD
communities
Science Today is Data Driven Discovery
3
ATLAS
QCD
CFN
NSLS II
RHIC
!2017 Milestone: Over 100PB, of catalogued
data archived, long term, frequent reuse
●2nd largest scientific archive in the US, 4th
largest in the world (ECMWF, UKMO, NOAA)
●2016 - 400PB analyzed, 2017 -expected
500PB
●2016 37 PB exported
●BNL PanDA software enabled processing of
~1.6 EXBytes of data worldwide in 2016
4
BNL Data Statistics
Where we need Real
Time Analysis and
Decision Making
6/22/2015 6
When do we need streaming decision support?
• Data arrives at high velocity and volume
• Critical decisions have to be taken while the data
is still arriving
• Events of interest are rare
• Tacit knowledge is required for accurate
interpretations
June 22, 2015 7
What does
streaming analysis
include?
Experimental Control:
• Analyze all the facts
as the situation is
evolving
• Consider background
information
• Evaluate alternatives
• Request more
information if
needed
• Set priorities
• Take decisions
Example 1: CFN TEM - A Widely Used Research
Tool
• In Transmission Electron Microscopy (TEM) a beam of
electrons is transmitted through an ultra-thin specimen,
interacting with the specimen as it passes through.
• ~500 aberration corrected (S)TEM worldwide with
~20-30 in National Laboratories
• In-situ observations generate from 10GB-10’s of TB
(e.g. at BNL) of data per experiment (and getting
larger) at rates ranging from 100 images/sec for basic
instruments to 400 images/sec = 3GB/s for the state
of the art systems at BNL
• Real time analysis needed to steer the experiment
•
Nanoparticle detection in TEM images
9
• 400 frames/s
• 3GB/seconds
• High noise
Convolutional Neural Network (CNN) based
deep learning framework – U-Net
TEM Detection results
10
Original frames
Detection results
Size distribution
Protein structure detection in cryoEM images 

11
• Size: 4096×4096 • ~120GB for one protein • High noise
Cryo-EM images Protein detection result
Protein structure detection in cryoEM images
12
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Rec.
S1
S2
S3
S4
S5
0.78
1.00
1.00
1.00
1.00
0.67
1.00
1.00
1.00
1.00
0.47
1.00
1.00
1.00
1.00
0.16
1.00
1.00
1.00
1.00
0.03
1.00
1.00
1.00
1.00
0.01
1.00
1.00
1.00
1.00
0.01
0.99
0.98
1.00
1.00
0.00
0.93
0.92
0.90
0.93
0.00
0.58
0.57
0.41
0.58
Prec.
S1
S2
S3
S4
S5
0.89
0.86
0.86
0.83
0.82
0.77
0.86
0.86
0.83
0.82
0.55
0.86
0.86
0.83
0.82
0.18
0.86
0.85
0.83
0.82
0.03
0.86
0.85
0.82
0.82
0.01
0.86
0.85
0.82
0.82
0.00
0.85
0.84
0.82
0.81
0.00
0.80
0.79
0.75
0.76
0.00
0.50
0.49
0.34
0.48
F1
S1
S2
S3
S4
S5
0.83
0.92
0.92
0.90
0.90
0.72
0.92
0.92
0.90
0.90
0.51
0.92
0.92
0.91
0.90
0.17
0.92
0.92
0.90
0.90
0.03
0.92
0.92
0.90
0.90
0.01
0.92
0.92
0.90
0.90
0.00
0.91
0.90
0.89
0.89
0.00
0.86
0.85
0.82
0.84
0.00
0.54
0.53
0.37
0.53
S1: Directly using the pre-trained model
S2: Trained with random initialization, 54,000 samples
S3: Trained with pre-trained model, 54,000 samples
S4: Trained with random initialization, 1,000 samples
S5: Trained with pre-trained model, 1,000 samples
Example 2: NSLS II Light Source (1)
• Brightest Light Source in the World
• 10,000 times brighter than original NSLS
• 2,000 users per year (4,000 in FY17)
• In full build out 60 unique beamlines,
each with a variety of instrument
configurations
• Enabling solutions to pressing energy
challenges, e.g.:
• Advanced electrical storage
• High-temperature superconductors 

for the electric grid
• Fuel cells based on nanocatalysts
• Plant/environment interactions
Example 2: NSLS II Light Source (2)
• NSLS II - Coherent Hard X-Ray (CHX)
Beamline 4.5 GB/s sustained data rates
• NSLS II - Hard X-Ray Nanoprobes (HXN)
50 GB/s sustained, 1 – 5 TB/s in burst
• In-situ and In-Operando experiments
• Observing and guiding physical, chemical
and biological transformations as they
happen
• Observing these processes in working
objects e.g. batteries, microchips
• Real time analysis needed to steer the
experiments
CHX - Healing X-ray Scattering Images
Scientific Achievement
A “physics-aware” algorithm was developed to heal
defects in x-ray scattering datasets
Significance and Impact
Healing x-ray data allows rapid automated analysis,
and is a useful pre-processing step for machine-
learning
Research Details
Modern x-ray synchrotrons generate data at a massive rate;
however, defects in the data (gaps, artifacts) complicate
automated analysis
Traditional ‘inpainting’ fails on scientific data
By exploiting the known physics of x-ray scattering, a tuned
healing algorithm was developed; for instance, symmetry is
recognized and exploited
Healed images can be more easily analyzed using existing
data pipelines, including modern machine-learning methods
The image healing algorithm also outputs physically-useful
information (symmetry, type of ordering, etc.)
Jiliang Liu, et al., IUCrJ 4, 455 (2017).
Healing X-ray Images: X-ray diffraction/scattering images typically have
defects, including missing data and artifacts, which complicate subsequent
analysis. However, traditional image healing algorithms do not interpolate
scientific data in a physically-meaningful way. A “physics-aware” inpainting
method was developed, allowing x-ray scattering images to be healed in a
physically-rigorous way.
CHX - Deep Learning Classification of X-ray Scattering
Data
Scientific Achievement
Modern deep learning methods were 

applied to the automated recognition and 

classification of x-ray scattering data
Significance and Impact
Experimental scientific data can be
automatically and accurately classified, allowing
autonomous experimental decision-making
Research Details
Modern x-ray synchrotrons generate data at a massive 

rate; automated data classification is critically required 

to deal with data-rates
Modern deep learning methods were used to recognize and
classify x-ray scattering data
Neural network training was augmented by inputting a large
corpus of synthetic images; because the 

experimental physics are understood, these images are highly
realistic and allow the network to implicitly learn 

the underlying physics
Boyu Wang, et al. New York Scientific Data Summit, 1 (2016)

Boyu Wang, et al. Applications of Computer Vision (WACV), 1, 697 (2017).

Jiliang Liu, et al., IUCrJ 4, 455 (2017).
Deep learning for scientific data: (Top) A convolutional
neural network is used to convert input x-ray scattering
images into a concise descriptor that can be used for
classification. (Bottom) X-ray scattering images are
complex and diverse. Owing to the limited amount of
tagged experimental data, we generate synthetic highly-
realistic synthetic data to train the CNN.
CHX - Multi-level Visual Data Exploration and
Analysis for Heterogeneous Scientific Images

Scientific Achievement
A multi-level exploration and visualization tool
for scientific image datasets.
Significance and Impact
Solves the challenge to explore and visualize an
extremely large image set, with each image
having a number of heterogeneous attributes.
Research Details
Present a level-of-detail data exploration mechanism
to interactively zoom in and out across multiple
layers of representations of the image set: attribute
layer, raw image layer and pixel layer.
Support attribute comparison and correlation
analysis.
Enable attribute statistics display and cross-filter
operations to interactively select the demand data.
Level-of-detail data exploration mechanism: we design
a data exploration fashion similar to Google Map that
allows users to iteratively zoom across multiple layers of
the information associated with the image set: 0 – the
scatterplot of the whole dataset with chosen X, Y axes
and one attribute being color coded; 1 – the zoom in
effect of the selected images with a pseudocolor plot
showing one attribute; 2 – the further zoom in effect of
1; 3 – the matrix of the raw images where the attributes
in 2 are generated from; 4 – the single raw image; 5 – the
pixel values of selected region in 4.
1
2
3
4
5
0
XANES - Solving the Structure of Nanoparticles
by Machine Learning Scientific Achievement
A new method was developed to decipher the
structure of nanoparticle catalysts on-the-fly from
their X-ray absorption spectra.
Significance and Impact
Tracking the structure of catalysts in real working
conditions is a challenge due to the paucity of
experimental techniques that can measure the number
of atoms and the distance between metal atoms with
high precision. Accurate structural analysis at the
nanoscale, now possible with the new method in the
harsh conditions of high temperature and pressure,
will enable new possibilities for tuning up reactivity of
catalysts at the high flux and high energy resolution
beamlines of NSLS-II and other advanced synchrotron
facilities.
Research Details
Using methods of machine learning the previously
“hidden” relationships between the features in the X-
ray absorption spectra and geometry of nanocatalysts
were found.J. Timoshenko, D. Lu, Y. Lin, A. I. Frenkel,
J. Phys. Chem. Lett., DOI: 10.1021/acs.jpcc.7b06270
A sketch of the new method that enables fast, “on-the-fly”
determination of three-dimensional structure of nanocatalysts.
The colored curves are synchrotron X-ray absorption spectra
collected in real time, during catalytic reaction (in operando).
The neural network (white circles and lines) converts the
spectra into geometric information (such as nanoparticle sizes
and shapes) and the structural models are obtained for each
spectrum.
Computational Modeling Infrastructure
- Multi-Source Data Analysis
Scientific Goal
• Operational integration of multi-modal experimental and
numerical modeling results
Achievement
• Software DiffPy-CMI - determines stable structures for
complex materials using results from different
experimental modalities in combination with ab initio and
atomistic scale modeling (MatDeLab, NWChem)
Impact
• DiffPy-CMI supported 20 materials studies since its first
release
- 2014 release downloaded ~1300 times (several
hundred users)
- 2016 release downloaded ~130 times (~50 users)
• In use at NSLS-II XPD beamline
• Similar approaches under development, e.g., at the
NSLS-II ISS beamline to interpret in operando device
performance behavior; also of interest to NSLS-II FMX,
AMX, and NYX beamlines
www.diffpy.org
Results from complex modeling study
published in Nature Communications
XPCS - Python Multiprocessing for X-ray
Photon Correlation Spectroscopy 

Scientific Achievements
Using the Python multiprocessing module, we have implemented a parallel version
of the analysis code for X-ray Photon Correlation Spectroscopy (XPCS) at BNL’s NSLS
II. The implementation achieves up to 75% efficiency compared to ideal speedup for
the test cases we studied.
Significance and Impact
X-ray photon correlation spectroscopy is an important technique to study the nano-
and meso-scale dynamics of materials. The high resolutions and high incoming image
rates expected at the NSLS II beamlines make it essential to have a software toolkit
that can process the image data as fast as possible. The parallel implementation of
the Python XPCS analysis code is the first step towards the efficient use of the
available computing resources at the beamlines and other institutional computing
resources.
Research Details
XPCS analysis code is implemented within the scikit-beam library being developed
at BNL.
Python multiprocessing module is used, with different processes computing the
time correlations of different region of interests (ROIs).
First-In-First-Out (FIFO) Queue is used to consolidate results from different
processes.
S. K. Abeykoon, M. Lin and K. K. van Dam, Proceedings of New York Scientific Data Summit 2017
Who we are
Computational Science Initiative
• Established in 2015
• An umbrella to bring together computing
and data expertise across the lab
• Aims to foster cross-disciplinary
collaborations in areas of computational
sciences, applied math, computer science
and data analytics.
• Aims to drive developments in
programming models, advanced
algorithms and novel computer
architectures to advance scientific
computing and data analysis.
Director:
Kerstin Kleese van Dam
Deputy Director:
Francis Alexander
Computational Science Initiative
Computing for National Security
Director:
Barbara Chapman
Director:
Eric Lancon
Director:
Shantenu Jha
Director:
Nick D’Imperio
Director:
Adolfy Hoisie
Key research initiatives: Making Sense
of Data at Exabyte-Scale and Beyond
Real-Time Analysis of Ultra-High Throughput Data Streams
Integrated, extreme-scale machine learning and visual analytics methods for real time analysis and
interpretation of streaming data.
New in situ and in operando experiments at at large scale facilities (e.g., NSLS-II, CFN and RHIC)
Data intensive science workflows possible in the Exascale Computing Project
Analysis on the Wire
Autonomous Optimal Experimental Design
Goal-driven capability that optimally leverages theory, modeling/simulation and experiments for the
autonomous design and execution of experiments
Complex Modeling Infrastructure
Interactive Exploration of Multi-Petabyte Scientific Data Sets
Common in nuclear physics, high energy physics, computational biology and climate science
Integrated research into the required novel hardware, system software, programming models, analysis
and visual analytics paradigms
Thank You!

Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experimental Science

  • 1.
    Enabling Real Time Analysisand Decision Making Kerstin Kleese van Dam
  • 2.
    Advancing science andtechnology to fulfill the DOE Mission
  • 3.
    BNL is aData Driven Science Laboratory DOE has 27 User Facilities - 6 are operated by BNL: BNL provides Data-rich Experimental Facilities: ▪ RHIC - Relativistic Heavy Ion Collider - supporting over 1000 scientists world wide ▪ NSLS II - Newest and Brightest Synchrotron in the world, supporting a multitude of scientific research in academia, industry and national security ▪ CFN - Center for Functional Nanomaterials, combines theory and experiment to probe materials ▪ ATF - Accelerator Test Facility ▪ LHC ATLAS - Largest Tier 1 Center outside CERN ▪ ARM - Atmospheric Radiation Measurement Program - Partner in multi-side facility, operating its external data Center BNL supports large scale Experimental Facilities: ▪ Belle II - Computing for Neutrino experiment ▪ QCD - Computing facilities for BNL, RIKEN & US QCD communities Science Today is Data Driven Discovery 3 ATLAS QCD CFN NSLS II RHIC
  • 4.
    !2017 Milestone: Over100PB, of catalogued data archived, long term, frequent reuse ●2nd largest scientific archive in the US, 4th largest in the world (ECMWF, UKMO, NOAA) ●2016 - 400PB analyzed, 2017 -expected 500PB ●2016 37 PB exported ●BNL PanDA software enabled processing of ~1.6 EXBytes of data worldwide in 2016 4 BNL Data Statistics
  • 5.
    Where we needReal Time Analysis and Decision Making
  • 6.
    6/22/2015 6 When dowe need streaming decision support? • Data arrives at high velocity and volume • Critical decisions have to be taken while the data is still arriving • Events of interest are rare • Tacit knowledge is required for accurate interpretations
  • 7.
    June 22, 20157 What does streaming analysis include? Experimental Control: • Analyze all the facts as the situation is evolving • Consider background information • Evaluate alternatives • Request more information if needed • Set priorities • Take decisions
  • 8.
    Example 1: CFNTEM - A Widely Used Research Tool • In Transmission Electron Microscopy (TEM) a beam of electrons is transmitted through an ultra-thin specimen, interacting with the specimen as it passes through. • ~500 aberration corrected (S)TEM worldwide with ~20-30 in National Laboratories • In-situ observations generate from 10GB-10’s of TB (e.g. at BNL) of data per experiment (and getting larger) at rates ranging from 100 images/sec for basic instruments to 400 images/sec = 3GB/s for the state of the art systems at BNL • Real time analysis needed to steer the experiment •
  • 9.
    Nanoparticle detection inTEM images 9 • 400 frames/s • 3GB/seconds • High noise Convolutional Neural Network (CNN) based deep learning framework – U-Net
  • 10.
    TEM Detection results 10 Originalframes Detection results Size distribution
  • 11.
    Protein structure detectionin cryoEM images 
 11 • Size: 4096×4096 • ~120GB for one protein • High noise Cryo-EM images Protein detection result
  • 12.
    Protein structure detectionin cryoEM images 12 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Rec. S1 S2 S3 S4 S5 0.78 1.00 1.00 1.00 1.00 0.67 1.00 1.00 1.00 1.00 0.47 1.00 1.00 1.00 1.00 0.16 1.00 1.00 1.00 1.00 0.03 1.00 1.00 1.00 1.00 0.01 1.00 1.00 1.00 1.00 0.01 0.99 0.98 1.00 1.00 0.00 0.93 0.92 0.90 0.93 0.00 0.58 0.57 0.41 0.58 Prec. S1 S2 S3 S4 S5 0.89 0.86 0.86 0.83 0.82 0.77 0.86 0.86 0.83 0.82 0.55 0.86 0.86 0.83 0.82 0.18 0.86 0.85 0.83 0.82 0.03 0.86 0.85 0.82 0.82 0.01 0.86 0.85 0.82 0.82 0.00 0.85 0.84 0.82 0.81 0.00 0.80 0.79 0.75 0.76 0.00 0.50 0.49 0.34 0.48 F1 S1 S2 S3 S4 S5 0.83 0.92 0.92 0.90 0.90 0.72 0.92 0.92 0.90 0.90 0.51 0.92 0.92 0.91 0.90 0.17 0.92 0.92 0.90 0.90 0.03 0.92 0.92 0.90 0.90 0.01 0.92 0.92 0.90 0.90 0.00 0.91 0.90 0.89 0.89 0.00 0.86 0.85 0.82 0.84 0.00 0.54 0.53 0.37 0.53 S1: Directly using the pre-trained model S2: Trained with random initialization, 54,000 samples S3: Trained with pre-trained model, 54,000 samples S4: Trained with random initialization, 1,000 samples S5: Trained with pre-trained model, 1,000 samples
  • 13.
    Example 2: NSLSII Light Source (1) • Brightest Light Source in the World • 10,000 times brighter than original NSLS • 2,000 users per year (4,000 in FY17) • In full build out 60 unique beamlines, each with a variety of instrument configurations • Enabling solutions to pressing energy challenges, e.g.: • Advanced electrical storage • High-temperature superconductors 
 for the electric grid • Fuel cells based on nanocatalysts • Plant/environment interactions
  • 14.
    Example 2: NSLSII Light Source (2) • NSLS II - Coherent Hard X-Ray (CHX) Beamline 4.5 GB/s sustained data rates • NSLS II - Hard X-Ray Nanoprobes (HXN) 50 GB/s sustained, 1 – 5 TB/s in burst • In-situ and In-Operando experiments • Observing and guiding physical, chemical and biological transformations as they happen • Observing these processes in working objects e.g. batteries, microchips • Real time analysis needed to steer the experiments
  • 15.
    CHX - HealingX-ray Scattering Images Scientific Achievement A “physics-aware” algorithm was developed to heal defects in x-ray scattering datasets Significance and Impact Healing x-ray data allows rapid automated analysis, and is a useful pre-processing step for machine- learning Research Details Modern x-ray synchrotrons generate data at a massive rate; however, defects in the data (gaps, artifacts) complicate automated analysis Traditional ‘inpainting’ fails on scientific data By exploiting the known physics of x-ray scattering, a tuned healing algorithm was developed; for instance, symmetry is recognized and exploited Healed images can be more easily analyzed using existing data pipelines, including modern machine-learning methods The image healing algorithm also outputs physically-useful information (symmetry, type of ordering, etc.) Jiliang Liu, et al., IUCrJ 4, 455 (2017). Healing X-ray Images: X-ray diffraction/scattering images typically have defects, including missing data and artifacts, which complicate subsequent analysis. However, traditional image healing algorithms do not interpolate scientific data in a physically-meaningful way. A “physics-aware” inpainting method was developed, allowing x-ray scattering images to be healed in a physically-rigorous way.
  • 16.
    CHX - DeepLearning Classification of X-ray Scattering Data Scientific Achievement Modern deep learning methods were 
 applied to the automated recognition and 
 classification of x-ray scattering data Significance and Impact Experimental scientific data can be automatically and accurately classified, allowing autonomous experimental decision-making Research Details Modern x-ray synchrotrons generate data at a massive 
 rate; automated data classification is critically required 
 to deal with data-rates Modern deep learning methods were used to recognize and classify x-ray scattering data Neural network training was augmented by inputting a large corpus of synthetic images; because the 
 experimental physics are understood, these images are highly realistic and allow the network to implicitly learn 
 the underlying physics Boyu Wang, et al. New York Scientific Data Summit, 1 (2016)
 Boyu Wang, et al. Applications of Computer Vision (WACV), 1, 697 (2017).
 Jiliang Liu, et al., IUCrJ 4, 455 (2017). Deep learning for scientific data: (Top) A convolutional neural network is used to convert input x-ray scattering images into a concise descriptor that can be used for classification. (Bottom) X-ray scattering images are complex and diverse. Owing to the limited amount of tagged experimental data, we generate synthetic highly- realistic synthetic data to train the CNN.
  • 17.
    CHX - Multi-levelVisual Data Exploration and Analysis for Heterogeneous Scientific Images
 Scientific Achievement A multi-level exploration and visualization tool for scientific image datasets. Significance and Impact Solves the challenge to explore and visualize an extremely large image set, with each image having a number of heterogeneous attributes. Research Details Present a level-of-detail data exploration mechanism to interactively zoom in and out across multiple layers of representations of the image set: attribute layer, raw image layer and pixel layer. Support attribute comparison and correlation analysis. Enable attribute statistics display and cross-filter operations to interactively select the demand data. Level-of-detail data exploration mechanism: we design a data exploration fashion similar to Google Map that allows users to iteratively zoom across multiple layers of the information associated with the image set: 0 – the scatterplot of the whole dataset with chosen X, Y axes and one attribute being color coded; 1 – the zoom in effect of the selected images with a pseudocolor plot showing one attribute; 2 – the further zoom in effect of 1; 3 – the matrix of the raw images where the attributes in 2 are generated from; 4 – the single raw image; 5 – the pixel values of selected region in 4. 1 2 3 4 5 0
  • 18.
    XANES - Solvingthe Structure of Nanoparticles by Machine Learning Scientific Achievement A new method was developed to decipher the structure of nanoparticle catalysts on-the-fly from their X-ray absorption spectra. Significance and Impact Tracking the structure of catalysts in real working conditions is a challenge due to the paucity of experimental techniques that can measure the number of atoms and the distance between metal atoms with high precision. Accurate structural analysis at the nanoscale, now possible with the new method in the harsh conditions of high temperature and pressure, will enable new possibilities for tuning up reactivity of catalysts at the high flux and high energy resolution beamlines of NSLS-II and other advanced synchrotron facilities. Research Details Using methods of machine learning the previously “hidden” relationships between the features in the X- ray absorption spectra and geometry of nanocatalysts were found.J. Timoshenko, D. Lu, Y. Lin, A. I. Frenkel, J. Phys. Chem. Lett., DOI: 10.1021/acs.jpcc.7b06270 A sketch of the new method that enables fast, “on-the-fly” determination of three-dimensional structure of nanocatalysts. The colored curves are synchrotron X-ray absorption spectra collected in real time, during catalytic reaction (in operando). The neural network (white circles and lines) converts the spectra into geometric information (such as nanoparticle sizes and shapes) and the structural models are obtained for each spectrum.
  • 19.
    Computational Modeling Infrastructure -Multi-Source Data Analysis Scientific Goal • Operational integration of multi-modal experimental and numerical modeling results Achievement • Software DiffPy-CMI - determines stable structures for complex materials using results from different experimental modalities in combination with ab initio and atomistic scale modeling (MatDeLab, NWChem) Impact • DiffPy-CMI supported 20 materials studies since its first release - 2014 release downloaded ~1300 times (several hundred users) - 2016 release downloaded ~130 times (~50 users) • In use at NSLS-II XPD beamline • Similar approaches under development, e.g., at the NSLS-II ISS beamline to interpret in operando device performance behavior; also of interest to NSLS-II FMX, AMX, and NYX beamlines www.diffpy.org Results from complex modeling study published in Nature Communications
  • 20.
    XPCS - PythonMultiprocessing for X-ray Photon Correlation Spectroscopy 
 Scientific Achievements Using the Python multiprocessing module, we have implemented a parallel version of the analysis code for X-ray Photon Correlation Spectroscopy (XPCS) at BNL’s NSLS II. The implementation achieves up to 75% efficiency compared to ideal speedup for the test cases we studied. Significance and Impact X-ray photon correlation spectroscopy is an important technique to study the nano- and meso-scale dynamics of materials. The high resolutions and high incoming image rates expected at the NSLS II beamlines make it essential to have a software toolkit that can process the image data as fast as possible. The parallel implementation of the Python XPCS analysis code is the first step towards the efficient use of the available computing resources at the beamlines and other institutional computing resources. Research Details XPCS analysis code is implemented within the scikit-beam library being developed at BNL. Python multiprocessing module is used, with different processes computing the time correlations of different region of interests (ROIs). First-In-First-Out (FIFO) Queue is used to consolidate results from different processes. S. K. Abeykoon, M. Lin and K. K. van Dam, Proceedings of New York Scientific Data Summit 2017
  • 21.
  • 22.
    Computational Science Initiative •Established in 2015 • An umbrella to bring together computing and data expertise across the lab • Aims to foster cross-disciplinary collaborations in areas of computational sciences, applied math, computer science and data analytics. • Aims to drive developments in programming models, advanced algorithms and novel computer architectures to advance scientific computing and data analysis. Director: Kerstin Kleese van Dam Deputy Director: Francis Alexander
  • 23.
    Computational Science Initiative Computingfor National Security Director: Barbara Chapman Director: Eric Lancon Director: Shantenu Jha Director: Nick D’Imperio Director: Adolfy Hoisie
  • 24.
    Key research initiatives:Making Sense of Data at Exabyte-Scale and Beyond Real-Time Analysis of Ultra-High Throughput Data Streams Integrated, extreme-scale machine learning and visual analytics methods for real time analysis and interpretation of streaming data. New in situ and in operando experiments at at large scale facilities (e.g., NSLS-II, CFN and RHIC) Data intensive science workflows possible in the Exascale Computing Project Analysis on the Wire Autonomous Optimal Experimental Design Goal-driven capability that optimally leverages theory, modeling/simulation and experiments for the autonomous design and execution of experiments Complex Modeling Infrastructure Interactive Exploration of Multi-Petabyte Scientific Data Sets Common in nuclear physics, high energy physics, computational biology and climate science Integrated research into the required novel hardware, system software, programming models, analysis and visual analytics paradigms
  • 25.