SlideShare a Scribd company logo
1 of 78
Download to read offline
High performance computing and spatial data:
an overview of recent work at NCAR
Dorit Hammerling
Analytics and Integrative Machine Learning Group
Technology Development Divison
National Center for Atmospheric Research (NCAR)
Joint work with Sophia Chen, Joseph Guinness, Marcin Jurek, Matthias Katzfuss,
Daniel Milroy, Douglas Nychka, Vinay Ramakrishnaiah, Yun Joon Soon and Brian
Vanderwende
February 13, 2018
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 1 / 45
Introduction and Motivation
Outline
1 Introduction and Motivation
2 Application benchmarking study example
3 Other examples and future work
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 2 / 45
Introduction and Motivation
The National Center for Atmospheric Research (NCAR)
• A federally federally funded research and development center
• Mission: To understand the behavior of the atmosphere and related
Earth and geospace systems
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 3 / 45
Introduction and Motivation
NCAR’s Community Earth System Model
• a “virtual laboratory” to study past, present and future climate states
• describes interactions of the atmosphere, land, river runoff, land-ice,
oceans and sea-ice
• complex! Large code base: approx. 1.5 Millions lines of code
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 4 / 45
Introduction and Motivation
Earth System Models
• Computationally very demanding, differential equations are solved for
millions of grid boxes. → Require HPC infrastructure.
• Approximately 200 variables, many in 3-D, are saved to describe the
state of the atmosphere, land, river runoff, land-ice, oceans and
sea-ice for every grid cell. → Massive amounts of data and storage
requirements, lots of science questions.
Work with scientists to gain insights from massive data sets, ideally
without moving the data.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
Introduction and Motivation
Earth System Models
• Computationally very demanding, differential equations are solved for
millions of grid boxes. → Require HPC infrastructure.
• Approximately 200 variables, many in 3-D, are saved to describe the
state of the atmosphere, land, river runoff, land-ice, oceans and
sea-ice for every grid cell. → Massive amounts of data and storage
requirements, lots of science questions.
Work with scientists to gain insights from massive data sets, ideally
without moving the data.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
Introduction and Motivation
Earth System Models
• Computationally very demanding, differential equations are solved for
millions of grid boxes. → Require HPC infrastructure.
• Approximately 200 variables, many in 3-D, are saved to describe the
state of the atmosphere, land, river runoff, land-ice, oceans and
sea-ice for every grid cell. → Massive amounts of data and storage
requirements, lots of science questions.
Work with scientists to gain insights from massive data sets, ideally
without moving the data.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
Introduction and Motivation
Analyzing large spatial data: initial considerations
• What is the scientific question and what statistical or machine
learning modeling framework can we use to answer it?
• Is the analysis inherently parallel or does the model allow for
parallelization?
• What software? Where and how is the data stored?
• How can we optimize the execution on our HPC infrastructure?
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
Introduction and Motivation
Analyzing large spatial data: initial considerations
• What is the scientific question and what statistical or machine
learning modeling framework can we use to answer it?
• Is the analysis inherently parallel or does the model allow for
parallelization?
• What software? Where and how is the data stored?
• How can we optimize the execution on our HPC infrastructure?
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
Introduction and Motivation
Analyzing large spatial data: initial considerations
• What is the scientific question and what statistical or machine
learning modeling framework can we use to answer it?
• Is the analysis inherently parallel or does the model allow for
parallelization?
• What software? Where and how is the data stored?
• How can we optimize the execution on our HPC infrastructure?
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
Introduction and Motivation
Analyzing large spatial data: initial considerations
• What is the scientific question and what statistical or machine
learning modeling framework can we use to answer it?
• Is the analysis inherently parallel or does the model allow for
parallelization?
• What software? Where and how is the data stored?
• How can we optimize the execution on our HPC infrastructure?
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
Application benchmarking study example
Outline
1 Introduction and Motivation
2 Application benchmarking study example
3 Other examples and future work
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 7 / 45
Application benchmarking study example
Study of Precipitation extremes: a typical example
• daily data for 35 years: 12,775 values per grid cell
• 288 longitudes × 192 latitudes: 55,296 grid cells
• 12,775 × 55,296 = 706,406,400 data points (2.83 GB)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 8 / 45
Application benchmarking study example
Fitting a Generalized Pareto distribution
• This is a complementary approach to block maxima for Extreme
Value Analysis
• For data above a given threshold (µ) fit a probability density with the
form:
f (x) =
1
σ[1 + ξ(x−µ)
σ
](1/ξ+1)
for x ≥ µ.
• σ – scale parameter, ξ – shape parameter
• We are ignoring all the data below the threshold to just fit the tail.
• Having selected the threshold, estimate σ and ξ by maximum
likelihood.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 9 / 45
Application benchmarking study example
Fitting a Generalized Pareto distribution: R code
tailProb<- .01 # tail probability used in extremes fitting
returnLevelYear <- 100 # Years used for return level
Y<- dataset[lonindex,latindex]
threshold<- quantile(Y, 1- tailProb)
frac<- sum(Y > threshold) / length(Y)
GPFit<- fevd(Y, threshold=threshold, type="GP",method="MLE")
ReturnLevel<- return.level(GPFit,returnLevelYear, do.ci=TRUE)
Depending on your machine takes somewhere from 0.3 to 1 second.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 10 / 45
Application benchmarking study example
Fitting a Generalized Pareto distribution: R code
tailProb<- .01 # tail probability used in extremes fitting
returnLevelYear <- 100 # Years used for return level
Y<- dataset[lonindex,latindex]
threshold<- quantile(Y, 1- tailProb)
frac<- sum(Y > threshold) / length(Y)
GPFit<- fevd(Y, threshold=threshold, type="GP",method="MLE")
ReturnLevel<- return.level(GPFit,returnLevelYear, do.ci=TRUE)
Depending on your machine takes somewhere from 0.3 to 1 second.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 10 / 45
Application benchmarking study example
Why use HPC systems for statistical computing?
Doing repetitive tasks can take a lot of time.
Even short tasks add up quickly:
• 0.33 seconds for one location corresponds to approx. 5 hours for
55,000 locations.
• 1 second for one location corresponds to approx. 15 hours for 55,000
locations.
And that is for a single data set. Often we want to analyze hundreds of
data sets and test different parameters.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
Application benchmarking study example
Why use HPC systems for statistical computing?
Doing repetitive tasks can take a lot of time.
Even short tasks add up quickly:
• 0.33 seconds for one location corresponds to approx. 5 hours for
55,000 locations.
• 1 second for one location corresponds to approx. 15 hours for 55,000
locations.
And that is for a single data set. Often we want to analyze hundreds of
data sets and test different parameters.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
Application benchmarking study example
Why use HPC systems for statistical computing?
Doing repetitive tasks can take a lot of time.
Even short tasks add up quickly:
• 0.33 seconds for one location corresponds to approx. 5 hours for
55,000 locations.
• 1 second for one location corresponds to approx. 15 hours for 55,000
locations.
And that is for a single data set. Often we want to analyze hundreds of
data sets and test different parameters.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
Application benchmarking study example
NCAR’s high performance computing (HPC) systems
Yellowstone (previous system (decommissioned at the end of 2017) ):
• 1.5 petaflops peak
• 72,256 cores
• 145 TB total memory
• 56 Gb/s interconnects
Cheyenne (new system: evolutionary increase)
• 5.34 petaflops peak
• 145,152 cores
• 313 TB total memory
• 100 Gb/s interconnects
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 12 / 45
Application benchmarking study example
Cores and nodes on HPC systems
• Usually cores on one node share memory (cache).
• Memory between nodes is typically not shared, but can be accessed.
• Understanding the basics of the architecture and interconnects can be
really helpful!
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 13 / 45
Application benchmarking study example
Relevant details: Memory and parallelization tools
Memory available on compute nodes
Two classes of nodes on Cheyenne:
• Standard nodes have 64 GB of memory (46 GB usable).
• Large memory nodes with 128 GB of memory (110 GB usable).
• Data Analysis cluster: 1TB (!) of memory (1000 GB usable)
• But very different network architecture, not meant for working across
nodes!
You need to know what is installed and how it is configured!
• Rmpi:
• Limits on workers?
• What physical interconnect is it using?
• Matlab Distributed Computing server
• Spark for Python or Scala
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 14 / 45
Application benchmarking study example
Relevant details: Memory and parallelization tools
Memory available on compute nodes
Two classes of nodes on Cheyenne:
• Standard nodes have 64 GB of memory (46 GB usable).
• Large memory nodes with 128 GB of memory (110 GB usable).
• Data Analysis cluster: 1TB (!) of memory (1000 GB usable)
• But very different network architecture, not meant for working across
nodes!
You need to know what is installed and how it is configured!
• Rmpi:
• Limits on workers?
• What physical interconnect is it using?
• Matlab Distributed Computing server
• Spark for Python or Scala
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 14 / 45
Application benchmarking study example
Application benchmarks
Even if one knows the architecture very well and has data on low-level
benchmarks, application benchmarking is critical.
Application benchmarking: benchmarking that uses code as close as
possible to the real production code (including I/O operations!).
For our application we use a quarter (approximately 14,000 grid cells) of
the full data for initial benchmarking.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
Application benchmarking study example
Application benchmarks
Even if one knows the architecture very well and has data on low-level
benchmarks, application benchmarking is critical.
Application benchmarking: benchmarking that uses code as close as
possible to the real production code (including I/O operations!).
For our application we use a quarter (approximately 14,000 grid cells) of
the full data for initial benchmarking.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
Application benchmarking study example
Application benchmarks
Even if one knows the architecture very well and has data on low-level
benchmarks, application benchmarking is critical.
Application benchmarking: benchmarking that uses code as close as
possible to the real production code (including I/O operations!).
For our application we use a quarter (approximately 14,000 grid cells) of
the full data for initial benchmarking.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
Application benchmarking study example
Double for loop: code sketch
# outer loop over latitude
for (latindex in 1:dim(lat)) {
# inner loop over longitude
for (lonindex in 1:dim(lon)) {
Y<- dataset[latindex,lonindex,] # extract data
......
EXTREME VALUE ANALYSIS (EVA) CODE
outSummary[latindex,lonindex,]<-EVA RESULTS
......
print(lonindex) # Counter
} # end of inner loop
print(latindex) # Counter
} # end of outer loop
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 16 / 45
Application benchmarking study example
Setup for Application benchmarking
Experimental design:
• What kind of cluster i.e. communication protocol and network?
• What loop to parallelize: inner or outer? Or nested?
• What to put in the inner/outer loop: latitude or longitude?
• How to read in the data? All at once, one latitude/longitude band at
a time, one grid cell at a time?
• How does the application scale with more cores and nodes?
Additional consideration:
• Do we want our code to run as fast as possible or as efficiently as
possible? Total execution time vs. time per core?
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 17 / 45
Application benchmarking study example
Setup for Application benchmarking
Experimental design:
• What kind of cluster i.e. communication protocol and network?
• . . .
• . . .
• . . .
• . . .
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 18 / 45
Application benchmarking study example
Physical networks and communication protocols
• TCP/IP is the protocol on which the internet is based. Connections
can be high bandwidth but also high latency, partially because the
protocol is designed to work in lossy networks. Logical endpoints of
TCP/IP connections are called sockets.
• Ethernet is a physical network designed to support TCP/IP
connections.
• InfiniBand is a physical network designed to support MPI message
passing. The physical connections are very high bandwidth, very low
latency, and very expensive.
• MPI (Message Passing Interface) is a library written to enable
message passing on compute clusters. It employs algorithms that
optimize communication efficiency and speed on clusters. It works
best with a high bandwidth, low latency and near lossless physical
network, but can work on Ethernet via TCP/IP as well.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 19 / 45
Application benchmarking study example
Remote Direct Memory Access (RDMA)
• RDMA allows data to be written/read to/from other nodes’
memories.
• In a sense, the nodes behave like a single aggregate node.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 20 / 45
Application benchmarking study example
Cluster setup in R
To run foreach in parallel, a cluster needs to be set up.
• Starting an R PSOCK cluster sets up TCP/IP connections (without
RDMA) across the Ethernet network (and can be tricked to work over
InfiniBand).
• Starting an MPI cluster sets up MPI communications with RDMA
across the InfiniBand network.
library(Rmpi)
library(doMPI)
##### Cluster setup #####
cl <- startMPIcluster(numCores) # Create MPI cluster
registerDoMPI(cl) # Register parallel backend for foreach
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 21 / 45
Application benchmarking study example
Cluster setup in R
To run foreach in parallel, a cluster needs to be set up.
• Starting an R PSOCK cluster sets up TCP/IP connections (without
RDMA) across the Ethernet network (and can be tricked to work over
InfiniBand).
• Starting an MPI cluster sets up MPI communications with RDMA
across the InfiniBand network.
library(Rmpi)
library(doMPI)
##### Cluster setup #####
cl <- startMPIcluster(numCores) # Create MPI cluster
registerDoMPI(cl) # Register parallel backend for foreach
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 21 / 45
Application benchmarking study example
Setup for Application benchmarking
Experimental design:
• What kind of cluster i.e. communication protocol?
• What loop to parallelize: inner or outer? Or nested?
• What to put in the inner/outer loop: latitude or longitude?
• . . .
• . . .
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 22 / 45
Application benchmarking study example
Parallelizing for loops: the “foreach” package
• foreach provides a looping construct using binary operators, which
can be easily parallelized.
• hyprid between standard for loop and lapply, evaluates an expression,
not a function (as lapply).
• returns a value rather than “causing side-effects”.
• needs a parallel backend, most commonly doParallel
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
Application benchmarking study example
Parallelizing for loops: the “foreach” package
• foreach provides a looping construct using binary operators, which
can be easily parallelized.
• hyprid between standard for loop and lapply, evaluates an expression,
not a function (as lapply).
• returns a value rather than “causing side-effects”.
• needs a parallel backend, most commonly doParallel
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
Application benchmarking study example
Parallelizing for loops: the “foreach” package
• foreach provides a looping construct using binary operators, which
can be easily parallelized.
• hyprid between standard for loop and lapply, evaluates an expression,
not a function (as lapply).
• returns a value rather than “causing side-effects”.
• needs a parallel backend, most commonly doParallel
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
Application benchmarking study example
Parallelizing for loops: the “foreach” package
• foreach provides a looping construct using binary operators, which
can be easily parallelized.
• hyprid between standard for loop and lapply, evaluates an expression,
not a function (as lapply).
• returns a value rather than “causing side-effects”.
• needs a parallel backend, most commonly doParallel
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
Application benchmarking study example
Code sketch for inner loop parallelization
library(doParallel) # loads foreach, parallel and iterators
# outer loop over latitude
for (latindex in 1:numLat) {
dataset <- getData(latindex) # load data for specific latitu
# inner loop over longitude (executed in parallel)
outSummary[latindex,,]<- foreach (lonindex = 1:dim(lon),
.combine=rbind,.packages=c("extRemes")) %dopar% {
Y<- dataset[lonindex,] # extract Y for specific longitude
......
EXTREME VALUE ANALYSIS (EVA) CODE
......
c(threshold,GPFit$results$par,frac = frac,ReturnLevel )
#print(lonindex) # Counter DON’T use in parallel execution
}
print(latindex) # Counter to monitor progress
} Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 24 / 45
Application benchmarking study example
One latitude at a time: inner loop over longitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
Application benchmarking study example
One latitude at a time: inner loop over longitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
Application benchmarking study example
One latitude at a time: inner loop over longitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
Application benchmarking study example
One longitude at a time: inner loop over latitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
Application benchmarking study example
One longitude at a time: inner loop over latitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
Application benchmarking study example
One longitude at a time: inner loop over latitude
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
Application benchmarking study example
Code sketch for outer loop parallelization
# outer loop over latitude (executed in parallel)
outSummary <- foreach(latindex = 1:numLat,lat_count = icount()
.combine = rbind,.packages=c("extRemes", "ncdf4","foreach",
"iterators")) %dopar% {
dataset <- getData(latindex)
# inner loop over longitude (executed sequentially)
foreach(lonindex = 1:dim(lon), lon_count = icount(),
.combine = rbind,.packages=c("extRemes", "foreach")) %do% {
Y<- dataset[lonindex,]
......
EXTREME VALUE ANALYSIS (EVA) CODE
......
c(threshold,GPFit$results$par,frac = frac,ReturnLevel)
}
}
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 27 / 45
Application benchmarking study example
Setup for Application benchmarking
Experimental design:
• What kind of cluster i.e. communication protocol and network?
• What loop to parallelize: inner or outer? Or nested?
• What to put in the inner/outer loop: latitude or longitude?
• How to read in the data? All at once, one latitude/longitude
band at a time, one grid cell at a time?
• . . .
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 28 / 45
Application benchmarking study example
Options of reading in the data
• All the data at once
• One latitude or longitude band at a time
• One grid cell at a time
Trade-off between the number of I/O calls and filling up memory.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 29 / 45
Application benchmarking study example
Options of reading in the data
• All the data at once
• One latitude or longitude band at a time
• One grid cell at a time
Trade-off between the number of I/O calls and filling up memory.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 29 / 45
Application benchmarking study example
Application Benchmarking Results
Experiment 1: Inner Loop Parallelization Across Longitude for 48
Latitudes reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 375
IPoIB 32 16 2 523
IPoIB 48 16 3 543
mpi 16 16 1 2955
mpi 32 16 2 3206
mpi 48 16 3 3277
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 30 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 2: Inner Loop Parallelization Across Longitude for 48
Latitudes reading all the data at once
Experiment 3: Inner Loop Parallelization Across Latitude for 48
Longitudes reading in the data by longitudinal band
Experiment 4: Inner Loop Parallelization Across Latitude for 48
Longitudes reading all the data at once
• Results equivalent or worse for experiments 2 through 4.
• The inner loop parallelization does NOT scale across nodes and can
run out of memory when reading in all the data at once.
• R does not have variable slicing, meaning each worker needs to be
sent the full data and then worker-specific-data is extracted.
[Different in Matlab!]
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 2: Inner Loop Parallelization Across Longitude for 48
Latitudes reading all the data at once
Experiment 3: Inner Loop Parallelization Across Latitude for 48
Longitudes reading in the data by longitudinal band
Experiment 4: Inner Loop Parallelization Across Latitude for 48
Longitudes reading all the data at once
• Results equivalent or worse for experiments 2 through 4.
• The inner loop parallelization does NOT scale across nodes and can
run out of memory when reading in all the data at once.
• R does not have variable slicing, meaning each worker needs to be
sent the full data and then worker-specific-data is extracted.
[Different in Matlab!]
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 2: Inner Loop Parallelization Across Longitude for 48
Latitudes reading all the data at once
Experiment 3: Inner Loop Parallelization Across Latitude for 48
Longitudes reading in the data by longitudinal band
Experiment 4: Inner Loop Parallelization Across Latitude for 48
Longitudes reading all the data at once
• Results equivalent or worse for experiments 2 through 4.
• The inner loop parallelization does NOT scale across nodes and can
run out of memory when reading in all the data at once.
• R does not have variable slicing, meaning each worker needs to be
sent the full data and then worker-specific-data is extracted.
[Different in Matlab!]
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 2: Inner Loop Parallelization Across Longitude for 48
Latitudes reading all the data at once
Experiment 3: Inner Loop Parallelization Across Latitude for 48
Longitudes reading in the data by longitudinal band
Experiment 4: Inner Loop Parallelization Across Latitude for 48
Longitudes reading all the data at once
• Results equivalent or worse for experiments 2 through 4.
• The inner loop parallelization does NOT scale across nodes and can
run out of memory when reading in all the data at once.
• R does not have variable slicing, meaning each worker needs to be
sent the full data and then worker-specific-data is extracted.
[Different in Matlab!]
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 198
IPoIB 32 16 2 172
IPoIB 48 16 3 101
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Much better scaling than innner loop parallelization.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 32 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 198
IPoIB 32 16 2 172
IPoIB 48 16 3 101
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Much better scaling than innner loop parallelization.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 32 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 7: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data by longitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 214
IPoIB 32 16 2 152
IPoIB 48 16 3 116
mpi 16 16 1 209
mpi 32 16 2 146
mpi 48 16 3 101
Much better scaling than innner loop parallelization.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 33 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 7: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data by longitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 214
IPoIB 32 16 2 152
IPoIB 48 16 3 116
mpi 16 16 1 209
mpi 32 16 2 146
mpi 48 16 3 101
Much better scaling than innner loop parallelization.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 33 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Experiment 7: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data by longitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 209
mpi 32 16 2 146
mpi 48 16 3 101
Outer loop parallelization over latitude somewhat faster.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 34 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Experiment 7: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data by longitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 209
mpi 32 16 2 146
mpi 48 16 3 101
Outer loop parallelization over latitude somewhat faster.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 34 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Experiment 6: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data only once
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 220
mpi 32 16 2 123
mpi 48 16 3 93
Data read options provide similar results for this data size.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Experiment 6: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data only once
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 220
mpi 32 16 2 123
mpi 48 16 3 93
Data read options provide similar results for this data size.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
Application benchmarking study example
Application Benchmarking Results cont.
Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 208
mpi 32 16 2 129
mpi 48 16 3 80
Experiment 6: Outer Loop Parallelization Across Longitude for 72
Longitude Values reading in the data only once
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 220
mpi 32 16 2 123
mpi 48 16 3 93
Data read options provide similar results for this data size.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
Application benchmarking study example
Application Benchmarking Results for FULL data
Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 719
IPoIB 32 16 2 521
IPoIB 48 16 3 352
IPoIB 64 16 4 279
IPoIB 80 16 5 252
IPoIB 96 16 6 192
IP (Socket) clusters have a limit of 128 workers within R.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 36 / 45
Application benchmarking study example
Application Benchmarking Results for FULL data
Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
IPoIB 16 16 1 719
IPoIB 32 16 2 521
IPoIB 48 16 3 352
IPoIB 64 16 4 279
IPoIB 80 16 5 252
IPoIB 96 16 6 192
IP (Socket) clusters have a limit of 128 workers within R.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 36 / 45
Application benchmarking study example
Application Benchmarking Results for FULL data
Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 658
mpi 32 16 2 354
mpi 48 16 3 233
mpi 64 16 4 206
mpi 80 16 5 187
mpi 96 16 6 154
mpi 192 16 12 88
Fastest setup overall. About one and a half minutes for the entire data set.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 37 / 45
Application benchmarking study example
Application Benchmarking Results for FULL data
Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude
Values reading in the data by latitudinal band
Cluster numCores ptile numNodes Run Time (seconds)
mpi 16 16 1 658
mpi 32 16 2 354
mpi 48 16 3 233
mpi 64 16 4 206
mpi 80 16 5 187
mpi 96 16 6 154
mpi 192 16 12 88
Fastest setup overall. About one and a half minutes for the entire data set.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 37 / 45
Application benchmarking study example
Technical Report, Data and Code available
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 38 / 45
Other examples and future work
Outline
1 Introduction and Motivation
2 Application benchmarking study example
3 Other examples and future work
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 39 / 45
Other examples and future work
Think globally - act locally (Doug Nychka’s latest work)
• A global statistical model for a spatial field provides seamless
inference across a spatial domain
• A local analysis of spatial data avoids large memory requirements and
simplifies parallel computation
Goal is to combine these two ideas.
• Compute on local neighborhoods of the spatial field but assemble the
results into a global model.
• The local computations are embarrassingly parallel and so easily scale
to many cores.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 40 / 45
Other examples and future work
Emulation of model output
• Pattern scaling is based on a linear relationship between local
temperatures and the global mean.
• Derived from a long coupled climate model run.
Mean scaling pattern
Variation in 8 pattern scaling fields
Goal: Simulate additional fields cheaply that reflect
the properties of the ensemble.
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 41 / 45
Other examples and future work
Parameters of local spatial models
−140 −100 −60
−40−200204060
2 4 6 8 10 14
Range (degrees)
−140 −100 −60
−40−200204060
0.05 0.15 0.25
Sigma (C)
−140 −100 −60
−40−200204060
0.00 0.02 0.04 0.06
Tau (C)
• Parallel fits using Rmpi on moving 11×11 pixel windows
• Demonstrated linear scaling to at least 1000 cores
• Highly nonstationary!
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 42 / 45
Other examples and future work
Emulation of the model output
• Encode the local parameter estimates into a global Markov Random
field model (LatticeKrig).
• Fast simulation due to sparsity of the LatticeKrig precision matrix and
basis functions
Top row: 4 model fields Bottom row: 4 simulated fields
−0.5 0.0 0.5
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 43 / 45
Other examples and future work
Other spatial work using HPC infrastructure
• Comparison of Python and Matlab implementation of the
Multi-resolution approximation
• Statistical compression algorithms using half-spectral models for
spatio-temporal data, parallelization over temporal frequencies
• Bayesian climate change detection and attribution models,
parallelization over number of basis functions
• . . .
Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
Other examples and future work
Other spatial work using HPC infrastructure
• Comparison of Python and Matlab implementation of the
Multi-resolution approximation
• Statistical compression algorithms using half-spectral models for
spatio-temporal data, parallelization over temporal frequencies
• Bayesian climate change detection and attribution models,
parallelization over number of basis functions
• . . .
Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
Other examples and future work
Other spatial work using HPC infrastructure
• Comparison of Python and Matlab implementation of the
Multi-resolution approximation
• Statistical compression algorithms using half-spectral models for
spatio-temporal data, parallelization over temporal frequencies
• Bayesian climate change detection and attribution models,
parallelization over number of basis functions
• . . .
Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
Other examples and future work
Other spatial work using HPC infrastructure
• Comparison of Python and Matlab implementation of the
Multi-resolution approximation
• Statistical compression algorithms using half-spectral models for
spatio-temporal data, parallelization over temporal frequencies
• Bayesian climate change detection and attribution models,
parallelization over number of basis functions
• . . .
Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
Other examples and future work
Other spatial work using HPC infrastructure
• Comparison of Python and Matlab implementation of the
Multi-resolution approximation
• Statistical compression algorithms using half-spectral models for
spatio-temporal data, parallelization over temporal frequencies
• Bayesian climate change detection and attribution models,
parallelization over number of basis functions
• . . .
Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu)
Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45

More Related Content

What's hot

Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudAccelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudGlobus
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Otávio Carvalho
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsAlbert Bifet
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudAnsgar Scherp
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Geoffrey Fox
 
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...aimsnist
 
Earth Science Platform
Earth Science PlatformEarth Science Platform
Earth Science PlatformTed Habermann
 
A Comparison of Different Strategies for Automated Semantic Document Annotation
A Comparison of Different Strategies for Automated Semantic Document AnnotationA Comparison of Different Strategies for Automated Semantic Document Annotation
A Comparison of Different Strategies for Automated Semantic Document AnnotationAnsgar Scherp
 
Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELaimsnist
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTDavide Gallitelli
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLMLconf
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsAnubhav Jain
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceSpiros Oikonomakis
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...Editor IJMTER
 

What's hot (20)

Real Time Geodemographics
Real Time GeodemographicsReal Time Geodemographics
Real Time Geodemographics
 
Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudAccelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data Streams
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel 
 
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
 
Earth Science Platform
Earth Science PlatformEarth Science Platform
Earth Science Platform
 
A Comparison of Different Strategies for Automated Semantic Document Annotation
A Comparison of Different Strategies for Automated Semantic Document AnnotationA Comparison of Different Strategies for Automated Semantic Document Annotation
A Comparison of Different Strategies for Automated Semantic Document Annotation
 
Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NREL
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/Reduce
 
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
A Survey on Improve Efficiency And Scability vertical mining using Agriculter...
 

Similar to NCAR HPC Spatial Data Analysis

Hadoop for High-Performance Climate Analytics - Use Cases and Lessons Learned
Hadoop for High-Performance Climate Analytics - Use Cases and Lessons LearnedHadoop for High-Performance Climate Analytics - Use Cases and Lessons Learned
Hadoop for High-Performance Climate Analytics - Use Cases and Lessons LearnedDataWorks Summit
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Geoffrey Fox
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitGanesan Narayanasamy
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineeringinside-BigData.com
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningKAMAL CHOUDHARY
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data Geoffrey Fox
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performanceinside-BigData.com
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Slide 1
Slide 1Slide 1
Slide 1butest
 

Similar to NCAR HPC Spatial Data Analysis (20)

Hadoop for High-Performance Climate Analytics - Use Cases and Lessons Learned
Hadoop for High-Performance Climate Analytics - Use Cases and Lessons LearnedHadoop for High-Performance Climate Analytics - Use Cases and Lessons Learned
Hadoop for High-Performance Climate Analytics - Use Cases and Lessons Learned
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 
AI Super computer update
AI Super computer update AI Super computer update
AI Super computer update
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on Summit
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineering
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performance
 
AI for Science
AI for ScienceAI for Science
AI for Science
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Slide 1
Slide 1Slide 1
Slide 1
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 

NCAR HPC Spatial Data Analysis

  • 1. High performance computing and spatial data: an overview of recent work at NCAR Dorit Hammerling Analytics and Integrative Machine Learning Group Technology Development Divison National Center for Atmospheric Research (NCAR) Joint work with Sophia Chen, Joseph Guinness, Marcin Jurek, Matthias Katzfuss, Daniel Milroy, Douglas Nychka, Vinay Ramakrishnaiah, Yun Joon Soon and Brian Vanderwende February 13, 2018 Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 1 / 45
  • 2. Introduction and Motivation Outline 1 Introduction and Motivation 2 Application benchmarking study example 3 Other examples and future work Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 2 / 45
  • 3. Introduction and Motivation The National Center for Atmospheric Research (NCAR) • A federally federally funded research and development center • Mission: To understand the behavior of the atmosphere and related Earth and geospace systems Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 3 / 45
  • 4. Introduction and Motivation NCAR’s Community Earth System Model • a “virtual laboratory” to study past, present and future climate states • describes interactions of the atmosphere, land, river runoff, land-ice, oceans and sea-ice • complex! Large code base: approx. 1.5 Millions lines of code Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 4 / 45
  • 5. Introduction and Motivation Earth System Models • Computationally very demanding, differential equations are solved for millions of grid boxes. → Require HPC infrastructure. • Approximately 200 variables, many in 3-D, are saved to describe the state of the atmosphere, land, river runoff, land-ice, oceans and sea-ice for every grid cell. → Massive amounts of data and storage requirements, lots of science questions. Work with scientists to gain insights from massive data sets, ideally without moving the data. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
  • 6. Introduction and Motivation Earth System Models • Computationally very demanding, differential equations are solved for millions of grid boxes. → Require HPC infrastructure. • Approximately 200 variables, many in 3-D, are saved to describe the state of the atmosphere, land, river runoff, land-ice, oceans and sea-ice for every grid cell. → Massive amounts of data and storage requirements, lots of science questions. Work with scientists to gain insights from massive data sets, ideally without moving the data. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
  • 7. Introduction and Motivation Earth System Models • Computationally very demanding, differential equations are solved for millions of grid boxes. → Require HPC infrastructure. • Approximately 200 variables, many in 3-D, are saved to describe the state of the atmosphere, land, river runoff, land-ice, oceans and sea-ice for every grid cell. → Massive amounts of data and storage requirements, lots of science questions. Work with scientists to gain insights from massive data sets, ideally without moving the data. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 5 / 45
  • 8. Introduction and Motivation Analyzing large spatial data: initial considerations • What is the scientific question and what statistical or machine learning modeling framework can we use to answer it? • Is the analysis inherently parallel or does the model allow for parallelization? • What software? Where and how is the data stored? • How can we optimize the execution on our HPC infrastructure? Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
  • 9. Introduction and Motivation Analyzing large spatial data: initial considerations • What is the scientific question and what statistical or machine learning modeling framework can we use to answer it? • Is the analysis inherently parallel or does the model allow for parallelization? • What software? Where and how is the data stored? • How can we optimize the execution on our HPC infrastructure? Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
  • 10. Introduction and Motivation Analyzing large spatial data: initial considerations • What is the scientific question and what statistical or machine learning modeling framework can we use to answer it? • Is the analysis inherently parallel or does the model allow for parallelization? • What software? Where and how is the data stored? • How can we optimize the execution on our HPC infrastructure? Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
  • 11. Introduction and Motivation Analyzing large spatial data: initial considerations • What is the scientific question and what statistical or machine learning modeling framework can we use to answer it? • Is the analysis inherently parallel or does the model allow for parallelization? • What software? Where and how is the data stored? • How can we optimize the execution on our HPC infrastructure? Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 6 / 45
  • 12. Application benchmarking study example Outline 1 Introduction and Motivation 2 Application benchmarking study example 3 Other examples and future work Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 7 / 45
  • 13. Application benchmarking study example Study of Precipitation extremes: a typical example • daily data for 35 years: 12,775 values per grid cell • 288 longitudes × 192 latitudes: 55,296 grid cells • 12,775 × 55,296 = 706,406,400 data points (2.83 GB) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 8 / 45
  • 14. Application benchmarking study example Fitting a Generalized Pareto distribution • This is a complementary approach to block maxima for Extreme Value Analysis • For data above a given threshold (µ) fit a probability density with the form: f (x) = 1 σ[1 + ξ(x−µ) σ ](1/ξ+1) for x ≥ µ. • σ – scale parameter, ξ – shape parameter • We are ignoring all the data below the threshold to just fit the tail. • Having selected the threshold, estimate σ and ξ by maximum likelihood. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 9 / 45
  • 15. Application benchmarking study example Fitting a Generalized Pareto distribution: R code tailProb<- .01 # tail probability used in extremes fitting returnLevelYear <- 100 # Years used for return level Y<- dataset[lonindex,latindex] threshold<- quantile(Y, 1- tailProb) frac<- sum(Y > threshold) / length(Y) GPFit<- fevd(Y, threshold=threshold, type="GP",method="MLE") ReturnLevel<- return.level(GPFit,returnLevelYear, do.ci=TRUE) Depending on your machine takes somewhere from 0.3 to 1 second. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 10 / 45
  • 16. Application benchmarking study example Fitting a Generalized Pareto distribution: R code tailProb<- .01 # tail probability used in extremes fitting returnLevelYear <- 100 # Years used for return level Y<- dataset[lonindex,latindex] threshold<- quantile(Y, 1- tailProb) frac<- sum(Y > threshold) / length(Y) GPFit<- fevd(Y, threshold=threshold, type="GP",method="MLE") ReturnLevel<- return.level(GPFit,returnLevelYear, do.ci=TRUE) Depending on your machine takes somewhere from 0.3 to 1 second. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 10 / 45
  • 17. Application benchmarking study example Why use HPC systems for statistical computing? Doing repetitive tasks can take a lot of time. Even short tasks add up quickly: • 0.33 seconds for one location corresponds to approx. 5 hours for 55,000 locations. • 1 second for one location corresponds to approx. 15 hours for 55,000 locations. And that is for a single data set. Often we want to analyze hundreds of data sets and test different parameters. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
  • 18. Application benchmarking study example Why use HPC systems for statistical computing? Doing repetitive tasks can take a lot of time. Even short tasks add up quickly: • 0.33 seconds for one location corresponds to approx. 5 hours for 55,000 locations. • 1 second for one location corresponds to approx. 15 hours for 55,000 locations. And that is for a single data set. Often we want to analyze hundreds of data sets and test different parameters. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
  • 19. Application benchmarking study example Why use HPC systems for statistical computing? Doing repetitive tasks can take a lot of time. Even short tasks add up quickly: • 0.33 seconds for one location corresponds to approx. 5 hours for 55,000 locations. • 1 second for one location corresponds to approx. 15 hours for 55,000 locations. And that is for a single data set. Often we want to analyze hundreds of data sets and test different parameters. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 11 / 45
  • 20. Application benchmarking study example NCAR’s high performance computing (HPC) systems Yellowstone (previous system (decommissioned at the end of 2017) ): • 1.5 petaflops peak • 72,256 cores • 145 TB total memory • 56 Gb/s interconnects Cheyenne (new system: evolutionary increase) • 5.34 petaflops peak • 145,152 cores • 313 TB total memory • 100 Gb/s interconnects Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 12 / 45
  • 21. Application benchmarking study example Cores and nodes on HPC systems • Usually cores on one node share memory (cache). • Memory between nodes is typically not shared, but can be accessed. • Understanding the basics of the architecture and interconnects can be really helpful! Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 13 / 45
  • 22. Application benchmarking study example Relevant details: Memory and parallelization tools Memory available on compute nodes Two classes of nodes on Cheyenne: • Standard nodes have 64 GB of memory (46 GB usable). • Large memory nodes with 128 GB of memory (110 GB usable). • Data Analysis cluster: 1TB (!) of memory (1000 GB usable) • But very different network architecture, not meant for working across nodes! You need to know what is installed and how it is configured! • Rmpi: • Limits on workers? • What physical interconnect is it using? • Matlab Distributed Computing server • Spark for Python or Scala Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 14 / 45
  • 23. Application benchmarking study example Relevant details: Memory and parallelization tools Memory available on compute nodes Two classes of nodes on Cheyenne: • Standard nodes have 64 GB of memory (46 GB usable). • Large memory nodes with 128 GB of memory (110 GB usable). • Data Analysis cluster: 1TB (!) of memory (1000 GB usable) • But very different network architecture, not meant for working across nodes! You need to know what is installed and how it is configured! • Rmpi: • Limits on workers? • What physical interconnect is it using? • Matlab Distributed Computing server • Spark for Python or Scala Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 14 / 45
  • 24. Application benchmarking study example Application benchmarks Even if one knows the architecture very well and has data on low-level benchmarks, application benchmarking is critical. Application benchmarking: benchmarking that uses code as close as possible to the real production code (including I/O operations!). For our application we use a quarter (approximately 14,000 grid cells) of the full data for initial benchmarking. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
  • 25. Application benchmarking study example Application benchmarks Even if one knows the architecture very well and has data on low-level benchmarks, application benchmarking is critical. Application benchmarking: benchmarking that uses code as close as possible to the real production code (including I/O operations!). For our application we use a quarter (approximately 14,000 grid cells) of the full data for initial benchmarking. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
  • 26. Application benchmarking study example Application benchmarks Even if one knows the architecture very well and has data on low-level benchmarks, application benchmarking is critical. Application benchmarking: benchmarking that uses code as close as possible to the real production code (including I/O operations!). For our application we use a quarter (approximately 14,000 grid cells) of the full data for initial benchmarking. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 15 / 45
  • 27. Application benchmarking study example Double for loop: code sketch # outer loop over latitude for (latindex in 1:dim(lat)) { # inner loop over longitude for (lonindex in 1:dim(lon)) { Y<- dataset[latindex,lonindex,] # extract data ...... EXTREME VALUE ANALYSIS (EVA) CODE outSummary[latindex,lonindex,]<-EVA RESULTS ...... print(lonindex) # Counter } # end of inner loop print(latindex) # Counter } # end of outer loop Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 16 / 45
  • 28. Application benchmarking study example Setup for Application benchmarking Experimental design: • What kind of cluster i.e. communication protocol and network? • What loop to parallelize: inner or outer? Or nested? • What to put in the inner/outer loop: latitude or longitude? • How to read in the data? All at once, one latitude/longitude band at a time, one grid cell at a time? • How does the application scale with more cores and nodes? Additional consideration: • Do we want our code to run as fast as possible or as efficiently as possible? Total execution time vs. time per core? Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 17 / 45
  • 29. Application benchmarking study example Setup for Application benchmarking Experimental design: • What kind of cluster i.e. communication protocol and network? • . . . • . . . • . . . • . . . Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 18 / 45
  • 30. Application benchmarking study example Physical networks and communication protocols • TCP/IP is the protocol on which the internet is based. Connections can be high bandwidth but also high latency, partially because the protocol is designed to work in lossy networks. Logical endpoints of TCP/IP connections are called sockets. • Ethernet is a physical network designed to support TCP/IP connections. • InfiniBand is a physical network designed to support MPI message passing. The physical connections are very high bandwidth, very low latency, and very expensive. • MPI (Message Passing Interface) is a library written to enable message passing on compute clusters. It employs algorithms that optimize communication efficiency and speed on clusters. It works best with a high bandwidth, low latency and near lossless physical network, but can work on Ethernet via TCP/IP as well. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 19 / 45
  • 31. Application benchmarking study example Remote Direct Memory Access (RDMA) • RDMA allows data to be written/read to/from other nodes’ memories. • In a sense, the nodes behave like a single aggregate node. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 20 / 45
  • 32. Application benchmarking study example Cluster setup in R To run foreach in parallel, a cluster needs to be set up. • Starting an R PSOCK cluster sets up TCP/IP connections (without RDMA) across the Ethernet network (and can be tricked to work over InfiniBand). • Starting an MPI cluster sets up MPI communications with RDMA across the InfiniBand network. library(Rmpi) library(doMPI) ##### Cluster setup ##### cl <- startMPIcluster(numCores) # Create MPI cluster registerDoMPI(cl) # Register parallel backend for foreach Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 21 / 45
  • 33. Application benchmarking study example Cluster setup in R To run foreach in parallel, a cluster needs to be set up. • Starting an R PSOCK cluster sets up TCP/IP connections (without RDMA) across the Ethernet network (and can be tricked to work over InfiniBand). • Starting an MPI cluster sets up MPI communications with RDMA across the InfiniBand network. library(Rmpi) library(doMPI) ##### Cluster setup ##### cl <- startMPIcluster(numCores) # Create MPI cluster registerDoMPI(cl) # Register parallel backend for foreach Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 21 / 45
  • 34. Application benchmarking study example Setup for Application benchmarking Experimental design: • What kind of cluster i.e. communication protocol? • What loop to parallelize: inner or outer? Or nested? • What to put in the inner/outer loop: latitude or longitude? • . . . • . . . Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 22 / 45
  • 35. Application benchmarking study example Parallelizing for loops: the “foreach” package • foreach provides a looping construct using binary operators, which can be easily parallelized. • hyprid between standard for loop and lapply, evaluates an expression, not a function (as lapply). • returns a value rather than “causing side-effects”. • needs a parallel backend, most commonly doParallel Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
  • 36. Application benchmarking study example Parallelizing for loops: the “foreach” package • foreach provides a looping construct using binary operators, which can be easily parallelized. • hyprid between standard for loop and lapply, evaluates an expression, not a function (as lapply). • returns a value rather than “causing side-effects”. • needs a parallel backend, most commonly doParallel Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
  • 37. Application benchmarking study example Parallelizing for loops: the “foreach” package • foreach provides a looping construct using binary operators, which can be easily parallelized. • hyprid between standard for loop and lapply, evaluates an expression, not a function (as lapply). • returns a value rather than “causing side-effects”. • needs a parallel backend, most commonly doParallel Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
  • 38. Application benchmarking study example Parallelizing for loops: the “foreach” package • foreach provides a looping construct using binary operators, which can be easily parallelized. • hyprid between standard for loop and lapply, evaluates an expression, not a function (as lapply). • returns a value rather than “causing side-effects”. • needs a parallel backend, most commonly doParallel Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 23 / 45
  • 39. Application benchmarking study example Code sketch for inner loop parallelization library(doParallel) # loads foreach, parallel and iterators # outer loop over latitude for (latindex in 1:numLat) { dataset <- getData(latindex) # load data for specific latitu # inner loop over longitude (executed in parallel) outSummary[latindex,,]<- foreach (lonindex = 1:dim(lon), .combine=rbind,.packages=c("extRemes")) %dopar% { Y<- dataset[lonindex,] # extract Y for specific longitude ...... EXTREME VALUE ANALYSIS (EVA) CODE ...... c(threshold,GPFit$results$par,frac = frac,ReturnLevel ) #print(lonindex) # Counter DON’T use in parallel execution } print(latindex) # Counter to monitor progress } Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 24 / 45
  • 40. Application benchmarking study example One latitude at a time: inner loop over longitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
  • 41. Application benchmarking study example One latitude at a time: inner loop over longitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
  • 42. Application benchmarking study example One latitude at a time: inner loop over longitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 25 / 45
  • 43. Application benchmarking study example One longitude at a time: inner loop over latitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
  • 44. Application benchmarking study example One longitude at a time: inner loop over latitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
  • 45. Application benchmarking study example One longitude at a time: inner loop over latitude Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 26 / 45
  • 46. Application benchmarking study example Code sketch for outer loop parallelization # outer loop over latitude (executed in parallel) outSummary <- foreach(latindex = 1:numLat,lat_count = icount() .combine = rbind,.packages=c("extRemes", "ncdf4","foreach", "iterators")) %dopar% { dataset <- getData(latindex) # inner loop over longitude (executed sequentially) foreach(lonindex = 1:dim(lon), lon_count = icount(), .combine = rbind,.packages=c("extRemes", "foreach")) %do% { Y<- dataset[lonindex,] ...... EXTREME VALUE ANALYSIS (EVA) CODE ...... c(threshold,GPFit$results$par,frac = frac,ReturnLevel) } } Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 27 / 45
  • 47. Application benchmarking study example Setup for Application benchmarking Experimental design: • What kind of cluster i.e. communication protocol and network? • What loop to parallelize: inner or outer? Or nested? • What to put in the inner/outer loop: latitude or longitude? • How to read in the data? All at once, one latitude/longitude band at a time, one grid cell at a time? • . . . Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 28 / 45
  • 48. Application benchmarking study example Options of reading in the data • All the data at once • One latitude or longitude band at a time • One grid cell at a time Trade-off between the number of I/O calls and filling up memory. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 29 / 45
  • 49. Application benchmarking study example Options of reading in the data • All the data at once • One latitude or longitude band at a time • One grid cell at a time Trade-off between the number of I/O calls and filling up memory. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 29 / 45
  • 50. Application benchmarking study example Application Benchmarking Results Experiment 1: Inner Loop Parallelization Across Longitude for 48 Latitudes reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 375 IPoIB 32 16 2 523 IPoIB 48 16 3 543 mpi 16 16 1 2955 mpi 32 16 2 3206 mpi 48 16 3 3277 Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 30 / 45
  • 51. Application benchmarking study example Application Benchmarking Results cont. Experiment 2: Inner Loop Parallelization Across Longitude for 48 Latitudes reading all the data at once Experiment 3: Inner Loop Parallelization Across Latitude for 48 Longitudes reading in the data by longitudinal band Experiment 4: Inner Loop Parallelization Across Latitude for 48 Longitudes reading all the data at once • Results equivalent or worse for experiments 2 through 4. • The inner loop parallelization does NOT scale across nodes and can run out of memory when reading in all the data at once. • R does not have variable slicing, meaning each worker needs to be sent the full data and then worker-specific-data is extracted. [Different in Matlab!] Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
  • 52. Application benchmarking study example Application Benchmarking Results cont. Experiment 2: Inner Loop Parallelization Across Longitude for 48 Latitudes reading all the data at once Experiment 3: Inner Loop Parallelization Across Latitude for 48 Longitudes reading in the data by longitudinal band Experiment 4: Inner Loop Parallelization Across Latitude for 48 Longitudes reading all the data at once • Results equivalent or worse for experiments 2 through 4. • The inner loop parallelization does NOT scale across nodes and can run out of memory when reading in all the data at once. • R does not have variable slicing, meaning each worker needs to be sent the full data and then worker-specific-data is extracted. [Different in Matlab!] Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
  • 53. Application benchmarking study example Application Benchmarking Results cont. Experiment 2: Inner Loop Parallelization Across Longitude for 48 Latitudes reading all the data at once Experiment 3: Inner Loop Parallelization Across Latitude for 48 Longitudes reading in the data by longitudinal band Experiment 4: Inner Loop Parallelization Across Latitude for 48 Longitudes reading all the data at once • Results equivalent or worse for experiments 2 through 4. • The inner loop parallelization does NOT scale across nodes and can run out of memory when reading in all the data at once. • R does not have variable slicing, meaning each worker needs to be sent the full data and then worker-specific-data is extracted. [Different in Matlab!] Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
  • 54. Application benchmarking study example Application Benchmarking Results cont. Experiment 2: Inner Loop Parallelization Across Longitude for 48 Latitudes reading all the data at once Experiment 3: Inner Loop Parallelization Across Latitude for 48 Longitudes reading in the data by longitudinal band Experiment 4: Inner Loop Parallelization Across Latitude for 48 Longitudes reading all the data at once • Results equivalent or worse for experiments 2 through 4. • The inner loop parallelization does NOT scale across nodes and can run out of memory when reading in all the data at once. • R does not have variable slicing, meaning each worker needs to be sent the full data and then worker-specific-data is extracted. [Different in Matlab!] Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 31 / 45
  • 55. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 198 IPoIB 32 16 2 172 IPoIB 48 16 3 101 mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Much better scaling than innner loop parallelization. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 32 / 45
  • 56. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 198 IPoIB 32 16 2 172 IPoIB 48 16 3 101 mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Much better scaling than innner loop parallelization. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 32 / 45
  • 57. Application benchmarking study example Application Benchmarking Results cont. Experiment 7: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data by longitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 214 IPoIB 32 16 2 152 IPoIB 48 16 3 116 mpi 16 16 1 209 mpi 32 16 2 146 mpi 48 16 3 101 Much better scaling than innner loop parallelization. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 33 / 45
  • 58. Application benchmarking study example Application Benchmarking Results cont. Experiment 7: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data by longitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 214 IPoIB 32 16 2 152 IPoIB 48 16 3 116 mpi 16 16 1 209 mpi 32 16 2 146 mpi 48 16 3 101 Much better scaling than innner loop parallelization. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 33 / 45
  • 59. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Experiment 7: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data by longitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 209 mpi 32 16 2 146 mpi 48 16 3 101 Outer loop parallelization over latitude somewhat faster. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 34 / 45
  • 60. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Experiment 7: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data by longitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 209 mpi 32 16 2 146 mpi 48 16 3 101 Outer loop parallelization over latitude somewhat faster. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 34 / 45
  • 61. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Experiment 6: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data only once Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 220 mpi 32 16 2 123 mpi 48 16 3 93 Data read options provide similar results for this data size. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
  • 62. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Experiment 6: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data only once Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 220 mpi 32 16 2 123 mpi 48 16 3 93 Data read options provide similar results for this data size. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
  • 63. Application benchmarking study example Application Benchmarking Results cont. Experiment 5: Outer Loop Parallelization Across Latitude for 48 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 208 mpi 32 16 2 129 mpi 48 16 3 80 Experiment 6: Outer Loop Parallelization Across Longitude for 72 Longitude Values reading in the data only once Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 220 mpi 32 16 2 123 mpi 48 16 3 93 Data read options provide similar results for this data size. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 35 / 45
  • 64. Application benchmarking study example Application Benchmarking Results for FULL data Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 719 IPoIB 32 16 2 521 IPoIB 48 16 3 352 IPoIB 64 16 4 279 IPoIB 80 16 5 252 IPoIB 96 16 6 192 IP (Socket) clusters have a limit of 128 workers within R. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 36 / 45
  • 65. Application benchmarking study example Application Benchmarking Results for FULL data Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) IPoIB 16 16 1 719 IPoIB 32 16 2 521 IPoIB 48 16 3 352 IPoIB 64 16 4 279 IPoIB 80 16 5 252 IPoIB 96 16 6 192 IP (Socket) clusters have a limit of 128 workers within R. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 36 / 45
  • 66. Application benchmarking study example Application Benchmarking Results for FULL data Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 658 mpi 32 16 2 354 mpi 48 16 3 233 mpi 64 16 4 206 mpi 80 16 5 187 mpi 96 16 6 154 mpi 192 16 12 88 Fastest setup overall. About one and a half minutes for the entire data set. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 37 / 45
  • 67. Application benchmarking study example Application Benchmarking Results for FULL data Experiment 5: Outer Loop Parallelization Across Latitude for 192 Latitude Values reading in the data by latitudinal band Cluster numCores ptile numNodes Run Time (seconds) mpi 16 16 1 658 mpi 32 16 2 354 mpi 48 16 3 233 mpi 64 16 4 206 mpi 80 16 5 187 mpi 96 16 6 154 mpi 192 16 12 88 Fastest setup overall. About one and a half minutes for the entire data set. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 37 / 45
  • 68. Application benchmarking study example Technical Report, Data and Code available Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 38 / 45
  • 69. Other examples and future work Outline 1 Introduction and Motivation 2 Application benchmarking study example 3 Other examples and future work Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 39 / 45
  • 70. Other examples and future work Think globally - act locally (Doug Nychka’s latest work) • A global statistical model for a spatial field provides seamless inference across a spatial domain • A local analysis of spatial data avoids large memory requirements and simplifies parallel computation Goal is to combine these two ideas. • Compute on local neighborhoods of the spatial field but assemble the results into a global model. • The local computations are embarrassingly parallel and so easily scale to many cores. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 40 / 45
  • 71. Other examples and future work Emulation of model output • Pattern scaling is based on a linear relationship between local temperatures and the global mean. • Derived from a long coupled climate model run. Mean scaling pattern Variation in 8 pattern scaling fields Goal: Simulate additional fields cheaply that reflect the properties of the ensemble. Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 41 / 45
  • 72. Other examples and future work Parameters of local spatial models −140 −100 −60 −40−200204060 2 4 6 8 10 14 Range (degrees) −140 −100 −60 −40−200204060 0.05 0.15 0.25 Sigma (C) −140 −100 −60 −40−200204060 0.00 0.02 0.04 0.06 Tau (C) • Parallel fits using Rmpi on moving 11×11 pixel windows • Demonstrated linear scaling to at least 1000 cores • Highly nonstationary! Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 42 / 45
  • 73. Other examples and future work Emulation of the model output • Encode the local parameter estimates into a global Markov Random field model (LatticeKrig). • Fast simulation due to sparsity of the LatticeKrig precision matrix and basis functions Top row: 4 model fields Bottom row: 4 simulated fields −0.5 0.0 0.5 Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 43 / 45
  • 74. Other examples and future work Other spatial work using HPC infrastructure • Comparison of Python and Matlab implementation of the Multi-resolution approximation • Statistical compression algorithms using half-spectral models for spatio-temporal data, parallelization over temporal frequencies • Bayesian climate change detection and attribution models, parallelization over number of basis functions • . . . Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
  • 75. Other examples and future work Other spatial work using HPC infrastructure • Comparison of Python and Matlab implementation of the Multi-resolution approximation • Statistical compression algorithms using half-spectral models for spatio-temporal data, parallelization over temporal frequencies • Bayesian climate change detection and attribution models, parallelization over number of basis functions • . . . Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
  • 76. Other examples and future work Other spatial work using HPC infrastructure • Comparison of Python and Matlab implementation of the Multi-resolution approximation • Statistical compression algorithms using half-spectral models for spatio-temporal data, parallelization over temporal frequencies • Bayesian climate change detection and attribution models, parallelization over number of basis functions • . . . Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
  • 77. Other examples and future work Other spatial work using HPC infrastructure • Comparison of Python and Matlab implementation of the Multi-resolution approximation • Statistical compression algorithms using half-spectral models for spatio-temporal data, parallelization over temporal frequencies • Bayesian climate change detection and attribution models, parallelization over number of basis functions • . . . Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45
  • 78. Other examples and future work Other spatial work using HPC infrastructure • Comparison of Python and Matlab implementation of the Multi-resolution approximation • Statistical compression algorithms using half-spectral models for spatio-temporal data, parallelization over temporal frequencies • Bayesian climate change detection and attribution models, parallelization over number of basis functions • . . . Thanks! Any questions? Dorit Hammerling (dorith@ucar.edu) Hammerling et al. (NCAR) HPC for spatial data February 13, 2018 44 / 45