SlideShare a Scribd company logo
1 of 1
Download to read offline
OpenTopography - Scalable Services for Geosciences Data
www.opentopography.org
Canopy Height (ft)
@opentopography
info@opentopography.org
DOI / OGC
CSW
DATA USAGE ANALYTICS
HPC & CLOUD INTEGRATION
CYBERINFRASTRUCTURE
Spatiotemporal variations in data access illustrate that certain regions of a dataset can be "cold", while others are "hot". OT collects analytics which include user data selections through time.
We have developed tools that allow us to mine and visualize this information, and are exploring how to utilize these analytics to develop storage optimizations based on data value and cost.
For the hottest data, fast (I/O) and scaleable access are required. In these cases, data stored on SSD and accessible through HPC systems such as Gordon are desirable. For "cooler" data
which sees more infrequent access, cheaper (and slower) storage systems such as the cloud can be used to lower data facility operating costs. A tiered storage system offers the potential
to dynamically manage data storage and associated system performance based on real analytical information about usage.
In the case of topographic data, events such as earthquakes, floods, landslides, and other geophysical events are likely to cause an increase in demand for data that intersect the spatial
extent of the event. External feeds (e.g., USGS NEIC) could be monitored to proactively move data into high performance storage in anticipation of increased demand.
Activity based
data ranking and
tiered cloud &
HPC integrated
storage
1. On-demand job execution on Gordon (XSEDE HPC Resource)
OT received a Microsoft Azure for Research Award (allocated $40k in Azure Resources) to
explore integration of cloud resources into our existing infrastructure.
A prototype OT image on Azure VM depot allows us (or others) to quickly deploy the OT
software stack on an appropriately sized resource.
Data can be pulled from OT’s storage on the SDSC Cloud for processing in Azure.
USE CASE: TauDEM hydrologic analysis of DEMs
TauDEM is an open source hydrologic analysis toolkit developed by David Tarboton
(USU).
As part of OT’s CyberGIS collaboration, we implemented TauDEM (MPI) on Gordon. We
dynamically scale the number of cores allocated to the job, as a function of the size of
the input DEM.
2. Integration of cloud based on-demand geospatial processing services
OT has a dedicated Gordon I/O Node XSEDE allocation with 48 GB Memory/4.8TB
Flash memory + 16 Compute nodes (256 cores) with 64GB memory + QDR InfiniBand
Interconnect.
Performance tests using a DEM generation use case showed 20x job speed-ups
when four concurrent jobs are executed on Gordon vs OT's standard compute cluster.
Test case: 208 million LIDAR returns gridded to 20cm grid.
http://www.engineering.usu.edu/dtarb/
The OpenTopography cyberinfrastructure employs a multi-tier service-oriented architecture (SOA) that is highly scalable, permitting upgrades to the infrastructure tier and
corresponding algorithms without the need to update the APIs and clients. The SOA has enabled the integration of compute intensive algorithms, like the TauDEM hydrology
suite running on the Gordon XSEDE resource, as a service made available to the OpenTopography user community. The pluggable services architecture allows researchers
to integrate their algorithms into the OpenTopography processing workflow. OpenTopography also interoperates with other CI systems like the NSF-funded CyberGIS
viewshed analysis application, NASA SSARA, etc.
OpenTopography implements a catalog services for the web (CSW),
using the ISO 19115 metadata standard that can be federated with
other environments, e.g., NSF Earthcube, Thomson Reuters Web of
Science, etc. All datasets served via OpenTopography are assigned a
DOI that not only provides a persistent identifier for the dataset.
Cover image of Science featured a 0.25 m digital elevation model
(DEM) and hillshade of offset channels along the San Andreas
Fault in the Carrizo Plain produced by OpenTopography.
The OpenTopography facility was funded by the National Science Foundation (NSF) in 2009 to provide efficient online access to Earth science-oriented high-resolution lidar topography data, online processing tools, and derivative products.
Currently, OpenTopography serves 183 high resolution LIDAR (Light Detection and Ranging) point cloud datasets with over 820 billion returns covering approximately 179,153 sq. km. of important geologic
features such as the San Andreas Fault, Yellowstone, Tetons, Yosemite National Parks, etc., to a growing user community. Information collected from over 42,250 custom point cloud jobs that have processed
upwards of 1.4 trillion LIDAR returns, and over 19,800 custom raster data jobs, is being analyzed to prioritize future development based on usage insights as well as identifying novel approaches to managing
the exponential growth in data.
Collaboration Opportunities
Analysis of user behavior and data usage for optimizing
data location in deep storage/memory hierarchies
Pluggable services framework - Tracking software
provenance / framework security
New data types - Full waveform
LIDAR, Hyperspectral Imagery data
New processing algorithms - change detection, difference analysis
and time series analysis. Algorithm optimizations/parallelization
| | |

More Related Content

What's hot

Processing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechProcessing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechRob Emanuele
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefRobert Grossman
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechRob Emanuele
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopExtremeEarth
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechRob Emanuele
 
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsEnabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsRob Emanuele
 
Earth Science Platform
Earth Science PlatformEarth Science Platform
Earth Science PlatformTed Habermann
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DCCCRinc
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationRob Emanuele
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
DATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & TimeDATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & Timeplan4all
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19ExtremeEarth
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech ProjectsJody Garnett
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkRob Emanuele
 
ExtremeEarth Open Workshop - Overview and Achievements
ExtremeEarth Open Workshop - Overview and AchievementsExtremeEarth Open Workshop - Overview and Achievements
ExtremeEarth Open Workshop - Overview and AchievementsExtremeEarth
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
New features presentation: meteodyn WT 4.8 software - Wind Energy
New features presentation: meteodyn WT 4.8 software - Wind EnergyNew features presentation: meteodyn WT 4.8 software - Wind Energy
New features presentation: meteodyn WT 4.8 software - Wind EnergyJean-Claude Meteodyn
 

What's hot (20)

Processing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTechProcessing Geospatial at Scale at LocationTech
Processing Geospatial at Scale at LocationTech
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster Relief
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projectsEnabling Access to Big Geospatial Data with LocationTech and Apache projects
Enabling Access to Big Geospatial Data with LocationTech and Apache projects
 
Earth Science Platform
Earth Science PlatformEarth Science Platform
Earth Science Platform
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DC
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 
Slide 1
Slide 1Slide 1
Slide 1
 
DATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & TimeDATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & Time
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19
 
Application of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailandApplication of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailand
 
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in ThailandApplication of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in Thailand
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
 
ExtremeEarth Open Workshop - Overview and Achievements
ExtremeEarth Open Workshop - Overview and AchievementsExtremeEarth Open Workshop - Overview and Achievements
ExtremeEarth Open Workshop - Overview and Achievements
 
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
New features presentation: meteodyn WT 4.8 software - Wind Energy
New features presentation: meteodyn WT 4.8 software - Wind EnergyNew features presentation: meteodyn WT 4.8 software - Wind Energy
New features presentation: meteodyn WT 4.8 software - Wind Energy
 

Viewers also liked

Religious c europe
Religious c europeReligious c europe
Religious c europejlo1313
 
Offshore development
Offshore developmentOffshore development
Offshore developmentsagar Patel
 
Encuentro de saberes "Los sentidos"
Encuentro de saberes "Los sentidos"Encuentro de saberes "Los sentidos"
Encuentro de saberes "Los sentidos"lualdom
 
Ficha Técnica Renault Symbol 2014
Ficha Técnica Renault Symbol 2014Ficha Técnica Renault Symbol 2014
Ficha Técnica Renault Symbol 2014rfarias_10
 
140. cantata per una tortuga. narració sense cançons
140. cantata per una tortuga. narració sense cançons140. cantata per una tortuga. narració sense cançons
140. cantata per una tortuga. narració sense cançonsjoanacervello
 
Conceptualizacion de la ley de ohm a partir del uso de las tics
Conceptualizacion de la ley de ohm  a partir del uso de las ticsConceptualizacion de la ley de ohm  a partir del uso de las tics
Conceptualizacion de la ley de ohm a partir del uso de las ticsCesar Aljure
 
2.1 2 The Impact of Marketing
2.1 2 The Impact of Marketing2.1 2 The Impact of Marketing
2.1 2 The Impact of Marketingioanekk
 
Iu mocion para atajar el grave deterioro que esta sufriendo cijuela
Iu mocion para atajar el grave deterioro que esta sufriendo cijuelaIu mocion para atajar el grave deterioro que esta sufriendo cijuela
Iu mocion para atajar el grave deterioro que esta sufriendo cijuelaHilario Sánchez Díaz
 
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011DirMKTCofares
 
Articles 104782 archivo-powerpoint_0
Articles 104782 archivo-powerpoint_0Articles 104782 archivo-powerpoint_0
Articles 104782 archivo-powerpoint_0danielarojassepulveda
 

Viewers also liked (19)

Religious c europe
Religious c europeReligious c europe
Religious c europe
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1
 
826_tipo_de_tortugas.doc
826_tipo_de_tortugas.doc826_tipo_de_tortugas.doc
826_tipo_de_tortugas.doc
 
Offshore development
Offshore developmentOffshore development
Offshore development
 
2ºA
2ºA2ºA
2ºA
 
Ve liveshow lang nghe mua xuan ve
Ve liveshow lang nghe mua xuan veVe liveshow lang nghe mua xuan ve
Ve liveshow lang nghe mua xuan ve
 
Autotrofii
AutotrofiiAutotrofii
Autotrofii
 
Encuentro de saberes "Los sentidos"
Encuentro de saberes "Los sentidos"Encuentro de saberes "Los sentidos"
Encuentro de saberes "Los sentidos"
 
Ficha Técnica Renault Symbol 2014
Ficha Técnica Renault Symbol 2014Ficha Técnica Renault Symbol 2014
Ficha Técnica Renault Symbol 2014
 
140. cantata per una tortuga. narració sense cançons
140. cantata per una tortuga. narració sense cançons140. cantata per una tortuga. narració sense cançons
140. cantata per una tortuga. narració sense cançons
 
Conceptualizacion de la ley de ohm a partir del uso de las tics
Conceptualizacion de la ley de ohm  a partir del uso de las ticsConceptualizacion de la ley de ohm  a partir del uso de las tics
Conceptualizacion de la ley de ohm a partir del uso de las tics
 
2.1 2 The Impact of Marketing
2.1 2 The Impact of Marketing2.1 2 The Impact of Marketing
2.1 2 The Impact of Marketing
 
Iu mocion para atajar el grave deterioro que esta sufriendo cijuela
Iu mocion para atajar el grave deterioro que esta sufriendo cijuelaIu mocion para atajar el grave deterioro que esta sufriendo cijuela
Iu mocion para atajar el grave deterioro que esta sufriendo cijuela
 
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011
Cofares Pharmagame - Repercusión Presentación 08 de Noviembre de 2011
 
1987sep66
1987sep661987sep66
1987sep66
 
A familia
A familiaA familia
A familia
 
Word triqui
Word triquiWord triqui
Word triqui
 
Turmeric pasta
Turmeric pastaTurmeric pasta
Turmeric pasta
 
Articles 104782 archivo-powerpoint_0
Articles 104782 archivo-powerpoint_0Articles 104782 archivo-powerpoint_0
Articles 104782 archivo-powerpoint_0
 

Similar to OpenTopography - Scalable Services for Geosciences Data

The Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
The Gordon Data-intensive Supercomputer. Enabling Scientific DiscoveryThe Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
The Gordon Data-intensive Supercomputer. Enabling Scientific DiscoveryIntel IT Center
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Ian Foster
 
grid mining
grid mininggrid mining
grid miningARNOLD
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)Robert Grossman
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkA Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkIRJET Journal
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009Robert Grossman
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009Ian Foster
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Earth on AWS - Next-Generation Open Data Platforms
Earth on AWS - Next-Generation Open Data PlatformsEarth on AWS - Next-Generation Open Data Platforms
Earth on AWS - Next-Generation Open Data PlatformsAmazon Web Services
 
2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eoEGI Federation
 
IMGS Geospatial User Group 2014 - Big data management with Apollo
IMGS Geospatial User Group 2014 - Big data management with ApolloIMGS Geospatial User Group 2014 - Big data management with Apollo
IMGS Geospatial User Group 2014 - Big data management with ApolloIMGS
 
remotesensing-12-01253.pdf
remotesensing-12-01253.pdfremotesensing-12-01253.pdf
remotesensing-12-01253.pdfNguyenVanTuan29
 

Similar to OpenTopography - Scalable Services for Geosciences Data (20)

The Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
The Gordon Data-intensive Supercomputer. Enabling Scientific DiscoveryThe Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
The Gordon Data-intensive Supercomputer. Enabling Scientific Discovery
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
 
grid mining
grid mininggrid mining
grid mining
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Slide 1
Slide 1Slide 1
Slide 1
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame WorkA Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Earth on AWS - Next-Generation Open Data Platforms
Earth on AWS - Next-Generation Open Data PlatformsEarth on AWS - Next-Generation Open Data Platforms
Earth on AWS - Next-Generation Open Data Platforms
 
2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo
 
IMGS Geospatial User Group 2014 - Big data management with Apollo
IMGS Geospatial User Group 2014 - Big data management with ApolloIMGS Geospatial User Group 2014 - Big data management with Apollo
IMGS Geospatial User Group 2014 - Big data management with Apollo
 
remotesensing-12-01253.pdf
remotesensing-12-01253.pdfremotesensing-12-01253.pdf
remotesensing-12-01253.pdf
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

OpenTopography - Scalable Services for Geosciences Data

  • 1. OpenTopography - Scalable Services for Geosciences Data www.opentopography.org Canopy Height (ft) @opentopography info@opentopography.org DOI / OGC CSW DATA USAGE ANALYTICS HPC & CLOUD INTEGRATION CYBERINFRASTRUCTURE Spatiotemporal variations in data access illustrate that certain regions of a dataset can be "cold", while others are "hot". OT collects analytics which include user data selections through time. We have developed tools that allow us to mine and visualize this information, and are exploring how to utilize these analytics to develop storage optimizations based on data value and cost. For the hottest data, fast (I/O) and scaleable access are required. In these cases, data stored on SSD and accessible through HPC systems such as Gordon are desirable. For "cooler" data which sees more infrequent access, cheaper (and slower) storage systems such as the cloud can be used to lower data facility operating costs. A tiered storage system offers the potential to dynamically manage data storage and associated system performance based on real analytical information about usage. In the case of topographic data, events such as earthquakes, floods, landslides, and other geophysical events are likely to cause an increase in demand for data that intersect the spatial extent of the event. External feeds (e.g., USGS NEIC) could be monitored to proactively move data into high performance storage in anticipation of increased demand. Activity based data ranking and tiered cloud & HPC integrated storage 1. On-demand job execution on Gordon (XSEDE HPC Resource) OT received a Microsoft Azure for Research Award (allocated $40k in Azure Resources) to explore integration of cloud resources into our existing infrastructure. A prototype OT image on Azure VM depot allows us (or others) to quickly deploy the OT software stack on an appropriately sized resource. Data can be pulled from OT’s storage on the SDSC Cloud for processing in Azure. USE CASE: TauDEM hydrologic analysis of DEMs TauDEM is an open source hydrologic analysis toolkit developed by David Tarboton (USU). As part of OT’s CyberGIS collaboration, we implemented TauDEM (MPI) on Gordon. We dynamically scale the number of cores allocated to the job, as a function of the size of the input DEM. 2. Integration of cloud based on-demand geospatial processing services OT has a dedicated Gordon I/O Node XSEDE allocation with 48 GB Memory/4.8TB Flash memory + 16 Compute nodes (256 cores) with 64GB memory + QDR InfiniBand Interconnect. Performance tests using a DEM generation use case showed 20x job speed-ups when four concurrent jobs are executed on Gordon vs OT's standard compute cluster. Test case: 208 million LIDAR returns gridded to 20cm grid. http://www.engineering.usu.edu/dtarb/ The OpenTopography cyberinfrastructure employs a multi-tier service-oriented architecture (SOA) that is highly scalable, permitting upgrades to the infrastructure tier and corresponding algorithms without the need to update the APIs and clients. The SOA has enabled the integration of compute intensive algorithms, like the TauDEM hydrology suite running on the Gordon XSEDE resource, as a service made available to the OpenTopography user community. The pluggable services architecture allows researchers to integrate their algorithms into the OpenTopography processing workflow. OpenTopography also interoperates with other CI systems like the NSF-funded CyberGIS viewshed analysis application, NASA SSARA, etc. OpenTopography implements a catalog services for the web (CSW), using the ISO 19115 metadata standard that can be federated with other environments, e.g., NSF Earthcube, Thomson Reuters Web of Science, etc. All datasets served via OpenTopography are assigned a DOI that not only provides a persistent identifier for the dataset. Cover image of Science featured a 0.25 m digital elevation model (DEM) and hillshade of offset channels along the San Andreas Fault in the Carrizo Plain produced by OpenTopography. The OpenTopography facility was funded by the National Science Foundation (NSF) in 2009 to provide efficient online access to Earth science-oriented high-resolution lidar topography data, online processing tools, and derivative products. Currently, OpenTopography serves 183 high resolution LIDAR (Light Detection and Ranging) point cloud datasets with over 820 billion returns covering approximately 179,153 sq. km. of important geologic features such as the San Andreas Fault, Yellowstone, Tetons, Yosemite National Parks, etc., to a growing user community. Information collected from over 42,250 custom point cloud jobs that have processed upwards of 1.4 trillion LIDAR returns, and over 19,800 custom raster data jobs, is being analyzed to prioritize future development based on usage insights as well as identifying novel approaches to managing the exponential growth in data. Collaboration Opportunities Analysis of user behavior and data usage for optimizing data location in deep storage/memory hierarchies Pluggable services framework - Tracking software provenance / framework security New data types - Full waveform LIDAR, Hyperspectral Imagery data New processing algorithms - change detection, difference analysis and time series analysis. Algorithm optimizations/parallelization | | |