SlideShare a Scribd company logo
1 of 25
Download to read offline
19/11/2020
DAOS : SEIS mapping
DUG’20
Mirna Moawad*, Amr Nasr (Brightskies)
Johann Lombardi, Mohamad Chaarawi, Philippe Thierry (Intel)
1
DAOS User Group 2020DAOS User Group 2020
Agenda
• Marine seismic acquisition
• Motivation
• DAOS-SEIS mapping Graph
• DAOS-SEIS API
• Benchmarking Results
2
DAOS User Group 2020DAOS User Group 2020 3
Marine seismic acquisition
· 1 shot :
· 8 streamers*1000 receivers*5001 samples*4 bytes=
0.15 GB
· 1 line :
· 20km / 25m = >> 800 shots per line = 120GB per line
· 2400 lines
· => 280 TB of Raw seismic data for about 1200 km²
acquisition
Raw data
Sorting
Mute
Filter
Filter 1
Filter 2 … to the final result
Depthkm.
Distance km.
DAOS User Group 2020DAOS User Group 2020
Agenda
• Marine seismic acquisition
• Motivation
• DAOS-SEIS mapping Graph
• DAOS-SEIS API
• Benchmarking Results
4
DAOS User Group 2020 5
Seg-Y Traditional Structure
Data is interleaved with metadataFile Headers Optional File
Headers
DAOS User Group 2020 6
Limitations
● A new copy of the data is created along with each processing step.
● Serial accessing of data in a segy file.
● Most processing applications access traces headers to decide the access pattern to the
data.
Goals
● Leverage object based storage system and inherent metadata/data separation.
● Reduce number of copies through utilizing daos snapshots.
● Store only updates through utilizing daos versioning object storage.
Motivation
DAOS User Group 2020DAOS User Group 2020
Agenda
• Marine seismic acquisition
• Motivation
• DAOS-SEIS mapping Graph
• DAOS-SEIS API
• Benchmarking Results
7
DAOS User Group 2020DAOS User Group 2020
What is DAOS-SEIS ?
Blue : Dkey
Red : Akey
Black : Object
8
DAOS User Group 2020DAOS User Group 2020
Trace Header & Trace Data
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Blue : Dkey
Red : Akey
Black : Object
Seismic Graph Representation
Graph
Properties
Indexing
Trace Data/MetaData
DAOS User Group 2020DAOS User Group 2020
Agenda
• Marine seismic acquisition
• Motivation
• DAOS-SEIS mapping Graph
• DAOS-SEIS API
• Benchmarking Results
11
DAOS User Group 2020DAOS User Group 2020
DAOS-SEIS API
12
DAOS User Group 2020DAOS User Group 2020
Agenda
• Marine seismic acquisition
• Motivation
• DAOS-SEIS mapping Graph
• DAOS-SEIS API
• Benchmarking Results
13
DAOS User Group 2020DAOS User Group 2020
Early Results : Seismic Unix vs DAOS-SEIS
14
● Comparison of the well known library of seismic unix with our DAOS seismic mapping
○ Test seismic unix on a parallel file system(PFS).
○ The parallel file system is not using the same hardware used for the DAOS
system.
● The daos system used to get these results is consisting 1 server node and 1 client node
DAOS User Group 2020DAOS User Group 2020
Early Results : Windowing
15
● Filtering data to read an arbitrary shot ID from the dataset.
● We can notice that the time taken for the DAOS is almost constant even when the size of the dataset
increased tenfold, due to the graph we’ve built when parsing, just increasing from 1.5 second to 2 second.
● Speedup from seismic unix would increase with the size of data since SU time scales with the size of the
data, on a 10 GB data, DAOS about 20x faster, while on 100 GB data, it was 103x faster than its equivalent.
DAOS User Group 2020DAOS User Group 2020
Early Results : Sorting
16
● We noticed a 4x reduction on sorting
from seismic unix.
● Read results slower than on PFS due to
sequential read, which indicates we
need to move to async API to increase
queue depth.
● Dataset was also small, so we suspect
a cache effect with the PFS.
● Further tuning is required.
DAOS User Group 2020DAOS User Group 2020
Next Steps
● Move to DAOS async API to increase queue depth.
● A more detailed comparison against different parallel file systems.
● Optimize current API implementation.
● Parallelize Parsing and sorting functionalities.
● Cover more Seismic processing functionalities in our API.
● Integrate DAOS-SEIS API in existing seismic processing frameworks.
DAOS User Group 2020DAOS User Group 2020
Resources
● Link to the DAOS-SEIS mapping wiki page:
○ https://wiki.hpdd.intel.com/display/DC/DAOS-SEGY+Mapping
● Link to the open-source github repository of the DAOS segy mapping:
○ https://github.com/daos-stack/segy-daos
Thanks
19
DAOS User Group 2020DAOS User Group 2020
Appendix: One level seismic gather object
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Appendix: Multi-Level seismic gather object
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Appendix: Complex gather object
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Appendix: File Header
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Appendix: Root Seismic Object
Blue : Dkey
Red : Akey
Black : Object
DAOS User Group 2020DAOS User Group 2020
Appendix: Window functionality
Data 1
Header
Shot
ID 601
Data 2
Header
Shot
ID 602
Data 3
Header
Shot
ID 603
Data 4
Header
Shot
ID 604
Data 2
Header
Shot
ID 602
Data 3
Header
Shot
ID 603Windowing on shot ID
from 602 to 603 :
Filter all traces in the
data passing by those
with specific values in a
header.

More Related Content

What's hot

IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsDataWorks Summit
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
Acid ORC, Iceberg and Delta Lake
Acid ORC, Iceberg and Delta LakeAcid ORC, Iceberg and Delta Lake
Acid ORC, Iceberg and Delta LakeMichal Gancarski
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowMapR Technologies
 

What's hot (19)

HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited DimensionHDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
 
HDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGISHDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGIS
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 
Utilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud ComputingUtilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud Computing
 
SPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth ObservationSPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth Observation
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Big data applications
Big data applicationsBig data applications
Big data applications
 
Riak TS
Riak TSRiak TS
Riak TS
 
GDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS ProjectGDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS Project
 
NEON HDF5
NEON HDF5NEON HDF5
NEON HDF5
 
Bicod2017
Bicod2017Bicod2017
Bicod2017
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Improved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the MassesImproved Methods for Accessing Scientific Data for the Masses
Improved Methods for Accessing Scientific Data for the Masses
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
Acid ORC, Iceberg and Delta Lake
Acid ORC, Iceberg and Delta LakeAcid ORC, Iceberg and Delta Lake
Acid ORC, Iceberg and Delta Lake
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution Roadshow
 
MATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and CapabilitiesMATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and Capabilities
 
Statistical data in RDF
Statistical data in RDFStatistical data in RDF
Statistical data in RDF
 

Similar to DUG'20: 08 - DAOS-SEGY Mapping

How to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collectionsHow to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collectionsARDC
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesEUDAT
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Unlocking Open Data in the Cloud
Unlocking Open Data in the CloudUnlocking Open Data in the Cloud
Unlocking Open Data in the CloudAmazon Web Services
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsIBM Cloud Data Services
 
DUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabDUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabAndrey Kudryavtsev
 
State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016GeoSolutions
 
In Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosIn Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosDenodo
 
#EarthOnAWS | AWS Public Sector Summit 2017
#EarthOnAWS | AWS Public Sector Summit 2017#EarthOnAWS | AWS Public Sector Summit 2017
#EarthOnAWS | AWS Public Sector Summit 2017Amazon Web Services
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systemsramazan fırın
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardRedis Labs
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10Jody Garnett
 
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraDUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraAndrey Kudryavtsev
 
Scalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesScalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesGlobus
 

Similar to DUG'20: 08 - DAOS-SEGY Mapping (20)

How to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collectionsHow to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collections
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubes
 
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Unlocking Open Data in the Cloud
Unlocking Open Data in the CloudUnlocking Open Data in the Cloud
Unlocking Open Data in the Cloud
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systems
 
DUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabDUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN Openlab
 
Earth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project UpdateEarth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project Update
 
State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016State of GeoServer - FOSS4G 2016
State of GeoServer - FOSS4G 2016
 
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a ServiceBenefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
 
HDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDCHDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDC
 
In Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data ScenariosIn Memory Parallel Processing for Big Data Scenarios
In Memory Parallel Processing for Big Data Scenarios
 
#EarthOnAWS | AWS Public Sector Summit 2017
#EarthOnAWS | AWS Public Sector Summit 2017#EarthOnAWS | AWS Public Sector Summit 2017
#EarthOnAWS | AWS Public Sector Summit 2017
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systems
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff Pollard
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraDUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
 
Scalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized ServicesScalable Data Analytics and Visualization with Cloud Optimized Services
Scalable Data Analytics and Visualization with Cloud Optimized Services
 

More from Andrey Kudryavtsev

DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansAndrey Kudryavtsev
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterAndrey Kudryavtsev
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...Andrey Kudryavtsev
 
DUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage ArchitecturesDUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage ArchitecturesAndrey Kudryavtsev
 
DUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateDUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateAndrey Kudryavtsev
 
DUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSDUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSAndrey Kudryavtsev
 
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedDUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedAndrey Kudryavtsev
 
DUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateDUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateAndrey Kudryavtsev
 
DUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSDUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSAndrey Kudryavtsev
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateAndrey Kudryavtsev
 

More from Andrey Kudryavtsev (11)

DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution Plans
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
 
DUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage ArchitecturesDUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage Architectures
 
DUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateDUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware Update
 
DUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSDUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOS
 
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedDUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
 
DUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateDUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature Update
 
DUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSDUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOS
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS Update
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 

Recently uploaded

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

DUG'20: 08 - DAOS-SEGY Mapping

  • 1. 19/11/2020 DAOS : SEIS mapping DUG’20 Mirna Moawad*, Amr Nasr (Brightskies) Johann Lombardi, Mohamad Chaarawi, Philippe Thierry (Intel) 1
  • 2. DAOS User Group 2020DAOS User Group 2020 Agenda • Marine seismic acquisition • Motivation • DAOS-SEIS mapping Graph • DAOS-SEIS API • Benchmarking Results 2
  • 3. DAOS User Group 2020DAOS User Group 2020 3 Marine seismic acquisition · 1 shot : · 8 streamers*1000 receivers*5001 samples*4 bytes= 0.15 GB · 1 line : · 20km / 25m = >> 800 shots per line = 120GB per line · 2400 lines · => 280 TB of Raw seismic data for about 1200 km² acquisition Raw data Sorting Mute Filter Filter 1 Filter 2 … to the final result Depthkm. Distance km.
  • 4. DAOS User Group 2020DAOS User Group 2020 Agenda • Marine seismic acquisition • Motivation • DAOS-SEIS mapping Graph • DAOS-SEIS API • Benchmarking Results 4
  • 5. DAOS User Group 2020 5 Seg-Y Traditional Structure Data is interleaved with metadataFile Headers Optional File Headers
  • 6. DAOS User Group 2020 6 Limitations ● A new copy of the data is created along with each processing step. ● Serial accessing of data in a segy file. ● Most processing applications access traces headers to decide the access pattern to the data. Goals ● Leverage object based storage system and inherent metadata/data separation. ● Reduce number of copies through utilizing daos snapshots. ● Store only updates through utilizing daos versioning object storage. Motivation
  • 7. DAOS User Group 2020DAOS User Group 2020 Agenda • Marine seismic acquisition • Motivation • DAOS-SEIS mapping Graph • DAOS-SEIS API • Benchmarking Results 7
  • 8. DAOS User Group 2020DAOS User Group 2020 What is DAOS-SEIS ? Blue : Dkey Red : Akey Black : Object 8
  • 9. DAOS User Group 2020DAOS User Group 2020 Trace Header & Trace Data Blue : Dkey Red : Akey Black : Object
  • 10. DAOS User Group 2020DAOS User Group 2020 Blue : Dkey Red : Akey Black : Object Seismic Graph Representation Graph Properties Indexing Trace Data/MetaData
  • 11. DAOS User Group 2020DAOS User Group 2020 Agenda • Marine seismic acquisition • Motivation • DAOS-SEIS mapping Graph • DAOS-SEIS API • Benchmarking Results 11
  • 12. DAOS User Group 2020DAOS User Group 2020 DAOS-SEIS API 12
  • 13. DAOS User Group 2020DAOS User Group 2020 Agenda • Marine seismic acquisition • Motivation • DAOS-SEIS mapping Graph • DAOS-SEIS API • Benchmarking Results 13
  • 14. DAOS User Group 2020DAOS User Group 2020 Early Results : Seismic Unix vs DAOS-SEIS 14 ● Comparison of the well known library of seismic unix with our DAOS seismic mapping ○ Test seismic unix on a parallel file system(PFS). ○ The parallel file system is not using the same hardware used for the DAOS system. ● The daos system used to get these results is consisting 1 server node and 1 client node
  • 15. DAOS User Group 2020DAOS User Group 2020 Early Results : Windowing 15 ● Filtering data to read an arbitrary shot ID from the dataset. ● We can notice that the time taken for the DAOS is almost constant even when the size of the dataset increased tenfold, due to the graph we’ve built when parsing, just increasing from 1.5 second to 2 second. ● Speedup from seismic unix would increase with the size of data since SU time scales with the size of the data, on a 10 GB data, DAOS about 20x faster, while on 100 GB data, it was 103x faster than its equivalent.
  • 16. DAOS User Group 2020DAOS User Group 2020 Early Results : Sorting 16 ● We noticed a 4x reduction on sorting from seismic unix. ● Read results slower than on PFS due to sequential read, which indicates we need to move to async API to increase queue depth. ● Dataset was also small, so we suspect a cache effect with the PFS. ● Further tuning is required.
  • 17. DAOS User Group 2020DAOS User Group 2020 Next Steps ● Move to DAOS async API to increase queue depth. ● A more detailed comparison against different parallel file systems. ● Optimize current API implementation. ● Parallelize Parsing and sorting functionalities. ● Cover more Seismic processing functionalities in our API. ● Integrate DAOS-SEIS API in existing seismic processing frameworks.
  • 18. DAOS User Group 2020DAOS User Group 2020 Resources ● Link to the DAOS-SEIS mapping wiki page: ○ https://wiki.hpdd.intel.com/display/DC/DAOS-SEGY+Mapping ● Link to the open-source github repository of the DAOS segy mapping: ○ https://github.com/daos-stack/segy-daos
  • 20. DAOS User Group 2020DAOS User Group 2020 Appendix: One level seismic gather object Blue : Dkey Red : Akey Black : Object
  • 21. DAOS User Group 2020DAOS User Group 2020 Appendix: Multi-Level seismic gather object Blue : Dkey Red : Akey Black : Object
  • 22. DAOS User Group 2020DAOS User Group 2020 Appendix: Complex gather object Blue : Dkey Red : Akey Black : Object
  • 23. DAOS User Group 2020DAOS User Group 2020 Appendix: File Header Blue : Dkey Red : Akey Black : Object
  • 24. DAOS User Group 2020DAOS User Group 2020 Appendix: Root Seismic Object Blue : Dkey Red : Akey Black : Object
  • 25. DAOS User Group 2020DAOS User Group 2020 Appendix: Window functionality Data 1 Header Shot ID 601 Data 2 Header Shot ID 602 Data 3 Header Shot ID 603 Data 4 Header Shot ID 604 Data 2 Header Shot ID 602 Data 3 Header Shot ID 603Windowing on shot ID from 602 to 603 : Filter all traces in the data passing by those with specific values in a header.