SlideShare a Scribd company logo
1 of 22
Download to read offline
ORNL is managed by UT-Battelle LLC for the US Department of Energy
Enabling Insight to Support World-Class
Supercomputing
Stefan Ceballos
Oak Ridge National Laboratory
National Center for Computational Sciences
Notice: This work was supported by the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National
Laboratory (ORNL) for the Department of Energy (DOE) under Prime Contract Number DE-AC05-00OR-22725
2
Topics
• Overview of Oak Ridge National Laboratory (ORNL)
– National Center for Computational Sciences (NCCS) overview
– History and future of high-performance computing (HPC) at ORNL
– External customers as an org, internal customers as a team
• Operational data challenges in HPC
• New team formation
• Lessons learned
• Current use-cases
3
Our vision: Sustain ORNL’s leadership and scientific
impact in computing and computational sciences
• Provide the world’s most powerful open
resources for:
– Scalable computing and simulation
– Data and analytics at any scale
– Scalable cyber-secure infrastructure for science
• Follow a well-defined path for maintaining
world leadership in these critical areas
• Deliver leading-edge science relevant to missions
of DOE and key federal and state agencies
• Build and exploit cross-cutting partnerships
• Attract the brightest talent
• Invest in education and training
4
National Center for Computational Sciences
Home to the OLCF, including Summit
• 65,000 ft2 of DC space
– Distributed across 4 controlled-access areas
– 31,000 ft2: Very high load bearing (³625 lb/ft2)
• 40 MW of high-efficiency highly reliable power
– Tightly integrated into TVA’s
161 kV transmission system
– Diverse medium-voltage distribution system
– High-efficiency 3.0/4.0 MVA transformers
• 6,600 tons of chilled water
– 20 MW cooling capacity at 70 °F
– 7,700 tons evaporative cooling
• Expansion potential: Power infrastructure
to 60 MW, cooling infrastructure to 70 MW
55
OLCF-3
ORNL has systematically delivered a series
of leadership-class systems
On scope • On budget • Within schedule
OLCF-1
OLCF-2
1000-fold
improvement
in 8 years
2012
Cray XK7
Titan
27
PF
18.5
TF
25 TF
54 TF 62 TF
263
TF
1
PF
2.3
PF
2004
Cray X1E
Phoenix
2005
Cray XT3
Jaguar
2006
Cray XT3
Jaguar
2007
Cray XT4
Jaguar
2008
Cray XT4
Jaguar
2008
Cray XT5
Jaguar
2009
Cray XT5
Jaguar
Titan turned 6 years old in October 2018
and continued to deliver world-class
science research in support of our user
community until it was decommissioned
on August 2nd 2019
66
We are building on this record of success
to enable exascale in 2021
OLCF-5
OLCF-4
1.5
EF
200
PF
27
PF
2012
Cray XK7
Titan
2021
Frontier
2018
IBM
Summit
7
System performance
• Peak performance
of 200 petaflops for M&S;
3.3 exaops (FP16) for data
analytics and AI
• Launched in June 2018;
ranked #1 on TOP500 list
System elements
• 4608 nodes
• Dual-rail Mellanox EDR InfiniBand
network
• 250 PB IBM Spectrum Scale
file system transferring
data at 2.5 TB/s
Node elements
• 2 IBM POWER9 processors
• 6 NVIDIA Tesla V100 GPUs
• 608 GB of fast memory
• (96 GB HBM2 + 512 GB DDR4)
• 1.6 TB of non-volatile memory
Summit system overview
Summit is the fastest
supercomputer in the world and
has been #1 on the TOP500 list
since its launch in June 2018
8
CPU Nodes
• 512 Compute nodes
• Dual 8-Core Xeon (Sandy
Bridge) Processors
• 128 GB Ram
GPU and Large Mem Nodes
• 9 GPU/LargeMem nodes
• Dual 14-Core Xeon
Processors
• 1 TB Ram
• 2 NVIDIA K80 GPUs
Cluster elements
• Mellanox FDR 4x Infiniband
(56 Gb/sec/direction)
• Mounts Alpine center-wide
GPFS file system
• Slurm resource manager
Rhea and Data Transfer Nodes Cluster system overview
Rhea is used for analysis, visualization,
and postprocessing
Data Transfer Cluster (DTNs) provide interactive and scheduled data transfer services
99
Frontier Overview
Partnership between ORNL, Cray, and AMD
The Frontier system will be delivered in 2021
Peak Performance greater than 1.5 EF
Composed of more than 100 Cray Shasta cabinets
– Connected by Slingshot™ interconnect with adaptive routing, congestion control, and quality of service
Node Architecture:
– An AMD EPYC™ processor and four Radeon Instinct™ GPU accelerators
purpose-built for exascale computing
– Fully connected with high speed AMD Infinity Fabric links
– Coherent memory across the node
– 100 GB/s injection bandwidth
– Near-node NVM storage
Researchers will harness Frontier to advance science in such applications as systems biology,
materials science, energy production, additive manufacturing and health data science.
10
National Climate-Computing Research Center
(NCRC)
• Strategic Partnership Project, currently in year 9
• 5-year periods. Current IAA effective through FY20
• Within ORNL’s National Center for Computational Sciences (NCCS)
• Service provided - DOE-titled equipment
• Secure network enclave; Department of Commerce access policies
• Agreement between NOAA and DOE’s Oak
Ridge National Laboratory for HPC services and
climate modeling support
11
980
TF
1.1
PF
1.1
PF
1.1
PF
1.1
PF
1.77
PF
4.02
PF
4.27
PF
4.78
PF
5.29
PF
260
TF
1111
2010
C1MS,
Cray XT6
2011
C1MS +
C2
Cray XE6
2012
C2 + C1,
Cray XE6
2016
C3, Cray
XC40
2017
C3 + C4
Cray XC40
2017
C3 + C4
Cray XC40
2018
C3 + C4
Cray XC40
2019
C3 + C4
Cray XC40
Gaea system evolution,
2010–2019
4.2 PB
storage
capability
6.4 PB
storage
capability
38 PB
storage
capability
ORNL has successfully lead the effort
to procure, configure, and transition
to operation sustained petaflops
scale computing including
infrastructure for storage, and
networking in support of the NOAA
mission.
12
US Air Force: ORNL is supporting the 557th Weather Wing
Existing
Core
Services
RSA
LDAP
DNS
License
Monitoring
…
Data Transfer Nodes
USAF 557th WW
Secure
Infrastructure
Secure
WAN
Offeror Solution
Sys11
Hall A
HPC11 Services
Front End Nodes
Scheduler
Management
…
Sys11
Hall B
TDS
Development
Filesystem(s)
Production
Filesystem(s)
Highly reliable computing resources that deliver timely and frequent weather products,
supporting Air Force mission requirements worldwide
13
Summary
• Scientific data
– Data generated from scientific computing jobs
• Operational data
– Machine data such as metrics, logs, and user info
• A variety of machines that produce operational data
– A revolving door of supercomputers and supporting infrastructure
14
Operational Data Challenges in HPC
• Large amounts of machine data generated with possible bursts
• Long-term data retention for analytics
• Siloed data limits holistic view
• No standard approach to monitoring in HPC
– Each org has their own unique challenges
15
New Team Formation
• Increased organizational efforts for data-driven decision-
making
• Expansion of initially-small Kafka project
• New Elasticsearch cluster for telemetry data
• Need in organization for centralized single source of truth
• Increased usage of big data tools
16
NCCS Ops Big Data Platform
Kafka
Clusters Data
Security Data
File Systems
Data
Application
Data
Facilities Data
Network Data
Logstash Elasticsearch
User-chosen
Exporters
Long-term Data
Storage
GPFS, HPSS
Telegraf Splunk
Kibana
Grafana
PrometheusUser
DB/App
Faust
Lightweight
Time Series
Database
Internal CRM
Software
Nagios
17
Lessons Learned
• Build it and they will come
– Scale as needed
• Work closely with a few initial
users who share vision
• Provide users with topic visibility
for quicker debugging
• A data dictionary offer users
potential new data streams to
explore
• Backbone of big data platform
18
Use Case – Summit Cooling Intelligence
• Intelligent real-time resource
control based on situational
awareness of the data center
• Automated control system is
monitored and controlled by our
facility engineers at a macro
scale
19
Use Case – Summit Cooling Intelligence
Weather
Wetbulb
Forecast
(NOAA)
Cooling Plant
MTW PLC
Data
(C-TECH)
Cooling Plant
MTW PLC
Outlet
Summit
OpenBMC
Telemetry
Streaming (IBM)
Summit
Job Scheduler
LSF Job
Allocation (IBM)
MessageBus(Kafka)
Human
Engineer
DashboardState
Snapshot
Histogram
Snapshot
Generator
Lightweight
Time Series
Database
Data
Exporters
Serialization
Compression
Long-term Data
Storage
GPFS, HPSS
Elasticsearch
Training
Learning
Model
Artifact
20
Use Case – Summit I/O Metrics
21
Use Case – Lustre RPC Trace Data
Questions
?c e b a l l o s s l @ o r n l . g o v

More Related Content

What's hot

Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...
Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...
Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...HostedbyConfluent
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...confluent
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...HostedbyConfluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming PlatformDr. Mirko Kämpf
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...HostedbyConfluent
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...HostedbyConfluent
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Kai Wähner
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyKairo Tavares
 
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...confluent
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformconfluent
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...HostedbyConfluent
 
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...confluent
 
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...confluent
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafkajexp
 
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...HostedbyConfluent
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...HostedbyConfluent
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019confluent
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...confluent
 

What's hot (20)

Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...
Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...
Kafka at the core of an AIOps pipeline | Sunanda Kommula, Selector.ai and Ala...
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...
Feed Your SIEM Smart with Kafka Connect (Vitalii Rudenskyi, McKesson Corp) Ka...
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made EasyConfluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
 
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platform
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
 
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
 
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
Tackling Kafka, with a Small Team ( Jaren Glover, Robinhood) Kafka Summit SF ...
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafka
 
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
Cloud-Based Event Stream Processing Architectures and Patterns with Apache Ka...
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 

Similar to Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak Ridge National Laboratory) Kafka Summit 2020

Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022HostedbyConfluent
 
The U.S. Exascale Computing Project: Status and Plans
The U.S. Exascale Computing Project: Status and PlansThe U.S. Exascale Computing Project: Status and Plans
The U.S. Exascale Computing Project: Status and Plansinside-BigData.com
 
Exascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing WorldExascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing Worldinside-BigData.com
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCinside-BigData.com
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017Stacy Véronneau
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plansinside-BigData.com
 
Accelerating Research and Enterprise Solutions by Bridging HPC and AI
Accelerating Research and Enterprise Solutions by Bridging HPC and AIAccelerating Research and Enterprise Solutions by Bridging HPC and AI
Accelerating Research and Enterprise Solutions by Bridging HPC and AIinside-BigData.com
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impactinside-BigData.com
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsYathiraj Udupi, Ph.D.
 
2017 Atlanta Regional User Seminar Introduction
2017 Atlanta Regional User Seminar Introduction2017 Atlanta Regional User Seminar Introduction
2017 Atlanta Regional User Seminar IntroductionOPAL-RT TECHNOLOGIES
 
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell World
 
Active Nets Technology Transfer through High-Performance Network Devices
Active Nets Technology Transfer through High-Performance Network DevicesActive Nets Technology Transfer through High-Performance Network Devices
Active Nets Technology Transfer through High-Performance Network DevicesTal Lavian Ph.D.
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research PlatformLarry Smarr
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 

Similar to Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak Ridge National Laboratory) Kafka Summit 2020 (20)

Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
Monitoring Exascale Supercomputers With Tim Osborne | Current 2022
 
The U.S. Exascale Computing Project: Status and Plans
The U.S. Exascale Computing Project: Status and PlansThe U.S. Exascale Computing Project: Status and Plans
The U.S. Exascale Computing Project: Status and Plans
 
Exascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing WorldExascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing World
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
EPCC MSc industry projects
EPCC MSc industry projectsEPCC MSc industry projects
EPCC MSc industry projects
 
CORAL Fact Sheet
CORAL Fact SheetCORAL Fact Sheet
CORAL Fact Sheet
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPC
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Available HPC Resources at CSUC
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plans
 
Accelerating Research and Enterprise Solutions by Bridging HPC and AI
Accelerating Research and Enterprise Solutions by Bridging HPC and AIAccelerating Research and Enterprise Solutions by Bridging HPC and AI
Accelerating Research and Enterprise Solutions by Bridging HPC and AI
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack Clouds
 
2017 Atlanta Regional User Seminar Introduction
2017 Atlanta Regional User Seminar Introduction2017 Atlanta Regional User Seminar Introduction
2017 Atlanta Regional User Seminar Introduction
 
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
 
Active Nets Technology Transfer through High-Performance Network Devices
Active Nets Technology Transfer through High-Performance Network DevicesActive Nets Technology Transfer through High-Performance Network Devices
Active Nets Technology Transfer through High-Performance Network Devices
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 

More from confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

More from confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak Ridge National Laboratory) Kafka Summit 2020

  • 1. ORNL is managed by UT-Battelle LLC for the US Department of Energy Enabling Insight to Support World-Class Supercomputing Stefan Ceballos Oak Ridge National Laboratory National Center for Computational Sciences Notice: This work was supported by the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL) for the Department of Energy (DOE) under Prime Contract Number DE-AC05-00OR-22725
  • 2. 2 Topics • Overview of Oak Ridge National Laboratory (ORNL) – National Center for Computational Sciences (NCCS) overview – History and future of high-performance computing (HPC) at ORNL – External customers as an org, internal customers as a team • Operational data challenges in HPC • New team formation • Lessons learned • Current use-cases
  • 3. 3 Our vision: Sustain ORNL’s leadership and scientific impact in computing and computational sciences • Provide the world’s most powerful open resources for: – Scalable computing and simulation – Data and analytics at any scale – Scalable cyber-secure infrastructure for science • Follow a well-defined path for maintaining world leadership in these critical areas • Deliver leading-edge science relevant to missions of DOE and key federal and state agencies • Build and exploit cross-cutting partnerships • Attract the brightest talent • Invest in education and training
  • 4. 4 National Center for Computational Sciences Home to the OLCF, including Summit • 65,000 ft2 of DC space – Distributed across 4 controlled-access areas – 31,000 ft2: Very high load bearing (³625 lb/ft2) • 40 MW of high-efficiency highly reliable power – Tightly integrated into TVA’s 161 kV transmission system – Diverse medium-voltage distribution system – High-efficiency 3.0/4.0 MVA transformers • 6,600 tons of chilled water – 20 MW cooling capacity at 70 °F – 7,700 tons evaporative cooling • Expansion potential: Power infrastructure to 60 MW, cooling infrastructure to 70 MW
  • 5. 55 OLCF-3 ORNL has systematically delivered a series of leadership-class systems On scope • On budget • Within schedule OLCF-1 OLCF-2 1000-fold improvement in 8 years 2012 Cray XK7 Titan 27 PF 18.5 TF 25 TF 54 TF 62 TF 263 TF 1 PF 2.3 PF 2004 Cray X1E Phoenix 2005 Cray XT3 Jaguar 2006 Cray XT3 Jaguar 2007 Cray XT4 Jaguar 2008 Cray XT4 Jaguar 2008 Cray XT5 Jaguar 2009 Cray XT5 Jaguar Titan turned 6 years old in October 2018 and continued to deliver world-class science research in support of our user community until it was decommissioned on August 2nd 2019
  • 6. 66 We are building on this record of success to enable exascale in 2021 OLCF-5 OLCF-4 1.5 EF 200 PF 27 PF 2012 Cray XK7 Titan 2021 Frontier 2018 IBM Summit
  • 7. 7 System performance • Peak performance of 200 petaflops for M&S; 3.3 exaops (FP16) for data analytics and AI • Launched in June 2018; ranked #1 on TOP500 list System elements • 4608 nodes • Dual-rail Mellanox EDR InfiniBand network • 250 PB IBM Spectrum Scale file system transferring data at 2.5 TB/s Node elements • 2 IBM POWER9 processors • 6 NVIDIA Tesla V100 GPUs • 608 GB of fast memory • (96 GB HBM2 + 512 GB DDR4) • 1.6 TB of non-volatile memory Summit system overview Summit is the fastest supercomputer in the world and has been #1 on the TOP500 list since its launch in June 2018
  • 8. 8 CPU Nodes • 512 Compute nodes • Dual 8-Core Xeon (Sandy Bridge) Processors • 128 GB Ram GPU and Large Mem Nodes • 9 GPU/LargeMem nodes • Dual 14-Core Xeon Processors • 1 TB Ram • 2 NVIDIA K80 GPUs Cluster elements • Mellanox FDR 4x Infiniband (56 Gb/sec/direction) • Mounts Alpine center-wide GPFS file system • Slurm resource manager Rhea and Data Transfer Nodes Cluster system overview Rhea is used for analysis, visualization, and postprocessing Data Transfer Cluster (DTNs) provide interactive and scheduled data transfer services
  • 9. 99 Frontier Overview Partnership between ORNL, Cray, and AMD The Frontier system will be delivered in 2021 Peak Performance greater than 1.5 EF Composed of more than 100 Cray Shasta cabinets – Connected by Slingshot™ interconnect with adaptive routing, congestion control, and quality of service Node Architecture: – An AMD EPYC™ processor and four Radeon Instinct™ GPU accelerators purpose-built for exascale computing – Fully connected with high speed AMD Infinity Fabric links – Coherent memory across the node – 100 GB/s injection bandwidth – Near-node NVM storage Researchers will harness Frontier to advance science in such applications as systems biology, materials science, energy production, additive manufacturing and health data science.
  • 10. 10 National Climate-Computing Research Center (NCRC) • Strategic Partnership Project, currently in year 9 • 5-year periods. Current IAA effective through FY20 • Within ORNL’s National Center for Computational Sciences (NCCS) • Service provided - DOE-titled equipment • Secure network enclave; Department of Commerce access policies • Agreement between NOAA and DOE’s Oak Ridge National Laboratory for HPC services and climate modeling support
  • 11. 11 980 TF 1.1 PF 1.1 PF 1.1 PF 1.1 PF 1.77 PF 4.02 PF 4.27 PF 4.78 PF 5.29 PF 260 TF 1111 2010 C1MS, Cray XT6 2011 C1MS + C2 Cray XE6 2012 C2 + C1, Cray XE6 2016 C3, Cray XC40 2017 C3 + C4 Cray XC40 2017 C3 + C4 Cray XC40 2018 C3 + C4 Cray XC40 2019 C3 + C4 Cray XC40 Gaea system evolution, 2010–2019 4.2 PB storage capability 6.4 PB storage capability 38 PB storage capability ORNL has successfully lead the effort to procure, configure, and transition to operation sustained petaflops scale computing including infrastructure for storage, and networking in support of the NOAA mission.
  • 12. 12 US Air Force: ORNL is supporting the 557th Weather Wing Existing Core Services RSA LDAP DNS License Monitoring … Data Transfer Nodes USAF 557th WW Secure Infrastructure Secure WAN Offeror Solution Sys11 Hall A HPC11 Services Front End Nodes Scheduler Management … Sys11 Hall B TDS Development Filesystem(s) Production Filesystem(s) Highly reliable computing resources that deliver timely and frequent weather products, supporting Air Force mission requirements worldwide
  • 13. 13 Summary • Scientific data – Data generated from scientific computing jobs • Operational data – Machine data such as metrics, logs, and user info • A variety of machines that produce operational data – A revolving door of supercomputers and supporting infrastructure
  • 14. 14 Operational Data Challenges in HPC • Large amounts of machine data generated with possible bursts • Long-term data retention for analytics • Siloed data limits holistic view • No standard approach to monitoring in HPC – Each org has their own unique challenges
  • 15. 15 New Team Formation • Increased organizational efforts for data-driven decision- making • Expansion of initially-small Kafka project • New Elasticsearch cluster for telemetry data • Need in organization for centralized single source of truth • Increased usage of big data tools
  • 16. 16 NCCS Ops Big Data Platform Kafka Clusters Data Security Data File Systems Data Application Data Facilities Data Network Data Logstash Elasticsearch User-chosen Exporters Long-term Data Storage GPFS, HPSS Telegraf Splunk Kibana Grafana PrometheusUser DB/App Faust Lightweight Time Series Database Internal CRM Software Nagios
  • 17. 17 Lessons Learned • Build it and they will come – Scale as needed • Work closely with a few initial users who share vision • Provide users with topic visibility for quicker debugging • A data dictionary offer users potential new data streams to explore • Backbone of big data platform
  • 18. 18 Use Case – Summit Cooling Intelligence • Intelligent real-time resource control based on situational awareness of the data center • Automated control system is monitored and controlled by our facility engineers at a macro scale
  • 19. 19 Use Case – Summit Cooling Intelligence Weather Wetbulb Forecast (NOAA) Cooling Plant MTW PLC Data (C-TECH) Cooling Plant MTW PLC Outlet Summit OpenBMC Telemetry Streaming (IBM) Summit Job Scheduler LSF Job Allocation (IBM) MessageBus(Kafka) Human Engineer DashboardState Snapshot Histogram Snapshot Generator Lightweight Time Series Database Data Exporters Serialization Compression Long-term Data Storage GPFS, HPSS Elasticsearch Training Learning Model Artifact
  • 20. 20 Use Case – Summit I/O Metrics
  • 21. 21 Use Case – Lustre RPC Trace Data
  • 22. Questions ?c e b a l l o s s l @ o r n l . g o v