High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...Larry Smarr
11.03.28
Remote Luncheon Presentation from Calit2@UCSD
National Science Board
Expert Panel Discussion on Data Policies
National Science Foundation
Title: High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Science and Engineering
Arlington, Virginia
08.09.19
Invited Lecture to the Green IT Workshop
Canada-California Strategic Innovation Partnership
Title: Toward Greener Cyberinfrastructure
Palo Alto, CA
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
05.01.12
Invited Talk to the 21st International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology Held at the 85th AMS Annual Meeting
Title: Toward a Global Interactive Earth Observing Cyberinfrastructure
San Diego, CA
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...Larry Smarr
11.03.28
Remote Luncheon Presentation from Calit2@UCSD
National Science Board
Expert Panel Discussion on Data Policies
National Science Foundation
Title: High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Science and Engineering
Arlington, Virginia
08.09.19
Invited Lecture to the Green IT Workshop
Canada-California Strategic Innovation Partnership
Title: Toward Greener Cyberinfrastructure
Palo Alto, CA
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
05.01.12
Invited Talk to the 21st International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology Held at the 85th AMS Annual Meeting
Title: Toward a Global Interactive Earth Observing Cyberinfrastructure
San Diego, CA
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...Larry Smarr
11.12.12
Seminar Presentation
Princeton Institute for Computational Science and Engineering (PICSciE)
Princeton University
Title: A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Intensive Research
Princeton, NJ
Creating a Big Data Machine Learning Platform in CaliforniaLarry Smarr
Big Data Tech Forum: Big Data Enabling Technologies and Applications
San Diego Chinese American Science and Engineering Association (SDCASEA)
Sanford Consortium
La Jolla, CA
December 2, 2017
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...Larry Smarr
11.05.13
Invited Presentation
Sanford Consortium for Regenerative Medicine
Salk Institute, La Jolla
Larry Smarr, Calit2 & Phil Papadopoulos, SDSC/Calit2
Title: High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting Stem Cell Research
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Larry Smarr
05.05.03
Presentation to 3rd Annual GEON Meeting
Bahia Resort
Title: Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Projects
San Diego, CA
Positioning University of California Information Technology for the Future: S...Larry Smarr
05.02.15
Invited Talk
The Vice Chancellor of Research and Chief Information Officer Summit
“Information Technology Enabling Research at the University of California”
Title: Positioning University of California Information Technology for the Future: State, National, and International IT Infrastructure Trends and Directions
Oakland, CA
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
The affect of service quality and online reviews on customer loyalty in the E...
PRP, NRP, GRP & the Path Forward
1. “PRP, NRP, GRP,
& the Path Forward”
Presentation
2nd National Research Platform Workshop
Bozeman, MT
August 6, 2018
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. ESnet’s ScienceDMZ Accelerates Science Research:
DOE & NSF Partnering on Science Engagement and Technology Adoption
Science
DMZ
Data Transfer
Nodes
(DTN/FIONA)
Network
Architecture
(zero friction)
Performance
Monitoring
(perfSONAR)
ScienceDMZ Coined in 2010 by ESnet
Basis of PRP Architecture and Design
http://fasterdata.es.net/science-dmz/
DOE
NSF
NSF CC* program (2012+) Funded Deployment
of ScienceDMZ on 200 Univ. campuses
www.nsf.gov/funding/pgm_summ.jsp?pims_id=504748
Slide From Inder Monga, ESnet
See Talk by
Eli Dart &
Deep Dive #2
Tuesday
3. (GDC)
Logical Next Step: The Pacific Research Platform Networks Campus DMZs
to Create a Regional End-to-End Science-Driven “Big Data Superhighway” System
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-PIs:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2/QI,
• Philip Papadopoulos, UCSD SDSC,
• Frank Wuerthwein, UCSD Physics and SDSC
Letters of Commitment from:
• 50 Researchers from 15 Campuses
• 32 IT/Network Organization Leaders
NSF Program Officer: Amy Walton
Source: John Hess, CENIC
4. PRP National-Scale Experimental Distributed Pilot:
Using CENIC & Internet2 to Connect Early-Adopter Quilt Regional R&E Networks
Announced May 8, 2018
Internet2 Global Summit
See
NRP Pilot
Monday;
Scaling
Tuesday
Original PRP
CENIC/PW Link
Extended PRP
Testbed
NSF CENIC Link
5. PRP Science DMZ Data Transfer Nodes (DTNs) -
Flash I/O Network Appliances (FIONAs)
UCSD Designed FIONAs
To Solve the Disk-to-Disk
Data Transfer Problem
at Full Speed
on 10G, 40G and
100G Networks
FIONAS—10/40G, $8,000
Phil Papadopoulos, SDSC &
Tom DeFanti, Joe Keefe & John Graham, Calit2
FIONette—1G, $250
Five Racked FIONAs at Calit2:
• Each Contains:
• Dual 12-Core CPUs
• 96GB RAM
• 1TB SSD
• 2 10GbE interfaces
• Total ~$10,500
• With 8 GPUs
• total ~$18,500
Report on
3-Day FIONA
Hands-On Workshop
For EPSCoR, MSI &
EPSCoR Deep Dive #3
Monday;
EPSCoR Talk Tuesday
6. GPN Becomes the First Multi-State Regional Network
to Peer with the PRP
Between the PRP-Contributed PWave DTN in Los Angeles
To GPN FIONA in UMC
Before PRP 0.8 Gbps, In May Seeing 3.7Gbs Over PRP, Now 11 Gbps
Source: John Hess, CENIC and George Rob III, UMissouri
May 30, 2018
See James Deaton
NRP Pilot Monday
7. Game Changer: Using Kubernetes
to Manage Containers Across the PRP
“Kubernetes is a way of stitching together
a collection of machines into, basically, a big computer,”
--Craig Mcluckie, Google
and now CEO and Founder of Heptio
"Everything at Google runs in a container."
--Joe Beda,Google
“Kubernetes has emerged as
the container orchestration engine of choice
for many cloud providers including
Google, AWS, Rackspace, and Microsoft,
and is now being used in HPC and Science DMZs.
--John Graham, Calit2/QI UC San Diego
Amazingly, I Didn’t
Mention Kubernetes
Last Year
Kubernetes
Tutorial
Sunday
8. Rook is Ceph Cloud-Native Object Storage
‘Inside’ Kubernetes
https://rook.io/
Source: John Graham, Calit2/QI
Kubernetes
Tutorial
Sunday
9. 40G 160TB
40G 160TB HPWREN
100G NVMe 6.4TB
FIONA8
2.5 FIONA8s
100G Epyc NVMe
100G Gold NVMe
July 2018 John Graham, UCSD
100G NVMe 6.4TB
Caltech*
40G 160TB
UCAR
FIONA8
FIONA8
3 FIONA8s
Calti2/UCI
FIONA8
FIONA8
>50 FIONA2s
FIONA8
FIONA8
6 FIONA8s
sdx-controller
2x40G 160TB HPWREN
Calit2/QI*/SIO
100G Gold FIONA8
SDSC
40G 160TB
UCR 40G 160TB
USC*
2x40G 160TB
UCLA
40G 160TB
Stanford U
40G 160TB
UCSB
100G NVMe 6.4TB
40G 160TB
UCSC*
40G 160TB
U Hawaii
PRP is Deploying Distributed Petabytes of Storage for Posting/Staging Data
at $10/TB per Year by Leveraging our Base of Installed FIONAs
10G FIONA$1K
40G 160TB HPWREN
100G NVMe 6.4TB
2 FIONA4s
SDSU*
Kubernetes Centos7
Rook/Ceph - Block/Object/FS
Swift API compatible with
SDSC, AWS, and Rackspace
Alex Szalay
Deep Dive #4
Monday
Rob Gardner
Tuesday
Dima
Mishin
Sunday
10. Operational Metrics: Containerized Trace Route Tool
Allows Realtime Visualization of Status of Network Links
All Kubernetes Nodes on PRP
Source: Dmitry Mishin(SDSC),
John Graham (Calit2)Presets
This node graph shows UCR
as the source of the flow
to the mesh
11. Operational Metrics: Containerized perfSONAR MaDDash Dashboards
For Realtime Measurements of PRP Number of Paths and Packet Loss
Source: Dmitry Mishin(SDSC),
John Graham (Calit2)
12. Quilt Members Have Built
Their Own perfSONAR MaDDash Inspired by PRP
http://quiltmesh.onenet.net/maddash-webui/
Source: Jen Leasure, Quilt
Aug. 4, 2018
13. Expanding to the Global Research Platform (GRP)
Via CENIC/Pacific Wave, Internet2, and International Links
PRP/
CENIC/PW
PRP’s Current
International
Partners
Korea Shows Distance is Not the Barrier
to Above 5Gb/s Disk-to-Disk Performance
Netherlands
Guam
Australia
Korea
Japan
Singapore
International-
Scale
Measurement
Technologies/
Techniques
Tuesday
14. PRP’s First 2.5 Years:
Connecting Multi-Campus Application Teams and Devices
Earth
Sciences
See Following
Panel: Science Drivers for NRP
15. PRP Science Application Class #1:
Providing High Performance Access to Distributed Data Analysis
16. Data Transfer Rates From 40 Gbps DTN in UCSD Physics Building,
Across Campus on PRISM DMZ, Then to Chicago’s Fermilab Over CENIC/ESnet
Based on This Success,
Würthwein Will Upgrade 40G DTN to 100G
For Bandwidth Tests & Kubernetes Integration
With OSG, Caltech, and UCSC
Source: Frank Würthwein, OSG, UCSD/SDSC, PRP
17. PRP Distributed Tier-2 Cache
Across Caltech & UCSD-Thousands of Flows Sustaining >10Gbps!
Cache
Server
Cache
Server…
Redirect
or
Cache
Server
Cache
Server…
Redirect
or
UCSD Caltech
Redirector Top Level Cache
Global Data Federation of CMS
Provisioned pilot systems:
PRP UCSD: 9 x 12 SATA Disk of 2TB
@ 10Gbps for Each System
PRP Caltech: 2 x 30 SATA Disk of 6TB
@ 40Gbps for Each System
Source: Frank Würthwein, OSG, UCSD/SDSC, PRP; Havey Newman, Caltech
18. Collaboration Opportunity with OSG/PRP/I2
on Distributed Storage
1.8PB1.2PB1.6PB
210TB
Total data volume pulled last year
is dominated by 4 caches.
OSG Is Operating a Distributed Caching CI.
At Present, 4 Caches Provide Significant Use
PRP Kubernetes Infrastructure Could Either
Grow Existing Caches by Adding Servers,
or by Adding Additional Locations
StashCache Users include:
LIGO
DES
Source: Frank Würthwein, OSG, UCSD/SDSC, PRP
See Talk
on OSG/PRP/I2
Tuesday
19. PRP Science Application Class #2:
Providing High Performance Access to Remote Supercomputers
20. Distributed Computation on PRP
Coupling SDSU Cluster and SDSC Comet Using Kubernetes Containers
25 years
Developed and executed MPI-based PRP Kubernetes Cluster execution
[CO2,aq] 100 Year Simulation
4 days
75 years
100 years
• 0.5 km x 0.5 km x 17.5 m
• Three sandstone layers
separated by two shale
layers
Simulating the Injection of CO2
in Brine-Saturated Reservoirs:
Poroelastic & Pressure-Velocity
Fields Solved In Parallel With MPI
Using Domain Decomposition
Across Containers
Source: Chris Paolini and Jose Castillo, SDSU
See Talk by
Chris Paolini
Sunday
21. Speeding Downloads Using 100 Gbps PRP Link Over CENIC
Couples UC Santa Cruz Astrophysics Cluster to LBNL NERSC Supercomputer
CENIC 2018
Innovations in
Networking
Award for
Research
Applications
NSF-Funded Cyberengineer
Shaw Dong @UCSC
Receiving FIONA
Feb 7, 2017
22. The Great Plains Network
Has Many Campuses With Active Projects at SDSC
GPN Map Source: James Deaton, GPN Shawn Strande, SDSC
23. PRP Science Application Class #3:
Providing High Perf. Access to SensorNets Coupled to Realtime Computing
24. Church Fire, San Diego CA
Alert SD&ECameras/HPWREN
October 21, 2017
New PRP Application:
Coupling Wireless Wildfire Sensors to Computing
Thomas Fire, Ventura, CA
Firemap Tool, WIFIRE
December 10, 2017
CENIC 2018
Innovations in Networking Award
for Experimental Applications
See HPWREN
Deep Dive #1
Tuesday
25. Once a Wildfire is Spotted, PRP Brings High-Resolution Weather Data
to Fire Modeling Workflows in WIFIRE
Real-Time
Meteorological Sensors
Weather Forecast
Landscape data
WIFIRE Firemap
Fire Perimeter
Work Flow
PRP
Source: Ilkay Altintas, SDSC
26. Fiber Optic Network Streams Images From
UC San Diego Jaffe Lab (SIO) Scripps Plankton Microscope Camera
27. Over 1 Billion Images So Far!
Requires Machine Learning for Automated Image Analysis and Classification
Phytoplankton: Diatoms
Zooplankton: Copepods
Zooplankton: Larvaceans
Source: Jules Jaffe, SIO
”We are using the FIONAs for image processing...
this includes doing Particle Tracking Velocimetry
that is very computationally intense.”-Jules Jaffe
28. Adding Machine Learning to PRP:
Left & Right Brain Computing: Arithmetic vs. Pattern Recognition
Adapted from D-Wave
29. New NSF CHASE-CI Grant Creates a Community Cyberinfrastructure:
Adding a Machine Learning Layer Built on Top of the Pacific Research Platform
Caltech
UCB
UCI UCR
UCSD
UCSC
Stanford
MSU
UCM
SDSU
NSF Grant for High Speed “Cloud” of 256 GPUs
For 30 ML Faculty & Their Students at 10 Campuses
for Training AI Algorithms on Big Data
See Venkat Vishwanath,
Deep Dive #4
Tuesday
30. FIONA8: Adding GPUs to FIONAs
Supports Data Science Machine Learning
Multi-Tenant Containerized GPU JupyterHub
Running Kubernetes / CoreOS
Eight Nvidia GTX-1080 Ti GPUs
~$13K
32GB RAM, 3TB SSD, 40G & Dual 10G ports
Source: John Graham, Calit2
31. 48 GPUs for
OSG Applications
UCSD Adding >350 Game GPUs to Data Sciences Cyberinfrastructure -
Devoted to Data Analytics and Machine Learning
SunCAVE 70 GPUs
WAVE + Vroom 48 GPUs
FIONA with
8-Game GPUs
95 GPUs
for Students
CHASE-CI Grant Provides
96 GPUs at UCSD
for Training AI Algorithms on Big Data
Plus 288 64-bit GPUs
On SDSC’s Comet
32. Next Step: Using Kubernetes to Surround the PRP Machine Learning Platform
With Clouds of CPUs, GPUs and Non-Von Neumann Processors
CHASE-CI
64-TrueNorth
Cluster
64-bit GPUs
4352x NVIDIA Tesla V100 GPUs
See Talks by
NSF Clouds,
Google, Amazon
Microsoft Installs Altera FPGAs
into Bing Servers &
384 into TACC for Academic Access
33. Calit2 Has Established Labs On Both UC San Diego and UC Irvine Campuses
For Exploring Machine Learning on von Neumann and NvN Processors
Charless Fowlkes, Director
Ken Kreutz Delgado, Director
34. Our Support:
• US National Science Foundation (NSF) awards
CNS 0821155, CNS-1338192, CNS-1456638, CNS-1730158,
ACI-1540112, & ACI-1541349
• University of California Office of the President CIO
• UCSD Chancellor’s Integrated Digital Infrastructure Program
• UCSD Next Generation Networking initiative
• Calit2 and Calit2 Qualcomm Institute
• CENIC, PacificWave and StarLight
• DOE ESnet