The Pacific Research Platform (PRP) aims to create a "Big Data freeway system" across research institutions in the western United States and Pacific region by leveraging high-bandwidth optical fiber networks. The PRP connects multiple universities and national laboratories, providing bandwidth up to 100Gbps for data-intensive science applications. Initial testing of the PRP demonstrated disk-to-disk transfer speeds exceeding 5Gbps between many sites. The PRP will be expanded with SDN/SDX capabilities to enable even higher performance for large-scale datasets from fields like astronomy, genomics, and particle physics.
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...Larry Smarr
Invited Presentation
Symposium on Computational Biology and Bioinformatics:
Remembering John Wooley
National Institutes of Health
Bethesda, MD
July 29, 2016
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...Larry Smarr
Invited Presentation
Symposium on Computational Biology and Bioinformatics:
Remembering John Wooley
National Institutes of Health
Bethesda, MD
July 29, 2016
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...Larry Smarr
11.03.28
Remote Luncheon Presentation from Calit2@UCSD
National Science Board
Expert Panel Discussion on Data Policies
National Science Foundation
Title: High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Science and Engineering
Arlington, Virginia
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Larry Smarr, founding director of Calit2 (now Distinguished Professor Emeritus at the University of California San Diego) and the first director of NCSA, is one of the seminal figures in the U.S. supercomputing community. What began as a personal drive, shared by others, to spur the creation of supercomputers in the U.S. for scientific use, later expanded into a drive to link those supercomputers with high-speed optical networks, and blossomed into the notion of building a distributed, high-performance computing infrastructure – replete with compute, storage and management capabilities – available broadly to the science community.
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
05.01.12
Invited Talk to the 21st International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology Held at the 85th AMS Annual Meeting
Title: Toward a Global Interactive Earth Observing Cyberinfrastructure
San Diego, CA
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
National Ocean Exploration Forum 2017
Ocean Exploration in a Sea of Data
Calit2’s Qualcomm Institute
University of California, San Diego
October 21, 2017
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
Using the Pacific Research Platform for Earth Sciences Big DataLarry Smarr
Grand Challenge Lecture
Big Data and the Earth Sciences: Grand Challenges Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
May 31, 2017
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...Larry Smarr
05.03.09
Invited Talk
Optical Fiber Communication Conference (OFC2005)
Title: The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Testbed for Optical Technologies Enabling LambdaGrid Computing
Anaheim, CA
Why Researchers are Using Advanced NetworksLarry Smarr
07.07.03
Remote Talk from Calit2 to:
Building KAREN Communities for Collaboration Forum
KIWI Advanced Research and Education Network
University of Auckland, Auckland City, New Zealand
Title: Why Researchers are Using Advanced Networks
La Jolla, CA
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...Larry Smarr
11.03.28
Remote Luncheon Presentation from Calit2@UCSD
National Science Board
Expert Panel Discussion on Data Policies
National Science Foundation
Title: High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Science and Engineering
Arlington, Virginia
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Larry Smarr, founding director of Calit2 (now Distinguished Professor Emeritus at the University of California San Diego) and the first director of NCSA, is one of the seminal figures in the U.S. supercomputing community. What began as a personal drive, shared by others, to spur the creation of supercomputers in the U.S. for scientific use, later expanded into a drive to link those supercomputers with high-speed optical networks, and blossomed into the notion of building a distributed, high-performance computing infrastructure – replete with compute, storage and management capabilities – available broadly to the science community.
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
05.01.12
Invited Talk to the 21st International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology Held at the 85th AMS Annual Meeting
Title: Toward a Global Interactive Earth Observing Cyberinfrastructure
San Diego, CA
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
National Ocean Exploration Forum 2017
Ocean Exploration in a Sea of Data
Calit2’s Qualcomm Institute
University of California, San Diego
October 21, 2017
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...Larry Smarr
05.06.14
Keynote to the 15th Federation of Earth Science Information Partners Assembly Meeting: Linking Data and Information to Decision Makers
Title: The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way to the International LambdaGrid
San Diego, CA
Using the Pacific Research Platform for Earth Sciences Big DataLarry Smarr
Grand Challenge Lecture
Big Data and the Earth Sciences: Grand Challenges Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
May 31, 2017
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...Larry Smarr
05.03.09
Invited Talk
Optical Fiber Communication Conference (OFC2005)
Title: The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Testbed for Optical Technologies Enabling LambdaGrid Computing
Anaheim, CA
Why Researchers are Using Advanced NetworksLarry Smarr
07.07.03
Remote Talk from Calit2 to:
Building KAREN Communities for Collaboration Forum
KIWI Advanced Research and Education Network
University of Auckland, Auckland City, New Zealand
Title: Why Researchers are Using Advanced Networks
La Jolla, CA
Positioning University of California Information Technology for the Future: S...Larry Smarr
05.02.15
Invited Talk
The Vice Chancellor of Research and Chief Information Officer Summit
“Information Technology Enabling Research at the University of California”
Title: Positioning University of California Information Technology for the Future: State, National, and International IT Infrastructure Trends and Directions
Oakland, CA
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
1. Pacific Wave and PRP Update
Big News for Big Data
John Hess
Dr. Larry Smarr
WESTNET 2016
FORT LEWIS COLLEGE
JUNE 16, 2016
2. Six Charter Associates:
• California K-12 System
• California Community Colleges
• California State University System
• Stanford, Caltech, USC
• University of California
• California Public Libraries
• CENIC is a 501(c)3 created to serve
California’s K-20 research & education
institutions with cost-effective,
high-bandwidth networking
3. Three networks operate simultaneously as
independent layers on a single infrastructure:
CalREN Digital California (DC) daily use for
e-mail, web browsing, videoconferencing, etc.
CalREN High Performance Research (HPR)
high-performance, data-intensive efforts
CalREN eXperimental Developmental (XD)
bleeding-edge research on the network itself
CENIC: California’s Research & Education NetworkCENIC: California’s Research & Education Network
4. CENIC: California’s Research & Education Network
• 3,800+ miles of optical fiber
• Members in all 58 counties connect via
fiber-optic cable or leased circuits from
telecom carriers
• Over 10,000 sites connect to CENIC
• 20,000,000 Californians use CENIC
• Governed by members on the segmental
level
• Collaborate with over 500 private sector
partners
• 88 other peering partners
(Google, Microsoft, Amazon …)
• Enables worldwide collaboration
5. Pacific Wave
and WRN
• Pacific Wave and the Western Region Network provide
for a 100Gbps network spanning the Western United
States serving PNWGP, CENIC, FRGP, ABQGP and UH.
• Pacific Wave and NSF IRNC awardee PIREN (Univ of
Hawaii) work together supporting AARNet links to
California and Washington and expansion of high-
speed service through the Pacific Islands Region
w w w . p n w - g i g a p o p . n e t
7. Pacific Wave
• Began as first geographically distributed exchange in
2004
• Pacific Wave is an open exchange supporting both
commercial and R&E peers
• Currently serves 29 countries peering across the Pacific
and Western United States
• With PNWGP and TransPac, announced the first
100Gbps Trans-Pacific link from Tokyo to Seattle in
2015
8. R&E Exchanges within R&E
• StarLight (Chicago, IL)
– StarLight Consortium/MREN
• MANLAN (New York, NY)
– NYSERnet
• WIX (Washington, DC)
– University of Maryland/MAX GigaPOP
• AmLight (Miami, Florida)
– Florida International University/Florida LambdaRail
• Pacific Wave (Western US)
– CENIC and PNWGP
9. National/Global Activities
• NSF provides support of the R&E exchange points
through the competitive IRNC (International Research
Network Connections) program with funding for
backbone, infrastructure and innovation
• The Global Lambda Integrated Facility
– The GLIF brings together some of the world’s premier
networking engineers who are working together to
develop and international infrastructure
12. Nx100G Across the Pacific
• CURRENT:
– TransPac/Pacific Wave (Tokyo-Seattle)
– SINGAREN/Internet2 (Singapore-Los Angeles)
– SINET/SoftBank/Pacific Wave (Tokyo-Los Angeles)
– AARNET/PIREN/Pacific Wave (Australia-SEA)
• FUTURE:
– AARNET/PIREN/Pacific Wave (Australia-LA) – end of June 2016
– UH/PIREN/Pacific Wave (Guam-Hawaii-LA)
13. Pacific Wave and NSF/IRNC
• Pacific Wave has been partially supported
through three separate five-year National
Science Foundation grants supporting growth,
connectivity and innovation
• Current award promotes 100G expansion and
implementation of SDX capabilities within
Pacific Wave (ACI-1451050)
14. SDX = SDN + IXP
14
AS A Router
AS C Router
AS B Router
BGP Session
SDN Switch
SDX Controller
SDX
17. Next Step: The Pacific Research Platform Creates
a Regional End-to-End Science-Driven “Big Data Freeway System”
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-Pis:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UC San Diego SDSC,
• Frank Wuerthwein, UC San Diego Physics and
SDSC
18. The Pacific Research Platform (PRP)
• NSF CC-NIE and similar projects represent significant investments in campus
infrastructure including SDN, Science DMZ’s (~130 projects)
• But the scientists are still struggling with the complexity of using the network and
interoperability between different implementations of Science DMZ’s
• PRP focuses on enabling the science communities across the Pacific region to
make effective use of the high performance infrastructure
• Kick-off in December 2014: take advantage of the regional infrastructure;
perfSONAR for measurement / analysis and MaDDash for visualization
• Include DTN’s: use a common software suite for data movement; reflect disk-to-
disk performance on MaDDash
• Demonstrated as a proof-of-concept at the CENIC Spring meeting (March 2015)
19. DOE ESnet’s Science DMZ: A Scalable Network
Design Model for Optimizing Science Data Transfers
A Science DMZ integrates four key concepts into a unified whole:
– A network architecture designed for high-performance applications, with the science
network distinct from the general-purpose network
– The use of dedicated systems for data transfer
– Performance measurement and network testing systems that are regularly used to
characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for high performance
science environments
http://fasterdata.es.net/science-dmz/
20. PRPv0 - An experiment including:
Caltech
CENIC / Pacific Wave
ESnet / LBNL
NASA Ames / NREN
San Diego State University
SDSC
Stanford University
University of Washington
USC
UC Berkeley
UC Davis
UC Irvine
UC Los Angeles
UC Riverside
UC San Diego
UC Santa Cruz
20
21. 21
PRPv0 Experiment
The PRPv0 experiment concentrated on the
regional aspects of the research data movement
challenge.
High-performance interconnection
among campus Science DMZs
A mesh of perfSONAR toolkit instances
perfSONAR MaDDash -- Measurement
and Debugging Dashboard
Flash I/O Network Appliances (FIONAs)
and Data Transfer Nodes (DTNs)
GridFTP file transfers to quantify
throughput, with results reflected on
MaDDash
CalREN HPR / AS2153
A partial mesh of bilateral BGP
sessions across the Pacific Wave
distributed exchange
22. FIONA – Flash I/O Network Appliance:
Linux PCs Optimized for Big Data on DMZs
FIONAs Are
Science DMZ Data Transfer Nodes (DTNs) &
Optical Network Termination Devices
UCSD CC-NIE Prism Award & UCOP
Phil Papadopoulos & Tom DeFanti
Joe Keefe & John Graham
Cost $8,000 $20,000
Intel Xeon Haswell E5-1650 v3 6-Core 2x E5-2697 v3 14-Core
RAM 128 GB 256 GB
SSD SATA 3.8 TB SATA 3.8 TB
Network Interface 10/40GbE Mellanox 2x40GbE Chelsi+Mellanox
GPU NVIDIA Tesla K80
RAID Drives 0 to 112TB (add ~$100/TB)
UCOP Rack-Mount Build:
Source:
John Graham and Tom
DeFanti, Calit2
23. DTNs loaded with Globus
Connect Server suite to obtain
GridFTP tools.
cron-scheduled transfers using
globus-url-copy.
ESnet-contributed script parses
GridFTP transfer log and loads
results in an esmond
measurement archive.
FDT – developed by Caltech in
collaboration with Polytehnica
Bucharest
23
As of 3/9/15, the Pacific Research Platform (PRPv0) as a facility, logs rather good
performance:
From To Measured
Bandwidth
Data Transfer Utility
San Diego State Univ. UC Los Angeles 5Gb/s out of 10 GridFTP
UC Riverside UC Los Angeles 9Gb/s out of 10 GridFTP
UC Berkeley UC San Diego 9.6Gb/s out of 10 GridFTP
UC Davis UC San Diego 9.6Gb/s out of 10 GridFTP
UC Irvine UC Los Angeles 9.6Gb/s out of 10 GridFTP
UC Santa Cruz UC San Diego 9.6Gb/s out of 10 FDT
Stanford UC San Diego 12Gb/s out of 40 FDT
Univ. of Washington UC San Diego 12Gb/s out of 40 FDT
UC Los Angeles UC San Diego 36Gb/s out of 40 FDT
Caltech UC San Diego 36Gb/s out of 40 FDT
Table I.2.1: Bandwidth of flash disk-to-flash disk file transfers shown between several sites
for the existing experimental facility “PRPv0.”
24. January 29, 2016 PRPV1 (L3)
PRP Point-to-Point Bandwidth Map
GridFTP File Transfers-Note Huge Improvement in Last Six Months
June 6, 2016 PRPV1 (L3)
Green is Disk-to-Disk
In Excess of 5Gbps
29. PRP Timeline
• PRPv1
– A routed Layer 3 architecture
– Tested, Measured, Optimized, With Multi-domain Science Data
– Bring Many Of Our Science Teams Up
– Each Community Thus Will Have Its Own Certificate-Based Access
To its Specific Federated Data Infrastructure.
• PRPv2
– Incorporating SDN/SDX, AutoGOLE / NSI
– Advanced IPv6-Only Version with Robust Security Features
– e.g. Trusted Platform Module Hardware and SDN/SDX Software
– Support Rates up to 100Gb/s in Bursts And Streams
– Develop Means to Operate a Shared Federation of Caches
– Cooperating Research Groups
30. Resources
w w w . p n w - g i g a p o p . n e t
Pacific Wave
http://www.pacificwave.net/
https://ps-dashboard.pacificwave.net
CENIC
http://www.cenic.org/
https://ps-dashboard.cenic.net
Pacific Research Platform
http://prp.ucsd.edu/
http://cenic.org/files/publications/PRP_Overview_%C6%92.pdf
http://prp-maddash.calit2.optiputer.net/maddash-webui/
Calit2
http://www.calit2.net/
CITRIS
http://citris-uc.org/
ESnet
http://www.es.net/
http://fasterdata.es.net/
http://ps-dashboard.es.net/
31. Vision:
Creating a Pacific Research Platform
Use Optical Fiber Networks to Connect
All Data Generators and Consumers,
Creating a “Big Data” Freeway System
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 20-Campus Scale.”
This Vision Has Been Building for 15 Years
32. Creating a “Big Data” Freeway on Campus:
NSF-Funded Prism@UCSD and CHeruB Grants
Prism@UCSD, Phil Papadopoulos, SDSC, Calit2, PI (2013-15)
CHERuB, Mike Norman, SDSC PI
CHERuB
These Are Two
of Over
100 NSF Campus
Cyberinfrastructure
Grants
Made in the
Last 4 Years
33. How Prism@UCSD Transforms Big Data Microbiome Science:
Preparing for Knight/Smarr 1 Million Core-Hour Analysis
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
34. For Big Data Science, One Needs Bandwidths Orders of Magnitude Higher
Than the Shared Internet Between Campuses
Bandwidth from My Office
in Calit2’s Qualcomm Institute
Bandwidth On
the Pacific Research Platform:
500 Times the Bandwidth of the Shared Internet!
35. Invitation-Only PRP Workshop Held in Calit2’s Qualcomm Institute
October 14-16, 2015
• 130 Attendees From 40 organizations
– Ten UC Campuses, as well as UCOP Plus 11 Additional US Universities
– Four International Organizations (from Amsterdam, Canada, Korea, and Japan)
– Five Members of Industry Plus NSF
36. GPU JupyterHub:
2 x 14-core CPUs
256GB RAM
1.2TB FLASH
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
GPU JupyterHub:
1 x 18-core CPUs
128GB RAM
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
PRP UC-JupyterHub Backbone
UCB Next Step: Deploy Across PRP UCSD
Source: John Graham, Calit2
37.
38. Cancer Genomics Hub (UCSC) is Housed in SDSC:
Large Data Flows to End Users at UCSC, UCB, UCSF, …
1G
8G
Data Source: David
Haussler, Brad Smith, UCSC
15G
Jan 2016
30,000 TB
Per Year
39. Two Automated Telescope Surveys
Creating Huge Datasets Will Drive PRP
300 images per night.
100MB per raw image
30GB per night
120GB per night
250 images per night.
530MB per raw image
150 GB per night
800GB per night
When processed
at NERSC
Increased by 4x
Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL
Professor of Astronomy, UC Berkeley
Precursors to
LSST and NCSA
PRP Allows Researchers
to Bring Datasets from NERSC
to Their Local Clusters
for In-Depth Science Analysis
Data Flows Over HPWREN
40. Global Scientific Instruments Will Produce Ultralarge Datasets Continuously
Requiring Dedicated Optic Fiber and Supercomputers
Square Kilometer Array Large Synoptic Survey Telescope
https://tnc15.terena.org/getfile/1939 www.lsst.org/sites/default/files/documents/DM%20Introduction%20-%20Kantor.pdf
Tracks ~40B Objects,
Creates 10M Alerts/Night
Within 1 Minute of Observing
2x40Gb/s
41. community resources. This facility depends on a range of common services, support activities, software,
and operational principles that coordinate the production of scientific knowledge through the DHTC
model. In April 2012, the OSG project was extended until 2017; it is jointly funded by the Department of
Energy and the National Science Foundation.
OSG Federates Clusters in 40/50 States:
Creating a Scientific Compute and Storage “Cloud”
Source: Miron Livny, Frank Wuerthwein, OSG
42. We are Experimenting with the PRP for Large Hadron Collider Data Analysis
Using The West Coast Open Science Grid on 10-100Gbps Optical Networks
Crossed
100 Million
Core-Hours/Month
In Dec 2015
Over 1 Billion
Data Transfers
Moved
200 Petabytes
In 2015
Supported Over
200 Million Jobs
In 2015
Source: Miron Livny, Frank Wuerthwein, OSG
ATLAS
CMS
44. Dan Cayan
USGS Water Resources Discipline
Scripps Institution of Oceanography, UC San Diego
much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues
NCAR Upgrading to 10Gbps Link Over Westnet
from Wyoming and Boulder to CENIC/PRP
Sponsors:
California Energy Commission
NOAA RISA program
California DWR, DOE, NSF
Planning for climate change in California
substantial shifts on top of already high climate variability
UCSD Campus Climate Researchers Need to Download
Results from NCAR Remote Supercomputer Simulations
to Make Regional Climate Change Forecasts
45. average summer
afternoon temperature
average summer
afternoon temperature
Downscaling Supercomputer Climate Simulations
To Provide High Res Predictions for California Over Next 50 Years
45
Source: Hugo Hidalgo, Tapash Das, Mike Dettinger
46. approximately 50 miles:
Note: locations are approximate
to CI and
PEMEX
Extending PRP/CENIC Optical Backplane
via High Speed Wireless Research and Education Network
47. Real-Time Network Cameras on Mountains
for Environmental Observations
Source: Hans Werner Braun,
HPWREN PI
48. 14 May 2014:
9 Simultaneous Active Fires in San Diego County
San Diego County Red Mountain Fire Cameras
• Southeast (left) “Highway” Fire
• Southwest (center rear) “Poinsettia” Fire
• West (right) “Tomahawk” Fire
49. Interactive Virtual Reality of San Diego County
Includes Live Feeds From 150 Met Stations
TourCAVE at Calit2’s Qualcomm Institute
50. HPWREN Users and Public Safety Clients
Gain Redundancy and Resilience from PRP Upgrade
San Diego Countywide
Sensors and Camera
Resources
UCSD & SDSU
Data & Compute
Resources
UCSD
UCR
SDSU
UCI
UCI & UCR
Data Replication
and PRP FIONA Anchors
as HPWREN Expands
Northward
10X Increase During Wildfires
Data From Hans-Werner Braun
• PRP CENIC 10G Link UCSD to SDSU
– DTN FIONAs Endpoints
– Data Redundancy
– Disaster Recovery
– High Availability
– Network Redundancy
51. NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways:
Imagine Linking All of Them Like the Pacific Research Platform
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
52. Next Step: Global Research Platform
Building on CENIC/Pacific Wave and GLIF
Current
International
GRP Partners