Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfrastructure


Published on

Presentation to UC Radiologists
October 16, 2017

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfrastructure

  1. 1. “The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfrastructure” Presentation to UC Radiologists October 16, 2017 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD 1
  2. 2. Collaborations with UCSD Radiology
  3. 3. Using Enhanced MRI to Detect Regions of Inflammation in Joint MRI and Annotation Christine Chung, MD, UCSD Visualization by Jurgen Schulze, Calit2, UCSD Inflammation in Bursa in Front of Patella Edema at Soft Tissue Attachment to Bone 196 Slices, 512x512 Slice Resolution Edema Behind Patella
  4. 4. Images courtesy of Christine Chung MD, UCSD MSK Imaging Research Lab ( Advanced MRI: Locating Disc Degeneration and Spinal Stenosis
  5. 5. Mike Kurisu Examining Larry Smarr’s Spine in the Calit2 Virtual Reality CAVE Visualizations from MRI by Jurgen Schulze, Calit2, UCSD
  6. 6. Converting Abdominal MRI Slices to 3D Organ Segmentation for Surgical Pre-Planning MRI Slice from Dr. Cynthia Santillan 3D Organ Segmentation Made by Dr. Jurgen Schulze from Dr. Santillan’s 150-Slice MRI
  7. 7. Pre-Surgical Planning in QI Virtual Reality: Using Virtual Reality As Input for Positioning The Two Resection Cuts Colon visualization by Jurgen Schulze, Calit2; Photo credit Tom DeFanti, Calit2 Surgeon Sonia Ramamoorthy, MD in Calit2 Virtual Reality CAVE Monday November 21, 2016
  8. 8. Toward a UCSD Data Sciences Cyberinfrastructure for 15 Years: OptIPuter, Quartzite, Prism PI Papadopoulos, Co-PI Smarr 2013-2015 PI Smarr, Co-PI DeFanti Co-PI Papadopoulos 2002-2009 PI Papadopoulos, Co-PI Smarr 2004-2007 Science DMZ
  9. 9. Based on Community Input and on ESnet’s Science DMZ Concept, NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways Red 2012 CC-NIE Awardees Yellow 2013 CC-NIE Awardees Green 2014 CC*IIE Awardees Blue 2015 CC*DNI Awardees Purple Multiple Time Awardees Source: NSF
  10. 10. (GDC) Logical Next Step: The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Superhighway” System NSF CC*DNI Grant $5M 10/2015-10/2020 PI: Larry Smarr, UC San Diego Calit2 Co-Pis: • Camille Crittenden, UC Berkeley CITRIS, • Tom DeFanti, UC San Diego Calit2, • Philip Papadopoulos, UCSD SDSC, • Frank Wuerthwein, UCSD Physics and SDSC Letters of Commitment from: • 50 Researchers from 15 Campuses • 32 IT/Network Organization Leaders
  11. 11. Big Data Science Data Transfer Nodes (DTNs)- Flash I/O Network Appliances (FIONAs) UCSD Designed FIONAs To Solve the Disk-to-Disk Data Transfer Problem at Full Speed on 10G, 40G and 100G Networks FIONAS—10/40G, $8,000 FIONette—1G, $1,000 Phil Papadopoulos, SDSC & Tom DeFanti, Joe Keefe & John Graham, Calit2 John Graham, Calit2
  12. 12. How UCSD DMZ Network Transforms Big Data Microbiome Science: Preparing for Knight/Smarr 1 Million Core-Hour Analysis Knight Lab FIONA 10Gbps Gordon Prism@UCSD Data Oasis 7.5PB, 200GB/s Knight 1024 Cluster In SDSC Co-Lo CHERuB 100Gbps Emperor & Other Vis Tools 64Mpixel Data Analysis Wall 120Gbps 40Gbps 1.3Tbps
  13. 13. We Measure Disk-to-Disk Throughput with 10GB File Transfer 4 Times Per Day in Both Directions for All PRP Sites January 29, 2016 From Start of Monitoring 12 DTNs to 24 DTNs Connected at 10-40G in 1 ½ Years July 21, 2017 Source: John Graham, Calit2
  14. 14. PRP’s First 2 Years: Connecting Multi-Campus Application Teams and Devices
  15. 15. Cryo-electron Microscopy (cryo-EM) Has Driven a “Resolution Revolution” in the Last Five Years Exposure (every 60 seconds): X & Y dimensions: 7420 x 7676 Pixels Frames per Movie: 10 - 50 Size: 3 - 10 GB per Movie Every 24 hours: Number of Movies: ~1400 Data Size: ~5 TB Typical Datasets: Length of Time: 2 - 6 Days Total size: 10 - 30 TB Each Cryo-EM ‘Image’ is Actually a Movie Source: Michael A. Cianfrocco, Elizabeth Villa, & Andres Leschziner, UCSD
  16. 16. Using PRP to Connect Cryo-EM across California With End Users and Computational Facilities Long term: ‣ Partner with Cryo-EM Facilities to Stream Data Straight from Microscopes (over PRP) to SDSC ‣ Perform All Cryo-EM Analysis (from Micrographs to 3D Models) via Web Browser on SDSC ‣ Expand Computing to Other XSEDE Resources (e.g. Xstream) and DOE’s NERSC Short term: ‣ Provide 2D and 3D Analysis on Particle Stacks on Comet at SDSC Source: Michael A. Cianfrocco, UCSD * * SDSC NERSC Xstream 3 Supercomputer Centers ~20 Microscopes in CA UCLA UC Davis UC Santa Cruz SF Bay UC Berkeley, LBNL, UCSF, Stanford San Diego UCSD, TSRI, Salk* Extending to MSU
  17. 17. New NSF CHASE-CI Grant Creates a Community Cyberinfrastructure Adding a Machine Learning Layer Built on Top of the Pacific Research Platform Caltech UCB UCI UCR UCSD UCSC Stanford MSU UCM SDSU NSF Grant for High Speed “Cloud” of 256 GPUs For 30 ML Faculty & Their Students at 10 Campuses for Training AI Algorithms on Big Data
  18. 18. Machine Learning Researchers Need a New Cyberinfrastructure “Until cloud providers are willing to find a solution to place commodity (32-bit) game GPUs into their servers and price services accordingly, I think we will not be able to leverage the cloud effectively.” “There is an actual scientific infrastructure need here, surprisingly unmet by the commercial market, and perhaps CHASE-CI is the perfect catalyst to break this logjam.” --UC Berkeley Professor Trevor Darrell
  19. 19. Adding GPUs to FIONAs Supports Data Science Machine Learning Eight Nvidia GTX-1080 Ti GPUs ~$13K 32GB RAM, 3TB SSD, 40G & Dual 10G ports Source: John Graham, Calit2
  20. 20. Single vs. Double Precision GPUs: Gaming vs. Supercomputing 8 x 1080 Ti: 1 Million GPU Core-Hours Every 2 Days, Cost of a Starbucks Latte. 500 Million GPU Core-Hours for $14K in 3yrs
  21. 21. 48 GPUs for OSG Applications UCSD Adding >350 Game GPUs to Data Sciences Cyberinfrastructure - Devoted to Data Analytics and Machine Learning SunCAVE 70 GPUs WAVE + Vroom 48 GPUs FIONA with 8-Game GPUs 88 GPUs for Students CHASE-CI Grant Provides 96 GPUs at UCSD for Training AI Algorithms on Big Data
  22. 22. Calit2’s Qualcomm Institute Has Established a Pattern Recognition Lab For Machine Learning on GPUs and von Neumann and NvN Processors Source: Dr. Dharmendra Modha Founding Director, IBM Cognitive Computing Group August 8, 2014 UCSD ECE Professor Ken Kreutz-Delgado Brings the IBM TrueNorth Chip to Start Calit2’s Qualcomm Institute Pattern Recognition Laboratory September 16, 2015
  23. 23. Our Pattern Recognition Lab is Exploring Mapping Machine Learning Algorithm Families Onto Novel Architectures Qualcomm Institute • Deep & Recurrent Neural Networks (DNN, RNN) • Graph Theoretic • Reinforcement Learning (RL) • Clustering and other neighborhood-based • Support Vector Machine (SVM) • Sparse Signal Processing and Source Localization • Dimensionality Reduction & Manifold Learning • Latent Variable Analysis (PCA, ICA) • Stochastic Sampling, Variational Approximation • Decision Tree Learning
  24. 24. For ¾ of a Century, Computing Has Relied on von Neumann’s Architecture
  25. 25. Next Step: Surrounding the UCSD Data Sciences Machine Learning Platform With Clouds of GPUs and Non-Von Neumann Processors Microsoft Installs Altera FPGAs into Bing Servers & 384 into TACC for Academic Access 64-TrueNorth Cluster CHASE-CI64-bit GPUs
  26. 26. Our Support: • US National Science Foundation (NSF) awards  CNS 0821155, CNS-1338192, CNS-1456638, CNS-1730158, ACI-1540112, & ACI-1541349 • University of California Office of the President CIO • UCSD Chancellor’s Integrated Digital Infrastructure Program • UCSD Next Generation Networking initiative • Calit2 and Calit2 Qualcomm Institute • CENIC, PacificWave and StarLight • DOE ESnet