Big Data Tech Forum: Big Data Enabling Technologies and Applications
San Diego Chinese American Science and Engineering Association (SDCASEA)
Sanford Consortium
La Jolla, CA
December 2, 2017
The Pacific Research Platform (PRP) is a multi-institutional cyberinfrastructure project that connects researchers across California and beyond to share large datasets. It spans the 10 University of California campuses, major private research universities, supercomputer centers, and some out-of-state universities. Fifteen multi-campus research teams in fields like physics, astronomy, earth sciences, biomedicine, and multimedia will drive the technical needs of the PRP over five years. The goal is to create a "big data freeway" to allow high-speed sharing of data between research labs, supercomputers, and repositories across multiple networks without performance loss over long distances.
CHASE-CI: A Distributed Big Data Machine Learning PlatformLarry Smarr
This document summarizes a talk given by Professor Ken Kreutz-Delgado on distributed machine learning platforms and brain-inspired computing. It discusses the Pacific Research Platform (PRP) which connects multiple universities and research institutions. The PRP uses FIONA appliances and Kubernetes to distribute storage and processing. A new NSF grant will add GPUs across 10 campuses for training AI algorithms on big data. The talk envisions connecting the PRP with clouds of GPUs and non-von Neumann processors like IBM's TrueNorth chip. Calit2's Pattern Recognition Lab uses different processors including TrueNorth to explore machine learning algorithms.
Building the Pacific Research Platform: Supernetworks for Big Data ScienceLarry Smarr
The document summarizes Dr. Larry Smarr's presentation on building the Pacific Research Platform (PRP) to enable big data science across research universities on the West Coast. The PRP provides 100-1000 times more bandwidth than today's internet to support research fields from particle physics to climate change. In under 2 years, the prototype PRP has connected researchers and datasets across California through optical networks and is now expanding nationally and globally. The next steps involve adding machine learning capabilities to the PRP through GPU clusters to enable new discoveries from massive datasets.
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
- The Pacific Research Platform (PRP) interconnects campus DMZs across multiple institutions to provide high-speed connectivity for data-intensive research.
- The PRP utilizes specialized data transfer nodes called FIONAs that provide disk-to-disk transfer speeds of 10-100Gbps.
- Early applications of the PRP include distributing telescope data between UC campuses, connecting particle physics experiments to computing resources, and enabling real-time wildfire sensor data analysis.
The Pacific Research Platform (PRP) is a multi-institutional cyberinfrastructure project that connects researchers across California and beyond to share large datasets. It spans the 10 University of California campuses, major private research universities, supercomputer centers, and some out-of-state universities. Fifteen multi-campus research teams in fields like physics, astronomy, earth sciences, biomedicine, and multimedia will drive the technical needs of the PRP over five years. The goal is to create a "big data freeway" to allow high-speed sharing of data between research labs, supercomputers, and repositories across multiple networks without performance loss over long distances.
CHASE-CI: A Distributed Big Data Machine Learning PlatformLarry Smarr
This document summarizes a talk given by Professor Ken Kreutz-Delgado on distributed machine learning platforms and brain-inspired computing. It discusses the Pacific Research Platform (PRP) which connects multiple universities and research institutions. The PRP uses FIONA appliances and Kubernetes to distribute storage and processing. A new NSF grant will add GPUs across 10 campuses for training AI algorithms on big data. The talk envisions connecting the PRP with clouds of GPUs and non-von Neumann processors like IBM's TrueNorth chip. Calit2's Pattern Recognition Lab uses different processors including TrueNorth to explore machine learning algorithms.
Building the Pacific Research Platform: Supernetworks for Big Data ScienceLarry Smarr
The document summarizes Dr. Larry Smarr's presentation on building the Pacific Research Platform (PRP) to enable big data science across research universities on the West Coast. The PRP provides 100-1000 times more bandwidth than today's internet to support research fields from particle physics to climate change. In under 2 years, the prototype PRP has connected researchers and datasets across California through optical networks and is now expanding nationally and globally. The next steps involve adding machine learning capabilities to the PRP through GPU clusters to enable new discoveries from massive datasets.
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
- The Pacific Research Platform (PRP) interconnects campus DMZs across multiple institutions to provide high-speed connectivity for data-intensive research.
- The PRP utilizes specialized data transfer nodes called FIONAs that provide disk-to-disk transfer speeds of 10-100Gbps.
- Early applications of the PRP include distributing telescope data between UC campuses, connecting particle physics experiments to computing resources, and enabling real-time wildfire sensor data analysis.
Machine Learning in Healthcare DiagnosticsLarry Smarr
Machine learning and artificial intelligence are rapidly transforming healthcare and medicine. Advances in genetic sequencing have enabled the mapping of human and microbial genomes at low costs. Researchers are using machine learning to analyze genomic and microbiome data to better understand health and disease. Non-von Neumann brain-inspired computing architectures are being developed for machine learning applications and could accelerate medical research and diagnostics. These technologies may help create personalized health coaching and move medicine from reactive sickcare to proactive healthcare.
Peering The Pacific Research Platform With The Great Plains NetworkLarry Smarr
The Pacific Research Platform (PRP) connects research institutions across the western United States with high-speed networks to enable data-intensive science collaborations. Key points:
- The PRP connects 15 campuses across California and links to the Great Plains Network, allowing researchers to access remote supercomputers, share large datasets, and collaborate on projects like analyzing data from the Large Hadron Collider.
- The PRP utilizes Science DMZ architectures with dedicated data transfer nodes called FIONAs to achieve high-speed transfer of large files. Kubernetes is used to manage distributed storage and computing resources.
- Early applications include distributed climate modeling, wildfire science, plankton imaging, and cancer genomics. The PR
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...Larry Smarr
This document summarizes Dr. Larry Smarr's invited talk about the Pacific Research Platform (PRP) given at the San Diego Supercomputer Center in April 2019. The PRP is building a distributed big data machine learning supercomputer by connecting high-performance computing and data resources across multiple universities in California and beyond using high-speed networks. It provides researchers with petascale computing power, distributed storage, and tools like Kubernetes to enable collaborative data-intensive science across institutions.
Towards a High-Performance National Research Platform Enabling Digital ResearchLarry Smarr
The document summarizes Dr. Larry Smarr's keynote presentation on enabling a high-performance national research platform. It describes how multi-institutional research increasingly relies on access to large datasets, requiring new cyberinfrastructure. The Pacific Research Platform provides high-bandwidth networking between universities to support research collaborations across disciplines. The next steps involve scaling this model into a national and global platform. The presentation highlights how the PRP enables various scientific applications and drives innovation through improved data transfer capabilities and distributed computing resources.
The document summarizes the agenda for the NDM 2012 workshop on Network-aware Data Management to be held on November 11th, 2012 in Salt Lake City. The workshop will include keynote speeches, invited talks, paper presentations, and a panel discussion on new directions in networking and data management. Topics will include data-intensive applications, transport of big data over dedicated networks, and using networking techniques for data management. The workshop aims to foster collaboration between the network and data management communities.
This document discusses several projects related to connecting research institutions through high-speed networks:
1) The Pacific Research Platform connects campuses in California through a "big data superhighway" funded by NSF from 2015-2020.
2) CHASE-CI adds machine learning capabilities for researchers across 10 campuses in California using NSF-funded GPU resources.
3) A pilot project is using CENIC and Internet2 to connect regional research networks on a national scale, funded by NSF from 2018-2019.
The Pacific Research Platform Enables Distributed Big-Data Machine-LearningLarry Smarr
The Pacific Research Platform enables distributed big data machine learning by connecting scientific instruments, sensors, and supercomputers across California and the United States with high-speed optical networks. Key components include FIONA data transfer nodes that allow fast disk-to-disk transfers near the theoretical maximum, Kubernetes to orchestrate distributed computing resources, and the Nautilus hypercluster which aggregates thousands of CPU cores and GPUs into a unified platform. This infrastructure has accelerated many scientific workflows and supported cutting-edge research in fields such as astronomy, oceanography, climate science, and particle physics.
My talk at the Winter School on Big Data in Tarragona, Spain.
Abstract: We have made much progress over the past decade toward harnessing the collective power of IT resources distributed across the globe. In high-energy physics, astronomy, and climate, thousands work daily within virtual computing systems with global scope. But we now face a far greater challenge: Exploding data volumes and powerful simulation tools mean that many more--ultimately most?--researchers will soon require capabilities not so different from those used by such big-science teams. How are we to meet these needs? Must every lab be filled with computers and every researcher become an IT specialist? Perhaps the solution is rather to move research IT out of the lab entirely: to leverage the “cloud” (whether private or public) to achieve economies of scale and reduce cognitive load. I explore the past, current, and potential future of large-scale outsourcing and automation for science, and suggest opportunities and challenges for today’s researchers.
The document summarizes a workshop on network-aware data management held alongside the SC'11 conference. The workshop addressed challenges in managing large amounts of data across high-bandwidth networks and how to simplify data access and movement. It included keynote and paper presentations on these topics, and a panel discussion on data management challenges for exascale computing and terabit networks. The best paper award was given to a presentation on a fat-tree routing algorithm to alleviate congestion in InfiniBand networks.
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
The document discusses the Pacific Research Platform (PRP), a regional big data cyberinfrastructure connecting researchers across California universities. PRP provides high-speed networks and data transfer nodes to enable sharing of large datasets for projects like medical imaging, cryo-electron microscopy, and machine learning. Recent grants are expanding PRP to add GPUs and non-von Neumann processors to support these computationally intensive applications.
Science Engagement: A Non-Technical Approach to the Technical DivideCybera Inc.
A presentation for the Future of Networking session at the 2014 Cyber Summit by Jason Zurawski, Science Engagement Engineer, ESnet (Lawrence Berkeley National Laboratory).
1) Scientists at the Advanced Photon Source use the Argonne Leadership Computing Facility for data reconstruction and analysis from experimental facilities in real-time or near real-time. This provides feedback during experiments.
2) Using the Swift parallel scripting language and ALCF supercomputers like Mira, scientists can process terabytes of data from experiments in minutes rather than hours or days. This enables errors to be detected and addressed during experiments.
3) Key applications discussed include near-field high-energy X-ray diffraction microscopy, X-ray nano/microtomography, and determining crystal structures from diffuse scattering images through simulation and optimization. The workflows developed provide significant time savings and improved experimental outcomes.
Accelerating Discovery via Science ServicesIan Foster
[A talk presented at Oak Ridge National Laboratory on October 15, 2015]
We have made much progress over the past decade toward harnessing the collective power of IT resources distributed across the globe. In big-science projects in high-energy physics, astronomy, and climate, thousands work daily within virtual computing systems with global scope. But we now face a far greater challenge: Exploding data volumes and powerful simulation tools mean that many more--ultimately most?--researchers will soon require capabilities not so different from those used by such big-science teams. How are we to meet these needs? Must every lab be filled with computers and every researcher become an IT specialist? Perhaps the solution is rather to move research IT out of the lab entirely: to develop suites of science services to which researchers can dispatch mundane but time-consuming tasks, and thus to achieve economies of scale and reduce cognitive load. I explore the past, current, and potential future of large-scale outsourcing and automation for science, and suggest opportunities and challenges for today’s researchers. I use examples from Globus and other projects to demonstrate what can be achieved.
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
National Ocean Exploration Forum 2017
Ocean Exploration in a Sea of Data
Calit2’s Qualcomm Institute
University of California, San Diego
October 21, 2017
Machine Learning in Healthcare DiagnosticsLarry Smarr
Machine learning and artificial intelligence are rapidly transforming healthcare and medicine. Advances in genetic sequencing have enabled the mapping of human and microbial genomes at low costs. Researchers are using machine learning to analyze genomic and microbiome data to better understand health and disease. Non-von Neumann brain-inspired computing architectures are being developed for machine learning applications and could accelerate medical research and diagnostics. These technologies may help create personalized health coaching and move medicine from reactive sickcare to proactive healthcare.
Peering The Pacific Research Platform With The Great Plains NetworkLarry Smarr
The Pacific Research Platform (PRP) connects research institutions across the western United States with high-speed networks to enable data-intensive science collaborations. Key points:
- The PRP connects 15 campuses across California and links to the Great Plains Network, allowing researchers to access remote supercomputers, share large datasets, and collaborate on projects like analyzing data from the Large Hadron Collider.
- The PRP utilizes Science DMZ architectures with dedicated data transfer nodes called FIONAs to achieve high-speed transfer of large files. Kubernetes is used to manage distributed storage and computing resources.
- Early applications include distributed climate modeling, wildfire science, plankton imaging, and cancer genomics. The PR
The Pacific Research Platform: Building a Distributed Big Data Machine Learni...Larry Smarr
This document summarizes Dr. Larry Smarr's invited talk about the Pacific Research Platform (PRP) given at the San Diego Supercomputer Center in April 2019. The PRP is building a distributed big data machine learning supercomputer by connecting high-performance computing and data resources across multiple universities in California and beyond using high-speed networks. It provides researchers with petascale computing power, distributed storage, and tools like Kubernetes to enable collaborative data-intensive science across institutions.
Towards a High-Performance National Research Platform Enabling Digital ResearchLarry Smarr
The document summarizes Dr. Larry Smarr's keynote presentation on enabling a high-performance national research platform. It describes how multi-institutional research increasingly relies on access to large datasets, requiring new cyberinfrastructure. The Pacific Research Platform provides high-bandwidth networking between universities to support research collaborations across disciplines. The next steps involve scaling this model into a national and global platform. The presentation highlights how the PRP enables various scientific applications and drives innovation through improved data transfer capabilities and distributed computing resources.
The document summarizes the agenda for the NDM 2012 workshop on Network-aware Data Management to be held on November 11th, 2012 in Salt Lake City. The workshop will include keynote speeches, invited talks, paper presentations, and a panel discussion on new directions in networking and data management. Topics will include data-intensive applications, transport of big data over dedicated networks, and using networking techniques for data management. The workshop aims to foster collaboration between the network and data management communities.
This document discusses several projects related to connecting research institutions through high-speed networks:
1) The Pacific Research Platform connects campuses in California through a "big data superhighway" funded by NSF from 2015-2020.
2) CHASE-CI adds machine learning capabilities for researchers across 10 campuses in California using NSF-funded GPU resources.
3) A pilot project is using CENIC and Internet2 to connect regional research networks on a national scale, funded by NSF from 2018-2019.
The Pacific Research Platform Enables Distributed Big-Data Machine-LearningLarry Smarr
The Pacific Research Platform enables distributed big data machine learning by connecting scientific instruments, sensors, and supercomputers across California and the United States with high-speed optical networks. Key components include FIONA data transfer nodes that allow fast disk-to-disk transfers near the theoretical maximum, Kubernetes to orchestrate distributed computing resources, and the Nautilus hypercluster which aggregates thousands of CPU cores and GPUs into a unified platform. This infrastructure has accelerated many scientific workflows and supported cutting-edge research in fields such as astronomy, oceanography, climate science, and particle physics.
My talk at the Winter School on Big Data in Tarragona, Spain.
Abstract: We have made much progress over the past decade toward harnessing the collective power of IT resources distributed across the globe. In high-energy physics, astronomy, and climate, thousands work daily within virtual computing systems with global scope. But we now face a far greater challenge: Exploding data volumes and powerful simulation tools mean that many more--ultimately most?--researchers will soon require capabilities not so different from those used by such big-science teams. How are we to meet these needs? Must every lab be filled with computers and every researcher become an IT specialist? Perhaps the solution is rather to move research IT out of the lab entirely: to leverage the “cloud” (whether private or public) to achieve economies of scale and reduce cognitive load. I explore the past, current, and potential future of large-scale outsourcing and automation for science, and suggest opportunities and challenges for today’s researchers.
The document summarizes a workshop on network-aware data management held alongside the SC'11 conference. The workshop addressed challenges in managing large amounts of data across high-bandwidth networks and how to simplify data access and movement. It included keynote and paper presentations on these topics, and a panel discussion on data management challenges for exascale computing and terabit networks. The best paper award was given to a presentation on a fat-tree routing algorithm to alleviate congestion in InfiniBand networks.
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
The document discusses the Pacific Research Platform (PRP), a regional big data cyberinfrastructure connecting researchers across California universities. PRP provides high-speed networks and data transfer nodes to enable sharing of large datasets for projects like medical imaging, cryo-electron microscopy, and machine learning. Recent grants are expanding PRP to add GPUs and non-von Neumann processors to support these computationally intensive applications.
Science Engagement: A Non-Technical Approach to the Technical DivideCybera Inc.
A presentation for the Future of Networking session at the 2014 Cyber Summit by Jason Zurawski, Science Engagement Engineer, ESnet (Lawrence Berkeley National Laboratory).
1) Scientists at the Advanced Photon Source use the Argonne Leadership Computing Facility for data reconstruction and analysis from experimental facilities in real-time or near real-time. This provides feedback during experiments.
2) Using the Swift parallel scripting language and ALCF supercomputers like Mira, scientists can process terabytes of data from experiments in minutes rather than hours or days. This enables errors to be detected and addressed during experiments.
3) Key applications discussed include near-field high-energy X-ray diffraction microscopy, X-ray nano/microtomography, and determining crystal structures from diffuse scattering images through simulation and optimization. The workflows developed provide significant time savings and improved experimental outcomes.
Accelerating Discovery via Science ServicesIan Foster
[A talk presented at Oak Ridge National Laboratory on October 15, 2015]
We have made much progress over the past decade toward harnessing the collective power of IT resources distributed across the globe. In big-science projects in high-energy physics, astronomy, and climate, thousands work daily within virtual computing systems with global scope. But we now face a far greater challenge: Exploding data volumes and powerful simulation tools mean that many more--ultimately most?--researchers will soon require capabilities not so different from those used by such big-science teams. How are we to meet these needs? Must every lab be filled with computers and every researcher become an IT specialist? Perhaps the solution is rather to move research IT out of the lab entirely: to develop suites of science services to which researchers can dispatch mundane but time-consuming tasks, and thus to achieve economies of scale and reduce cognitive load. I explore the past, current, and potential future of large-scale outsourcing and automation for science, and suggest opportunities and challenges for today’s researchers. I use examples from Globus and other projects to demonstrate what can be achieved.
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...Larry Smarr
National Ocean Exploration Forum 2017
Ocean Exploration in a Sea of Data
Calit2’s Qualcomm Institute
University of California, San Diego
October 21, 2017
Using the Pacific Research Platform for Earth Sciences Big DataLarry Smarr
Grand Challenge Lecture
Big Data and the Earth Sciences: Grand Challenges Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
May 31, 2017
The document discusses the Pacific Research Platform (PRP), a distributed cyberinfrastructure that connects researchers and data across multiple campuses in California and beyond using optical fiber networking. Key points:
- The PRP uses high-speed networking infrastructure like the CENIC network to connect data generators and consumers across 15+ campuses, creating an integrated "big data freeway system".
- It deploys specialized data transfer nodes called FIONAs to enable high-speed transfer of large datasets between sites at near the full network speed.
- Recent additions include using Kubernetes to orchestrate containers across the PRP infrastructure and integrating machine learning resources through the CHASE-CI grant to support data-intensive AI applications.
An Integrated Science Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
This document summarizes Dr. Larry Smarr's vision for an integrated science cyberinfrastructure to support data-intensive research. It discusses the exponential growth of digital data and need for dedicated high-bandwidth networks and data repositories. Specific examples are provided of initiatives at UCSD, regional optical networks connecting research institutions, and national projects like the Open Science Grid and Cancer Genomics Hub that are creating cyberinfrastructure to enable data-intensive scientific discovery.
Looking Back, Looking Forward NSF CI Funding 1985-2025Larry Smarr
This document provides an overview of the development of national research platforms (NRPs) from 1985 to the present, with a focus on the Pacific Research Platform (PRP). It describes the evolution of the PRP from early NSF-funded supercomputing centers to today's distributed cyberinfrastructure utilizing optical networking, containers, Kubernetes, and distributed storage. The PRP now connects over 15 universities across the US and internationally to enable data-intensive science and machine learning applications across multiple domains. Going forward, the document discusses plans to further integrate regional networks and partner with new NSF-funded initiatives to develop the next generation of NRPs through 2025.
High Performance Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
This document summarizes a lecture given by Dr. Larry Smarr on high performance cyberinfrastructure for data-intensive research. The summary discusses:
1) The need for dedicated high-bandwidth networks separate from the shared internet to enable big data research due to the increasing volume of digital scientific data.
2) Extensions being made to networks like CENIC in California to provide campus "Big Data Freeways" connecting instruments, computing resources, and remote facilities.
3) The use of networks like HPWREN to provide high-performance wireless access for data-intensive applications in rural areas like astronomy, wildfire detection, and more.
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemLarry Smarr
The Pacific Research Platform will create a regional "Big Data Freeway System" along the West Coast to support science. It will connect major research institutions with high-speed optical networks, allowing them to share vast amounts of data and computational resources. This will enable new forms of collaborative, data-intensive research for fields like particle physics, astronomy, biomedicine, and earth sciences. The first phase aims to establish a basic networked infrastructure, with later phases advancing capabilities to 100Gbps and beyond with security and distributed technologies.
Pacific Research Platform Science DriversLarry Smarr
The document discusses the vision and progress of the Pacific Research Platform (PRP) in creating a "big data freeway" across the West Coast to enable data-intensive science. It outlines how the PRP builds on previous NSF and DOE networking investments to provide dedicated high-performance computing resources, like GPU clusters and Jupyter hubs, connected by high-speed networks at multiple universities. Several science driver teams are highlighted, including particle physics, astronomy, microbiology, earth sciences, and visualization, that will leverage PRP resources for large-scale collaborative data analysis projects.
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...Larry Smarr
Invited Presentation
Symposium on Computational Biology and Bioinformatics:
Remembering John Wooley
National Institutes of Health
Bethesda, MD
July 29, 2016
Global Research Platforms: Past, Present, FutureLarry Smarr
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise boosts blood flow, releases endorphins, and promotes changes in the brain which help regulate emotions and stress levels.
A California-Wide Cyberinfrastructure for Data-Intensive ResearchLarry Smarr
The document discusses creating a California-wide cyberinfrastructure for data-intensive research. It outlines efforts to connect all UC campuses and other research institutions across California with high-speed optical networks. This would create a "big data plane" to share large datasets. Several campuses have received NSF grants to upgrade their networks and implement Science DMZ architectures with 10-100Gbps connections to CENIC. Connecting these resources would provide researchers access to high-performance computing, large scientific instruments, and datasets. This would support collaborative big data science across disciplines like physics, climate modeling, genomics and microscopy.
The document provides an overview of the Pacific Research Platform (PRP) and discusses its role in connecting researchers across institutions and enabling new applications. It summarizes the PRP's key components like Science DMZs, Data Transfer Nodes (FIONAs), and use of Kubernetes for container management. Several examples are given of how the PRP facilitates high-performance distributed data analysis, access to remote supercomputers, and sensor networks coupled to real-time computing. Upcoming work on machine learning applications and expanding the PRP internationally is also outlined.
Similar to Creating a Big Data Machine Learning Platform in California (20)
My Remembrances of Mike Norman Over The Last 45 YearsLarry Smarr
Mike Norman has been a leader in computational astrophysics for over 45 years. Some of his influential work includes:
- Cosmic jet simulations in the early 1980s which helped explain phenomena from galactic centers.
- Pioneering the use of adaptive mesh refinement in the 1990s to achieve dynamic load balancing on supercomputers.
- Massive cosmology simulations in the late 2000s with over 100 trillion particles using thousands of processors across multiple supercomputing sites, producing petabytes of data.
- Developing end-to-end workflows in the 2000s to couple supercomputers, high-speed networks, and large visualization systems to enable real-time analysis of extremely large astrophysics simulations.
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Larry Smarr
Larry Smarr discusses quantifying his body and health over time through extensive self-tracking. He measures various biomarkers through regular blood tests and analyzes his gut microbiome by sequencing stool samples. This revealed issues like chronic inflammation and an unhealthy microbiome. Smarr then took steps like a restricted eating window and increasing plant diversity in his diet, which reversed metabolic syndrome issues and correlated with shifts in his microbiome ecology. His goal is to continue precisely measuring factors like toxins, hormones, gut permeability and food/supplement impacts to further optimize his health.
Panel: Reaching More Minority Serving InstitutionsLarry Smarr
This document discusses engaging more minority serving institutions (MSIs) in cyberinfrastructure development through regional networks. It provides data showing the importance of MSIs like historically black colleges and universities (HBCUs) in educating underrepresented minority students in STEM fields. Regional networks can help equalize opportunities by assisting MSIs in overcoming barriers to resources through training, networking infrastructure support, and helping institutions obtain necessary staffing and funding. Strategies mentioned include collaborating with MSIs on grants and addressing issues identified in surveys like lack of vision for data use beyond compliance. The goal is to broaden participation in STEAM fields by leveraging the success MSIs have shown in supporting underrepresented students.
Global Network Advancement Group - Next Generation Network-Integrated SystemsLarry Smarr
This document summarizes a presentation on global petascale to exascale workflows for data intensive sciences. It discusses a partnership convened by the GNA-G Data Intensive Sciences Working Group with the mission of meeting challenges faced by data-intensive science programs. Cornerstone concepts that will be demonstrated include integrated network and site resource management, model-driven frameworks for resource orchestration, end-to-end monitoring with machine learning-optimized data transfers, and integrating Qualcomm's GradientGraph with network services to optimize applications and science workflows.
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...Larry Smarr
This document discusses opportunities for ESnet to support wireless edge computing through developing a strategy around self-guided field laboratories (SGFL). It outlines several potential science use cases that could benefit from wireless and distributed computing capabilities, both in the short term through technologies like 5G, LoRa and Starlink, and longer term through the vision of automated SGFL. The document proposes some initial ideas for deploying and testing wireless edge computing technologies through existing projects to help enable the SGFL vision and further scientific opportunities. It emphasizes that exploring these emerging areas could help drive new science possibilities if done at a reasonable scale.
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonLarry Smarr
This document provides an overview of Asia Pacific and Korea research platforms. It discusses the Asia Pacific Research Platform working group in APAN, including its objectives to promote HPC ecosystems and engage members. It describes the Asi@Connect project which provides high-capacity internet connectivity for research across Asia-Pacific. It also discusses the Korea Research Platform and efforts to expand it to 25 national research institutes in Korea. New related projects on smart hospitals, agriculture, and environment are mentioned. The conclusion discusses enhancing APAN and the Korea Research Platform and expanding into new areas like disaster and AI education.
Panel: Reaching More Minority Serving InstitutionsLarry Smarr
This document discusses engaging more minority serving institutions (MSIs) in the National Research Platform (NRP). It provides data showing that MSIs serve a disproportionate number of underrepresented minority students and are important producers of STEM graduates from these groups. The NRP can help broaden participation in STEAM fields by providing MSIs access to advanced cyberinfrastructure resources, new learning modalities, and opportunities for collaborative research between MSIs and other institutions. Regional networks also have a role to play in helping MSIs overcome barriers and attracting them to collaborative grants. The goal is to tear down walls between research and teaching and reinvent the university experience for more inclusive learning and innovation.
Panel: The Global Research Platform: An OverviewLarry Smarr
The document provides an overview of the Global Research Platform (GRP), an international collaborative partnership creating a distributed environment for data-intensive global science. The GRP facilitates high-performance data gathering, analytics, transport up to terabits per second, computing, and storage to support large-scale global science cyberinfrastructure ecosystems. It aims to orchestrate research across multiple domains using international testbeds for investigating new technologies related to data-intensive science. Examples of instruments generating exabytes of data that would benefit include the Korea Superconducting Tokamak, the High Luminosity LHC, genomics, the SKA radio telescope, and the Vera Rubin Observatory.
Panel: Future Wireless Extensions of Regional Optical NetworksLarry Smarr
CENIC is a non-profit organization that operates an 8,000+ mile fiber optic network connecting over 12,000 sites across California, including K-12 schools, universities, libraries, and research organizations. It has over 750 private sector partners and contributes over $100 million annually to the California economy. CENIC's network enables research and education collaborations, innovation, and economic growth statewide. It also operates a wireless research network called PRP that connects wireless sensors to supercomputers, supporting applications like wildfire modeling.
Global Research Platform Workshops - Maxine BrownLarry Smarr
The document announces a workshop on global research platforms that will be held virtually in 2021 and in Salt Lake City in 2022, with topics including large-scale science, next-generation platforms, data transport, and international testbeds. It also announces the 4th Global Research Platform Workshop to be held in October 2023 in Limassol, Cyprus co-located with the IEEE eScience 2023 conference.
EPOC and NetSage provide engagement and network monitoring services to support research and education. NetSage collects anonymized network flow data to help understand traffic patterns and troubleshoot performance issues. It provides dashboards and analysis to answer common questions from network engineers and end users. Examples of NetSage deployments and use cases were shown for the CENIC network, including top sources and destinations of traffic, debugging slow flows, and analyzing international traffic patterns by country over time.
The document discusses accelerating science discovery with AI inference-as-a-service. It describes showcases using this approach for high energy physics and gravitational wave experiments. It outlines the vision of the A3D3 institute to unite domain scientists, computer scientists, and engineers to achieve real-time AI and transform science. Examples are provided of using AI inference-as-a-service to accelerate workflows for CMS, ProtoDUNE, LIGO, and other experiments.
Democratizing Science through Cyberinfrastructure - Manish ParasharLarry Smarr
This document summarizes a presentation by Manish Parashar on democratizing science through cyberinfrastructure. The key points are:
1) Broad, fair, and equitable access to advanced cyberinfrastructure is essential for democratizing 21st century science, but there are significant barriers related to knowledge, technical issues, social factors, and balancing capabilities.
2) An advanced cyberinfrastructure ecosystem for all requires integrated portals, access to local and national resources through high-speed networks, diverse allocation modes, embedded expertise networks, and broad training.
3) Realizing this vision will require a scalable federated ecosystem with diverse capabilities and incentives for partnerships to meet growing needs for cyberinfrastructure and
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Larry Smarr
This document summarizes a panel discussion on building the National Research Platform ecosystem with regional networks. The panelists discussed how their regional networks are connecting to and using the Nautilus nodes of the NRP. Examples included using NRP for deep learning and computer vision research at the University of Missouri, challenges of adoption in Nevada and potential solutions, and Georgia Tech's new involvement through the Southern Crossroads regional network. The regional networks see opportunities to expand NRP access and training to enable more researchers in their regions to take advantage of the platform.
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...Larry Smarr
The document discusses Open Force Field (OpenFF), an open-source project that enables rapid development of molecular force fields through automated infrastructure, open data and software, and an open science approach. OpenFF provides access to large quantum chemical datasets, runs quantum chemistry calculations on pre-emptible cloud resources with minimal human intervention, and facilitates easy iteration and testing of new force field hypotheses through an open development model.
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Larry Smarr
The document discusses open infrastructure for an open society and the role of commercial clouds. It describes how the National Research Platform (NRP), Open Science Grid (OSG), and Open Science Data Federation (OSDF) provide open infrastructure through open source components that anyone can contribute to and use. It then discusses how Southwestern Oklahoma State University leveraged NRP resources on their campus and engaged students and local teachers. Finally, it outlines the pros and cons of commercial clouds, when they may be suitable to use, and how tools like CloudBank and Kubernetes can help facilitate science users' access to cloud resources.
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Larry Smarr
The document discusses open infrastructure for an open society and the role of commercial clouds. It describes how the National Research Platform (NRP), Open Science Grid (OSG), and Open Science Data Federation (OSDF) provide open infrastructure through open source components that anyone can contribute to and use. It then discusses how Southwestern Oklahoma State University leveraged NRP resources on their campus and engaged students and local teachers. Finally, it outlines the pros and cons of commercial clouds, noting they provide huge capacity and variety but are very expensive for regular use. Facilitating science users on clouds requires services like CloudBank and Kubernetes federation.
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Larry Smarr
The document discusses open infrastructure for an open society and the role of commercial clouds. It describes how the National Research Platform (NRP), Open Science Grid (OSG), and Open Science Data Federation (OSDF) provide open infrastructure through open source components that anyone can contribute to and use. It then discusses how Southwestern Oklahoma State University leveraged NRP resources on their campus and engaged students and local teachers. Finally, it outlines the pros and cons of commercial clouds, noting they provide huge capacity and variety but are very expensive for regular use. Facilitating science users on clouds requires tools for account management, documentation, and integrating cloud resources through HTCondor and Kubernetes.
Frank Würthwein - NRP and the Path forwardLarry Smarr
NRP will replace PRP and aims to democratize access to national research cyberinfrastructure. The long term vision is to create an open national cyberinfrastructure by federating resources across research institutions. Key innovations include an innovative network fabric, application libraries for FPGAs, a "bring your own resource" model, and innovative scheduling and data infrastructure. The NSF has funded the Prototype National Research Platform project to support NRP for the next 5 years. NRP aims to grow resources, introduce new capabilities, and be driven by the research community.
Evaluation and Identification of J'BaFofi the Giant Spider of Congo and Moke...MrSproy
ABSTRACT
The J'BaFofi, or "Giant Spider," is a mainly legendary arachnid by reportedly inhabiting the dense rain forests of
the Congo. As despite numerous anecdotal accounts and cultural references, the scientific validation remains more elusive.
My study aims to proper evaluate the existence of the J'BaFofi through the analysis of historical reports,indigenous
testimonies and modern exploration efforts.
Anti-Universe And Emergent Gravity and the Dark UniverseSérgio Sacani
Recent theoretical progress indicates that spacetime and gravity emerge together from the entanglement structure of an underlying microscopic theory. These ideas are best understood in Anti-de Sitter space, where they rely on the area law for entanglement entropy. The extension to de Sitter space requires taking into account the entropy and temperature associated with the cosmological horizon. Using insights from string theory, black hole physics and quantum information theory we argue that the positive dark energy leads to a thermal volume law contribution to the entropy that overtakes the area law precisely at the cosmological horizon. Due to the competition between area and volume law entanglement the microscopic de Sitter states do not thermalise at sub-Hubble scales: they exhibit memory effects in the form of an entropy displacement caused by matter. The emergent laws of gravity contain an additional ‘dark’ gravitational force describing the ‘elastic’ response due to the entropy displacement. We derive an estimate of the strength of this extra force in terms of the baryonic mass, Newton’s constant and the Hubble acceleration scale a0 = cH0, and provide evidence for the fact that this additional ‘dark gravity force’ explains the observed phenomena in galaxies and clusters currently attributed to dark matter.
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Sérgio Sacani
Magmatic iron-meteorite parent bodies are the earliest planetesimals in the Solar System,and they preserve information about conditions and planet-forming processes in thesolar nebula. In this study, we include comprehensive elemental compositions andfractional-crystallization modeling for iron meteorites from the cores of five differenti-ated asteroids from the inner Solar System. Together with previous results of metalliccores from the outer Solar System, we conclude that asteroidal cores from the outerSolar System have smaller sizes, elevated siderophile-element abundances, and simplercrystallization processes than those from the inner Solar System. These differences arerelated to the formation locations of the parent asteroids because the solar protoplane-tary disk varied in redox conditions, elemental distributions, and dynamics at differentheliocentric distances. Using highly siderophile-element data from iron meteorites, wereconstruct the distribution of calcium-aluminum-rich inclusions (CAIs) across theprotoplanetary disk within the first million years of Solar-System history. CAIs, the firstsolids to condense in the Solar System, formed close to the Sun. They were, however,concentrated within the outer disk and depleted within the inner disk. Future modelsof the structure and evolution of the protoplanetary disk should account for this dis-tribution pattern of CAIs.
Rodents, Birds and locust_Pests of crops.pdfPirithiRaju
Mole rat or Lesser bandicoot rat, Bandicotabengalensis
•Head -round and broad muzzle
•Tail -shorter than head, body
•Prefers damp areas
•Burrows with scooped soil before entrance
•Potential rat, one pair can produce more than 800 offspringsin one year
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxgoluk9330
Ahota Beel, nestled in Sootea Biswanath Assam , is celebrated for its extraordinary diversity of bird species. This wetland sanctuary supports a myriad of avian residents and migrants alike. Visitors can admire the elegant flights of migratory species such as the Northern Pintail and Eurasian Wigeon, alongside resident birds including the Asian Openbill and Pheasant-tailed Jacana. With its tranquil scenery and varied habitats, Ahota Beel offers a perfect haven for birdwatchers to appreciate and study the vibrant birdlife that thrives in this natural refuge.
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Sérgio Sacani
Context. The observation of several L-band emission sources in the S cluster has led to a rich discussion of their nature. However, a definitive answer to the classification of the dusty objects requires an explanation for the detection of compact Doppler-shifted Brγ emission. The ionized hydrogen in combination with the observation of mid-infrared L-band continuum emission suggests that most of these sources are embedded in a dusty envelope. These embedded sources are part of the S-cluster, and their relationship to the S-stars is still under debate. To date, the question of the origin of these two populations has been vague, although all explanations favor migration processes for the individual cluster members. Aims. This work revisits the S-cluster and its dusty members orbiting the supermassive black hole SgrA* on bound Keplerian orbits from a kinematic perspective. The aim is to explore the Keplerian parameters for patterns that might imply a nonrandom distribution of the sample. Additionally, various analytical aspects are considered to address the nature of the dusty sources. Methods. Based on the photometric analysis, we estimated the individual H−K and K−L colors for the source sample and compared the results to known cluster members. The classification revealed a noticeable contrast between the S-stars and the dusty sources. To fit the flux-density distribution, we utilized the radiative transfer code HYPERION and implemented a young stellar object Class I model. We obtained the position angle from the Keplerian fit results; additionally, we analyzed the distribution of the inclinations and the longitudes of the ascending node. Results. The colors of the dusty sources suggest a stellar nature consistent with the spectral energy distribution in the near and midinfrared domains. Furthermore, the evaporation timescales of dusty and gaseous clumps in the vicinity of SgrA* are much shorter ( 2yr) than the epochs covered by the observations (≈15yr). In addition to the strong evidence for the stellar classification of the D-sources, we also find a clear disk-like pattern following the arrangements of S-stars proposed in the literature. Furthermore, we find a global intrinsic inclination for all dusty sources of 60 ± 20◦, implying a common formation process. Conclusions. The pattern of the dusty sources manifested in the distribution of the position angles, inclinations, and longitudes of the ascending node strongly suggests two different scenarios: the main-sequence stars and the dusty stellar S-cluster sources share a common formation history or migrated with a similar formation channel in the vicinity of SgrA*. Alternatively, the gravitational influence of SgrA* in combination with a massive perturber, such as a putative intermediate mass black hole in the IRS 13 cluster, forces the dusty objects and S-stars to follow a particular orbital arrangement. Key words. stars: black holes– stars: formation– Galaxy: center– galaxies: star formation
Hariyalikart Case Study of helping farmers in Biharrajsaurav589
Helping farmers all across India through our latest technologies of modern farming like drones for irrigation and best pest control For more visit : https://www.hariyalikart.com/case-study
Discovery of Merging Twin Quasars at z=6.05Sérgio Sacani
We report the discovery of two quasars at a redshift of z = 6.05 in the process of merging. They were
serendipitously discovered from the deep multiband imaging data collected by the Hyper Suprime-Cam (HSC)
Subaru Strategic Program survey. The quasars, HSC J121503.42−014858.7 (C1) and HSC J121503.55−014859.3
(C2), both have luminous (>1043 erg s−1
) Lyα emission with a clear broad component (full width at half
maximum >1000 km s−1
). The rest-frame ultraviolet (UV) absolute magnitudes are M1450 = − 23.106 ± 0.017
(C1) and −22.662 ± 0.024 (C2). Our crude estimates of the black hole masses provide log 8.1 0. ( ) M M BH = 3
in both sources. The two quasars are separated by 12 kpc in projected proper distance, bridged by a structure in the
rest-UV light suggesting that they are undergoing a merger. This pair is one of the most distant merging quasars
reported to date, providing crucial insight into galaxy and black hole build-up in the hierarchical structure
formation scenario. A companion paper will present the gas and dust properties captured by Atacama Large
Millimeter/submillimeter Array observations, which provide additional evidence for and detailed measurements of
the merger, and also demonstrate that the two sources are not gravitationally lensed images of a single quasar.
Unified Astronomy Thesaurus concepts: Double quasars (406); Quasars (1319); Reionization (1383); High-redshift
galaxies (734); Active galactic nuclei (16); Galaxy mergers (608); Supermassive black holes (1663)
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Creating a Big Data Machine Learning Platform in California
1. “Creating a Big Data
Machine Learning Platform in California”
Big Data Tech Forum:
Big Data Enabling Technologies and Applications
San Diego Chinese American Science and Engineering Association (SDCASEA)
Sanford Consortium
La Jolla, CA
December 2, 2017
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
2. Vision:
Use Optical Fiber to Connect
Big Data Generators and Consumers,
Creating a “Big Data” Freeway System
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 20-Campus Scale.”
This Vision Has Been Building for 15 Years
3. We Have Been Working Towards Distributed Big Data for 15 Years:
NSF OptIPuter, Quartzite, Prism Awards
PI Papadopoulos,
2013-2015
PI Smarr,
2002-2009
PI Papadopoulos,
2004-2007
4. Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
5. Terminating the Fiber Optics - Data Transfer Nodes (DTNs):
Flash I/O Network Appliances (FIONAs)
UCSD Designed FIONAs
To Solve the Disk-to-Disk
Data Transfer Problem
For Big Data
at Full Speed
on 10G, 40G and 100G Networks
FIONAS—10/40G, $8,000
FIONette—1G, $1,000
Phil Papadopoulos, SDSC &
Tom DeFanti, Joe Keefe & John Graham, Calit2
John Graham, Calit2
6. How UCSD DMZ Network Transforms Big Data Microbiome Science:
Preparing for Knight/Smarr 1 Million Core-Hour Analysis
Knight Lab
FIONA
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
7. (GDC)
Logical Next Step: The Pacific Research Platform Creates
a Regional End-to-End Science-Driven “Big Data Superhighway” System
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego; Calit2
Co-PIs:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UCSD SDSC,
• Frank Wuerthwein, UCSD Physics and SDSC
Letters of Commitment from:
• 50 Researchers from 15 Campuses
• 32 IT/Network Organization Leaders
8. We Measure Disk-to-Disk Throughput with 10GB File Transfer
4 Times Per Day in Both Directions for All PRP Sites
January 29, 2016
From Start of Monitoring 12 DTNs
to 24 DTNs Connected at 10-40G
in 1 ½ Years
July 21, 2017
Source: John Graham, Calit2
9. PRP’s First 2 Years:
Connecting Multi-Campus Application Teams and Devices
10. Data Transfer Rates From UCSD Physics Building Servers
Across Campus and Then To Chicago’s Fermilab
Utilizing UCSD Prism Campus
Optical Network
Source: Frank Wuerthwein, UCSD, SDSC
11. 100 Gbps FIONA at UCSC Allows for Downloads to the UCSC Hyades Cluster
from the LBNL NERSC Supercomputer for DESI Science Analysis
300 images per night.
100MB per raw image
30GB per night
120GB per night
250 images per night.
530MB per raw image
150 GB per night
800GB per night
Source: Peter Nugent, LBNL
Professor of Astronomy, UC Berkeley
Precursors to
LSST and NCSA
NSF-Funded Cyberengineer
Shaw Dong @UCSC
Receiving FIONA
Feb 7, 2017
12. Cancer Genomics Hub (UCSC) Was Housed in SDSC, But NIH Moved Dataset
From SDSC to Uchicago - So the PRP Deployed a FIONA to Chicago’s MREN
1G
8G
Data Source: David Haussler,
Brad Smith, UCSC
15G
Jan 2016
13. 40G FIONAs
20x40G PRP-connected
WAVE@UC San Diego
PRP Now Enables
Distributed Virtual Reality
PRP
WAVE @UC Merced
Transferring 5 CAVEcam Images from UCSD to UC Merced:
2 Gigabytes now takes 2 Seconds (8 Gb/sec)
14. Five Examples of Earth Sciences Research Teams
That Are in the Early Stages of Using PRP
Frank Vernon - Expansion of HPWREN
Dan Cayan, Mike Dettinger
Regional Downscaling of Climate Models
Scott Sellars, Marty Ralph
Center for Western Weather and Water Extremes
John Delaney-
Undersea Cabled Observatory
Jules Jaffe,
Ocean Microscope
15. Director: F. Martin Ralph Website: cw3e.ucsd.edu
Big Data Collaboration with:
Source: Scott Sellers, CW3E
Collaboration on Atmospheric Water in the West
Between UC San Diego and UC Irvine
Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu
16. Calit2’s FIONA
SDSC’s COMET
Calit2’s FIONA
Pacific Research Platform (10-100 Gb/s)
GPUsGPUs
Complete workflow time: 20 days20 hrs20 Minutes!
UC, Irvine UC, San Diego
Major Speedup in Scientific Work Flow
Using the PRP
Source: Scott Sellers, CW3E
17. Jaffe Lab (SIO) Scripps Plankton Camera
Off the SIO Pier with Fiber Optic Network
18. Over 300 Million Images So Far!
Requires Machine Learning for Automated Image Analysis and Classification
Phytoplankton: Diatoms
Zooplankton: Copepods
Zooplankton: Larvaceans
Source: Jules Jaffe, SIO
”We are using the FIONAs for image processing...
this includes doing Particle Tracking Velocimetry
that is very computationally intense.”-Jules Jaffe
19. New NSF CHASE-CI Grant Creates a Community Cyberinfrastructure
Adding a Machine Learning Layer Built on Top of the Pacific Research Platform
Caltech
UCB
UCI UCR
UCSD
UCSC
Stanford
MSU
UCM
SDSU
NSF Grant for High Speed “Cloud” of 256 GPUs
For 30 ML Faculty & Their Students at 10 Campuses
for Training AI Algorithms on Big Data
20. Adding GPUs to FIONAs
Supports Data Science Machine Learning
Multi-Tenant Containerized GPU JupyterHub
Running Kubernetes / CoreOS
Eight Nvidia GTX-1080 Ti GPUs
~$13K
32GB RAM, 3TB SSD, 40G & Dual 10G ports
Source: John Graham, Calit2
21. 48 GPUs for
OSG Applications
UCSD Adding >350 Game GPUs to Data Sciences Cyberinfrastructure -
Devoted to Data Analytics and Machine Learning
SunCAVE 70 GPUs
WAVE + Vroom 48 GPUs
FIONA with
8-Game GPUs
88 GPUs
for Students
CHASE-CI Grant Provides
96 GPUs at UCSD
for Training AI Algorithms on Big Data
22. The Rise of Brain-Inspired Computers:
Left & Right Brain Computing: Arithmetic vs. Pattern Recognition
Adapted from D-Wave
23. Calit2’s Qualcomm Institute Has Established a Pattern Recognition Lab
For Machine Learning on GPUs and von Neumann and NvN Processors
Source: Dr. Dharmendra Modha
Founding Director, IBM Cognitive Computing Group
August 8, 2014
UCSD ECE Professor Ken Kreutz-Delgado Brings
the IBM TrueNorth Chip
to Start Calit2’s Qualcomm Institute
Pattern Recognition Laboratory
September 16, 2015
24. Brain-Inspired Processors
Are Accelerating the Non-von Neumann Architecture Era
“On the drawing board are collections of 64, 256, 1024, and 4096 chips.
‘It’s only limited by money, not imagination,’ Modha says.”
Source: Dr. Dharmendra Modha
Founding Director, IBM Cognitive Computing Group
August 8, 2014
25. Our Pattern Recognition Lab is Exploring Mapping
Machine Learning Algorithm Families Onto Novel Architectures
Qualcomm
Institute
• Deep & Recurrent Neural Networks (DNN, RNN)
• Graph Theoretic
• Reinforcement Learning (RL)
• Clustering and other neighborhood-based
• Support Vector Machine (SVM)
• Sparse Signal Processing and Source Localization
• Dimensionality Reduction & Manifold Learning
• Latent Variable Analysis (PCA, ICA)
• Stochastic Sampling, Variational Approximation
• Decision Tree Learning
26. Next Step: Surrounding the UCSD Data Sciences Machine Learning Platform
With Clouds of GPUs and Non-Von Neumann Processors
Microsoft Installs Altera FPGAs
into Bing Servers &
384 into TACC for Academic Access
64-TrueNorth
Cluster
CHASE-CI64-bit GPUs
27. PRP Hosted
The First National Research Platform Workshop on August 7-8, 2017
Announced in I2 Closing Keynote:
Larry Smarr “Toward a National Big Data Superhighway”
on Wednesday, April 26.
Co-Chairs:
Larry Smarr, Calit2
& Jim Bottum, Internet2
150 Attendees
28. Expanding to the Global Research Platform
Via CENIC/Pacific Wave, Internet2, and International Links
PRP
PRP’s Current
International
Partners
Korea Shows Distance is Not the Barrier
to Above 5Gb/s Disk-to-Disk Performance
Netherlands
Guam
Australia
Korea
Japan
29. Our Support:
• US National Science Foundation (NSF) awards
CNS 0821155, CNS-1338192, CNS-1456638, CNS-1730158,
ACI-1540112, & ACI-1541349
• University of California Office of the President CIO
• UCSD Chancellor’s Integrated Digital Infrastructure Program
• UCSD Next Generation Networking initiative
• Calit2 and Calit2 Qualcomm Institute
• CENIC, PacificWave and StarLight
• DOE ESnet